帳號:guest(3.142.54.83)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):蕭凡凱
作者(外文):Hsiao, Fan Kai
論文名稱(中文):基於人事時地分析之社群相簿摘要系統
論文名稱(外文):Automatic Album Summarization Based on 4W: Who, When, Where, and What
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia Wen
口試委員(中文):許秋婷
鄭文皇
胡敏君
口試委員(外文):Hsu, Chiou Ting
Cheng, Wen Huang
Hu, Min Chun
學位類別:碩士
校院名稱:國立清華大學
系所名稱:通訊工程研究所
學號:102064546
出版年(民國):104
畢業學年度:103
語文別:英文
論文頁數:39
中文關鍵詞:個人相簿摘要相簿分析影像推薦
外文關鍵詞:Album summarizationAlbum analysisImage recommendation
相關次數:
  • 推薦推薦:0
  • 點閱點閱:613
  • 評分評分:*****
  • 下載下載:7
  • 收藏收藏:0
  在這篇論文中,我們提出了一個可以針對相簿內的人事時地,自動對相簿進行摘要的方法。藉由我們提出的方法,可以根據觀看者的不同,在計算相簿內社群關係後,考慮觀看者與相簿內角色關係、社群關係,以及人事時地因素下,自動產生不同的對應摘要結果。
  我們的方法主要分為兩個部分。第一步驟中,我們從相簿中提取出人事時地物。人的部分包含相簿內所含有的角色及其之間的關係,利用人臉偵測及分群找出角色群,再利用共同出現、角色頻率等特性,加以結合相簿社群關係計算角色社群矩陣。事則為相片的內容所包含的影像特徵,利用區域及全域特徵將照片分群及分層,得到相簿內所含有的景象結構。時,包含以時間為基準的相簿內事件,利用相片之間的時間差,由時間差的分布求出事件分隔時間點。地,指的則是相簿內所含照片的拍攝地點,也就是GPS資訊。由我們的觀察以及過去的研究說明,這四者特性足以以完整的描述、解析及管理相簿。
  由於我們目標為以觀看者為出發點的個人化相簿摘要,於第二步驟中,我們以觀看者為輸入,利用已建立之角色社群矩陣,可計算出觀看者與相簿中他人之間之互動程度。以人為出發點,並結合其餘事件、地點、景象結構,我們設計了最佳化公式,以貪婪式演算法 (Greedy Algorithm) 求解。使得摘要選擇時能同時考慮到人事時地因素,並根據觀看者不同,產生最佳的摘要結果。
  實驗中我們利用主觀評測,建立線上個人相簿系統,模擬觀看者觀看相簿習慣,讓使用者輸入其角色,觀看對應之摘要。由實驗能證明相簿摘要對於個人相簿的實用性,並相較其他方法能更完整的讓觀看者了解相簿故事、提供良好的觀看感受。
In this paper, we present an automatic personal album summarization system. Given a query viewer, the proposed method combines 4W: who, when, where, and what to calculate the optimization function and output the summary.
Our method includes two parts: the first part is the album structure extraction. We extract following structures: for “who” is in the album, we use the correlation between the characters, calculate the correlation matrix and the appearance frequency. Furthermore, we combine the community detection to find close relationship. “When” were the photos taken helps to split the time events. “Where” were the photos taken can be extract from GPS which helps maintain the diversity of geolocation. The last one is “what” the image contents are about. The image contents are analyzed by the local and global features, which form the hierarchical structures named shots and scenes in the album. In the second part of the method, given viewer query, we combine 4W and design an optimization function. Solving the optimization function by greedy algorithm, we can get the summary result which is depended on who the viewer is.
We build the online album viewing system, model how user used to view the album. The experiment is evaluated by the subjective assessment. We let the viewer input who he/she is, system shows the corresponding results. The experiment results show the effectiveness of the proposed method personal album summary. Compare to other methods, ours can let viewers understand more about the album content and is preferable by viewers.
摘要 2
Abstract 3
Content 4
Chapter 1 Introduction 5
1.1 Research Background and Motivation 5
1.2 Research Objective 7
1.3 Thesis Organization 8
Chapter 2 Related Work 10
2.1 Visual summarization Approach 10
2.2 Personal Album Summarization Approach 12
Chapter 3 Proposed Method 15
3.1 Overview 15
3.2 Album Structure Extraction 15
3.3 The Photo Selection Algorithm 23
Chapter 4 Experiments and Discussion 28
4.1 Data Set and Environment Setting 28
4.2 Subjective Assessment 29
4.3 Discussion 35
Chapter 5 Conclusion 36
Reference 37
[1] J. C. Platt, M. Czerwinski, and B.A. Field, “PhotoTOC: Automatic clustering for browsing personal photographs,” Microsoft Research Technical Report MSR-TR-2002-17, 2002.
[2] J. C. Platt, “AutoAlbum: Clustering digital photographs using probabilistic model merging,” in Proc. IEEE Workshop on Content-based Access of Image and Video Libraries, pages 96 – 100, 2000.
[3] P. Sinha, H. Pirsiavash, and R. Jain, "Personal Photo Album Summarization," In Proc. of ACM International Conference of Multimedia, 2009.
[4] J. Yang, J. Luo, J. Yu, and T. Huang, “Photo Stream Alignment and Summarization for Collaborative Photo Collection and Sharing,” IEEE Transactions on Multimedia, vol. 14, no. 6, pp.1642 – 1651, 2012.
[5] P. Obrador, R. de Oliveira, & N. Oliver, “Supporting personal photo storytelling for social albums,” in Proc. of ACM international conference on Multimedia, 2010.
[6] G. Kim and E. P. Xing. “Reconstructing Storyline Graphs for Image Recommendation from Web Community Photos,” IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[7] L. Kennedy and M. Naaman, “Generating diverse and representative image search results for landmarks,” in Proc. Conf. 19th World Wide Web, pp. 297-306, 2008.
[8] I. Simon, N. Snavely and S. Seitz, “Scene summarization for online image collections,” in Proc. ICCV, pp. 1 – 8, 2007.
[9] I. Dhillon and D. Modha, “Concept Decompositions for Large Sparse Text Data Using Clustering,” Machine Learning, vol. 42, no. 1, pp. 143–175, 2001.
[10] L. Cao, X. Jin, Z. Yin, A. Pozo, J. Luo, J. Han, and T. Huang, “RankCompete: Simultaneous ranking and clustering of information networks,” Neurocomputing. Vol. 95, pp. 98-104, Oct. 2012.
[11] Y. Jing and S. Baluja, “Visualrank: Applying pagerank to large-scale image search,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1877-1890, Nov.2008.
[12] P. Obrador and N. Moroney, “Automatic Image Selection by means of a Hierarchical Scalable Collection Representation,” in SPIE, Electronic Imaging, Visual Communications and Image Processing, volume 7257, pp. 72570W-1–12.
IS&T/SPIE, 2009.

[13] A. C. Loui and A. E. Savakis, “Automated event clustering and quality screening of consumer pictures for digital albumin,” IEEE Trans. on Multimedia, pp. 390–402, Sept. 2003.
[14] M. Naaman, Y. J. Song, A. Paepcke, “Automatic organization for digital photographs with geographic coordinates,” in Proc. of the 4th ACM/IEEE-CS joint conference on Digital libraries, JCDL ’04, pp. 53–62. ACM, 2004.
[15] Microsoft Face API from http://www.projectoxford.ai/demo/face
[16] S. Whittaker, O. Bergman, and P. Clough. “Easy on that trigger dad: a study of long term family photo retrieval,” Journal Personal and Ubiquitous Computing, 14, 1:31–43, 2010.
[17] S. M. Omohundro, “Best-first model merging for dynamic learning and recognition,” in Advances in Neural Information Processing Systems, vol. 4, pp. 958–969, 1992.
[18] D. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. on Comput. Vision, vol. 60, no. 2, pp. 910–110, 2004..
[19] Zheng, Liang, et al. "Packing and padding: Coupled multi-index for accurate image retrieval," IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[20] A. Torralba, K. P. Murphy, W. T. Freeman, and M. A. Rubin, “Context-based vision system for place and object recognition,” in Proc. Int. Conf. Computer Vision, 2003.
[21] X. Qian, Y. Xue, X. Yang, Y. Y. Tang, X. Hou and T. Mei, “Landmark summarization with diverse viewpoints”, IEEE Trans. Circuits Syst. Video Technol, 2015.
[22] C. M. Tsai, L.W. Kang, C. W. Lin and W. S. Lin, “Scene-Based Movie Summarization Via Role-Community Networks,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no.11, pp. 1927-1940, 2013.
[23] ] S. Gao, I. W.-H. Tsang, and L.T. Chia, “Kernel sparse representation for image classification and face recognition,” in Proc. European Conf. Computer Vision, 2010.
[24] C. S. Chang, W. J. Liao, Y. S. Chen, “A Mathematical Theory for Clustering in Metric Spaces,” under reviewing.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *