帳號:guest(18.217.122.161)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):李竣穎
作者(外文):Li, Jun-Ying
論文名稱(中文):以語意結構為基礎產生電影摘要
論文名稱(外文):A Semantic-Based Framework for Movie Summarization
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia-Wen
口試委員(中文):林嘉文
朱威達
王鈺強
葉梅珍
學位類別:碩士
校院名稱:國立清華大學
系所名稱:通訊工程研究所
學號:100064533
出版年(民國):103
畢業學年度:102
語文別:英文
論文頁數:45
中文關鍵詞:電影摘要影片摘要
外文關鍵詞:Movie summarizationvideo abstract
相關次數:
  • 推薦推薦:0
  • 點閱點閱:614
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
有鑑於科技進步,大量的多媒體訊息可以透過電腦、筆記型電腦、智慧型手機等設備,讓使用者方便接收這些資訊,其中多媒體之中的影片,雖然使用者可以輕易接收該資訊,但是使用者今天若是在搭乘大眾交通工具、飛機或其他狀況出現時,可能無法在有限的時間內將一部一到兩小時長度的影片看完,並且也無法在短時間內掌握影片的劇情內容,所以影片摘要的需求就顯得格外重要。
在本篇研究中,我們提出一種新型的動態型影片摘要方法,不同於過去傳統的做法,首先我們的方法會結合影片的劇本對輸入影片的每個場景進行分類,並觀察每個場景對應的類別是否有出現吸引人或重要的事件,再來系統會建立以場景對應的角色組合為單位的社群網路,透過這個社群網路觀察影片場景之間的鋪陳及先後出現的因果關係。因此透過這兩個階段處理所產生的事件偵測結果以及場景之間的前後結構關係,將這兩者的資訊輸入至我們建立的機器學習模型,並且也輸入了這個場景的視覺特性,再藉由此模型萃取出內容語意重要且在時間上有連貫結構關係的場景,針對這些場景會在它們之中進一步選出一組最具代表性的場景作為影片摘要,我們在此將這個步驟視為一個解最佳化的數學問題,藉此我們以解最佳化的演算法產生最終理想的影片摘要。
摘 要 ii
Abstract iii
Category iv
Chapter 1 Introduction 6
Chapter 2 Related Work 9
2.1 Background 9
2.2 Nomenclature 10
Chapter 3 Proposed Method 13
3.1 Scene detection 16
3.2 Scene-topic detection 17
3.2.1 Probabilistic Latent Semantic Analysis 17
3.3 Scene-event detection 19
3.4 Role-based social network analysis 20
3.4.1 Global RC-based Movie Analysis 20
3.4.2 Local RC-based Movie Analysis 22
3.5 Scene classification 23
3.6 Optimized Movie Summarization 24
3.6.1 Topic-event Weight 24
3.6.2 Calculating Importance of Scenes 26
3.6.3 Simplify the Content of Scene 26
3.6.4 Using Greedy Algorithm for Solving Knapsack Problem 27
Chapter 4 Experimental Results 32
4.1 Learning Model Predict Data 32
4.2 Movie Highlight Event Weight Prediction 35
4.3 Objective Assessment 37
4.4 Subjective Assessment 41
4.5 User interface 43
Chapter 5 Conclusion 44
References 45
[1] Money, G. Arthur, and H. Agius, “Video summarisation: A conceptual framework and survey of the state of the art,” J. of Visual Commun. and Image Represent. vol. 19 no. 2, pp. 121–143, Apr. 2008.
[2] K. Li, L. Guo, and C. Faraco, “Human-centered attention models for video summarization,” ACM Trans. Int. Conf. Multimodal Interfaces and the Workshop on Mach. Learning for Multimodal Interaction, pp. 27–30, Nov. 2010.
[3] Y. Ma, X. Hua, L. Lu, and H. Zhang, “A generic framework of user attention model and its application in video summarization,” IEEE Trans. Multimedia, pp. 907–919, Jul. 2005.
[4] G. Evangelopoulos, K. Rapantzikos, and A. Potamianos, “Movie summarization based on audiovisual saliency detection,” IEEE Trans. Image Proc., pp. 2528–2531, Oct. 2008
[5] S. Zhu, Y. Zhao, Z. Liang, and X. Jing, “Movie abstraction via the progress of the storyline,” IET Trans. Signal Process., pp. 751–762, Oct. 2012.
[6] R. Singh, “Solving 0–1 knapsack problem using genetic algorithms,” IEEE Trans. Commun. Softwa. and Networks, pp.591–595, 2011.
[7] E. J. Y. C. Cahuina and G. C. Chavez, “A new method for static video summarization using local descriptors and video temporal segmentation,” IEEE Trans. Graphics, Patterns and Images, Aug. 2013.
[8] C. Xu, “Live sports event detection based on broadcast video and web-casting text,” ACM Multimedia, 2006.
[9] J. T. Sang and C. S. Xu, “Character-based movie summarization,” ACM Trans. Multimedia, pp. 855-858, 2010.
[10] S. B. Park, K. J. Oh, and G. S. Jo, “Social network analysis in a movie using character-net,” Springer Multimedia Tools and Applicat., pp. 601-627, Jul. 2011.
[11] C.-M. Tsai, L.-W. Kang, C.-W. Lin, and W. Lin, “Scene-based movie summarization via role-community networks,” IEEE Trans. Circuits Syst. for Video Technol., Jun. 2013.
[12] M. Xu, C. Maddage, C. Xu, M. Kankanhalli, and Q. Tian, “Creating audio keywords for event detection in soccer video,” IEEE Trans. Multimedia, vol. 2, pp. 281–284, July 2003.
[13] D. Zhang and S.-F. Chang, “Event detection in baseball video using superimposed caption recognition,” ACM Multimedia, pp. 315-318, Nov. 2002.
[14] B. Yu, “Video summarization based on user log enhanced link analysis,” ACM Multimedia, 2003.
[15] M. Kogler, “Global vs. local feature in video summarization: Experimental results,” Int. Conf. Semantic and Digital Media Technol., 2009.
[16] C. Fan, C. De Vleeschouwer, and A. Cavallaro, “Resource Allocation for Personalized Video Summarization,” IEEE Trans. Multimedia, vol. 2, pp. 455-469, Feb. 2013.
[17] D. Tjondronegoro, Y. Chen, and B. Pham, “Highlights for more complete sports video summarization,” IEEE Trans. Multimedia, vol. 11, pp. 22-37, Oct.-Dec. 2004.
[18] W.-T. Peng, W.-T. Chu, C.-H. Chang, C.-N. Chou, W.-J. Huang, W.-Y. Chang, and Y.-P. Hung, “Editing by viewing: automatic home video summarization by viewing behavior analysis,” IEEE Trans. Multimedia, vol. 13, pp.539-550, June 2011.
[19] C.-W. Ngo, Y.-F. Ma, and H.-J. Zhang, “Video summarization and scene detection by graph modeling,” IEEE Trans. Circuits Syst. for Video Technol., pp. 296-305, Feb. 2005.
[20] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” J. of the Amer. Soc. for Inform. Sci., vol. 41, no. 6, pp. 391-407, 1990.
[21] S. Sharff, “The elements of cinema: toward a theory of cinesthetic impact,” Columbia Univ. Press, 1982.
[22] C. Dang and H. Radha, “Heterogeneity Image Patch Index and Its Application to Consumer Video Summarization,” IEEE Trans. Image Proc., vol. 23, pp. 2704-2718, June 2014.
[23] T.-H. Tsai, W.-H. Cheng, and Y.-H. Hsieh, “Dynamic social network for narrative video analysis”, ACM Multimedia, pp. 663–666, 2011.
[24] H. Rowley, S. Baluja, and T. Kanade, “Neural Network-Based Face Detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, pp. 23-38, Jan. 1998.
[25] M. Ellouze, N. Boujemaa, and A. M. Alimi, “IM(S)2: Interactive movie summarization system,” J. Vis. Commun. Image Represent., vol. 21, pp.283-294, Jan. 2010.
[26] Z. Zhao and X. Ge., “A computable structure model for hollywood film,” IEEE Trans. Image Proc., pp. 877-880, Sept. 2010.
[27] Z. Zhao, “A computable film structure model,” IEEE Trans. Future Computer and Commun., vol. 2, pp. 657-660, May 2010.
[28] C.-Y. Weng, W.-T. Chu, and J.-L. Wu, “RoleNet: Movie analysis from the perspective of social networks,”IEEE Trans. Multimedia, vol. 11, pp. 256-271, Feb. 2009.
[29] C.-C. Chang and C.-J. Lin, “LIBSVM : a library for support vector machines,” ACM Trans. Intell. Syst. and Technol., 2011.
[30] X. Chen and D. Cai., “Large Scale Spectral Clustering with Landmark-based Representation,” AAAI Artificial Intell. Conf., 2011.
[31] M. E. Anjum, S. F. Ali, M. T. Hassan, and M. Adnan, “Video summarization: Sports highlights generation,” IEEE INMIC Multi Topic Conf., pp. 142-147, Dec. 2013.
[32] C. Kanan and G. W. Cottrell, “Robust Classification of Objects, Faces, and Flowers Using Natural Image Statistics,” IEEE Comput. Vision and Pattern Recognition Conf., pp. 2472-2479,June 2010.
(此全文限內部瀏覽)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *