帳號:guest(3.135.204.31)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):趙芳譽
作者(外文):Chao, Fang Yu
論文名稱(中文):融合社群媒體上文字、影像及圖片屬性的使用者興趣探勘技術
論文名稱(外文):Mining User Interests from Social Media: Fusion of Textual and Visual Features
指導教授(中文):林嘉文
指導教授(外文):Lin, Chia Wen
口試委員(中文):賴尚宏
孫民
曾新穆
口試委員(外文):Lai,Shang Hong
Sun, Min
Tseng, Hsin Mu
學位類別:碩士
校院名稱:國立清華大學
系所名稱:電機工程學系
學號:102061519
出版年(民國):104
畢業學年度:104
語文別:中文英文
論文頁數:43
中文關鍵詞:主題模型興趣發現社群網站分析
外文關鍵詞:Topic modelSocial media analysisInterest mining
相關次數:
  • 推薦推薦:0
  • 點閱點閱:283
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
在這篇論文中,我們提出了一種結合使用文字、影像特徵對於用戶生成的社交媒體內容,來找到使用者興趣分佈的方法,該興趣分佈可用於對用戶做個人化的廣告推薦,或是社群內容的推薦。
這篇論文的架構包含了三個步驟,分別是特徵提取、模型訓練以及使用者興趣發現。本研究選取Pinterest 上受歡迎用戶組織良好的板作為訓練和測試資料,總共有Pinterest 推薦的三個受歡迎的主要主題,及十二個較精細的主題。對於每個釘我們提取字詞-文件矩陣作為文字特徵、視覺詞袋模型作為低階視覺特徵、並以圖片屬性作為中階視覺特徵,來減少文字敘述和低階視覺特徵間的語意落差。
在特徵擷取完後,我們進行一個單辭選擇的處理以過濾主題分佈不夠明確的字詞。接著,三類特徵以新的字詞-文件矩陣使用DLDA (discriminative latent Dirichlet allocation) 模型進行主題模型的訓練。最後,我們使用一個選擇代表分佈的方法來決定每個輸入文檔最後的主題分佈。在預測階段,我們使用額外的受歡迎使用者的釘來測試分類精確度,並使用額外的普通用戶的釘來測試推薦系統的實作效果。
實驗結果顯示該方法的成效改善,並且圖片推薦示範核實了該方法應用在真實數據下的可行性。
This thesis proposes a framework that jointly uses textual and visual features of user generated social media data for mining the distribution of user interests. The mined distribution can serve for personalized ads recommendation or social content recommendation. The proposed framework consists of three steps: feature extraction, model training, and user interest mining.We choose boards from popular users on Pinterest to collect training and test data. For each pin we extract the term-document matrices as textual features, bag of visual words (BoVW) as low-level visual features, and attributes
as mid-level visual features to bridge the semantic gap between low-level visual feature and textual descriptions. After feature extraction, a word selection process is applied to filter out words with an ambiguous distribution. The new term-document matrices of three
types of features are then used to train topic models using discriminative latent Dirichlet allocation (DLDA). Finally, a representative distribution selection method is performed to choose the final topic distribution of each input document.
In the prediction phase, pins from other popular user are used to evaluate the classification accuracy and pins from other common users are used to evaluate the
recommendation performance. Our experimental results shows the efficacy of the proposed method. Also, the image recommendation demonstration verifies the feasibility of our method applied on real data.
摘要 i
Abstract ii
Content iii
Chapter 1. Introduction 1
1.1 Research Background 1
1.2 Motivation and Objective 3
1.3 Thesis Organization 4
Chapter 2. Related Works 5
2.1 Topic Models 5
2.2 Multimedia Topic Exploration 8
Chapter 3. Proposed Method 11
3.1 Overview 11
3.2 Feature Extraction 13
3.3 Topic Model Training 17
3.4 Prediction 23
Chapter 4. Experiments and Discussion 25
4.1 Dataset 25
4.2 Confirmation 26
4.3 Comparison 28
4.4 Applications 33
4.5 Discussion 36
Chapter 5. Conclusion 40
Reference 41
[1] Ting Yao, Tao Mei, Chong-Wah Ngo, Shipeng Li, “Annotation for free: video tagging by mining user search behavior,” ACM Multimedia (2013): 977-986.
[2] Feng Qiu, Junghoo Cho, “Automatic identification of user interest for personalized search,” International World Wide Web Conference (2006): 727-736.
[3] J. Tang, R. Hong, S. Yan, T. Chua, G. Qi, R. Jain, “Image annotation by knn-sparse graph-based label propagation over noisily tagged web images,” ACM Transactions on Intelligent Systems and Technology 2(2): 14 (2011).
[4] H. Feng, X. Qian, “Recommend social network users favorite brands,” Pacific-Rim Conference on Multimedia (2013).
[5] X. Qian, X. Liu, C. Zheng, Y. Du, X. Hou, “Tagging photos using users' vocabularies,” Neurocomputing 111, 144–153 (2013).
[6] Pasquale Lops, Marco de Gemmis, Giovanni Semeraro, “Content-based recommender systems: state of the art and trends,” Recommender Systems Handbook (2011): 73-105.
[7] Julian J. McAuley, Christopher Targett, Qinfeng Shi, Anton van den Hengel, “Image-based recommendations on styles and substitutes,” SIGIR (2015): 43-52.
[8] Hofmann, Thomas, “Probabilistic latent semantic indexing,” SIGIR (1999): 50-57.
[9] Blei, David M., Andrew Y. Ng, and Michael I. Jordan, “Latent dirichlet allocation,” the Journal of machine Learning research 3 (2003): 993-1022.
[10] Griffiths, D. M. B. T. L., and M. I. J. J. B. Tenenbaum, “Hierarchical topic models and the nested Chinese restaurant process,” Advances in neural information processing systems 16 (2004): 17.
[11] Mcauliffe, Jon D., and David M. Blei, “Supervised topic models,” Advances in neural information processing systems. 2008.
[12] Adler J. Perotte, Frank Wood, Noemie Elhadad, Nicholas Bartlett, “Hierarchically supervised latent dirichlet allocation,” NIPS (2011): 2609-2617.
[13] Wang, Xin-Jing, et al, “Argo: intelligent advertising by mining a user's interest from his photo collections,” KDD Workshop on Data Mining and Audience Intelligence for Advertising (2009): 18-26.
[14] Feng, He, and Xueming Qian, “Mining user-contributed photos for personalized product recommendation,” Neurocomputing 129, 409-420 (2014).
[15] Xikui Wang, Yang Liu, Donghui Wang, Fei Wu, “Cross-media topic mining on Wikipedia,” ACM Multimedia (2013): 689-692.
[16] Yanfei Wang, Fei Wu, Jun Song, Xi Li, Yueting Zhuang, “Multi-modal mutual topic reinforce modeling for cross-media retrieval,” ACM Multimedia (2014): 307-316.
[17] Marie Katsurai, Takahiro Ogawa, Miki Haseyama, “A cross-modal approach for extracting semantic relationships between concepts using tagged images,” IEEE Transactions on Multimedia 16(4): 1059-1074 (2014).
[18] Redner R, Walker H, “Mixture densities, maximum likelihood and the EM algorithm,” SIAM Rev (1984): 195–239.
[19] Barndorff-Nielsen O, “Information and exponential families in statistical theory,” (1978).
[20] Hanhuai Shan, Arindam Banerjee, “Mixed-membership naive Bayes models.” Data Min. Knowl. Discov. 23(1): 1-62 (2011).
[21] Xin-Jing Wang, Lei Zhang, Xirong Li, Wei-Ying Ma, “Annotating images by mining image search results,” IEEE Trans. Pattern Anal. Mach. Intell. 30(11): 1919-1932 (2008).
[22] Dimitrios Zeimpekis, Efstratios Gallopoulos, “TMG: a MATLAB toolbox for generating term-document matrices from text collections,” Grouping Multidimensional Data (2006): 187-210.
[23] M.F. Porter, “An algorithm for suffix stripping,” Program (1980), no. 3, 130–137.
[24] David G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, 2004.
[25] A. Vedaldi and B. Fulkerson, “VLFeat, an open and portable library of computer vision algorithms,” 2008. http://www.vlfeat.org.
[26] A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth, “Describing objects by their attributes,” CVPR 2009.
[27] Harel, Jonathan, Christof Koch, and Pietro Perona, “Graph-based visual saliency,” Advances in neural information processing systems (2006).
[28] Bharath, Ramesh, et al., “Scalable scene understanding using saliency-guided object localization,” Control and Automation (ICCA), 2013, 10th IEEE International Conference on. IEEE, 2013.
[29] Pengtao Xie, Yulong Pei, Yuan Xie, Eric P. Xing, “Mining user interests from personal photos,” AAAI (2015): 1896-1902.
[30] Jianping Fan, Daniel A. Keim, Yuli Gao, Hangzai Luo, Zongmin Li, “JustClick: personalized image recommendation via exploratory search from large-scale Flickr images,” IEEE Trans. Circuits Syst. Video Techn. 19(2): 273-288 (2009).
[31] Greg Linden, Brent Smith, Jeremy York, “Amazon.com recommendations: item-to-item collaborative filtering,” IEEE Internet Computing 7(1): 76-80 (2003).

(此全文限內部瀏覽)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *