帳號:guest(3.144.42.2)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):鄭婷予
作者(外文):Cheng, Ting-Yu
論文名稱(中文):處理半監督式分群方法中的使用者觀點屬性與資料屬性的非線性關係
論文名稱(外文):On Coping with Nonlinear Correlation between Perception Features and Data Features in Semi-supervised Clustering
指導教授(中文):吳尚鴻
指導教授(外文):Wu, Shan Hung
口試委員(中文):張正尚
林守德
陳銘憲
口試委員(外文):Chang, Cheng Shang
Lin, Shou De
Chen, Ming Syan
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:102062564
出版年(民國):104
畢業學年度:103
語文別:英文
論文頁數:19
中文關鍵詞:分群半監督式分群使用者觀點非線性非線性嵌入取樣偏差觀點向量
外文關鍵詞:clusteringsemi-supervised clusteringuser perceptionnonlinearnonlinear embeddingsampling biasperception vector
相關次數:
  • 推薦推薦:0
  • 點閱點閱:139
  • 評分評分:*****
  • 下載下載:16
  • 收藏收藏:0
半監督式分群方法被用於找出可以與使用者提供的邊資訊相符合的分群結果。然而,由於抽樣偏差的影響,許多分群方法找出的結果往往與使用者真正所想的分群結果大相逕庭。傳統的資料點層級的邊資訊有可能位於一些會誤導演算法的資料點上,導致錯誤的分群結果。為了解決這個問題,有一篇相關論文提出了利用屬性層級的邊資訊: 觀點向量。然而,目前的方法假設資料屬性與使用者觀點屬性的關係是線性的。這個假設使得目前的方法無法在一些應用中捕捉到兩者的非線性關係。在本篇論文中,我們提出兩個非線性的方法: 非線性觀點嵌入 (NPE) 以及類神經網絡 (NN) 來捕捉資料屬性與使用者觀點屬性的非線性關係,並獲得更好的分群成效。
Semi-supervised clustering algorithms have been proposed to identify data clusters that align with some side information provided by users. However, the identified clusters are still far from the true clusters perceived by users, mainly due to the sampling bias—traditional instance-level side information may cover a few, non-randomly sampled instances that mislead the algorithms to wrong clusters. To overcome this problem, a related work proposes to learn from the feature-level side information: perception vectors. However, the existing method assumes a linear correlation between the data features and perception features, which can not capture the nonlinearity correlation in some applications. In this paper, we propose two approaches Nonlinear Perception Embedded (NPE) and Neural Network (NN) to capture the nonlinear correlation between data and perception features and give better performance.
處理半監督式分群方法中的使用者觀點屬性與資料屬性的非線性關係 ii
On Coping with Nonlinear Correlation between Perception Features and Data Features in Semi-supervised Clustering iii
Acknowledgments iv
Abstract v
摘要 vi
Table of Contents vii
List of Figures ix
List of Tables x
Chapter 1 Introduction 1
Chapter 2 PE Clustering 4
2.1. Neural Network Approach (NN) 4
2.2. Linear PE (LPE) Review 4
2.3. Nonlinear PE (NPE) 7
Chapter 3 Performance Evaluation 9
3.1. Baselines and Settings 9
3.2. Parameter Tuning 10
3.3. Real Datasets 10
3.4. Evaluation Metrics 13
3.5. Results 14
Chapter 4 Conclusion 16
Bibliography 17

[1] Arindam Banerjee, Chase Krumpelman, Joydeep Ghosh, Sugato Basu, and Raymond J Mooney. Model-based overlapping clustering. In Proc. of KDD, pages 532–537, 2005.
[2] Aharon Bar-Hillel, Tomer Hertz, Noam Shental, and Daphna Weinshall. Learning distance functions using equivalence relations. In Proc. of ICML, pages 11–18, 2003.
[3] Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney. A probabilistic framework for semi-supervised clustering. In Proc. of KDD, pages 59–68, 2004.
[4] Sanjiv K Bhatia and Jitender S Deogun. Conceptual clustering in information retrieval. IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics, 28(3):427–436, 1998.
[5] Mikhail Bilenko and Raymond J Mooney. Adaptive duplicate detection using learnable string similarity measures. In Proc. of KDD, pages 39–48, 2003.
[6] Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. Nus-wide: A real-world web image database from national university of singapore. In Proc. of CIVR, page 48, 2009.
[7] Guillaume Cleuziou. An extended version of the k-means method for overlapping clustering. In Proc. of ICPR, pages 1–4, 2008.
[8] Christopher D. Manning Dan Klein and, Sepandar D. Kamvar. From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In Proc. of ICML, pages 307–314, 2002.
[9] Ayhan Demiriz, Kristin Bennett, and Mark J. Embrechts. Semi-supervised clustering using genetic algorithms. In Proc. of ANNIE, pages 809–814, 1999.
[10] Ivor W. Tsang Feiping Nie, Dong Xu and Changshui Zhang. Spectral embedded clustering. In Proc. of IJCAI, pages 1181–1186, 2009.
[11] Thomas Finley and Thorsten Joachims. Supervised clustering with support vector machines. In Proc. of ICML, pages 217–224, 2005.
[12] Hichem Frigui and Raghu Krishnapuram. A robust competitive clustering algorithm with applications in computer vision. IEEE Trans. on PAMI, 21(5):450–465, 1999.
[13] Xiangnan He, Min-Yen Kan, Peichu Xie, and Xiao Chen. Comment-based multi-view clustering of web 2.0 items. In Proc. of WWW, pages 771–782, 2014.
[14] Anil K Jain. Data clustering: 50 years beyond k-means. Elsevier Science Inc. Trans on Pattern Recognition Letters, 31(8):651–666, 2010.
[15] Qing Li and Byeong Man Kim. Clustering approach for hybrid recommender system. In Proc. of IEEE/WIC int’l Conf. on Web Intelligence, pages 33–38, 2003.
[16] Zhenguo Li and Jianzhuang Liu. Constrained clustering by spectral kernel learning. In Proc. of ICCV, pages 421–427, 2009.
[17] Zhenguo Li, Jianzhuang Liu, and Xiaoou Tang. Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In Proc. of ICML, pages 576–583, 2008.
[18] Xiaoyong Liu and W Bruce Croft. Cluster-based retrieval using language models. In Proc. of SIGIR, pages 186–193, 2004.
[19] Stuart Lloyd. Least squares quantization in pcm. IEEE Trans. on Information Theory, 28(2):129–137, 1982.
[20] Mingsheng Long, Jianmin Wang, Guiguang Ding, Wei Cheng, Xiang Zhang, and Wei Wang. Dual transfer learning. In Proc. of SDM, pages 540–551. SIAM, 2012.
[21] Zhengdong Lu and Todd K Leen. Semi-supervised learning with penalized probabilistic clustering. In Proc. of NIPS, pages 849–856, 2004.
[22] Eldar Sadikov, Jayant Madhavan, Lu Wang, and Alon Halevy. Clustering query refinements by user intent. In Proc. of WWW, pages 841–850, 2010.
[23] Markus Schedl, Nicola Orio, Cynthia Liem, and Geoffroy Peeters. A professionally annotated and enriched multimodal data set on popular music. In Proc. of ACM Multimedia Systems Conference (MMSys), pages 78–83, 2013.
[24] Bernhard Sch¨olkopf, Ralf Herbrich, and Alex J Smola. A generalized representer theorem. Lecture Notes in Computer Science, page 416, 2001.
[25] Matthew Schultz and Thorsten Joachims. Learning a distance metric from relative comparisons. In Proc. of NIPS, 2004.
[26] Raymond J Mooney Sugato Basu, Arindam Banerjee. Semi-supervised clustering by seeding. In Proc. of ICML, pages 27–34, 2002.
[27] Kiri Wagstaff, Claire Cardie, Seth Rogers, and Stefan Schr¨odl. Constrained k-means clustering with background knowledge. In Proc. of ICML, pages 577–584, 2001.
[28] Lou Wagstaff, Kiri Lou Wagstaff, and Ph. D. Clustering with instance-level constraints. In Proc. of ICML, pages 1103–1110, 2000.
[29] Eric P Xing, Michael I Jordan, Stuart Russell, and Andrew Ng. Distance metric learning with application to clustering with side-information. In Proc. of NIPS, pages 505–512, 2002.
[30] Yisong Yue, Chong Wang, Khalid El-Arini, and Carlos Guestrin. Personalized collaborative clustering. In Proc. of WWW, pages 75–84, 2014.
[31] Lihi Zelnik-Manor and Pietro Perona. Self-tuning spectral clustering. In Proc. of NIPS, pages 1601–1608, 2004.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *