帳號:guest(18.117.229.133)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):戴瑞克
作者(外文):Derek Davis
論文名稱(中文):SociRank : 基於社群媒體影響力之新聞重要性排序
論文名稱(外文):SociRank : Ranking prevalent topics using social media factors
指導教授(中文):陳宜欣
指導教授(外文):Chen, Yi-Shin
口試委員(中文):蘇豐文
彭文志
口試委員(外文):Soo, Von-Wun
Peng, Wen-Chih
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:101065427
出版年(民國):103
畢業學年度:102
語文別:英文
論文頁數:48
中文關鍵詞:資訊過濾主題偵測主題排序社群網路分析
外文關鍵詞:information filteringtopic identificationtopic rankingsocial network analysis
相關次數:
  • 推薦推薦:0
  • 點閱點閱:265
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
Historically, information which apprises us of daily events has been provided by mass media sources, specifically news media. Presently social media services, such as Twitter, provide an enormous amount of user generated data which has great potential to contain informative news related content. However, for this content to be useful we must find a way to filter noise and capture only such information that, based on its content similarity to news media, may potentially be considered useful or valuable. However, even after noise is removed there still exists a problem of information overload in the remaining data. A person is incapable of processing huge amounts of information all at once and thus information which is of most value must be prioritized for consumption. To achieve prioritization, the information must be ranked in order of estimated importance. The temporal prevalence of a particular topic in news media is one significant factor of importance and may be considered the media focus of a topic. The topic’s temporal prevalence in social media, specifically Twitter, indicates user interest and may be considered its user attention. Furthermore, the interaction between the social media users whom mention this topic indicates the strength of the community discussing said topic and may be considered the user interaction. We propose an unsupervised method called SociRank, which identifies news topics that are prevalent in both social and news media and then ranks these topics taking into account media focus, user attention and user interaction as measures of importance.
Historically, information which apprises us of daily events has been provided by mass media sources, specifically news media. Presently social media services, such as Twitter, provide an enormous amount of user generated data which has great potential to contain informative news related content. However, for this content to be useful we must find a way to filter noise and capture only such information that, based on its content similarity to news media, may potentially be considered useful or valuable. However, even after noise is removed there still exists a problem of information overload in the remaining data. A person is incapable of processing huge amounts of information all at once and thus information which is of most value must be prioritized for consumption. To achieve prioritization, the information must be ranked in order of estimated importance. The temporal prevalence of a particular topic in news media is one significant factor of importance and may be considered the media focus of a topic. The topic’s temporal prevalence in social media, specifically Twitter, indicates user interest and may be considered its user attention. Furthermore, the interaction between the social media users whom mention this topic indicates the strength of the community discussing said topic and may be considered the user interaction. We propose an unsupervised method called SociRank, which identifies news topics that are prevalent in both social and news media and then ranks these topics taking into account media focus, user attention and user interaction as measures of importance.
1 Introduction
2 Related Work
3 Methodology
3.1 Pre-processin
3.1.1 New
3.1.2 Tweet
3.2 Key Term Graph Definition
3.2.1 Term Document Frequency
3.2.2 Key Term Extraction
3.2.3 Key Term Similarity
3.3 Key Term Graph Construction
3.3.1 Outlier Detection
3.3.2 Betweenness
3.3.3 Transitivity
3.3.4 Graph Clustering
3.4 Content Selectio
3.4.1 Node Weighting
3.4.2 User Attention Selection
3.4.3 Media Focus Selection
3.5 User Interaction Definition
3.5.1 Reciprocity
3.5.2 User Interaction
3.6 Rankin
4 Experiment and Results
4.1 Co-Occurrence Measure
4.2 IQR Coefficien
4.3 Node Weightin
4.4 Method Evaluation
4.4.1 Topic Selection Evaluation
4.4.2 Topic Ranking Evaluation
4.5 Ranked List Comparison
5 Conclusion
[1] Jasmeen Kaur and Vishal Gupta. Effective approaches for extraction of keywords. Journal of Computer Science, 7(6):144–148, 2010.
[2] Rada Mihalcea and Paul Tarau. Textrank: Bringing order into texts. In Proceedings of EMNLP, volume 4. Barcelona, Spain, 2004.
[3] Yutaka Matsuo and Mitsuru Ishizuka. Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(01):157–169, 2004.
[4] Hsin-Hsi Chen, Ming-Shun Lin, and Yu-Chuan Wei. Novel association measures using web search with double checking. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 1009–1016. Association for Computational Linguistics, 2006.
[5] Danushka Bollegala, Yutaka Matsuo, and Mitsuru Ishizuka. Measuring semantic similarity between words using web search engines. www, 7:757–766, 2007.
[6] Hideya Iwasaka and Kumiko Tanaka-Ishii. Clustering co-occurrence graph based on transitivity.
[7] Yutaka Matsuo, Takeshi Sakaki, Kˆoki Uchiyama, and Mitsuru Ishizuka. Graphbased word clustering using a web search engine. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 542– 550. Association for Computational Linguistics, 2006.
[8] Michelle Girvan and Mark EJ Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12):7821–7826, 2002.
[9] Mark EJ Newman. Fast algorithm for detecting community structure in networks. Physical review E, 69(6):066133, 2004.
[10] Christian Wartena and Rogier Brussee. Topic detection by clustering keywords. In Database and Expert Systems Application, 2008. DEXA’08. 19th International Workshop on, pages 54–58. IEEE, 2008.
[11] Chien Chin Chen, Yao-Tsung Chen, Yeali Sun, and Meng Chang Chen. Life cycle modeling of news events using aging theory. In Machine Learning: ECML 2003, pages 47–59. Springer, 2003.
[12] Canhui Wang, Min Zhang, Liyun Ru, and Shaoping Ma. Automatic online news topic ranking using media focus and user attention based on aging theory. In Proceedings of the 17th ACM conference on Information and knowledge management, pages 1033–1042. ACM, 2008.
[13] Elizabeth Kwan, Pei-Ling Hsu, Jheng-He Liang, and Yi-Shin Chen. Event identification for social streams using keyword-based evolving graph sequences. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 450–457. ACM, 2013.
[14] Kevin Gimpel, Nathan Schneider, Brendan O’Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A Smith. Part-of-speech tagging for twitter: Annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, pages 42–47. Association for Computational Linguistics, 2011.
[15] Mia Hubert and Stephan Van der Veeken. Outlier detection for skewed data. Journal of chemometrics, 22(3-4):235–246, 2008.
[16] Ulrik Brandes. On variants of shortest-path betweenness centrality and their generic computation. Social Networks, 30(2):136–145, 2008.
[17] Lawrence Page, Sergey Brin, Rajeev Motwani, and TerryWinograd. The pagerank citation ranking: Bringing order to the web. 1999.
[18] Jagan Sankaranarayanan, Hanan Samet, Benjamin E Teitler, Michael D Lieberman, and Jon Sperling. Twitterstand: news in tweets. In Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems, pages 42–51. ACM, 2009.
[19] Owen Phelan, Kevin McCarthy, and Barry Smyth. Using twitter to recommend real-time topical news. In Proceedings of the third ACM conference on Recommender systems, pages 385–388. ACM, 2009.
(此全文未開放授權)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *