帳號:guest(3.144.224.37)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):戴瑜廷
作者(外文):Tai, Yu Ting
論文名稱(中文):應用文字探勘之自動化新聞文本分析以探討社會對新聞事件之反應
論文名稱(外文):Automatic Content Analysis Using Text Mining to Investigate How News Events Trigger the Response of Society
指導教授(中文):林福仁
指導教授(外文):Lin, Fu Ren
口試委員(中文):雷松亞
徐茉莉
口試委員(外文):Ray, Soumya
Shmueli, Galit
學位類別:碩士
校院名稱:國立清華大學
系所名稱:服務科學研究所
學號:101078505
出版年(民國):104
畢業學年度:103
語文別:英文
論文頁數:90
中文關鍵詞:新聞摘要文字探勘文本分析食品安全焦點訪談社會學習
外文關鍵詞:SummarizationText MiningAutomatic Content AnalysisFood safetyFocused Conversation MethodORIDSocial Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:656
  • 評分評分:*****
  • 下載下載:10
  • 收藏收藏:0
近年來食品安全問題層出不窮,接連爆發塑化劑、毒澱粉、假油一連串的事件。然而,相關報導的數量龐大,一般民眾難以有效閱讀完所有資訊;再者,一連串的事件都與食品安全相關,社會是否從過去事件中學習到經驗,並在類似事件發生時做出不同的因應也是值得探討的議題。
但新聞閱聽者難以從非結構化的訊息中了解事件之間社會反應的差異,因此本研究的目的在於自動化分析同一主題的多個事件,探討社會對新聞事件的反應。
本研究旨在提出一個自動化的文本分析系統,自動分析隸屬同一主題的多個新聞事件。首先,本研究透過分群技術(Clustering),以事件發展階段及利害關係人二維向度,呈現各利害關係人在事件各階段的言論內容。再者,系統將透過摘要技術(summarization)萃取事件發展重點以提供單一事件發展的新聞摘要。最後,以焦點訪談法(ORID)衡量系統的有效性,並同時探索讀者對於事件的反應。
藉由本研究提出的自動化文本分析系統,一般民眾可以更快速及有效的了解新聞事件的發展,回顧事件發生當下的感受、想法與行動。
In recent years, the crisis of food safety events continued happened in interval. There are three main food safety events, in sequence, “Plasticizer”, “Poison starch” and “Fake oil”. However, the related news reports are too enormous to be digested efficiently by the readers. In addition, it’s interested to know if similar events happen again, would they learn something from the past experiences and responds in a different way.
This study aimed to propose a system that can automatic analyze the related news belonging to the same topic. First, this study presents the opinions of each stakeholder on each period of the news development by clustering. Second, this system extracts the important content of news reports using summarization and provides the summarization of each news event to readers. Finally, this study combines the system with Focused Conversation Method (ORID) to evaluate the effective of the system and to explore the response of readers to the news events.
With the facility of the system that we proposed, the readers can understand the development of news event efficiently and recall their feeling, thought, and reaction for the news events at the moment that the event happened.
Chapter 1 Introduction 1
1.1 Research Background 1
1.2 Research Motivation 3
1.3 Research Objectives 4
Chapter 2 Literature Review 5
2.1 Automatic Content Analysis 5
2.2 Text Summarization 6
2.3 Clustering Algorithm 8
2.3.1 Hierarchical Cluster Analysis (HCA) 8
2.3.2 Other Clustering Methods 9
2.4 Focused Conversation Method (ORID) 11
Chapter 3 System Framework and Methodology 14
3.1 Definition 15
3.2 System Architecture 16
3.3 Data Acquisition 18
3.4 Preprocessing 19
3.4.1 Word segmentation 19
3.4.2 Term Aggregation 19
3.4.3 Feature Selection 21
3.5 Opinion Extraction 22
3.6 Clustering 27
3.7 Summarization 28
3.8 Content Analysis 28
Chapter 4 System Implementation and Results 30
4.1 Data Source 30
4.2 System Implementation 30
4.3 Results 33
Chapter 5 Evaluation and Results 39
5.1 Evaluation Design 39
5.2 Evaluation Results 42
5.2.1 The understanding of news events 42
5.2.2 The change of response of each reader for three events 44
5.2.3 The change of response of stakeholders for three events 46
5.3 Discussions 50
Chapter 6 Conclusion and Future Work 51
References 53
Appendix A. Contents Presented to Subject in Round 1 57
Appendix B. Summarization Results and Contents Presented to Subject in Round 2 60
Appendix C. ORID Interview Transcript in Round 1 64
Appendix D. ORID Interview Transcript in Round 2 72
Appendix E. Opinions of Stakeholders Cross Three Events 84
Alguliev, R. M., Aliguliyev, R. M., & Mehdiyev, C. A. (2011). Sentence selection for generic document summarization using an adaptive differential evolution algorithm. Swarm and Evolutionary Computation, 1(4), 213-222.
Allan, J., Gupta, R., & Khandelwal, V. (2001, September). Temporal summaries of new topics. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 10-18). ACM.
Baptiste, N. (1995). Professional development Always growing and learning: The ORID—A technique to enhance communication. Early Childhood Education Journal, 22(4), 39-40.
Berghel, H. (1997). Cyberspace 2000: Dealing with information overload. Communications of the ACM, 40(2), 19-24.
Cheney, D. (2013). Text mining newspapers and news content: new trends and research methodologies.
Chang, Y. H., Chang, C. Y., & Tseng, Y. H. (2010). Trends of science education research: An automatic content analysis. Journal of Science Education and Technology, 19(4), 315-331.
Carbonell, J., & Goldstein, J. (1998, August). The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval (pp. 335-336). ACM.
Feldman, R., & Sanger, J. (2007). The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge University Press.
Hsu, C. H. (2004). Automatically Constructing Ontology on Semantic Web (Doctoral dissertation, MS thesis, Fu Jen Catholic University, Taiwan).
Hu, J. Y. (2009). 追蹤進行中新聞議題產生事件主軸摘要. 清華大學科技管理研究所學位論文, 1-81.
Han, J., Kamber, M., & Pei, J. (2011). Data mining: concepts and techniques: concepts and techniques. Elsevier.
Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004). Design science in information systems research. Mis Quarterly, 28(1), 75-105.
Hsieh, H. F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative health research, 15(9), 1277-1288.
Ilango, M. R., & Mohan, V. (2010). A survey of grid based clustering algorithms. International Journal of Engineering Science and Technology, 2(8), 3441-3446.
King, B. (1967). Step-wise clustering procedures. Journal of the American Statistical Association, 62(317), 86-101.
Kriegel, H. P., Kröger, P., Sander, J., & Zimek, A. (2011). Density‐based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3), 231-240.
Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of research and development, 2(2), 159-165.
Lin, F. R., & Liang, C. H. (2008). Storyline-based summarization for news topic retrospection. Decision Support Systems, 45(3), 473-490.
Lai, Y. S., & Wang, R. J. (2003, October). Towards automatic knowledge acquisition from text based on ontology-centric knowledge representation and acquisition. In Proceeding of the SemAnnot 2003 Workshop.
Mani, I. (2001, October). Recent developments in text summarization. In Proceedings of the tenth international conference on Information and knowledge management (pp. 529-531). ACM.
McKeown, K. R., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J. L., Nenkova, A., ... & Sigelman, S. (2002, March). Tracking and summarizing news on a daily basis with Columbia's Newsblaster. In Proceedings of the second international conference on Human Language Technology Research (pp. 280-285). Morgan Kaufmann Publishers Inc..Radev, D. R., Hovy, E., & McKeown, K. (2002). Introduction to the special issue on summarization. Computational linguistics, 28(4), 399-408.
Mani, I., & Maybury, M. T. (Eds.). (1999). Advances in automatic text summarization (Vol. 293). Cambridge, MA: MIT press.
Moretti, F., van Vliet, L., Bensing, J., Deledda, G., Mazzi, M., Rimondini, M., ... & Fletcher, I. (2011). A standardized approach to qualitative content analysis of focus group discussions from different countries. Patient education and counseling, 82(3), 420-428.
Radev, D. R., & Fan, W. (2000, October). Automatic summarization of search engine hit lists. In Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics-Volume 11 (pp. 99-109). Association for Computational Linguistics.
Radev, D. R., Jing, H., Styś, M., & Tam, D. (2004). Centroid-based summarization of multiple documents. Information Processing & Management, 40(6), 919-938.
Radev, D., Otterbacher, J., Winkel, A., & Blair-Goldensohn, S. (2005). NewsInEssence: summarizing online news topics. Communications of the ACM, 48(10), 95-98.
Schilling, J. (2006). On the pragmatics of qualitative assessment. European Journal of Psychological Assessment, 22(1), 28-37.
Spee, J. C. (2005). Using focused conversation in the classroom. Journal of Management Education, 29(6), 833-851.
Spangler, W. D., Gupta, A., Kim, D. H., & Nazarian, S. (2012). Developing and validating historiometric measures of leader individual differences by computerized content analysis of documents. The Leadership Quarterly, 23(6), 1152-1172.
Stanfield, R. B. (2000). The art of focused conversation. Gabriola Island, BC: New Society Publishers, 17-29.
Salvador, S., & Chan, P. (2004, November). Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on (pp. 576-584). IEEE.
Sneath, P. H., & Sokal, R. R. (1973). Numerical taxonomy. The principles and practice of numerical classification.
Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.
Ward Jr, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301), 236-244.
Wang, W. M., Cheung, C. F., Lee, W. B., & Kwok, S. K. (2008). Mining knowledge from natural language texts using fuzzy associated concept mapping. Information Processing & Management, 44(5), 1707-1719.
Wu, S. H., Day, M. Y., Tsai, T. H., & Hsu, W. L. (2002). FAQ-centered organizational memory. In Knowledge Management and Organizational Memories (pp. 103-112). Springer US.
Xue, N. (2003). Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing, 8(1), 29-48.
Yang, Y. (1995, July). Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 256-263). ACM.
Zou, F., Wang, F. L., Deng, X., Han, S., & Wang, L. S. (2006, April). Automatic construction of Chinese stop word list. In Proceedings of the 5th WSEAS international conference on Applied computer science (pp. 1010-1015).
(此全文限內部瀏覽)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *