帳號:guest(18.118.253.223)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):郝德
作者(外文):De Hao
論文名稱(中文):應用文件探勘技術進行媒體框架自動化分析
論文名稱(外文):Automatic Content Analysis of Media Framing by Text Mining Techniques
指導教授(中文):林福仁
指導教授(外文):Fu-ren Lin
口試委員(中文):林福仁
楊叔卿
李永銘
雷松亞
口試委員(外文):Fu-ren Lin
Shwu-Ching Young
Yung-Ming Li
Soumya Ray
學位類別:碩士
校院名稱:國立清華大學
系所名稱:服務科學研究所
學號:101078506
出版年(民國):103
畢業學年度:102
語文別:英文
論文頁數:64
中文關鍵詞:文字探勘媒體框架立法分群奇異值分解 (SVD)隱含狄利克雷分 配 (LDA)
外文關鍵詞:Text miningmedia framinglegislativeclusteringSingular Value Decomposition (SVD)Latent Dirichlet Allocation (LDA)
相關次數:
  • 推薦推薦:0
  • 點閱點閱:442
  • 評分評分:*****
  • 下載下載:3
  • 收藏收藏:0
台灣的政治新聞在各平台都具有相當大的影響力,然而新聞在產生的過程中
會因問記者和媒體公司個人的政治傾向和個人意見而進行某程度的撰稿選擇或偏頗,
重要的訊息會因此而消失。
台灣的立法院圖書館有提供完整而詳細的會議記錄,包含逐字稿與影片,雖
說這樣詳細的資料有被記錄著,這樣的開放資料有助於一般民眾與學者去研究與追
蹤立法院內部的詳細情況,也有助於減少因為僅接收大眾媒體而失去重義的議題,
但其資訊量過於龐大而無法輕易的以人力的方式去追蹤與研究,為了縮減民眾與立
法院內部的距離,此研究提出應用文件探勘技術於立法院文本與新聞文本進行媒體
框架之分析。
此研究嘗試設計一個以最低限度的人為參予的媒體框架系統,過去多數的研
究需要專家因應特定的議題而預先定義每個框架的代表詞彙,這樣的過程本身也是
一種偏見,所以我們在過程中除了少量的停用詞需要專家預先定義,剩餘的SVD、
分群與LDA 皆無人為的參予。
此系統的數據結果被專家與一般民眾評估後得知確實具備著協助專家計算媒
體框架的實證與發現不曾預料的媒體框架。一般民眾與專家可使用此系統於追蹤及
監控各新聞媒體對於事件的媒體框架現象。
Political-related news is one of the most popular topics in different media platforms,
such as TV channels, Internet news and newspapers in Taiwan. When news are produced
through a process of selection and rephrase by reporters and media firms, reporters’
personal political leaning and personal opinions may influence the process, important
messages may inevitably loss.
In Taiwan, the Parliamentary Library of Legislative Yuan website provides detailed
contents about activities happening inside Legislative Yuan, including contents, such as
transcripts and video recordings of interpellation, conference speech, interim proposal
and legislation proposal. Although there is a complete record of information regarding
Legislative Yuan provided online, but the quantity of Legislative documents are far too
much for citizens to make sense of. It is imperative that better organized information
released to the public would facilitate readers to ease the cognitive loads in understanding
what issues have been discussed by legislators and reported by the media. To minimize
the gap between legislative documents and the general public, this study proposes a text
mining mechanism to automatically cluster legislative and news documents by media
frames, and then represents the proportion of each frame on certain sources.
This study aims to develop an automatic clustering system to determine media
frames with the minimum amount of human interference. Domain expert labeling was not
considered to increase classification accuracy, except the stopping words and few minor
variables in the feature selection stage requires human adjustment according to input files,
and the rest of the system are able to process input documents with zero expert labeling
or adjustments.
Experimental results evaluated by domain experts shows the system proposed in
this study is able to assist and provide political domain experts hard evidences of media
framing. People can monitor and discover media framing phenomenon using accurate
numbers and data.
Chapter
1
Introduction
.................................................................................................
7
1.1
Research
Background
.....................................................................................................
7
1.2
Research
Motivation
.............................................................................................................
8
1.3
Research
Objectives
............................................................................................................
10
Chapter
2
Literature
Review
......................................................................................
11
2.1
Application
domains
.....................................................................................................
11
2.2
Text
mining
..........................................................................................................................
16
Chapter
3
Research
Method
.......................................................................................
22
Chapter
4
System
Architecture
and
Implementation
..................................................
24
4.1
System
Architecture
.....................................................................................................
24
4.2
Raw
Data
Collection
............................................................................................................
25
4.3
Pre-­‐Processing
.....................................................................................................................
26
4.4
Dimension
Reduction
..........................................................................................................
28
4.5
Multi-­‐stage
Clustering
.........................................................................................................
30
4.5.1
K-­‐means
Clustering
.......................................................................................................
34
4.6
Topic
Distribution
Modeling
................................................................................................
36
Chapter
5
Evaluation
Results
......................................................................................
37
5.1
Data
Source
..................................................................................................................
37
5.2
Evaluation
Criteria
...............................................................................................................
38
5.3
Interview
Design
..................................................................................................................
39
5.3.1
Sample
selection
...........................................................................................................
39
5.3.2
Interview
Question
.......................................................................................................
40
5.4
Framing
Results
Shown
by
Bar
Charts
.................................................................................
41
5.5
Evaluation
of
Interview
.......................................................................................................
52
5.5.1
Data
Sense
making
.......................................................................................................
52
5.5.2
Value
added
service
.....................................................................................................
55
5.6
Discussion
of
Interview
Result
.............................................................................................
55
Chapter
6
Conclusion
and
Future
Work
......................................................................
57
6.1
Conclusion
....................................................................................................................
57
6.2
Future
Work
........................................................................................................................
58
Reference
..................................................................................................................
59
Abbott, J. P. (2011). Electoral Authoritarianism and the Print Media in Malaysia: Measuring Political Bias and Analyzing Its Cause. Asian Affairs: An American Review, 38(1), 1-38. doi: 10.1080/00927678.2010.520575
Afolabi, I. T., Musa, G. A., Ayo, C. K., & Sofoluwe, A. B. (2008). Knowledge discovery in online repositories: a text mining approach. European Journal of Scientific Research, 22(2), 241-250.
Allan, J. (2002). Introduction to topic detection and tracking Topic detection and tracking (pp. 1-16): Springer.
Allen, B., O'Loughlin, P., Jasperson, A., & Sullivan, J. L. (1994). The media and the Gulf War: Framing, priming, and the spiral of silence. Polity, 255-284.
Assogba, Y., Ros, I., DiMicco, J., & McKeon, M. (2011). Many bills: engaging citizens through visualizations of congressional legislation. Paper presented at the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
Bertot, J. C., Jaeger, P. T., & Grimes, J. M. (2010). Using ICTs to create a culture of transparency: E-government and social media as openness and anti-corruption tools for societies. Government Information Quarterly, 27(3), 264-271.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. the Journal of machine Learning research, 3, 993-1022.
Chen, C.-j. (2008). Investigation of global warming and the Kyoto Protocol—Using content analysis to analyze New York Times news during 2001-2007.

Chen, H. H., & Ku, L. W. (2002). An NLP & IR approach to topic detection. In Topic detection and tracking (pp. 243-264). Springer US.
Chiang, C. F., & Knight, B. (2011). Media Bias and Influence: Evidence from Newspaper Endorsements. The Review of Economic Studies, 78(3), 795-820. doi: 10.1093/restud/rdq037
Chou, Shih-Yao, 周士堯. (2013). Automatic Content Analysis of Media Framing by Text Mining Techniques. 應用文件探勘技術進行立法文本自動化分析.
Chun, S. A., Shulman, S., Sandoval, R., & Hovy, E. (2010). Government 2.0: Making connections between citizens, data and government. Information Polity, 15(1), 1-9.
Culley, M. R., Ogley‐Oliver, E., Carton, A. D., & Street, J. C. (2010). Media framing of proposed nuclear reactors: An analysis of print media. Journal of Community & Applied Social Psychology, 20(6), 497-512.
D'Alessio, D., & Allen, M. (2000). Media bias in presidential elections: a meta‐analysis. Journal of Communication, 50(4), 133-156.
DeGregorio, C. (2011). Promoting Policy in a Mediated Democracy: Congress and the News. The Forum, 9(1). doi: 10.2202/1540-8884.1417
Denning, P. J. (1997). A new social contract for research. Communications of the ACM, 40(2), 132-134.
Dhillon, I. S., & Modha, D. S. (2001). Concept decompositions for large sparse text data using clustering. Machine learning, 42(1-2), 143-175.
Entman, R. M. (1993). Framing: Toward clarification of a fractured paradigm. Journal of communication, 43(4), 51-58.
Entman, R. M. (2007). Framing Bias: Media in the Distribution of Power. Journal of Communication, 57(1), 163-173. doi: 10.1111/j.1460-2466.2006.00336.x
Entman, R. M. (2010). Media framing biases and political power: Explaining slant in news of Campaign 2008. Journalism, 11(4), 389-408. doi: 10.1177/1464884910367587
Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2001). Hierarchical clustering. Cluster Analysis, 5th Edition, 71-110.
Fan, W.-y. (2012). A Study of Newspapers' Reports on the Foxconn jumps: the United Daily, China Times, Liberty Times and Apple Daily.
Fukunaga, K. (1990). Introduction to statistical pattern recognition: Academic press.
Groseclose, T., & Milyo, J. (2005). A measure of media bias. Quarterly Journal of Economics, 120(4), 1191-1237. doi: Doi 10.1162/003355305775097542
Groseclose, T., & Milyo, J. (2005). A measure of media bias. The Quarterly Journal of Economics, 120(4), 1191-1237.
Heinrich, G. (2005). Parameter estimation for text analysis. Technical report.
Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004). Design science in information systems research. MIS quarterly, 28(1), 75-105.
Holtzman, N. S., Schott, J. P., Jones, M. N., Balota, D. A., & Yarkoni, T. (2011). Exploring media bias with semantic analysis tools: Validation of the Contrast Analysis of Semantic Similarity (CASS). Behavior Research Methods, 43(1), 193-200.
Huijboom, N., & Van den Broek, T. (2011). Open data: an international comparison of strategies. European journal of ePractice, 12(1), 1-13.
Kaplan, B., & Maxwell, J. A. (2005). Qualitative research methods for evaluating computer information systems. In Evaluating the Organizational Impact of Healthcare Information Systems (pp. 30-55). Springer New York.
Kaufman, L., & Rousseeuw, P. J. (2009). Finding groups in data: an introduction to cluster analysis (Vol. 344). John Wiley & Sons.
Krishnamoorthy, R., & SreedharKumar, S. (2012). New optimized agglomerative clustering algorithm using multilevel threshold for finding optimum number of clusters on large data set. Paper presented at the Emerging Trends in Science, Engineering and Technology (INCOSET), 2012 International Conference on.
Lin, F.-r., & Liang, C.-H. (2008). Storyline-based summarization for news topic retrospection. Decision Support Systems, 45(3), 473-490.
March, S. T., & Smith, G. F. (1995). Design and natural science research on information technology. Decision support systems, 15(4), 251-266.
Mo, Y.-c. (2014, 2014/06/22 ). Cross-strait service trade pact signed. Taipei Times Retrieved from http://www.taipeitimes.com/News/front/archives/2013/06/22/2003565371
Moira, W., & Hsieh, Y.-Y. (2014, 2014/03/27). 324: Dispatches from Taipei. from https://nplusonemag.com/online-only/foreign-affairs/324-dispatches-from-taipei/
Ni, X., Sun, J. T., Hu, J., & Chen, Z. (2009, April). Mining multilingual topics from wikipedia. In Proceedings of the 18th international conference on World wide web (pp. 1155-1156). ACM.
Osterwalder, A. (2004). The business model ontology: A proposition in a design science approach. Institut d’Informatique et Organisation. Lausanne, Switzerland, University of Lausanne, Ecole des Hautes Etudes Commerciales HEC, 173.
Pollak, S., Coesemans, R., Daelemans, W., & Lavrač, N. (2013). Detecting contrast patterns in newspaper articles by combining discourse analysis and text mining. Pragmatics, 21(4).
Punj, G., & Stewart, D. W. (1983). Cluster analysis in marketing research: review and suggestions for application. Journal of marketing research, 134-148.
Rogers, E. M., Dearing, J. W., & Chang, S. (1991). AIDS in the 1980s: The agenda-setting process for a public issue (No. 126). Association for Education in Journalism and Mass Communication.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523.
Scheufele, D. A. (1999). Framing as a theory of media effects. Journal of Communication, 49(1), 103-122.
Simon, H. A. (1996). The Sciences of the Artificial. MIT Press Books, 1.
Sun, J., Hoke, E., Strunk, J. D., Ganger, G. R., & Faloutsos, C. (2006). Intelligent system monitoring on large clusters. Paper presented at the Proceedings of the 3rd workshop on Data management for sensor networks: in conjunction with VLDB 2006.
Tankard, J. W. (2001). The empirical approach to the study of media framing. Framing public life: Perspectives on media and our understanding of the social world
Thelwall, M., & Hellsten, I. (2006). The BBC, Daily Telegraph and Wikinews timelines of the terrorist attacks of 7th July 2006 in London: a comparison with contemporary discussions. Information Research, 12(1), 14.

van der Pas, D. (2014). Making Hay While the Sun Shines Do Parties Only Respond to Media Attention When the Framing Is Right?. The International Journal of Press/Politics, 19(1), 42-65.
Von Alan, R. H., March, S. T., Park, J., & Ram, S. (2004). Design science in information systems research. MIS quarterly, 28(1), 75-105.
White, D. M. (1950). The “gate keeper”: A case study in the selection of news(pp. 143-159). na.
Yamron, J., Carp, I., Gillick, L., Lowe, S., & Van Mulbregt, P. (1999). Topic tracking in a news stream. Paper presented at the Proceedings of DARPA Broadcast News Workshop.
Yan, M. (2005). Methods of determining the number of clusters in a data set and a new clustering criterion. Virginia Polytechnic Institute and State University.
Zhang, X., Guo, Z., & Li, B. (2009). An effective algorithm of news topic tracking. Paper presented at the Intelligent Systems, 2009. GCIS'09. WRI Global Congress on.
Zitnick, L., & Kanade, T. (2012). Maximum entropy for collaborative filtering. arXiv preprint arXiv:1207.4152.
(此全文限內部瀏覽)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *