帳號:guest(3.133.130.105)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):周士堯
作者(外文):Chou, Shih-Yao
論文名稱(中文):應用文件探勘技術進行立法文本自動化分析
論文名稱(外文):Automatic Content Analysis of Legislative Documents by Text Mining Techniques
指導教授(中文):林福仁
指導教授(外文):Lin, Fu-Ren
口試委員(中文):雷松亞
鄭興
口試委員(外文):Ray, Soumya
Cheng, Hsing
學位類別:碩士
校院名稱:國立清華大學
系所名稱:服務科學研究所
學號:100078515
出版年(民國):102
畢業學年度:101
語文別:英文
論文頁數:48
中文關鍵詞:文件探勘支持向量機立法表現分類兩階段分群
外文關鍵詞:text miningSVMlegislative performanceclassificationtwo-stage clustering
相關次數:
  • 推薦推薦:0
  • 點閱點閱:926
  • 評分評分:*****
  • 下載下載:7
  • 收藏收藏:0
在立法院國會圖書館網站裡,提供了一個公開且客觀的管道,讓公民可以追蹤了解立法院每天發生的事情,諸如立委的質詢等等。然而,這些公開的資訊量其實非常大,也非常凌亂,一般民眾可能無法有效消化這些資訊,或很難透過這些資訊去清楚了解立委的問政績效,因而浪費了此公開管道的美意,因此,為了克服這個困難,本研究目的就在於透過文件探勘技術去有效分辯每位立委立法表現的類別,然後展現出他們在各領域裡的問政績效。
此研究根據中山政治所專家所建構的立法分類架構為基礎,透過兩階段分群(two-stage clustering)去做特徵值擷取,再採用支持向量機(support vector machine)去建立模型來自動預測立委立法表現到最適合的分類。
為了讓此系統可以永續執行下去,此研究同時也對政治專家與一般民眾在分類標籤貢獻上的內容差別做了實驗驗證,呈現的結果沒有顯著差別,將支持未來系統可以直接透過網路讓一般民眾做維護與更新分類的動作。
本研究提出的自動預測分類方法,輔以視覺化雷達圖的呈現,希望幫助公民更能了解立法院活動與立委的問政績效,根據實驗的結果顯示,使用本方法可以有效自動分辨立法表現類別,進而可持續利用國會圖書館的公開立法資訊,有效做到監督立委在各種面向下的問政績效。
The Parliamentary Library of Taiwan’s Legislative Yuan website provides a fair and objective channel for the public to track daily activities of the Legislative Yuan and legislators’ inquiries. However the quantity of generated documents is so large that the general public may not be able to update of the legislative performance of each legislator from these contents. To mitigate the gap of legislative document generation and the sense making by the general public, this study proposed a text mining mechanism to automatically classify legislative documents referring to each legislator, and then represent the proportion of their legislative performance on certain categories.
This study first initiated a basic legislative categorical structure by domain experts. Then a two-stage clustering was applied to perform feature selection for legislative documents. The SVM method was applied to build a model to classify the new document to the appropriate category.
In order to maintain the classification categories up to date, in this study, we also evaluate the difference from labeling contents by domain experts and the general public. If the categories labeled by both do not have significant difference, we can call for the general public via internet to maintain the updated categories of newly generated legislative documents.
Experimental results show the effectiveness of the proposed test mining mechanism, which automatically classifies legislative documents to reveal legislators’ performance accordingly. With this result, people can monitor legislators and track their legislative activities using the information from the Parliamentary Library of Legislative Yuan to update their perception on legislative performance in various categories.
Chapter 1 Introduction --- 8
1.1 Research Background --- 8
1.2 Research Motivation --- 9
1.3 Research Objective --- 10
Chapter 2 Literature review --- 11
2.1 Lack of Research for Information Technique in Political Science --- 11
2.2 Legislative Categorical structure Initialization --- 11
2.3 Support Vector Machine(SVM) --- 13
2.4 Two-stage Clustering --- 14
2.4.1 Stage 1 (hierarchical clustering) --- 14
2.4.2 Stage 2 (k-means clustering) --- 16
Chapter 3 Research Framework --- 17
3.1 System Architecture --- 17
3.2 Pre-processing --- 17
3.3 Showing Relevant Keywords by Clustering --- 19
3.4 Categorical Labeling by Domain Experts --- 20
3.5 Automatic Classification --- 20
Chapter 4 System Implementation and Experimental Design --- 21
4.1 Data Sources --- 21
4.2 System Implementation --- 21
4.3 Evaluation Criteria --- 23
4.4 Experimental Design --- 24
4.4.1 Experiment A: The Evaluation of Classification Result --- 24
4.4.2 Experiment B: The Evaluation between Expert and Public Labeling --- 25
Chapter 5 Experimental Results --- 26
5.1 The Evaluation of Classification Results --- 26
5.2 The Comparison between Experts and Public Labeling --- 31
5.3 The Discussion of Experimental Results --- 32
5.4 The Legislators’ Performance shown by Radar Chart --- 32
Chapter 6 Conclusion and Future Work --- 34
6.1 Conclusion --- 34
6.2 Future Work --- 35
References --- 36
Appendix --- 38
I. Political Science references:
Liao, D. L., Lin, F. R., Huang, Y. C., Liu, Z. Y., & Lee, C. X. (2012). The Establishment of Taiwanese Legislators' Campaign Promise Database. Journal of Electoral Studies, 19(1), 129-158.
Sheng, H. Y. (2005). 立法委員的立法提案:第五屆立法院的分析. Taipei, Taiwan: 2005 Annual conference of Taiwanese Political Science Association.
Lin, J. J. (2006). The Study of Interpellation System of Legislative Yuan in R.O.C. Journal of TOKO, 1(1).
Liao, Y. (2006). The Research of Voter Turnout: Case Study in Taiwan. The Journal of Chinese Public Administration, (3), 185-202.
Siao, Y. S. (2010). Investigation and research of the oral presentation of legislators: Analysis of debates about national defense, diplomacy and cross-strait relations in The Legislative Yuan Official Gazette (Master’s thesis, National Taiwan Normal University, 2010). NTNU Institutional Repository.

II. Technical mechanism references:
Berghel, H. (1997). Cyberspace 2000: Dealing with information overload. Communications of the ACM, 40(2), 19-24.
Everitt, B. S., Landau, S., & Leese, M. (2001). Cluster Analysis (fourth.). Arnold, London.
Ku, L. W. (2000). A study on the multilingual topic detection of news articles (Master’s thesis, National Taiwan University, 2000). NDLTD in Taiwan.
Korenius, T., Laurikkala, J., Juhola, M., & Jarvelin, K. (2006). Hierarchical clustering of a Finnish newspaper article collection with graded relevance assessments. Information Retrieval, 9(1), 33-53.
Lin, F., & Hsueh, C. (2006). Knowledge map creation and maintenance for virtual communities of practice. Information Processing & Management, 42(2), 551-568.
Punj, G., & Stewart, D. W. (1983). Cluster analysis in marketing research: review and suggestions for application. Journal of marketing research, 20(2), 134-148.
Burbidge, R., & Buxton, B. (2001). An Introduction to Support Vector Machines for Data Mining. Operation Research Society. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.103.7639
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523.
Huang, Y. C. (2010). Incremental Clustering: An Example of Legislative Interpellation (Master’s thesis, National Tsing Hua University, 2010). NTHU Electronic Theses and Dissertations System.
(此全文限內部瀏覽)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *