帳號:guest(18.191.233.80)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):高定慧
作者(外文):Kao, Ting-Hui
論文名稱(中文):結合語料庫與知識庫之動詞語意框解歧與習得
論文名稱(外文):Combining corpus statistics and knowledge base to disambiguate and acquire verb frames
指導教授(中文):張俊盛
指導教授(外文):Chang, Jason S.
口試委員(中文):陳信希
張嘉惠
口試委員(外文):Chen, Hsin-Hsi
Chang, Chia-Hui
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:100062654
出版年(民國):102
畢業學年度:101
語文別:英文
論文頁數:52
中文關鍵詞:動詞語意框習得語意解歧知識庫最大期望演算法
外文關鍵詞:Verb Frame generationWord Sense DisambiguationKnowledge BaseEstimation-Maximization algorithm
相關次數:
  • 推薦推薦:0
  • 點閱點閱:375
  • 評分評分:*****
  • 下載下載:1
  • 收藏收藏:0
在句法中,動詞扮演著舉足輕重的角色。對語言學習者而言,學習相關的動詞架框(verb frame),更是一個重要的課題。 不幸的是,現今大部分的線上字典,例如朗文字典,對於動詞架框都只提供非常粗淺、概略性的表示法,也就是一般列舉的某事(something)與某人(somebody)。然而,這樣粗略的標籤並無法有效的幫助語言學習者。在本論文中,我們提出一個自動習得動詞語意框的方法,並提供更加全面、更容易理解的表示法。我們首先根據從語料庫中得到的句法關係,抽取出與動詞有關的參數並組合出動詞參數組(verb argument tuple),再藉由知識庫(knowledge base)決定各個動詞參數(verb argument)的語意類別。接著透過組合並估計動詞語意組(verb semantic tuple)出現的機率以消除歧義,
最後便得到動詞架框(verb frame) 。我們開發了一個雛形系統 FrameFinder,根據上述方法自動產生動詞語意框,並採用一個人工編輯的動詞型態(verb pattern)字典作為標準答案,評估結果也顯示此方法對於常見動詞,可以得到令人滿意的準確度。
Verb frames are very important for language learners, since they capture the semantics and word usages associated with verbs. Unfortunately, most online dictionaries such as Longman Dictionary show verb frames with broad semantic categories (i.e., something and somebody) which are not very informative. In this work, we introduce a method for automatically generating more comprehensive verb frames. The method involves extracting verb argument tuples based on grammatical relations acquired from a parsed corpus, obtaining intended semantic categories for each argument based on a knowledge base, estimating the probabilities of each semantically labeled tuples, and finally generating verb frames. We present a prototype system, FrameFinder, that applies the method to generate verb frames automatically. Evaluation on a set of verbs with manually compiled semantic patterns shows that the method is able to extract with high accuracy for the important high frequnecies verbs for language learning.
摘要 ........................................................................................................................................... i ABSTRACT.............................................................................................................................. ii 致 謝 辭.................................................................................................................................. iii List of Figures ........................................................................................................................... v List of Tables ........................................................................................................................... vi CHAPTER 1 INTRODUCTION .............................................................................................. 1 CHAPTER 2 RELATED WORK ............................................................................................. 6 CHAPTER 3 METHOD ........................................................................................................... 9
3.1 Problem Statement .......................................................................................................... 9 3.2 Extracting tuples of verb arguments..............................................................................11 3.3 Obtaining semantic roles ............................................................................................... 13 3.4 Generating verb frames ................................................................................................. 15
CHAPTER 4 EXPERIMENTAL SETTING........................................................................... 18 4.1 Verb frame Generation Methods Compared .................................................................19 4.2 Evaluation metric .......................................................................................................... 20 4.3 Evaluation verb frame and Relevance judgments ......................................................... 21
CHAPTER 5 EVALUATION RESULTS .............................................................................. 25 CHAPTER 6 FUTURE WORK AND SUMMARY ............................................................... 33 REFERENCES........................................................................................................................ 35 APPENDIX A - Stanford Dependency Annotation ................................................................ 39 APPENDIX B - WordNet lexicographer classes with descriptions. ....................................... 41 APPENDIX C - CPA to WordNet Transformation.................................................................42
Abel, A., Gamper, J., Knapp, J., & Weber, V. (2003). Describing Verb Valency in an Electronic Learner’s Dictionary: Linguistic and Technical Implications. In World Conference on Educational Multimedia, Hypermedia and Telecommunications (Vol. 2003, No. 1, pp. 1202-1209).
Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998, August). The berkeley framenet project. In Proceedings of the 17th international conference on Computational linguistics-Volume 1 (pp. 86-90). Association for Computational Linguistics.
Boisson, J., Kao, T.H., Wu, J.C. Yen, T.H., and Chang, J. S. (2010). Linggle: a Web-scale Linguistic Search Engine for Words in Context. Association for Computational Linguistics System Demo 2013.
Bojar, O., Semecký, J., & Benesová, V. (2005). VALEVAL: Testing Vallex Consistency and Experimenting with Word-Frame Disambiguation. Prague Bull. Math. Linguistics, 83, 5-18.
Chen, M. C., & Lin H. (2009). Self-efficacy, foreign language anxiety as predictors of academic performance among professional program students in a general English proficiency writing test. Perceptual and Motor Skills, 2009(109), 420-430.
Chen, M. H., Huang, C. C., Huang, S. T., & Chang, J. S. (2010). GRASP: Grammar-and Syntax-based Pattern-Finder for Collocation and Phrase Learning. In PACLIC (pp. 357-364).
Church, K. W., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational linguistics, 16(1), 22-29.
De Mareken, C.G. "Parsing the LOB Corpus". In Proceedings of the 28th Annual Meeting of the ACL, 1990, pp. 243-251.
De Marneffe, M. C., MacCartney, B., & Manning, C. D. (2006, May). Generating typed dependency parses from phrase structure parses. In Proceedings of LREC (Vol. 6, pp. 449-454).
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 1-38.
Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational linguistics, 19(1), 61-74.
Faure, D., & Nédellec, C. (1998, May). A corpus-based conceptual clustering method for verb frames and ontology acquisition. In LREC workshop on adapting lexical and corpus resources to sublanguages and applications (Vol. 707, No. 728, p. 30).
Hanks, P. (2004a). Corpus pattern analysis. In Euralex Proceedings (Vol. 1, pp. 87-98).
Hanks, P. (2004b). The syntagmatics of metaphor and idiom. International Journal of Lexicography, 17(3), 245-274.
Hanks, P., & Pustejovsky, J. (2005). A pattern dictionary for natural language processing. Revue Française de linguistique appliquée, 10(2), 63-82.
Herbst, T. (Ed.). (2004). A Valency Dictionary of English: A Corpus Based Analysis of the Complementation Patterns of English Verbs, Nouns and Adjectives (Vol. 40). Walter de Gruyter.
Hlavácková, D., & Horák, A. (2005). Verbalex–new comprehensive lexicon of verb valencies for czech. In Proceedings of the Slovko Conference, Bratislava, Slovakia.
Hornby, A. S., Gatenby, E. V., & Wakefield, H. (1942). Idiomatic and Syntactic English Dictionary.
Hornby, A. S. (ed.). Oxford Advanced Learner's Dictionary of Current English. Oxford, UK: Oxford University Press, 1989.
Im Walde, S. S., Hying, C., Scheible, C., & Schmid, H. (2008). Combining EM Training and the MDL Principle for an Automatic Verb Classification Incorporating Selectional Preferences. In ACL (pp. 496-504).
Kilgarriff, A., & Grefenstette, G. (2003). Introduction to the special issue on the web as corpus. Computational linguistics, 29(3), 333-347.
Materna, J. (2012). LDA-Frames: an unsupervised approach to generating semantic frames. In Computational Linguistics and Intelligent Text Processing (pp. 376-387). Springer Berlin Heidelberg.
McKay, S. (1980). Teaching the syntactic, semantic and pragmatic dimensions of verbs. TESOL Quarterly, 17-26.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to wordnet: An on-line lexical database. International journal of lexicography, 3(4), 235-244.
Nevěřilová, Z., & Grác, M. (2012, January). Common Sense Inference Using Verb Valency Frames. In Text, Speech and Dialogue (pp. 328-335).
Pustejovsky, J., Hanks, P., & Rumshisky, A. (2004, August). Automated induction of sense in context. In Proceedings of the 20th international conference on Computational Linguistics (p. 924).
Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of. Addison-Wesley.
Schuler, K. K. (2005). VerbNet: A broad-coverage, comprehensive verb lexicon.
Smadja, F. (1993). Retrieving collocations from text: Xtract. Computational linguistics, 19(1), 143-177.
Sun, L., & Korhonen, A. (2009, August). Improving verb clustering with automatically acquired selectional preferences. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2-Volume 2 (pp. 638-647). Association for Computational Linguistics.
Vanderwende, L. (1994, August). Algorithm for automatic interpretation of noun sequences. In Proceedings of the 15th conference on Computational linguistics-Volume 2 (pp. 782-788). Association for Computational Linguistics.
West, M. P. (1953). A general service list of English words: with semantic frequencies and a supplementary word-list for the writing of popular science and technology. Longmans, Green.
Wible, D., & Tsao, N. L. (2010, June). StringNet as a computational resource for discovering and investigating linguistic constructions. In Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics (pp. 25-31). Association for Computational Linguistics.
Yarowsky, D. (1992, August). Word-sense disambiguation using statistical models of Roget's categories trained on large corpora. In Proceedings of the 14th conference on Computational linguistics-Volume 2 (pp. 454-460). Association for Computational Linguistics.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *