帳號:guest(13.59.107.152)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):程尚謙
作者(外文):Cheng, Shang-Chien
論文名稱(中文):發展以WordNet 為本的詞彙語意資料集
論文名稱(外文):Developing a Word Sense Dataset Based on WordNet Hierarchy
指導教授(中文):張俊盛
指導教授(外文):Chang, Jason S.
口試委員(中文):柯淑津
許永真
口試委員(外文):Ker, Sue J.
Hsu, Yung-jen
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:105065509
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:33
中文關鍵詞:詞義辨識語意網路平行語料
外文關鍵詞:Word Sense DisambiguationWSDWordNetParallel Corpus
相關次數:
  • 推薦推薦:0
  • 點閱點閱:406
  • 評分評分:*****
  • 下載下載:12
  • 收藏收藏:0
本論文提出了一個方法,利用 WordNet 和平行語料進行英語的詞義辨識,為 WordNet 提供詞義相關的翻譯及雙語例句,能夠以學習者的母語(如:中文)來輔助英語學習,研究的結果也能應用在其他詞義辨識問題中。多義詞的不同詞義在其他語言通常會被翻譯成不同的詞,我們提出的方法主要是利用此特性來決定詞義。我們的方法涉及了擷取平行語料中英文詞相互對應的翻譯,並訓練一個分類器透過不同翻譯以區分詞義,最後為不同的詞義挑選出具有代表性的例句。我們將擷取出的詞彙語意資料發展成一個搜尋系統,LanguageNet,提供查詢多義詞不同詞義的雙語同義詞及使用實例。實驗的評估結果顯示我們提出的方法在擷取詞義相關的翻譯及雙語例句有著不錯的準確性。
We introduce a method for disambiguating word sense based on WordNet from a parallel corpus that can be used to provide accurate sense relevant translations and bilingual examples to support word sense disambiguation, as well as assist learning English with learner's native language (e.g., Chinese). In our approach, different translations of a word determine the specificity of the senses. The method involves extracting word translations, training a classifier to distinguish words into groups of senses based on translations, and selecting sense relevant example sentences. We present a prototype system, LanguageNet that applies the proposed method to display bilingual synonyms and sense relevant examples of senses of the given word. The evaluation on a set of polymous words shows that the method has good performance finding sense relevant translations and bilingual examples.
Abstract ii
Acknowledgements iii
Contents iv
List of Figures vi
List of Tables vii

Chapter 1 Introduction ........................ 1
Chapter 2 Related Work ........................ 5
Chapter 3 Methodology ........................ 8
3.1 ProblemStatement........................... 8
3.2 Aligning and Extracting Word Translations........................... 10
3.3 Performing Classification for WSD Based on Translations.............. 11
3.3.1 Sense Categories........................... 11
3.3.2 Training a Model for Extracting WSD Features ........................... 12
3.3.3 Classifying Chinese Translation to WordNet Senses................... 14
3.4 Selecting Sense Relevant Example Sentences.......................... 15
3.5 Run-Time: Providing Sense Relevant Example Sentences............ 17

Chapter 4 Experiment and Evaluation........................ 19
4.1 DataSets and Tools .......................... 19
4.2 Experimental Setting.......................... 21
4.3 Evaluation and Discussion ....................... 23
4.3.1 EvaluationMetrics ....................... 23
4.3.2 EvaluationResults ....................... 24

Chapter 5 Conclusion and Future Work........................ 28

Reference .............................. 30
Katy McAdam Michael Rundell Pavel Rychl ́y Adam Kilgarriff, Miloˇs Hus ́ak. Gdex:Automatically finding good dictionary examples in a corpus. In Janet DeCe-saris Elisenda Bernal, editor,Proceedings of the 13th EURALEX InternationalCongress, pages 425–432, Barcelona, Spain, jul 2008. Institut Universitari deLinguistica Aplicada, Universitat Pompeu Fabra. ISBN 978-84-96742-67-3.

Steven Bird, Ewan Klein, and Edward Loper.Natural language processing withPython: analyzing text with the natural language toolkit. ” O’Reilly Media, Inc.”,2009.

Wei-Te Chen, Su-Chu Lin, Shu-Ling Huang, You-Shan Chung, and Keh-JiannChen. E-hownet and automatic construction of a lexical ontology. InPro-ceedings of the 23rd International Conference on Computational Linguistics:Demonstrations, pages 45–48. Association for Computational Linguistics, 2010.

Xinxiong Chen, Zhiyuan Liu, and Maosong Sun. A unified model for word senserepresentation and disambiguation. InProceedings of the 2014 Conference onEmpirical Methods in Natural Language Processing (EMNLP), pages 1025–1035,2014.

Chris Dyer, Victor Chahuneau, and Noah A. Smith. A simple, fast, and effectivereparameterization of ibm model 2. InIn Proc. NAACL, 2013.

Jiang Guo, Wanxiang Che, Haifeng Wang, and Ting Liu. Learning sense-specificword embeddings by exploiting bilingual resources. InProceedings of COLING2014, the 25th International Conference on Computational Linguistics: Techni-cal Papers, pages 497–507, 2014.

Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. Embeddingsfor word sense disambiguation: An evaluation study. InACL, 2016.

Rub ́en Izquierdo, Armando Su ́arez, and German Rigau. Exploring the automaticselection of basic level concepts. InProceedings of RANLP, volume 7. Citeseer,2007.

Hong Jin Kang, Tao Chen, Muthu Kumar Chandrasekaran, and Min-Yen Kan. Acomparison of word embeddings for english and cross-lingual chinese word sensedisambiguation. InNLP-TEA@COLING, 2016.

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Dis-tributed representations of words and phrases and their compositionality. InAdvances in neural information processing systems, pages 3111–3119, 2013.

George A Miller. Wordnet: a lexical database for english.Communications of theACM, 38(11):39–41, 1995.

Roberto Navigli. Word sense disambiguation: A survey.ACM Computing Surveys(CSUR), 41(2):10, 2009.

Roberto Navigli and Simone Paolo Ponzetto. Babelnet: The automatic construc-tion, evaluation and application of a wide-coverage multilingual semantic net-work.Artificial Intelligence, 193:217–250, 2012.

Tzu-yi Nien, Tsun Ku, Chung-chi Huang, Mei-hua Chen, and Jason S Chang.Extending bilingual wordnet via hierarchical word translation classification. InProceedings of the 23rd Pacific Asia Conference on Language, Information andComputation, Volume 1, 2009.

Philip Resnik and David Yarowsky. Distinguishing systems and distinguishingsenses: New evaluation methods for word sense disambiguation. volume 5,pages 113–133. Cambridge University Press, 1999.

Liang Tian, Derek F Wong, Lidia S Chao, Paulo Quaresma, Francisco Oliveira,and Lu Yi. Um-corpus: A large english-chinese parallel corpus for statisticalmachine translation. InLREC, pages 1837–1842, 2014.

Shyam Upadhyay, Kai-Wei Chang, Matt Taddy, Adam Tauman Kalai, andJames Y. Zou. Beyond bilingual: Multi-sense word embeddings using multi-lingual context. InRep4NLP@ACL, 2017.

David Yarowsky. Word-sense disambiguation using statistical models of roget’scategories trained on large corpora. InProceedings of the 14th conference onComputational linguistics-Volume 2, pages 454–460. Association for Computa-tional Linguistics, 1992.

Dayu Yuan, Julian Richardson, Ryan Doherty, Colin Evans, and Eric Altendorf.Semi-supervised word sense disambiguation with neural models. InCOLING,2016.

Zhi Zhong and Hwee Tou Ng. It makes sense: A wide-coverage word sense disam-biguation system for free text. InProceedings of the ACL 2010 system demon-strations, pages 78–83. Association for Computational Linguistics, 2010.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *