帳號:guest(216.73.216.146)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳映竹
作者(外文):Chen, Ying-Zhu
論文名稱(中文):自動化產生同步雙語文法樣式
論文名稱(外文):Learning to extract Bilingual Grammar Patterns
指導教授(中文):張俊盛
指導教授(外文):Chang, Jason S.
口試委員(中文):杜海倫
賴淑麗
口試委員(外文):Tu, Hai-Lun
Lai, Shu-Li
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062547
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:40
中文關鍵詞:同步雙語文法樣式序列標註計算機輔助語言學習
外文關鍵詞:Bilingual Synchronous Grammar PatternsSequence LabelingComputer Assisted Language Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:764
  • 評分評分:*****
  • 下載下載:10
  • 收藏收藏:0
本論文提出了一個利用序列標註模型自動化辨識英文文法規則,以及擷取同步雙語文法樣式的方法,可用於協助語言學習。在我們的方法中,我們將英文例句轉換為標記著文法規則符號的字集,作為序列標註模型的訓練資料。我們的方法包含了訓練一個序列標註模型來自動化辨識英文文法規則,產生人工標記資料,建立單詞翻譯表,以及設計一個利用標記資料和翻譯表來擷取雙語文法樣式的方法。在執行時,系統會依使用者查詢的單字,顯示根據使用頻率排序過的中英同步文法樣式,以及相關例句。我們提出了一個網站雛形 FamiliarPatterns ,幫助語言學習者學習正確的單字文法規則。我們使用隨機選取的例句進行初步評估,實驗結果顯示我們的方法有著不錯的準確性。
We introduce a method for automatically identifying English grammar patterns using sequence labeling and extracting bilingual Synchronous Grammar Patterns (SGPs) to assist language learning. In our approach, English sentences are transformed into a set of words marked by grammar pattern labels, aimed at training a sequence labeling model. The method involves training a model to automatically identify English grammar patterns, generating annotated SGP data, creating a phrase table, and developing a method for extracting SGPs using phrase table. At run-time, queried words are submitted, and suggestion is performed on the corresponding synchronous grammar patterns of English and Chinese and the example sentences retrieved by frequency. We present a prototype, FamiliarPatterns, which applies the method to assist learners to adhere correct word usage. Blind evaluation on a set of randomly sampled sentences pairs shows that the method performs reasonably well.
Abstract i
摘要 ii
致謝 iii
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Related Work 5
3 Methodology 9
3.1 Problem Statement ............................................ 9
3.2 Extracting English Grammar Patterns .......................... 11
3.2.1 Transforming Raw Data into a Training Dataset .............. 11
3.2.2 Training a Sequence Labeling Model ......................... 12
3.3 Discovering Chinese Counterparts of English Grammar Patterns . 14
3.3.1 Building a Phrase Table .................................... 14
3.3.2 Generating Annotated Data................................... 15
3.3.3 Aligning English Grammar Pattern to Chinese ................ 17
3.4 Selecting Representative SGPs and Examples ................... 18
4 Experiments and Evaluation 21
4.1 Datasets...................................................... 22
4.2 Training process.............................................. 23
4.3 Evaluation and Discussion .................................... 24
4.3.1 EvaluationMetrics .......................................... 24
4.3.2 Evaluation Results and Discussion .......................... 25
5 Conclusion and Future Work 29
Reference 37
Glenn Carroll and Eugene Charniak. Two experiments on learning probabilistic dependency grammars from corpora. Department of Computer Science, Univ., 1992.
Jim Chang and Jason S Chang. Writeahead2: Mining lexical grammar patterns for assisted writing. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 106–110, 2015.
Timothy Dozat and Christopher D Manning. Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734, 2016.
Chris Dyer, Victor Chahuneau, and Noah A Smith. A simple, fast, and effective reparameterization of ibm model 2. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 644–648, 2013.
Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, and Noah A Smith. Recurrent neural network grammars. arXiv preprint arXiv:1602.07776, 2016.
Gill Francis, Susan Hunston, and Elizabeth Manning. Grammar patterns 1: verbs. NY: HarperCollins Publication, 1996.
S Hunston and G Francis. Grammar patterns 2: Nouns and adjectives, 1998.
Susan Hunston and Gill Francis. Pattern grammar: A corpus-driven approach to the lexical grammar of English, volume 4. John Benjamins Publishing, 2000.
Adam Kilgarriff, Milos Husa ́k, Katy McAdam, Michael Rundell, and Pavel Rychly`. Gdex: Automatically finding good dictionary examples in a corpus. In Proceedings of the XIII EURALEX international congress, pages 425–432. Universitat Pompeu Fabra Barcelona, Spain, 2008.
Yoon Kim, Alexander M Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, and Ga ́bor Melis. Unsupervised recurrent neural network grammars. arXiv preprint arXiv:1904.03746, 2019.
Eliyahu Kiperwasser and Yoav Goldberg. Simple and accurate dependency parsing using bidirectional lstm feature representations. Transactions of the Association for Computational Linguistics, 4:313–327, 2016.
Nikita Kitaev and Dan Klein. Constituency parsing with a self-attentive encoder. arXiv preprint arXiv:1805.01052, 2018.
Dan Klein and Christopher D Manning. A generative constituent-context model for improved grammar induction. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 128–135. Association for Computational Linguistics, 2002.
Karim Lari and Steve J Young. The estimation of stochastic context-free grammars using the inside-outside algorithm. Computer speech & language, 4(1):35–56, 1990.
Peng-Hsuan Li, Tsu-Jui Fu, and Wei-Yun Ma. Why attention? analyze bilstm deficiency and its remedies in the case of ner.
Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. The stanford corenlp natural language processing toolkit. In Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pages 55–60, 2014.
Mitchell Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. Building a large annotated corpus of english: The penn treebank. 1993.
Oliver Mason and Susan Hunston. The automatic recognition of verb patterns: A feasibility study. International Journal of Corpus Linguistics, 9(2):253–270, 2004.
Yikang Shen, Zhouhan Lin, Chin-Wei Huang, and Aaron Courville. Neural language modeling by jointly learning syntax and lexicon. arXiv preprint arXiv:1711.02013, 2017.
Yikang Shen, Shawn Tan, Alessandro Sordoni, and Aaron Courville. Ordered neurons: Integrating tree structures into recurrent neural networks. arXiv preprint arXiv:1810.09536, 2018.
Liang Tian, Derek F Wong, Lidia S Chao, Paulo Quaresma, Francisco Oliveira, and Lu Yi. Um-corpus: A large english-chinese parallel corpus for statistical machine translation. In LREC, pages 1837–1842, 2014.
Chi-En Wu, Jhih-Jie Chen, Jim Chang, and Jason S Chang. Learning synchronous grammar patterns for assisted writing for second language learners. In Proceedings of the IJCNLP 2017, System Demonstrations, pages 53–56, 2017.
Ching-Yu Yang, Ying-Zhu Chen, Yi-Chien Lin, Jason S Chang, and Wei-Tien Tsai. Annotating synchronous grammar patterns across english and chinese. pages 424– 433, 2019.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *