Level Up：提升寫作等級的提示工具__國立清華大學博碩士論文全文影像系統

帳號：guest(216.73.216.157) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	許靜媛
作者(外文):	Hsu, Jing-Yuan
論文名稱(中文):	Level Up：提升寫作等級的提示工具
論文名稱(外文):	Level Up: Learning to Improve Learners' Writing Proficiency Level
指導教授(中文):	張俊盛
指導教授(外文):	Chang, Jason S.
口試委員(中文):	劉奕汶顏安孜黃芸茵高宏宇
口試委員(外文):	LIU, YI-WEN YAN, AN-ZI HUANG, YUN-YIN GAO, HONG-YU
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊系統與應用研究所
學號:	109065537
出版年(民國):	111
畢業學年度:	110
語文別:	英文
論文頁數:	36
中文關鍵詞:	英文文法分析、英文文法改善、語言模型、英文單字建議、電腦語言輔助寫作系統
外文關鍵詞:	Grammatical Error Correction、Language Model
相關次數:	推薦:0 點閱:1049 評分: 下載:0 收藏:0

本論文提出一個利用同義片語提升寫作文法等級的方法。在我們的研究中，我們分析使用者輸入的句子並取出句中的字彙或片語，來生成較高等級的同義詞，保持句子意思不變的同時也提升了寫作等級。
該方法涉及訓練一個寫作等級分類器、分析句子並辨識片語、自動分類片語等級，來建立一個標註了文法等級的片語庫。在執行時，剖析學習者輸入的句子，再根據辨識出的單字或片語，利用語言模型（Language Model, LM）推薦進階的同義詞。我們提出一個雛形文法建議系統\textit{Level Up}，此系統將上述方法應用於巨量規模語料庫及學習者的句子或文章中，以協助其寫作。公開資料集的實驗結果顯示，我們的系統對於學習者常出現的搭配詞錯誤，比起現今最具代表性的文法改錯系統，獲得較佳的結果。

We introduce a method for learning to generate higher proficiency sentences that retain the meaning of a given sentence.
In our approach, phrases in the sentence are transformed into paraphrases aimed at increasing the proficiency level while maintaining the meaning.
The method involves automatically learning to classify phrases from a training set of phrase/level pairs, automatically parsing sentences into dependency based phrases, automatically classifying these phrases into different levels, and storing phrase/level pairs in a phrasebank.
At run-time, we identify phrases in the given sentence and replace each with higher-level phrases in the phrasebank, ensuring that the replacements retain the meaning of the sentence. We present a prototype system, Level-Up, that applies the method to a web corpus for improving writing proficiency.

Blind evaluation on a set of real examples shows that the method significantly outperforms existing writing assistance services. Our methodology cleanly supports combining expert knowledge in level-graded dataset and large scale corpus data, resulting in a level-graded tool and resources for second and foreign language learning.

Abstract i
摘要ii
致謝iii
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Related Work 4
3 Methodology 8
3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Learning to Construct a set of Level-tagged Phrases . . . . . . . . . 10
3.2.1 Training a classifier on a set of level-tagged phrases . . . . . 10
3.2.2 Retrieving phrases from corpus . . . . . . . . . . . . . . . . 11
3.2.3 Tagging phrases using level classifier . . . . . . . . . . . . . 13
3.3 Run-Time Sentence Rephrasing and Filtering . . . . . . . . . . . . 13
4 Experiment 15
4.1 Dataset and Toolkits . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Model Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.1 CEFR Classifier . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.2 Masked Language Model . . . . . . . . . . . . . . . . . . . . 21
4.3.3 Textual Entailment Model . . . . . . . . . . . . . . . . . . . 23
4.4 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5 Evaluation Sentences and Their Relevance Judgments . . . . . . . . 24
5 Evaluation Results 26
6 Conclusion and Future Work 28
Reference 30
Appendices 34

Øistein E Andersen, Helen Yannakoudakis, Fiona Barker, and Tim Parish. Devel- oping and testing a self-assessment and tutoring system. In Proceedings of the eighth workshop on innovative use of NLP for building educational applications, pages 32–41, 2013.
Colin Bannard and Chris Callison-Burch. Paraphrasing with bilingual paral- lel corpora. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL ’05, page 597–604, USA, 2005. Association for Computational Linguistics. doi: 10.3115/1219840.1219914. URL https:
//doi.org/10.3115/1219840.1219914.

Inge Bartning, Maisa Martin, and Ineke Vedder. Communicative proficiency and linguistic development, volume 1. Lulu. com, 2010.
Regina Barzilay and Lillian Lee. Learning to paraphrase: An unsupervised ap- proach using multiple-sequence alignment. CoRR, cs.CL/0304006, 2003. URL http://arxiv.org/abs/cs/0304006.
Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Man- ning. A large annotated corpus for learning natural language inference. In

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2015.
Thorsten Brants. Web 1t 5-gram version 1. http://www. ldc. upenn. edu/Cata- log/CatalogEntry. jsp? catalogId= LDC2006T13, 2006.
Jill Burstein, Andrea Horbach, Ekaterina Kochmar, Ronja Laarmann-Quante, Claudia Leacock, Nitin Madnani, Ildik´o Pil´an, Helen Yannakoudakis, and Torsten Zesch, editors. Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications, Online, April 2021. Association for Computational Linguistics. URL https://aclanthology.org/2021.bea-1.0.
BNC Consortium et al. British national corpus. Oxford Text Archive Core Col- lection, 2007.
Elozino Egonmwan and Yllias Chali. Transformer and seq2seq model for para- phrase generation. In Proceedings of the 3rd Workshop on Neural Gener- ation and Translation, pages 249–255, Hong Kong, November 2019. Asso- ciation for Computational Linguistics. doi: 10.18653/v1/D19-5627. URL https://aclanthology.org/D19-5627.
Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nel- son F. Liu, Matthew Peters, Michael Schmitz, and Luke S. Zettlemoyer. Al- lennlp: A deep semantic natural language processing platform. 2017.
Wen-Bin Han, Jhih-Jie Chen, Chingyu Yang, and Jason S Chang. Level-up: Learn- ing to improve proficiency level of essays. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstra- tions, pages 207–212, 2019.

Julia Hancke and Detmar Meurers. Exploring cefr classification for german based on rich linguistic modeling. Learner Corpus Research, pages 54–56, 2013.
Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. spaCy: Industrial-strength Natural Language Processing in Python, 2020. URL https://doi.org/10.5281/zenodo.1212303.
Elma Kerz, Daniel Wiechmann, Yu Qiao, Emma Tseng, and Marcus Str¨obel. Automated classification of written proficiency levels on the cefr-scale through complexity contours and rnns. In Proceedings of the 16th Workshop on Innova- tive Use of NLP for Building Educational Applications, pages 199–209, 2021.
Zichao Li, Xin Jiang, Lifeng Shang, and Hang Li. Paraphrase generation with deep reinforcement learning. arXiv preprint arXiv:1711.00279, 2017.
Zichao Li, Xin Jiang, Lifeng Shang, and Qun Liu. Decomposable neural paraphrase generation, 2019.
Dekang Lin. Dependency-based evaluation of minipar. In Treebanks, pages 317–329. Springer, 2003.

Council of Europe. Council for Cultural Co-operation. Education Committee. Modern Languages Division. Common European framework of reference for languages: Learning, teaching, assessment. Cambridge University Press, 2001.
Aaditya Prakash, Sadid A. Hasan, Kathy Lee, Vivek V. Datla, Ashequl Qadir, Joey Liu, and Oladimeji Farri. Neural paraphrase generation with stacked residual LSTM networks. CoRR, abs/1610.03098, 2016. URL http://arxiv.org/abs/ 1610.03098.

Nigel D Turton and John Brian Heaton. Longman dictionary of common errors. Longman, 1996.

Sowmya Vajjala and Kaidi L˜oo. Automatic CEFR level prediction for Estonian learner text. In Proceedings of the third workshop on NLP for computer-assisted language learning, pages 113–127, Uppsala, Sweden, November 2014. LiU Elec- tronic Press. URL https://aclanthology.org/W14-3509.
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement De- langue, Anthony Moi, Pierric Cistac, Tim Rault, R´emi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural lan- guage processing.
Helen Yannakoudakis, Øistein E Andersen, Ardeshir Geranpayeh, Ted Briscoe, and Diane Nicholls. Developing an automated writing placement system for esl learners. Applied Measurement in Education, 31(3):251–267, 2018.

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文