輔助寫作的文法提示系統__國立清華大學博碩士論文全文影像系統

帳號：guest(3.147.48.186) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	鍾幸芸
作者(外文):	Chung, Hsin-Yun
論文名稱(中文):	輔助寫作的文法提示系統
論文名稱(外文):	Grammar Level Auto-Complete for Assistive Writing
指導教授(中文):	張俊盛
指導教授(外文):	Chang, Jason S.
口試委員(中文):	張智星鍾曉芳
口試委員(外文):	JANG, Jyh-Shing Chung, Siaw-Fong
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊系統與應用研究所
學號:	110065701
出版年(民國):	113
畢業學年度:	112
語文別:	英文
論文頁數:	35
中文關鍵詞:	文法提示、文字生成
外文關鍵詞:	Grammar Pattern、Text Generation
相關次數:	推薦:0 點閱:18 評分: 下載:0 收藏:0

本論文提出一個寫作建議的系統:GrammarGenie，可以自動預測不完整句子後續的文法樣式 (Grammar pattern)。我們爬取字典中的例句和文法樣式，並將其轉換成所需格式作為訓練資料，然後以微調大型語言模型 T5 (Text-to-Text Transfer Transformer) 的方法來建立系統。實驗結果顯示，我們的系統除了預測其他字典例句的文法樣式十分優秀外，在實際運用上也具有出色的預測能力。

We present a method that automatically generates corresponding grammar patterns for a given incomplete sentence. In our approach, partial sentences allow the system to predict probable grammar patterns. The method involves crawling a dictionary of example sentences, converting these sentences into incomplete sentences and grammar patterns, and using this data to fine-tune a large language model to fill in the incomplete sentences. At run-time, the system receives partial sentences and outputs the highest probability grammar pattern. Evaluation on a set of open courses transcript shows that the system has excellent predictive capabilities on the average. Our methodology supports combining many example sentences, resulting in improved model accuracy.

Abstract i
摘要 ii
致謝 iii
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Related Work 5
3 Method 9
3.1 Problem Statement 9
3.2 Prepare Training Data and Train Model 10
3.3 Run-Time Grammar Pattern Predicting 14
4 Experiment 16
4.1 Datasets and Pre-trained Model 17
4.2 System Compared 18
4.3 TestData 19
4.4 Evaluation Metrics 20
5 Evaluation Results 23
5.1 Results of Automatic Evaluation 23
5.2 Results of Human Evaluation 24
6 Conclusion and Future Work 31
Reference 33

Peter F Brown, Vincent J Della Pietra, Peter V Desouza, Jennifer C Lai, and
Robert L Mercer. Class-based n-gram models of natural language. Computational
linguistics, 18(4):467–480, 1992.
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan,
Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda
Askell, et al. Language models are few-shot learners. Advances in neural information
processing systems, 33:1877–1901, 2020.
Jim Chang and Jason S Chang. Writeahead2: Mining lexical grammar patterns
for assisted writing. In Proceedings of the 2015 conference of the north American
chapter of the association for computational linguistics: Demonstrations, pages
106–110, 2015.
Mia Xu Chen, Benjamin N Lee, Gagan Bansal, Yuan Cao, Shuyuan Zhang, Justin
Lu, Jackie Tsay, Yinan Wang, Andrew M Dai, Zhifeng Chen, et al. Gmail
smart compose: Real-time assisted writing. In Proceedings of the 25th ACM
SIGKDD International Conference on Knowledge Discovery & Data Mining,
pages 2287–2295, 2019.
33
James H. Martin Dan Jurafsky. Speech and Language Processing (3rd ed. draft).
2024. URL https://web.stanford.edu/~jurafsky/slp3/.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pretraining
of deep bidirectional transformers for language understanding. arXiv
preprint arXiv:1810.04805, 2018.
S. Hunston and G. Francis. Pattern Grammar: A Corpus-driven Approach to the
Lexical Grammar of English. Pattern Grammar: A Corpus-driven Approach
to the Lexical Grammar of English. John Benjamins Publishing Company,
2000. ISBN 9789027222732. URL https://books.google.com.tw/books?id=
nqqh46Q0uVMC.
Susan Hunston, G Francis, and Elizabeth Manning. Collins COBUILD Grammar
Patterns 1: Verbs. HarperCollins, 1996. ISBN 0003750620.
Susan Hunston, G Francis, and Elizabeth Manning. Collins Cobuild Grammar
Patterns 2: Nouns and Adjectives. HarperCollins, 1998. ISBN 9780003750676.
Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text
summarization branches out, pages 74–81, 2004.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation
of word representations in vector space, 2013. URL https://arxiv.org/abs/
1301.3781.
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method
for automatic evaluation of machine translation. In Proceedings of the 40th
34
annual meeting of the Association for Computational Linguistics, pages 311–
318, 2002.
Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global
vectors for word representation. In Proceedings of the 2014 conference on empirical
methods in natural language processing (EMNLP), pages 1532–1543, 2014.
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang,
Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits
of transfer learning with a unified text-to-text transformer. Journal of machine
learning research, 21(140):1–67, 2020.
Tzu-Hsi Yen, Jian-Cheng Wu, Jim Chang, Joanne Boisson, and Jason S Chang.
Writeahead: Mining grammar patterns in corpora for assisted writing. In Proceedings
of ACL-IJCNLP 2015 system demonstrations, pages 139–144, 2015.
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav
Artzi. Bertscore: Evaluating text generation with bert. arXiv preprint
arXiv:1904.09675, 2019.

電子全文
摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文