作者(外文):Kuo, Chin-ting
論文名稱(外文):Boosting Factual Consistency and High Coverage in Unsupervised Abstractive Summarization
指導教授(外文):CHEN, YI-SHIN
口試委員(外文):Tsai, Tzong-Han
Peng, Wen-Chih
外文關鍵詞:Abstractive SummarizationKeyword ExtractionReinforcement LearningUnsupervised LearningCoverageFactual Consistency
生成式摘要(abstractive summarization)隨著快速成長的預訓練模型逐漸成為摘要任務的主流,而生成式摘要與原文資訊不一致的問題也變得更加明顯:摘要必須忠於原文,不應編造故事。本論文在非監督式摘要(unsupervised abstractive sumamrization)的研究基礎上,透過增加事實一致性評分機制,強化摘要與原文的資訊一致性;另外我們提出一個新的擷取關鍵字的方法,利用依存句法剖析器(Dependency Parsing)找到被修飾最多的關鍵字,這些關鍵字將用於輔助非監督式摘要所需還蓋到的訊息。透過 FEQA與ROUGE,實驗結果顯示我們在資訊一致性與重點還覆蓋率上皆有顯著的提升。
Abstractive summarization has gradually gained importance because of the rapid growth of pre-trained language models. However, there are occasions when the models generate a summary that contains information that is inconsistent with the original document. Presenting information differently from the original document is a critical problem under summarization that we label factual inconsistency. This research proposes an unsupervised abstractive summarization method for improving factual consistency and coverage that uses reinforcement learning. It includes a novel method designed to maintain factual consistency between the generated summary and the original document. As well as a novel method of ranking keywords; here, keywords are used to support the model and keep track of the level of coverage of the information. The result validates the performance and outperforms the existing methods.
1 Introduction . . . . . . . . . . 1
2 Related Work . . . . . . . . . . 6
3 Framework . . . . . . . . . . . 9
3.1 Agent Group . . . . . . . . . . 11
3.1.1 Summarizer . . . . . . . . . . 11
3.1.2 Keyword Extraction . . . . . . . . 12
3.1.3 Masking Process . . . . . . . . . 12
3.2 Environment Group: Score Models . . . . 13
3.2.1 Factual Consistency . . . . . . . . 13
3.2.2 Coverage . . . . . . . . . . . 17
3.2.3 Fluency and Brevity . . . . . . . . 19
4 Reward and Training . . . . . . . . 21
4.1 Reinforcement Learning . . . . . . . 21
4.2 Training Order . . . . . . . . . 23
4.3 Scorer Weight Setting . . . . . . . 24
5 Experiment . . . . . . . . . . . 27
5.1 Dataset . . . . . . . . . . . . 27
5.2 Experimental Setup . . . . . . . . 28
5.3 Evaluation Metrics . . . . . . . . 29
6 Result and Analysis . . . . . . . . 33
6.1 RQ1: Coverage Evaluation . . . . . . 33
6.1.1 ROUGE Score . . . . . . . . . . 33
6.1.2 Keyword Selection . . . . . . . . 39
6.2 RQ2: Factual Consistency Evaluation . . 41
6.2.1 FEQA Score . . . . . . . . . . . 41
6.2.2 Human Evaluation . . . . . . . . . 43
7 Conclusion & Future Work . . . . . . 44
8 Appendix . . . . . . . . . . . 45
References . . . . . . . . . . . . . . 46

