作者(外文):Chen, Li-Kuang
論文名稱(外文):Identification and Classification of Rhetorical Function Expressions in Academic Articles
指導教授(外文):Chang, Jason S.
外文關鍵詞:rhetorical functionsequence labellingmovecomputer-assisted English learning
修辭功能詞句(Rhetorical Function Expression, RFE,又稱「文步」詞句)在構建有說服力的學術寫作論證中扮演著至關重要的角色,然而現有的學習資源有限,且通常在沒有上下文的情況下呈現。我們在此論文,提出了一個學術文本 RFE 資料集,以及兩個模型。模型在給定一學術文章,分別能從中辨識和分類 RFE ,以提供學習者更符合直覺的寫作輔助。我們通過半自動的方式,透過使用既有的一組學術片語,在現有的資料集中標注出 RFE,並研究了三種標註策略的效果。我們使用該資料集,訓練一個用於識別學術寫作中 RFE 的序列標註模型,該模型在自動評估指標下展現了相當不錯的評估結果。此外,我們也展現,在多任務(multi-tasking)設定下,將 RFE 標註作為輔助任務可有效提高修辭類別分類的準確率。本研究提供了一套完整的 RFE 資料標註、訓練和評估工作流程設計,為未來的研究提供了基準。
Rhetorical Function Expression (RFE) plays a crucial role in constructing persuasive argument, but existing learning resources provide limited amount of examples and are often presented without context. We present an RFE-annotated dataset extracted from scholarly texts and a model that identifies RFEs infor a given academic article.
Using a set of existing rhetoric phrases, we develop a RFE dataset semi-heuristically, and investigate the effectiveness of three annotation strategies. We trained and evaluated a sequence-tagging model for identifying RFEs in scholarly writing, showing promising results based on automated evaluation. We also demonstrate that using RFE-tagging as an auxiliary task is effective in improving rhetorical category classification under a multi-tasking setting. Our work offers a comprehensive workflow design of RFE data annotation, training, and evaluation, and provides baselines for future research.
Abstract (Chinese) I
Acknowledgements (Chinese) II
Abstract III
Contents IV
List of Figures VI
List of Tables VII
1 Introduction 1
2 Related Work 4
3 Methodology 8
3.1 The System Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Annotating Rhetorical Function Expressions . . . . . . . . . 9
3.1.2 Learning to Identify RFEs . . . . . . . . . . . . . . . . . . . 10
3.1.3 Classification with RFE-Tagging as an Auxiliary Task . . . . 11
3.1.4 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Experiments 14
4.1 Data Labelling and Preprocessing . . . . . . . . . . . . . . . . . . . 14
4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1.2 Tagging Rhetorical Function Expressions in Sentences . . . . 16
4.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . 20
5 Results and Discussion 22
5.1 Sequence Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6 Conclusion 32
Bibliography 34
A Complete List of Patterns 43
A.1 Rhetorical Categories of Patterns . . . . . . . . . . . . . . . . . . . 43
A.2 Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
A.2.1 Formulaic Patterns . . . . . . . . . . . . . . . . . . . . . . . 44
A.2.2 Agentivity Patterns . . . . . . . . . . . . . . . . . . . . . . . 50
A.3 Lexicons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.3.1 Visualization of Relation Between Tag Types and Sentence
Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.3.2 Visualization of Relation Between Tag Types and Sentence
Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
B Human Evaluation Details 60
C Precision and Recall Scores of Classification Models 61
