帳號:guest(3.140.188.185)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):邱序之
作者(外文):Chiu, Hsu-Chih
論文名稱(中文):增強思維鏈提示技術在空間理解的應用:以家具描述檔生成為例
論文名稱(外文):Application of Reinforcement Chain of Thought Prompting in Spatial Understanding: A Case Study of Furniture Description File Generation
指導教授(中文):李哲榮
指導教授(外文):Lee, Che-Rung
口試委員(中文):韓永楷
陳柏安
口試委員(外文):Hon, Wing-Kai
Chen, Po-An
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:111065505
出版年(民國):113
畢業學年度:112
語文別:英文
論文頁數:41
中文關鍵詞:提示詞工程增強式思維鏈家具描述檔生成
外文關鍵詞:Prompt EngineeringReinforcement Chain of Thought (R-CoT)Furniture Description File Generation
相關次數:
  • 推薦推薦:2
  • 點閱點閱:54
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
大型語言模型(LLM)在解決各種自然語言處理任務中展示了其優勢。然而,由於缺乏特定領域的知識,LLM在未經過特定的提示詞技術下在專家系統的應用上往往表現不佳。在本文中,我們介紹了一種新的提示工程技術,稱為強化思維鏈(R-CoT),該技術結合了關鍵回饋和互動步驟以改善思維鏈(CoT)。透過利用大型語言模型(LLM)的自動化優勢,這項技術減少了對廣泛手動資料標註的需求,並且可以有效地應用於特定領域的專家系統中。為了展示其能力,我們使用R-CoT生成家具描述文件,這些文件可用於從2D影像重建3D模型。為了準確產生家具描述文件,LLM需要充分理解家具部件之間的關係。除了家具檔案生成,我們也針對GSM8K數學問題解決資料集和CSQA常識理解資料集使用ChatGPT進行了實驗。實驗結果顯示,當家具描述檔的生成數量為30個時,R-CoT產生家具描述檔的準確率可達98.2%,而Auto-CoT僅達到72.4%的準確率,Zero-Shot-CoT達到68.1%的準確率, CoT達到92.8%的準確率。此外,它在CSQA資料集上的準確率為80.7%,在GSM8K資料集上的準確率為45.6%。
The large language models (LLMs) have shown their superiority in solving various natural language processing tasks. However, owing to the lack of knowledge of specific domains, LLMs often perform poorly for applications of expert systems. In this paper, we introduce a new prompt engineering technique, called Reinforcement Chain of Thought (R-CoT), which incorporates the critical feedback and interactive steps to improve the Chain of Thought (CoT). By leveraging the automation advantages of large language models (LLMs), this technique reduces the need for extensive manual data annotation, and can be effectively applied in expert systems for specific domains. To demonstrate its capacity, we use R-CoT to generate furniture description files, which can be used to reconstruct 3D models from 2D images. To accurately generate the furniture description files, LLMs need to fully understand the relations among furniture parts. Besides furniture file generation, we conducted experiments for the GSM8K math problem-solving dataset, and the CSQA commonsense understanding dataset using ChatGPT. Experimental results show that the accuracy of R-CoT in generating furniture description files can reach 98.2% when the output size is 30, while Auto-CoT only achieves an accuracy of 72.4%, Zero-Shot-CoT achieves 68.1%, and CoT achieves 92.8% accuracy. Additionally, it achieves an accuracy of 80.7% on the CSQA dataset and 45.6% on the GSM8K dataset.
中文摘要 1
Abstract 2
List of Figures 5
List of Tables 6
Chapter 1 Introduction 7
Chapter 2 Related work 11
2.1 Chain of Thought (CoT) 11
2.2 Zero-Shot Chain of Thought Prompting 12
2.3 Automatic Chain of Thought (Auto-CoT) Prompting 13
2.4 3D Object Construction and Furniture Description Files 14
Chapter 3 Method 17
3.1 Reinforcement Chain of Thought (R-CoT) prompting 17
3.2 Furniture description files generation System 19
3.2.1 FDF Generator 20
3.2.2 Database 22
3.2.3 Subsystems 23
3.2.3.1 Visualizer 23
3.2.3.2 Palette 25
3.3 Generating Furniture Description Files using CoT 26
Chapter 4 Experiments 27
4.1 Benchmark Tasks 27
4.2 Furniture Description File Generation 28
4.3 Ablation study 32
Chapter 5 Conclusion and Future Work 33
References 35
Chapter 6 Appendix 38
6.1 Correction Feedback on the Understanding of Furniture part by Large Language Models (step 1 & 2) 38
6.2 Corrected feedback on the overall annotation of furniture description files by Large Language Models (step 3 and step 4) 39
[1] Humble, N., & Mozelius, P. (2022). The threat, hype, and promise of artificial intelligence in education. *Discover Artificial Intelligence, 2*(22).
[2] Fei-Fei, Li. (2024, May). With Spatial Intelligence, With spatial intelligence, AI Will Understand the Real World [Ted Talk]. https://youtu.be/y8NtMZ7VGmU?si=e18PxQKMD6_XqF-e .
[3] Brown, Tom, et al. "Language models are few-shot learners." Advances in neural information processing systems 33 (2020): 1877-1901.
[4] Sahoo, Pranab, et al. "A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications." arXiv preprint arXiv:2402.07927 (2024).
[5] Wei, Jason, et al. "Chain-of-thought prompting elicits reasoning in large language models." Advances in neural information processing systems 35 (2022): 24824-24837.
[6] Yao Yao, Zuchao Li, and Hai Zhao.” Beyond chain-of-thought, effective graph-of-thought reasoning in large language models.” arXiv preprint arXiv:2305.16582 (2023).
[7] Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, Jingbo Shang, Chen-Yu Lee, and Tomas Pfister. “Chain-of-table: Evolving tables in the reasoning chain for table understanding” ( 2024).
[8] Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L Griffiths, Yuan Cao, and Karthik Narasimhan. “Tree of thoughts: Deliberate problem solving with large language models.” arXiv preprint arXiv:2305.10601, (2023).
[9] Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, and Yue Zhang. Chain-of-symbol prompting elicits planning in large language models, (2023).
[10] Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171, 2022.
[11] Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, and Jianbing Shen. Thread of thought unraveling chaotic contexts. arXiv preprint arXiv:2311.08734, 2023.
[12] Kojima, Takeshi, et al. "Large language models are zero-shot reasoners." Advances in neural information processing systems 35 (2022): 22199-22213.
[13] Zhang, Zhuosheng, et al. "Automatic chain of thought prompting in large language models." arXiv preprint arXiv:2210.03493 (2022).
[14] Kaelbling, Leslie Pack, Michael L. Littman, and Andrew W. Moore. "Reinforcement learning: A survey." Journal of artificial intelligence research 4 (1996): 237-285.
[15] Silviana, Silviana, Andy Hardianto, and Dadang Hermawan. "The implementation of anthropometric measurement in designing the ergonomics work furniture." EUREKA: Physics and Engineering 3 (2022): 20-27.
[16] ydor, Maciej, and Miloš Hitka. "Chair size design based on user height." Biomimetics 8.1 (2023): 57.
[17] Ramesh, Aditya, et al. "Zero-shot text-to-image generation." International conference on machine learning. Pmlr, 2021.
[18] Cobbe, Karl, et al. "Training verifiers to solve math word problems." arXiv preprint arXiv:2110.14168 (2021).
[19] Talmor, Alon, et al. "CommonsenseQA: A question answering challenge targeting commonsense knowledge." arXiv preprint arXiv:1811.00937 (2018).
[20] Achiam, Josh, et al. "Gpt-4 technical report." arXiv preprint arXiv:2303.08774 (2023).
[21] Durante, Zane, et al. "Few-Shot Classification of Interactive Activities of Daily Living (InteractADL)." arXiv preprint arXiv:2406.01662 (2024).
[22] Radford, Alec, et al. "Language models are unsupervised multitask learners." OpenAI blog 1.8 (2019): 9.
[23] Brown, Tom, et al. "Language models are few-shot learners." Advances in neural information processing systems 33 (2020): 1877-1901.
[24] Reimers, Nils, and Iryna Gurevych. "Sentence-bert: Sentence embeddings using siamese bert-networks." arXiv preprint arXiv:1908.10084 (2019).
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *