帳號:guest(13.58.247.192)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):邱聖雅
作者(外文):Chiu, Sheng-Ya
論文名稱(中文):基於湯普森抽樣的個人化間隔重複學習
論文名稱(外文):Personalizing Spaced Repetition Learning Using Thompson Sampling
指導教授(中文):吳尚鴻
指導教授(外文):Wu, Shan-Hung
口試委員(中文):彭文志
黃慶育
口試委員(外文):Peng, Wen-Chih
Huang, Chin-Yu
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:111065514
出版年(民國):113
畢業學年度:112
語文別:英文
論文頁數:33
中文關鍵詞:間隔重複湯普森抽樣個人化學習
外文關鍵詞:spaced repetitionThompson samplingpersonalized learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:82
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
當不同的人學習相同的新概念時,多種因素如先備的知識和天生擅長不同領域的學習能力會影響他們的理解力,從而影響學習成果。學習成果通常通過正確回想的機率來表示,即在一段時間後記住該概念的機率。
過去的研究透過計算適當的間隔重複間隔並估計個人的回想機率來優化學習成果。與臨時抱佛腳相比,適當安排的重複複習能更有效地強化記憶。相關研究使用資料集訓練出單一且固定的模型,而這些資料是來自許多不同的使用者共同集結而成,並根據這種模型來估算個人的回憶概率。
然而,人們具有不同的學習能力。在本文中,我們提出了一種使用湯普森抽樣來估計每個人對每個概念的回想機率的個人化方法。與使用單一且固定的模型不同,我們透過湯普森抽樣具採樣高效的特性,針對每次回想結果更新來更新記憶模型中的權重,從而近似個別學習者對概念的記憶強度。這種方法在有限的時間內安排多個概念的複習,以達到更好的學習成果。我們使用模擬進行了評估,並證明了我們提出的算法相比先前方法的有效性,進一步證實了個人化因素對學習過程的影響。
Individuals have varying cognitive strengths. When different people are learning the same new concept, multiple factors such as prior knowledge and motivation will impact their engagement and comprehension, which in turn influences learning outcomes. Learning outcomes are often represented through recall probability, which is the probability of remembering the concept after some time has passed since last encountering it.
Previous work aimed to optimize learning outcomes by calculating proper spaced repetition intervals and estimating individuals’ recall probabilities. Compared to cramming, properly scheduled repetitions of reviewing concepts help strengthen memories more efficiently. Earlier research used global models trained on data from many users and estimated individuals’ recall probabilities based solely on this one-size-fits-all model.
However, people have varying cognitive strengths and might be naturally talented in different fields. In this paper, we propose a personalized method to estimate individuals’ recall probabilities for each concept using Thompson sampling. Instead of a global model, we developed a sampling-efficient algorithm that updates weights in the memory model to approximate individual learners’ memory strength for concepts as we gather more information from each recall result update. This approach schedules multiple concepts for review within a limited time to achieve better memory retention. We conducted evaluations using simulations and demonstrated the effectiveness of our proposed algorithm compared to previous approaches, further verifying that personalizing factors affect the learning for each individual.
Contents
Abstract (Chinese) I
Abstract II
Contents III
List of Figures V
List of Tables VI 1
Introduction 1
2 Related work 3
2.1 The forgetting curve and spaced repetition learning 3
2.2 Multi-armed bandit(MAB) 6
3 Method 9
3.1 Algorithm overview 9
3.2 Personalized scheduler 10
3.2.1 Relation to Thompson Sampling 11
4 Experiments 17
4.1 Settings 17
4.1.1 Environment 17
4.1.2 Evaluation metric 19
4.1.3 Baselines 19
4.2 Results 21
4.2.1 Overall performance 21
4.2.2 Time line analysis 23
4.2.3 Parameter sensitivity 24
5 Conclusion 28
[1] Shun-ichi Amari. “Backpropagation and stochastic gradient descent method”. In: Neurocomputing 5.4-5 (1993), pp. 185–196.
[2] Peter Bienstman. The Mnemosyne Project. 2006. url: https://mnemosyne-proj.org/ (visited on 07/07/2024).
[3] Olivier Chapelle and Lihong Li. “An empirical evaluation of thompson sampling”. In: Advances in neural information processing systems 24 (2011).
[4] Fiona Draxler, Albrecht Schmidt, and Lewis L Chuang. “Relevance, Effort, and Perceived Quality: Language Learners’ Experiences with AI-Generated Contextually Personalized Learning Material”. In: Proceedings of the 2023 ACM Designing Interactive Systems Conference. 2023, pp. 2249–2262.
[5] Duolingo, Inc. Duolingo. Version 7.32.1. July 29, 2024. url: https://www. duolingo.com/.
[6] Hermann Ebbinghaus. “Memory: A contribution to experimental psychology”. In: Annals of neurosciences 20.4 (2013), p. 155.
[7] Dean Eckles and Maurits Kaptein. “Thompson sampling with the online bootstrap”. In: arXiv preprint arXiv:1410.4009 (2014).
[8] Aur ́elien Garivier and Eric Moulines. “On upper-confidence bound policies for switching bandit problems”. In: International conference on algorithmic learning theory. Springer. 2011, pp. 174–188.
[9] Anette Hunziker et al. “Teaching multiple concepts to a forgetful learner”. In: Advances in neural information processing systems 32 (2019).
[10] Eugene Ie et al. “Recsim: A configurable simulation platform for recommender systems”. In: arXiv preprint arXiv:1909.04847 (2019).
[11] John Langford and Tong Zhang. “The epoch-greedy algorithm for contextual multi-armed bandits”. In: Advances in neural information processing systems 20.1 (2007), pp. 96–1.
[12] S. Leitner. So lernt man lernen. Herder, 1974. url: https://books.google. com.tw/books?id=opWFRAAACAAJ.
[13] Lihong Li et al. “A contextual-bandit approach to personalized news article recommendation”. In: Proceedings of the 19th international conference on World wide web. 2010, pp. 661–670.
[14] Terje Lømo. “The discovery of long-term potentiation”. In: Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 358.1432 (2003), pp. 617–620.
[15] Martin Mladenov et al. “Recsim ng: Toward principled uncertainty modeling for recommender ecosystems”. In: arXiv preprint arXiv:2103.08057 (2021).
[16] Kaustubh R Patil et al. “Optimal teaching for limited-capacity human learners”. In: Advances in neural information processing systems 27 (2014).
[17] Peter Bienstman. The Mnemosyne Project. Version 2.10.1. July 29, 2024. url: https://www.duolingo.com/.
[18] Quizlet Inc. Quizlet: AI-powered Flashcards. Version 8.44. July 29, 2024. url: https://quizlet.com.
[19] Siddharth Reddy et al. “Unbounded human learning: Optimal scheduling for spaced repetition”. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016, pp. 1815–1824.
[20] Sherry Ruan et al. “Quizbot: A dialogue-based adaptive learning system for factual knowledge”. In: Proceedings of the 2019 CHI conference on human factors in computing systems. 2019, pp. 1–13.
[21] Daniel J Russo et al. “A tutorial on thompson sampling”. In: Foundations and Trends® in Machine Learning 11.1 (2018), pp. 1–96.
[22] Burr Settles and Brendan Meeder. “A trainable spaced repetition model for language learning”. In: Proceedings of the 54th annual meeting of the associ- ation for computational linguistics (volume 1: long papers). 2016, pp. 1848–1858.
[23] Aleksandrs Slivkins et al. “Introduction to multi-armed bandits”. In: Foundations and Trends® in Machine Learning 12.1-2 (2019), pp. 1–286.
[24] Paul Smolen, Yili Zhang, and John H Byrne. “The right time to learn: mechanisms and optimization of spaced learning”. In: Nature Reviews Neuroscience 17.2 (2016), pp. 77–88.
[25] Jingyong Su et al. “Optimizing spaced repetition schedule by capturing the dynamics of memory”. In: IEEE Transactions on Knowledge and Data Engineering 35.10 (2023), pp. 10085–10097.
[26] Max Welling and Yee W Teh. “Bayesian learning via stochastic gradient Langevin dynamics”. In: Proceedings of the 28th international conference on machine learning (ICML-11). Citeseer. 2011, pp. 681–688.
[27] Wikipedia. Conjugate prior. 2024. url: https://en.wikipedia.org/wiki/Conjugate_prior (visited on 07/31/2024).
[28] Piotr Wozniak. “Supermemo 2004”. In: TESL EJ 10.4 (2007), pp. 1–12.
[29] Zhengyu Yang et al. “TADS: Learning time-aware scheduling policy with dyna-style planning for spaced repetition”. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Infor- mation Retrieval. 2020, pp. 1917–1920.
[30] Junyao Ye, Jingyong Su, and Yilong Cao. “A stochastic shortest path algorithm for optimizing spaced repetition scheduling”. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022, pp. 4381–4390.
(此全文20260806後開放外部瀏覽)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *