帳號:guest(3.142.252.129)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):楊庭量
作者(外文):Yang, Ting Liang
論文名稱(中文):使用機器學習技術預測大學生是否續讀同校研究所之研究-以國立清華大學資工系為例
論文名稱(外文):Churn Prediction in Undergraduate Students Continuing Their Graduate Study at the Same University-A Case Study of Department of Computer Science, National Tsing Hua University
指導教授(中文):黃婷婷
指導教授(外文):Huang, Ting Ting
口試委員(中文):王廷基
賴尚宏
口試委員(外文):Wang, Ting Chi
Lai, Shang Hong
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:102062504
出版年(民國):105
畢業學年度:104
語文別:中文
論文頁數:47
中文關鍵詞:機器學習流失預測續讀學生續讀客戶流失
外文關鍵詞:machine learningchurn predictionretentionretention of studentschurn
相關次數:
  • 推薦推薦:0
  • 點閱點閱:309
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
台灣各知名大學之間,彼此常存在著多年的競爭,且隨著近年來教育發展越趨國際 化,使得競爭早已不僅止於這些國內名校間。而想要提昇大學競爭力最根本的作法,不外 乎是積極延攬好的人才,並長期培育提昇學校的研究實力,因此若能將校內優秀的大學畢 業生保留下來繼續修讀研究所,將有助於提升學校之研究水準與聲譽。
本研究將機器學習方法應用於預測學生大學畢業後是否願意繼續修讀同校之研究所, 並以國立清華大學資訊工程學系之畢業生為資料來源,透過機器學習分類技術如J48決策 樹、隨機森林、支持向量機等方法,建立畢業學生流失預測模型並瞭解影響學生流失之重 要特徵,提供給校方作為擬定未來發展策略之參考。
The competition between top universities in Taiwan has been existing for years, but since the development of education is becoming more international, the competition has become more intense and has expended to more than universities in Taiwan. The basic and most significant way to improve competitiveness of a university is to attract more talented and qualified students to come and study, good students with good training would definitely help to improve research capability. Thus, for the university, keeping talented undergraduate students to continue their graduate study would be also helpful to raise the reputation and research level for the university.

This research applies machine learning techniques on churn prediction in undergraduate students continuing their graduate study at the same university, while using data of National Tsing Hua University computer science students as the data resource. Through machine learning classification methods like J48 Decision Tree, Random Forest and Support Vector Machine, we develop prediction models to detect possible churners and analyze the most important factors that affect students to churn.
1 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究流程 4
1.4 論文架構 4
2 文獻探討 6
2.1 客戶流失管理 6
2.2 機器學習 7
2.3 機器學習方法在其他產業之應用 9
3 學生流失研究分析技術 11
3.1 資料收集與初步分析 11
3.1.1 資料來源 11
3.1.2 因應個人資料保護法之資料彙整與處理 12
3.1.3 資料訓練集標籤取得 16
3.1.4 資料總項目列表 16
3.1.5 初步資料處理 20
3.2 特徵篩選 21
3.3 機器學習分類技術應用 24
4 實驗結果 30
4.1 重要特徵分析 30
4.2 預測模型實驗結果 37
5 結論與未來展望 42
[1] 104人力銀行. 國立清華大學升學就業地圖, 15 May 2016. ”http://www.104.com.tw/jb/career/department/navigation?sid=5002000000”.
[2]Bodenseo Bernd Klein. Text categorization and classification, 2011. http://www.python-course.eu.
[3] Bingquan Huang, Mohand Tahar Kechadi, and Brian Buckley. Customer churn predic- tion in telecommunications. Expert Syst. Appl., 39(1):1414–1425, January 2012.
[4] BoozAllen & Hamilton Inc. Winning the customer churn battle in the wireless industry.
[5] Wikipedia. Machine learning, 2014.
[6] IBM 商用服務器事業群匮 企業管理最新戰力, Sep 1998.
[7] 陳文華. 應用資料倉儲系統建立CRM, May 1999.
[8] Sphan Nasr. Customer Relationship Management Strategies in the Digital Era. IGI Global, 2015.
[9] 李御璽. 大數據時代的數據挖掘及應用, 2014. ”http://www.magazine.mcu.edu.tw/pdf files/20140924083020.pdf”.
[10] 行政院法務部. 個人資料保護法, 15 May 2016. ”http://law.moj.gov.tw/LawClass/LawAll.aspx?PCode=I0050021”.
[11] 國立清華大學課務組. 國立清華大學學士班基本科目免修測試辦法, 2014.
[12] Tom Dietterich. Overfitting and undercomputing in machine learning. ACM Comput. Surv., 27(3):326–327, September 1995.
[13] Imola Fodor. A survey of dimension reduction techniques, 2002.
[14] Isabelle Guyon and Andr ́e Elisseeff. An introduction to variable and feature selection.
J. Mach. Learn. Res., 3:1157–1182, March 2003.
[15] Isabelle Guyon, Steve Gunn, Masoud Nikravesh, and Lotfi A. Zadeh. Feature Extraction:
Foundations and Applications (Studies in Fuzziness and Soft Computing). Springer- Verlag New York, Inc., Secaucus, NJ, USA, 2006.
[16] M. A. Hall. Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, University of Waikato, Hamilton, New Zealand, 1998.
[17] K. Pearson. On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(6):559–572, 1901.
[18] Ian Jolliffe. Principal Component Analysis. John Wiley & Sons, Ltd, 2014.
[19] McGraw-Hill Tom Mitchell. Machine Learning. Material, 1997.
[20]Dr. Saed Sayad. Support vector machine - classification (svm). http://www.saedsayad.com/supportvectormachine.htm.
[21] Wikipedia. Logistic regression. https://en.wikipedia.org/wiki/Logisticregression.
28
[22] Karl Pearson. Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58:240–242, 1895.
[23] Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: An update. SIGKDD Explor. Newsl., 11(1):10–18, November 2009.
[24] Ron Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI’95, pages 1137–1143, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *