帳號:guest(13.58.134.228)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張芸禎
作者(外文):Chang, Yun-Chen
論文名稱(中文):使用隨機森林預測不平衡資料集的員工流失
論文名稱(外文):Employee Attrition Prediction Using Random Forest for Imbalanced Dataset
指導教授(中文):許健平
指導教授(外文):Sheu, Jang-Ping
口試委員(中文):洪樂文
陳裕賢
口試委員(外文):Hong, Yao-Win
Chen, Yuh-Shyan
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:106062502
出版年(民國):108
畢業學年度:108
語文別:英文
論文頁數:27
中文關鍵詞:員工流失預測機器學習k-近鄰決策樹隨機森林支持向量機
外文關鍵詞:Employee Attrition PredictionMachine Learningk-Nearest NeighborDecision TreeRandom ForestSupport Vector Machine
相關次數:
  • 推薦推薦:0
  • 點閱點閱:483
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
員工流失是公司面臨的一個重要問題,當員工離開公司時,不僅會損失該員工的工作效率,還會影響同事的工作效率。預測員工流動率有利於人力資源部門採取行動以加強內部政策。在本文中,我們使用隨機森林模型來預測員工流失。我們利用自適應合成採樣方法 (Adaptive Synthetic Sampling Approach) 來解決類別不平衡問題,即一類數據的總數遠遠小於另一類數據的總數。我們使用兩個數據集和三個實驗來驗證所提出模型的可靠性。第一個實驗結果顯示,隨機森林模型優於其他分類方法。第二個實驗結果表明,自適應合成採樣方法可以大幅提高預測模型的性能。第三個實驗顯示不同的採樣率對模型性能的影響,實驗結果驗證提出的模型能夠準確預測員工流失。
Employee attrition is an essential problem for companies. The labor market is competitive, and the required skills are in high demand. As employees leave the company, not only the productivity of that person lost, but also the productivity of colleagues is impacted. Predicting employee turnover is benefit the human resource department take action to enhance internal policies. In this thesis, we use the Random Forest model to predict employee attrition. We utilize an Adaptive Synthetic Sampling Approach (ADASYN) to solve the class imbalance problem, which is that the total number of a class is far less than the total number of another class in data. We use two datasets with three experiments to verify the reliability of the proposed model. The first experimental result shows that Random Forest is superior to other classification methods. The second experimental result indicates that the ADASYN sampling approach can significantly improve the performance of the prediction model. The third experiment indicates the effect of the different oversampling ratio on model performance. The results of the experiments validate that the proposed model can accurately predict employee attrition.
Abstract
List of Contents
I. Introduction------------------------------1
II. Related Works----------------------------4
III. Employee Attrition Prediction Model-----7
IV. Experiment and Performance Analysis-----16
V. Conclusion-------------------------------25
References----------------------------------26
1. D. S. Sisodia, S. Vishwakarma, A. Pujahari, “Evaluation of Machine Learning Models for Employee Churn Prediction,” International Conference on Inventive Computing and Informatics (ICICI), pp. 1016-1020, Coimbatore, India, Nov. 2017.
2. R. S. Shankar, J. Rajanikanth, V. V. Sivaramaraju, K. V. Murthy, “Prediction of Employee Attrition Using Datamining,” IEEE International Conference on System, Computation, Automation and Networking (ICSCA), pp.1-8, Pondicherry, India, July 2018.
3. S. S. Alduayj, K. Rajpoot, “Predicting Employee Attrition Using Machine Learning,” International Conference on Innovations in Information Technology (IIT), pp.93-98, Al Ain, United Arab Emirates, Nov. 2018.
4. X. Gao, J. Wen, C. Zhang, “An Improved Random Forest Algorithm for Predicting Employee Turnover,” Mathematical Problems in Engineering, Vol. 2019, Article ID 4140707, pp. 1-12, Apr. 2019.
5. H. He, Y. Bai, E. A. Garcia, S. Li, “ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning,” IEEE International Joint Conference on Neural Networks (WCCI), pp.1322-1328, Hong Kong, China, June 2008.
6. I. Ullah1, B. Raza, A. K. Malik, M. Imran, S. U. Islam, S. W. Kim, “A Churn Prediction Model Using Random Forest: Analysis of Machine Learning Techniques for Churn Prediction and Factor Identification in Telecom Sector,” IEEE Access, vol.7, pp. 60134-60149, May 2019.
7. H. Xu, Y. Pan, J. Li, L. Nie, X. Xu, “Activity Recognition Method for Home-Based Elderly Care Service Based on Random Forest and Activity Similarity,” IEEE Access, vol.7, pp. 16217 - 16225, Jan. 2019.
8. Z. Li, M.‐A. Meier, E. Hauksson, Z. Zhan, J. Andrews, “Machine Learning Seismic Wave Discrimination: Application to Earthquake Early Warning,” Geophysical Research Letters, Vol. 45, Issue10, pp.4773-4779, May 2018.
9. S. Hamori, M. Kawai, T. Kume, Y. Murakami, C. Watanabe, “Ensemble Learning or Deep Learning? Application to Default Risk Analysis,” Journal of Risk and Financial Management, Vol. 11, Issue1, pp.1-14, Mar. 2018.
10. P. Romero-Aroca, A. Valls, A. Moreno, R. Sagarra-Alamo, J. Basora-Gallisa, E. Saleh, M. Baget-Bernaldiz, D. Puig, “A Clinical Decision Support System for Diabetic Retinopathy Screening: Creating a Clinical Support Application,” Telemedicine and e-Health, Vol. 25, No. 1, pp.31-40, Jan. 2019.
11. Kaggle, “IBM HR Analytics Employee Attrition & Performance.” [Online]. Available: https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset.
12. Medium, “HR Analytics Employee Attrition & Performance.” [Online]. Available: https://drive.google.com/file/d/1ria5DGCPf7YPoBCu6JDPIJ1gPu7niCEi/view
13. A. P. Pawlovsky, “An Ensemble Based on Distances for a kNN Method for Heart Disease Diagnosis,” International Conference on Electronics, Information, and Communication (ICEIC), pp. 1-4, Jan. 2018, Honolulu, USA.
14. W. Shang, J. Cui, C. Song, J. Zhao, P. Zeng, “Research on Industrial Control Anomaly Detection Based on FCM and SVM,” IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 218-222, Aug. 2018, New York, USA.
15. A. A. Supianto, A. J. Dwitama, M. Hafis, “Decision Tree Usage for Student Graduation Classification: A Comparative Case Study in Faculty of Computer Science Brawijaya University,” International Conference on Sustainable Information Engineering and Technology (SIET), pp.308-311, Apr. 2019, Malang, Indonesia.
16. N. V. Chawla, L. O. Hall, K. W. Bowyer, W. P. Kegelmeyer, “SMOTE: Synthetic Minority Oversampling Technique,” Journal of Artificial Intelligence Research, Vol. 16, pp. 321-357, June 2002.
17. N. V. Chawla, A. Lazarevic, L. O. Hall, K. W. Bowyer, “Smoteboost: Improving Prediction of the Minority Class in Boosting,” European Conference Principles of Data Mining and Knowledge Discovery, pp. 107-119, Jan. 2003, Dubrovnik, Croatia.
18. H. Guo, H. L. Viktor, “Learning from Imbalanced Data Sets with Boosting and Data Generation: The DataBoost-IM Approach,” ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets, Vol. 6, Issue 1, pp. 30-39, June 2004.
19. M. Du, Z. Zhang, Y. Zhang, “Modified Machine Learning Model and Stock Classification Research Based on Unbalance Data,” International Conference on Digital Home (ICDH), pp.200-207, Dec. 2018, Guilin, China.
20. scikit-learn: https://scikit-learn.org/stable/
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *