帳號:guest(3.145.57.201)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):王怡文
作者(外文):Wang, Yi-Wen
論文名稱(中文):結合特徵選擇與集成學習模型於預測問題之研究
論文名稱(外文):Combining Feature Selection and Ensemble Learning Models to Prediction Problems
指導教授(中文):蘇朝墩
指導教授(外文):Su, Chao-Ton
口試委員(中文):蕭宇翔
許俊欽
口試委員(外文):Hsiao, Yu-Hsiang
Hsu, Chun-Chin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:工業工程與工程管理學系
學號:110034526
出版年(民國):112
畢業學年度:111
語文別:中文
論文頁數:52
中文關鍵詞:特徵選擇正則化集成學習預測
外文關鍵詞:Feature SelectionRegularizationEnsemble LearningPrediction
相關次數:
  • 推薦推薦:0
  • 點閱點閱:157
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
特徵選擇在預測問題中扮演著關鍵且不可忽視的角色,適當地選擇特徵能夠提高模型的預測準確度,同時減少模型的複雜度和訓練時間。集成學習模型的應用被廣泛認可,透過結合多個基本模型的預測結果,提高預測準確度並降低過擬合的風險,具有良好的泛化能力。本研究旨在探討結合特徵選擇與集成學習模型在預測問題中的應用。研究中選擇了三個常見的正則化模型(LASSO、Ridge、Elastic Net)和六個集成學習模型(隨機森林、極限隨機樹、自適應增強、梯度提升、極限梯度提升、CatBoost)進行比較和評估,並且使用三個不同領域的資料集進行驗證,以找出能夠應用於預測問題之最佳組合。實驗結果顯示,LASSO與CatBoost的組合為正則化方法與集成學習模型之間的最佳組合,其具有優秀的特徵選擇能力,並且在獲得準確預測結果的同時降低模型複雜度與訓練成本,使模型更具實用性。本研究為解決預測問題提供了一個有效且高效的方法,未來可以進一步優化此組合並應用於更廣泛的領域。
Feature selection plays a key role in the prediction problem. Appropriate selection of features can improve the prediction accuracy and reduce the complexity of the model. The application of the ensemble learning model is widely recognized. By combining the prediction results of multiple basic models, it improves the prediction accuracy and reduces the risk of overfitting, and it also has good generalization ability. This study explores the application of combining feature selection and ensemble learning models to prediction problems. Three common regularization models (LASSO, Ridge, Elastic Net) and six ensemble learning models (Random Forest, Extra Trees, AdaBoost, Gradient Boosting, XGBoost, CatBoost) were selected for comparison and evaluation in the study, and validated using three datasets from different domains to find the best combination that can be applied to the prediction problem. The experimental results show that LASSO and CatBoost is the best combination. It has excellent feature selection ability, and it can reduce the model complexity and training cost while obtaining accurate prediction results. In sum, this study provides an effective and efficient way for solving prediction problems, and this combination can be further optimized and applied to a wider range of fields in the future.
摘要 I
Abstract II
誌謝 III
目錄 IV
圖目錄 VI
表目錄 VII
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 研究架構 3
第二章 文獻探討 5
2.1 特徵選擇 5
2.1.1 過濾法 6
2.1.2 包裝法 7
2.1.3 嵌入法 8
2.1.4 正則化模型 8
2.2 集成學習 10
2.2.1 裝袋法 10
2.2.2 提升法 11
2.2.3 堆疊法 12
2.3 相關文獻整理 13
第三章 研究方法 15
3.1 研究流程 15
3.2 資料正規化 16
3.3 正則化模型 17
3.3.1 LassoCV 17
3.3.2 RidgeCV 18
3.3.3 ElasticNetCV 18
3.4 集成學習模型 19
3.4.1 隨機森林 19
3.4.2 極限隨機樹 20
3.4.3 自適應增強 20
3.4.4 梯度提升 21
3.4.5 極限梯度提升 23
3.4.6 CatBoost 24
3.5 交叉驗證 26
3.6 評估標準 27
第四章 實驗結果 29
4.1 實驗環境配置 29
4.2 資料集 29
4.2.1 能源效率 30
4.2.2 體脂預測 31
4.2.3 汽車排放 32
4.3 實驗結果 33
4.3.1 特徵選擇 33
4.3.2 模型預測結果 38
第五章 結論與建議 44
5.1 結論 44
5.2 未來研究方向 45
參考資料 46
[1] Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3, 1157-1182.
[2] Witten, I. H., & Frank, E. (2002). Data mining: practical machine learning tools and techniques with Java implementations. Acm Sigmod Record, 31(1), 76-77.
[3] Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. Data classification: Algorithms and applications, 37.
[4] Polikar, R. (2012). Ensemble learning. Ensemble machine learning: Methods and applications, 1-34.
[5] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
[6] Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189-1232.
[7] Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine learning, 63, 3-42.
[8] Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).
[9] Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial intelligence, 97(1-2), 245-271.
[10] Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent data analysis, 1(1-4), 131-156.
[11] Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1-2), 273-324.
[12] Yang, Y., & Pedersen, J. O. (1997, July). A comparative study on feature selection in text categorization. In International Conference on Machine Learning (Vol. 97, No. 412-420, p. 35).
[13] Rui, Y., Huang, T. S., & Chang, S. F. (1999). Image retrieval: Current techniques, promising directions, and open issues. Journal of visual communication and image representation, 10(1), 39-62.
[14] Jović, A., Brkić, K., & Bogunović, N. (2015, May). A review of feature selection methods with applications. In 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO) (pp. 1200-1205). IEEE.
[15] Hoque, N., Bhattacharyya, D. K., & Kalita, J. K. (2014). MIFS-ND: A mutual information-based feature selection method. Expert Systems with Applications, 41(14), 6371-6385.
[16] Liu, H., & Setiono, R. (1996, July). A probabilistic approach to feature selection-a filter solution. In International Conference on Machine Learning (Vol. 96, pp. 319-327).
[17] Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. Data classification: Algorithms and applications, 37.
[18] Robnik-Šikonja, M., & Kononenko, I. (2003). Theoretical and empirical analysis of ReliefF and RReliefF. Machine learning, 53, 23-69.
[19] Alelyani, S., Tang, J., & Liu, H. (2018). Feature selection for clustering: A review. Data Clustering, 29-60.
[20] Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine learning, 46, 389-422.
[21] Narendra, P. M., & Fukunaga, K. (1977). A branch and bound algorithm for feature subset selection. IEEE Transactions on Computers, 26(09), 917-922.
[22] Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM Computing Surveys (CSUR), 50(6), 1-45.
[23] Sandri, M., & Zuccolotto, P. (2006). Variable selection using random forests. In Data Analysis, Classification and the Forward Search: Proceedings of the Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, University of Parma, June 6–8, 2005 (pp. 263-270). Springer Berlin Heidelberg.
[24] Cawley, G., Talbot, N., & Girolami, M. (2006). Sparse multinomial logistic regression via bayesian l1 regularisation. Advances in neural information processing systems, 19.
[25] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288.
[26] Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67.
[27] Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2), 301-320.
[28] Zhu, J., Rosset, S., Tibshirani, R., & Hastie, T. (2003). 1-norm support vector machines. Advances in neural information processing systems, 16.
[29] Xu, Z., Huang, G., Weinberger, K. Q., & Zheng, A. X. (2014, August). Gradient boosted feature selection. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 522-531).
[30] Hara, S., & Maehara, T. (2017, February). Enumerate lasso solutions for feature selection. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1).
[31] Mu, X., Lu, J., Watta, P., & Hassoun, M. H. (2009, June). Weighted voting-based ensemble classifiers with application to human face recognition and voice recognition. In 2009 International Joint Conference on Neural Networks (pp. 2168-2171). IEEE.
[32] Kim, Y., & Sohn, S. Y. (2012). Stock fraud detection using peer group analysis. Expert Systems with Applications, 39(10), 8986-8992.
[33] Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems?. The Journal of Machine Learning Research, 15(1), 3133-3181.
[34] Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123-140.
[35] Freund, Y., & Schapire, R. E. (1996, July). Experiments with a new boosting algorithm. In International Conference on Machine Learning (Vol. 96, pp. 148-156).
[36] Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241-259.
[37] Pirbazari, A. M., Chakravorty, A., & Rong, C. (2019, February). Evaluating feature selection methods for short-term load forecasting. In 2019 IEEE International Conference on Big Data and Smart Computing (BigComp) (pp. 1-8). IEEE.
[38] Otchere, D. A., Ganat, T. O. A., Ojero, J. O., Tackie-Otoo, B. N., & Taki, M. Y. (2022). Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions. Journal of Petroleum Science and Engineering, 208, 109244.
[39] Jing, X., Zou, Q., Yan, J., Dong, Y., & Li, B. (2022). Remote Sensing Monitoring of Winter Wheat Stripe Rust Based on mRMR-XGBoost Algorithm. Remote Sensing, 14(3), 756.
[40] Banga, A., Ahuja, R., & Sharma, S. C. (2021). Performance analysis of regression algorithms and feature selection techniques to predict PM 2.5 in smart cities. International Journal of System Assurance Engineering and Management, 1-14.
[41] Luo, M., Wang, Y., Xie, Y., Zhou, L., Qiao, J., Qiu, S., & Sun, Y. (2021). Combination of feature selection and catboost for prediction: The first application to the estimation of aboveground biomass. Forests, 12(2), 216.
[42] Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1), 119-139.
[43] Schapire, R. E. (2013). Explaining adaboost. Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik, 37-52.
[44] Ying, C., Qi-Guang, M., Jia-Chen, L., & Lin, G. (2013). Advance and prospects of AdaBoost algorithm. Acta Automatica Sinica, 39(6), 745-758.
[45] Touzani, S., Granderson, J., & Fernandes, S. (2018). Gradient boosting machine for modeling the energy consumption of commercial buildings. Energy and Buildings, 158, 1533-1543.
[46] Wang, S., Dong, P., & Tian, Y. (2017). A novel method of statistical line loss estimation for distribution feeders based on feeder cluster and modified XGBoost. Energies, 10(12), 2067.
[47] Dorogush, A. V., Ershov, V., & Gulin, A. (2018). CatBoost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363.
[48] Bakhareva, N., Shukhman, A., Matveev, A., Polezhaev, P., Ushakov, Y., & Legashev, L. (2019, September). Attack detection in enterprise networks by machine learning methods. In 2019 international Russian automation conference (RusAutoCon) (pp. 1-6). IEEE.
[49] Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). CatBoost: unbiased boosting with categorical features. Advances in Neural Information Processing Systems, 31.
[50] Huang, G., Wu, L., Ma, X., Zhang, W., Fan, J., Yu, X., ... & Zhou, H. (2019). Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. Journal of Hydrology, 574, 1029-1041.
[51] Hancock, J. T., & Khoshgoftaar, T. M. (2020). CatBoost for big data: an interdisciplinary review. Journal of Big Data, 7(1), 1-45.
[52] Xia, Y., He, L., Li, Y., Liu, N., & Ding, Y. (2020). Predicting loan default in peer‐to‐peer lending using narrative data. Journal of Forecasting, 39(2), 260-280.
[53] Diao, L., Niu, D., Zang, Z., & Chen, C. (2019, July). Short-term weather forecast based on wavelet denoising and catboost. In 2019 Chinese control conference (CCC) (pp. 3760-3764). IEEE.
[54] Fan, J., Wang, X., Zhang, F., Ma, X., & Wu, L. (2020). Predicting daily diffuse horizontal solar radiation in various climatic regions of China using support vector machine and tree-based soft computing models with local and extrinsic climatic data. Journal of Cleaner Production, 248, 119264.
[55] Bouckaert, R. R. (2003, August). Choosing between two learning algorithms based on calibrated tests. In Proceedings of the Twentieth International Conference on International Conference on Machine Learning (pp. 51-58).
[56] Vanwinckelen, G., Blockeel, H., De Baets, B., Manderick, B., Rademaker, M., & Waegeman, W. (2012, January). On estimating model accuracy with repeated cross-validation. In BeneLearn 2012: Proceedings of the 21st Belgian-Dutch conference on machine learning (pp. 39-44).
[57] Kim, J. H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics & Data Analysis, 53(11), 3735-3745.
[58] Tang, J., Liu, F., Zou, Y., Zhang, W., & Wang, Y. (2017). An improved fuzzy neural network for traffic speed prediction considering periodic characteristic. IEEE Transactions on Intelligent Transportation Systems, 18(9), 2340-2350.
[59] Chen, T. T., & Lee, S. J. (2015). A weighted LS-SVM based learning system for time series forecasting. Information Sciences, 299, 99-116.
[60] Asuncion, A., & Newman, D. (2007). UCI machine learning repository.
[61] Tsanas, Athanasios and Xifara, Angeliki. (2012). Energy efficiency. UCI Machine Learning Repository. https://doi.org/10.24432/C51307.
[62] Penrose, K. W., Nelson, A. G., & Fisher, A. G. (1985). Generalized body composition prediction equation for men using simple measurement techniques. Medicine & Science in Sports & Exercise, 17(2), 189.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *