作者(外文):Peng, Chieh-Ying
論文名稱(外文):A Comparative Study of Utilizing Feature Selection in Medical Classification Problems
指導教授(外文):Su, Chao-Ton
外文關鍵詞:Machine LearningFeature SelectionFilter MethodWrapper MethodEmbedded MethodHybrid Feature SelectionMental DisordersBreast Cancer
In the era of big data, handling large datasets effectively is crucial. While machine learning and AI offer solutions, redundant features can hinder model performance. Feature selection, like filter, wrapper, embedded, and hybrid methods, aims to enhance model efficiency and interpretability by selecting the most important subset of features. Choosing the right method depends on data characteristics, model needs, and resources. Proper feature selection streamlines models, boosts predictive accuracy, and aids decision-making.
This study investigates the impact of feature selection on classification model performance. Methods used include Pearson correlation, recursive feature elimination, Lasso, Ridge, ElasticNet, and a hybrid approach. Using mental disorder and breast cancer datasets, ElasticNet emerged as the best feature selection method, significantly enhancing model performance. Notably, ElasticNet combined with BPNN achieved outstanding accuracy and F1 scores. The findings highlight the importance of feature selection in model training and suggest further research to enhance the generalizability of these methods across different domains.
目錄 V
圖目錄 VII
表目錄 VIII
第一章 緒論 1
1.1 研究背景 1
1.2 研究目的 2
1.3 研究架構 2
第二章 文獻回顧 4
2.1 特徵篩選 4
2.1.1 過濾法 4
2.1.2 包裝法 5
2.1.3 嵌入法 6
2.1.4 混合式特徵篩選 7
2.2 分類模型 9
第三章 研究方法 4
3.1 研究流程 10
3.2 特徵篩選 11
3.2.1 過濾法 13
3.2.2 包裝法 14
3.2.3 混合式特徵篩選 15
3.2.4 嵌入法 16 Lasso 16 Ridge 17 ElasticNet 17
3.3 分類模型 18
3.3.1 自適應提升 18
3.3.2 極限梯度提升 19
3.3.3 K-近鄰演算法 20
3.3.4 支持向量機 20
3.3.5 反向傳播神經網絡 20
3.4 模型評估指標 22
第四章 案例研究 24
4.1 資料集 24
4.1.1 精神障礙 24
4.1.2 乳癌 26
4.2 實驗結果 27
4.2.1 特徵篩選 27
4.2.2 模型預測結果 31
4.3 執行結果之討論 42
第五章 結論與建議 45
5.1 總結 45
5.2 未來研究方向 46
參考文獻 47

