帳號:guest(18.218.75.222)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張家豪
作者(外文):Chang, Chia Hao
論文名稱(中文):XGBoost 在高維度模型的研究
論文名稱(外文):A Study of XGBoost in High-Dimensional Model
指導教授(中文):銀慶剛
指導教授(外文):Ing, Ching-Kang
口試委員(中文):鄭又仁
黃信誠
俞淑惠
口試委員(外文):Cheng, Yu-Jen
Huang, Hsin-Cheng
Yu, Shu-Hui
學位類別:碩士
校院名稱:國立清華大學
系所名稱:統計學研究所
學號:106024518
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:38
中文關鍵詞:極限梯度提升高維度訊息準則模型選擇
外文關鍵詞:extreme gradient boostinghigh-dimensional information criterionmodel selection
相關次數:
  • 推薦推薦:0
  • 點閱點閱:426
  • 評分評分:*****
  • 下載下載:1
  • 收藏收藏:0
隨著機器學習方法蓬勃方展,越來越多半導體公司引進相關的演算法進行產品良率分析。在大量的機台當中,往往只有少數機台導致品質下降,而為了降低停機成本和維修成本,必須在少量樣品下快速找出問題機台。機器學習方法在預測上表現十分優異,但在模型選擇部分卻停留在初步階段,這引發我們思考關於這系列機器學習方法的模型選擇演算法,並利用模擬資料和實際資料來說明我們所提出的演算法對於模型選擇問題有良好的表現。我們使用基於極限梯度提升(eXtreme Gradient Boosting)的方法,同時參考了Ing and Lai (2011)的構想建立出一套模型選擇的演算法。我們首先用極限梯度提升進行選模,並參考Lundberg and Lee (2017)所提出的SHAP值(SHapley Additive exPlanations)對變數進行排序後,再搭配高維度訊息準則 (high-dimensional information criterion) 汰除偽陽性的變數。模擬結果說明了我們的方法不只在模型假設正確下有不差的表現,即便在錯誤的模型假設下也能繼續維持。在實際晶圓良率資料的應用上也說明了我們的方法具有實用性。
With the vigorous development of machine learning methods, many semiconductor companies have introduced algorithms for product yield analysis. Of a large number of machines, often only a few machines cause quality degradation; to reduce downtime and maintenance costs, they must efficiently find the problematic machine given a small number of samples. Although machine learning is excellent for forecasting, when used for model selection it gets stuck in the initial stage. Thus we focus on the model selection algorithm for this series of machine learning methods, and use both simulated and real-world data to illustrate that our process yields good performance. We use eXtreme Gradient Boosting (XGBoost) and build a set of model selection algorithms with reference to the ideas in Ing and Lai (2011).
We first use XGBoost to select the model, then sort the variables with reference to the SHAP values [Lundberg and Lee (2017)], and then use the high-dimensional information criterion to eliminate false positive variables. Simulation results show that our method not only performs well under the right model assumption but maintains this level of performance even under the wrong model assumptions. The application to wafer yield data also shows that our method is practical.
Contents
List of Figures iv
List of Tables iv
1 Introduction 1
2 Literature Reviews 2
2.1 Lasso . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Tree-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2.1 Classification and Regression Trees . . . . . . . . . . . . . . . 3
2.2.2 Random Forests . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.3 Gradient Tree Boosting . . . . . . . . . . . . . . . . . . . . . . 6
2.2.4 eXtreme Gradient Boosting (XGBoost) . . . . . . . . . . . . . 7
2.3 Chebyshev Greedy Algorithms (CGA) . . . . . . . . . . . . . . . . . . 9
3 Model Selection Procedure 10
4 Simulation Studies 15
4.1 Simple Logistic Regression Models . . . . . . . . . . . . . . . . . . . 17
4.2 Model Misspecification . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5 Real Data Analysis 24
References 33
Appendix A 35
Appendix B 36
Amaldi, E. and Kann, V. (1998). On the approximability of minimizing nonzero variables
or unsatisfied relations in linear systems. Theoretical Computer Science, 209(1),
237–260.
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and regression trees. Wadsworth Advanced Books and Software, Belmont, CA.
Chen, T. and Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, 785-794.
Chen, T. and He, T. (2015). Higgs boson discovery with boosted trees. NIPS Workshop
on High-Energy Physics and Machine Learning, 69-80.
Chen, Y.-L, Dai, C.-S and Ing, C.-K (2019). High-Dimensional Model Selection via
Chebyshev Greedy Algorithms. Working paper.
Fan, J., and Li, R. (2001). Variable Selection via Nonconcave Penalized Likelihood and
Its Oracle Properties. Journal of the American Statistical Association, 1348–1360.
Hastie, T., Tibshirani, R. and Friedman, J. H. (2009). The elements of statistical learning,
2nd ed. Springer, New York.
Ing, C.-K and Lai, T. L. (2011). A stepwise regression method and consistent model
selection for high-dimensional sparse linear models. Statistica Sinica, 21, 1473–1513.
Lin, S. C. (2018). High-dimensional location-dispersion models with application to root
cause analysis in wafer fabrication processes. Master’s Thesis of institute of Statistics.
Hsinchu: National Tsing Hua University. Retrieved from https://hdl.handle.net/
11296/zh226d
Liu, B., Wei, Y., Zhang, Y. and Yang Q. (2017). Deep neural networks for high dimension,
low sample size data. Proceedings of the Twenty-Sixth International Joint
Conference on Artificial Intelligence, 2287–2293.
Lundberg, S. M. and Lee, S.-I (2017). A Unified Approach to Interpreting Model Predictions.
Advances in Neural Information Processing Systems, 4768-4777.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the
Royal Statistical Society, Series B (Methodological), 267–288.
Xu, Z., Huang, G., Weinberger, K. Q. and Zheng, A. X. (2014). Gradient boosted feature
selection. Proceedings of the 20th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, 522-531.
Yamada, M., Jitkrittum, W., Sigal, L., Xing, E.P. and Sugiyama M. (2014). Highdimensional
feature selection by feature-wise kernelized lasso. Neural computation,
26(1), 185–207.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American
Statistical Association, 101(476), 1417–1429.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *