帳號:guest(18.119.17.177)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):金賢達
作者(外文):Jin, Xian-Da
論文名稱(中文):二元出項效用最大化預測之最佳子集選擇:一個對德國銀行貸款業務數據的應用
論文名稱(外文):Best Subset Utility-maximizing Prediction of Binary Outcome:An application to data from a German bank's loan operations
指導教授(中文):陳樂昱
冼芻蕘
指導教授(外文):Chen, Le-Yu
Sin, Chor-Yiu
口試委員(中文):蘇俊華
楊睿中
廖仁哲
口試委員(外文):Su, Jiun-Hua
Yang, Jui-Chung
Liao, Jen-Che
學位類別:碩士
校院名稱:國立清華大學
系所名稱:經濟學系
學號:107072470
出版年(民國):111
畢業學年度:110
語文別:英文
論文頁數:41
中文關鍵詞:最優子集選擇二元選擇l_0 EUM最高加權分數混合整數優化
外文關鍵詞:Best subset selectionBinary choicel_0-norm empirical utility maximization (l_0-EUM)Maximum weighted scoreMixed integer optimization (MIO)
相關次數:
  • 推薦推薦:0
  • 點閱點閱:41
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
We study a variable selection problem for the utility-maximizing prediction of binary outcomes. Based on the framework of Chen and Lee (2018), we construct the l_0-norm constrained maximum weighted score (utility-maximizing) prediction rules. Contrast to the conventional approach which maximizes the accuracy of prediction, our prediction procedure is designed to maximize the utility of the lenders, which can be implemented through the mixed integer optimization method. Then we give the theoretical non-asymptotic upper risk bounds when the number of the observations is large. And empirically, we demonstrate the higher revenue (utility) attained of our predicting approach applying to a German banking dataset. Our results compare favorably with those obtained by Lieli and White (2010) who do not consider the variable selection problem.
1 Introduction 3
2 Literature Review 7
2.1 Maximum score estimator 7
2.2 MIO algorithm 7
2.3 Penalized generalized linear model (PGLM) 8
2.4 Penalized support vector machine (PSVM) 9
3 Best subset utility-maximizing binary prediction rule 10
3.1 Estimation model 10
3.2 Theoretical properties of the prediction rule 11
4 Implementation via mixed integer optimization 14
4.1 An algorithm to deal with the maximization problem 14
4.2 The branch-and-bound method 16
5 Application to real data 20
5.1 The data set 20
5.2 Specifications 22
5.3 Data-based calculations 24
5.4 Estimation results 25
6 Conclusion 31
Appendix 32
A. Proof of Theorem 3.1 32
B. The empirical results of a particular repetition 36
References 38

Bartik, T.J., Butler, J. S., & Liu, J.T. (1992). Maximum score estimates of the determinants of residential mobility: implications for the value of residential attachment and neighborhood amenities. Journal of Urban Economics, 32(2), 233-256.
Bertsimas, D., King, A., & Mazumder, R. (2016). Best subset selection via a modern optimization lens. The Annals of Statistics, 44(2), 813-852.
Blundell, R., Fry, V., & Walker, I. (1988). Modelling the take-up of means-tested benefits: the case of housing benefits in the United Kingdom. The Economic Journal, 98(390), 58-74.
Caudill, S.B. (2003). Predicting discrete outcomes with the maximum score estimator: The case of the NCAA men’s basketball tournament. International Journal of Forecasting, 19(2), 313-317.
Chen, L.Y., & Lee, S. (2018). Best subset binary prediction. Journal of Econometrics, 206(1), 39-56.
Chen, L.Y., & Lee, S. (2018). Exact computation of GMM estimators for instrumental variable quantile regression models. Journal of Applied Econometrics, 33(4), 553-567.
Chen, L.Y., & Lee, S. (2021). Binary classification with covariate selection through ℓ0-penalised empirical risk minimisation. The Econometrics Journal, 24(1), 103-120.
Conforti, M., Cornuéjols, G., & Zambelli, G. (2014). Integer Programming, Berlin: Springer.
Cover, T.M. (1965). Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, 3(14), 326-334.
Danilov, D., & Magnus, J. R. (2004). On the harm that ignoring pretesting can cause. Journal of Econometrics, 122(1), 27-46.
Das, S. (1991). A semiparametric structural analysis of the idling of cement kilns. Journal of Econometrics, 50(3), 235-256.
Elliott, G., & Lieli, R.P. (2013). Predicting binary outcomes. Journal of Econometrics, 174(1), 15-26.
Florios, K., & Skouras, S. (2008). Exact computation of max weighted score estimators. Journal of Econometrics, 146(1), 86-91.
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22.
Fung, G.M., & Mangasarian, O.L. (2004). A feature selection Newton method for support vector machine classification. Computational Optimization and Applications, 28(2), 185-202.
Granger, C.W.J, & Machina, M.J. (2006). Forecasting and decision theory. Handbook of Economic Forecasting, 1, 81-98.
Greenshtein, E. (2006). Best subset selection, persistence in high-dimensional statistical learning and optimization under l1 constraint. The Annals of Statistics, 34(5), 2367-2386.
Horowitz, J.L. (1993). Semiparametric estimation of a work-trip mode choice model. Journal of Econometrics, 58(1-2), 49-70.
Kitagawa, T., & Tetenov, A. (2018). Who should be treated? Empirical welfare maximization methods for treatment choice. Econometrica, 86(2), 591-616.
Kosorok, M.R. (2008). Introduction to Empirical Processes and Semiparametric Inference. New York: Springer.
Lee, S., Liao, Y., Seo, M.H., & Shin, Y. (2021). Sparse HP filter: finding kinks in the COVID-19 contact rate. Journal of Econometrics, 220(1), 158-180.
Lee, S., Liao, Y., Seo, M.H., & Shin, Y. (2021). Factor-driven two-regime regression. The Annals of Statistics, 49(3), 1656-1678.
Li, C.Z. (1996). Semiparametric estimation of the binary choice model for contingent valuation. Land Economics, 72(4), 462-473.
Lieli, R.P., & White, H. (2010). The construction of empirical credit scoring rules based on maximization principles. Journal of Econometrics, 157(1), 110-119.
Little, J.D., Murty, K.G., Sweeney, D.W., & Karel, C. (1963). An algorithm for the traveling salesman problem. Operations research, 11(6), 972-989.
Magnus, J.R., & Durbin, J. (1999). Estimation of regression coefficients of interest when other regression coefficients are of no interest. Econometrica, 67(3), 639-643.
Manski, C.F. (1975). Maximum score estimation of the stochastic utility model of choice. Journal of Econometrics, 3(3), 205-228.
Manski, C.F. (1985). Semiparametric analysis of discrete response: Asymptotic properties of the maximum score estimator. Journal of econometrics, 27(3), 313-333.
McCullagh, P., & Nelder, J. (1989). Generalized Linear Models, Second Edition. Boca Raton: Chapman and Hall/CRC.
Nelder, J.A., & Wedderburn, R.W. (1972). Generalized linear models. Journal of the Royal Statistical Society: Series A (General), 135(3), 370-384.
Shin, Y., & Todorov, Z. (2021). Exact computation of maximum rank correlation estimator. The Econometrics Journal, 24(3), 589-607.
Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2011). Regularization paths for Cox’s proportional hazards model via coordinate descent. Journal of Statistical Software, 39(5), 1-13.
Simon, N., Friedman, J., & Hastie, T. (2013). A blockwise descent algorithm for group-penalized multiresponse and multinomial regression. arXiv:1311.6529.
Smith, F.W. (1968). Pattern classifier design by linear programming. IEEE Transactions on Computers, 100(4), 367-372.
Su, J.H. (2021). Model selection in utility-maximizing binary prediction. Journal of Econometrics, 223(1), 96-124.
Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes. The Annals of Probability, 22(1), 28-76.
Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J., & Tibshirani, R. J. (2012). Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74(2), 245-266.
Vapnik, V.N., & Chervonenkis, A.Y. (1971). On uniform convergence of the frequencies of events to their probabilities. Theory of Probability and its Application, 16(2), 264-280.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top

相關論文

無相關論文
 
* *