|
Acock, A., & Stavig, G. (1979). A Measure of Association for Nonparametric Statistics. Social Forces,57(4), 1381-1386. doi:10.2307/2577276 Aggarwal, C. C., & Reddy, C. K. (Eds.). (2013). Data clustering: algorithms and applications. Chapman and Hall/CRC. Albert, A., & Anderson, J. A. (1984). On the existence of maximum likelihood estimates in logistic regression models. Biometrika, 71(1), 1-10. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees Belmont. CA: Wadsworth International Group. Bustamante, C. D., Francisco, M., & Burchard, E. G. (2011). Genomics for the world. Nature, 475(7355), 163-165. Breiman, L. Manual–setting up, using, and understanding random forests V4. 0. (2003) Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357. Chawla, N. V., Japkowicz, N., & Kotcz, A. (2004). Editorial: special issue on learning from imbalanced data sets. ACM Sigkdd Explorations Newsletter, 6(1), 1-6. Drummond, C., & Holte, R. C. (2000). Exploiting the cost (in) sensitivity of decision tree splitting criteria. In ICML (Vol. 1, No. 1). Flach, P. A. (2003). The geometry of ROC space: understanding machine learning metrics through ROC isometrics. In ICML (pp. 194-201). Galimberti, G., & Soffritti, G. (2011). Tree‐Based Methods and Decision Trees. Modern Analysis of Customer Surveys: With Applications Using R, 283-307. He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263-1284. Hoens, T., & Chawla, N. (2010). Generating diverse ensembles to counter the problem of class imbalance. Advances in Knowledge Discovery and Data Mining, 488-499. Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398). John Wiley & Sons. Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical statistics, 15(3), 651-674. Hothorn, T., Hornik, K., Strobl, C. & Zeileis, A. (2015). Package ‘party’. Package Reference Manual for Party Version 0.9-998, 16, 37. Japkowicz, N. (2001). Concept-learning in the presence of between-class and within-class imbalances. In Conference of the Canadian Society for Computational Studies of Intelligence (pp. 67-77). Springer Berlin Heidelberg. Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied statistics, 119-127. Kiritchenko, S., & Matwin, S. (2011). Email classification with co-training. In Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research (pp. 301-312). IBM Corp.. Kumar, M., & Sheshadri, H. (2012). On the classification of imbalanced datasets. International Journal of Computer Applications, 44. Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2(3), 18-22. Lin, M., Lucas Jr, H. C., & Shmueli, G. (2013). Research commentary—too big to fail: large samples and the p-value problem. Information Systems Research, 24(4), 906-917. Meyer, D., Zeileis, A., Hornik, K., Meyer, M. D., & KernSmooth, S. (2007). The vcd package. Retrieved October, 3, 2007. Nguyen, G. H., Bouzerdoum, A., & Phung, S. L. (2009). Learning pattern classification tasks with imbalanced data sets. In Pattern recognition. InTech. Phua, C., Alahakoon, D., & Lee, V. (2004). Minority report in fraud detection: classification of skewed data. Acm sigkdd explorations newsletter, 6(1), 50-59. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1: 81-106. Quinlan, J. R. (1993). C4. 5: Programming for machine learning. Morgan Kauffmann, 38. Rokach, L., & Maimon, O. (2005). Decision trees. Data mining and knowledge discovery handbook, 165-192. Shmueli, G., Patel, N. R., & Bruce, P. C. (2016). Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner. John Wiley & Sons. Shmueli, G. (2010). To explain or to predict?. Statistical science, 289-310. Sun, Y., Wong, A. K., & Kamel, M. S. (2009). Classification of imbalanced data: A review. International Journal of Pattern Recognition and Artificial Intelligence, 23(04), 687-719. Therneau, T. M., Atkinson, B., & Ripley, B. (2010). rpart: Recursive partitioning. R package version, 3, 1-46. Weiss, G. M. (2004). Mining with rarity: a unifying framework. ACM Sigkdd Explorations Newsletter, 6(1), 7-19. Weiss, G. M. (2009). Mining with rare cases. In Data Mining and Knowledge Discovery Handbook (pp. 747-757). Springer US. Yahav, I., Shmueli, G., & Mani, D. (2016). A tree-based approach for addressing self-selection in impact studies with big data. MIS Quarterly, 40(4), 819-848 |