帳號:guest(          離開系統
字體大小: 字級放大   字級縮小   預設字形  


作者(外文):Mach, Patrizia
論文名稱(外文):A Data Mining Approach to Surveying Academic Literature on Behavioral Big Data in Operations Management Research
指導教授(外文):Shmueli, Galit
口試委員(外文):Lee, Hsiao-Hui
Lin, Furen
外文關鍵詞:Behavioral Big DataData MiningOperations ManagementAcademic LiteratureBig DataClassification
  • 推薦推薦:0
  • 點閱點閱:125
  • 評分評分:*****
  • 下載下載:28
  • 收藏收藏:0
Behavioral big data has become a recent focus in operations management research as it attempts to aid decision making using traditional mathematic modelling that captures human behavior. To explore the depth of research on this topic, it is common to conduct a comprehensive literature review of published papers. However, identifying individual papers as containing behavioral big data in a large and growing pool of operations management published research is both time and labor intensive, and fails to specifically define factors that would identify a paper as pertaining to behavioral big data. This research provides an efficient data mining method to surveying a vast number of research articles across different operations management journals that objectively classifies papers as relevant or not. We find that the model was able to detect a larger number of papers and classify them correctly if the journal content and structure was similar to the training set used. Although it was not possible to automate the entire procedure, this process of classifying documents without manual reading of each paper requires less time to identify a narrower subgroup for close examination, and by requiring a set of conditions for identifying behavioral big data, provides a more efficient and structured literature review process.
Table of Contents
Acknowledgement 2
List of Figures 6
List of Tables 7
Abstract 8
中文摘要 9
1. Introduction 10
1.1. Motivation 13
2. Data 14
2.1. Extraction 15
2.2. Labeling 16
2.3. Exploration 17
2.4. Challenges 22
3. Data Mining Approach 24
3.1. General Approach 24
3.2. Classification Algorithms 24
3.2.1. Classification Tree 25
3.2.2. Random Forest 25
3.2.3. Boosted Tree 26
3.2.4. Logistic Regression 26
3.2.5. LASSO logistic regression 27
3.2.6. Ensemble 27
3.3. Performance Evaluation 28
3.3.1. Sensitivity and Specificity 28
3.3.2. ROC Curves 29
3.3.3. Confusion Matrices and Cut-off Value 29
3.3.4. Precision and Recall 29
4. Data Analysis and Results 31
4.1. Data Pre-Screening 31
4.2. Data Partitioning 32
4.3. Benchmark Model 32
4.4. Model Training and Evaluation 33
4.4.1. Classification Tree 35
4.4.2. Random Forest 37
4.4.3. Gradient Boosted Tree 39
4.4.4. Logistic Regression 41
4.4.5. LASSO Regression (L1 regularization) 43
4.4.6. Ensemble 45
5. Classifying test sets 46
5.1. Classifying Test Set: 2017 MS Journal 47
5.2. Classifying Test Set: 2017 MSOM Journal 49
5.3. Classifying Test Set: 2017 POM Journal 51
6. Conclusion 53
6.1. Limitations 55
6.2. Recommendations 55
6.3. Future Work 57
References 58

Breiman, L., & Cutler, A. (n.d.). Random Forests. Retrieved from University of California, Berkeley Department of Statistics: https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#overview
Clark, J. (2015, October 26). Google Turning Its Lucrative Web Search Over to AI Machines. Retrieved from Bloomberg: https://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines
Deshpande, G. (2016, February 17). 3 ways behavioral analytics can drive business growth. Retrieved from IBM Big Data & Analytics Hub: https://www.ibmbigdatahub.com/blog/3-ways-behavioral-analytics-can-drive-business-growth
Donselaar, K. H., Gaur, V., Woensel, T. v., Broekmeulen, R. A., & Fransoo, J. C. (2010). Ordering Behavior in Retail Stores and Implications for Automated Replenishment. Management Science, 766-784.
Gaus, T., Olsen, K., & Deloso, M. (2018, May 22). Synchronizing the digital supply network. Retrieved from Deloitte Insights: https://www2.deloitte.com/insights/us/en/focus/industry-4-0/artificial-intelligence-supply-chain-planning.html
Google Developers. (2019, March 5). Classification: Precision and Recall. Retrieved from Google Developers: https://developers.google.com/machine-learning/crash-course/classification/precision-and-recall
Hosmer, D., Lemeshow, S., & Sturdivant, R. (2013). Applied Logistic Regression. John Wiley & Sons, Inc.
Kelley, C. (2018a, August 16). Hertz: How big data is delivering big advantages. Retrieved from The Big Data Insight Group: https://www.thebigdatainsightgroup.com/2018/08/hertz-how-big-data-is-delivering-big-advantages/
Kelley, C. (2018b, July 26). Specsavers harnesses data to sharpen its performance visibility. Retrieved from The Big Data Insight Group: https://www.thebigdatainsightgroup.com/2018/07/specsavers-harnesses-data-to-sharpen-its-performance-visibility/
Lamba, K., & Singh, S. P. (2017). Big data in operations and supply chain management: current trends and future perspectives. Production Planning & Control , 877-890.
Le, J. (2018, June 19th). Decision Trees in R. Retrieved from DataCamp: https://www.datacamp.com/community/tutorials/decision-trees-R
Lee, H.-H., Shmueli, G., & Mach, P. (2019 unpublished). Operations Management Research with Behavioral Big Data.
Michał, O. (2018, November 30). Regularization: Ridge, Lasso and Elastic Net. Retrieved from DataCamp: https://www.datacamp.com/community/tutorials/tutorial-ridge-lasso-elastic-net
Nadeem, M. (2018, July 7th). How YouTube Recommends Videos. Retrieved from Towards Data Science: https://towardsdatascience.com/how-youtube-recommends-videos-b6e003a5ab2f
Peixeiro, M. (2018, December 10th ). Classification (Part 1) — Intro to Logistic Regression. Retrieved from Becoming Human: Artificial Intelligence Magazine: https://becominghuman.ai/classification-part-1-intro-to-logistic-regression-f6258791d309
Shiller, R. (2017, October 11). Richard Thaler is a controversial Nobel prize winner – but a deserving one. Retrieved from The Guardian: https://www.theguardian.com/world/2017/oct/11/richard-thaler-nobel-prize-winner-behavioural-economics
Shmueli, G. (2017). Research Dilemmas with Behavioral Big Data. Big Data, 98-119.
Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Jr., K. C. (2017). Data mining for business analytics: concepts, techniques, and applications in R. Hoboken: John Wiley & Sons, Inc.
Simchi-Levi, D. (2017, December 21). From the Editor. Retrieved from Informs PubsOnLine: https://pubsonline.informs.org/doi/full/10.1287/mnsc.2017.3019
Singh, H. (2018, November 4th ). Understanding Gradient Boosting Machines. Retrieved from Towards Data Science: https://towardsdatascience.com/understanding-gradient-boosting-machines-9be756fe76ab
Soyer, R., & Tarimcilar, M. M. (2008). Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach. Management Science, 266-278.
STHDA. (n.d.). Text mining and word cloud fundamentals in R : 5 simple steps you should know. Retrieved from STHDA: http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know#the-5-main-steps-to-create-word-clouds-in-r
Weiss, S. M., Indurkhya, N., & Zhang, T. (2010). Fundamentals of Predictive Text Mining. London: Springer.

第一頁 上一頁 下一頁 最後一頁 top
* *