應用資料探勘方法於辨別行為大數據之運籌管理文獻__國立清華大學博碩士論文全文影像系統

帳號：guest(3.147.140.144) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中文):	麥慧芬
作者(外文):	Mach, Patrizia
論文名稱(中文):	應用資料探勘方法於辨別行為大數據之運籌管理文獻
論文名稱(外文):	A Data Mining Approach to Surveying Academic Literature on Behavioral Big Data in Operations Management Research
指導教授(中文):	徐茉莉
指導教授(外文):	Shmueli, Galit
口試委員(中文):	李曉惠林福仁
口試委員(外文):	Lee, Hsiao-Hui Lin, Furen
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	國際專業管理碩士班
學號:	106077429
出版年(民國):	108
畢業學年度:	107
語文別:	英文
論文頁數:	60
中文關鍵詞:	應用資料、大數據、文獻回顧、運籌管理
外文關鍵詞:	Behavioral Big Data、Data Mining、Operations Management、Academic Literature、Big Data、Classification
相關次數:	推薦:0 點閱:125 評分: 下載:28 收藏:0

行為大數據能夠成為運營管理研究的最新焦點，是因為透過傳統數學建模即可捕捉人類行為並有效幫助決策。為了深入瞭解此主題，對已發表的論文進行全面的文獻探討是常見的研究方法。然而，在一個龐大且不斷增長的運營管理文獻資料庫裡逐一識別存在於論文中的行為大數據既費時又費力，並且無法明確定義將文獻識別為與行為大數據相關的因素。這項研究提供了一種有效的數據挖掘方法，用於調查大量橫跨不同運營管理期刊的研究論文，客觀地將論文分類為相關與否。我們發現，如果期刊內容和結構與所使用的訓練集相似，該模型能夠檢測大量論文並對其進行正確分類。儘管不可能使整個過程自動化，但是這種對文檔進行分類而不需要親自閱讀每篇論文的過程需要更少的時間來識別較窄的子組以進行仔細檢查，並且透過一組用來識別行為大數據的條件便能提供更高效率和有條理的文獻探討過程。

Behavioral big data has become a recent focus in operations management research as it attempts to aid decision making using traditional mathematic modelling that captures human behavior. To explore the depth of research on this topic, it is common to conduct a comprehensive literature review of published papers. However, identifying individual papers as containing behavioral big data in a large and growing pool of operations management published research is both time and labor intensive, and fails to specifically define factors that would identify a paper as pertaining to behavioral big data. This research provides an efficient data mining method to surveying a vast number of research articles across different operations management journals that objectively classifies papers as relevant or not. We find that the model was able to detect a larger number of papers and classify them correctly if the journal content and structure was similar to the training set used. Although it was not possible to automate the entire procedure, this process of classifying documents without manual reading of each paper requires less time to identify a narrower subgroup for close examination, and by requiring a set of conditions for identifying behavioral big data, provides a more efficient and structured literature review process.

Table of Contents
Acknowledgement 2
List of Figures 6
List of Tables 7
Abstract 8
中文摘要 9
1. Introduction 10
1.1. Motivation 13
2. Data 14
2.1. Extraction 15
2.2. Labeling 16
2.3. Exploration 17
2.4. Challenges 22
3. Data Mining Approach 24
3.1. General Approach 24
3.2. Classification Algorithms 24
3.2.1. Classification Tree 25
3.2.2. Random Forest 25
3.2.3. Boosted Tree 26
3.2.4. Logistic Regression 26
3.2.5. LASSO logistic regression 27
3.2.6. Ensemble 27
3.3. Performance Evaluation 28
3.3.1. Sensitivity and Specificity 28
3.3.2. ROC Curves 29
3.3.3. Confusion Matrices and Cut-off Value 29
3.3.4. Precision and Recall 29
4. Data Analysis and Results 31
4.1. Data Pre-Screening 31
4.2. Data Partitioning 32
4.3. Benchmark Model 32
4.4. Model Training and Evaluation 33
4.4.1. Classification Tree 35
4.4.2. Random Forest 37
4.4.3. Gradient Boosted Tree 39
4.4.4. Logistic Regression 41
4.4.5. LASSO Regression (L1 regularization) 43
4.4.6. Ensemble 45
5. Classifying test sets 46
5.1. Classifying Test Set: 2017 MS Journal 47
5.2. Classifying Test Set: 2017 MSOM Journal 49
5.3. Classifying Test Set: 2017 POM Journal 51
6. Conclusion 53
6.1. Limitations 55
6.2. Recommendations 55
6.3. Future Work 57
References 58

Breiman, L., & Cutler, A. (n.d.). Random Forests. Retrieved from University of California, Berkeley Department of Statistics: https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#overview
Clark, J. (2015, October 26). Google Turning Its Lucrative Web Search Over to AI Machines. Retrieved from Bloomberg: https://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines
Deshpande, G. (2016, February 17). 3 ways behavioral analytics can drive business growth. Retrieved from IBM Big Data & Analytics Hub: https://www.ibmbigdatahub.com/blog/3-ways-behavioral-analytics-can-drive-business-growth
Donselaar, K. H., Gaur, V., Woensel, T. v., Broekmeulen, R. A., & Fransoo, J. C. (2010). Ordering Behavior in Retail Stores and Implications for Automated Replenishment. Management Science, 766-784.
Gaus, T., Olsen, K., & Deloso, M. (2018, May 22). Synchronizing the digital supply network. Retrieved from Deloitte Insights: https://www2.deloitte.com/insights/us/en/focus/industry-4-0/artificial-intelligence-supply-chain-planning.html
Google Developers. (2019, March 5). Classification: Precision and Recall. Retrieved from Google Developers: https://developers.google.com/machine-learning/crash-course/classification/precision-and-recall
Hosmer, D., Lemeshow, S., & Sturdivant, R. (2013). Applied Logistic Regression. John Wiley & Sons, Inc.
Kelley, C. (2018a, August 16). Hertz: How big data is delivering big advantages. Retrieved from The Big Data Insight Group: https://www.thebigdatainsightgroup.com/2018/08/hertz-how-big-data-is-delivering-big-advantages/
Kelley, C. (2018b, July 26). Specsavers harnesses data to sharpen its performance visibility. Retrieved from The Big Data Insight Group: https://www.thebigdatainsightgroup.com/2018/07/specsavers-harnesses-data-to-sharpen-its-performance-visibility/
Lamba, K., & Singh, S. P. (2017). Big data in operations and supply chain management: current trends and future perspectives. Production Planning & Control , 877-890.
Le, J. (2018, June 19th). Decision Trees in R. Retrieved from DataCamp: https://www.datacamp.com/community/tutorials/decision-trees-R
Lee, H.-H., Shmueli, G., & Mach, P. (2019 unpublished). Operations Management Research with Behavioral Big Data.
Michał, O. (2018, November 30). Regularization: Ridge, Lasso and Elastic Net. Retrieved from DataCamp: https://www.datacamp.com/community/tutorials/tutorial-ridge-lasso-elastic-net
Nadeem, M. (2018, July 7th). How YouTube Recommends Videos. Retrieved from Towards Data Science: https://towardsdatascience.com/how-youtube-recommends-videos-b6e003a5ab2f
Peixeiro, M. (2018, December 10th ). Classification (Part 1) — Intro to Logistic Regression. Retrieved from Becoming Human: Artificial Intelligence Magazine: https://becominghuman.ai/classification-part-1-intro-to-logistic-regression-f6258791d309
Shiller, R. (2017, October 11). Richard Thaler is a controversial Nobel prize winner – but a deserving one. Retrieved from The Guardian: https://www.theguardian.com/world/2017/oct/11/richard-thaler-nobel-prize-winner-behavioural-economics
Shmueli, G. (2017). Research Dilemmas with Behavioral Big Data. Big Data, 98-119.
Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Jr., K. C. (2017). Data mining for business analytics: concepts, techniques, and applications in R. Hoboken: John Wiley & Sons, Inc.
Simchi-Levi, D. (2017, December 21). From the Editor. Retrieved from Informs PubsOnLine: https://pubsonline.informs.org/doi/full/10.1287/mnsc.2017.3019
Singh, H. (2018, November 4th ). Understanding Gradient Boosting Machines. Retrieved from Towards Data Science: https://towardsdatascience.com/understanding-gradient-boosting-machines-9be756fe76ab
Soyer, R., & Tarimcilar, M. M. (2008). Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach. Management Science, 266-278.
STHDA. (n.d.). Text mining and word cloud fundamentals in R : 5 simple steps you should know. Retrieved from STHDA: http://www.sthda.com/english/wiki/text-mining-and-word-cloud-fundamentals-in-r-5-simple-steps-you-should-know#the-5-main-steps-to-create-word-clouds-in-r
Weiss, S. M., Indurkhya, N., & Zhang, T. (2010). Fundamentals of Predictive Text Mining. London: Springer.

電子全文
中英文摘要

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文