帳號:guest(3.139.83.199)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):邱熙証
作者(外文):Ciou, Si-Jheng.
論文名稱(中文):建立大數據分析架構並計算市占率均衡問題 - 以汽車市場為例
論文名稱(外文):Establishing a Big Data Analysis Framework and Computing Equilibrium Market Share with Vehicle Model
指導教授(中文):李雨青
指導教授(外文):Lee, Yu-Ching
口試委員(中文):林仁彥
林國義
學位類別:碩士
校院名稱:國立清華大學
系所名稱:工業工程與工程管理學系
學號:104034551
出版年(民國):106
畢業學年度:105
語文別:英文
論文頁數:70
中文關鍵詞:大數據方法論及應用
外文關鍵詞:Big Data Methodology and Applications
相關次數:
  • 推薦推薦:0
  • 點閱點閱:175
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
  可擴展框架的數據分析已經是多年來廣受歡迎且廣泛使用的技術。該技術能夠處理、分析一般軟體無法應付的資料。在平行運算和分散式儲存的觀點中,Hadoop是健全及優秀的工具之一。
  我們在Hadoop上應用大數據分析找出的潛在訊息和模式。本研究目的是提供一種方法,將過去基於賽局的模型與現有的可擴展運算系統相連接的方法。我們的案例研究是汽車資料。首先,我們自行構建一個集群,並安裝Hadoop生態系統進行研究。第二,資料預分析的三個步驟將有助於我們了解資料的輪廓。第三,根據預分析結果和我們的研究問題構建出汽車市占率的數學模型。最後,我們集中分析矩陣的大小、計算時間和矩陣分割,以評估分散式平台上求解數學最佳化模型所需的實體資源。
  Data analysis on scalable framework has already been a popular and widely employed technique for plenty of years. The technique is able to process and analyze the data that is unable to be dealt with by the general software. Hadoop is one of the robust and well-behaved tools in views of parallel computing and distributed storage.
  We employ big data analysis to discover the potential information and patterns on Hadoop. This paper is aimed to provide a method to connect the advanced game-based model with the existing scalable computing system. Our case study is the vehicle data. First, we build a cluster and install the Hadoop ecosystem for research by ourselves. Second, three steps of the data pre-analysis would assist us to understand the outline of the data. Third, a mathematical model with vehicle market share is constructed in accordance with the outcome of pre-analysis and our research question. Finally, we focus the analysis on the size of the matrices, computing time, and matrix splitting, for evaluating the physical resources required to solve the mathematical optimization model on distributed platform.
中文摘要 I
Abstract II
List of Illustrations VI
List of Tables VIII
Chapter 1 Introduction 1
1.1 Background and Motivation 1
1.2 Framework and Organization 5
Chapter 2 Literature Review 6
2.1 Big Data 6
2.2 Hadoop 7
2.3 Recent Research 7
Chapter 3 Methodology 9
3.1 Relate Work 9
3.1.1 Big Data Applications 9
3.1.2 Commercial Hadoop Platforms 10
3.1.3 Consumer Behavior and Nash Equilibrium 11
3.1.4 Big Data Feature of the 4v and Equilibrium Problem of
   the Market Share 12
3.2 Analytic Tools 13
3.3 Establishing the Big Data Analysis Framework for Vehicle Data 16
3.3.1 Cluster Building 17
3.3.2 Big Data Analysis Framework 18
Chapter 4 Mathematical Model 20
4.1 Definition of Mathematical Symbols 20
4.2 Formulation of the Equations for a Mathematical Programming 21
Chapter 5 Case Study 25
5.1 Specification of the Cluster We Built 25
5.2 Pre-analysis Result of the Data 28
5.2.1 Data Collection 29
5.2.2 Data Cleaning 30
5.2.3 Data Mining 31
5.3 Discussion of the Distributed Computing for the Matrix Operations 34
5.3.1 Classifying where to store objects (local RAM or HDFS)
   in accordance with resources 34
5.3.2 Select the appropriate size of fragments to split Big Data 36
5.3.3 Simplify the algorithm to decrease the number of
   MapReduce functions 37
5.4 Procedure of the Distributed Computing 40
5.4.1 Matrix-Matrix Multiplication 40
5.4.2 Matrix Transposition 42
5.4.3 Inverse Matrix 43
5.5 Analysis of the Size of the Input Parameters and Outputs 45
5.5.1 Analysis of the Matrix Operations 45
5.5.2 Comparison with the Matrix Operations 54
Chapter 6 Conclusion 57
References 59
Appendix 63
Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6), 734-749.
Ahmad, T., Ahmad, R., Masud, S., & Nilofer, F. (2016, August). Framework to extract context vectors from unstructured data using big data analytics. In Contemporary Computing (IC3), 2016 Ninth International Conference on (pp. 1-6). IEEE.
Brynjolfsson, E., Hu, Y. J., & Smith, M. D. (2006). From niches to riches: Anatomy of the long tail. Sloan Management Review, 47(4), 67-71.
Chen, M., Mao, S., & Liu, Y. (2014). Big data: a survey. Mobile Networks and Applications, 19(2), 171-209.
Cox, M., & Ellsworth, D. (1997, August). Managing big data for scientific visualization. In ACM Siggraph (Vol. 97, pp. 146-162).
Chen, H., Chiang, R. H., & Storey, V. C. (2012). Business Intelligence and Analytics: From Big Data to Big Impact. MIS quarterly, 36(4), 1165-1188.
Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
Dean, J., & Ghemawat, S. (2010). MapReduce: a flexible data processing tool. Communications of the ACM, 53(1), 72-77.
Einav, L., & Levin, J. (2014). Economics in the age of big data. Science, 346(6210), 1243089.
Ghemawat, S., Gobioff, H., & Leung, S. T. (2003, October). The Google file system. In ACM SIGOPS operating systems review (Vol. 37, No. 5, pp. 29-43). ACM.
Gunnels, J., Lin, C., Morrow, G., & Van De Geijn, R. (1998, March). A flexible class of parallel matrix multiplication algorithms. In Parallel Processing Symposium, 1998. IPPS/SPDP 1998. Proceedings of the First Merged International... and Symposium on Parallel and Distributed Processing 1998 (pp. 110-116). IEEE.
Hasani, Z. (2017, April). Implementation of Infrastructure for Streaming Outlier Detection in Big Data. In World Conference on Information Systems and Technologies (pp. 503-511). Springer, Cham.
He, W., Zha, S., & Li, L. (2013). Social media competitive analysis and text mining: A case study in the pizza industry. International Journal of Information Management, 33(3), 464-472.
Iskhakov, F., Lee, J., Rust, J., Schjerning, B., & Seo, K. (2016). Comment on “Constrained Optimization Approaches to Estimation of Structural Models”. Econometrica, 84(1), 365-370.
Janis, I. L., & Mann, L. (1977). Decision making: A psychological analysis of conflict, choice, and commitment. Free Press.
Karau, H., Konwinski, A., Wendell, P., & Zaharia, M. (2015). Learning spark: lightning-fast big data analysis. " O'Reilly Media, Inc.".
Liu, S., & Zhang, D. (2016, October). The Short-Term Traffic Flow Prediction Based on MapReduce. In Bio-Inspired Computing-Theories and Applications (pp. 448-453). Springer, Singapore.
Lohr, S. (2012). The age of big data. New York Times, 11.
Landset, S., Khoshgoftaar, T. M., Richter, A. N., & Hasanin, T. (2015). A survey of open source tools for machine learning with big data in the Hadoop ecosystem. Journal of Big Data, 2(1), 1.
McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big data. The management revolution. Harvard Bus Rev, 90(10), 61-67.
Marr B. (2015, July 21) Turning on the tap: Discuss commercial Hadoop platforms. Wellesley Information Services. Retrieved June 14, 2016, from http://data-informed.com/10-top-commercial-hadoop-platforms/
McKelvey, R. D., & McLennan, A. (1996). Computation of equilibria in finite games. Handbook of computational economics, 1, 87-142.
Ning, C., & You, F. (2017). Data‐Driven Adaptive Nested Robust Optimization: General Modeling Framework and Efficient Computational Solution Algorithm for Decision Making under Uncertainty. AIChE Journal.
Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and trends in information retrieval, 2(1-2), 1-135.
Pang, J. S., Su, C. L., & Lee, Y. C. (2015). A constructive approach to estimating pure characteristics demand models with pricing. Operations Research, 63(3), 639-659.
Radhika, T. V., Gouda, K. C., & Kumar, S. S. (2016, October). Big data research in climate science. In Communication and Electronics Systems (ICCES), International Conference on (pp. 1-6). IEEE.
Richtárik, P., & Takáč, M. (2016). Parallel coordinate descent methods for big data optimization. Mathematical Programming, 156(1-2), 433-484.
Sarnovsky, M., Butka, P., & Huzvarova, A. (2017, January). Twitter data analysis and visualizations using the R language on top of the Hadoop platform. In Applied Machine Intelligence and Informatics (SAMI), 2017 IEEE 15th International Symposium on (pp. 000327-000332). IEEE.
Sitto, K., & Presser, M. (2015). Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies. " O'Reilly Media, Inc.".
Sharmila, K., & Manickam, S. V. (2016). Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques. Indian Journal of Science and Technology, 9(40).
Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, May). The hadoop distributed file system. In 2010 IEEE 26th symposium on mass storage systems and technologies (MSST) (pp. 1-10). IEEE.
Su, C. L., & Judd, K. L. (2012). Constrained optimization approaches to estimation of structural models. Econometrica, 80(5), 2213-2230.
Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, May). The hadoop distributed file system. In 2010 IEEE 26th symposium on mass storage systems and technologies (MSST) (pp. 1-10). IEEE.
Tufekci, Z. (2014). Big questions for social media big data: Representativeness, validity and other methodological pitfalls. arXiv preprint arXiv:1403.7400.
Wu, X., Zhu, X., Wu, G. Q., & Ding, W. (2014). Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1), 97-107.
Yang, X., Liu, S., Feng, K., Zhou, S., & Sun, X. H. (2016, October). Visualization and Adaptive Subsetting of Earth Science Data in HDFS: A Novel Data Analysis Strategy with Hadoop and Spark. In Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom)(BDCloud-SocialCom-SustainCom), 2016 IEEE International Conferences on (pp. 89-96). IEEE.
Zhang, J. (2015, March). Big data issues in medical imaging informatics. In SPIE Medical Imaging (pp. 941803-941803). International Society for Optics and Photonics.
Zhang, Y., Chen, M., Mao, S., Hu, L., & Leung, V. C. (2014). Cap: Community activity prediction based on big data analysis. IEEE Network, 28(4), 52-57.
Zheng, K., Yang, Z., Zhang, K., Chatzimisios, P., Yang, K., & Xiang, W. (2016). Big data-driven optimization for mobile networks toward 5G. IEEE Network, 30(1), 44-51.
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *