作者(外文):Ciou, Si-Jheng.
論文名稱(中文):建立大數據分析架構並計算市占率均衡問題 - 以汽車市場為例
論文名稱(外文):Establishing a Big Data Analysis Framework and Computing Equilibrium Market Share with Vehicle Model
指導教授(外文):Lee, Yu-Ching
外文關鍵詞:Big Data Methodology and Applications
  Data analysis on scalable framework has already been a popular and widely employed technique for plenty of years. The technique is able to process and analyze the data that is unable to be dealt with by the general software. Hadoop is one of the robust and well-behaved tools in views of parallel computing and distributed storage.
  We employ big data analysis to discover the potential information and patterns on Hadoop. This paper is aimed to provide a method to connect the advanced game-based model with the existing scalable computing system. Our case study is the vehicle data. First, we build a cluster and install the Hadoop ecosystem for research by ourselves. Second, three steps of the data pre-analysis would assist us to understand the outline of the data. Third, a mathematical model with vehicle market share is constructed in accordance with the outcome of pre-analysis and our research question. Finally, we focus the analysis on the size of the matrices, computing time, and matrix splitting, for evaluating the physical resources required to solve the mathematical optimization model on distributed platform.
中文摘要 I
Abstract II
List of Illustrations VI
List of Tables VIII
Chapter 1 Introduction 1
1.1 Background and Motivation 1
1.2 Framework and Organization 5
Chapter 2 Literature Review 6
2.1 Big Data 6
2.2 Hadoop 7
2.3 Recent Research 7
Chapter 3 Methodology 9
3.1 Relate Work 9
3.1.1 Big Data Applications 9
3.1.2 Commercial Hadoop Platforms 10
3.1.3 Consumer Behavior and Nash Equilibrium 11
3.1.4 Big Data Feature of the 4v and Equilibrium Problem of
   the Market Share 12
3.2 Analytic Tools 13
3.3 Establishing the Big Data Analysis Framework for Vehicle Data 16
3.3.1 Cluster Building 17
3.3.2 Big Data Analysis Framework 18
Chapter 4 Mathematical Model 20
4.1 Definition of Mathematical Symbols 20
4.2 Formulation of the Equations for a Mathematical Programming 21
Chapter 5 Case Study 25
5.1 Specification of the Cluster We Built 25
5.2 Pre-analysis Result of the Data 28
5.2.1 Data Collection 29
5.2.2 Data Cleaning 30
5.2.3 Data Mining 31
5.3 Discussion of the Distributed Computing for the Matrix Operations 34
5.3.1 Classifying where to store objects (local RAM or HDFS)
   in accordance with resources 34
5.3.2 Select the appropriate size of fragments to split Big Data 36
5.3.3 Simplify the algorithm to decrease the number of
   MapReduce functions 37
5.4 Procedure of the Distributed Computing 40
5.4.1 Matrix-Matrix Multiplication 40
5.4.2 Matrix Transposition 42
5.4.3 Inverse Matrix 43
5.5 Analysis of the Size of the Input Parameters and Outputs 45
5.5.1 Analysis of the Matrix Operations 45
5.5.2 Comparison with the Matrix Operations 54
Chapter 6 Conclusion 57
References 59
Appendix 63
