帳號:guest(3.12.36.130)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):歐陽憲毅
作者(外文):OU YANG, SHIAN-YI
論文名稱(中文):應用機器學習技術於車輛偵測-以ETC數據為例
論文名稱(外文):Machine Learning Techniques for Vehicle Detection based on ETC data
指導教授(中文):蘇朝墩
指導教授(外文):Su, Chao-Ton
口試委員(中文):陳穆臻
蕭宇翔
口試委員(外文):Chen, Mu-Chen
Hsiao, Yu-Hsiang
學位類別:碩士
校院名稱:國立清華大學
系所名稱:工業工程與工程管理學系
學號:106034522
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:58
中文關鍵詞:機器學習資料挖礦大數據基因演算法
外文關鍵詞:Machine LearningData MiningBig DataGenetic Algorithm
相關次數:
  • 推薦推薦:0
  • 點閱點閱:421
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
科技日益發展,人們能獲取的數據量越來越龐大,問題也日益複雜,以往透過統計分析能夠解決的問題,漸漸地難以解決。因此我們提出一連串分析流程,使用機器學習技術來分析大數據問題。
本研究使用屬性篩選從大數據中找出影響偵測率的關鍵因子,再透過類神經網路與基因演算法進行重要因子的組合最佳化,同時使用關聯規則找出有意義的規則。
由結果來看,找出影響偵測率的關鍵因子後,能提供給產品設計部門進行下一步分析與探討,而最佳化的結果顯示在理想狀況下能將偵測率從89%提升至93%,企業若是能夠達成目標,將能夠節省人力成本,且更近一步提升偵測率。
最後,企業能面對未來複雜且多元化的資訊時,有可供選擇的一個參考方案。
With the development of technology, the amount of data that people can obtain is getting larger, and the problems are becoming more complicated. The problems that can be solved through statistical analysis in the past are gradually difficult to solve. Therefore, we propose a series of analysis processes that use machine-learning techniques to analyze big data problems.
This study uses feature selection to identify important factors affecting detection rate from big data, and then optimizes the setting of important factors through neural networks and genetic algorithms, and uses association rules to find meaningful rules.
From the results, after identifying the important factors affecting the detection rate, it can be provided to the product design for further analysis. The optimization results show that under the ideal conditions, the detection rate can be improved from 89% to 93%. If the company can achieve the goal, it will save labor costs and further improve the detection rate.
Finally, this study provides a reference for choice when companies face complex and diverse information in the future.
1. Introduction 9
1.1 Research Background and Motivation 9
1.2 Research Purposes 11
1.3 Research Architecture 11
2. Related work 14
2.1 Electronic Toll Collection System 14
2.2 Artificial Neural Network 15
2.3 Random Forest 17
2.4 Adaptive Boosting Decision Trees 18
2.5 Logistic Regression 19
2.6 Support Vector Machine 19
2.7 Feature Selection and its Applications 20
2.8 Genetic Algorithm 21
2.9 Integration of Neural Network and Genetic Algorithm 24
2.10 Association Rules 25
3. Research Methodology 27
3.1 Proposed Procedure 27
3.2 Data pre-processing 28
3.3 Feature selection methods 29
3.3.1 Important Input Variables in Neural Network 29
3.3.2 Random Forest 30
3.3.3 Adaptive Boosting Decision Trees 32
3.3.4 Support Vector Machine 33
3.3.5 logistic regression 35
3.4 Optimal Setting by Neural Network and Genetic Algorithm 37
3.5 Association Rule 38
4. Implementation 39
4.1 Case Description 39
4.2 Data Collection and Observation 39
4.3 Data Preprocessing 40
4.4 Implementation of Feature Selection 42
4.4.1 Feature Selection using Neural Networks 42
4.4.2 Feature Selection using Random Forest 42
4.4.3 Feature Selection using Adaptive Boosting Decision Trees 42
4.4.4 Feature Selection using Support Vector Machine 43
4.4.5 Feature Selection using Logistic Regression 43
4.4.6 Summary of Important Attributes 43
4.4.7 Comparison of full and reduced models 45
4.5 Combining Neural Networks and Genetic Algorithms for Optimizing the setting for important features 47
4.5.1 Data Preprocessing 47
4.5.2 Training a neural network model 48
4.5.3 Apply Genetic Algorithm to Optimize the settings for important features 49
4.5.4 Association Rule Implementation 51
4.6 Results and Discussion 52
5. Conclusion 53
5.1 Conclusion 53
5.2 Future Study 53
References 55
Appendix 58
[1] Alititi, Marco. (2015). Dealing with imbalance data: undersampling, oversampling and proper cross-validation. https://www.marcoaltini.com/blog/dealing-with-imbalanced-data-undersampling-oversampling-and-proper-cross-validation
[2] Berson, A., Smith, S., & Thearling, K. (1999). An overview of data mining
techniques: excerpted from the book building data mining applications for CRM.
McGraw-Hill, 89-229.
[3] Bermingham, Mairead L.; Pong-Wong, Ricardo; Spiliopoulou, Athina; Hayward, Caroline; Rudan, Igor; Campbell, Harry; Wright, Alan F.; Wilson, James F.; Agakov, Felix; Navarro, Pau; Haley, Chris S. (2015). "Application of high-dimensional feature selection: evaluation for genomic prediction in man". Sci. Rep.
[4] Breiman, L. (2001a). Random forests. Machine Learning, 45(1), 5-32.
[5] Breiman, L. (2001b). Statistical modeling: The Two Cultures, Statistical Science,
16, 199 -215.
[6] Chen, K. Y., & Wang, C. H. (2007). Support vector regression with genetic
algorithms in forecasting tourism demand. Tourism Management, 28(1), 215-226.
[7] Chien, C. F., & Chen, L. F. (2008). Data mining to improve personnel selection
and enhance human capital: a case study in high-technology industry. Expert
Systems With Applications, 34(1), 280-290.
[8] Chien, C.F. & Hsu C.Y. (2014) Data Mining and Big Data Analysis. CRC Press.
[9] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996a). From data mining to
knowledge discovery in databases. AI Magazine, 17(3), 37.
[10] Gareth James; Daniela Witten; Trevor Hastie; Robert Tibshirani (2013). An Introduction to Statistical Learning. Springer. p. 204.
[11] Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine
learning. Reading, MA: Addison-Wesley.
[12] Górny, Z., Kluska-Nawarecka, S., Wilk-Kołodziejczyk, D., & Regulski, K.
(2010). Diagnosis of casting defects using uncertain and incomplete knowledge.
Archives of Metallurgy and Materials, 55(3), 827-836.
[13] Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques.
Elsevier.
[14] Hui, S. C., & Jha, G. (2000). Data mining for customer service support.
Information & Management, 38(1), 1-13.
[15] Moshkovich, H. M., Mechitov, A.I. & Olson, D.L. (2002). Rule Induction in the Data Mining: Effect of Ordinal Scales. Expert System with Applications, 22(4), pp.301-311.
[16] Quinlan, J.R. (1986), Induction of Decision Tree, Machine Learning, Vol.1, pp.81-106
[17] Shen, C., Wang, L., & Li, Q. (2007). Optimization of injection molding process parameters using combination of artificial neural network and genetic algorithm method. Journal of Materials Processing Technology, 183(2), 412-418.
[18] Su, Chao-Ton (2013)。Off-Line: Methods and Applications
[19] Su, C. T., Wang, P. C., Chen, Y. C., & Chen, L. F. (2012). Data mining
techniques for assisting the diagnosis of pressure ulcer development in surgical
patients. Journal of Medical Systems, 36(4), 2387-2399.
[20] Su, C.T., Wang, P.C., Chen, Y.C. and Chen, L.F. (2012). Data mining techniques for assisting the diagnosis of pressure ulcer development in surgical patients. Journal of Medical Systems, 36(4), 2387-2399.
[21] Su, C.T., Yang, C.H., Hsu, K.H. and Chiu, W.K. (2006). Data mining for the diagnosis of type II diabetes from three-dimensional body surface anthropometrical scanning data. Computers and Mathematics with Applications, 51(1), 1075-1092.
[22] Zhou, C. C., Yin, G. F., & Hu, X. B. (2009). Multi-objective optimization of material selection for sustainable products: artificial neural networks and genetic algorithm approach. Materials & Design, 30(4), 1209-1215.
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *