帳號:guest(18.191.34.169)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張文騰
作者(外文):Chang, Wen-Teng
論文名稱(中文):基於梯度下降之二次多項分類器
論文名稱(外文):Gradient-based Quadratic Multiform Separation
指導教授(中文):銀慶剛
范可輝
指導教授(外文):Ing, Ching-Kang
Fan, Michael
口試委員(中文):盧鴻興
俞淑惠
口試委員(外文):Lu, Horng-Shing
Yu, Shu-Hui
學位類別:碩士
校院名稱:國立清華大學
系所名稱:統計學研究所
學號:108024507
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:40
中文關鍵詞:二次多項分類器自適應動差估計監督式學習分類模型
外文關鍵詞:Quadratic Multiform SeparationQMSAdamSupervised LearningClassification
相關次數:
  • 推薦推薦:0
  • 點閱點閱:136
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
機器學習中的分類問題為一種監督式學習,其旨在於將資料區分至不同類別。現今已有不少被廣泛運用的分類方法像是k-近鄰演算法、隨機森林、支持向量機等。每種分類器皆有其優缺點且沒有任何的分類器在所有問題上皆表現最佳。此篇論文專注在Michael Fan等學者於2019 年所提出的全新分類方法「二次多項分類器」(QMS)。此分類器具備新穎的觀念、豐富的數學結構以及創新的損失函數定義使其與既有的分類方法截然不同。根據此QMS架構,我們將提出使用一基於梯度的優化演算法Adam 以最小化QMS自定義的損失函數來獲得最終分類器。除此之外,我們也將透過探索超參數以及準確度來提供模型的調參建議。實證結果顯示出QMS的準確度與既有的分類算法旗鼓相當且其表現已十分接近機器學習競賽的常勝軍「梯度提升分類器」。
Classification as a supervised learning concept is an important content in machine learning. It aims at categorizing a set of data into classes. There are several commonly-used classification methods nowadays such as k-nearest neighbors, random forest, and support vector machine. Each of them has its own pros and cons, and none of them is invincible for all kinds of problems. In this thesis, we focus on Quadratic Multiform Separation (QMS), a classification method recently proposed by Michael Fan et al. (2019). Its fresh concept, rich mathematical structure, and innovative definition of loss function set it apart from the existing classification methods. Inspired by QMS, we propose utilizing a gradient-based optimization method, Adam, to obtain a classifier that minimizes the QMS-specific loss function. In addition, we provide suggestions regarding model tuning through explorations of the relationship between hyperparameters and accuracy. Our empirical result shows that QMS performs as good as most classification methods in terms of accuracy. Its superior performance almost comparable to those of gradient boosting algorithms that win massive machine learning competitions.
Abstract i
摘要 ii
Acknowledgement iii
Contents
1 Introduction 1
2 Machine Learning Algorithms 3
2.1 k-Nearest Neighbors (k-NN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Random Forest (RF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Support Vector Machine (SVM) . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 eXtreme Gradient Boosting (XGBoost) . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Artificial Neural Network (ANN) . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Quadratic Multiform Separation 14
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Multiform Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.1 Member Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.2 Multiform Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.3 Quadratic Multiform Separation . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Classification Using QMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 Learning Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.2 Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Schematic Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 QMS Optimization and Properties 19
4.1 Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.1 Mathematical Derivation of Gradients . . . . . . . . . . . . . . . . . . . 20
4.1.2 Training Algorithm and Results . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Properties of Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.1 Weight Hyperparameter q . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.2 Control Hyperparameter α . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5 Empirical Model Performance 30
5.1 Description of Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.1 MNIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.2 Fashion MNIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.3 Dogs vs. Cats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.4 Census Income . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.5 Dry Bean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6 Conclusion 38
Reference 39
[1] Phil Simon. Too Big to Ignore: The Business Case for Big Data. Wiley, 1 edition, 2013.
[2] Tom M. Mitchell. Machine Learning. McGraw-Hill series in computer science. McGraw-Hill, 1 edition, 1997.
[3] Ko-Hui Michael Fan, Chih-Chung Chang, Ye-Hwa Chen, and Kuang-Hsiao-Yin Kongguoluo, US. Patent 17/148,860, filed January 14, 2021.
[4] Ko-Hui Michael Fan, Chih-Chung Chang, and Kuang-Hsiao-Yin Kongguoluo, US. Patent 17/230,283, filed April 14, 2021.
[5] Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning with Applications in R. Springer Texts in Statistics, Vol. 103. Springer, 2013.
[6] Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
[7] Laura Raileanu and Kilian Stoffel. Theoretical comparison between the gini index and information gain criteria. Annals of Mathematics and Artificial Intelligence, 41:77–93, 05 2004.
[8] Galit Shmueli, Peter C. Bruce, Inbal Yahav, Nitin R. Patel, and Kenneth C. Lichtendahl Jr. Data Mining for Business Analytics: Concepts, Techniques, and Applications in R. Wiley, 1 edition, 2017.
[9] William S Noble. What is a support vector machine? Nature biotechnology, 24(12):1565—1567, December 2006.
[10] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. CoRR, abs/1603.02754, 2016.
[11] Ani1 K. Jain, Jianchang Mao, and K.M. Mohiuddin. Artificial neural networks: A tutorial. IEEE, 1996.
[12] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization, 2017.
[13] Yann LeCun and Corinna Cortes. MNIST handwritten digit database. 2010.
[14] Han Xiao, Kashif Rasul, and Roland Vollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017.
[15] Kaggle. Dogs vs. cats. https://www.kaggle.com/c/dogs-vs-cats/overview, 2013.
[16] Dheeru Dua and Casey Graff. UCI machine learning repository - census income, 2017.
[17] Murat Koklu and Ilker Ali Ozkan. Multiclass classification of dry beans using computer vision and machine learning techniques. Computers and Electronics in Agriculture, 174:105507, 2020.
(此全文20260721後開放外部瀏覽)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *