帳號:guest(18.223.195.115)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):施彥池
作者(外文):Shih, Yan-Chih.
論文名稱(中文):應用混和式演算法優化卷積神經網路
論文名稱(外文):Optimization of Convolutional Neural Network Using Hybrid Algorithm
指導教授(中文):葉維彰
指導教授(外文):Yeh, Wei-Chang
口試委員(中文):張桂琥
鍾武勳
口試委員(外文):Cheng, Kuei-hu
Chung, Wu-Hsun
學位類別:碩士
校院名稱:國立清華大學
系所名稱:工業工程與工程管理學系
學號:105034519
出版年(民國):107
畢業學年度:106
語文別:中文
論文頁數:53
中文關鍵詞:圖像辨識卷積神經網路改良式簡化群體演算法隨機梯度搜尋法
外文關鍵詞:Image recognitionconvolutional neural networkImproved Simplified Swarm Optimization
相關次數:
  • 推薦推薦:0
  • 點閱點閱:888
  • 評分評分:*****
  • 下載下載:13
  • 收藏收藏:0
近幾年來,由於卷積神經網路 (Convolutional Neural Network, CNN) 在大型圖像處理上有出色表現,廣泛被學者應用於圖像辨識領域。卷積神經網路由一個或多個卷積層(Convolutional layer ) 和頂端的全連通層所對應經典的神經網路組成,同時也包括關聯權重和池化層 (pooling layer) ,藉由權值共享減少特徵值學習的時間與降低圖片維度,使得卷積神經網路比起其他分類器更有效率。
CNN模型使用反向傳播算法 (Back-propagation,BP) 訓練權重與偏誤,進而使誤差越來越小。而使用BP演算法最常見的優化器為隨機梯度搜尋法 (Stochastic Gradient Decent,SGD),也有Adam、Adadelta等優化器,單獨使用這些優化器已被學者證實容易陷入區域最佳解。提出的解決的辦法有很多種,例如調整BP演算法的偏微分,或是修正激活函數接收與輸出的函數,也有應用柔性演算法對模型進行修正,綜合上述方法,本研究將著重於柔性演算法探討與改善。
過去所提出之柔性演算法研究並不多,大多都是粒子群演算法(Particle Swarm Optimization),其中以Hayder M. Albeahdili所提出之混和型演算法訓練神經網路分類效果較為出色,以粒子群演算法結合SGD,使圖片分類的準確度優於原有優化器所提出的改善方式,驗證了柔性演算法可以解決反向傳播法陷入區域最佳解的問題。
本研究提出改良式簡化群體演算法 (Improved Simplified Swarm Optimization,ISSO) 結合 SGD 訓練卷積神經網路,讓兩個優化器隨機訓練神經網路至迭代結束,ISSO的解具有無記憶性,使其能保持在全域最佳解 (global best, gbest),進行多次迭代後能獲得最好的損失函數,綜合上述兩點,此混和演算法能建立更好的預測模型,提高分類準確率。本研究最後與對比論文的PSO-SGD以及現今常使用之優化器Adam、adadelta、rmsprop、momentum進行比較,以證實本研究所提方法可以解決BP演算法的缺點且在提升圖片分類準確率占有優勢。
In recent years, Convolutional Neural Network (CNN) has been widely used in the field of image recognition due to its excellent performance in large-scale image processing. The convolutional neural network consists of one or more convolutional layers and a classic neural network corresponding to the top fully connected layer, and also associated weights and pooling layers. Beacause of decreasing the image dimension, training time is reduced and implement higher accuracy.
The CNN model uses Back-Propagation (BP) training weights and biases to make errors smaller and smaller. However, after repeated studies, it has been found that BP algorithm has a fatal flaw. It has been proved by researchers that it is easy to fall into the local best solution. There are many solutions to this problem, such as adjusting the partial differential of BP algorithm or correcting the activation function. Some researchers propose soft computing to train the model, and this research will focus on how to improve the CNN model using soft computing.
This study proposes Improved Simplified Swarm Optimization (ISSO) combined with SGD to train convolutional neural networks. The hybrid algorithm can establish a better prediction model and improve the classification accuracy. In order to prove the method proposed in this study, comparing the PSO-SGD in the comparative paper, we also compare the currently optimizers Adam, adadelta, rmsprop, and momentum to prove that the proposed method can solve the plight of the BP algorithm.
目錄

摘要 I
英文摘要 II
目錄 III
圖目錄 VI
表目錄 VIII
第一章、研究介紹 1
1.1 研究背景和動機 1
1.2 研究架構 3
第二章、文獻回顧 5
2.1 機器學習 5
2.1.1 監督式學習 5
2.1.2 非監督式學習 5
2.1.3 增強式學習 6
2.2深度學習 7
2.3類神經網路 7
2.3.1 單層感知器 7
2.3.2 多層感知器 8
2.4卷積神經網路 8
2.4.1 卷積層 9
2.4.2 池化層 12
2.4.3 激活函數 13
2.4.4 全連接層 14
2.4.5 Softmax函數 14
2.4.6 損失函數 15
2.4.7 梯度下降法 15
2.4.8 前向傳播與反向傳播法 16
2.5 改良式簡化群體演算法 17
2.6 柔性演算法應用於卷積神經網路 18
第三章、研究方法 19
3.1 資料分配 19
3.2 粒子編碼方式 20
3.3 適應值函數 21
3.4 初始解生成方式 22
3.5 改良式簡化群體演算法與隨機梯度下降 22
3.5.1 SGD演算方法 23
3.5.2 ISSO演算方法 25
3.6 混和式演算法更新流程 27
第四章、實驗結果與分析 29
4.1 標竿問題資料集 29
4.2 實驗設計 31
4.2.1 學習率和SGD使用比例參數設計 32
4.2.2 ISSO參數設計 37
4.3 實驗情境 38
4.4 實驗數據 40
4.4.1 實驗結果 40
4.4.2 實驗分析 47
第五章、結論與後續研究方向 49
5.1 結論 49
5.2 後續研究方向 49
參考文獻 50

[1] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in neural information processing systems, 2012, pp. 1097-1105.
[2] Z. Zhong, L. Jin, Z. Xie, High performance offline handwritten chinese character recognition using googlenet and directional feature maps, in: Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, IEEE, 2015, pp. 846-850.
[3] L. Wang, S. Guo, W. Huang, Y. Qiao, Places205-vggnet models for scene recognition, arXiv preprint arXiv:1508.01667, (2015).
[4] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998) 2278-2324.
[5] H.M. Albeahdili, T. Han, N.E. Islam, Hybrid Algorithm for the Optimization of Training Convolutional Neural Network.
[6] A. Toshev, C. Szegedy, Deeppose: Human pose estimation via deep neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1653-1660.
[7] R.A. Jacobs, Increased rates of convergence through learning rate adaptation, Neural networks, 1 (1988) 295-307.
[8] A. Van Ooyen, B. Nienhuis, Improving the convergence of the back-propagation algorithm, Neural Networks, 5 (1992) 465-471.
[9] V. Nair, G.E. Hinton, Rectified linear units improve restricted boltzmann machines, in: Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807-814.
[10] T. Yamasaki, T. Honma, K. Aizawa, Efficient Optimization of Convolutional Neural Networks Using Particle Swarm Optimization, in: Multimedia Big Data (BigMM), 2017 IEEE Third International Conference on, IEEE, 2017, pp. 70-73.
[11] P.R. Lorenzo, J. Nalepa, M. Kawulok, L.S. Ramos, J.R. Pastor, Particle swarm optimization for hyper-parameter selection in deep neural networks, in: Proceedings of the Genetic and Evolutionary Computation Conference, ACM, 2017, pp. 481-488.
[12] D. Zang, J. Ding, J. Cheng, D. Zhang, K. Tang, A Hybrid Learning Algorithm for the Optimization of Convolutional Neural Network, in: International Conference on Intelligent Computing, Springer, 2017, pp. 694-705.
[13] W.-C. Yeh, An improved simplified swarm optimization, Knowledge-Based Systems, 82 (2015) 60-69.
[14] T. Hastie, R. Tibshirani, J. Friedman, Overview of supervised learning, in: The elements of statistical learning, Springer, 2009, pp. 9-41.
[15] T.D. Sanger, Optimal unsupervised learning in a single-layer linear feedforward neural network, Neural networks, 2 (1989) 459-473.
[16] R.S. Sutton, A.G. Barto, Reinforcement learning: An introduction, MIT press Cambridge, 1998.
[17] Y. Freund, L. Mason, The alternating decision tree learning algorithm, in: icml, 1999, pp. 124-133.
[18] D.T. Larose, K‐nearest neighbor algorithm, Discovering Knowledge in Data: An Introduction to Data Mining, (2005) 90-106.
[19] J.A. Suykens, J. Vandewalle, Least squares support vector machine classifiers, Neural processing letters, 9 (1999) 293-300.
[20] D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, Mastering the game of Go with deep neural networks and tree search, Nature, 529 (2016) 484-489.
[21] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, Playing atari with deep reinforcement learning, arXiv preprint arXiv:1312.5602, (2013).
[22] C.J.C.H. Watkins, Learning from delayed rewards, in, King's College, Cambridge, 1989.
[23] C.J. Watkins, Peter. Dayan, Technical note Q-learning, Machine Learn, 8 (1992) 279-292.
[24] F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain, Psychological review, 65 (1958) 386.
[25] Y. Bengio, P. Simard, P. Frasconi, Learning long-term dependencies with gradient descent is difficult, IEEE transactions on neural networks, 5 (1994) 157-166.
[26] Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, Backpropagation applied to handwritten zip code recognition, Neural computation, 1 (1989) 541-551.
[27] G.E. Nasr, E. Badr, C. Joun, Cross Entropy Error Function in Neural Networks: Forecasting Gasoline Demand, in: FLAIRS Conference, 2002, pp. 381-384.
[28] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, Tensorflow: Large-scale machine learning on heterogeneous distributed systems, arXiv preprint arXiv:1603.04467, (2016).
[29] 武妍, 王守覺, 一種新的快速收斂的反向傳播算法, 同濟大學學報 (自然科學版), 32 (2004) 1092-1095.
[30] W. Yeh, Study on quickest path networks with dependent components and apply to RAP, Rep. NSC, (2008) 97-2221.
[31] C.-L. Huang, A particle-based simplified swarm optimization algorithm for reliability redundancy allocation problems, Reliability Engineering & System Safety, 142 (2015) 221-230.
[32] W.-C. Yeh, Novel swarm optimization for mining classification rules on thyroid gland data, Information Sciences, 197 (2012) 65-76.
[33] W.-C. Yeh, Y.-M. Yeh, C.-H. Chou, Y.-Y. Chung, X. He, A radio frequency identification network design methodology for the decision problem in Mackay Memorial Hospital based on swarm optimization, in: Evolutionary Computation (CEC), 2012 IEEE Congress on, IEEE, 2012, pp. 1-7.
[34] W.-C. Yeh, Simplified swarm optimization in disassembly sequencing problems with learning effects, Computers & Operations Research, 39 (2012) 2168-2177.
[35] W. Gao, C. Song, J. Jiang, C. Zhang, Simplified Particle Swarm Optimization Algorithm Based on Improved Learning Factors, in: International Symposium on Neural Networks, Springer, 2017, pp. 321-328.
[36] W.C. Yeh, Y.T. Yang, C.M. Lai, A Hybrid Simplified Swarm Optimization Method for Imbalanced Data Feature Selection, Australian Academy of Business and Economics Review, 2 (2017) 263-275.
[37] W.-C. Yeh, Y.-M. Yeh, P.-C. Chang, Y.-C. Ke, V. Chung, Forecasting wind power in the Mai Liao Wind Farm based on the multi-layer perceptron artificial neural network model with improved simplified swarm optimization, International Journal of Electrical Power & Energy Systems, 55 (2014) 741-748.
[38] X. Zhang, W.-c. Yeh, Y. Jiang, Y. Huang, Y. Xiao, L. Li, A Case Study of Control and Improved Simplified Swarm Optimization for Economic Dispatch of a Stand-Alone Modular Microgrid, Energies, 11 (2018) 793.
[39] I. Sutskever, J. Martens, G. Dahl, G. Hinton, On the importance of initialization and momentum in deep learning, in: International conference on machine learning, 2013, pp. 1139-1147.
[40] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980, (2014).
[41] M.D. Zeiler, ADADELTA: an adaptive learning rate method, arXiv preprint arXiv:1212.5701, (2012).
[42] T. Tieleman, G. Hinton, Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude, COURSERA: Neural networks for machine learning, 4 (2012) 26-31.
[43] A.R. Syulistyo, D.M.J. Purnomo, M.F. Rachmadi, A. Wibowo, Particle swarm optimization (PSO) for training optimization on convolutional neural network (CNN), Jurnal Ilmu Komputer dan Informasi, 9 (2016) 52-58.
[44] D.M. Hawkins, The problem of overfitting, Journal of chemical information and computer sciences, 44 (2004) 1-12.
[45] S. Lawrence, C.L. Giles, Overfitting and neural networks: conjugate gradient and backpropagation, in: Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on, IEEE, 2000, pp. 114-119.
[46] X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, in: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 249-256.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *