帳號:guest(18.217.224.194)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):林怡萍
作者(外文):Lin, Yi-Ping
論文名稱(中文):應用簡化群體演算法優化卷積神經網路超參數
論文名稱(外文):Convolution Neural Network Hyperparameter Optimization Using Simplified Swarm Optimization
指導教授(中文):葉維彰
指導教授(外文):Yeh, Wei-Chang
口試委員(中文):梁韵嘉
賴智明
口試委員(外文):Liang, Yun-Chia
Lai, Chyh-Ming
學位類別:碩士
校院名稱:國立清華大學
系所名稱:工業工程與工程管理學系
學號:108034511
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:42
中文關鍵詞:機器學習圖像辨識卷積神經網路簡化群體演算法超參數優化
外文關鍵詞:Machine LearningImage RecognitionConvolutional Neural NetworksSimplified Swarm OptimizationHyperparameter Optimization
相關次數:
  • 推薦推薦:0
  • 點閱點閱:404
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
現今在各產業中,圖像辨識的技術愈趨被重視,而應用在圖像辨識的機器學習方法中,卷積神經網路(Convolutional Neural Network, CNN)已被廣泛運用。現有的CNN架構已獲得不錯的效率,但為特定應用找到性能最佳的網絡架構並非易事。為了提高CNN性能,有些研究選擇改變網路架構,有些則選擇超參數優化,但其中許多都是手動設計的,需要具備相關的專業知識,也需耗費大量的時間。因此本研究提出將簡化群體演算法(Simplified Swarm Optimization, SSO) 應用在LeNet模型的超參數優化上,除了採用三個既有的資料集MNIST、Fashion-MNIST和Cifar10驗證外,也使用一個真實的晶圓資料集作為實際應用的驗證。實驗結果表明,在四個資料集中,本研究提出的方法相較原始LeNet架構與其他元啟發式演算法,皆有著更高的準確度,且當完成訓練,實際使用時只需要非常短暫的時間即能找到較優的超參數配置,另外也分析了特徵圖經過每一層運作後的輸出尺寸,結果出乎意料地大多為長方形而非正方形,代表此方法能夠容易地找到圖片的特徵,亦可發現架構中沒有運作的層。本研究其貢獻為經過實際資料集驗證,能提供使用者在既有的模型下,以更簡易且更快速的方式獲得更準確的結果,且此研究亦能應用至其他資料集或其他CNN架構,抑或是其他深度學習網路上。
Image recognition technology is increasingly emphasized in various industries today. Among the machine learning approaches applied in image recognition, Convolutional Neural Network (CNN) is widely used. Although existing CNN models have been proven to be efficient, it is not easy to find a network architecture with better performance. Some studies choose to optimize the network architecture, while others choose to optimize the hyperparameters. Most of them are designed manually, which requires relevant expertise and takes a lot of time. Therefore, this study proposes the idea of applying Simplified Swarm Optimization (SSO) on the hyperparameter optimization of LeNet models. In addition to using the three existing datasets MNIST, Fashion-MNIST, and Cifar10 for verification, a real wafer dataset is also used for verification of practical applications. The experimental results show that the proposed algorithm has higher accuracy than the original LeNet model and another meta-heuristic algorithm for all datasets, and it only takes a very short time to find a better hyperparameter configuration after training. Moreover, we also analyze the output shape of the feature map after each layer, and surprisingly, the results were mostly rectangular instead of square. It means that this method can easily extract the features of the picture and also find the layers in the structure that are not working. The contribution of this study is to provide users with a simpler and faster way to obtain more accurate results with the existing models after validation on real datasets. This study can also be applied to other datasets, CNN architectures, or other deep learning networks.
摘要 i
Abstract ii
誌謝 iii
Contents iv
List of Figures vi
List of Tables vii
1. INTRODUCTION 1
1.1 Background and Motivation 1
1.2 Research Framework 4
2. LITERATURE REVIEW 7
2.1 Image recognition 7
2.2 CNN 8
2.2.1 Convolution Layer 9
2.2.2 Pooling or SubSampling Layer 10
2.2.3 Fully Connected Layer 11
2.2.4 Activation function 12
2.3 LeNet 12
2.4 Hyperparameter Optimization Approaches 14
2.5 Simplified Swarm Optimization 15
3. METHODOLOGY 17
3.1 Encoding Strategy 17
3.2 Fitness function 19
3.3 Initialization of Solution 20
3.4 Proposed LeNet-SSO 20
3.4.1 Notations of LeNet-SSO 20
3.4.2 Update Mechanism of LeNet-SSO 21
3.4.3 Stopping Criteria of LeNet-SSO 22
3.4.4 Pseudocode and Flowchart 22
4. EXPERIMENTS 24
4.1 Datasets 24
4.2 Parameter settings 25
4.2.1 Small Sampling Test 25
4.2.2 ANOVA Test 27
4.2.3 Training Parameters and Detailed Setting 29
4.3 Experimental Results 30
4.3.1 The result of Existing Dataset 30
4.3.2 The result of wafer defect detection 33
4.3.3 Comparison between LeNet-SSO and LeNet-PSO 34
5. CONCLUSION AND FUTURE WORK 37
5.1 Conclusion 37
5.2 Future Work 38
REFERENCE 39

[1] Krizhevsky, A., I. Sutskever, and G.E. Hinton, Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 2012: p. 1097-1105.
[2] Chen, K. and X. Huang, Feature extraction method of 3D art creation based on deep learning. Soft Computing, 2019: p. 1-13.
[3] Chen, L.-C., et al., Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 2017. 40(4): p. 834-848.
[4] Loller-Andersen, M. and B. Gambäck. Deep Learning-based Poetry Generation Given Visual Input. in ICCC. 2018.
[5] LeCun, Y., Y. Bengio, and G. Hinton, Deep learning. Nature, 2015. 521(7553): p. 436-444.
[6] Goodfellow, I., et al., Deep learning. Vol. 1. 2016: MIT Press Cambridge.
[7] Al-Qizwini, M., et al. Deep learning algorithm for autonomous driving using googlenet. in 2017 IEEE Intelligent Vehicles Symposium (IV). 2017. IEEE.
[8] Miotto, R., et al., Deep learning for healthcare: review, opportunities and challenges. Briefings in bioinformatics, 2018. 19(6): p. 1236-1246.
[9] Wang, T., et al., A fast and robust convolutional neural network-based defect detection model in product quality control. The International Journal of Advanced Manufacturing Technology, 2018. 94(9-12): p. 3465-3471.
[10] Lundervold, A.S. and A. Lundervold, An overview of deep learning in medical imaging focusing on MRI. Zeitschrift für Medizinische Physik, 2019. 29(2): p. 102-127.
[11] Suzuki, K., Overview of deep learning in medical imaging. Radiological physics and technology, 2017. 10(3): p. 257-273.
[12] Sultana, F., A. Sufian, and P. Dutta. Advancements in image classification using convolutional neural network. in 2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN). 2018. IEEE.
[13] Bergstra, J., et al., Algorithms for hyper-parameter optimization. Advances in neural information processing systems, 2011. 24: p. 2546-2554.
[14] LeCun, Y., et al., Backpropagation applied to handwritten zip code recognition. Neural computation, 1989. 1(4): p. 541-551.
[15] Krizhevsky, A., I. Sutskever, and G.E. Hinton. Imagenet classification with deep convolutional neural networks. in Advances in neural information processing systems. 2012.
[16] Simonyan, K. and A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[17] Szegedy, C., et al. Going deeper with convolutions. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[18] He, K., et al. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[19] Hazan, E., A. Klivans, and Y. Yuan, Hyperparameter optimization: A spectral approach. arXiv preprint arXiv:1706.00764, 2017.
[20] Zhang, X., et al. Deep neural network hyperparameter optimization with orthogonal array tuning. in International Conference on Neural Information Processing. 2019. Springer.
[21] Salimans, T. and D.P. Kingma, Weight normalization: A simple reparameterization to accelerate training of deep neural networks. arXiv preprint arXiv:1602.07868, 2016.
[22] Cheng, D., et al. Person re-identification by multi-channel parts-based cnn with improved triplet loss function. in Proceedings of the iEEE conference on computer vision and pattern recognition. 2016.
[23] Zhu, Q.-Y., et al., A New Loss Function for CNN Classifier Based on Predefined Evenly-Distributed Class Centroids. IEEE Access, 2019. 8: p. 10888-10895.
[24] Larochelle, H., et al. An empirical evaluation of deep architectures on problems with many factors of variation. in Proceedings of the 24th international conference on Machine learning. 2007.
[25] Bergstra, J. and Y. Bengio, Random search for hyper-parameter optimization. Journal of machine learning research, 2012. 13(2).
[26] Snoek, J., H. Larochelle, and R.P. Adams, Practical bayesian optimization of machine learning algorithms. arXiv preprint arXiv:1206.2944, 2012.
[27] Luketina, J., et al. Scalable gradient-based tuning of continuous regularization hyperparameters. in International conference on machine learning. 2016. PMLR.
[28] Aszemi, N.M. and P. Dominic, Hyperparameter Optimization in Convolutional Neural Network using Genetic Algorithms.
[29] Johnson, F., et al., Automating configuration of convolutional neural network hyperparameters using genetic algorithm. IEEE Access, 2020. 8: p. 156139-156152.
[30] Loussaief, S. and A. Abdelkrim, Convolutional neural network hyper-parameters optimization based on genetic algorithms. International Journal of Advanced Computer Science and Applications, 2018. 9(10): p. 252-266.
[31] Xiao, X., et al., Efficient Hyperparameter Optimization in Deep Learning Using a Variable Length Genetic Algorithm. arXiv preprint arXiv:2006.12703, 2020.
[32] Huang, C.-L., A particle-based simplified swarm optimization algorithm for reliability redundancy allocation problems. Reliability Engineering & System Safety, 2015. 142: p. 221-230.
[33] Lorenzo, P.R., et al. Particle swarm optimization for hyper-parameter selection in deep neural networks. in Proceedings of the genetic and evolutionary computation conference. 2017.
[34] Yamasaki, T., T. Honma, and K. Aizawa. Efficient optimization of convolutional neural networks using particle swarm optimization. in 2017 IEEE Third International Conference on Multimedia Big Data (BigMM). 2017. IEEE.
[35] Zhu, W.-B., et al. Evolutionary convolutional neural networks using ABC. in Proceedings of the 2019 11th International Conference on Machine Learning and Computing. 2019.
[36] Yeh, W.-C., Novel swarm optimization for mining classification rules on thyroid gland data. Information Sciences, 2012. 197: p. 65-76.
[37] Yeh, W.-C., Simplified swarm optimization in disassembly sequencing problems with learning effects. Computers & Operations Research, 2012. 39(9): p. 2168-2177.
[38] Yeh, W.-C., Optimization of the disassembly sequencing problem on the basis of self-adaptive simplified swarm optimization. IEEE transactions on systems, man, and cybernetics-part A: systems and humans, 2011. 42(1): p. 250-261.
[39] Yeh, W.-C., Orthogonal simplified swarm optimization for the series–parallel redundancy allocation problem with a mix of components. Knowledge-Based Systems, 2014. 64: p. 1-12.
[40] Yeh, W.-C., et al. A radio frequency identification network design methodology for the decision problem in Mackay Memorial Hospital based on swarm optimization. in 2012 IEEE Congress on Evolutionary Computation. 2012. IEEE.
[41] Jain, A.K. and S.Z. Li, Handbook of face recognition. Vol. 1. 2011: Springer.
[42] Hubel, D.H. and T.N. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of Physiology, 1962. 160(1): p. 106.
[43] LeCun, Y., et al. Comparison of learning algorithms for handwritten digit recognition. in International conference on artificial neural networks. 1995. Perth, Australia.
[44] LeCun, Y., et al., Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998. 86(11): p. 2278-2324.
[45] Ioffe, S. and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
[46] Szegedy, C., et al. Rethinking the inception architecture for computer vision. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[47] Szegedy, C., et al. Inception-v4, inception-resnet and the impact of residual connections on learning. in Proceedings of the AAAI Conference on Artificial Intelligence. 2017.
[48] LeCun, Y., et al. Handwritten digit recognition with a back-propagation network. in Advances in neural information processing systems. 1990.
[49] Jarrett, K., et al. What is the best multi-stage architecture for object recognition? in 2009 IEEE 12th international conference on computer vision. 2009. IEEE.
[50] Glorot, X., A. Bordes, and Y. Bengio. Deep sparse rectifier neural networks. in Proceedings of the fourteenth international conference on artificial intelligence and statistics. 2011.
[51] Nair, V. and G.E. Hinton. Rectified linear units improve restricted boltzmann machines. in ICML. 2010.
[52] Injadat, M., et al., Systematic ensemble model selection approach for educational data mining. Knowledge-Based Systems, 2020. 200: p. 105992.
[53] Hinton, G.E., A practical guide to training restricted Boltzmann machines, in Neural networks: Tricks of the trade. 2012, Springer. p. 599-619.
[54] Hsu, C.-W., C.-C. Chang, and C.-J. Lin, A practical guide to support vector classification. 2003.
[55] Lemley, J., F. Jagodzinski, and R. Andonie. Big holes in big data: A monte carlo algorithm for detecting large hyper-rectangles in high dimensional data. in 2016 IEEE 40th annual computer software and applications conference (COMPSAC). 2016. IEEE.
[56] Shahriari, B., et al., Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 2015. 104(1): p. 148-175.
[57] Gelbart, M.A., Constrained Bayesian optimization and applications. 2015.
[58] Sinha, T., A. Haidar, and B. Verma. Particle swarm optimization based approach for finding optimal values of convolutional neural network parameters. in 2018 IEEE Congress on Evolutionary Computation (CEC). 2018. IEEE.
[59] Guo, Y., J.-Y. Li, and Z.-H. Zhan, Efficient Hyperparameter Optimization for Convolution Neural Networks in Deep Learning: A Distributed Particle Swarm Optimization Approach. Cybernetics and Systems, 2020: p. 1-22.
[60] Bhandare, A. and D. Kaur. Designing convolutional neural network architecture using genetic algorithms. in Proceedings on the International Conference on Artificial Intelligence (ICAI). 2018. The Steering Committee of The World Congress in Computer Science, Computer ….
[61] Han, J.-H., et al., Hyperparameter Optimization Using a Genetic Algorithm Considering Verification Time in a Convolutional Neural Network. Journal of Electrical Engineering & Technology, 2020. 15(2): p. 721-726.
[62] Bacanin, N., et al., Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms, 2020. 13(3): p. 67.
[63] Rere, L., M.I. Fanany, and A.M. Arymurthy, Metaheuristic algorithms for convolution neural network. Computational intelligence and neuroscience, 2016. 2016.
[64] Yeh, W.-C., A two-stage discrete particle swarm optimization for the problem of multiple multi-level redundancy allocation in series systems. Expert Systems with Applications, 2009. 36(5): p. 9192-9200.
[65] Yeh, W.-C., W.-W. Chang, and Y.Y. Chung, A new hybrid approach for mining breast cancer pattern using discrete particle swarm optimization and statistical method. Expert Systems with Applications, 2009. 36(4): p. 8204-8211.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *