|
[1] L. D. Harmon, "Artificial neuron," Science, vol. 129, no. 3354, pp. 962-963, 1959. [2] F. Rosenblatt, "The perceptron: a probabilistic model for information storage and organization in the brain," Psychological review, vol. 65, no. 6, p. 386, 1958. [3] T. D. Sanger, "Optimal unsupervised learning in a single-layer linear feedforward neural network," Neural networks, vol. 2, no. 6, pp. 459-473, 1989. [4] W. Zaremba, I. Sutskever, and O. Vinyals, "Recurrent neural network regularization," arXiv preprint arXiv:1409.2329, 2014. [5] Y. LeCun et al., "Backpropagation applied to handwritten zip code recognition," Neural computation, vol. 1, no. 4, pp. 541-551, 1989. [6] G. Tesauro, D. S. Touretzky, and T. Leen, Advances in neural information processing systems 7. MIT press, 1995. [7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012. [8] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014. [9] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9. [10] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778. [11] H. Faris, I. Aljarah, and S. Mirjalili, "Training feedforward neural networks using multi-verse optimizer for binary classification problems," Applied Intelligence, vol. 45, no. 2, pp. 322-332, 2016. [12] S. Mirjalili, "How effective is the Grey Wolf optimizer in training multi-layer perceptrons," Applied Intelligence, vol. 43, no. 1, pp. 150-161, 2015. [13] I. Aljarah, H. Faris, and S. Mirjalili, "Optimizing connection weights in neural networks using the whale optimization algorithm," Soft Computing, vol. 22, no. 1, pp. 1-15, 2018. [14] H. Robbins and S. Monro, "A stochastic approximation method," The annals of mathematical statistics, pp. 400-407, 1951. [15] A. Toshev and C. Szegedy, "Deeppose: Human pose estimation via deep neural networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1653-1660. [16] R. A. Jacobs, "Increased rates of convergence through learning rate adaptation," Neural networks, vol. 1, no. 4, pp. 295-307, 1988. [17] A. Van Ooyen and B. Nienhuis, "Improving the convergence of the back-propagation algorithm," Neural networks, vol. 5, no. 3, pp. 465-471, 1992. [18] D. Zang, J. Ding, J. Cheng, D. Zhang, and K. Tang, "A hybrid learning algorithm for the optimization of convolutional neural network," in International Conference on Intelligent Computing, 2017: Springer, pp. 694-705. [19] H. M. Albeahdili, T. Han, and N. E. Islam, "Hybrid algorithm for the optimization of training convolutional neural network," Int J Adv Comput Sci Appl, vol. 1, no. 6, pp. 79-85, 2015. [20] M. Črepinšek, S.-H. Liu, and M. Mernik, "Exploration and exploitation in evolutionary algorithms: A survey," ACM computing surveys (CSUR), vol. 45, no. 3, pp. 1-33, 2013. [21] G. Xu, "An adaptive parameter tuning of particle swarm optimization algorithm," Applied Mathematics and Computation, vol. 219, no. 9, pp. 4560-4569, 2013. [22] S. Mirjalili, S. Z. M. Hashim, and H. M. Sardroudi, "Training feedforward neural networks using hybrid particle swarm optimization and gravitational search algorithm," Applied Mathematics and Computation, vol. 218, no. 22, pp. 11125-11137, 2012. [23] X.-S. Yang, Nature-inspired optimization algorithms. Academic Press, 2020. [24] D. H. Wolpert and W. G. Macready, "No Free Lunch Theorems for Optimization IEEE Transactions on Evolutionary Computation," E997, 1997. [25] W.-C. Yeh, "An improved simplified swarm optimization," Knowledge-Based Systems, vol. 82, pp. 60-69, 2015. [26] Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, no. 7553, pp. 436-444, 2015. [27] C. Farabet, C. Couprie, L. Najman, and Y. LeCun, "Learning hierarchical features for scene labeling," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1915-1929, 2012. [28] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler, "Joint training of a convolutional network and a graphical model for human pose estimation," Advances in neural information processing systems, vol. 27, pp. 1799-1807, 2014. [29] T. Mikolov, A. Deoras, D. Povey, L. Burget, and J. Černocký, "Strategies for training large scale neural network language models," in 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011: IEEE, pp. 196-201. [30] G. Hinton et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal processing magazine, vol. 29, no. 6, pp. 82-97, 2012. [31] T. N. Sainath et al., "Improvements to deep convolutional neural networks for LVCSR," in 2013 IEEE workshop on automatic speech recognition and understanding, 2013: IEEE, pp. 315-320. [32] J. Ma, R. P. Sheridan, A. Liaw, G. E. Dahl, and V. Svetnik, "Deep neural nets as a method for quantitative structure–activity relationships," Journal of chemical information and modeling, vol. 55, no. 2, pp. 263-274, 2015. [33] T. Ciodaro, D. Deva, J. De Seixas, and D. Damazio, "Online particle detection with neural networks based on topological calorimetry information," in Journal of physics: conference series, 2012, vol. 368, no. 1: IOP Publishing, p. 012030. [34] D. H. Hubel and T. N. Wiesel, "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex," The Journal of physiology, vol. 160, no. 1, pp. 106-154, 1962. [35] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," in International conference on machine learning, 2015: PMLR, pp. 448-456. [36] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818-2826. [37] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Thirty-first AAAI conference on artificial intelligence, 2017. [38] Y. LeCun et al., "Handwritten digit recognition with a back-propagation network," Advances in neural information processing systems, vol. 2, 1989. [39] K. Jarrett, K. Kavukcuoglu, M. A. Ranzato, and Y. LeCun, "What is the best multi-stage architecture for object recognition?," in 2009 IEEE 12th international conference on computer vision, 2009: IEEE, pp. 2146-2153. [40] T. Mikolov, S. Kombrink, L. Burget, J. Černocký, and S. Khudanpur, "Extensions of recurrent neural network language model," in 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), 2011: IEEE, pp. 5528-5531. [41] X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier neural networks," in Proceedings of the fourteenth international conference on artificial intelligence and statistics, 2011: JMLR Workshop and Conference Proceedings, pp. 315-323. [42] V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Icml, 2010. [43] N. Qian, "On the momentum term in gradient descent learning algorithms," Neural networks, vol. 12, no. 1, pp. 145-151, 1999. [44] J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of machine learning research, vol. 12, no. 7, 2011. [45] M. D. Zeiler, "Adadelta: an adaptive learning rate method," arXiv preprint arXiv:1212.5701, 2012. [46] G. Hinton, N. Srivastava, and K. Swersky, "Neural networks for machine learning lecture 6a overview of mini-batch gradient descent," Cited on, vol. 14, no. 8, p. 2, 2012. [47] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014. [48] T. Dozat, "Incorporating nesterov momentum into adam," 2016. [49] J. Ma and D. Yarats, "Quasi-hyperbolic momentum and Adam for deep learning," arXiv preprint arXiv:1810.06801, 2018. [50] S. J. Reddi, S. Kale, and S. Kumar, "On the convergence of adam and beyond," arXiv preprint arXiv:1904.09237, 2019. [51] J. Lucas, S. Sun, R. Zemel, and R. Grosse, "Aggregated momentum: Stability through passive damping," arXiv preprint arXiv:1804.00325, 2018. [52] D. J. Montana and L. Davis, "Training feedforward neural networks using genetic algorithms," in IJCAI, 1989, vol. 89, pp. 762-767. [53] R. Mendes, P. Cortez, M. Rocha, and J. Neves, "Particle swarms for feedforward neural network training," in Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No. 02CH37290), 2002, vol. 2: IEEE, pp. 1895-1899. [54] Y. Chhabra, S. Varshney, and A. Wadhwa, "Hybrid particle swarm training for convolution neural network (CNN)," in 2017 Tenth International Conference on Contemporary Computing (IC3), 2017: IEEE, pp. 1-3. [55] E. Y. Sari and A. Sunyoto, "Optimization of Weight Backpropagation with Particle Swarm Optimization for Student Dropout Prediction," in 2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), 2019: IEEE, pp. 423-428. [56] A. R. Syulistyo, D. M. J. Purnomo, M. F. Rachmadi, and A. Wibowo, "Particle swarm optimization (PSO) for training optimization on convolutional neural network (CNN)," Jurnal Ilmu Komputer dan Informasi, vol. 9, no. 1, pp. 52-58, 2016. [57] M. H. Khalifa, M. Ammar, W. Ouarda, and A. M. Alimi, "Particle swarm optimization for deep learning of convolution neural network," in 2017 Sudan Conference on Computer Science and Information Technology (SCCSIT), 2017: IEEE, pp. 1-5. [58] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. [59] W.-C. Yeh, "A two-stage discrete particle swarm optimization for the problem of multiple multi-level redundancy allocation in series systems," Expert Systems with Applications, vol. 36, no. 5, pp. 9192-9200, 2009. [60] J. Kennedy and R. Eberhart, "Particle swarm optimization," in Proceedings of ICNN'95-international conference on neural networks, 1995, vol. 4: IEEE, pp. 1942-1948. [61] W.-C. Yeh, W.-W. Chang, and Y. Y. Chung, "A new hybrid approach for mining breast cancer pattern using discrete particle swarm optimization and statistical method," Expert Systems with Applications, vol. 36, no. 4, pp. 8204-8211, 2009. [62] Y. K. Ever, "Using simplified swarm optimization on path planning for intelligent mobile robot," Procedia computer science, vol. 120, pp. 83-90, 2017. [63] C.-L. Huang, "A particle-based simplified swarm optimization algorithm for reliability redundancy allocation problems," Reliability Engineering & System Safety, vol. 142, pp. 221-230, 2015. [64] W.-C. Yeh, "Novel swarm optimization for mining classification rules on thyroid gland data," Information Sciences, vol. 197, pp. 65-76, 2012. [65] W.-C. Yeh, "New parameter-free simplified swarm optimization for artificial neural network training and its application in the prediction of time series," IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 4, pp. 661-665, 2013. [66] W.-C. Yeh, C.-M. Lai, and M.-H. Tsai, "Nurse scheduling problem using Simplified Swarm Optimization," in Journal of Physics: Conference Series, 2019, vol. 1411, no. 1: IOP Publishing, p. 012010. [67] W.-C. Yeh, Y.-M. Yeh, P.-C. Chang, Y.-C. Ke, and V. Chung, "Forecasting wind power in the Mai Liao Wind Farm based on the multi-layer perceptron artificial neural network model with improved simplified swarm optimization," International Journal of Electrical Power & Energy Systems, vol. 55, pp. 741-748, 2014. [68] X. Zhang, W.-c. Yeh, Y. Jiang, Y. Huang, Y. Xiao, and L. Li, "A case study of control and improved simplified swarm optimization for economic dispatch of a stand-alone modular microgrid," Energies, vol. 11, no. 4, p. 793, 2018. [69] C.-M. Lai, W.-C. Yeh, and C.-Y. Chang, "Gene selection using information gain and improved simplified swarm optimization," Neurocomputing, vol. 218, pp. 331-338, 2016. [70] X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010: JMLR Workshop and Conference Proceedings, pp. 249-256.
|