帳號:guest(3.15.29.209)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):洪旻夆
作者(外文):Hong, Min-Fong
論文名稱(中文):於可微分網路架構搜索探討選擇運算之合理性
論文名稱(外文):Toward a Reasonable Candidate Operation Selection for Differentiable Architecture Search
指導教授(中文):李濬屹
指導教授(外文):Lee, Chun-Yi
口試委員(中文):周志遠
蔡一民
口試委員(外文):Chou, Chi-Yuan
Tsai, Yi-Min
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:108062532
出版年(民國):109
畢業學年度:109
語文別:英文
論文頁數:30
中文關鍵詞:神經網路架構搜索可微分神經網路架構搜索
外文關鍵詞:neural architecture searchdifferentiable architecture search
相關次數:
  • 推薦推薦:0
  • 點閱點閱:435
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
可微分架構搜索建構一個有效的架構以尋找一個組成細胞,並且有一些續作致力於從不同面向優化它。然而,大部分的研究忽略了其本身架構所造成的偏差問題。於這篇論文中,我們是第一位提出應將「零運算」從搜索過程中移出可選擇的運算,以消除其過多的問題並且讓剩下的運算能夠真正地彼此競爭。我們更進一步提出可分捲積應只堆疊一次以減弱其不公平的優勢並因此促進搜索過程選擇更多樣性的運算。我們所提出的方法能夠應用在多個可微分架構搜索演算法上,展現出減緩偏差問題的能力,並在CIFAR-10資料集中展現出優越的表現。
Differentiable architecture search has constructed an efficient framework of searching for a component cell, and a few follow-up works endeavored to improve it from different aspects. However, most of them overlooked the bias problem inherent in the differentiable framework. In this paper, we first propose to remove the zero operation from the candidate operation set in the search phase to eliminate the domination problem and enable the rest operations to compete with each other. We further propose to apply the separable convolution only once to weaken the unfair advantages of the separable convolution, and thus encourage the search process to attempt more diversified operations. Our propositions are applied to several representative differentiable architecture search methods and demonstrated with the effectiveness of mitigating the bias problem while delivering promising performance on CIFAR10.
Abstract
Contents
1 Introduction------------------------------------1
2 Related Work------------------------------------5
3 Background--------------------------------------7
3.1 NAS-----------------------------------------7
3.2 DARTS---------------------------------------8
3.3 PDARTS--------------------------------------10
3.4 PC-DARTS------------------------------------10
4 Methodology-------------------------------------12
4.1 Removal of Zero Operation-------------------12
4.2 Modification of the Separable Operation-----14
5 Experiments-------------------------------------17
5.1 Experimental Setups-------------------------17
5.2 Performance on DARTS, PDARTS, and PC-DARTS--19
5.3 Ablation Analysis---------------------------21
6 Conclusion--------------------------------------25
References----------------------------------------26
[1] Hanxiao Liu, Karen Simonyan, and Yiming Yang. Darts: Differentiable architecturesearch. InProc. Int. Conf. Learning Representation (ICLR), 5 2019.
[2] Xin Chen, Lingxi Xie, Jun Wu, and Qi Tian. Progressive differentiable architecturesearch: Bridging the depth gap between search and evaluation. InProc. IEEE Int.Conf. Computer Vision (ICCV), 10 2019.
[3] Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, andHongkai Xiong.{PC}-{darts}: Partial channel connections for memory-efficientarchitecture search. InProc. Int. Conf. Learning Representation (ICLR), 4 2020.
[4] Han Cai, Ligeng Zhu, and Song Han. Proxylessnas: Direct neural architecturesearch on target task and hardware. InProc. Int. Conf. Learning Representation(ICLR), 5 2019.
[5] Sirui Xie, Hehui Zheng, Chunxiao Liu, and Liang Lin. Snas: Stochastic neuralarchitecture search. InProc. Int. Conf. Learning Representation (ICLR), 5 2019.
[6] B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. InProc. Int. Conf. Learning Representation (ICLR), 4 2017.
[7] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. Learning transferable architecturesfor scalable image recognition. InProc. IEEE Conf. Computer Vision and PatternRecognition (CVPR), pages 8697–8710, 6 2018.
[8] Renqian Luo, Fei Tian, Tao Qin, Enhong Chen, and Tie-Yan Liu. Neural architectureoptimization. InAdvances in Neural Information Processing Systems (NeurIPS),pages 7816–7827, 12 2018.
[9] H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean. Efficient neural architecturesearch via parameter sharing. InProc. Int. Conf. Machine Learning (ICML), pages4092–4101, 7 2018.
[10] A. Brock, T. Lim, J. M. Ritchie, and N. Weston. Smash: One-shot model architecturesearch through hypernetworks. InProc. Int. Conf. Learning Representation (ICLR),5 2018.
[11] K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos, and E. Xing. Neural ar-chitecture search with bayesian optimisation and optimal transport. InAdvances inNeural Information Processing Systems (NeurIPS), page 2020–2029, 12 2018.
[12] Hongpeng Zhou, Minghao Yang, Jun Wang, and Wei Pan. Bayesnas: A bayesian approach for neural architecture search. InProc. Int. Conf. Machine Learning (ICML),pages 7603–7613, 6 2019.
[13] Mingxing Tan, Bo Chen, Ruoming Pang, V. Vasudevan, M. Sandler, A. Howard, andQ. V. Le. Mnasnet: Platform-aware neural architecture search for mobile. InProc.IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pages 2820–2828, 62019.
[14] Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, A. Yuille, and Jian-chao Yang. Atomnas: Fine-grained end-to-end neural architecture search. InProc.Int. Conf. Learning Representation (ICLR), 4 2020.
[15] G. Bender, P.-J. Kindermans, B. Zoph, V. Vasudevan, and Q. V. Le. Understandingand simplifying one-shot architecture search. InProc. Int. Conf. Machine Learning(ICML), pages 549–558, 7 2018.
[16] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. Regularized evolution for imageclassifier architecture search. InAssociation for the Advancement of Artificial Intel-ligence (AAAI), pages 4780–4789, 1 2019.
[17] Hanxiao Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu. Hierar-chical representations for efficient architecture search. InProc. Int. Conf. LearningRepresentation (ICLR), 4 2018.
[18] Lingxi Xie and Yuille A. Genetic cnn. InProc. IEEE Int. Conf. Computer Vision(ICCV), pages 1379–1388, 10 2017.
[19] E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, Jie Tan, Q. V. Le, andA. Kurakin. Large-scale evolution of image classifiers. InProc. Int. Conf. MachineLearning (ICML), pages 2902–2911, 8 2017.
[20] T. Elsken, J. H. Metzen, and F. Hutter. Efficient multi-objective neural architec-ture search via lamarckian evolution. InProc. Int. Conf. Learning Representation(ICLR), 5 2019.
[21] Zhao Zhong, Junjie Yen, Wei Wu, Jing Shao, and Cheng-Lin Liu. Pratical block-wise neural network architecture generation. InProc. IEEE Conf. Computer Visionand Pattern Recognition (CVPR), pages 2423–2432, 6 2018.
[22] B. Baker, O. Gupta, N. Nalik, and R. Raskar. Designing neural network architecturesusing reinforcement learning. InProc. Int. Conf. Learning Representation (ICLR),4 2017.
[23] A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images.Technical Report, Citeseer, 2009.
[24] L. Li and A. Talwalkar. Random search and reproducibility for neural architecturesearch. InProc. Conf. Uncertainty in Artificial Intelligence (UAI), page 129, 7 2019.
[25] A. Zela, T. Elsken, T. Saikia, Y. Marrakchi, T. Brox, and F. Hutter. Understandingand robustifying differentiable architecture search. InProc. Int. Conf. LearningRepresentation (ICLR), 4 2020.
[26] Chenxi Liu, B. Zoph, M. Neumann, J. Shlens, Wei Hua, Li-Jia Li, Fei-Fei Li, A. L.Yuille, J. Huang, and K. Murphy. Progressive neural architecture search. InProc.IEEE Eur. Conf. Computer Vision (ECCV), pages 19–35, 9 2018.
[27] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov.Dropout: A simple way to prevent neural networks from overfitting.Journal ofMachine Learning Research (JMLR), 15:1929–1958, 2014.
[28] I. Loshchilov and F. Hutter. Sgdr: Stochastic gradient descent with warm restarts.arXiv:1608.03983, 8 2016.
[29] D. P. Kingma and J. Ba.Adam:A method for stochastic optimization.arXiv:1412.6980, 12 2014.
[30] T. DeVries and G. W. Taylor. Improved regularization of convolutional neural net-works with cutout.arXiv:1708.04552, 8 2017.
[31] Xiangxiang Chu, Tianbao Zhou, Bo Zhang, and Jixiang Li. Fair DARTS: Eliminat-ing Unfair Advantages in Differentiable Architecture Search.arXiv:1911.12126, 11 2019.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *