帳號:guest(3.17.77.42)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):鄭鈺敏
作者(外文):Cheng, Yu-Min
論文名稱(中文):預測早期退出點以加速深度卷積網路
論文名稱(外文):Predicting Early Exiting for Fast Inference in Deep Convolutional Neural Networks
指導教授(中文):金仲達
指導教授(外文):King, Chung-Ta
口試委員(中文):江振瑞
李濬屹
口試委員(外文):Jiang, Jehn-Ruey
Lee, Chun-Yi
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062541
出版年(民國):109
畢業學年度:109
語文別:英文
論文頁數:32
中文關鍵詞:卷積神經網路加速分支早期退出預測器
外文關鍵詞:convolutionneural networkspeed upbranchearly exitpredictor
相關次數:
  • 推薦推薦:0
  • 點閱點閱:332
  • 評分評分:*****
  • 下載下載:27
  • 收藏收藏:0
卷積神經網路已經被證明可有效解決許多複雜以及實務上的問題。為了能夠解決更困難的問題,我們使用的網路不僅越來越深,也越來越廣,因此加速其運算時間和降低能耗的需求變得迫在眉睫。其中一個有效的方法在 BranchyNet 這篇論文被提出,也就是在神經網路中加入分支,讓那些具備較高信心程度的資料,能夠儘早退出神經網路。然而,這種架構也會導致那些比較難以預測正確的資料,需要經過很多分支才能夠結束運算,形成額外的時間成本。此外,一筆資料在進入一個分支時的中間值也需要被暫存起來,如此才能夠讓這筆資料在無法於該分支退出時,得以回去整個網路的主幹繼續往下執行。若是我們在邊緣裝置上使用這種架構,這些問題會顯得更加嚴重。同步執行這些分支或許可以減輕這些問題,但是會需要花費額外的計算資源以及能源。在這篇論文中,我們提倡在分支的前面使用輕量的預測器,以決定是否要進入一個分支。如果預測正確的話,就可以讓一筆資料以最短的路徑前往正確的出口。為了檢驗這個方法是否可行,我們使用了不同的預測器以及擺放策略進行實驗。實驗結果顯示我們的方法可以犧牲很少的準確度,降低整體的運算時間。
Convolutional neural networks (CNNs) have been shown to be effective in solving many complex and practical problems. As increasingly deeper and larger networks are attempted for solving more complex problems, the need to speed up their inference time and reduce the energy consumption is becoming imperative. One promising approach is to allow test instances to exit the neural network early if they can be inferred with high confidence, as exemplified in BranchyNet, which adds extra side branch classifiers for early exiting. The problem with architectures such as BranchyNet is that test instances that are hard to infer may have to go through many side branches before they can exit, incurring extra delays in the network. Moreover, intermediate results have to be buffered when such instances enter a branch, so that if they cannot exit at that branch, they can return to the main path. The problems become more severe if edge devices are to adopt such architectures. Concurrent execution of the branches may mitigate the problems, but at the cost of extra computing resources and energy consumption. In this thesis, we propose to use light-weight predictors, or classifiers, in front of the branches to determine whether the given instance should enter a branch or not. If predicted accurately, test instances will go through the shortest path to their right exits. We examine different predictors and strategies of placing the predictors. Extensive experiments show that the proposed approach can reduce the total execution time while sacrificing minimal inference accuracy.
Chinese Abstract i
Abstract ii
Acknowledgements iii
1 Introduction 1
2 Related Work 4
2.1 Parameter Pruning and Sharing ................. 4
2.2 Low-rank Factorization ........................ 4
2.3 Transferred/Compact Convolutional Filters ..... 5
2.4 Knowledge Distillation ........................ 5
2.5 Early-exiting ................................. 5
3 Method 7
3.1 NN-based Entropy Predictor .................... 7
3.2 NN-based Exit Predictor ....................... 10
4 Evaluation 12
4.1 Branchy-LeNet ................................. 13
4.2 Branchy-AlexNet ............................... 15
4.3 Branchy-ResNet ................................ 18
4.4 Branchy-ResNeXt ............................... 20
4.5 Branchy-DenseNet .............................. 22
5 Discussion 26
6 Conclusion 28
[1] Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (pp. 1097–1105).
[2] Teerapittayanon, S., McDanel, B., & Kung, H. (2016). BranchyNet: Fast inference via early exiting from deep neural networks. In 2016 23rd International Conference on Pattern Recognition (ICPR).
[3] Gong, Y., Liu, L., Yang, M., & Bourdev, L. (2014). Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115.
[4] Wu, J., Leng, C., Wang, Y., Hu, Q., & Cheng, J. (2016). Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4820-4828).
[5] Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282.
[6] Vanhoucke, V., Senior, A., & Mao, M. Z. (2011). Improving the speed of neural networks on CPUs.
[7] Gupta, S., Agrawal, A., Gopalakrishnan, K., & Narayanan, P. (2015, June). Deep learning with limited numerical precision. In International Conference on Machine Learning (pp. 1737-1746).
[8] Srinivas, S., & Babu, R. V. (2015). Data-free parameter pruning for deep neural networks. arXiv preprint arXiv:1507.06149.
[9] Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. In Advances in neural information processing systems (pp. 1135-1143).
[10] Ullrich, K., Meeds, E., & Welling, M. (2017). Soft weight-sharing for neural network compression. arXiv preprint arXiv:1702.04008.
[11] Rigamonti, R., Sironi, A., Lepetit, V., & Fua, P. (2013). Learning separable filters. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2754-2761).
[12] Jaderberg, M., Vedaldi, A., & Zisserman, A. (2014). Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866.
[13] Cohen, T., & Welling, M. (2016, June). Group equivariant convolutional networks. In International conference on machine learning (pp. 2990-2999).
[14] Buciluǎ, C., Caruana, R., & Niculescu-Mizil, A. (2006, August). Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 535-541).
[15] Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1492-1500).
[16] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708).
[17] Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., ... & Kalenichenko, D. (2018). Quantization and training of neural networks for efficient integer- arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2704-2713).
[18] Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830.
[19] Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016, October). Xnor-net: Imagenet classification using binary convolutional neural networks. In European conference on computer vision (pp. 525-542). Springer, Cham.
[20] Lahoud, F., Achanta, R., Márquez-Neila, P., & Süsstrunk, S. (2019). Self-binarizing networks. arXiv preprint arXiv:1902.00730.
[21] Sainath, T. N., Kingsbury, B., Sindhwani, V., Arisoy, E., & Ramabhadran, B. (2013, May). Low-rank matrix factorization for deep neural network training with high- dimensional output targets. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 6655-6659). IEEE.
[22] Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
[23] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[24] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[25] Panda, P., Sengupta, A., & Roy, K. (2017). Energy-efficient and improved image recognition with conditional deep learning. ACM Journal on Emerging Technologies in Computing Systems (JETC), 13(3), 1-21.
[26] Shafiee, M. S., Shafiee, M. J., & Wong, A. (2019, June). Dynamic Representations Toward Efficient Inference on Deep Neural Networks by Decision Gates. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 667-675). IEEE.
[27] Wu, Z., Nagarajan, T., Kumar, A., Rennie, S., Davis, L. S., Grauman, K., & Feris, R. (2018). Blockdrop: Dynamic inference paths in residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 8817-8826).
[28] Huang, G., Chen, D., Li, T., Wu, F., van der Maaten, L., & Weinberger, K. Q. (2017). Multi-scale dense networks for resource efficient image classification. arXiv preprint arXiv:1703.09844.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *