帳號:guest(3.147.58.159)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):魏聖修
作者(外文):Wei, Sheng-Hsiu
論文名稱(中文):一個考慮到過濾器重複特性的二值化神經網路卷積結果靈活共享方法的研究
論文名稱(外文):A Flexible Result Sharing Approach Using Filter Repetitions to Binarized Neural Networks Optimization
指導教授(中文):王俊堯
指導教授(外文):Wang, Chun-Yao
口試委員(中文):張世杰
陳勇志
口試委員(外文):Chang, Shih-Chieh
Chen, Yung-Chih
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:108062537
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:34
中文關鍵詞:二值化神經網路過濾器重複特性卷積結果共享方法
外文關鍵詞:Binarized Neural NetworksFilter RepetitionsConvolutional Result Sharing
相關次數:
  • 推薦推薦:0
  • 點閱點閱:318
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
卷積式神經網路(CNN)在處理計算機視覺和人工智能(AI)等領域的問題時可以達到出色的準確率。而其具有二進制權重和二進制輸入特徵的二進制神經網路(BNN)可以在邊緣裝置上更有效率地實現。在BNN的推論模型中,原始複雜的乘法和累加(MAC)運算可以被簡化為XNOR-Popcount運算。由於過濾器中的二進制權重的特性,一個完整的過濾器可以分解為多個部分過濾器,並且這些部分過濾器的重複特性可以藉由利用結果共享的方法來減少所需的XNOR-Popcount運算量。
因此,在這篇論文中,我們提出了一種靈活的卷積結果共享方法。這個方法可 以在部分過濾器之間重複利用和共享儲存的計算結果。我們還在 FPGA 平台上實現了所提出的方法。實驗結果顯示,相較於原始的方法,使用我們的方法,在硬體實現上能夠使所需要的XNOR-Popcount運算量減少到原本的1%以下,並且在各個不同的神經網路層上所需的LUT數量可以減少到原本的50.6%到23.6%不等。
Convolutional Neural Networks (CNNs) can provide excellent accuracy especially in fields, such as computer vision and artificial intelligence (AI), and its variant with binary weights and binary inputs, Binarized Neural Networks (BNNs), can be realized more efficiently on the edge. In the BNNs’ inference model, the original complex multiplication and accumulation (MAC) operations can be simplified to XNOR-Popcount operations. Because of the binarized weights in the filters, a complete filter can be decomposed into several partial filters, and the repetitions of partial filters can reduce the number of required XNOR-Popcount operations by exploiting result sharing.
Thus, in this work, we propose a flexible result sharing approach to reuse the computed results among partial filters. We also implemented the proposed approach on an FPGA platform. The experimental results show that the number of required XNOR-Popcount operations can be reduced to less than 1%, and the required number of LUTs is reduced to 50.6% to 23.6% on different layers as compared to the implementation without using the proposed approach.
1 Introduction --- 1
2 Preliminaries --- 4
2.1 ConvolutionalNeuralNetworks --- 4
2.2 Binarized Neural Networks and Optimizations of Operators --- 6
2.3 FilterRepetition --- 9
3 Proposed Approach --- 11
3.1 FlexibleResultSharingApproach --- 11
3.1.1 FilterDecomposition --- 11
3.1.2 InputFeatureMapDecomposition --- 12
3.1.3 Partial-resultsSharing --- 13
3.2 Second-level Flexible Result Sharing Approach --- 17
4 Architecture Design for Realization in FPGAs --- 21
4.1 The Original Circuit without Sharing --- 22
4.2 The Optimized Circuit with the Flexible Result Sharing Approach --- 23
4.3 The Optimized Circuit with the Second-level Flexible Result Sharing Approach --- 24
5 Experimental Evaluation --- 26
5.1 Experimental Setup --- 26
5.2 Experimental Results --- 27
6 Conclusion and Future Work31
1. Dario Amodei, et al., “Deep speech 2: End-to-End Speech Recognition in English and Mandarin,” in International conference on machine learning, PMLR, 2016, pp. 173-182.
2. Sajid Anwar, et al., “Structured Pruning of Deep Convolutional Neural Networks,” in ACM Journal on Emerging Technologies in Computing Systems (JETC), 2017, vol. 13, no. 3, pp. 1-18.
3. Chia-Chih Chi and Jie-Hong R Jiang, “Logic Synthesis of Binarized Neural Networks for Efficient Circuit Implementation,” in Proc. of ICCAD, 2018.
4. Matthieu Courbariaux, et al., “Binaryconnect: Training Deep Neural Networks with Binary Weights during Propagations,” arXiv preprint arXiv:1511.00363, 2015.
5. Matthieu Courbariaux, et al., “Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1,” arXiv preprint arXiv:1602.02830, 2016.
6. Ya-Chun Chang, et al., “A Convolutional Result Sharing Approach for Binarized Neural Network Inference,” in Proc. of DATE, 2020, pp. 780-785.
7. Itay Hubara, et al., “Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations,” JMLR, 2015, vol. 18, no. 1, pp. 6869-6898.
8. Song Han, et al., “Deep compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman coding,” arXiv preprint arXiv:1510.00149, 2015.
9. Forrest N Iandola, et al., “SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and < 0.5 MB Model Size,” arXiv preprint arXiv:1602.07360, 2016.
10. Sergey Ioffe and Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Proc. of ICML, 2018, pp. 448-456.
11. Alex Krizhevsky and Geoffrey Hinton, “Learning Multiple Layers of Features from Tiny Images,” Citeseer, 2009.
12. Alex Krizhevsky, et al., “Imagenet Classification with Deep Convolutional Neural Networks,” in Advances in neural information processing systems, 2012, vol. 25, pp. 1097-1105.
13. Yann LeCun, ”The MNIST Database of Handwritten Digits,” http://yann.lecun.com/exdb/mnist/, 1998.
14. Chigozie Nwankpa, et al., “Activation Functions: Comparison of Trends in Practice and Research for Deep Learning, ” arXiv preprint arXiv:1811.03378, 2018.
15. Sridhar Narayan, “The Generalized Sigmoid Activation Function: Competitive Supervised Learning,” Information Sciences, 1997, vol. 99, no. 1-2, pp. 69-82.
16. Jiantao Qiu, et al., “Going Deeper with Embedded FPGA Platform for Convolutional Neural Network,” in Proc. of the FPGA, 2016, pp. 26-35.
17. Mohammad Rastegari, et al., “Xnor-net: Imagenet Classification Using Binary Convolutional Neural Networks,” in Proc. of ECCV, Springer, 2016, pp. 525- 542.
18. Yaman Umuroglu, et al., “Finn: A Framework for Fast, Scalable Binarized Neural Network Inference,” in Proc. of FPGA, 2017, pp. 65-74.
19. Bing Xu, et al., “Empirical Evaluation of Rectified Activations in Convolutional Network,” arXiv preprint arXiv:1505.00853, 2015.
20. Tien-Ju Yang, et al., “Designing Energy-efficient Convolutional Neural Networks Using Energy-aware Pruning,” in Proc. of CVPR, 2017, pp. 5687-5695.
21. Shuchang Zhou, et al., “Dorefa-net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradient,” arXiv preprint arXiv:1606.06160, 2016.
22. Ye Zhang and Byron Wallace, “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification,” arXiv preprint arXiv:1510.03820, 2015.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *