帳號:guest(3.21.12.194)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張雅鈞
作者(外文):Chang, Ya-Chun
論文名稱(中文):用於二值化神經網路推論的卷積結果共享方法
論文名稱(外文):A Convolutional Result Sharing Approach for Binarized Neural Network Inference
指導教授(中文):王俊堯
指導教授(外文):Wang, Chun-Yao
口試委員(中文):江介宏
溫宏斌
口試委員(外文):Jiang, Jie-Hong
Wen, Hung-Pin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:106062554
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:30
中文關鍵詞:卷積神經網路二值化神經網路近似運算
外文關鍵詞:convolutional neural networkbinarized neural networkapproximate computing
相關次數:
  • 推薦推薦:0
  • 點閱點閱:122
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
二值化神經網路(Binarized neural network, BNN)能更有效的在移動平台上實現卷積神經網路(Convolutional Neural Network, CNN)。推論時,二值化神經網路中的乘法累加運算可以簡化成反互斥或閘-位元計數 (XNOR-popcount) 運算,互斥或閘-位元計數運算在二值化神經網路中占了大部分的運算。為了減少二值化神經網路卷積層中所需的運算量,我們將三維濾波器分解為二維濾波器,並利用重複濾波器、反向濾波器和類似濾波器來共享卷積結果。通過共享卷積結果的方式,可以有效減少二值化神經網路卷積層中的運算量。實驗結果顯示CIFAR-10和SVHN在二值化神經網路卷積層中的運算量可減少約60%,同時保持精度損失在原始訓練網路的1%內。

The binary-weight-binary-input binarized neural network (BNN) allows a much more efficient way to implement convolutional neural networks (CNNs) on mobile platforms.
During inference, the multiply-accumulate operations in BNNs can be reduced to XNOR-popcount operations.
Thus, the XNOR-popcount operations dominate most of the computation in BNNs.
To reduce the number of required operations in convolution layers of BNNs, we decompose 3-D filters into 2-D filters and exploit the repeated filters, inverse filters, and similar filters to share convolutional results.
By sharing the convolutional results, the number of operations in convolution layers of BNNs can be reduced effectively.
Experimental results show that the number of operations can be reduced by about 60\% for CIFAR-10 and SVHN on BNNs while keeping the accuracy loss within 1\% of originally trained networks.
中文摘要 i
Abstract ii
Acknowledgment iii
Contents iv
List of Tables vi
List of Figures vii
1 Introduction 1
2 Backgrounds 4
2.1 Convolutional Neural Networks 4
2.2 Binarized Neural Networks 5
3 Proposed Scheme 7
3.1 Convolutional Result Sharing with Filter Repetitions 7
3.1.1 2-D Filter ID 7
3.1.2 Inverse Bit 8
3.1.3 Decomposition from 3-D to 2-D 8
3.1.4 Filter Dependency Graph 10
3.1.5 Convolutional Result Sharing 12
3.1.6 Reduction on XNOR-Popcount Operations 13
3.2 Convolutional Result Sharing with Filter Approximation 14
3.2.1 Degree of Similarity of Filters 15
3.2.2 Degree of Importance of Filters 16
3.2.3 Filter Similarity Graph 16
3.2.4 Filter Approximation 17
3.2.5 ILP Formulation 20
3.3 Overall Flow 20
4Experimental Evaluation 23
4.1 Experimental Setup 23
4.2 Experimental Results 23
5 Conclusion 28
Bibliography 29
[1] M. Courbariaux, Y. Bengio, and J.-P. David, “BinaryConnect: training deep neural networks with binary weights during propagations,” in Proc. NIPS, pp. 3123–3131, 2015.
[2] M. Courbariaux, Y. Bengio, “BinaryNet: training deep neural networks with weights and activations constrained to +1 or -1,” ArXiv:1602.02830, 2016.
[3] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1," ArXiv:1602.02830, 2016.
[4] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net: imageNet classification using binary convolutional neural networks,” in Proc. ECCV, pp. 525–542, 2016.
[5] P. Gysel, M. Motamedi, and S. Ghiasi, “Hardware-oriented approximation of convolutional neural networks,” ArXiv:1604.03168, 2016.
[6] S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, “DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients,” ArXiv:1606.06160, 2016.
[7] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, “Quantized neural networks: training neural networks with low precision weights and activations,” ArXiv:1609.07061, 2016.
[8] S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning both weights and connections for efficient neural networks,” in Proc. NIPS, pp. 1135-1143, 2015.
[9] T.-J. Yang, Y.-H. Chen, and V. Sze, “Designing energy-efficient convolutional neural networks using energy-aware pruning,” in Proc. CVPR, 2017.
[10] S. Han, H. Mao, and W. J. Dally, "Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding," ArXiv:1510.00149, 2015.
[11] W. Chen, J. T. Wilson, S. Tyree,K. Q. Weinberger, and Y. Chen, “Compressing neural networks with the hashing trick,” in Proc. ICML, pp. 2285–2294, 2015.
[12] H. Kim, J. Sim, Y. Choi, L.-S. Kim, “A kernel decomposition architecture for binary-weight convolutional neural networks,” in Proc. DAC, p. 60, 2017.
[13] C.-C. Chi and J.-H. R. Jiang, "Logic synthesis of binarized neural networks for efficient circuit implementation," in Proc. ICCAD, pp 84:1–84:7, 2018.
[14] Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre, and K. Vissers, "FINN: A framework for fast, scalable binarized neural network inference," in Proc. Int. Symp. on Field-Programmable Gate Arrays, pp. 65--74, 2017.
[15] A. Krizhevsky, "Learning multiple layers of features from tiny images," MS thesis, University of Toronto, https://www.cs.toronto.edu/~kriz/cifar.html, 2009.
[16] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A. Y. Ng, "Reading digits in natural images with unsupervised feature learning," NIPS Workshop on Deep Learning and Unsupervised Feature Learning, pp. 5, 2011.
[17] “keras,” https://keras.io/
[18] “GUROBI,” https://www.gurobi.com/.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *