作者(外文):Chang, Ya-Chun
論文名稱(外文):A Convolutional Result Sharing Approach for Binarized Neural Network Inference
指導教授(外文):Wang, Chun-Yao
口試委員(外文):Jiang, Jie-Hong
Wen, Hung-Pin
外文關鍵詞:convolutional neural networkbinarized neural networkapproximate computing
二值化神經網路(Binarized neural network, BNN)能更有效的在移動平台上實現卷積神經網路(Convolutional Neural Network, CNN)。推論時,二值化神經網路中的乘法累加運算可以簡化成反互斥或閘-位元計數 (XNOR-popcount) 運算,互斥或閘-位元計數運算在二值化神經網路中占了大部分的運算。為了減少二值化神經網路卷積層中所需的運算量,我們將三維濾波器分解為二維濾波器,並利用重複濾波器、反向濾波器和類似濾波器來共享卷積結果。通過共享卷積結果的方式,可以有效減少二值化神經網路卷積層中的運算量。實驗結果顯示CIFAR-10和SVHN在二值化神經網路卷積層中的運算量可減少約60%,同時保持精度損失在原始訓練網路的1%內。

The binary-weight-binary-input binarized neural network (BNN) allows a much more efficient way to implement convolutional neural networks (CNNs) on mobile platforms.
During inference, the multiply-accumulate operations in BNNs can be reduced to XNOR-popcount operations.
Thus, the XNOR-popcount operations dominate most of the computation in BNNs.
To reduce the number of required operations in convolution layers of BNNs, we decompose 3-D filters into 2-D filters and exploit the repeated filters, inverse filters, and similar filters to share convolutional results.
By sharing the convolutional results, the number of operations in convolution layers of BNNs can be reduced effectively.
Experimental results show that the number of operations can be reduced by about 60\% for CIFAR-10 and SVHN on BNNs while keeping the accuracy loss within 1\% of originally trained networks.
中文摘要 i
Abstract ii
Acknowledgment iii
Contents iv
List of Tables vi
List of Figures vii
1 Introduction 1
2 Backgrounds 4
2.1 Convolutional Neural Networks 4
2.2 Binarized Neural Networks 5
3 Proposed Scheme 7
3.1 Convolutional Result Sharing with Filter Repetitions 7
3.1.1 2-D Filter ID 7
3.1.2 Inverse Bit 8
3.1.3 Decomposition from 3-D to 2-D 8
3.1.4 Filter Dependency Graph 10
3.1.5 Convolutional Result Sharing 12
3.1.6 Reduction on XNOR-Popcount Operations 13
3.2 Convolutional Result Sharing with Filter Approximation 14
3.2.1 Degree of Similarity of Filters 15
3.2.2 Degree of Importance of Filters 16
3.2.3 Filter Similarity Graph 16
3.2.4 Filter Approximation 17
3.2.5 ILP Formulation 20
3.3 Overall Flow 20
4Experimental Evaluation 23
4.1 Experimental Setup 23
4.2 Experimental Results 23
5 Conclusion 28
Bibliography 29
