作者(外文):Hsu, Chen-Han
論文名稱(外文):A Reconstructing Spike-Based Convolution Neural Network (SCNN) Accelerator with Input Sparsity Mechanism
指導教授(外文):Tang, Kea-Tiong
口試委員(外文):Huang, Chao-Tsung
Liu, Ren-Shuo
Lu, Chih-Cheng
外文關鍵詞:Spiking neural networkConvolution neural networkacceleratorsparsity mechanism
近年來AI科技蓬勃發展,並廣泛的應用於各式各樣的任務中,其中卷積神經網路更是廣泛的被應用在影像處理的任務上,如辨識、分類等,因此應用於邊緣裝置上的低功耗晶片需求也隨之增加,然而隨著處理的任務越加複雜,所使用的神經網路模型參數量及運算量大幅增加,造成晶片的功耗隨之增加。因此近年來突波神經網路(Spiking neural networks, SNNs)受到越來越多的關注。
Artificial intelligence technology has flourished in recent years and has been used in various fields. Convolutional neural networks are widely used in image-processing tasks, such as recognition, classification, etc. The demand for low-power chips used in edge devices is increasing. However, as the processing tasks become complex, the number of parameters and computations of the neural network model increases significantly, increasing chip power consumption. Therefore, in recent years, Spiking Neural Networks (SNNs) have received more and more attention.
SNNs inspired by the human brain with simple functions and low data density has become an important research topic. It has many low-power features, such as event-driven, data binarization, and high input sparsity. In this research, we proposed a spiked-based CNN accelerator with Spatiotemporal Parallel Data Flow to simultaneously calculate data in the spatial and temporal domains. Reduces the number of memory accesses to reduce overall energy consumption. In addition, we also design a sparsity, event-driven circuit and propose an early skip mechanism for pooling operations to reduce power consumption and computation time. For different network sizes and layers, hardware resources can be reconstructed to apply to different networks or network layers. The accelerator in this study can achieve 54.77TOPs/W in energy efficiency and 2.57GOPs/kmm2 in area efficiency under the 40nm process frequency of 300MHZ. This study has better energy efficiency and area utilization than other spike-based convolutional neural network accelerators.
第1章 緒論.............................................1
1.1 研究動機與目的...........................................6
1.2 章節簡介................................................8
第2章 文獻回顧 ........................................9
2.1 卷積神經網路加速器 ................................ 9
2.1.1 深度卷積神經網路加速器架構 ........................ 9
2.1.2 資料覆用 ........................................10
2.1.3 資料搬移 ........................................ 13
2.2 突波卷積神經網路 ................................ 14
2.3 突波神經網路加速器 ........................ 15
2.3.1 優先運算空間維度方向資料流之架構與加速器........15
2.3.2 優先運算時間維度方向資料流之架構與加速器........18
2.4 研究動機 ........................................22
第3章 突波卷積神經網路加速器設計........................23
3.1 時間步長定義........................................23
3.2 加速器架構................................................23
3.3.1 時空並行資料流........................................26
3.3.2 神經元模型與電路........................................29
3.4 稀疏性跳零機制與電路設計................................30
3.4.1 稀疏性跳零及事件觸發機制................................30
3.4.2 跳零運算及事件觸發機制電路設計........................32
3.5 池化層提前跳過機制與電路................................34
3.6 可重建之運算單元陣列及運算電路架構........................38
第4章 實驗結果................................................42
4.1 環境設置 ................................................42
4.2 突波卷積加速器功能量測................................ 44
4.3 提出方法執行於VGG網路之成效分析........................ 45
4.3.1 時空並行資料流對記憶體存取次數..........................45
4.3.2 稀疏性跳零與池化層提前跳過機制........................46
4.4 晶片規格及與多種SCNN加速器之比較........................47
第5章 結論與未來發展.......................................50

