|
[1] J. Albericio, P. Judd, T. Hetherington, T. Aamodt, N. E. Jerger, and A. Moshovos. Cn- vlutin: Ineffectual-neuron-free deep neural netw ork computing. In 2016 ACM/IEEE 43rd Annual International Sympo sium o n Computer Architecture (ISCA), pages 1– 13, 2016. [2] Y. Bengio, N. Léonard, and A. C. Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR, abs/1308.3432, 2013. [3] L. Cavigelli and L. Benini. Extended bit-plane compression for convolutional neural network accelerators. In 2019 IEEE International Conference on Artificial Intelli- gence Circuits and Systems (AICAS), pages 279–283. IEEE, 2019. [4] Chao-Tsung Huang, Po-Chih Tseng, and Liang-Gee Chen. Flipping structure: an efficient vlsi architecture for lifting-based discrete wavelet transform. In Asia-Pacific Conference on Circuits and Systems, pages 383–388 vol.1, 2002. [5] Chao-Tsung Huang, Po-Chih Tseng, and Liang-Gee Chen. Generic ram-based ar- chitectures for two-dimensional discrete wavelet transform with line-based method. IEEE Transactions on Circuits and Systems for Video Technology, pages 910–920, 2005. [6] Y. Chen, T. Krishna, J. S. Emer, and V. Sze. 14.5 eyeriss: An energy-efficient recon- figurable accelerator for deep convolutional neural networks. In 2016 IEEE Interna- tional Solid-State Circuits Conference, ISSCC, pages 262–263, 2016. [7] M. Courbariaux, Y. Bengio, and J.-P. David. Binaryconnect: Training deep neural networks with binary weights during propagations. In NIPS, 2015. [8] G. Georgiadis. Accelerating convolutional neural networks via activation map com- pression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7085–7095, 2019. [9] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural net- work with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149, 2016. [10] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [11] Y. He, X. Zhang, and J. Sun. Channel pruning for accelerating very deep neural networks. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 1398–1406, 2017. [12] M. Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10–14. IEEE, 2014. [13] D. H. J. L. G. P. H.-J. Y. Jinsu Lee, Juhyoung Lee. 7.7 lnpu: A 25.3 tflops/w sparse deep-neural-network learning processor with fine-grained mixed precision of fp8- fp16. In 2019 IEEE International Solid-State Circuits Conference-(ISSCC)., pages 142–144. IEEE, 2019. [14] J. Kim, M. Sullivan, E. Choukse, and M. Erez. Bit-plane compression: Transforming data for better compression in many-core architectures. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), pages 329–340, June 2016. [15] J. Lee, C. Kim, S. Kang, D. Shin, S. Kim, and H. Yoo. Unpu: A 50.6tops/w unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision. In 2018 IEEE International Solid - State Circuits Conference - (ISSCC), pages 218–220, Feb 2018. [16] Z. Liu, B.Wu,W. Luo, X. Yang,W. Liu, and K.-T. Cheng. Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In ECCV, 2018. [17] M. W. Marcellin, M. J. Gormish, A. Bilgin, and M. P. Boliek. An overview of JPEG- 2000. In Data Compression Conference, DCC, pages 523–544, 2000. [18] F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, and L. V. Gool. Conditional probability models for deep image compression. In 2018 IEEE Conference on Com- puter Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 4394–4402, 2018. [19] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. Xnor-net: Imagenet classifica- tion using binary convolutional neural networks. In ECCV, 2016. [20] M. Rhu, M. O’Connor, N. Chatterjee, J. Pool, Y. Kwon, and S.W. Keckler. Compress- ing dma engine: Leveraging activation sparsity for training deep neural networks. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 78–91, 2018. [21] W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li. Learning structured sparsity in deep neural networks. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 2074–2082, 2016. [22] I. H. Witten, R. M. Neal, and J. G. Cleary. Arithmetic coding for data compression. Commun. ACM, 30(6):520–540, 1987. [23] B.-F. Wu and C.-F. Lin. A high-performance and memory-efficient pipeline architec- ture for the 5/3 and 9/7 discrete wavelet transform of jpeg2000 codec. IEEE Transac- tions on Circuits and Systems for Video Technology, pages 1615–1628, 2005. [24] H. Yamauchi, S. Okada, K. Taketa, T. Ohyama, Y. Matsuda, T. Mori, S. Okada, T. Watanabe, Y. Matsuo, Y. Yamada, T. Ichikawa, and Y. Matsushita. Image pro- cessor capable of block-noise-free jpeg2000 compression with 30 frames/s for digital camera applications. In 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC., pages 46–477 vol.1, 2003. [25] S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu, L. Li, Q. Guo, T. Chen, and Y. Chen. Cambricon-x: An accelerator for sparse neural networks. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 1–12, 2016. |