|
[1] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553): 436–444, 2015. [2] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention-based neural machine translation. In arXiv, 2015. [3] Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. [4] Volodymyr Mnih, et al. Human-level control through deep reinforcement learning. Nature, 518(7540): 529, 2015. [5] Olga Russakovsky, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115.3: 211-252, 2015. [6] A. Krizhevsky, and et al., “Imagenet classification with deep convolutional neural networks.” In NIPS, 2012. [7] S. Han, and et al., “Learning both Weights and Connections for Efficient Neural Networks.” In NIPS,2015. [8] W. Wen, and et al., “Learning Structured Sparsity in Deep Neural Network”. In NIPS, 2016. [9] J. Luo, and et al., “ThiNet-A Filter Level Pruning Method for Deep Neural Network Compression.” In ICCV, 2017. [10] M. Courbariaux, and et al., “Binaryconnect: Training deep neural networks with binary weights during propagations.” In NIPS, 2015. [11] S. Zhou, and et al., “Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients.” In arXiv:1606.06160, 2016. [12] Z. Cai, and et al., “Deep learning with low precision by half-wave gaussian quantization.” In CVPR, 2017. [13] X. Lin, and et al., “Towards Accurate Binary Convolutional Neural Network.” In NIPS, 2017. [14] D. Miyashita, and et al., “Convolutional neural networks using logarithmic data representation.” In arXiv, 2016. [15] A. Zhou, and et al. “Incremental network quantization: Towards lossless cnns with low-precision weights.” In ICLR, 2017. [16] M. Rastegar, and et al. “Xnor-net: Imagenet classification using binary convolutional neural networks.” In ECCV, 2016. [17] Y. Dong, and et al. “Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization.” In BMVC, 2017. [18] F. Li and B. Liu. “Ternary weight networks.” In NIPS Workshop on EMDNN, 2016. [19] S. K. Esser, and et al.”Learned Step Size Quantization.”In ICLR, 2020. [20] X. Zhao, and et al. “Linear Symmetric Quantization of Neural Networks For Lowprecision Integer Hardware.” In ICLR, 2020. [21] Boris Murmann. 2020. Mixed-signal computing for deep neural network inference. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 29, 1 (2020), 3ś13. [22] G. Hinton, O. Vinyals, J. Dean, Distilling the Knowledge in a Neural Network, 2015. [23] M. Jaderberg, and et al., “Speeding up convolutional neural networks with low rank expansions.” In arXiv, 2014. [24] X. Zhang, and et al., “Accelerating very deep convolutional networks for classification and detection.” In TPAMI, 38(10):1943-1955, 2015. [25] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [26] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9, 2015 [27] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015. [28] Mark Sandler, and et al., “MobileNetV2: Inverted Residuals and Linear Bottlenecks.” In CVPR, 2018. [29] S. Jeloka, N. B. Akesh, D. Sylvester, and D. Blaauw, “A 28 nm configurable memory (TCAM/BCAM/SRAM) using push-rule 6T bit cell enabling logic-inmemory,” IEEE Journal of Solid-State Circuits, vol. 51, no. 4, pp. 1009–1021, 2016. [30] A. Subramaniyan, J. Wang, E. R. Balasubramanian, D. Blaauw, D. Sylvester, and R. Das, “Cache automaton,” in International Symposium on Microarchitecture, 2017, pp. 259–272 [31] J. Zhang, Z. Wang, and N. Verma, “In-memory computation of a machine-learning classifier in a standard 6T SRAM array,” IEEE Journal of Solid-State Circuits, vol. 52, no. 4, pp. 915–924, 2017. [32] S. Mittal, R. Wang, and J. Vetter, “DESTINY: A Comprehensive Tool with 3D and Multi-level Cell Memory Modeling Capability,” Journal of Low Power Electronics and Applications, vol. 7, no. 3, p. 23, 2017. [33] J. P. Kulkarni, J. Keane, K.-H. Koo, S. Nalam, Z. Guo, E. Karl, and K. Zhang, “5.6 Mb/mm2 1R1W 8T SRAM Arrays Operating Down to 560 mV Utilizing SmallSignal Sensing With Charge Shared Bitline and Asymmetric Sense Amplifier in 14 nm FinFET CMOS Technology,” IEEE Journal of Solid-State Circuits, vol. 52, no. 1, pp. 229–239, 2016. [34] M. Qazi, K. Stawiasz, L. Chang, and A. P. Chandrakasan, “A 512kb 8T SRAM macro operating down to 0.57 V with an AC-coupled sense amplifier and embedded data-retention-voltage sensor in 45 nm SOI CMOS,” IEEE Journal of Solid-State Circuits, vol. 46, no. 1, pp. 85–96, 2010. [35] J. Kulkarni, M. Khellah, J. Tschanz, B. Geuskens, R. Jain, S. Kim, and V. De, “Dual-V CC 8T-bitcell SRAM array in 22nm tri-gate CMOS for energy-efficient operation across wide dynamic voltage range,” in 2013 Symposium on VLSI Technology. IEEE, 2013, pp. C126–C127. [36] Y. Zhang, L. Xu, K. Yang, Q. Dong, S. Jeloka, D. Blaauw, and D. Sylvester, “Recryptor: A reconfigurable in-memory cryptographic Cortex-M0 processor for IoT,” in 2017 Symposium on VLSI Circuits. IEEE, 2017, pp. C264–C265. [37] S. Mittal, “A Survey of ReRAM-based Architectures for Processing-in-memory and Neural Networks,” Machine learning and knowledge extraction, vol. 1, p. 5, 2018. [38] B. Feinberg, U. K. R. Vengalam, N. Whitehair, S. Wang, and E. Ipek, “Enabling scientific computing on memristive accelerators,” in 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2018, pp. 367–382. [39] D. Fujiki, S. Mahlke, and R. Das, “Duality cache for data parallel acceleration,” in Proceedings of the 46th International Symposium on Computer Architecture, 2019, pp. 397–410 [40] Xin Si; Yung-Ning Tu; Wei-Hsing Huang; Jian-Wei Su; Pei-Jung Lu; Jing-Hong Wang, et al. “A 28nm 64Kb 6T SRAM Computing-in-Memory macro with 8b MAC operation for AI edge chips”. In ISSCC, 2020. In press. [41] Yi Cai; Tianqi Tang: Lixue Xia; Boxun Li; Yu Wang, et al. “Low Bit-width Convolutional Neural Network on RRAM.” In TCAD, 2020. [42] Vinay Joshi, et al.”Accurate Deep Neural Network Inference Using Computational Phase-change Memory.” In Nature Communications, 2020. [43] P. Adam, and et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library”, 2019. [44] Martín Abadi, et al. TensorFlow: Large-scale machine learning on heterogeneous systems, Software available from tensorflow.org., 2015. [45] Yangqing Jia, and et al. “Caffe: Convolutional Architecture for Fast Feature Embedding.” In CVPR, 2014 |