|
[1] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553): 436–444, 2015. [2] Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. [3] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention-based neural machine translation. In arXiv, 2015. [4] Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, et al. Deep speech 2: End-to-end speech recognition in english and mandarin. In arXiv, 2015. [5] V. Mnih, et al. Human-level control through deep reinforcement learning. In Nature , 2015. [6] Olga Russakovsky, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115.3: 211-252, 2015. [7] J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," In CVPR, 2018 [8] Geoffrey Hinton and Ruslan Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786): 504-507, 2006. [9] Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. [10] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, 2016. [11] Chen, Y.-H., Krishna, T., Emer, J. S., and Sze, V., “Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks”, IEEE Journal of Solid-State Circuits, vol. 52, no. 1, pp. 127–138, 2017. doi:10.1109/JSSC.2016.2616357. [12] Y. Chen et al., "DaDianNao: A Machine-Learning Supercomputer." In MACRO, 2014 [13] S. Yin et al., "A High Energy Efficient Reconfigurable Hybrid Neural Network Processor for Deep Learning Applications," in IEEE Journal of Solid-State Circuits, vol. 53, no. 4, pp. 968-982, April 2018, doi: 10.1109/JSSC.2017.2778281. [14] A. Shafiee et al., "ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars." In ISCA,2016 [15] P. Chi et al., "PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory." In ISCA, 2016 [16] M. Lin et al., "DL-RSIM: A Simulation Framework to Enable Reliable ReRAM-based Accelerators for Deep Learning." In ICCAD, 2018 [17] C.-X. Xue et al., "24.1 a 1Mb multibit ReRAM computing-in-memory macro with 14.6ns parallel MAC computing time for CNN based AI edge processors", Proc. IEEE Int. Solid-State Circuits Conf., pp. 388-390., 2019. [18] G. Hinton, O. Vinyals, J. Dean, Distilling the Knowledge in a Neural Network, 2015. [19] M. Jaderberg et al., “Speeding up convolutional neural networks with low rank expansions.” In arXiv, 2014. [20] R. Krishnamoorthi, et al., “Quantizing deep convolutional networks for efficient inference: A whitepaper.” [21] S. Zhou, et al., “Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients.” In arXiv:1606.06160, 2016. [22] R. Banner, et al., “Post training 4-bit quantization of convolutional networks for rapid-deployment.” In NeurIPS , 2019. [23] S. Han, et al., “Learning both Weights and Connections for Efficient Neural Networks.” In NIPS, 2015. [24] H. Li, et al., ” Pruning Filters for Efficient ConvNets.” In ICLR, 2017 [25] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan and C. Zhang, "Learning Efficient Convolutional Networks through Network Slimming," In ICCV, 2017. [26] S. Narang, et al., “Exploring Sparsity in Recurrent Neural Networks.” In ICLR, 2017 [27] P. Molchanov, et al., “ Importance Estimation for Neural Network Pruning.” In CVPR, 2019. [28] N. Lee, et al., “SNIP: Single-shot Network Pruning based on Connection Sensitivity.” In ICLR, 2019. [29] H. Yang, et al.,” Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration.” In CVPR, 2019 [30] Q. Zhang, et al., "Learning Compact Networks via Similarity-Aware Channel Pruning." In MIPR , 2020. [31] J.-H. Luo, et al., “An Entropy-based Pruning Method for CNN Compression.” In arXiv, 2017. [32] L. Hang, et al., “Feature Statistics Guided Efficient Filter Pruning.” In IJCAI, 2020. [33] X. Si, et al., “A 28nm 64Kb 6T SRAM Computing-in-Memory Macro with 8b MAC Operation for AI Edge Chips.” In ISSCC, 2020. [34] S. Han, et al., “EIE: Efficient Inference Engine on Compressed Deep Neural Network.” In ISCA, 2016. [35] S. Zhang, et al., “Cambricon-X: An accelerator for sparse neural networks.” In MICRO, 2016. [36] J. Lin, et al., ” Learning the sparsity for ReRAM: mapping and pruning sparse neural network for ReRAM based accelerator.” In ASP-DAC, 2019. [37] H. Ji, et al., "ReCom: An efficient resistive accelerator for compressed deep neural networks." In DATE, 2018. [38] P. Wang, et al., "SNrram: An Efficient Sparse Neural Network Computation Architecture Based on Resistive Random-Access Memory." In DAC, 2019. [39] W. Wen, et al., “Learning Structured Sparsity in Deep Neural Network”. In NIPS, 2016. [40] S. Srinivas, et al., “Data-free Parameter Pruning for Deep Neural Networks.” In BMVC,2015. [41] Boyd, Stephen & Parikh, Neal & Chu, Eric & Peleato, Borja & Eckstein, Jonathan. “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers.” [42] T. Zhang, et al., “A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers.” In ECCV, 2018. [43] T. Zhang, et al., “StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs.” In arXiv, 2018. [44] H. Wang, Q. Zhang, Y. Wang, L. Yu and H. Hu, "Structured Pruning for Efficient ConvNets via Incremental Regularization," In IJCNN, 2019. [45] H. Yang, et al., “DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures.” In ICLR, 2020. [46] S.-H. Sie, et al. "MARS: Multi-macro Architecture SRAM CIM Based Accelerator with Co-designed Compressed Neural Networks." arXiv:2010.12861 2020. [47] T. -W. Chin, et al., "Towards Efficient Model Compression via Learned Global Ranking." In CVPR, 2020. [48] A. Kusupati, et al., “Soft Threshold Weight Reparameterization for Learnable Sparsity.” In ICML,2020. [49] T. Yang, et al., “Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning.” In CVPR,2017 [50] H. Yang, et al., “Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking.” In ICLR, 2019 [51] H. Yang, et al., ”ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model,” In CVPR, 2019 [52] J. Shi, et al., “SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration.” In arXiv,2020.
|