|
[1] C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao and J. Cong, “Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks,” Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2015 [2] Guo, Kaiyuan, et al. "[DL] A Survey of FPGA-based Neural Network Inference Accelerators." ACM Transactions on Reconfigurable Technology and Systems (TRETS) 12.1 (2019): 2. [3] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” Proceedings of the IEEE conference on computer vision and pattern recognition, 2017. [4] A. Rahman, S. Oh, J. Lee and Ki. Choi, “Design Space Exploration of FPGA Accelerators for Convolutional Neural Networks,” Proceedings of the Conference on Design, Automation & Test in Europe. European Design and Automation Association, 2017. [5] Y. Ma, Y. Cao, S. Vrudhula and J.S. Seo, “Optimizing loop operation and dataflow in FPGA acceleration of deep convolutional neural networks.” Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2017. [6] R. Zhao, X. Niu, Y. Wu, W. Luk and Q. Liu, “Optimizing CNN-based object detection algorithms on embedded FPGA platforms,” International Symposium on Applied Reconfigurable Computing. Springer, Cham, 2017. [7] Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong, M. Jahre and K. Viessers, “Finn: A framework for fast, scalable binarized neural network inference,” Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2017. [8] Xilinx, “UG473: 7 Series FPGAs Memory Resources,” 2019. [9] W. Shi, J. Cao, Q. Zhang, Y. Li, L.Xu, "Edge computing: Vision and challenges," IEEE Internet of Things Journal 3.5 (2016): 637-646. [10] Qiu, Jiantao, et al, “Going deeper with embedded fpga platform for convolutional neural network,” Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2016. [11] Xilinx, “DS190: Zynq-7000 all programmable SoC overview,” 2018. [12] C. Zhang, P. Li, G. Y. Sun, Y. J. Guan, B. J. Xiao and J. Cong, “Optimizing fpga-based accelerator design for deep convolutional neural networks,” Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2015. [13] Y. J. Wai, Z bin M. Yussof, S.I. bin Salim, and L. K. Chuan, "Fixed point implementation of tiny-yolo-v2 using opencl on fpga," International Journal of Advanced Computer Science and Applications 9.10, 2018, p. 506-512. [14] J. Cong, and B. J. Xiao, "Minimizing computation in convolutional neural networks," International conference on artificial neural networks. Springer, Cham, 2014. [15] M. Verhelst, and B. Moons. "Embedded deep neural network processing: Algorithmic and processor techniques bring deep learning to iot and edge devices," IEEE Solid-State Circuits Magazine 9.4 (2017): 55-65. [16] Guo, Kaiyuan, et al. "[DL] A survey of FPGA-based neural network inference accelerators," ACM Transactions on Reconfigurable Technology and Systems (TRETS) 12.1 (2019): 1-26. [17] Zhang, Chen, et al. "Caffeine: Toward uniformed representation and acceleration for deep convolutional neural networks," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 38.11 (2018): 2072-2085. [18] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems. 2012. [19] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context," European conference on computer vision. Springer, Cham, 2014. [20] Xilinx, “UG585: Zynq-7000 Technical Reference Manual,” 2018. [21] Xu, Bing, et al. "Empirical evaluation of rectified activations in convolutional network," arXiv preprint arXiv:1505.00853, 2015. [22] Sadri, Mohammadsadegh, et al. "Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ," Proceedings of the 10th FPGAworld Conference. 2013. [23] Xilinx, “UG1073 Vivado AXI Reference:,” 2017. [24] “PYNQ_Burst_Test,” Feb. 21, 2020. [Online]. Available: https://github.com/ms0488638/PYNQ_Burst_Test [25] Ma, Yufei, et al. "End-to-end scalable FPGA accelerator for deep residual networks." 2017 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2017. [26] Sze, Vivienne, et al. "Efficient processing of deep neural networks: A tutorial and survey." Proceedings of the IEEE 105.12 (2017): 2295-2329. [27] Mittal, Sparsh. "A Survey on optimized implementation of deep learning models on the NVIDIA Jetson platform." Journal of Systems Architecture (2019). [28] Das, Reetuparna, et al. "DNN accelerator Architecture- SIMD or Systolic," ACM, ACM SIGARCH (2018). [29] Rajmohan, Shathanaa, and Ramasubramanian Natarajan. "Group influence based improved firefly algorithm for Design Space Exploration of Datapath resource allocation." Applied Intelligence 49.6 (2019): 2084-2100. [30] Rockchip, “RK3399 Open source Document,” 2019.
|