|
[1] RISC-V, RISC-V foundation, 2010. [Online]. Available: https://riscv.org [2] H. Lin, P. Chen, Y.-S. Hwang, and J.-K. Lee, “Devise rust compiler opti- mizations on risc-v architectures with simd instructions,” in Proceedings of the 48th International Conference on Parallel Processing: Workshops, 2019, pp. 1–7. [3] RISC-V Vector Extension, RISC-V foundation, 2018. [Online]. Available: https://github.com/riscv/riscv-v-spec [4] T. Chen, T. Moreau, Z. Jiang, L. Zheng, E. Yan, H. Shen, M. Cowan, L. Wang, Y. Hu, L. Ceze et al., “Tvm: An automated end-to-end op- timizing compiler for deep learning,” in 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), 2018, pp. 578–594. [5] Tensorflow, Google, 2015. [Online]. Available: https://www.tensorflow.org/ [6] J. McFarlane and M. Wong, Pytorch, 2016. [Online]. Available: https://pytorch.org/ [7] MXNet, apache foundation, 2015. [Online]. Available: https://mxnet.apache.org/ [8] coreML, Apple, 2017. [Online]. Available: https://developer.apple.com/documentation/coreml [9] ONNX, ONNX Project Contributors, 2017. [Online]. Available: https://onnx.ai/ [10] J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Ama- rasinghe, “Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines,” in Acm Sig- plan Notices, vol. 48, no. 6. ACM, 2013, pp. 519–530. [11] J. Roesch, S. Lyubomirsky, L. Weber, J. Pollock, M. Kirisame, T. Chen, and Z. Tatlock, “Relay: a new ir for machine learning frameworks,” in Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. ACM, 2018, pp. 58– 68. [12] A. Lu, C.-L. Lee, Y.-M. Chang, P.-Y. Chen, H.-W. Sung, H. Lin, S.-C. Wang, and J.-K. Lee, “Enabling tvm on risc-v architectures with simd instructions,” 2019. [13] J.-K. Lee, A. Lu, Y.-M. Chang, C.-L. Lee, P.-Y. Chen, and S.-C. Wang, “Supporting tvm on risc-v architectures,” 2018. [14] J.-K. Lee, C.-C. Yang, A. Lu, P.-Y. Chen, Y.-M. Chang, C. Chang, Y.- R. Chen, H. Liao, C.-L. Lee, S.-H. Lu, and S.-C. Wang, “Supporting tvm on risc-v architectures with simd computations,” 2019. [15] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner et al., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. [16] A. Krizhevsky, “One weird trick for parallelizing convolutional neural networks,” arXiv preprint arXiv:1404.5997, 2014. [17] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convo- lutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017. [18] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo- bilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4510–4520. [19] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer param- eters and¡ 0.5 mb model size,” arXiv preprint arXiv:1602.07360, 2016. [20] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779– 788. [21] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018. 22] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826. |