|
[1] “NCNN: a high-performance neural network inference framework optimized for the mobile platform.” https://github.com/Tencent/ncnn. [2] Albert Chiou and Mat Laibowitz. "Cache Coherent Interconnect Network ". [3] Neil Parris. "Cache Coherency Fundamentals." https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/extended-system-coherency---part-1---cache-coherency-fundamentals, 2013.H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural network cascade for face detection,” in IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 5325–5334. [4] “Snoop Control Unit” https://developer.arm.com/docs/100486/latest/snoop-control-unit.M. Motamedi, D. Fong, and S. Ghiasi, “Machine intelligence on resource-constrained IoT devices: The case of thread granularity optimization for CNN inference,” ACM Trans. Embedded Comput. Syst., vol. 16, no. 5s, pp. 1–19, 2017. [5] Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size.” arXiv preprint arXiv:1602.07360, 2016. [6] Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. “Mobilenets: Efficient convolutional neural networks for mobile vision applications.” arXiv preprint arXiv:1704.04861, 2017. [7] “Tengine is a lite, high performance, modular inference engine for embedded device.” https://github.com/OAID/Tengine. [8] “Compute Library: A Software Library for Computer Vision and Machine Learning.” https://developer.arm.com/technologies/compute-library.J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng. "Quantized convolutional neural networks for mobile devices." arXiv preprint arXiv:1512.06473, 2015. [9] “Arm big.LITTLE technology is a heterogeneous processing architecture that uses two types of processor.” https://www.arm.com/why-arm/technologies/big-littleLinpeng Tang, Yida Wang, Theodore L Willke, and Kai Li. "Scheduling computation graphs of deep learning models on manycore cpus." arXiv preprint arXiv:1807.09667, 2018. [10] “Intel Lakefield is packed with more than one type of CPU core to create a more stable and better rounded system.” https://www.techradar.com/news/intel-lakefield-video-guides-us-inside-its-first-hybrid-processor?region-switch=1551470279Hsin-Yu Ho, et al. "An Effective Early Multi-core System Shared Cache Design Method Based on Reuse-distance Analysis" National Tsing Hua University, 2017. [11] J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng. "Quantized convolutional neural networks for mobile devices." arXiv preprint arXiv:1512.06473, 2015. [12] Ji Lin, Yongming Rao, Jiwen Lu, and Jie Zhou. "Runtime neural pruning." In NIPS, 2017. [13] Linpeng Tang, Yida Wang, Theodore L Willke, and Kai Li. "Scheduling computation graphs of deep learning models on manycore cpus." arXiv preprint arXiv:1807.09667, 2018. [14] Siqi Wang, Gayathri Ananthanarayanan, Yifan Zeng, Neeraj Goel, Anuj Pathania, Tulika Mitra. "High-Throughput CNN Inference on Embedded ARM big.LITTLE Multi-Core Processors. " arXiv preprint arXiv:1903.05898, 2019. [15] B. Lewis and D. J. Berg. “Multithreaded Programming with Pthreads. Prentice Hall”, 1998.
|