帳號:guest(18.188.224.177)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):宋家宇
作者(外文):Sung, Chia-Yu
論文名稱(中文):利用 NNAPI 加速 NNEF 模型的執行
論文名稱(外文):Supports and Experiments with NNEF Models for NNAPI
指導教授(中文):李政崑
指導教授(外文):Lee, Jenq-Kuen
口試委員(中文):陳呈瑋
洪明郁
口試委員(外文):Chen, Cheng-Wei
Hung, Ming-Yu
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:106062561
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:50
中文關鍵詞:人工智慧模型編譯器
外文關鍵詞:AI Model CompilersNNEFNNAPI
相關次數:
  • 推薦推薦:0
  • 點閱點閱:357
  • 評分評分:*****
  • 下載下載:21
  • 收藏收藏:0
近年來深度學習模型已被廣泛應用於影像辨識、語音辨識、ADAS、AIoT等領域。可以用來運行這些人工智慧應用的神經網路框架有很多種,包含TensorFlow、Caffe、MXNet、PyTorch、Core ML、TensorFlow Lite、NN API等等。隨著眾多框架的出現,我們愈來愈需要有一個為這些框架設計的交換格式。為此,Khronos Group草擬了Neural Network Exchange Format (NNEF)。然而,NNEF是一個新的設計,目前還缺乏許多可以將之轉換至各式框架的工具。
在本論文裡,我們開發了一項工具來填補由NNEF轉換至NN API的空隙。我們的成果也同時讓NNEF模型可以成功獨自運行,並且在Android環境下有彈性地呼叫NN API來加速執行過程。我們的成果呼叫NN API的方式為藉由將輸入的NNEF模型切割成多個子模型,並且使用NN API來執行這些子模型。我們提出了一個用來決定輸入的NNEF模型該如何被分割的演算法。這個演算法為基於經典的廣度優先搜尋,因此可以檢驗所有可能的分割並且決定出最好的分割。不同於經典的廣度優先搜尋,我們的演算法包含了兩條用來加速搜尋過程的規則,其一為如果可以確定將某一個運算子交由某一平台來執行會比較好,那我們就將該運算子分配至該平台而不去搜尋其它平台。其二為我們限制佇列的大小,以確保時間複雜度不會無限制的增長。我們在實體手機執行的實驗結果顯示使用NN API來運行NNEF模型可以獲得4至200倍的加速。我們用在實驗的模型包含知名的LeNet、AlexNet、MobileNetV1、MobileNetV2、VGG-16與VGG-19。
In recent years, deep learning models with artificial intelligence have become
popular and been widely used for applications
such as image recognition, speech recognition, ADAS, AIoT, etc.
There are many choices of neural network frameworks for AI model applications especially in the inference part,
including TensorFlow, Caffe, MXNet, PyTorch, Core ML, TensorFlow Lite,
NN API, etc. With so many different emerging frameworks, there are needs to have exchange formats for
different AI frameworks. With this requirement, Khronos group has come up with the standard draft
known as the Neural Network Exchange Format (NNEF) to serve this purpose.
However, as NNEF is new
and there are still missing for the converting tools for various AI frameworks to make the exchange of various AI
frameworks possible. In this thesis, we fill in the gap to devise the NNEF to NN API converting tools.
Our work allows NNEF to execute inference on a host and android platforms and flexibly invokes
the android neural networks API (NNAPI) on the android platform to speed up inference operations.
The way it invokes NNAPI is through dividing the input NNEF model into multiple sub-models, and let NNAPI execute these sub-models.
We come up with an algorithm to decide how the input model is going to be divided, it is based on classic Breadth First Search while, which basically goes through every possible division of the input model and examines which one is the best choice.
We also add two rules of our own into the algorithm to speedup the searching.
The first is that if we can be certain that it is better to let NNAPI compute an operator, then we assign the operator to NNAPI without considering the other choice.
The second is that we restrict the maximum number of elements in the queue, so that the time complexity of our algorithm is bounded.
Preliminary experimental results show that our support with NNEF on NNAPI
can improve the performance by 4 to 200x over the NNEF version to android platforms without invoking NN APIs.
The experiment includes AI models such as LeNet, AlexNet, MobileNetV1, MobileNetV2, VGG-16 and VGG-19. The experiment runs on the real android smartphone.
Chapter 1 Introduction 1
Chapter 2 Background 5
Section 1 NNEF 5
Section 2 NNAPI 6
Chapter 3 NNEF-RT Overview 8
Chapter 4 Deploying operators to NNAPI 11
Section 1 NNEF2NNAPI 11
Section 2 Creating Sub-model(s) 13
Chapter 5 Experimental Results 22
Chapter 6 Conclusion 39
Section 1 Conclusion 39
Section 2 Future Work 39
Appendix Converting operators from NNEF to NNAPI 42
Section 1 Operators with same format 42
Section 2 Operators with different attributes 43
Section 3 Operators with different names 44
Section 4 Operators with different tensor shapes 45
Section 5 Operators with different data layouts 47
Section 6 Operators with variations 50
“TensorFlow,” https://www.tensorflow.org/.
“Caffe,” http://caffe.berkeleyvision.org/.
“MXNet,” https://mxnet.apache.org/.
“PyTorch,” https://pytorch.org/.
“Core ML,” https://developer.apple.com/documentation/coreml/.
“TensorFlow Lite,” https://www.tensorflow.org/lite/.
“NNAPI,” https://developer.android.com/ndk/guides/neuralnetworks.
C.-L. Lee, M.-Y. Hsu, B.-S. Lu, and J.-K. Lee, “Enable the flow forgpgpu-sim simulators with fixed-point instructions,” inProceedings ofthe 47th International Conference on Parallel Processing Companion.ACM, 2018, p. 12.
“The Khronos Group,” https://www.khronos.org/.
“NNEF Overview,” https://www.khronos.org/nnef.
“Google,” https://www.google.com/.
“Android Studio,” https://developer.android.com/studio/.
Y. LeCun, L. Bottou, Y. Bengio, P. Haffneret al., “Gradient-basedlearning applied to document recognition,”Proceedings of the IEEE,vol. 86, no. 11, pp. 2278–2324, 1998.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classificationwith deep convolutional neural networks,” inAdvances in neural infor-mation processing systems, 2012, pp. 1097–1105.
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang,T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convo-lutional neural networks for mobile vision applications,”arXiv preprintarXiv:1704.04861, 2017.
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo-bilenetv2: Inverted residuals and linear bottlenecks,” inProceedingsof the IEEE Conference on Computer Vision and Pattern Recognition,2018, pp. 4510–4520.
K. Simonyan and A. Zisserman, “Very deep convolutional networks forlarge-scale image recognition,”arXiv preprint arXiv:1409.1556, 2014.
“Qualcomm Snapdragon Neural Processing Engine,”https://developer.qualcomm.com/docs/snpe/overview.html.
“JSON (JavaScript Object Notation),” https://json.org/.
“MNIST,” http://yann.lecun.com/exdb/mnist/.
“CIFAR-10,” https://www.cs.toronto.edu/ kriz/cifar.html.
“ImageNet,” http://www.image-net.org/.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *