資料載入處理中...
圖書館首頁
|
網站地圖
|
首頁
|
本站說明
|
聯絡我們
|
相關資源
|
台聯大論文系統
|
操作說明
|
English
簡易查詢
進階查詢
論文瀏覽
熱門排行
我的研究室
上傳論文
新版博碩士論文系統
建檔說明
常見問題
帳號:guest(216.73.216.146)
離開系統
字體大小:
詳目顯示
第 1 筆 / 共 1 筆
/1
頁
以作者查詢圖書館館藏
、
以作者查詢臺灣博碩士論文系統
、
以作者查詢全國書目
論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文
作者(中文):
余俊佑
作者(外文):
Yu, Jun-You
論文名稱(中文):
Blockly架構上針對TVM的排程優化介面設計
論文名稱(外文):
Support nn Blockly for TVM Scheduling
指導教授(中文):
李政崑
指導教授(外文):
Lee, Jenq-Kuen
口試委員(中文):
游逸平
吳中如
口試委員(外文):
You, Yi-Ping
Wu, Chung-Ju
學位類別:
碩士
校院名稱:
國立清華大學
系所名稱:
資訊工程學系
學號:
105062611
出版年(民國):
108
畢業學年度:
107
語文別:
中文
論文頁數:
44
中文關鍵詞:
深度學習
外文關鍵詞:
Deep Learning
、
TVM
、
GUI
、
Blockly
、
Node.js
、
NNEF
、
ONNX
、
NNVM
、
TVM
、
LeNet
、
AlexNet
、
TVM schedule API
相關次數:
推薦:0
點閱:679
評分:
下載:0
收藏:0
近年來,深度學習(DL)已成為學術中重要的研究議題,目前已廣泛的應用於影像辨識,語音辨識,文字解析上。然而運行深度學習需要大量的運算時間以及軟體結合的問題。華盛頓大學SAMPL團隊的研究人員開發了TVM架構,它用於CPU或GPU上進行優化的深度學習編譯器,主要提供許多人工智慧的神經網路框架包含了TensorFlow, Keras, Caffe, MXNet等。隨著框架越來越多,結合性與優化性問題進而更加的複雜。因此,許多研究者正在探討著轉換框架的問題與優化的方式之間取得平衡。
在本論文中,我們使用了一種類似於圖形使用者介面(GUI),AI Model Blockly,它有效的結合軟體相容性的問題。Blockly運行在網頁客戶端的Javascript程式庫,目的藉由視覺化程式語言(VPL),客製化圖形與圖形組合模式來解決程式上的複雜性。接著將圖形轉化成NNEF規範格式,藉由Node.js伺服器端軟體,達到使用者與開發者之間的互動。在伺服器端,我們將NNEF規範格式載入到ONNX中轉成訓練好的模型格式,再載入到NNVM之中標準化計算圖形,最後將進入到TVM中進行優化。
我們將Blockly與TVM做結合,實驗多種的深度學習訓練模組(例如:Lenet, Alexnet)與TVM schedule API,例如: 我們可以使用split, fuse, reorder, vectorization, parallel API 等,達到不同種類的優化效果。
In recent years, Deep Learning (DL) has become an important research topic in academics. It has been widely used in image recognition, speech recognition, and analyzing the sentence. However, running deep learning model requires a lot of computing time and software integration.
Researchers at the University of Washington SAMPL group developed the TVM architecture, a Deep Learning compiler optimized for CPU or GPU. It mainly provides more Neural Network (NN) frameworks of Artificial Intelligence (AI) including TensorFlow, Keras, Caffe, MXNet, Pytorch, so on. Now the combination and optimization problems for NN framework are more complicated.
Therefore, many researchers are studying the balance between the transformation and the optimization. At the same time, it's our current goal to provide a research tool to solve these problems efficiently.
In this thesis, we use a Graphical User Interface (GUI), AI Model Blockly, it effectively combines software compatibility issues. Blockly, a Javascript library that runs on the web client. It solves the complexity of the program by editing Visual Programming Language (VPL) and combining various graphics.
Blockly converts the graphics into Neural Network Exchange Format (NNEF), and then uses the Node.js server-side software to transfer data between the user and the developer.
Not only we use the Node.js API and Socket.io to transfer different kinds of files on server-side, but also we can SSH to different servers. It makes integration more diversified.
Then we transfer NNEF specification to the server by the Node.js API. In server-side, we will load NNEF specification into Open Neural Network Exchange (ONNX) and convert to trained model format. At the same time, load the standardized calculation graph in NNVM. Finally, it will be loaded into TVM for the optimization.
We combine Blockly and TVM application which we experiment on a variety of deep learning models (E.g: LeNet, AlexNet) and TVM schedule APIs. For example, we can have split, fuse, reorder, vectorization, parallel, tile, etc in schedule parameters with Blockly GUIs. It achieves different kinds of application and optimization.
Abstract i
Contents iii
List of Figures v
List of Listings vii
List of Tables viii
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . 1
1.2 Overview of the Thesis . . . . . . . . . . . . . . . 3
2 Background Architectures 5
2.1 Blockly. . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Node.js. . . . . . . . . . . . . . . . . . . . . . . 8
2.3 NNEF . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 ONNX . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 TVM . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Design and Implementation 15
3.1 Client-Side Application. . . . . . . . . . . . . . . 15
3.2 Server implementation. . . . . . . . . . . . . . . . 29
3.3 TVM scheduling . . . . . . . . . . . . . . . . . . . 31
3.4 Execution architecture diagram . . . . . . . . . . . 33
4 Experimental results 35
5 Conclusion 41
5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . 41
5.2 Future Work. . . . . . . . . . . . . . . . . . . . . 42
6 Bibliography 43
[1] Y. LeCun, Y. Bengio, and G. Hinton, \Deep learning," nature, vol. 521, no. 7553, p. 436, 2015.
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, \Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097{1105.
[3] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner et al., \Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278{2324, 1998.
[4] T. Chen, T. Moreau, Z. Jiang, L. Zheng, E. Yan, H. Shen, M. Cowan, L. Wang, Y. Hu, L. Ceze et al., {TVM}: An automated end-to-end optimizing compiler for deep learning," in 13th {USENIX} Symposium on Operating Systems Design and Implementation({OSDI} 18), 2018, pp. 578-594.
[5] "NNEF Overview," https://www.khronos.org/nnef.
[6] "The Khronos Group," https://www.khronos.org/.
[7] "ONNX Overview," https://onnx.ai/.
[8] "Google," https://www.google.com/.
[9] "Apache," https://httpd.apache.org/.
[10] "blockly Overview," https://developers.google.com/blockly/.
[11] S. Tilkov and S. Vinoski, "Node. js: Using javascript to build high-performance network programs," IEEE Internet Computing, vol. 14, no. 6, pp. 80-83, 2010.
[12] "MNIST," http://yann.lecun.com/exdb/mnist/.
[13] "CIFAR-10," https://www.cs.toronto.edu/ kriz/cifar.html.
[14] H.-H. Lin, C.-H. Tu, and Y.-S. Hwang, \Cudablock: A gui programming tool for cuda," in 2015 44th International Conference on Parallel Processing Workshops. IEEE, 2015, pp. 37-42.
[15] "Caffe," http://caffe.berkeleyvision.org/.
[16] "PyTorch," https://pytorch.org/.
[17] "MXNet," https://mxnet.apache.org/.
(此全文未開放授權)
電子全文
中英文摘要
推文
當script無法執行時可按︰
推文
推薦
當script無法執行時可按︰
推薦
評分
當script無法執行時可按︰
評分
引用網址
當script無法執行時可按︰
引用網址
轉寄
當script無法執行時可按︰
轉寄
top
相關論文
1.
人工智慧模型的開放計算語言C++和Blockly計算之最佳化研究
2.
深度神經網路模型在自動駕駛中的應用
3.
人工智慧和開放計算語言的執行時系統和框架支持
4.
具分散式及非正規設計之超長指令集數位訊號處理器架構之編譯器設計與最佳化研究
5.
以元件為組成基礎之分散式系統的計算與通訊最佳化
6.
先進硬體架構之機率化指標分析之研究
7.
支援推測式多緒計算機結構的編譯器設計
8.
以網路處理器建構一個支援.NET Remoting的叢集式伺服器
9.
在無線網路環境下有效的進行遠端物件呼叫之研究
10.
有效率地支援遠端物件呼叫於InfiniBand架構
11.
支援指令結構描述語言與GCC的結合技術
12.
可程式化單晶片平台上的設計流程研究
13.
低功率嵌入式處理器之編譯器最佳化研究
14.
嵌入式多核心系統架構上之程式設計模型及系統軟體
15.
機器學習方法之複雜處理器編譯器設計
簡易查詢
|
進階查詢
|
論文瀏覽
|
熱門排行
|
管理/審核者登入
前往新版 [國立清華大學博碩士論文庫]
Go [NTHU Theses & Dissertations Repository]
關閉