帳號:guest(3.16.67.134)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):簡士桓
作者(外文):Chien, Shih-Huan
論文名稱(中文):ViennaCL++: 針對機器學習運用之線性代數加速程式庫建立 OpenCL C++ 流程
論文名稱(外文):ViennaCL++: Enabling OpenCL C++ Flow in Linear Algebraic Acceleration Library for Machine Learning
指導教授(中文):李政崑
指導教授(外文):Lee, Jenq-Kuen
口試委員(中文):蘇泓萌
陳鵬升
口試委員(外文):Su, Hong-Men
Chen, Peng-Sheng
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:105065528
出版年(民國):107
畢業學年度:106
語文別:英文
論文頁數:37
中文關鍵詞:異質運算OpenCLC++SPIR-VEigenViennaCLTensorFlow
外文關鍵詞:Heterogeneous ComputingOpenCLC++SPIR-VEigenViennaCLTensorFlow
相關次數:
  • 推薦推薦:0
  • 點閱點閱:113
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
近年來各種運算裝置如 CPU、GPU、DSP、FPGA 與其他特有硬體加速器等上之異質運算因高效能運算需求增長而備受矚目。異質多核心系統之運用可對於計算密集之應用程式,如深度學習之訓練與推論,帶來相當助益。現今而言,深度學習應用程式之生態系統十分仰賴框架程式庫來處理各種深度學習或機器學習之底層實作。舉例而言,TensorFlow 即是可用於此類深度學習應用程式的機器學習框架程式庫,使用 Eigen ,一個 C++ 線性代數加速標頭程式庫,用於其主要運算核心。另一方面, OpenCL 是一個異質多核心系統上的平行運算開放標準,在最新的標準當中支援先進的 OpenCL C++ 核心語言。
在此論文中,透過新提出之 ViennaCL++ 線性代數加速程式庫 TensorFlow/Eigen 上以 OpenCL 與 OpenCL C++ 啟用異質運算之嶄新軟體流程被提出。 ViennaCL++ 是一個基於 ViennaCL 擴展的程式庫,一個支援 OpenCL 後端及與 Eigen 之資料介面的 C++ 線性代數加速程式庫。 ViennaCL++ 使用最先進的 OpenCL 2.1/2.2 標準以及 SPIR-V 流程與 OpenCL C++ 核心語言。另外,我們提出 ViennaCL++ 中之 OpenCL C++ 運算核心之強化,包括程式碼重構,函數模板,向量與矩陣類別模板,移動語意及共享虛擬記憶體。我們的實驗顯示 ViennaCL++ 中的 OpenCL C++ 核心與基線相較在一級 BLAS 測量中效能相近,在二級 BLAS 測量中增快 3 至 12 倍,在三級 BLAS 測量中增快 30 至 67 倍,證明我們的方案可以有效透過 OpenCL 加速 TensorFlow/Eigen 流程。
Heterogeneous computing on various computing devices including CPUs, GPUs, DSPs, FPGAs and other specialized hardware accelerators has received great attention in recent years due to the increasing demand on high-performance computing. Utilization of heterogeneous multi-core systems could be very beneficial for compute-intensive applications such as training and inferencing in deep learning. In modern days, the ecosystem of deep learning application relies heavily on framework libraries to handle the underlying implementation of various kinds of deep learning or machine learning computations. In this case, TensorFlow is one of such machine learning framework library which could be used for such deep learning applications using Eigen, a C++ header library for linear algebraic acceleration, for its core computational kernels. On the other hand, OpenCL is an open standard for parallel programming in heterogeneous multi-core systems which supports advanced OpenCL C++ kernel language in the latest specification.
In this thesis, a new software flow which enables heterogeneous computing through OpenCL and OpenCL C++ on TensorFlow/Eigen using the newly proposed ViennaCL++ linear algebraic library is presented. ViennaCL++ is an extended library based ViennaCL, a C++ linear algebraic acceleration library supporting OpenCL backend and data interfacing with Eigen. ViennaCL++ uses the state-of-the-art OpenCL 2.1/2.2 standard along with SPIR-V flow and OpenCL C++ kernel language. Moreover, we present the enhancement of OpenCL C++ computational kernels in ViennaCL++, including the code refactoring, the function templates, the vector/matrix class templates, the move semantics and shared virtual memory. Our evaluation shows that OpenCL C++ kernels in ViennaCL++ are similar in performance compared to the baseline for level 1 BLAS benchmarks, 3 to 12 times faster for level 2 BLAS benchmarks, operations while outperforms it for level 3 BLAS benchmarks with a speedup of 30 to 67 times, demonstrating that our scheme effectively enables acceleration on TensorFlow/Eigen through OpenCL.
Abstract i
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . 3
2 Background 5
2.1 OpenCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 SPIR-V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 TensorFlow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 ViennaCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 The Design and Workflow of ViennaCL++ 11
3.1 Software Flow of ViennaCL++ . . . . . . . . . . . . . . . . . 12
3.2 OpenCL C++ through SPIR-V Compilation Flow . . . . . . . 13
3.3 Interaction of ViennaCL++ with TensorFlow/Eigen . . . . . . 14
4 Enhancement of Computational Kernels 18
4.1 Cross-language Code Refactoring of Computation Kernels . . 19
4.1.1 OpenCL C++ Standard Library Headers . . . . . . . . 19
4.1.2 Explicit Address Space Classes . . . . . . . . . . . . . 20
4.1.3 Refactoring for Change of Naming Conventions . . . . 21
4.2 OpenCL C++ Function Templates . . . . . . . . . . . . . . . 21
4.3 OpenCL C++ Vector/Matrix Class Templates . . . . . . . . . 23
4.4 Move Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.5 Enabling Shared Virtual Memory in ViennaCL++ . . . . . . . 26
5 Experimental Methodology and Results 28
5.1 Experimental Environment . . . . . . . . . . . . . . . . . . . . 28
5.2 Experimental Methodology . . . . . . . . . . . . . . . . . . . . 29
5.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 29
6 Conclusion and Future Works 33
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
[1] SPIR Overview, Khronos Group. [Online]. Available:
https://www.khronos.org/spir/
[2] The OpenCL Specification, version 1.2, Khronos
OpenCL Working Group, 2012. [Online]. Available:
http://www.khronos.org/registry/cl/spec/opencl-1.2.pdf
[3] The OpenCL Specification, version 2.0, Khronos
OpenCL Working Group, 2015. [Online]. Available:
https://www.khronos.org/registry/cl/specs/opencl-2.0.pdf
[4] CUDA C Programming Guide, NVIDIA, 2016. [Online]. Available:
http://docs.nvidia.com/cuda/cuda-c-programming-guide/
[5] The SYCL Specification, version 1.2, Khronos
OpenCL Working Group, 2015. [Online]. Available:
https://www.khronos.org/registry/SYCL/specs/sycl-1.2.pdf
[6] L. Dagum and R. Menon, “Openmp: an industry standard api for
shared-memory programming,” Computational Science & Engineering,
IEEE, vol. 5, no. 1, pp. 46–55, 1998.
[7] The OpenCL C++ Specification, version 1.0, Khronos
OpenCL Working Group, 2018. [Online]. Available:
https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL -
Cxx.pdf
[8] The OpenCL Specification, version 2.2, Khronos
OpenCL Working Group, 2018. [Online]. Available:
https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL -
API.pdf
[9] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro,
G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J.
Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. J´ozefowicz,
L. Kaiser, M. Kudlur, J. Levenberg, D. Man´e, R. Monga, S. Moore,
D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever,
K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Vi´egas,
O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and
X. Zheng, “Tensorflow: Large-scale machine learning on heterogeneous
distributed systems,” CoRR, vol. abs/1603.04467, 2016. [Online].
Available: http://arxiv.org/abs/1603.04467
[10] B. Jacob and G. Guennebaud, Eigen. [Online]. Available:
http://eigen.tuxfamily.org/index.php
[11] K. Rupp, P. Tillet, F. Rudolf, J. Weinbub, A. Morhammer, T. Grasser,
A. Jngel, and S. Selberherr, “Viennacl—linear algebra library for
multi- and many-core architectures,” SIAM Journal on Scientific Computing, vol. 38, no. 5, pp. S412–S439, 2016. [Online]. Available:
https://doi.org/10.1137/15M1026419
[12] OpenCL C++ Compiler Reference Implementation, Khronos
OpenCL Working Group, 2018. [Online]. Available:
https://github.com/KhronosGroup/SPIR/tree/spirv-1.1
[13] OpenCL C++ Standard Library Reference Implementation,
Khronos OpenCL Working Group, 2018. [Online]. Available:
https://github.com/KhronosGroup/libclcxx
[14] Bazel, Google. [Online]. Available: https://bazel.build/
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *