帳號:guest(18.191.107.209)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):萬子豪
論文名稱(中文):以多核心圖形處理器實現雅可比-大衛森演算法
指導教授(中文):陳人豪
學位類別:碩士
校院名稱:國立新竹教育大學
系所名稱:應用數學系碩士班
學號:10124210
出版年(民國):103
畢業學年度:102
語文別:中文
中文關鍵詞:圖形處理器雅可比-大衛森演算法
外文關鍵詞:Graphics Processing UnitJacobi-Davidsons Method
相關次數:
  • 推薦推薦:0
  • 點閱點閱:31
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
以多核心圖形處理器實現雅可比-大衛森演算法


摘要
  Jacobi-Davidsons Method在求解大型稀疏特徵值問題時雖然有極佳的迭代收斂性,但近年來資料規模量逐漸變大,即便擁有極佳的迭代收斂性還是會花上大量得研究成本。因此使用圖形處理器(Graphics Processing Unit,GPU)以協同處理的方式降低研究成本就顯得更為重要。
  本論文探討如何以圖形處理器加速 Jacobi-Davidsons Method。 其中包含基本線性代數運算如矩陣相乘,向量內積和解大行稀疏線性系統,且分析在使用圖形處理器加速前後之效率。
  研究結果顯示,GPU 之執行結果為正確的。而基本線性代數運算中, GPU可將效率提升 1.95 ~ 4.638 倍,可見其效率提升。然而,整體的 Jacobi-Davidson Method 計算時間卻與 CPU 版本的相近,原因可能與記憶體搬移耗費過多時間,以及本實驗中所使用的 GPU 的計算時脈較低有關。
Accelerating Jacobi-Davidson Method using Multi-core Graphics Processing Unit



Abstract
  Jacobi-Davidson Method (JDM) has rapid iterative convergence in
solving large sparse eigenvalue problems. However, due to the huge
matrix size, we still have to spend a lot of research costs. This motivates
us to employ the graphics processing unit (GPU) to accelerate the JDM.
Under the framework of Compute Unified Device Architecture
(CUDA), some linear algebraic operations including matrix-matrix
multiplication, vector inner product and the computation of the solution
of the sparse linear system, are accelerated by using GPU. To evaluate the
performance of our code, we also perform these operations and overall
JDM with and without GPU. The results show that the solutions
computed by GPU are correct. Moreover, these linear algebraic
operations via GPU can gain 1.95~4.63 times speedup with respect to
CPU version. However, the performance of overall JDM by using GPU is
comparable to those by CPU. This may be due to many extra works
regarding memory transfer in our GPU code, or slower clock rate in our
GPU.
目錄
第一章 緒論............................................1
1.1 研究背景........................................1
1.2 研究動機........................................2
1.3 研究目的........................................5
1.4 論文架構........................................6
第二章 CUDA背景知識探討................................7
2.1  CUDA.........................................7
2.1.1 CUDA平行化程式設計模型.................7
2.1.2 CUDA記憶體模型........................8
2.1.3 NVIDIA GeForce GT 740 M 硬體介紹.......10
   2.1.4 CUDA Kernel...........................10
2.1.5 CUDA Runtime API......................12
2.1.6 __syncthread()函式......................13
2.2   CUDA平行化方法 .............................13
第三章 Jacobi-Davidson Method平行化....................17
3.1  Jacobi-Davidsons Method簡介.................17
3.2  Jacobi-Davidsons Method.....................18
3.3  CUDA平行化Jacobi-Davidsons Method............19
第四章 實驗結果.........................................21
4.1  實驗環境......................................21
4.2 實驗方式......................................21
4.3 實驗問題......................................22
4.4 矩陣乘向量....................................23
4.5 向量內積......................................25
4.6 Jacobi Method................................26
4.7 Jacobi-Davidsons Method......................27
第五章 結論.............................................29
參考文獻.................................................37









圖目錄
圖1-1 GPU 與 CPU 峰值浮點運算能力比較...................1
圖1-2 GPU 與 CPU 記憶體頻寬比較.........................2
圖1-3大量執行單元........................................3
圖2-1 平行化程式設計模型..................................8
圖2-2 CUDA 記憶體模型...................................9
圖2-3平行化運算過程.....................................15
圖4-1 GPU 與 CPU 矩陣乘向量時間折線圖..................24
圖4-2 GPU與 CPU 向量內積時間折線圖.....................25
圖4-3 GPU與 CPU Jacobi Method 時間折線圖.................26
圖4-4 GPU與 CPU Jacobi-Davidsons Method 時間折線圖.......27








表目錄
表2-1優化前後之 GPU 和 CPU 時間比較表...................16
表4-1 CPU 與 GPU 架構簡介.............................21
表4-2 特徵值結果........................................23
表4-3 CPU 與 GPU 矩陣乘向量效能之比較.................24
表4-4 CPU 與 GPU 向量內積效能之比較...................25
表4-5 CPU 與 GPU Jacobi Method 效能之比較..............26
表4-6 CPU 與 GPU Jacobi-Davidsons Method 效能之較.......28
參考文獻
[1] 張舒,GPU高效能運算之CUDA,2-10(2011)
[2] 薛熙于,還在用圖形顯示卡打電動嗎?當超級電腦遇上圖形顯示卡,物理雙月刊,34 172-174(2012)
[3] 黃耘,利用CUDA平行計算平台探討可壓縮留在三維煙囪管道的熱傳導分析,1-8(2010)
[4] Weichung Wang,Performance models and workload distribution algorithms for optimizing a hybrid CPU–GPU multifrontal solver, Computers & Mathematics with Applications,67 1421-1437(2014)
[5] Gerard L. G. Sleijpen、Henk A. Van der Vorst,A Jacobi-Davidson Lteration Method for Linear Eigenvalue Problems,SIAM REVIEW,42,2 267-293(2000)
[6] Weichung Wang、Tsung-Min Hwang、Wen-Wei Lin、Jinn-Liang Liu,Numerical methods for semiconductor heterostuctures with band nonparabolicity,Journal of Computational Physics,190 141-158(2003)
[7] SLEIJPEN,G.L.G.,AND VANDER VORST,H.A.A Jacobi-Davidson iteration method for linear eigenvalue problems.SIAM J.Matrix Anal.Appl.17,2(1996),401-25.
[8] DAVIDSON,E.R. The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices.J.Comput.Phys.17(1975),87-94.
[9] http://www.top500.org/
[10] Weichung Wang,A CPU-GPU hybrid approach for the unsymmetric multifrontal method,Parallel Computing,37,759-770(2011)
[11] Wei-Kang Cheng,GPU-Based Acceleration of Ray-tracing Alorithm and It's Applications on Medical Imaging ,1-3(2010)
[12] PETER ARBENZ、MICHIEL E. HOCHSTENBACH, A JACOBI–DAVIDSON METHOD FOR SOLVING COMPLEX SYMMETRIC EIGENVALUE PROBLEMS, SIAM J. SCI. COMPUT,25,1655-1673(2004)
[13] Jinn-Liang Liu、Jen-Hao Chen、O. Voskoboynikov, A model for semiconductor quantum dot molecule based on the current spin density functional theory,Computer Physics Communications,175, 575–582 (2006)
[14] https://software.intel.com/en-us/intel-mkl
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *