Accelerate Reed-Solomon Codes on GPUs__國立清華大學博碩士論文全文影像系統

帳號：guest(18.117.185.132) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士論文系統

、以作者查詢全國書目

論文基本資料
摘要
論文目次
參考文獻
電子全文

作者(中文):	袁帥
論文名稱(中文):	Accelerate Reed-Solomon Codes on GPUs
論文名稱(外文):	應用GPGPU加速Reed-Solomon Erasure Code的編解碼
指導教授(中文):	周志遠
口試委員(中文):	李哲榮林俊淵
學位類別:	碩士
校院名稱:	國立清華大學
系所名稱:	資訊工程學系
學號:	101062468
出版年(民國):	103
畢業學年度:	102
語文別:	英文
論文頁數:	47
中文關鍵詞:	里德-所羅門碼、抹除碼、通用圖形處理器
相關次數:	推薦:0 點閱:488 評分: 下載:0 收藏:0

Reed-Solomon Codes是一種在cloud storage system中被廣泛使用的redundancy solution。與replication這一傳統的redundancy solution相比，它在保證系統fault tolerance的同時，又能有效降低storage overhead。然而，Reed-Solomon Codes的編解碼複雜程度高，需要消耗大量的運算時間。在這篇論文中，我們採用GPU作爲accelerator，並探討了一些利用GPU來加速Reed-Solomon Codes編解碼的技巧。我們也用CUDA完成了GPU版本的Reed-Solomon Codes的實作，並對它的performance進行evaluate。作爲比較，我們也在Intel Xeon CPU上測試我們目前所知的最佳CPU實作——Jerasure的performance，最終，我們優化後的GPU版本可以獲得14倍以上的加速比。

1 Introduction 1
2 Related Works 3
3 Background 5
3.1 Reed-Solomon Coding Mechanism . . . . . . . . . . . . . . . . . . . . 5
3.2 Brief Introduction of Galois Field . . . . . . . . . . . . . . . . . . . . 6
4 Accelerating Operations over Galois Field 8
4.1 GPU Implementation: Loop-based or Table-based? . . . . . . . . . . 8
4.1.1 Overview of the Loop-based Method . . . . . . . . . . . . . . 8
4.1.2 Overview of the Table-based Methods . . . . . . . . . . . . . . 9
4.1.3 Comparison between the Loop-based and the Log&exp Table-
based Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Further Improvement of the Log&exp Table-based Method . . . . . . 13
5 Accelerating Matrix Multiplication 17
5.1 Square-Tiling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Generalized Tiling Algorithm . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Further Improvement of Tiling Algorithm . . . . . . . . . . . . . . . . 21
6 Accelerating Decoding Matrix Generation 23
7 Reducing Data Transfer Overhead 26
7.1 Using Pinned Host Memory . . . . . . . . . . . . . . . . . . . . . . . 26
7.2 Using CUDA Streaming . . . . . . . . . . . . . . . . . . . . . . . . . 28
8 Experiment 31
8.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.3 Overall Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 32
8.3.1 Step-by-step Improvement . . . . . . . . . . . . . . . . . . . . 33
8.3.2 GPU vs. CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.4 Accelerating Operations over Galois Field . . . . . . . . . . . . . . . 34
8.4.1 GPU Implementation: Loop-based or Table-based? . . . . . . 34
8.4.2 Further Improvement of the Log&exp Table-based Method . . 35
8.5 Accelerating Matrix Multiplication . . . . . . . . . . . . . . . . . . . 36
8.6 Reducing Data Transfer Overhead . . . . . . . . . . . . . . . . . . . . 39
8.6.1 Using Pinned Host Memory . . . . . . . . . . . . . . . . . . . 39
8.6.2 Using CUDA Streaming . . . . . . . . . . . . . . . . . . . . . 40
9 Conclusion 42

[1] D. Borthakur, R. Schmidt, R. Vadali, S. Chen, and P. Kling. Hdfs raid. In Hadoop User Group Meeting, 2010.
[2] X. Chu and K. Zhao. Practical random linear network coding on gpus. In GPU Solutions to Multi-scale Problems in Science and Engineering, pages 115-130. Springer, 2013.
[3] C. Cuda. Programming guide. NVIDIA Corporation (July 2012), 2012.
[4] M. L. Curry, A. Skjellum, H. L. Ward, and R. Brightwell. Accelerating reed-solomon coding in raid systems with gpus. In Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, pages 1-6. IEEE, 2008.
[5] A. Fikes. Storage architecture and challenges. Talk at the Google Faculty Summit, 2010.
[6] D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in globally distributed storage systems. In OSDI, pages 61{74, 2010.
[7] S. Ghemawat, H. Gobioff, and S.-T. Leung. The google File system. In ACM SIGOPS Operating Systems Review, volume 37, pages 29-43. ACM, 2003.
[8] B. J. Gimmestad. The russian peasant multiplication algorithm: A generalization. The Mathematical Gazette, 75(472):169-171, 1991.
[9] K. M. Greenan, E. L. Miller, and T. J. Schwarz. Optimizing galois Field arithmetic for diverse processor architectures and applications. In Modeling, Analysis and Simulation of Computers and Telecommunication Systems, 2008. MAS-COTS 2008. IEEE International Symposium on, pages 1-10. IEEE, 2008.
[10] C. Huang and L. Xu. Fast software implementation of Finite Field operations. Technical report, Citeseer, 2003.
[11] S. Kalcher and V. Lindenstruth. Accelerating galois Field arithmetic for reed-solomon erasure codes in storage applications. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on, pages 290-298. IEEE, 2011.
[12] C. NVidia. C best practices guide. NVIDIA, Santa Clara, CA, 2012.
[13] J. S. Plank, S. Simmerman, and C. D. Schuman. Jerasure: A library in c/c++ facilitating erasure coding for storage applications-version 1.2. University of Tennessee, Tech. Rep. CS-08-627, 23, 2008.
[14] J. S. Plank and L. Xu. Optimizing cauchy reed-solomon codes for fault-tolerant network storage applications. In Network Computing and Applications, 2006. NCA 2006. Fifth IEEE International Symposium on, pages 173-180. IEEE, 2006.
[15] I. Reed and G. Solomon. Polynomial codes over certain Finite Fields. Journal of the Society for Industrial & Applied Mathematics, 8(2):300-304, 1960.
[16] T. S. Schwarz and E. L. Miller. Store, forget, and check: Using algebraic signatures to check remotely administered storage. In Distributed Computing Systems, 2006. ICDCS 2006. 26th IEEE International Conference on, pages 12-12. IEEE, 2006.
[17] H. Shojania and B. Li. Pushing the envelope: Extreme network coding on the gpu. In Distributed Computing Systems, 2009. ICDCS'09. 29th IEEE International Conference on, pages 490-499. IEEE, 2009.
[18] H. Shojania, B. Li, and X. Wang. Nuclei: Gpu-accelerated many-core network coding. In INFOCOM 2009, IEEE, pages 459-467. IEEE, 2009.
[19] K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed File system. In Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, pages 1-10. IEEE, 2010.
[20] W. A. Wulf and S. A. McKee. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news, 23(1):20-24, 1995.4

(此全文限內部瀏覽)
電子全文
摘要檔

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文