帳號:guest(18.117.185.132)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):袁帥
論文名稱(中文):Accelerate Reed-Solomon Codes on GPUs
論文名稱(外文):應用GPGPU加速Reed-Solomon Erasure Code的編解碼
指導教授(中文):周志遠
口試委員(中文):李哲榮
林俊淵
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:101062468
出版年(民國):103
畢業學年度:102
語文別:英文
論文頁數:47
中文關鍵詞:里德-所羅門碼抹除碼通用圖形處理器
相關次數:
  • 推薦推薦:0
  • 點閱點閱:488
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
Reed-Solomon Codes是一種在cloud storage system中被廣泛使用的redundancy solution。與replication這一傳統的redundancy solution相比,它在保證系統fault tolerance的同時,又能有效降低storage overhead。然而,Reed-Solomon Codes的編解碼複雜程度高,需要消耗大量的運算時間。在這篇論文中,我們採用GPU作爲accelerator,並探討了一些利用GPU來加速Reed-Solomon Codes編解碼的技巧。我們也用CUDA完成了GPU版本的Reed-Solomon Codes的實作,並對它的performance進行evaluate。作爲比較,我們也在Intel Xeon CPU上測試我們目前所知的最佳CPU實作——Jerasure的performance,最終,我們優化後的GPU版本可以獲得14倍以上的加速比。
1 Introduction 1
2 Related Works 3
3 Background 5
3.1 Reed-Solomon Coding Mechanism . . . . . . . . . . . . . . . . . . . . 5
3.2 Brief Introduction of Galois Field . . . . . . . . . . . . . . . . . . . . 6
4 Accelerating Operations over Galois Field 8
4.1 GPU Implementation: Loop-based or Table-based? . . . . . . . . . . 8
4.1.1 Overview of the Loop-based Method . . . . . . . . . . . . . . 8
4.1.2 Overview of the Table-based Methods . . . . . . . . . . . . . . 9
4.1.3 Comparison between the Loop-based and the Log&exp Table-
based Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Further Improvement of the Log&exp Table-based Method . . . . . . 13
5 Accelerating Matrix Multiplication 17
5.1 Square-Tiling Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Generalized Tiling Algorithm . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Further Improvement of Tiling Algorithm . . . . . . . . . . . . . . . . 21
6 Accelerating Decoding Matrix Generation 23
7 Reducing Data Transfer Overhead 26
7.1 Using Pinned Host Memory . . . . . . . . . . . . . . . . . . . . . . . 26
7.2 Using CUDA Streaming . . . . . . . . . . . . . . . . . . . . . . . . . 28
8 Experiment 31
8.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.3 Overall Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 32
8.3.1 Step-by-step Improvement . . . . . . . . . . . . . . . . . . . . 33
8.3.2 GPU vs. CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.4 Accelerating Operations over Galois Field . . . . . . . . . . . . . . . 34
8.4.1 GPU Implementation: Loop-based or Table-based? . . . . . . 34
8.4.2 Further Improvement of the Log&exp Table-based Method . . 35
8.5 Accelerating Matrix Multiplication . . . . . . . . . . . . . . . . . . . 36
8.6 Reducing Data Transfer Overhead . . . . . . . . . . . . . . . . . . . . 39
8.6.1 Using Pinned Host Memory . . . . . . . . . . . . . . . . . . . 39
8.6.2 Using CUDA Streaming . . . . . . . . . . . . . . . . . . . . . 40
9 Conclusion 42
[1] D. Borthakur, R. Schmidt, R. Vadali, S. Chen, and P. Kling. Hdfs raid. In Hadoop User Group Meeting, 2010.
[2] X. Chu and K. Zhao. Practical random linear network coding on gpus. In GPU Solutions to Multi-scale Problems in Science and Engineering, pages 115-130. Springer, 2013.
[3] C. Cuda. Programming guide. NVIDIA Corporation (July 2012), 2012.
[4] M. L. Curry, A. Skjellum, H. L. Ward, and R. Brightwell. Accelerating reed-solomon coding in raid systems with gpus. In Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, pages 1-6. IEEE, 2008.
[5] A. Fikes. Storage architecture and challenges. Talk at the Google Faculty Summit, 2010.
[6] D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in globally distributed storage systems. In OSDI, pages 61{74, 2010.
[7] S. Ghemawat, H. Gobioff, and S.-T. Leung. The google File system. In ACM SIGOPS Operating Systems Review, volume 37, pages 29-43. ACM, 2003.
[8] B. J. Gimmestad. The russian peasant multiplication algorithm: A generalization. The Mathematical Gazette, 75(472):169-171, 1991.
[9] K. M. Greenan, E. L. Miller, and T. J. Schwarz. Optimizing galois Field arithmetic for diverse processor architectures and applications. In Modeling, Analysis and Simulation of Computers and Telecommunication Systems, 2008. MAS-COTS 2008. IEEE International Symposium on, pages 1-10. IEEE, 2008.
[10] C. Huang and L. Xu. Fast software implementation of Finite Field operations. Technical report, Citeseer, 2003.
[11] S. Kalcher and V. Lindenstruth. Accelerating galois Field arithmetic for reed-solomon erasure codes in storage applications. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on, pages 290-298. IEEE, 2011.
[12] C. NVidia. C best practices guide. NVIDIA, Santa Clara, CA, 2012.
[13] J. S. Plank, S. Simmerman, and C. D. Schuman. Jerasure: A library in c/c++ facilitating erasure coding for storage applications-version 1.2. University of Tennessee, Tech. Rep. CS-08-627, 23, 2008.
[14] J. S. Plank and L. Xu. Optimizing cauchy reed-solomon codes for fault-tolerant network storage applications. In Network Computing and Applications, 2006. NCA 2006. Fifth IEEE International Symposium on, pages 173-180. IEEE, 2006.
[15] I. Reed and G. Solomon. Polynomial codes over certain Finite Fields. Journal of the Society for Industrial & Applied Mathematics, 8(2):300-304, 1960.
[16] T. S. Schwarz and E. L. Miller. Store, forget, and check: Using algebraic signatures to check remotely administered storage. In Distributed Computing Systems, 2006. ICDCS 2006. 26th IEEE International Conference on, pages 12-12. IEEE, 2006.
[17] H. Shojania and B. Li. Pushing the envelope: Extreme network coding on the gpu. In Distributed Computing Systems, 2009. ICDCS'09. 29th IEEE International Conference on, pages 490-499. IEEE, 2009.
[18] H. Shojania, B. Li, and X. Wang. Nuclei: Gpu-accelerated many-core network coding. In INFOCOM 2009, IEEE, pages 459-467. IEEE, 2009.
[19] K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed File system. In Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, pages 1-10. IEEE, 2010.
[20] W. A. Wulf and S. A. McKee. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news, 23(1):20-24, 1995.4
(此全文限內部瀏覽)
電子全文
摘要檔
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *