帳號:guest(3.12.136.63)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):張展榕
作者(外文):Chang, Chan-Jung
論文名稱(中文):ECS2: 利用直接且平行的I/O 路徑提升使用 GPU 加速糾刪碼的儲存系統效能
論文名稱(外文):ECS2: A Fast Erasure Coding Library for GPU-Accelerated Storage Systems With Parallel & Direct IO
指導教授(中文):周志遠
指導教授(外文):Chou, Jerry
口試委員(中文):李哲榮
賴冠州
口試委員(外文):Lee, Che-Rung
Lai, Kuan-Chou
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062522
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:29
中文關鍵詞:儲存裝置糾刪碼可靠性效能平行I/O
外文關鍵詞:Storage systemErasure CodeReliabilityPerformanceParallel I/O
相關次數:
  • 推薦推薦:1
  • 點閱點閱:444
  • 評分評分:*****
  • 下載下載:6
  • 收藏收藏:0
隨著資料量高速成長,對於可靠、大規模、具成本效益的儲存系統有迫切的
需求。糾刪碼由於可透過較高的儲存成本效益保持資料的可靠度,而逐漸吸
引眾人目光,同時已被廣泛應用在許多分散式、大規模儲存系統,如Azure
cloud storage 和HDFS。然而,採用糾刪碼的代價即是較高的計算複雜度。許
多研究指出,糾刪碼的計算可透過GPU 大幅度加速,這也同時導致新的效
能瓶頸轉移到儲存裝置與GPU 之間的資料傳輸。在本研究中,我們設計並
實作了ECS2。ECS2 是個透過GPU 加速,快速的糾刪碼函式庫。使用者可
以透過該函式庫加強資料可靠性保護,而該函式庫提供類似儲存系統的程式
介面。透過Nvidia GPU 提供的最新GPUDirect 技術,本函式庫可使I/O 路經
省略並繞過CPU 和主記憶體,以減少計算以及I/O 的時間花費。基於真實的
儲存系統追蹤,我們透過合成的I/O 追蹤,驗證了I/O 延遲可透過GPUDirect
技術降低10% ∼ 20% 的時間,且整體的通過量可提高至70%。
As data volume keeps increasing at a rapid rate, there is an urgent need for large,
reliable, and cost-effective storage systems. Erasure coding has drawn increasing
attention because of its ability to ensure data reliability with higher storage efficiency,
and it has been widely adopted in many distributed and large-scale storage
systems, such as Azure cloud storage and HDFS. However, the storage efficiency
of erasure code comes at the price of higher computing complexity. While many
studies have shown the coding computations can be significantly accelerated using
GPU, the overhead of data transfer between storage devices and GPUs become a
new performance bottleneck. In this work, we designed and implemented, ECS2,
a fast erasure coding library on GPU-accelerated storage to let users enhance their
data protection with transparent IO performance and storage system like programming
interface. By taking advantage of the latest GPUDirect technology supported
on Nvidia GPU, our library is able to bypass CPU and host memory copy from the
IO path, so that both the computing and IO overhead from coding can be minimized.
Using synthetic IO workload based on real storage system trace, we show that the IO
latency can be reduced by 10% ∼ 20% with GPUDirect technology, and the overall
IO throughput of a storage system can be improved up to 70%.
1 Introduction 1
2 Related Works 4
3 Approach 5
3.1 ECS2 System Architecture . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 7
4 Implementations 10
4.1 Direct Memory Copy Technique . . . . . . . . . . . . . . . . . . . 10
4.2 GPU Accelerated Erasure Coding . . . . . . . . . . . . . . . . . . 12
4.3 ECS2 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5 Experiments 16
5.1 Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.2 Performance Analysis on IO Behavior . . . . . . . . . . . . . . . . 17
5.3 Performance Analysis on Erasure Code Configuration . . . . . . . . 19
5.4 Performance Analysis on System Architecture . . . . . . . . . . . . 20
5.5 Performance Analysis on Real Workload . . . . . . . . . . . . . . . 22
6 Conclusions 25
References 26
[1] A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology.
https://github.com/NVIDIA/gdrcopy.
[2] GPUDirect family. https://developer.nvidia.com/gpudirect.
[3] H3 falcon 4008 pcie switch. https://www.h3platform.com/productdetail/
overview/11.
[4] IO trace from Systor ’17 Traces. http://iotta.snia.org/tracetypes/3.
[5] Al-Kiswany, S., Gharaibeh, A., and Ripeanu, M. Gpus as storage system accelerators.
CoRR abs/1202.3669 (02 2012).
[6] Balaji, S., Muralee Krishnan, N. K., Vajha, M., Ramkumar, V., Sasidharan, B.,
and Kumar, P. Erasure coding for distributed storage: an overview. Science
China Information Sciences 61 (10 2018).
[7] Bhatotia, P., Rodrigues, R., and Verma, A. Shredder: Gpu-accelerated incremental
storage and computation. In Proceedings of the 10th USENIX Conference
on File and Storage Technologies (USA, 2012), FAST’12, USENIX
Association, p. 14.
[8] Chang, F., Ji, M., Leung, S.-T., MacCormick, J., Perl, S., and Zhang, L. Myriad:
Cost-effective disaster tolerance. In Proceedings of the 1st USENIX Conference
on File and Storage Technologies (USA, 2002), FAST ’02, USENIX
Association, p. 8‒es.
[9] Chen, X., Liu, J., and Xie, P. Erasure code of small file in a distributed file
system. In 2017 3rd IEEE International Conference on Computer and Communications
(ICCC) (2017), pp. 2549–2554.
[10] Chen, X., and Reed, I. S. Error-Control Coding for Data Networks. Kluwer
Academic Publishers, USA, 1999.
[11] Chu, X., Liu, C., Ouyang, K., Yung, L. S., Liu, H., and Leung, Y. Perasure: A
parallel cauchy reed-solomon coding library for gpus. In 2015 IEEE International
Conference on Communications (ICC) (2015), pp. 436–441.
[12] Curry, M. L., Skjellum, A., Lee Ward, H., and Brightwell, R. Gibraltar: A
reed-solomon coding library for storage applications on programmable graphics
processors. Concurr. Comput.: Pract. Exper. 23, 18 (Dec. 2011), 2477‒
2495.
[13] David Reinsel, John Gantz, J. R. Data age 2025, November 2018.
[14] Greenan, K. M., Li, X., and Wylie, J. J. Flat xor-based erasure codes in storage
systems: Constructions, efficient recovery, and tradeoffs. In 2010 IEEE 26th
Symposium on Mass Storage Systems and Technologies (MSST) (2010), pp. 1–
14.
[15] Haddock, W., Curry, M. L., Bangalore, P. V., and Skjellum, A. Gpu erasure
coding for campaign storage. In High Performance Computing (Cham, 2017),
Springer International Publishing, pp. 145–159.
[16] Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., Li, J., and
Yekhanin, S. Erasure coding in windows azure storage. In Presented as part of
the 2012 USENIX Annual Technical Conference (USENIX ATC 12) (Boston,
MA, 2012), USENIX, pp. 15–26.
[17] Ishengoma, F. Hdfs+: Erasure-coding based hadoop distributed file system.
International Journal of Scientific and Research Technology Volume 2 (09
2013).
[18] Khan, O., Burns, R., Park, J., and Huang, C. In search of i/o-optimal recovery
from disk failures. In Proceedings of the 3rd USENIX Conference on Hot
Topics in Storage and File Systems (USA, 2011), HotStorage’11, USENIX
Association, p. 6.
[19] Khasymski, A., Rafique, M. M., Butt, A. R., Vazhkudai, S. S., and Nikolopoulos,
D. S. On the use of gpus in realizing cost-effective distributed raid. In 2012
IEEE 20th International Symposium on Modeling, Analysis and Simulation of
Computer and Telecommunication Systems (Aug 2012), pp. 469–478.
[20] Khasymski, A., Rafique, M. M., Butt, A. R., Vazhkudai, S. S., and Nikolopoulos,
D. S. On the use of gpus in realizing cost-effective distributed raid. In 2012
IEEE 20th International Symposium on Modeling, Analysis and Simulation of
Computer and Telecommunication Systems (2012), pp. 469–478.
[21] Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D.,
Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., and Zhao,
B. Oceanstore: An architecture for global-scale persistent storage. SIGPLAN
Not. 35, 11 (Nov. 2000), 190‒201.
[22] Liu, C., Wang, Q., Chu, X., and Leung, Y. G-crs: Gpu accelerated cauchy reedsolomon
coding. IEEE Transactions on Parallel and Distributed Systems 29,
7 (2018), 1484–1498.
[23] Mayhew, D., and Krishnan, V. Pci express and advanced switching: evolutionary
path to building next generation interconnects. 11th Symposium on High
Performance Interconnects, 2003. Proceedings. (2003), 21–29.
[24] Oggier, F., and Datta, A. Self-repairing homomorphic codes for distributed
storage systems. In 2011 Proceedings IEEE INFOCOM (2011), pp. 1215–
1223.
[25] Patterson, D. A., Gibson, G., and Katz, R. H. A case for redundant arrays of
inexpensive disks (raid). In Proceedings of the 1988 ACM SIGMOD International
Conference on Management of Data (New York, NY, USA, 1988),
SIGMOD ’88, Association for Computing Machinery, p. 109‒116.
[26] Plank, J. S. Erasure codes for storage systems: A brief primer. ;login: the
Usenix magazine 38, 6 (December 2013).
[27] Plank, J. S., Simmerman, S., and Schuman, C. D. Jerasure: A library in c/c++
facilitating erasure coding for storage applications version 1.2. Tech. Rep. CS-
08-627, University of Tennessee, 2008.
[28] Rashmi, K. V., Shah, N. B., Gu, D., Kuang, H., Borthakur, D., and Ramchandran,
K. A solution to the network challenges of data recovery in erasure-coded
distributed storage systems: A study on the facebook warehouse cluster. In
Proceedings of the 5th USENIX Conference on Hot Topics in Storage and File
Systems (USA, 2013), HotStorage’13, USENIX Association, p. 8.
[29] Rashmi, K. V., Shah, N. B., and Kumar, P. V. Optimal exact-regenerating
codes for distributed storage at the msr and mbr points via a product-matrix
construction. IEEE Transactions on Information Theory 57, 8 (2011), 5227–
5239.
[30] REED, I. S. Polynomial codes over certain finite fields. Journal of SIAM 8, 2
(1960), 300–304.
[31] Rossbach, C. J., Currey, J., Silberstein, M., Ray, B., and Witchel, E. Ptask:
Operating system abstractions to manage gpus as compute devices. In Proceedings
of the Twenty-Third ACM Symposium on Operating Systems Principles
(New York, NY, USA, 2011), SOSP ’11, Association for Computing
Machinery, p. 233‒248.
[32] Silberstein, M., Ford, B., Keidar, I., and Witchel, E. Gpufs: Integrating a file
system with gpus. SIGPLAN Not. 48, 4 (Mar. 2013), 485‒498.
[33] Suh, C., and Ramchandran, K. Exact-repair mds codes for distributed storage
using interference alignment. In 2010 IEEE International Symposium on
Information Theory (2010), pp. 161–165.
[34] Tseng, H.-W., Zhao, Q., Zhou, Y., Gahagan, M., and Swanson, S. Morpheus:
Creating application objects efficiently for heterogeneous computing.
In Proceedings of the 43rd International Symposium on Computer Architecture
(2016), ISCA ’16, IEEE Press, p. 53‒65.
[35] Weatherspoon, H., and Kubiatowicz, J. D. Erasure coding vs. replication: A
quantitative comparison. In Peer-to-Peer Systems (Berlin, Heidelberg, 2002),
P. Druschel, F. Kaashoek, and A. Rowstron, Eds., Springer Berlin Heidelberg,
pp. 328–337.
[36] Weil, S. A., Brandt, S. A., Miller, E. L., Long, D. D. E., and Maltzahn, C.
Ceph: A scalable, high-performance distributed file system. In Proceedings of
the 7th Symposium on Operating Systems Design and Implementation (USA,
2006), OSDI ’06, USENIX Association, p. 307‒320.
[37] Xiang, L., Xu, Y., Lui, J. C., and Chang, Q. Optimal recovery of single disk
failure in rdp code storage systems. SIGMETRICS Perform. Eval. Rev. 38, 1
(June 2010), 119‒130.
[38] Yiu, M. M. T., Chan, H. H. W., and Lee, P. P. C. Erasure coding for small
objects in in-memory kv storage. In Proceedings of the 10th ACM International
Systems and Storage Conference (New York, NY, USA, 2017), SYSTOR ’17,
Association for Computing Machinery.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *