帳號:guest(3.137.200.239)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):葉庭安
作者(外文):Yeh, Ting-An
論文名稱(中文):KubeShare: 在容器雲中實現一等且共用的GPU資源管理提升系統效能與資源使用率
論文名稱(外文):KubeShare: A Framework to Support GPUs as First-Class and Shared Resources in Container Clouds for Maximizing System Throughput and Resource Utilization
指導教授(中文):周志遠
指導教授(外文):Chou, Jerry
口試委員(中文):李哲榮
洪士灝
口試委員(外文):Lee, Che-Rung
Hung, Shih-Hao
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:107062568
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:39
中文關鍵詞:雲計算圖形處理器容器排程
外文關鍵詞:Cloud computingGPUContainerScheduling
相關次數:
  • 推薦推薦:0
  • 點閱點閱:2313
  • 評分評分:*****
  • 下載下載:77
  • 收藏收藏:0
在雲端上操作和部屬分散式應用程式時,容器這個新的技術被用來取代傳統 的虛擬機器。隨著雲端發展越來越興盛,像深度學習和高效能型應用程式這 些高度依賴 GPU 的程式也開始執行在雲端上,如何有效分配 GPU 資源逐漸 成為重要的議題。現今 GPU 在虛擬機器上的虛擬化已發展成熟,但在容器 上的虛擬化卻尚未被廣泛討論。目前雲平台上要將 GPU 在多個容器間共享 還有非常多的問題要解決,並且獨占 GPU 會讓一些不能充分利用 GPU 資源 的應用程式浪費 GPU 與降低使用率。為了克服這些議題,我們設計和實做 了 KubeShare,可以擴展 Kubernetes 並讓其支援容器間 GPU 共享與細粒資源 分配,KubeShare 也是第一個讓 GPU 在 Kubernetes 中成為一等資源的解決辦 法。我們也利用真實的深度學習應用印證使用 KubeShare 可以大幅提昇 GPU 使用率、比原先提昇至大約兩倍的系統吞吐量,並且在容器初始化和執行時 期只產生不到百分之十的效能損耗。
Container has emerged as a new technology in clouds to replace virtual machines (VM) for distributed applications deployment and operation. With the increasing number of new cloud-focused applications, such as deep learning and high performance applications, started to rely on the high computing throughput of GPUs, efficiently supporting GPU in container cloud becomes essential. While GPU virtualization has been extensively studied for VM, limited work has been done for containers. One of the key challenges is the lack of support for GPU sharing between multiple concurrent containers. This limitation leads to low resource utilization when a GPU device cannot be fully utilized by a single application due to the burstiness of GPU workload and the limited memory bandwidth. To overcome this issue, we designed and implemented KubeShare, which extends Kubernetes to enable GPU sharing with fine-grained allocation. KubeShare is the first solution for Kubernetes to make GPU device as a first class resources for scheduling and allocations. Using real deep learning workloads, we demonstrated KubeShare can significantly increase GPU utilization and overall system throughput around 2x with less than 10% performance overhead during container initialization and execution.
1 Introduction . . . . . . . . . . . . . . . . . . . . 1
2 Background on Kubernetes . . . . . . . . . . . . . . 4
2.1 Architecture . . . . . . . . . . . . . . . . . . . 4
2.2 Custom Device . . . . . . . . . . . . . . . . . . 6
3 Problems & Requirements . . . . . . . . . . . . . . 8
3.1 Resource Fragmentation . . . . . . . . . . . . . . 8
3.2 Implicit and Late Binding . . . . . . . . . . . . 9
3.3 Requirements . . . . . . . . . . . . . . . . . . 10
4 KubeShare . . . . . . . . . . . . . . . . . . . . . 11
4.1 Architecture Overview . . . . . . . . . . . . . . 11
4.2 First-Class Resource Specification . . . . . . . . 14
4.3 Resource & Locality Aware Scheduling . . . . . . 15
4.4 vGPU Lifecycle Management . . . . . . . . . . . . 17
4.5 GPU Usage Isolation & Elastic Allocation . . . . 19
4.6 System Compatibility & Flexibility . . . . . . . 21
5 Experimental Evaluation . . . . . . . . . . . . . . 23
5.1 Testbed & Workload . . . . . . . . . . . . . . . 23
5.2 GPU Isolation . . . . . . . . . . . . . . . . . . 24
5.3 GPU Sharing Benefits . . . . . . . . . . . . . . . 26
5.4 GPU Sharing Overhead . . . . . . . . . . . . . . 29
5.5 Job Interference . . . . . . . . . . . . . . . . 30
6 Related Works . . . . . . . . . . . . . . . . . . . 34
7 Conclusions . . . . . . . . . . . . . . . . . . . . 36
[1] Abadi, M., and et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
[2] Alibaba Cloud. GPU Sharing Scheduler Extender in Kubernetes. [Online]. Available: https://github.com/AliyunContainerService/gpushare-scheduler-extender.
[3] Basaran, C., and Kang, K. Supporting preemptive task executions and memory copies in GPGPUs. In Euromicro Conference on Real-Time Systems (July 2012), pp. 287–296.
[4] Becchi, M., Sajjapongse, K., Graves, I., Procter, A., Ravi, V., and Chakradhar, S. A Virtual Memory Based Runtime to Support Multi-Tenancy in Clusters with GPUs. In Proceedings of the 21st International Symposium on HighPerformance Parallel and Distributed Computing (2012), p. 97‒108.
[5] Belkin, M., Haas, R., Arnold, G. W., Leong, H. W., Huerta, E. A., Lesny, D., and Neubauer, M. Container solutions for hpc systems: A case study of using shifter on blue waters. In Proceedings of the Practice and Experience on Advanced Research Computing (2018).
[6] Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. CoRR (2016).
[7] Deepomatic. A shared GPU Nvidia K8S device plugin. [Online]. Available: https://github.com/Deepomatic/shared-gpu-nvidia-k8s-device-plugin.
[8] Giunta, G., Montella, R., Agrillo, G., and Coviello, G. A GPGPU Transparent Virtualization Component for High Performance Computing Clouds. In Proceedings of the 16th International Euro-Par Conference on Parallel Processing (2010), Springer-Verlag, p. 379‒391.
[9] Google. Kubernetes cluster management. [Online]. Available: http://kubernetes.io/.
[10] Gu, J., Song, S., Li, Y., and Luo, H. GaiaGPU: Sharing GPUs in Container Clouds. In 2018 IEEE Intl Conf on Parallel Distributed Processing with Applications, Ubiquitous Computing Communications, Big Data Cloud Computing, Social Computing Networking, Sustainable Computing Communications (Dec 2018), pp. 469–476.
[11] Gupta, V., Schwan, K., Tolia, N., Talwar, V., and Ranganathan, P. Pegasus: Coordinated scheduling for virtualized accelerator-based systems. In USENIX Annual Technical Conference (2011), p. 3.
[12] He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. CoRR (2015).
[13] Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A. D., Katz, R., Shenker, S., and Stoica, I. Mesos: A platform for fine-grained resource sharing in the data center. In USENIX Conference on Networked Systems Design and Implementation (2011), pp. 295–308.
[14] Intel. Intel device plugin for kubernetes. [Online]. Available: https://github.com/intel/intel-device-plugins-for-kubernetes.
[15] Kang, D., Jun, T. J., Kim, D., Kim, J., and Kim, D. ConVGPU: GPU Management Middleware in Container Based Virtualized Environment. In IEEE International Conference on Cluster Computing (Sep. 2017), pp. 301–309.
[16] Kang, H., Le, M., and Tao, S. Container and microservice driven design for cloud infrastructure devops. In 2016 IEEE International Conference on Cloud Engineering (IC2E) (April 2016), pp. 202–211.
[17] Kato, S., Lakshmanan, K., Rajkumar, R., and Ishikawa, Y. Timegraph: Gpu scheduling for real-time multi-tasking environments. In USENIX Annual Technical Conference (USA, 2011), USENIX Association, p. 2.
[18] Kato, S., McThrow, M., Maltzahn, C., and Brandt, S. Gdev: First-class GPU resource management in the operating system. In USENIX Annual Technical Conference (USA, 2012), USENIX Association, p. 37.
[19] Kehne, J., Metter, J., and Bellosa, F. GPUswap: Enabling Oversubscription of GPU Memory through Transparent Swapping. In Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (New York, NY, USA, 2015), VEE ’15, Association for Computing Machinery, p. 65‒77.
[20] Kurtzer, G. M., Sochat, V., and Bauer, M. W. Singularity: Scientific containers for mobility of compute. PLOS ONE 12, 5 (05 2017), 1–20.
[21] Merkel, D. Docker: Lightweight linux containers for consistent development and deployment. Linux J. 2014, 239 (Mar. 2014).
[22] Nvidia. NVIDIA GPU Device Plugin for Kubernetes. [Online]. Available: https://github.com/NVIDIA/k8s-device-plugin/.
[23] Peña, A. J., Reaño, C., Silla, F., Mayo, R., Quintana-Ortí, E. S., and Duato, J. A complete and efficient cuda-sharing solution for hpc clusters. Parallel Comput. 40, 10 (Dec. 2014), 574‒588.
[24] Rossbach, C., Currey, J., Silberstein, M., Ray, B., and Witchel, E. PTask: Operating System Abstractions To Manage GPUs as Compute Devices. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (October 2011), p. 233‒248.
[25] Sengupta, D., Belapure, R., and Schwan, K. Multi-Tenancy on GPGPU-Based Servers. In Proceedings of the 7th International Workshop on Virtualization Technologies in Distributed Computing (2013), p. 3‒10.
[26] Suzuki, Y., Yamada, H., Kato, S., and Kono, K. GLoop: An Event-Driven Runtime for Consolidating GPGPU Applications. In Symposium on Cloud Computing (2017), p. 80‒93.
[27] Tencent. Rdma device plugin for kubernetes. [Online]. Available: https://github.com/hustcat/k8s-rdma-device-plugin.
[28] Ting-An Yeh, Jerry Chou. Kubeshare. [Online]. Available: https://github.com/NTHU-LSALAB/KubeShare/.
[29] Vavilapalli, V. K., et al. Apache hadoop yarn: Yet another resource negotiator. In Symposium on Cloud Computing (2013), pp. 5:1–5:16.
[30] Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E., and Wilkes, J. Large-scale Cluster Management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems (2015), pp. 18:1–18:17.
[31] Wu, B., Liu, X., Zhou, X., and Jiang, C. Flep: Enabling flexible and efficient preemption on gpus. In International Conference on Architectural Support for Programming Languages and Operating Systems (2017), p. 483‒496.
[32] Xue, M., Tian, K., Dong, Y., Ma, J., Wang, J., Qi, Z., He, B., and Guan, H. gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space. In USENIX Annual Technical Conference (June 2016), pp. 579–590.
[33] Younge, A. J., Pedretti, K., Grant, R. E., Gaines, B. L., and Brightwell, R. Enabling diverse software stacks on supercomputers using high performance virtual clusters. In IEEE International Conference on Cluster Computing (Sep. 2017), pp. 310–321.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *