帳號:guest(18.116.37.31)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):黃振維
作者(外文):Harriman, Wilbert
論文名稱(中文):機器學習與系統工程的結合的案例探討
論文名稱(外文):Study of Two Cases Where Machine Learning Meets Systems
指導教授(中文):吳尚鴻
指導教授(外文):Wu, Shan-Hung
口試委員(中文):黃俊龍
李哲榮
口試委員(外文):Huang, Jiun-Long
Lee, Che-Rung
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:111062528
出版年(民國):112
畢業學年度:112
語文別:英文
論文頁數:33
中文關鍵詞:資料庫系統深度學習
外文關鍵詞:Database SystemDeep Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:423
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
機器學習的快速發展導致軟體系統建構方式的轉變。 例如,這些系統可以透過揭示輸入特徵中的隱藏模式來改善輸入處理,而無需明確編程。 同時,機器學習模型的規模一直呈指數級增長,這很大程度上歸因於最近引入的 Transformer 架構的可擴展性。 然而,硬體資源一直難以跟上這些機器學習模型的快速成長速度,導致部署大型機器學習模型時的成本巨大。

在這項工作中,我們將重點放在機器學習和軟體系統如何相互增強。 具體來說,我們演示了資料庫系統如何透過在執行之前估計交易延遲來最佳化事務處理。 此方法可將服務等級協定 (SLA) 違規行為減少 36%。 此外,我們還展示瞭如何利用我們在建置系統方面的專業知識來大幅減少部署大型機器學習模型時的 GPU 記憶體消耗。 這使我們能夠在 8 GB GPU 上運行 120 GB 模型。
The rapid advancement of machine learning has led to a shift in how software systems are built. For instance, these systems can improve input processing by uncovering concealed patterns within the input features without explicit programming. Concurrently, machine learning models have been experiencing exponential growth in size, largely attributed to the scalable nature of the recently introduced transformer architecture. Nevertheless, hardware resources have struggled to keep up with the rapid pace of growth in these machine learning models, leading to astronomical cost when deploying large machine learning models.

In this work, we focus on how machine learning and software systems can mutually enhance each other. Specifically, we demonstrate how database systems can optimize transaction processing by estimating transaction latencies prior to execution. This approach leads to a 36% reduction in service level agreement (SLA) violations. Furthermore, we show how our expertise in building systems can be used to substantially reduce the GPU memory consumption of large machine learning models when deployed. This allows us to run a 120 GB model on an 8 GB GPU.
Abstract
摘要
1. Introduction----1
2. Background------5
2.1 Database Management System-----5
2.1.1 Deterministic Database Management System-----5
2.1.2 Deterministic Latency Estimator-----7
2.1.3 Service Level Agreement------8
2.2 Machine Learning-----8
2.2.1 Neural Network--9
2.2.2 Quantization----10
2.2.3 Memory Offloading-----11
3. Enhanced Sequencer for Database Systems-----12
3.1 Drop----13
3.2 Reorder-13
3.3 Hybrid--14
4. Evaluation – ML for Database System-----14
4.1 Performance on SLA Violations-----16
4.2 System Throughput-----17
5. Reducing Memory Usage for Large Transformer Models-----18
5.1 Low Rank Decomposition--18
5.2 Layer Offloading--------21
6. Evaluation – System for ML-----22
6.1 GPU Memory Usage--------23
6.2 Inference Latency-------24
6.3 IO vs Compute-----25
6.4 Post Training Quantization------26
6.5 Quantized Inference Latency-----28
6.6 Deployment Cost Analysis--------29
7. Conclusions and Future Works----30
8. References-----31
1. Marcus, R., et al. Bao: Making learned query optimization practical. in Proceedings of the 2021 International Conference on Management of Data. 2021.
2. Yang, Z., et al. Balsa: Learning a Query Optimizer Without Expert Demonstrations. in Proceedings of the 2022 International Conference on Management of Data. 2022.
3. Krishnan, S., et al., Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196, 2018.
4. Kraska, T., et al. The case for learned index structures. in Proceedings of the 2018 international conference on management of data. 2018.
5. Van Aken, D., A. Pavlo, G.J. Gordon, and B. Zhang. Automatic database management system tuning through large-scale machine learning. in Proceedings of the 2017 ACM international conference on management of data. 2017.
6. Pavlo, A., et al. Self-Driving Database Management Systems. in CIDR. 2017.
7. Thomson, A., et al. Calvin: fast distributed transactions for partitioned database systems. in Proceedings of the 2012 ACM SIGMOD international conference on management of data. 2012.
8. Wu, S.-H., et al. T-part: Partitioning of transactions for forward-pushing in deterministic database systems. in Proceedings of the 2016 International Conference on Management of Data. 2016.
9. Ma, L., et al. MB2: decomposed behavior modeling for self-driving database management systems. in Proceedings of the 2021 International Conference on Management of Data. 2021.
10. Wang, P.-Y., On Estimating the Transaction Delay on a Relational Database System via Machine Learning. 2022.
11. Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
12. Radford, A., et al., Language models are unsupervised multitask learners. OpenAI blog, 2019. 1(8): p. 9.
13. Touvron, H., et al., Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
14. Chowdhery, A., et al., Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
15. Brown, T., et al., Language models are few-shot learners. Advances in neural information processing systems, 2020. 33: p. 1877-1901.
16. Choquette, J., et al., NVIDIA A100 tensor core GPU: Performance and innovation. IEEE Micro, 2021. 41(2): p. 29-35.
17. Blalock, D., J.J. Gonzalez Ortiz, J. Frankle, and J. Guttag, What is the state of neural network pruning? Proceedings of machine learning and systems, 2020. 2: p. 129-146.
18. Quantization. https://huggingface.co/docs/optimum/concept_guides/quantization.
19. llama.cpp. Available from: https://github.com/ggerganov/llama.cpp.
20. Vaswani, A., et al., Attention is all you need. Advances in neural information processing systems, 2017. 30.
21. Abadi, D.J. and J.M. Faleiro, An overview of deterministic database systems. Communications of the ACM, 2018. 61(9): p. 78-88.
22. Hochreiter, S. and J. Schmidhuber, Long short-term memory. Neural computation, 1997. 9(8): p. 1735-1780.
23. Bahdanau, D., K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
24. Cho, K., et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
25. ElaSQL. Available from: https://www.elasql.org/.
26. TPC-C. https://www.tpc.org/tpcc/.
27. Lambda Labs. Available from: https://lambdalabs.com/.
28. Paszke, A., et al., Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 2019. 32.

 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *