帳號:guest(216.73.216.146)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):林冠吾
作者(外文):Lin, Kuan-Wu
論文名稱(中文):FastQuery調整及動態分配利用混和平行機制
論文名稱(外文):FastQuery Tuning and dynamic scheduling with hybrid parallel mechanism
指導教授(中文):周志遠
指導教授(外文):Chou, Jerry
口試委員(中文):李哲榮
蕭宏章
口試委員(外文):Che-Rung Lee
Hung-Chang Hsiao
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:101062584
出版年(民國):103
畢業學年度:103
語文別:英文
論文頁數:43
中文關鍵詞:FastQueryI/O 效能調參數
外文關鍵詞:I/O performanceTuning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:441
  • 評分評分:*****
  • 下載下載:3
  • 收藏收藏:0
FastQuery是一個由我們所開發的平行索引和查詢系統,用來加
速分析及視覺化科學資料。我們已經把它用在許多不同的高速運算應 用程式,也用一個兆等級的模擬實驗展現了他的彈性及能力。但是在 我們實驗裡可以看到I/O的結果因為參數不同而有嚴重的影響。在這 篇論文我們將會介紹如何調整參數以及對每個參數做分析還有討論 他們的影響。我們將會展示調整過的參數對結果有重大的影響。
FastQuery is a parallel indexing and querying system we developed for accelerating analysis and visualization of scientific data. We have applied it to a wide variety of HPC applications and demonstrated its capability and scalability using a petas- cale trillion-particle simulation in our previous work. Yet, through our experience, we found that performance of reading and writing data with FastQuery, like many other HPC applications, could be significantly affected by various tunable param- eters throughout the parallel I/O stack. In this paper, we describe our success in tuning the performance of FastQuery on a Lustre parallel file system. We study and analyze the impact of parameters and tunable settings at file system, MPI-IO library, and HDF5 library levels of the I/O stack. We demonstrate that a combined optimization strategy is able to improve performance and I/O bandwidth of Fast- Query significantly. In our tests with a trillion-particle dataset, the time to index the dataset reduced by more than one half. We also provide a hybrid architecture for overlaying the CPU and IO time. FastQuery builds indexes iteratively, so the overall performance is bound by cost of each iteration. We combine thread and MPI to overcome this limitation. The results show that hybrid architecture can overlay the CPU and I/O time and has significant improvement.
1 Introduction 5
2 Background and Motivation 8
2.1 SearchinginScientificData ....................... 8 2.2 TuningparallelI/Operformance..................... 9 2.3 ApplicationUseCase:VPIC....................... 10
3 Tuning Options 13
3.1 FastQueryoverview............................ 13 3.2 FastQueryParallelI/OStrategy..................... 15 3.3 HDF5andMPI-IOCollectiveI/O.................... 17 3.4 LustreStriping .............................. 19
4 Hybrid Parallelism Architecture 20
5 Experimental Setup 22
5.1 TuningoptimizationTestbed....................... 22 5.2 Datasets.................................. 23 5.3 Methodology ............................... 24
6 Tuning Optimization Experimental Results 26
6.1 OverallPerformanceEvaluation..................... 27 6.2 StripeCount ............................... 28 6.3 StripeSize&CollectiveBuffering .................... 30 6.4 ThreadCountperMPITask....................... 31
7 Hybrid Parallelism Experimental Results 34
7.1 Overall................................... 34
1
7.2 Monitor .................................. 35 7.3 Histogramdispatch............................ 36 7.4 Overalldispatch.............................. 37
8 Conclusions
39
[1] IPCC Fifth Assessment Report. http://en.wikipedia.org/wiki/IPCC_ Fifth_Assessment_Report.
[2] B. Behzad, J. Huchette, H. Luu, R. Aydt, S. Byna, Y. Yao, Q. Koziol, and Prabhat. A framework for auto-tuning hdf5 applications. In HPDC, 2013. https://sdm.lbl.gov/~sbyna/research/papers/hpdc2013.pdf.
[3] K. J. Bowers, B. J. Albright, L. Yin, B. Bergen, and T. J. T. Kwan. Ultra- high performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Physics of Plasmas, 15(5):7, 2008.
[4] S. Byna, J. Chou, O. Ru ̈bel, Prabhat, H. Karimabadi, W. S. Daughton, V. Roytershteyn, E. W. Bethel, M. Howison, K.-J. Hsu, K.-W. Lin, A. Shoshani, A. Uselton, and K. Wu. Parallel i/o, analysis, and visualization of a trillion particle simulation. In SC, pages 59:1–59:12, 2012.
[5] Y. Chen, M. Winslett, Y. Cho, and S. Kuo. Automatic parallel I/O performance optimization using genetic algorithms. In HPDC, pages 155 –162, jul 1998.
[6] Y. Chen, M. Winslett, Y. Cho, S. Kuo, and C. Y. Chen. Automatic Parallel I/O Performance Optimization in Panda. In In Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 108–118, 1998.
[7] Y. Chen, M. Winslett, S.-w. Kuo, Y. Cho, M. Subramaniam, and K. Seamons. Performance modeling for the panda array I/O library. In SC, 1996.
[8] J. Chou, K. Wu, and Prabhat. FastQuery: A general indexing and querying system for scientific data. In SSDBM, pages 573–574, 2011.
[9] J. Chou, K. Wu, and Prabhat. FastQuery: A parallel indexing system for scientific data. In Proceedings of Workshop on Interfaces and Abstractions for Scientific Data Storage, 2011.
[10] J. Chou, K. Wu, O. Ru ̈bel, M. Howison, J. Qiang, Prabhat, B. Austin, E. W. Bethel, R. D. Ryne, and A. Shoshani. Parallel index and query for large scale data analysis. In SC, pages 30:1–30:11, 2011.
[11] D. Comer. The ubiquitous b-tree. ACM Comput. Surv., 11(2):121–137, June 1979.
[12] Cray. Getting Started on MPI I/O, Dec. 2009. CrayDoc S-2490-40.
[13] C. M. Herb Wartens, Jim Garlick. LMT - The Lustre Monitoring Tool. https:
//github.com/chaos/lmt/wiki.
[14] T. Hey, S. Tansley, and K. Tolle, editors. The Fourth Paradigm: Data-Intensive
Scientific Discovery. Microsoft, Oct. 2009.
[15] M. Howison, Q. Koziol, D. Knaak, J. Mainzer, and J. Shalf. Tuning HDF5 for Lustre File Systems. In Proceedings of Workshop on Interfaces and Abstractions for Scientific Data Storage, Heraklion, Crete, Greece, Sept. 2010. LBNL-4803E.
[16] M. Howison, Q. Koziol, D. Knaak, J. Mainzer, and J. Shalf. Tuning HDF5 for Lustre File Systems. In Proceedings of Workshop on Interfaces and Abstractions for Scientific Data Storage, Heraklion, Crete, Greece, Sept. 2010. LBNL-4803E.
[17] J. Kim, H. Abbasi, L. Chaco ́n, C. Docan, S. Klasky, Q. Liu, N. Podhorszki, A. Shoshani, and K. Wu. Parallel in situ indexing for data-intensive computing. In LDAV, pages 65–72. IEEE, 2011.
[18] J. Li, W. keng Liao, A. Choudhary, R. Ross, R. Thakur, W. Gropp, R. Latham, A. Siegel, B. Gallagher, and M. Zingale. Parallel netCDF: A high-performance scientific I/O interface. In SC, page 39, 2003.
[19] W.-k. Liao and A. Choudhary. Dynamically adapting file domain partition- ing methods for collective i/o based on underlying parallel file system locking protocols. In SC, pages 3:1–3:12, 2008.
[20] J. F. Lofstead, S. Klasky, K. Schwan, N. Podhorszki, and C. Jin. Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS). In CLADE, pages 15–24, 2008.
[21] J. Mache, V. Lo, and S. Garg. The impact of spatial layout of jobs on I/O hotspots in mesh networks. JPDC, 65(10):1190–1203, Oct. 2005.
[22] P. O’Neil and E. O’Neil. Database: principles, programming, and performance. Morgan Kaugmann, 2nd edition, 2000.
[23] P. E. O’Neil. Model 204 architecture and performance. In Proceedings of the 2nd International Workshop on High Performance Transaction Systems, pages 40–59, London, UK, UK, 1989. Springer-Verlag.
[24] A. Shoshani and D. e. Rotem. Scientific Data Management: Challenges, Tech- nology, and Deployment. Chapman & Hall/CRC Press, 2009.
[25] K. Stockinger, J. Shalf, W. Bethel, and K. Wu. Query-driven visualization of large data sets. In IEEE Visualization, pages 167–174, Oct. 2005.
[26] The HDF Group. HDF5 user guide. http://hdf.ncsa.uiuc.edu/HDF5/doc/ H5.user.html.
[27] Unidata. The NetCDF users’ guide. http://www.unidata.ucar.edu/ software/netcdf/docs/netcdf/.
[28] VisIt Visualization Tool. https://wci.llnl.gov/codes/visit/.
[29] K. Wu. FastBit: an efficient indexing technology for accelerating data-intensive
science. Journal of Physics: Conference Series, 16:556–560, 2005.
[30] K. Wu, E. Otoo, and A. Shoshani. Optimizing bitmap indices with efficient
compression. ACM Transactions on Database Systems, 31:1–38, 2006.
[31] W. Yu, J. Vetter, and H. Oral. Performance characterization and optimization
of parallel I/O on the Cray XT. In IPDPS, pages 1–11, Apr. 2008.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *