帳號:guest(3.15.22.24)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):梁郁珮
作者(外文):Liang, Yu-Pei
論文名稱(中文):基於寫入限制之超大型儲存裝置之資料管理策略
論文名稱(外文):Efficient Data Management for Large-scale Computer Systems with Write-constrained Memory and Storage Devices
指導教授(中文):石維寬
指導教授(外文):Shih, Wei-Kuan
口試委員(中文):王廷基
何宗易
謝孫源
許富皓
徐讚昇
張原豪
衛信文
謝仁偉
口試委員(外文):Wang, Ting-Chi
Ho, Tsung-Yi
Hsieh, Sun-Yuan
Hsu, Fu-Hau
Hsu, Tsan-sheng
Chang, Yuan-Hao
Wei, Hsin-Wen
Hsieh, Jen-Wei
學位類別:博士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:106062804
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:76
中文關鍵詞:大規模儲存系統索引管理非揮發性記憶體疊瓦式硬碟
外文關鍵詞:large-scale computer systemsindex managementnon-volatile memoryshingled magnetic recording drive
相關次數:
  • 推薦推薦:0
  • 點閱點閱:689
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
近年來由於各種新興大數據應用的發展大量的數位資訊隨之被創造。然而對於如此龐大的資料量,傳統的記憶體及儲存技術已逐漸無法在成本與儲存空間需求上取得平衡。幸運的是,近年來許多新興的記憶體及儲存裝置正在成熟中,而這些裝置也為未來大規模儲存系統帶來新的可能性,十分有可能成為次世代主流的儲存系統。

本論文分別選用了非揮發性記憶體及疊瓦式硬碟作為計算機系統中的記憶體與儲存裝置;然而這些新興的儲存技術與傳統的儲存技術有許多相異的性質,其中又以寫入受限問題最為嚴重將可能直接影響資料存取效能及裝置壽命。另一方面來說,索引技術是管理大量資料的常見技術,若要有效率地管理大量資料,加速索引管理的效能將會是一個相當有效的作法。因此本論文分別針對兩種新興儲存裝置的寫入受限問題設計了索引管理及加速方案以提升未來大規模儲存系統之效能。

首先在非揮發性記憶體方面,由於在大量資料的管理上經常會使用到排序演算法,因此我們為其設計了一個新的排序演算法--B*-sort,此方法不但能夠提升排序演算法於非揮發性記憶體的效能更是能夠最大化非揮發性記憶體的壽命。而另一方面來說,為了減輕疊瓦式硬碟上的索引管理成本,我們以常用的B+樹為研究對象,為疊瓦式硬碟設計了一套B+樹的索引管理機制稱為SW-B+ tree,使得大量資料能夠有效地在疊瓦式硬碟中被管理。而最後在實驗的驗證下,B*-sort與SW-B+ tree皆能為整體系統減輕負擔進而提升效能。
In recent years, digital data volume has been growing larger rapidly because of the presentation of a lot of emerging big data applications. The capacity of both memory and storage fails to keep the cost and performance balance to meet the requirement of storing data. Luckily, some novel memory technologies have been developed for large-scale data applications. This dissertation considers a novel computer architecture that applies non-volatile random access memory (NVRAM) to be the main memory for reducing the power consumption of the system and shingled magnetic recording (SMR) drive to be the storage for enlarging the storage. However, both of these new devices have write-constraint issues, and they may significantly affect the performance and lifetime. Therefore, to manage massive data efficiently over such architecture, this dissertation focuses on the index management and proposes methodologies for mitigating the overhead over two different levels of the memory hierarchy. First, to accelerate the index search operation, this dissertation proposed an NVRAM-friendly sorting algorithm called B*-sort. B*-sort can not only enhance the performance of the sorting algorithm on NVRAM but also maximize the lifetime while doing the sorting algorithm. On the other hand, to mitigate the overhead of managing index on SMR drive, this dissertation also proposes a sequential-write-constrained B+-tree index scheme namely SW-B+-tree. Moreover, the capabilities of these proposed approaches were evaluated by a series of experiments to demonstrate the effectiveness of the designs.
摘要 i
Abstract ii
Acknowledgement iii
Table of Contents iv
List of Figures vi
List of Tables viii
1 Introduction 1
1.1 Overview 1
1.2 Background and Related Work 3
1.2.1 Non­volatile Random Access Memory 3
1.2.2 Sorting Algorithms on NVRAM­based computer Systems 4
1.2.3 Shingled Magnetic Recording Technology 6
1.2.4 The Write Amplification Problem 7
1.2.5 B+­tree Index Scheme 9
1.3 Organization 11
2 An NVRAM­friendly sorting algorithm 12
2.1 Motivation 12
2.2 Write­once B*­Sort 13
2.2.1 Basic Concept of B*­Sort 13
2.2.2 Tunnel List to Reduce Read Complexity in the worst case 15
2.2.3 Showcasing B*­Sort 17
2.2.4 The Operations of B*­Sort 23
2.3 B*­Sort Properties and Analysis 27
2.3.1 Analysis of Write Complexity 27
2.3.2 Analysis of Read Complexity 28
2.3.3 Remark of Threshold 31
2.3.4 Analysis of Memory Footprint 32
2.4 Experimental Evaluation 33
2.4.1 Experimental Setup 33
2.4.2 Experimental Results 37
2.4.3 Advanced Analysis 40
2.5 Summary 42
3 A Sequential­write­constrained B+­tree Index Scheme 43
3.1 Motivation 43
3.2 Sequential Write B+­tree 45
3.2.1 Design Philosophy 45
3.2.2 System Overview 46
3.2.3 Designed Components in SW­-B+tree 48
3.2.4 Overhead Analysis 56
3.3 Performance Evaluation 59
3.3.1 Experiment Setup 59
3.3.2 Experimental Results 60
3.4 Summary 64
4 Conclusion and Future work 66
4.1 Conclusion 66
4.2 Future work 67
Bibliography 68
List of Publications 75
[1] KL Wang, JG Alzate, and P Khalili Amiri. Low­power non­volatile spintronic memory: Stt­ram and beyond. Journal of Physics D: Applied Physics, 46(7):074003, 2013.
[2] Benjamin C Lee, Ping Zhou, Jun Yang, Youtao Zhang, Bo Zhao, Engin Ipek, Onur Mutlu, and Doug Burger. Phase­change technology and the future of main memory. IEEE micro, 30(1):143–143, 2010.
[3] Ping Chi, Cong Xu, Tao Zhang, Xiangyu Dong, and Yuan Xie. Using multi­level cell stt­ ram for fast and energy­efficient local checkpointing. In 2014 IEEE/ACM International Conference on Computer­Aided Design (ICCAD), pages 301–308. IEEE, 2014.
[4] Mu­Tien Chang, Paul Rosenfeld, Shih­Lien Lu, and Bruce Jacob. Technology compari­ son for large last­level caches (l 3 cs): Low­leakage sram, low write­energy stt­ram, and refresh­optimized edram. In 2013 IEEE 19th International Symposium on High Perfor­ mance Computer Architecture (HPCA), pages 143–154. IEEE, 2013.
[5] Shimin Chen, Phillip B Gibbons, Suman Nath, et al. Rethinking database algorithms for phase change memory. In CIDR, volume 11, page 5th, 2011.
[6] Ming­Chang Yang, Cheng­Chin Tu, Yuan­Hao Chang, Pei­Lun Suei, and Tei­Wei Kuo. Endurance­aware clustering­based mining algorithm for non­volatile phase­change mem­ ory. In 2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE), pages 719– 720. IEEE, 2014.
[7] Joy Arulraj, Justin Levandoski, Umar Farooq Minhas, and Per­Ake Larson. Bztree: A high­performance latch­free range index for non­volatile memory. Proceedings of the VLDB Endowment, 11(5):553–565, 2018.
[8] Alexander van Renen, Viktor Leis, Alfons Kemper, Thomas Neumann, Takushi Hashida, Kazuichi Oe, Yoshiyasu Doi, Lilian Harada, and Mitsuru Sato. Managing non­volatile memory in database systems. In Proceedings of the 2018 International Conference on Management of Data, pages 1541–1555, 2018.
[9] Joy Arulraj and Andrew Pavlo. How to build a non­volatile memory database management system. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 1753–1758, 2017.
[10] Mihnea Andrei, Christian Lemke, Günter Radestock, Robert Schulze, Carsten Thiel, Rolando Blanco, Akanksha Meghlan, Muhammad Sharique, Sebastian Seifert, Surendra Vishnoi, et al. Sap hana adoption of non­volatile memory. Proceedings of the VLDB Endowment, 10(12):1754–1765, 2017.
[11] Ohad Rodeh, Josef Bacik, and Chris Mason. Btrfs: The linux b­tree filesystem. ACM Transactions on Storage (TOS), 9(3):1–32, 2013.
[12] Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger, Alex Tomas, and Laurent Vivier. The new ext4 filesystem: current status and future plans. In Proceedings of the Linux symposium, volume 2, pages 21–33. Citeseer, 2007.
[13] Ping Chi, Wang­Chien Lee, and Yuan Xie. Making b+­tree efficient in pcm­based main memory. In Proceedings of the 2014 international symposium on Low power electronics and design, pages 69–74, 2014.
[14] Hosagrahar V Jagadish, Beng Chin Ooi, Kian­Lee Tan, Cui Yu, and Rui Zhang. idistance: An adaptive b+­tree based indexing method for nearest neighbor search. ACM Transac­ tions on Database Systems (TODS), 30(2):364–397, 2005.
[15] Sparsh Mittal and Jeffrey S Vetter. A survey of software techniques for using non­volatile memories for storage and main memory systems. IEEE Transactions on Parallel and Distributed Systems, 27(5):1537–1550, 2015.
[16] Rick Stevens. Deep learning in cancer and infectious disease: novel driver problems for future hpc architecture. In Proceedings of the 26th International Symposium on High­ Performance Parallel and Distributed Computing, pages 65–65, 2017.
[17] Geoffrey W Burr, Robert M Shelby, Abu Sebastian, Sangbum Kim, Seyoung Kim, Severin Sidler, Kumar Virwani, Masatoshi Ishii, Pritish Narayanan, Alessandro Fumarola, et al. Neuromorphic computing using non­volatile memory. Advances in Physics: X, 2(1):89– 124, 2017.
[18] Doo Seok Jeong and Cheol Seong Hwang. Nonvolatile memory materials for neuromor­ phic intelligent machines. Advanced Materials, 30(42):1704729, 2018.
[19] Chundong Wang, Qingsong Wei, Lingkun Wu, Sibo Wang, Cheng Chen, Xiaokui Xiao, Jun Yang, Mingdi Xue, and Yechao Yang. Persisting rb­tree into nvm in a consistency perspective. ACM Transactions on Storage (TOS), 14(1):1–27, 2018.
[20] K. Vättö, I. Cutress, and R. Smith. Analyzing intel­micron 3d xpoint: The next generation non­volatile memory @ONLINE, = http://goo.gl/xhjPRr, 2016.
[21] Zihao Liu, Wujie Wen, Lei Jiang, Yier Jin, and Gang Quan. A statistical stt­ram retention model for fast memory subsystem designs. In 2017 22nd Asia and South Pacific Design Automation Conference (ASP­DAC), pages 720–725. IEEE, 2017.
[22] Xunchao Chen, Jun Wang, and Jian Zhou. Promoting mlc stt­ram for the future persistent memory system. In 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/ PiCom/DataCom/CyberSciTech), pages 1180–1185. IEEE, 2017.
[23] Manu Komalan, Oh Hyung Rock, Matthias Hartmann, Sushil Sakhare, Christian Tenllado, José Ignacio Gómez, Gouri Sankar Kar, Arnaud Furnemont, Francky Catthoor, Sophiane Senni, et al. Main memory organization trade­offs with dram and stt­mram options based on gem5­nvmain simulation frameworks. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 103–108. IEEE, 2018.
[24] Jiaxin Ou, Jiwu Shu, and Youyou Lu. A high performance file system for non­volatile main memory. In Proceedings of the Eleventh European Conference on Computer Systems, pages 1–16, 2016.
[25] Ju­Young Jung and Sangyeun Cho. Memorage: Emerging persistent ram based malleable main memory and storage architecture. In Proceedings of the 27th international ACM conference on International conference on supercomputing, pages 115–126, 2013.
[26] Ren­Shuo Liu, De­Yu Shen, Chia­Lin Yang, Shun­Chih Yu, and Cheng­Yuan Michael Wang. Nvm duet: Unified working memory and persistent store architecture. ACM SIGARCH Computer Architecture News, 42(1):455–470, 2014.
[27] Xiaojian Wu and AL Narasimha Reddy. Scmfs: a file system for storage class memory. In Proceedings of 2011 International Conference for High Performance Computing, Net­ working, Storage and Analysis, pages 1–11, 2011.
[28] Jun Yang, Qingsong Wei, Cheng Chen, Chundong Wang, Khai Leong Yong, and Bing­ sheng He. Nv­tree: Reducing consistency cost for nvm­based single level systems. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15), pages 167– 181, 2015.
[29] Yi Lin, Po­Chun Huang, Duo Liu, Xiao Zhu, and Liang Liang. Making in­memory frequent pattern mining durable and energy efficient. In 2016 45th International Conference on Parallel Processing (ICPP), pages 47–56. IEEE, 2016.
[30] Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Yan Gu, and Julian Shun. Sorting with asymmetric read and write costs. In Proceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures, pages 1–12, 2015.
[31] Stratis D. Viglas. Write­limited sorts and joins for persistent memory. Proceedings of the VLDB Endowment VLDB Endowment Hompage archive, 7(5):413–424, 2014.
[32] Guy E Blelloch, Yan Gu, Julian Shun, and Yihan Sun. Parallel write­efficient algorithms and data structures for computational geometry. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, pages 235–246, 2018.
[33] Riko Jacob and Nodari Sitchinava. Lower bounds in the asymmetric external memory model. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, pages 247–254, 2017.
[34] Tim Feldman and Garth Gibson. Shingled magnetic recording: Areal density increase requires new data management. ; login:: the magazine of USENIX & SAGE, 38(3):22–30, 2013.
[35] Simon Greaves, Yasushi Kanai, and Hiroaki Muraoka. Shingled recording for 2–3 tbit/in^2. IEEE Transactions on Magnetics, 45(10):3823–3829, 2009.
[36] Y Shiroishi, K Fukuda, I Tagawa, H Iwasaki, S Takenoiri, H Tanaka, H Mutoh, and N Yoshikawa. Future options for hdd storage. IEEE Transactions on Magnetics, 45(10): 3816–3822, 2009.
[37] Weiping He and David HC Du. Smart: An approach to shingled magnetic recording trans­ lation. In 15th {USENIX} Conference on File and Storage Technologies ({FAST} 17), pages 121–134, 2017.
[38] Yuval Cassuto, Marco AA Sanvido, Cyril Guyot, David R Hall, and Zvonimir Z Bandic. Indirection systems for shingled­recording disk drives. In 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pages 1–14. IEEE, 2010.
[39] Chun­Feng Wu, Ming­Chang Yang, and Yuan­Hao Chang. Improving runtime perfor­ mance of deduplication system with host­managed smr storage drives. In 2018 55th ACM/ ESDA/IEEE Design Automation Conference (DAC), pages 1–6. IEEE, 2018.
[40] Saurabh Kadekodi, Swapnil Pimpale, and Garth A Gibson. Caveat­scriptor: Write any­ where shingled disks. In 7th {USENIX} Workshop on Hot Topics in Storage and File Systems (HotStorage 15), 2015.
[41] Fenggang Wu, Ziqi Fan, Ming­Chang Yang, Baoquan Zhang, Xiongzi Ge, and David HC Du. Performance evaluation of host aware shingled magnetic recording (ha­smr) drives. IEEE Transactions on Computers, 66(11):1932–1945, 2017.
[42] Adam Manzanares, Noah Watkins, Cyril Guyot, Damien LeMoal, Carlos Maltzahn, and Zvonimr Bandic. {ZEA}, a data management approach for {SMR}. In 8th {USENIX} Workshop on Hot Topics in Storage and File Systems (HotStorage 16), 2016.
[43] Eugene Inseok Chong, Jagannathan Srinivasan, Souripriya Das, Chuck Freiwald, Aravind Yalamanchi, Mahesh Jagannath, Anh­Tuan Tran, Ramkumar Krishnan, and Richard Jiang. A mapping mechanism to support bitmap index and other auxiliary structures on tables stored as primary b+­trees. ACM SIGMOD Record, 32(2):78–88, 2003.
[44] Sai Wu, Dawei Jiang, Beng Chin Ooi, and Kun­Lung Wu. Efficient b­tree based indexing for cloud data processing. Proceedings of the VLDB Endowment, 3(1­2):1207–1218, 2010.
[45] Xiaofeng Gao, Binjie Li, Zongchen Chen, Maofan Yin, Guihai Chen, and Yaohui Jin. Ft­ index: A distributed indexing scheme for switch­centric cloud storage system. In 2015 IEEE International Conference on Communications (ICC), pages 301–306. IEEE, 2015.
[46] Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. The log­structured merge­tree (lsm­tree). Acta Informatica, 33(4):351–385, 1996.
[47] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2):1–26, 2008.
[48] Shimin Chen and Qin Jin. Persistent b+­trees in non­volatile main memory. Proceedings of the VLDB Endowment, 8(7):786–797, 2015.
[49] Philippe Flajolet and Andrew Odlyzko. The average height of binary trees and other simple trees. Journal of Computer and System Sciences, 25(2):171–213, 1982.
[50] Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking cloud serving systems with ycsb. In Proceedings of the 1st ACM symposium on Cloud computing, pages 143–154, 2010.
[51] Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R Hower, Tushar Krishna, Somayeh Sardashti, et al. The gem5 simulator. ACM SIGARCH computer architecture news, 39(2):1–7, 2011.
[52] Matthew Poremba, Tao Zhang, and Yuan Xie. Nvmain 2.0: A user­friendly memory sim­ ulator to model (non­) volatile memory systems. IEEE Computer Architecture Letters, 14(2):140–143, 2015.
[53] Transaction Processing Performance Council (TPC). Tpc benchmark h 2.18.0@ONLINE, http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.18.0. pdf, 2018.
[54] Seagate Inc. Data sheet barracud@ONLINE, https://www.seagate.com/files/ staticfiles/docs/pdf/datasheet/disc/barracuda-ds1737-1-1111us.pdf.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *