帳號:guest(3.137.178.131)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):連翊涵
作者(外文):Lien, Yi-Han
論文名稱(中文):針對基於隔行磁紀錄的硬碟效能提升之隔行感知資料管理策略
論文名稱(外文):Interlace-aware Data Management Strategy for Performance Enhancement of IMR-based Hard-disk Drives
指導教授(中文):石維寬
指導教授(外文):Shih, Wei-Kuan
口試委員(中文):周志遠
逄愛君
許富皓
張原豪
謝仁偉
陳郁方
梁郁珮
陳彥廷
口試委員(外文):Chou, Chi-Yuan
Pang, Ai-Chun
Hsu, Fu-Hau
Chang, Yuan-Hao
Hsieh, Jen-Wei
Chen, Yu-Fang
Liang, Yu-Pei
Chen, Yen-Ting
學位類別:博士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:108062801
出版年(民國):113
畢業學年度:112
語文別:英文
論文頁數:61
中文關鍵詞:隔行磁紀錄硬碟檔案系統自平衡樹資料管理策略
外文關鍵詞:Interlaced Magnetic RecordingHard-disk DriveFile systemSelf-balanced treeB^epsilon-treeData management strategy
相關次數:
  • 推薦推薦:0
  • 點閱點閱:29
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
隨著雲端服務、大數據、機器學習等新興大規模應用的快速成長,近年來對大容量、高性價比儲存設備的需求不斷增加。其中,硬碟(HDDs)為一種代表性的低成本儲存設備,為了追求更高的硬碟容量,許多研究者透過不同的磁軌排列設計(track layout)來提高單位面積可儲存的資料量。其中,隔行磁紀錄(Interlaced Magnetic Recording, IMR)將磁軌分成頂部與底部磁軌,並且將兩個頂部磁軌部分交疊於每個底部磁軌來增加資料密度。然而,這種設計會導致更新底部磁軌時覆蓋到鄰近的兩個頂部磁軌,造成資料遺失。因此,在更新一個底部磁軌時,需將兩個鄰近的頂部磁軌進行備份,待底部磁軌完成更新後,再將兩個頂部磁軌的資料寫回。如此一來,即使只有一個底部磁軌的資料要做更新,卻需要多寫兩個頂部磁軌的資料,這種現象也稱作寫入放大,導致系統效能下降。為了解決此問題,有許多研究致力於提出減輕寫入放大的方法,但現有的方法皆為裝置層級的解決方案(device-level solutions),因無考慮儲存之資料的特性,其改善效果有限。因此,此論文希望依照不同的資料特性來做空間分配,突破既有方法的效能提升限制。此論文分為兩個部分,第一個部分以檔案系統的角度出發,觀察到不同的檔案類型有不同的更新頻率,進而提出一個空間分配的方法來減少寫入放大,同時因感知檔案系統的特性而將容易一起存取的資料放置於鄰近的位置,有效的減少了尋道時間(seek time)。在針對檔案資料的空間分配提出一個解決方案後,我們接著考慮到了索引結構的空間分配。在大規模的檔案系統或資料庫中,索引結構廣泛地被使用來提高尋找資料的速度。其中,B^epsilon-tree 為 B-tree 和 B^+-tree 的擴展,B^epsilon-tree 和 B^+-tree 相似,只有在葉子節點儲存鍵值對(key-value pair),而中間節點只有存鍵,以及一個特殊的暫存(buffer)設計來減少頻繁平衡樹的成本,近年來越來越受到關注。B^epsilon-tree 透過將插入、刪除、訪問等存取請求壓縮成訊息(message),並將訊息放入根目錄的暫存中,來達成一次的存取請求。當根目錄的暫存達到空間上限,便會將訊息向下倒入(flush)子節點的暫存中,直到子節點為葉子節點才會將訊息打開,執行相對應的存取請求。也就是說,當訊息被倒入葉子節點時,才有可能會觸發樹平衡。然而,因為暫存中的訊息只依照時間順序排隊,在決定要倒入哪一個子節點時,需要遍歷整個暫存以計算每個子節點擁有的訊息數量。這不僅會造成嚴重的讀取成本,在更新父節點的暫存時也有可能因為訊息分散,而要更新過多的空間。此外,我們也發現暫存屬於經常更新的資料,但有可能因沒有妥善規劃空間分配而被放在隔行磁紀錄的底部軌道中,而這必然會造成嚴重的寫入放大。因此,此論文的第二個部分即為重新設計暫存中管理訊息的方法,並且透過感知每個數節點的特性來對其做合適的空間分配,來達到減少寫入放大並提升讀寫的效能之目的。
Interlaced Magnetic Recording (IMR) is an emerging recording technology for hard-disk drives (HDDs) that provides larger storage capacity at a lower cost. By partially overlapping (interlacing) each bottom track with two adjacent top tracks, IMR-based HDDs successfully increase the data density while incurring some hardware write constraints. To update each bottom track, the data on two adjacent top tracks must be read and rewritten to avoid losing their valid data, resulting in additional overhead for performing read-modify-write (RMW) operations. Therefore, researchers have proposed various data management schemes to mitigate such overhead in recent years, aiming at improving the write performance. However, these designs have not taken into account the data characteristics of the file system, which is a crucial layer of operating systems for storing/retrieving data into/from HDDs. Consequently, the write performance improvement is limited due to the unawareness of spatial locality and hotness of data. The dissertation is divided into two parts: the first part proposes a file-system-aware data management scheme called FSIMR to improve system write performance. Noticing that data of the same directory may have higher spatial locality and are mostly updated at the same time, FSIMR logically partitions the IMR-based HDD into fixed-sized zones; data belonging to the same directory will be arranged to one zone to reduce the time of seeking to-be-updated data (seek time). Furthermore, cold data within a zone are arranged to bottom tracks and updated in an out-of-place manner to eliminate write amplification. After proposing a solution for file data space allocation, the dissertation considers space allocation for index structures. In large-scale file systems or databases, index structures are widely used to enhance data retrieval speed. Among these, the B^epsilon-tree, an extension of the B-tree and B^+-tree, has gained attention for its use of specialized buffers to reduce the costs of frequent tree balancing. Similar to B^+-tree, B^epsilon-tree stores the key-value pairs in the leaf nodes while storing only keys in the internal nodes. To be mode precise, the internal nodes are divided into pivot areas (for storing keys) and buffer areas. The B^epsilon-tree encodes access requests into messages, and adds them to the root node's buffer to complete one access request. Once the root node's buffer reaches its capacity limit, the messages are flushed to one of its sub-node until they reach the leaf node, where the messages are then opened to execute the corresponding requests. In other words, the tree balance routine only occurs when the message reaches the leaf node, which can greatly improve the write performance. However, messages in the buffer are only sorted by their arrival time. As a result, the entire buffer must be traversed during a flush routine to find the sub-node to be flushed. This design not only incurs significant read overhead but also involves more than necessary buffer updates when the messages in the parent buffer (the one flushing messages) are scattered. Moreover, it can be observed that the buffer is the most frequently updated data, which can cause significant write amplification if we do not strategically place the data, i.e., in the bottom tracks of the IMR-based HDD. Therefore, the second part of this dissertation proposes a B^epsilon-tree-aware data management strategy for IMR-based HDD to decrease the write amplification issue and increase read and write performance. This approach aims to redesign the management within the buffers and allocate space according to the update characteristics of different types of nodes.
Abstract (Chinese) I
Abstract III
Acknowledgements (Chinese) V
Contents VI
List of Figures VIII
List of Tables X
1 Introduction 1
2 Background 9
2.1 IMR-based Hard-disk Drives . . . . . . . . . . . . . . . . . . . . . . 9
2.2 File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Bϵ-tree Index Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Interlace-aware User Data Management for IMR-based Hard-disk
Drives. . . . . . . . . . . . . . . . . . . . . . 14
3.1 Observation and Motivation . . . . . . . . . . . . . . . . . . . . . . 14
3.2 File-system-aware Data Management: FSIMR . . . . . . . . . . . . 16
3.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
VI
3.2.2 Directory-based Zone Allocation . . . . . . . . . . . . . . . . 19
3.2.3 Zone Management . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.4 Directory-based Garbage Collection . . . . . . . . . . . . . . 24
3.3 Overhead Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.1 Space Utilization . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.2 Time Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 28
3.4.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . 30
4 Interlace-aware Index Data Management for IMR-based Harddisk
Drives 37
4.1 Observation and Motivation . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Bϵ-tree-aware Data Management . . . . . . . . . . . . . . . . . . . 41
4.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2.2 Partitioned Buffer Scheme . . . . . . . . . . . . . . . . . . . 43
4.2.3 Zone Management . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . 48
5 Conclusion 52
Bibliography 54
6 Publication List 60
Publication List 60
[1] Abutalib Aghayev, Y Theodore, Garth Gibson, and Peter Desnoyers. Evolving
ext4 for shingled disks. In FAST, pages 105–120, 2017.
[2] Abdullah Al Mamun, GuoXiao Guo, and Chao Bi. Hard disk drive: mechatronics
and control. CRC press, 2017.
[3] Ahmed Amer, JoAnne Holliday, Darrell DE Long, Ethan L Miller, Jehan-
Fran¸cois Pˆaris, and Thomas Schwarz. Data management and layout for shingled
magnetic recording. IEEE Transactions on Magnetics, 47(10):3691–3697,
2011.
[4] Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau. Operating Systems:
Three Easy Pieces. Arpaci-Dusseau Books, 1.00 edition, August 2018.
[5] Michael A Bender, Martin Farach-Colton, William Jannen, Rob Johnson,
Bradley C Kuszmaul, Donald E Porter, Jun Yuan, and Yang Zhan. An introduction
to bϵ-trees and write-optimization. login; magazine, 40(5), 2015.
[6] Daniel Campello, Hector Lopez, Luis Useche, Ricardo Koller, and Raju Rangaswami.
FIU filesystem syscall traces (SNIA IOTTA trace set 5198). In
Geoff Kuenning, editor, SNIA IOTTA Trace Repository. Storage Networking
Industry Association, September 2014.
[7] Mingming Cao, Suparna Bhattacharya, and Ted Ts’o. Ext4: The next generation
of ext2/3 filesystem. In LSF, 2007.
[8] Brian Carrier. File system forensic analysis. Addison-Wesley Professional,
2005.
[9] Chandranil Chakraborttii and Heiner Litz. Reducing write amplification in
flash by death-time prediction of logical block addresses. In Proceedings of
the 14th ACM International Conference on Systems and Storage, pages 1–12,
2021.
[10] Douglas Comer. Ubiquitous b-tree. ACM Computing Surveys (CSUR),
11(2):121–137, 1979.
[11] Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and
Russell Sears. Benchmarking cloud serving systems with YCSB. In Proceedings
of the 1st ACM symposium on Cloud computing, pages 143–154, 2010.
[12] Kevin D Fairbanks. An analysis of ext4 for digital forensics. Digital investigation,
9:S118–S130, 2012.
[13] Michael E Fitzpatrick. 4k sector disk drives: Transitioning to the future with
advanced format technologies. Toshiba. (vid. p´ag. 349), 2011.
[14] Kaizhong Gao, Wenzhong Zhu, and Edward Gage. Write management for interlaced
magnetic recording devices, November 29 2016. US Patent 9,508,362.
[15] Kaizhong Gao, Wenzhong Zhu, and Edward Gage. Interlaced magnetic
recording, August 8 2017. US Patent 9,728,206.
[16] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system.
In Proceedings of the nineteenth ACM symposium on Operating systems
principles, pages 29–43, 2003.
[17] Goetz Graefe et al. Modern b-tree techniques. Foundations and Trends® in
Databases, 3(4):203–402, 2011.
[18] Steven Granz, Jason Jury, Chris Rea, Ganping Ju, Jan-Ulrich Thiele, Tim
Rausch, and Edward C Gage. Areal density comparison between conventional,
shingled, and interlaced heat-assisted magnetic recording with multiple sensor
magnetic recording. IEEE Transactions on Magnetics, 55(3):1–3, 2018.
[19] Faheem Hafeez. Role of file system in operating system. International Journal
of Computer Science and Innovation, 2016:117–127, 2016.
[20] Mohammad Hossein Hajkazemi, Ajay Narayan Kulkarni, Peter Desnoyers,
and Timothy R Feldman. Track-based translation layers for interlaced magnetic
recording. In USENIX Annual Technical Conference, pages 821–832,
2019.
[21] John H Howard et al. An overview of the andrew file system, volume 17.
Carnegie Mellon University, Information Technology Center, 1988.
[22] Euiseok Hwang, Jongseung Park, Richard Rauschmayer, and Bruce Wilson.
Interlaced magnetic recording. IEEE Transactions on Magnetics, 53(4):1–7,
2016.
[23] Hosagrahar V Jagadish, Beng Chin Ooi, Kian-Lee Tan, Cui Yu, and Rui
Zhang. idistance: An adaptive b+-tree based indexing method for nearest
neighbor search. ACM Transactions on Database Systems (TODS), 30(2):364–
397, 2005.
[24] William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet,
Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif
Walsh, et al. BetrFS: A right-optimized write-optimized file system. In 13th
USENIX Conference on File and Storage Technologies (FAST 15), pages 301–
315, 2015.
[25] Sooman Jeong, Kisung Lee, Seongjin Lee, Seoungbum Son, and Youjip Won.
I/O stack optimization for smartphones. In 2013 USENIX Annual Technical
Conference (USENIXATC 13), pages 309–320, 2013.
[26] Saurabh Kadekodi, Swapnil Pimpale, and Garth A Gibson. Caveat-scriptor:
Write anywhere shingled disks. In 7th USENIX Workshop on Hot Topics in
Storage and File Systems (HotStorage 15), 2015.
[27] Oleh-Yevhen Khavrona. B-epsilon-tree and cache-oblivious lookahead array:
a comparative study of two write-optimised data structures. B.S. thesis, University
of Twente, 2021.
[28] Kim S Larsen and Rolf Fagerberg. Efficient rebalancing of b-trees with relaxed
balance. International Journal of Foundations of Computer Science,
7(02):169–186, 1996.
[29] Yu-Pei Liang, Shuo-Han Chen, Yuan-Hao Chang, Yong-Chin Lin, Hsin-Wen
Wei, and Wei-Kuan Shih. Mitigating write amplification issue of SMR drives
via the design of sequential-write-constrained cache. Journal of Systems Architecture,
99:101634, 2019.
[30] Yuhong Liang, Tsun-Yu Yang, and Ming-Chang Yang. Kvimr: Key-value
store aware data management middleware for interlaced magnetic recording
based hard disk drive. In USENIX Annual Technical Conference, pages 657–
671, 2021.
[31] Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger,
Alex Tomas, and Laurent Vivier. The new ext4 filesystem: current status
and future plans. In Proceedings of the Linux symposium, volume 2, pages
21–33. Citeseer, 2007.
[32] Marshall K McKusick, William N Joy, Samuel J Leffler, and Robert S Fabry. A
fast file system for UNIX. ACM Transactions on Computer Systems (TOCS),
2(3):181–197, 1984.
[33] Andreas Moser, Kentaro Takano, David T Margulies, Manfred Albrecht,
Yoshiaki Sonobe, Yoshihiro Ikeda, Shouheng Sun, and Eric E Fullerton. Magnetic
recording: advancing into the future. Journal of Physics D: Applied
Physics, 35(19):R157, 2002.
[34] Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. MSR Cambridge
traces (SNIA IOTTA trace 386). In Geoff Kuenning, editor, SNIA
IOTTA Trace Repository. Storage Networking Industry Association, March
2007.
[35] Wenjia Ruan, Yujie Liu, and Michael Spear. Transactional read-modify-write
without aborts. ACM Transactions on Architecture and Code Optimization
(TACO), 11(4):1–24, 2015.
[36] Chris Ruemmler and John Wilkes. An introduction to disk drive modeling.
Computer, 27(3):17–28, 1994.
[37] DennisE Speliotis. Magnetic recording beyond the first 100 years. Journal of
Magnetism and Magnetic Materials, 193(1-3):29–35, 1999.
[38] Venkathachary Srinivasan and Michael J Carey. Performance of b+ tree concurrency
control algorithms. The VLDB Journal, 2:361–406, 1993.
[39] Ext Wiki. Ext4 disk layout, 2013.
[40] Fenggang Wu, Bingzhe Li, Baoquan Zhang, Zhichao Cao, Jim Diehl, Hao
Wen, and David HC Du. TrackLace: Data management for interlaced magnetic
recording. IEEE Transactions on Computers, 70(3):347–358, 2020.
[41] Peng Xia, Dan Feng, Hong Jiang, Lei Tian, and Fang Wang. Farmer: a novel
approach to file access correlation mining and evaluation reference model for
optimizing peta-scale file system performance. In Proceedings of the 17th
international symposium on High performance distributed computing, pages
185–196, 2008.
[42] Zhimin Zeng, Xinyu Chen, Laurence T Yang, and Jinhua Cui. Imrsim: A disk
simulator for interlaced magnetic recording technology. In Network and Parallel
Computing: 19th IFIP WG 10.3 International Conference, NPC 2022,
Jinan, China, September 24–25, 2022, Proceedings, pages 267–273. Springer,
2022.
(此全文未開放授權)
電子全文
摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *