帳號:guest(216.73.216.146)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):陳至萱
作者(外文):Chen, Chih-Hsuan
論文名稱(中文):大規模儲存系統疊瓦式磁碟之外部排序
論文名稱(外文):Facilitating External Sorting on Active SMR-based Large-Scale Storage Systems
指導教授(中文):石維寬
指導教授(外文):Shih, Wei-Kuan
口試委員(中文):徐讚昇
張原豪
衛信文
口試委員(外文):Hsu, Tsan-Sheng
Chang, Yuan-Hao
Wei, Hsin-Wen
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學號:107065532
出版年(民國):109
畢業學年度:108
語文別:英文
論文頁數:23
中文關鍵詞:疊瓦式磁碟外部排序大規模儲存系統
外文關鍵詞:External SortingShingled Magnetic Recording DrivesLarge-Scale Storage Systems
相關次數:
  • 推薦推薦:0
  • 點閱點閱:365
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
在進入大數據時代之中,如何計算運行並且儲存巨量的資料已成為數據密集計算中十分重要的議題。為了要良好的進行數據密集的計算,需要使用到數種基礎資料處理技術,其中一項即是「外部排序」。外部排序也被廣泛適用於資料庫管理系統以及Hadoop架構之中。
另一方面,為了要儲存如此大量的數據,疊瓦式磁性紀錄硬碟也被提出,目的是為了增加傳統硬碟的單位面積可儲存的資料密度,作法如其名,是如堆疊瓦片一樣地將磁軌堆疊在另一磁軌的部分面積上,藉此做法達到增加單位面積的磁軌密度,以提升單位面積可儲存的資料量。
並且,由於疊瓦式磁性紀錄硬碟能夠不做大幅度的技術更改即可增加傳統硬碟的容量,也被視作在大數據處理領域中十分具有前景的一項技術。但是,疊瓦式磁性紀錄硬碟的重疊磁軌分布卻限制了寫入的順序,必須依序寫入以防資料被破壞,此狀況將造成寫入時的資料流動受限,若在其上進行外部排序,可能影響排序的效能。
基於此觀察,本論文提出一針對大資料量儲存系統之疊瓦式磁碟外部排序方法,希望能利用疊瓦式磁碟本身的特性及外部排序的優勢改善系統效能上的問題。
Facing the rapidly changing age, there are some pioneering creation in the computing society, one of them is the big data computing. To deal with the data-intensive computing, some of the fundamental data processing technique are created, among them is the external sorting. External sorting is widely used in database management systems (DBMS) and Hadoop framework. Besides, in order to store the sheer amount of data, shingled magnetic recording (SMR) drives have been proposed to increase the areal density of conventional hard disk drives (HDDs) via overlapping adjacent tracks. It is also considered as a prospective candidate when faced with the big data application, for SMR drive can enhance the capacity of HDDs without great modifications. However, the overlapped layout of the SMR drives does trigger some problems, such as write amplification and sequential write constraint. The observation noticing that inspires us to proposed a SMR-based External Merge Sort (SMR-EMS) strategy to resolve the obstacles mentioned above. The proposed strategy also aims to mitigate the sequential write constraint and write traffics, in order to boost the performance of performing external sorting on SMR drives. Experiments were performed to demonstrate the improvement of the proposed strategy compared to the conventional one.
摘要 (p.i)
Abstract (p.ii)
1 Introduction (p.1)
1.1 Introduction (p.1)
2 Background and Motivation (p.5)
2.1 Background (p.5)
2.2 Motivation (p.9)
3 SMR-Based External Merge Sort Strategy (p.12)
3.1 Overview (p.12)
3.2 Active-sort Caching Design (p.13)
3.3 Analysis (p.14)
3.3.1 Write Amplification Amount (p.14)
4 Performance Evaluation (p.17)
4.1 Experiment Setup (p.17)
4.2 Experimental Results (p.18)
5 Conclusion (p.20)
References (p.21)
[1]Seagate, “Seagate barracuda data sheet,” https://www.seagate.com/staticfiles/docs/pdf/
datasheet/disc/barracuda-ds1737-1-1111us.pdf, 2017.
[2] J. Boukhobza and P. Olivier, Flash Memory Integration : Performance and Energy Issues.
UK ISTE Press Elsevier, 03 2017.
[3] “Apache hadooop,” https://hadoop.apache.org/, [Online; accessed 1-April-2019].
[4] J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,”
vol. 51, 01 2004, pp. 137–150.
[5] G. Graefe, “Implementing sorting in database systems,” ACM Comput. Surv., vol. 38, 09
2006.
[6] T. Feldman and G. Gibson, “Shingled magnetic recording: Areal density increase requires
new data management,” USENIX ;login issue, vol. 38(3), 2013.
[7] A. Acharya, M. Uysal, and J. Saltz, “Active disks: Programming model, algorithms and
evaluation,” SIGPLAN Not., vol. 33, no. 11, pp. 81–91, Oct. 1998. [Online]. Available:
http://doi.acm.org/10.1145/291006.291026
[8] E. Riedel, C. Faloutsos, G. A. Gibson, and D. Nagle, “Active disks for large-scale data
processing,” Computer, vol. 34, no. 6, pp. 68–74, June 2001.
[9] C. R. Cook and D. J. Kim, “Best sorting algorithm for nearly sorted lists,” Commun. ACM,
vol. 23, no. 11, pp. 620–624, nov 1980.
[10] M. V. Wilkes, “The Art of Computer Programming, Volume 3, Sorting and Searching,”
The Computer Journal, vol. 17, no. 4, pp. 324–324, 11 1974. [Online]. Available:
https://doi.org/10.1093/comjnl/17.4.324
[11] A. Aggarwal and S. Vitter, Jeffrey, “The input/output complexity of sorting and related
problems,” Commun. ACM, vol. 31, no. 9, pp. 1116–1127, Sep. 1988. [Online].
Available: http://doi.acm.org/10.1145/48529.48535
[12] A. Acharya, M. Uysal, and J. Saltz, “Active disks: Programming model, algorithms and
evaluation,” SIGPLAN Not., vol. 33, no. 11, pp. 81–91, Oct. 1998. [Online]. Available:
http://doi.acm.org/10.1145/291006.291026
[13] A. Aghayev and P. Desnoyers, “Skylight—a window on shingled disk operation,”
in 13th USENIX Conference on File and Storage Technologies (FAST 15). Santa
Clara, CA: USENIX Association, 2015, pp. 135–149. [Online]. Available: https:
//www.usenix.org/conference/fast15/technical-sessions/presentation/aghayev
[14] S.-H. Chen, Y.-C. Lin, Y.-H. Chang, M.-C. Yang, T.-Y. Chen, H.-W. Wei, and W.-K. Shih,
“A new sequential-write-constrained cache management to mitigate write amplification
for smr drives,” in Proceedings of the 34th Annual ACM Symposium on Applied Com-
puting, ser. SAC ’19, 2019.
[15] C. Ma, Z. Shen, Y. Wang, and Z. Shao, “Alleviating hot data write back effect for shingled
magnetic recording storage systems,” IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, pp. 1–1, 2018.
[16] L. Zheng and P. A. Larson, “Speeding up external mergesort,” IEEE Transactions on
Knowledge and Data Engineering, vol. 8, no. 2, pp. 322–332, April 1996.
[17] P.-A. Larson and G. Graefe, “Memory management during run generation in external
sorting,” in Proceedings of the 1998 ACM SIGMOD International Conference on
Management of Data, ser. SIGMOD ’98. New York, NY, USA: ACM, 1998, pp.
472–483. [Online]. Available: http://doi.acm.org/10.1145/276304.276346
[18] A. Laga, J. Boukhobza, F. Singhoff, and M. Koskas, “Montres : Merge on-the-run external
sorting algorithm for large data volumes on ssd based storage systems,” IEEE Transactions
on Computers, vol. 66, no. 10, pp. 1689–1702, Oct 2017.
[19] A. Shatnawi and Y. Alzahouri, “A multi-pass algorithm for sorting extremely large data
files,” in 2015 6th International Conference on Information and Communication Systems
(ICICS), April 2015, pp. 79–82.
[20] L. C. Quero, Y. Lee, and J. Kim, “Self-sorting ssd: Producing sorted data inside active
ssds,” in 2015 31st Symposium on Mass Storage Systems and Technologies (MSST), May
2015, pp. 1–7.
[21] D. A. Thompson and J. S. Best, “The future of magnetic data storage techology,” IBM
Journal of Research and Development, vol. 44, no. 3, pp. 311–322, 2000.
[22] Y. Shiroishi, K. Fukuda, I. Tagawa, H. Iwasaki, S. Takenoiri, H. Tanaka, H. Mutoh, and
N. Yoshikawa, “Future options for hdd storage,” IEEE Transactions on Magnetics, vol. 45,
no. 10, pp. 3816–3822, Oct 2009.
[23] L. Ma and L. Xu, “Hmss: A high performance host-managed shingled storage system
based on awareness of smr on block layer,” in 2016 IEEE 18th International Conference
on High Performance Computing and Communications; IEEE 14th International Confer-
ence on Smart City; IEEE 2nd International Conference on Data Science and Systems
(HPCC/SmartCity/DSS), Dec 2016, pp. 570–577.
[24] F. Wu, M.-C. Yang, Z. Fan, B. Zhang, X. Ge, and D. H. Du, “Evaluating host aware
SMR drives,” in 8th USENIX Workshop on Hot Topics in Storage and File Systems
(HotStorage 16). Denver, CO: USENIX Association, 2016. [Online]. Available:
https://www.usenix.org/conference/hotstorage16/workshop-program/presentation/wu
[25] M. Shafaei, M. H. Hajkazemi, P. Desnoyers, and A. Aghayev, “Modeling drive-managed
smr performance,” ACM Trans. Storage, vol. 13, no. 4, pp. 38:1–38:22, Dec. 2017.
[Online]. Available: http://doi.acm.org/10.1145/3139242
[26] W. Digital, “Wd ultrastar dc hc600 smr series,” https://www.westerndigital.com/products/
data-center-drives/ultrastar-dc-hc600-series-hdd, [Online; accessed 1-April-2019].
[27] Seagate, “Seagate archive hdd,” https://www.seagate.com/support/
enterprise-servers-storage/hard-disk-drives/archive-hdd/, [Online; accessed 1-April-
2019].
[28] S. Jones, A. Amer, E. L. Miller, D. D. Long, R. Pitchumani, and C. Strong, “Classifying
data to reduce long term data movement in shingled write disks,” in Proceedings of the
31st International Conference on Massive Storage Systems and Technology (MSST 2015),
June 2015.
[29] Y. Cassuto, M. A. A. Sanvido, C. Guyot, D. R. Hall, and Z. Z. Bandic, “Indirection systems
for shingled-recording disk drives,” in Mass Storage Systems and Technologies (MSST),
2010 IEEE 26th Symposium on, May 2010, pp. 1–14.
[30] gensort, “gensort data generator,” http://www.ordinal.com/gensort.html, [Online; ac-
cessed 1-April-2019].
[31] S. Benchmark, “Sort benchmark home page,” http://sortbenchmark.org/, [Online; ac-
cessed 1-April-2019].
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *