帳號:guest(3.148.107.43)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):蔡 靖
作者(外文):Tsai, Ching
論文名稱(中文):先決式資料庫系統之前瞻性資料劃分技術
論文名稱(外文):Look Ahead Data Partitioning for Deterministic Database Systems
指導教授(中文):吳尚鴻
指導教授(外文):Wu, Shan Hung
口試委員(中文):彭文志
吳怡樂
李之屏
口試委員(外文):Peng, Wen-Chih
Wu, Yi-Leh
Lee, Chris
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系所
學號:104062587
出版年(民國):106
畢業學年度:105
語文別:中文
論文頁數:25
中文關鍵詞:資料劃分資料遷移負載平衡資料存取線上交易處理
外文關鍵詞:data partitiondata migrationload balancingdata accessOLTP
相關次數:
  • 推薦推薦:0
  • 點閱點閱:236
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
現今大型分散式關聯資料庫會使用資料劃分技術將資料放置在不同機器節點上來達到高吞吐量以及高可得性。在這樣的分散式資料庫上面執行交易面臨兩大挑戰 :
1) 當執行的交易牽涉到跨機器節點上的資料,就必須要進行同步以確保關聯資料庫處理交易的正確性。然而頻繁的同步動作造成整體性能下降,因此在資料劃分上都會優先把會把反覆同時訪問的資料放在同一個機器節點上,以減少在執行交易上頻繁的同步動作。
2) 在真實的資料庫中的資料訪問模式當中,每一個區段的資料被訪問的頻率不同,更進一步從不同時間點來看,資料被訪問的頻率也不同。在這樣天生的特性下,在某個時間點下會出現單一甚至多個機器節點上面的資料被頻繁訪問,導致整體的性能會因為單一節點無法負荷突然增加的工作量而被侷限。
為了因應此種資料劃分而導致系統負載失衡的情況,必須動態根據新的資料訪問模式對資料進行重新劃分以改善上述兩點不良資料劃分的問題。本篇論文提出一種全新協同計畫資料劃分以及執行資料搬移之資料重新劃分技術 - Look Ahead data Partitioning (LAP)。並用實際的實驗驗證在多變的資料訪問模式下,我們提出的技術能正確劃分資料以改善整體資料庫系統的吞吐量以及降低反應時間,並且在效能回復時間上比最新技術減少60%。

關鍵字: 資料劃分、資料遷移、負載平衡、資料存取、線上交易處理
Today’s large distributed relational database system partitioned data across multiple machines to achieve high scalability and high availability. However, there are two major challenges when processing transactions on this distributed database:
1) In order to ensure correctness of relational database, the database system must synchronize a transaction’s execution result among multiple machines when processing a transaction that acquires transactional resources among different machines. Since synchronization process would result in the performance drop of the whole system, distributed database system partitions data such that frequently co-access data could be located on same physical location. Therefore, most transactions could be executed locally without synchronization overhead.
2) In the real world workload, every partition of data in the database has different access frequency. Moreover, the access frequency also changes over time. Under this kind of nature, at a certain time point, a single or multiple machines would face high frequent data access. This dramatically hurts system performance due to suddenly burst load among these machines exceed their computing capacity.
To handle this unbalanced loading situation caused by bad data placement, the system may dynamically generate new partition plan according to the new data access pattern and reconfigure the data placement. In this paper, we identify the common blind spot of the previous state of the art data repartitioning technique. We present LAP (Look Ahead data Partitioning), a new data partitioning approach that combines transaction routing and data partitioning into a single step. Our experimental evaluation of LAP shows our approach could not only partition data to improve throughput and reduce the latency of the database system under a time-varying workload but also outperform the state of the art technique by reducing 60% of response time.
Keywords : data partition, data migration, load balancing, data access, and OLTP.
目次
致謝 2
Abstract 3
摘要 4
第一章 前言 6
第二章 背景 8
第一節 分析預存處理和交易路由 8
第二節 資料劃分 9
第三章 要旨 12
第一節 主要觀察 12
第二節 批次處理交易 12
第三節 前瞻性資料劃分 13
第四節 記錄行所在地追蹤 14
第四章 實作細節 16
第一節 系統假設 16
第二節 查找表的容錯性 16
第三節 系統架構 16
第五章 評測 18
第一節 實驗環境 18
第二節 單點永久工作模式改變 19
第三節 時變工作模式 21
第四節 參數 α 調整 22
第六章 結論 23
第七章 參考文獻 24
[1] Sudipto Das, Divyakant Agrawal, and Amr El Abbadi. 2010. G-Store: a scalable data store for transactional multi key access in the cloud. In Proceedings of the 1st ACM symposium on Cloud computing (SoCC '10). ACM, New York, NY, USA, 163-174.
[2] Marco Serafini, Rebecca Taft, Aaron J. Elmore, Andrew Pavlo, Ashraf Aboulnaga, and Michael Stonebraker. 2016. Clay: fine-grained adaptive partitioning for general database schemas. Proc. VLDB Endow. 10, 4 (November 2016), 445-456.
[3] Aaron J. Elmore, Vaibhav Arora, Rebecca Taft, Andrew Pavlo, Divyakant Agrawal, and Amr El Abbadi. 2015. Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 299-313.
[4] Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-store: a high-performance, distributed main memory transaction processing system. Proc. VLDB Endow. 1, 2 (August 2008), 1496-1499.
[5] Alexander Thomson and Daniel J. Abadi. 2010. The case for determinism in database systems. Proc. VLDB Endow. 3, 1-2 (September 2010), 70-80.
[6] Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J. Abadi. 2012. Calvin: fast distributed transactions for partitioned database systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD '12). ACM, New York, NY, USA, 1-12.
[7] Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden. 2010. Schism: a workload-driven approach to database replication and partitioning. Proc. VLDB Endow. 3, 1-2 (September 2010), 48-57.
[8] Andrew Pavlo, Carlo Curino, and Stanley Zdonik. 2012. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD '12). ACM, New York, NY, USA, 61-72.
[9] Khai Q. Tran, Jeffrey F. Naughton, Bruhathi Sundarmurthy, and Dimitris Tsirogiannis. 2014. JECB: a join-extension, code-based approach to OLTP data partitioning. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14). ACM, New York, NY, USA, 39-50.
[10] Abdul Quamar, K. Ashwin Kumar, and Amol Deshpande. 2013. SWORD: scalable workload-aware data placement for transactional workloads. In Proceedings of the 16th International Conference on Extending Database Technology (EDBT '13). ACM, New York, NY, USA, 430-441.
[11] Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J. Elmore, Ashraf Aboulnaga, Andrew Pavlo, and Michael Stonebraker. 2014. E-store: fine-grained elastic partitioning for distributed transaction processing systems. Proc. VLDB Endow. 8, 3 (November 2014), 245-256.
[12] Aubrey L. Tatarowicz, Carlo Curino, Evan P. C. Jones, and Sam Madden. 2012. Lookup Tables: Fine-Grained Partitioning for Distributed Databases. In Proceedings of the 2012 IEEE 28th International Conference on Data Engineering (ICDE '12). IEEE Computer Society, Washington, DC, USA, 102-113.
[13] http://www.elasql.org/
[14] Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini. 2009. Zab: High-performance broadcast for primary-backup systems. 23rd International Symposium on Distributed Computing (DISC '09)
[15] Andrew Pavlo. 2017. What Are We Doing With Our Lives?: Nobody Cares About Our Concurrency Control Research. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). ACM, New York, NY, USA, 3-3.
(此全文未開放授權)
電子全文
中英文摘要
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *