帳號:guest(3.133.135.8)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):游傑宇
作者(外文):Yu, Chieh-Yu
論文名稱(中文):應用於雲端多層式架構服務之高效群組式容錯系統
論文名稱(外文):Efficient Group Fault Tolerance for Multi-tier Services in Cloud Environments
指導教授(中文):李哲榮
指導教授(外文):Lee, Che-Rung
口試委員(中文):林郁翔
鍾武君
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:106062608
出版年(民國):108
畢業學年度:107
語文別:英文
論文頁數:32
中文關鍵詞:容錯系統虛擬化多層式架構服務
外文關鍵詞:fault tolerancevirtualizationmulti-tier services
相關次數:
  • 推薦推薦:0
  • 點閱點閱:409
  • 評分評分:*****
  • 下載下載:0
  • 收藏收藏:0
容錯系統在讓長時間持續運作之服務保持高可用性中扮演著關鍵
性的技術。在這個雲端計算快速發展的時代,此系統常常應用在虛擬
化技術中。然而,多數的以虛擬化為基礎之容錯系統只專注在保持單
一伺服器之保護,這導致多台各自有容錯系統保護的伺服器在大量的
訊息傳遞與溝通時會有嚴重的效能低落狀況發生。一種比較好的解決
方法是使用群組式容錯系統,即是將所有虛擬機器組成一個群組並且
使其狀態同步以避免延遲積累之狀況發生。
在此篇論文中,我們實作並展示了一個高效能的群組式容錯系統,
並且提出了能為此系統在故障轉移的過程增進效能的方法。經過實驗
驗證,和各自有容錯系統保護的伺服器相比,此系統的吞吐量之效能
增進了6.52 倍,且傳輸延遲只有上述系統的百分之十二。我們也在較
複雜的虛擬多層級服務中得到相似的實驗結果。此外,故障轉移之過
程與系統暫停運作之時間皆有經過嚴謹的優化,使其能夠接近單台伺
服器之容錯系統下的故障轉移時間。
Fault tolerance is the key technology to achieve high availability for non-stop and
long-lasting services, which is usually carried out by the virtualization technology
in the era of cloud computing. However, most virtualization-based fault tolerance
methods only focus on the resilience of a single server, which cause great perfor-
mance degradation for the services that have heavy communication among multiple
nodes. One of the solutions is the Group Fault Tolerance (Group FT) technique,
which synchronizes a group of VMs within single fault tolerance states to avoid the
latency accumulation problem. In this paper, we present an ecient implementation
of Group FT, as well as the methods to enhance the performance of Group FT's
failover process. Experiments show that with Group FT, the system throughput
can be increased by 6.52 times and the latency can be reduced to 12% for the OLTP
workload in SysBench, comparing to the fault tolerance for individual servers. Sim-
ilar results are also shown for more complicated synthetic multi-tier architectures.
Moreover, the downtime and the duration of the failover process for Group FT are
also optimized so that they are comparable to those of Individual FT.
Chinese Abstract i
Abstract ii
Contents iii
List of Figures v
List of Tables vi
1 Introduction 1
2 Background 3
2.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Implementation 7
3.1 High Level Description . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Implementation of Synchronization . . . . . . . . . . . . . . . . . . . 9
3.3 Group FT Failover Handling . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.1 All-failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.2 Partial-failover . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.3 Partial-failover with dynamic resync . . . . . . . . . . . . . . 12
4 Performance Evaluation 15
4.1 Latency and throughput of Group FT . . . . . . . . . . . . . . . . . . 15
4.2 Synthetic multi-tier architectures . . . . . . . . . . . . . . . . . . . . 17
4.2.1 Linear topology . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.2 Load-balancing topology . . . . . . . . . . . . . . . . . . . . . 20
4.2.3 Multicasting topology . . . . . . . . . . . . . . . . . . . . . . 21
iii
4.3 Run stage length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4 Service disruption time . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.5 Performance degradation by dynamic resync . . . . . . . . . . . . . . 25
5 Conclusion 27
6 Bibliography 29
[1] Example http load balancing to three real web servers.
https://help.fortinet.com/fos50hlp/54/Content/FortiOS/
fortigate-load-balancing-52/ldb-example1.htm, 2018.
[2] Cuju: An open source project for virtualization-based fault tolerance. https:
//cuju-ft.github.io/cuju-web/home.html, 2019.
[3] Joel F. Bartlett. A nonstop kernel. In Proceedings of the Eighth ACM Sympo-
sium on Operating Systems Principles, SOSP '81, pages 22-29, New York, NY,
USA, 1981. ACM.
[4] T. C. Bressoud and F. B. Schneider. Hypervisor-based fault tolerance. SIGOPS
Oper. Syst. Rev., 29(5):1-11, December 1995.
[5] M. C. Caraman, S. A. Moraru, S. Dan, and C. Grama. Continuous disaster
tolerance in the iaas clouds. In 2012 13th International Conference on Opti-
mization of Electrical and Electronic Equipment (OPTIM), pages 1226-1232,
May 2012.
[6] Mihai Caraman, Sorin Aurel Moraru, Stefan Dan, and Dominic Mircea Kristaly.
Romulus: Disaster tolerant system based on kernel virtual machines. 20th In-
ternational DAAAM Symposium : Intelligent Manufacturing and Automation:
Theory, Practice and Education, 1, November 2009.
[7] Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul,
Christian Limpach, Ian Pratt, and Andrew War eld. Live migration of vir-
tual machines. In Proceedings of the 2Nd Conference on Symposium on Net-
worked Systems Design and Implementation - Volume 2, NSDI'05, pages 273~286, Berkeley, CA, USA, 2005. USENIX Association.
[8] Brendan Cully, Geo rey Lefebvre, Dutch T. Meyer, Mike Feeley, Norman C.
Hutchinson, and Andrew War eld. Remus: High availability via asynchronous
virtual machine replication. (best paper). In Jon Crowcroft and Michael Dahlin,
editors, NSDI, page 161. USENIX Association, 2008.
[9] Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudre-
Mauroux. Oltp-bench: An extensible testbed for benchmarking relational
databases. Proc. VLDB Endow., 7(4):277-288, December 2013.
[10] YaoZu Dong, Wei Ye, YunHong Jiang, Ian Pratt, ShiQing Ma, Jian Li, and
HaiBing Guan. Colo: Coarse-grained lock-stepping virtual machines for non-
stop service. In Proceedings of the 4th Annual Symposium on Cloud Computing,
SOCC '13, pages 3:1{3:16, New York, NY, USA, 2013. ACM.
[11] Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed.
Zookeeper: Wait-free coordination for internet-scale systems. In Proceedings
of the 2010 USENIX Conference on USENIX Annual Technical Conference,
USENIXATC'10, pages 11-11, Berkeley, CA, USA, 2010. USENIX Association.
[12] Priti Kumari and Parmeet Kaur. A survey of fault tolerance in cloud computing.
Journal of King Saud University - Computer and Information Sciences, 2018.
[13] K. Lee, I. Lai, and C. Lee. Optimizing back-and-forth live migration. In
2016 IEEE/ACM 9th International Conference on Utility and Cloud Computing
(UCC), pages 49-54, Dec 2016.
[14] Michael Nelson, Beng-Hong Lim, and Greg Hutchins. Fast transparent migra-
tion for virtual machines. In Proceedings of the Annual Conference on USENIX
Annual Technical Conference, ATEC '05, pages 25-25, Berkeley, CA, USA
2005. USENIX Association.
[15] Shriram Rajagopalan, Brendan Cully, Ryan O'Connor, and Andrew War eld.
Secondsite: Disaster tolerance as a service. SIGPLAN Not., 47(7):97-108,
March 2012.
[16] S. Ren, Y. Zhang, L. Pan, and Z. Xiao. Phantasy: Low-latency virtualization-
based fault tolerance via asynchronous prefetching. IEEE Transactions on
Computers, 68(02):225-238, feb 2019.
[17] Yifeng Sun. Protection Mechanisms for Virtual Machines on Virtualized
Servers. PhD thesis, Stony Brook University, 2017.
[18] Yoshiaki Tamura, Koji Sato, Seiji Kihara, and Satoshi Moriai. Kemari: Virtual
machine synchronization for fault tolerance. June 2008.
[19] P. Tsao, Y. Sun, L. Chen, and C. Cho. Ecient virtualization-based fault
tolerance. In 2016 International Computer Symposium (ICS), pages 114-119,
Dec 2016.
[20] Y. Ueno, N. Miyaho, S. Suzuki, and K. Ichihara. Performance evaluation
of a disaster recovery system and practical network system applications. In
2010 Fifth International Conference on Systems and Networks Communica-
tions, pages 195-200, Aug 2010.
[21] Timothy Wood, Emmanuel Cecchet, K K Ramakrishnan, Prashant Shenoy,
and Jacobus Van der Merwe. Disaster recovery as a cloud service: Economic
bene ts and deployment challenges. Aug 2010.
[22] Timothy Wood, Emmanuel Cecchet, K. K. Ramakrishnan, Prashant Shenoy,
Jacobus van der Merwe, and Arun Venkataramani. Disaster recovery as a cloud
service: Economic bene ts and deployment challenges. In Proceedings of the
2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10,
pages 8{8, Berkeley, CA, USA, 2010. USENIX Association.
[23] Timothy Wood, H. Andres Lagar-Cavilla, K. K. Ramakrishnan, Prashant
Shenoy, and Jacobus Van der Merwe. Pipecloud: Using causality to over-
come speed-of-light delays in cloud-based disaster recovery. In Proceedings of
the 2Nd ACM Symposium on Cloud Computing, SOCC '11, pages 17:1-17:13,
New York, NY, USA, 2011. ACM.
[24] Hsuan-Heng Wu. Implementation and optimization of group virtual machine
fault tolerance. Master's thesis, National Taiwan University, 7 2017.
[25] J. Zhu, Z. Jiang, Z. Xiao, and X. Li. Optimizing the performance of virtual
machine synchronization for fault tolerance. IEEE Transactions on Computers,
60(12):1718-1729, Dec 2011.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *