Model-Based Survivability Analysis of a Virtualized System
Transient survivability analysis of a virtualized system (VS) is critical to the wide deployment of cloud services. The existing research of VS availability and/or reliability focused on the steady-state analysis. This paper presents a model and the closed-form solutions to analyze the survivability of both cloud service and VS after a service breakdown occurrence by using continuous-time Markov chain. Service breakdown may be caused by software rejuvenation of virtual machine (VM) and/or VM monitor (VMM), or caused by VM and/or VMM bugs. The VS applies two techniques for improving service survivability: VM failover and live VM migration. The proposed model and the defined survivability metrics not only enable us to quantitatively assess the system survivability but also provide insights on the investment efforts in system recovery strategies. Sensitivity analysis through numerical analysis is carried out to study the impact of key parameters on system survivability.