Software rejuvenation policies for cluster systems under varying workload
This paper analyzes two software rejuvenation policies of cluster server systems under varying workload, called fixed rejuvenation and delayed rejuvenation. In order to achieve a higher average throughput, we propose the delayed rejuvenation policy, which postpones the rejuvenation of individual nodes until off-peak hours. Analytic models using the well known paradigm of Markov chains are used. Since the size of the Markov model is nontrivial, automated specification generation, and the solution via stochastic Petri nets is utilized. Deterministic time to trigger rejuvenation is approximated by a 20-stage Erlangian distribution. Based on the numerical solutions of the models, we find that under the given context, although the fixed rejuvenation occasionally yields a higher throughput, the delayed rejuvenation policy seems to outperform fixed rejuvenation policy by up to 11%. We also compare the steady-state system availabilities of these two rejuvenation policies.