Availability analysis of blade server systems

Journal Article (Journal Article)

The successful development and marketing of commercial high-availability systems requires the ability to evaluate the availability of systems. Specifically, one should be able to demonstrate that projected customer requirements are met, to identify availability bottlenecks, to evaluate and compare different configurations, and to evaluate and compare different designs. For evaluation approaches based on analytic modeling, these systems are often sufficiently complex so that state-space methods are not effective due to the large number of states, whereas combinatorial methods are inadequate for capturing all significant dependencies. The two-level hierarchical decomposition proposed here is suitable for the availability modeling of blade server systems such as IBM BladeCenter®, a commercial, high-availability multicomponent system comprising up to 14 separate blade servers and contained within a chassis that provides shared subsystems such as power and cooling. This approach is based on an availability model that combines a high-level fault tree model with a number of lower-level Markov models. It is used to determine component level contributions to downtime as well as steady-state availability for both standalone and clustered blade servers. Sensitivity of the results to input parameters is examined, extensions to the models are described, and availability bottlenecks and possible solutions are identified. © Copyright 2008 by International Business Machines Corporation.

Full Text

Duke Authors

Cited Authors

  • Smith, WE; Trivedi, KS; Tomek, LA; Ackaret, J

Published Date

  • December 1, 2008

Published In

Volume / Issue

  • 47 / 4

Start / End Page

  • 621 - 640

International Standard Serial Number (ISSN)

  • 0018-8670

Digital Object Identifier (DOI)

  • 10.1147/SJ.2008.5386524

Citation Source

  • Scopus