Resilience in computer systems and networks
The term resilience is used differently by different communities. In general engineering systems, fast recovery from a degraded system state is often termed as resilience. Computer networking community defines it as the combination of trustworthiness (dependability, security, performability) and tolerance (survivability, disruption tolerance, and traffic tolerance). Dependable computing community defined resilience as the persistence of service delivery that can justifiably be trusted, when facing changes. In this paper, resilience definitions of systems and networks will be presented. Metrics for resilience will be compared with dependability metrics such as availability, performance, performability. Simple examples will be used to show quantification of resilience via probabilistic analytic models. Copyright 2009 ACM.