Toward Predictive Fault Tolerance in a Core-Router System: Anomaly Detection Using Correlation-Based Time-Series Analysis

Published

Journal Article

© 2017 IEEE. Fault tolerance is used in communication systems to ensure high reliability and rapid error recovery. The effectiveness of most proactive fault-tolerant mechanism depends on whether anomalies can be accurately detected before a failure occurs. However, traditional anomaly detection techniques fail to detect 'outliers' when the monitored data involves temporal measurements and exhibits significantly different statistical characteristics for its constituent features. We describe the design of an anomaly detector that monitors the time-series data of a complex core router system. Anomaly detection techniques are compared in terms of their effectiveness for detecting different types of anomalies. A feature-categorizing-based hybrid method is proposed to overcome the difficulty of detecting anomalies in features with different statistical characteristics. Furthermore, a correlation analyzer is implemented to remove irrelevant and redundant features. Three types of synthetic anomalies, generated using a small amount of real data for a commercial telecom system, are used to validate the proposed anomaly detector.

Full Text

Cited Authors

  • Jin, S; Zhang, Z; Chakrabarty, K; Gu, X

Published Date

  • October 1, 2018

Published In

Volume / Issue

  • 37 / 10

Start / End Page

  • 2111 - 2124

International Standard Serial Number (ISSN)

  • 0278-0070

Digital Object Identifier (DOI)

  • 10.1109/TCAD.2017.2775240

Citation Source

  • Scopus