Unsupervised Two-Stage Root-Cause Analysis with Transfer Learning for Integrated Systems

Journal Article (Journal Article)

The growing complexity of integrated systems makes root-cause analysis increasingly difficult. To address this challenge, advances in machine learning (ML) have been leveraged in recent years to design ML-based techniques for root-cause analysis. However, most of these methods require root-cause labels for defective samples obtained based on analysis by human experts. In this paper, we propose a multi-algorithm two-stage clustering method with transfer learning for unsupervised root-cause analysis. First, a two-stage clustering method is proposed by applying multiple clustering methods to accommodate both numerical and categorical data and leveraging Silhouette score for model selection. Next, a double-bootstrapping method is proposed for data selection, transferring valuable information from a source product to a target product with insufficient data. In the first bootstrapping step, a random forest model is built to select effective source data. In the second bootstrapping step, clustering ensemble is applied to two-stage clustering to further improve the accuracy for root-cause analysis. Two case studies based on network products demonstrate the superior performance of the proposed approach compared to other state-of-the-art methods.

Full Text

Duke Authors

Cited Authors

  • Pan, R; Li, X; Chakrabarty, K

Published Date

  • January 1, 2022

Published In

Electronic International Standard Serial Number (EISSN)

  • 1937-4151

International Standard Serial Number (ISSN)

  • 0278-0070

Digital Object Identifier (DOI)

  • 10.1109/TCAD.2022.3176998

Citation Source

  • Scopus