Root-Cause Analysis with Semi-Supervised Co-Training for Integrated Systems
Root-cause analysis for integrated systems has become increasingly challenging due to their growing complexity. To tackle these challenges, machine learning (ML) has been applied to enhance root-cause analysis. Nonetheless, ML-based root-cause analysis usually requires abundant training data with root causes labeled by human experts, which are difficult or even impossible to obtain. To overcome this drawback, a semi-supervised co-training method is proposed for root-cause analysis in this article, which only requires a small portion of labeled data. First, a random forest is trained with labeled data. Next, we propose a co-training technique to learn from unlabeled data with semi-supervised learning, which pre-labels a subset of these data automatically and then retrains each decision tree in the random forest. In addition, a robust framework is proposed to avoid over-fitting. We further apply initialization by clustering and feature selection to improve the diagnostic performance. With two case studies from industry, the proposed approach shows superior performance against other state-of-the-art methods by saving up to 67% of labeling efforts.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Related Subject Headings
- Design Practice & Management
- 4612 Software engineering
- 4606 Distributed computing and systems software
- 4009 Electronics, sensors and digital hardware
- 1006 Computer Hardware
- 0803 Computer Software
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Related Subject Headings
- Design Practice & Management
- 4612 Software engineering
- 4606 Distributed computing and systems software
- 4009 Electronics, sensors and digital hardware
- 1006 Computer Hardware
- 0803 Computer Software