Skip to main content

Domain Adaptive Log Anomaly Prediction for Hadoop System

Publication ,  Journal Article
Xie, Y; Yang, K
Published in: IEEE Internet of Things Journal
October 15, 2022

Hadoop provides a powerful platform that allows reliable, scalable, and distributed processing of massive data sets across a cluster of computers. Log data record events taken place in the Hadoop system that helps to understand system activities and diagnose problems. However, system upgrades and updates often change the syntax and patterns of logs, rendering the machine-learning models that were designed for the legacy system ineffective. Retraining the machine-learning models with new data sets from scratch might improve the accuracy of the machine-learning model. Nevertheless, annotating new data sets is often time consuming and labor intensive. In this article, we propose a domain adaptive log anomaly prediction framework called LogAT to effectively transfer learned knowledge from the existing labeled data set (source domain) to the new unlabeled data set (target domain) by adopting an unsupervised domain adaption method. Furthermore, a hierarchical anomaly knowledge graph has been constructed to represent the domain knowledge that facilitates the subsequent detection and diagnosis of system faults. Extensive experiments have been conducted on public and real-world data sets to validate the effectiveness of the proposed framework as well as each module. Our results show that LogAT achieves superior performance over the state-of-the-art methods for predicting log anomalies and acquiring considerable performance improvement in terms of AUC-ROC score on different Hadoop application data sets.

Duke Scholars

Published In

IEEE Internet of Things Journal

DOI

EISSN

2327-4662

Publication Date

October 15, 2022

Volume

9

Issue

20

Start / End Page

20778 / 20787

Related Subject Headings

  • 46 Information and computing sciences
  • 40 Engineering
  • 1005 Communications Technologies
  • 0805 Distributed Computing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xie, Y., & Yang, K. (2022). Domain Adaptive Log Anomaly Prediction for Hadoop System. IEEE Internet of Things Journal, 9(20), 20778–20787. https://doi.org/10.1109/JIOT.2022.3178873
Xie, Y., and K. Yang. “Domain Adaptive Log Anomaly Prediction for Hadoop System.” IEEE Internet of Things Journal 9, no. 20 (October 15, 2022): 20778–87. https://doi.org/10.1109/JIOT.2022.3178873.
Xie Y, Yang K. Domain Adaptive Log Anomaly Prediction for Hadoop System. IEEE Internet of Things Journal. 2022 Oct 15;9(20):20778–87.
Xie, Y., and K. Yang. “Domain Adaptive Log Anomaly Prediction for Hadoop System.” IEEE Internet of Things Journal, vol. 9, no. 20, Oct. 2022, pp. 20778–87. Scopus, doi:10.1109/JIOT.2022.3178873.
Xie Y, Yang K. Domain Adaptive Log Anomaly Prediction for Hadoop System. IEEE Internet of Things Journal. 2022 Oct 15;9(20):20778–20787.

Published In

IEEE Internet of Things Journal

DOI

EISSN

2327-4662

Publication Date

October 15, 2022

Volume

9

Issue

20

Start / End Page

20778 / 20787

Related Subject Headings

  • 46 Information and computing sciences
  • 40 Engineering
  • 1005 Communications Technologies
  • 0805 Distributed Computing