Skip to main content

Mining multidimensional contextual outliers from categorical relational data

Publication ,  Journal Article
Tang, G; Pei, J; Bailey, J; Dong, G
Published in: Intelligent Data Analysis
September 8, 2015

A wide range of methods have been proposed for detecting different types of outliers in both the full attribute space and its subspaces. However, the interpretability of outliers, that is, explaining in what ways and to what extent an object is an outlier, remains a critical issue. In this paper, we focus on improving the interpretability of outliers. Particularly, we develop a notion of multidimensional contextual outliers to model the context of an outlier, and propose a framework for contextual outlier detection. Intuitively, a contextual outlier is a small group of objects that share strong similarity with a significantly larger reference group of objects on some attributes, but deviate dramatically on some other attributes. In contextual outlier detection, we identify not only the outliers, but also their associated contextual information including (1) comparing to what reference group of objects the detected object(s) is/are an outlier; (2) the attributes defining the unusual behavior of the outlier(s) compared against the reference group; (3) the population of similar outliers sharing the same context; and (4) the outlier degree, which measures the population ratio between the reference group and the outlier group. We present an algorithm and conduct extensive experiments to evaluate our approach.

Duke Scholars

Published In

Intelligent Data Analysis

DOI

EISSN

1571-4128

ISSN

1088-467X

Publication Date

September 8, 2015

Volume

19

Issue

5

Start / End Page

1171 / 1192

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4611 Machine learning
  • 4605 Data management and data science
  • 4602 Artificial intelligence
  • 1702 Cognitive Sciences
  • 0804 Data Format
  • 0801 Artificial Intelligence and Image Processing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Tang, G., Pei, J., Bailey, J., & Dong, G. (2015). Mining multidimensional contextual outliers from categorical relational data. Intelligent Data Analysis, 19(5), 1171–1192. https://doi.org/10.3233/IDA-150764
Tang, G., J. Pei, J. Bailey, and G. Dong. “Mining multidimensional contextual outliers from categorical relational data.” Intelligent Data Analysis 19, no. 5 (September 8, 2015): 1171–92. https://doi.org/10.3233/IDA-150764.
Tang G, Pei J, Bailey J, Dong G. Mining multidimensional contextual outliers from categorical relational data. Intelligent Data Analysis. 2015 Sep 8;19(5):1171–92.
Tang, G., et al. “Mining multidimensional contextual outliers from categorical relational data.” Intelligent Data Analysis, vol. 19, no. 5, Sept. 2015, pp. 1171–92. Scopus, doi:10.3233/IDA-150764.
Tang G, Pei J, Bailey J, Dong G. Mining multidimensional contextual outliers from categorical relational data. Intelligent Data Analysis. 2015 Sep 8;19(5):1171–1192.

Published In

Intelligent Data Analysis

DOI

EISSN

1571-4128

ISSN

1088-467X

Publication Date

September 8, 2015

Volume

19

Issue

5

Start / End Page

1171 / 1192

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4611 Machine learning
  • 4605 Data management and data science
  • 4602 Artificial intelligence
  • 1702 Cognitive Sciences
  • 0804 Data Format
  • 0801 Artificial Intelligence and Image Processing