Skip to main content

Mining sequence classifiers for early prediction

Publication ,  Conference
Xing, Z; Dong, G; Pei, J; Yu, PS
Published in: Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130
January 1, 2008

Supervised learning on sequence data, also known as sequence classification, has been well recognized as an important data mining task with many significant applications. Since temporal order is important in sequence data, in many critical applications of sequence classification such as medical diagnosis and disaster prediction, early prediction is a highly desirable feature of sequence classifiers. In early prediction, a sequence classifier should use a prefix of a sequence as short as possible to make a reasonably accurate prediction. To the best of our knowledge, early prediction on sequence data has not been studied systematically. In this paper, we identify the novel problem of mining sequence classifiers for early prediction. We analyze the problem and the challenges. As the first attempt to tackle the problem, we propose two interesting methods. The sequential classification rule (SCR) method mines a set of sequential classification rules as a classifier. A so-called early-prediction utility is defined and used to select features and rules. The generalized sequential decision tree (GSDT) method adopts a divide-and-conquer strategy to generate a classification model. We conduct an extensive empirical evaluation on several real data sets. Interestingly, our two methods achieve accuracy comparable to that of the state-of-the-art methods, but typically need to use only very short prefixes of the sequences. The results clearly indicate that early prediction is highly feasible and effective. Copyright © by SIAM.

Duke Scholars

Published In

Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130

DOI

ISBN

9781605603179

Publication Date

January 1, 2008

Volume

2

Start / End Page

644 / 655
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xing, Z., Dong, G., Pei, J., & Yu, P. S. (2008). Mining sequence classifiers for early prediction. In Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130 (Vol. 2, pp. 644–655). https://doi.org/10.1137/1.9781611972788.59
Xing, Z., G. Dong, J. Pei, and P. S. Yu. “Mining sequence classifiers for early prediction.” In Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130, 2:644–55, 2008. https://doi.org/10.1137/1.9781611972788.59.
Xing Z, Dong G, Pei J, Yu PS. Mining sequence classifiers for early prediction. In: Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130. 2008. p. 644–55.
Xing, Z., et al. “Mining sequence classifiers for early prediction.” Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130, vol. 2, 2008, pp. 644–55. Scopus, doi:10.1137/1.9781611972788.59.
Xing Z, Dong G, Pei J, Yu PS. Mining sequence classifiers for early prediction. Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130. 2008. p. 644–655.

Published In

Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130

DOI

ISBN

9781605603179

Publication Date

January 1, 2008

Volume

2

Start / End Page

644 / 655