Scholars@Duke publication: Efficient matching of substrings in uncertain sequences

Efficient matching of substrings in uncertain sequences

Publication , Conference

Li, Y; Bailey, J; Kulik, L; Pei, J

Published in: SIAM International Conference on Data Mining 2014, SDM 2014

January 1, 2014

Substring matching is fundamental to data mining methods for sequential data. It involves checking the existence of a short subsequence within a longer sequence, ensuring no gaps within a match. Whilst a large amount of existing work has focused on substring matching and mining techniques for certain sequences, there are only a few results for uncertain sequences. Uncertain sequences provide powerful representations for modelling sequence behavioural characteristics in emerging domains, such as bioinformatics, sensor streams and trajectory analysis. In this paper, we focus on the core problem of computing substring matching probability in uncertain sequences and propose an efficient dynamic programming algorithm for this task. We demonstrate our approach is both competitive theoretically, as well as effective and scalable experimentally. Our results contribute towards a foundation for adapting classic sequence mining methods to deal with uncertain data.

Duke Scholars

Author Jian Pei Computer Science

Published In

SIAM International Conference on Data Mining 2014, SDM 2014

DOI

10.1137/1.9781611973440.88

Publication Date

January 1, 2014

Volume

Start / End Page

767 / 775

Citation

APA

Chicago

ICMJE

MLA

NLM

Li, Y., Bailey, J., Kulik, L., & Pei, J. (2014). Efficient matching of substrings in uncertain sequences. In SIAM International Conference on Data Mining 2014, SDM 2014 (Vol. 2, pp. 767–775). https://doi.org/10.1137/1.9781611973440.88

Li, Y., J. Bailey, L. Kulik, and J. Pei. “Efficient matching of substrings in uncertain sequences.” In SIAM International Conference on Data Mining 2014, SDM 2014, 2:767–75, 2014. https://doi.org/10.1137/1.9781611973440.88.

Li Y, Bailey J, Kulik L, Pei J. Efficient matching of substrings in uncertain sequences. In: SIAM International Conference on Data Mining 2014, SDM 2014. 2014. p. 767–75.

Li, Y., et al. “Efficient matching of substrings in uncertain sequences.” SIAM International Conference on Data Mining 2014, SDM 2014, vol. 2, 2014, pp. 767–75. Scopus, doi:10.1137/1.9781611973440.88.

Li Y, Bailey J, Kulik L, Pei J. Efficient matching of substrings in uncertain sequences. SIAM International Conference on Data Mining 2014, SDM 2014. 2014. p. 767–775.

Published In

SIAM International Conference on Data Mining 2014, SDM 2014

DOI

10.1137/1.9781611973440.88

Publication Date

January 1, 2014

Volume

Start / End Page

767 / 775