Skip to main content

Efficient matching of substrings in uncertain sequences

Publication ,  Conference
Li, Y; Bailey, J; Kulik, L; Pei, J
Published in: SIAM International Conference on Data Mining 2014, SDM 2014
January 1, 2014

Substring matching is fundamental to data mining methods for sequential data. It involves checking the existence of a short subsequence within a longer sequence, ensuring no gaps within a match. Whilst a large amount of existing work has focused on substring matching and mining techniques for certain sequences, there are only a few results for uncertain sequences. Uncertain sequences provide powerful representations for modelling sequence behavioural characteristics in emerging domains, such as bioinformatics, sensor streams and trajectory analysis. In this paper, we focus on the core problem of computing substring matching probability in uncertain sequences and propose an efficient dynamic programming algorithm for this task. We demonstrate our approach is both competitive theoretically, as well as effective and scalable experimentally. Our results contribute towards a foundation for adapting classic sequence mining methods to deal with uncertain data.

Duke Scholars

Published In

SIAM International Conference on Data Mining 2014, SDM 2014

DOI

Publication Date

January 1, 2014

Volume

2

Start / End Page

767 / 775
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Li, Y., Bailey, J., Kulik, L., & Pei, J. (2014). Efficient matching of substrings in uncertain sequences. In SIAM International Conference on Data Mining 2014, SDM 2014 (Vol. 2, pp. 767–775). https://doi.org/10.1137/1.9781611973440.88
Li, Y., J. Bailey, L. Kulik, and J. Pei. “Efficient matching of substrings in uncertain sequences.” In SIAM International Conference on Data Mining 2014, SDM 2014, 2:767–75, 2014. https://doi.org/10.1137/1.9781611973440.88.
Li Y, Bailey J, Kulik L, Pei J. Efficient matching of substrings in uncertain sequences. In: SIAM International Conference on Data Mining 2014, SDM 2014. 2014. p. 767–75.
Li, Y., et al. “Efficient matching of substrings in uncertain sequences.” SIAM International Conference on Data Mining 2014, SDM 2014, vol. 2, 2014, pp. 767–75. Scopus, doi:10.1137/1.9781611973440.88.
Li Y, Bailey J, Kulik L, Pei J. Efficient matching of substrings in uncertain sequences. SIAM International Conference on Data Mining 2014, SDM 2014. 2014. p. 767–775.

Published In

SIAM International Conference on Data Mining 2014, SDM 2014

DOI

Publication Date

January 1, 2014

Volume

2

Start / End Page

767 / 775