Scholars@Duke publication: On shortest unique substring queries

On shortest unique substring queries

Publication , Conference

Pei, J; Wu, WCH; Yeh, MY

Published in: Proceedings - International Conference on Data Engineering

August 15, 2013

In this paper, we tackle a novel type of interesting queries - shortest unique substring queries. Given a (long) string S and a query point q in the string, can we find a shortest substring containing q that is unique in S? We illustrate that shortest unique substring queries have many potential applications, such as information retrieval, bioinformatics, and event context analysis. We develop efficient algorithms for online query answering. First, we present an algorithm to answer a shortest unique substring query in O(n) time using a suffix tree index, where n is the length of string S. Second, we show that, using O(n·h) time and O(n) space, we can compute a shortest unique substring for every position in a given string, where h is variable theoretically in O(n) but on real data sets often much smaller than n and can be treated as a constant. Once the shortest unique substrings are pre-computed, shortest unique substring queries can be answered online in constant time. In addition to the solid algorithmic results, we empirically demonstrate the effectiveness and efficiency of shortest unique substring queries on real data sets. © 2013 IEEE.

Duke Scholars

Author Jian Pei Computer Science

Published In

Proceedings - International Conference on Data Engineering

DOI

10.1109/ICDE.2013.6544887

ISSN

1084-4627

Publication Date

August 15, 2013

Start / End Page

937 / 948

Citation

APA

Chicago

ICMJE

MLA

NLM

Pei, J., Wu, W. C. H., & Yeh, M. Y. (2013). On shortest unique substring queries. In Proceedings - International Conference on Data Engineering (pp. 937–948). https://doi.org/10.1109/ICDE.2013.6544887

Pei, J., W. C. H. Wu, and M. Y. Yeh. “On shortest unique substring queries.” In Proceedings - International Conference on Data Engineering, 937–48, 2013. https://doi.org/10.1109/ICDE.2013.6544887.

Pei J, Wu WCH, Yeh MY. On shortest unique substring queries. In: Proceedings - International Conference on Data Engineering. 2013. p. 937–48.

Pei, J., et al. “On shortest unique substring queries.” Proceedings - International Conference on Data Engineering, 2013, pp. 937–48. Scopus, doi:10.1109/ICDE.2013.6544887.

Pei J, Wu WCH, Yeh MY. On shortest unique substring queries. Proceedings - International Conference on Data Engineering. 2013. p. 937–948.

Published In

Proceedings - International Conference on Data Engineering

DOI

10.1109/ICDE.2013.6544887

ISSN

1084-4627

Publication Date

August 15, 2013

Start / End Page

937 / 948