Skip to main content

On shortest unique substring queries

Publication ,  Conference
Pei, J; Wu, WCH; Yeh, MY
Published in: Proceedings - International Conference on Data Engineering
August 15, 2013

In this paper, we tackle a novel type of interesting queries - shortest unique substring queries. Given a (long) string S and a query point q in the string, can we find a shortest substring containing q that is unique in S? We illustrate that shortest unique substring queries have many potential applications, such as information retrieval, bioinformatics, and event context analysis. We develop efficient algorithms for online query answering. First, we present an algorithm to answer a shortest unique substring query in O(n) time using a suffix tree index, where n is the length of string S. Second, we show that, using O(n·h) time and O(n) space, we can compute a shortest unique substring for every position in a given string, where h is variable theoretically in O(n) but on real data sets often much smaller than n and can be treated as a constant. Once the shortest unique substrings are pre-computed, shortest unique substring queries can be answered online in constant time. In addition to the solid algorithmic results, we empirically demonstrate the effectiveness and efficiency of shortest unique substring queries on real data sets. © 2013 IEEE.

Duke Scholars

Published In

Proceedings - International Conference on Data Engineering

DOI

ISSN

1084-4627

ISBN

9781467349086

Publication Date

August 15, 2013

Start / End Page

937 / 948
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Pei, J., Wu, W. C. H., & Yeh, M. Y. (2013). On shortest unique substring queries. In Proceedings - International Conference on Data Engineering (pp. 937–948). https://doi.org/10.1109/ICDE.2013.6544887
Pei, J., W. C. H. Wu, and M. Y. Yeh. “On shortest unique substring queries.” In Proceedings - International Conference on Data Engineering, 937–48, 2013. https://doi.org/10.1109/ICDE.2013.6544887.
Pei J, Wu WCH, Yeh MY. On shortest unique substring queries. In: Proceedings - International Conference on Data Engineering. 2013. p. 937–48.
Pei, J., et al. “On shortest unique substring queries.” Proceedings - International Conference on Data Engineering, 2013, pp. 937–48. Scopus, doi:10.1109/ICDE.2013.6544887.
Pei J, Wu WCH, Yeh MY. On shortest unique substring queries. Proceedings - International Conference on Data Engineering. 2013. p. 937–948.

Published In

Proceedings - International Conference on Data Engineering

DOI

ISSN

1084-4627

ISBN

9781467349086

Publication Date

August 15, 2013

Start / End Page

937 / 948