Skip to main content
Journal cover image

Ranking queries on uncertain data

Publication ,  Journal Article
Hua, M; Pei, J; Lin, X
Published in: VLDB Journal
February 1, 2011

Uncertain data is inherent in a few important applications. It is far from trivial to extend ranking queries (also known as top-k queries), a popular type of queries on certain data, to uncertain data. In this paper, we cast ranking queries on uncertain data using three parameters: rank threshold k, probability threshold p, and answer set size threshold l. Systematically, we identify four types of ranking queries on uncertain data. First, a probability threshold top-k query computes the uncertain records taking a probability of at least p to be in the top-k list. Second, a top-(k, l) query returns the top-l uncertain records whose probabilities of being ranked among top-k are the largest. Third, the p-rank of an uncertain record is the smallest number k such that the record takes a probability of at least p to be ranked in the top-k list. A rank threshold top-k query retrieves the records whose p-ranks are at most k. Last, a top-(p, l) query returns the top-l uncertain records with the smallest p-ranks. To answer such ranking queries, we present an efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation-based algorithm. To answer top-(k, l) queries and top-(p, l) queries, we propose PRist+, a compact index. An efficient index construction algorithm and efficacious query answering methods are developed for PRist+. An empirical study using real and synthetic data sets verifies the effectiveness of the probabilistic ranking queries and the efficiency of our methods. © 2010 Springer-Verlag.

Duke Scholars

Published In

VLDB Journal

DOI

EISSN

0949-877X

ISSN

1066-8888

Publication Date

February 1, 2011

Volume

20

Issue

1

Start / End Page

129 / 153

Related Subject Headings

  • Information Systems
  • 4605 Data management and data science
  • 0806 Information Systems
  • 0805 Distributed Computing
  • 0804 Data Format
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Hua, M., Pei, J., & Lin, X. (2011). Ranking queries on uncertain data. VLDB Journal, 20(1), 129–153. https://doi.org/10.1007/s00778-010-0196-4
Hua, M., J. Pei, and X. Lin. “Ranking queries on uncertain data.” VLDB Journal 20, no. 1 (February 1, 2011): 129–53. https://doi.org/10.1007/s00778-010-0196-4.
Hua M, Pei J, Lin X. Ranking queries on uncertain data. VLDB Journal. 2011 Feb 1;20(1):129–53.
Hua, M., et al. “Ranking queries on uncertain data.” VLDB Journal, vol. 20, no. 1, Feb. 2011, pp. 129–53. Scopus, doi:10.1007/s00778-010-0196-4.
Hua M, Pei J, Lin X. Ranking queries on uncertain data. VLDB Journal. 2011 Feb 1;20(1):129–153.
Journal cover image

Published In

VLDB Journal

DOI

EISSN

0949-877X

ISSN

1066-8888

Publication Date

February 1, 2011

Volume

20

Issue

1

Start / End Page

129 / 153

Related Subject Headings

  • Information Systems
  • 4605 Data management and data science
  • 0806 Information Systems
  • 0805 Distributed Computing
  • 0804 Data Format