Efficient Indexes for Diverse Top-k Range Queries
© 2020 Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. All rights reserved. Let P be a set of n (non-negatively) weighted points in Rd. We consider the problem of computing a subset of (at most) k diverse and high-valued points of P that lie inside a query range, a problem relevant to many areas such as search engines, recommendation systems, and online stores. The diversity and value of a set of points are measured as functions (say average or minimum) of their pairwise distances and weights, respectively. We study both bicriteria and constrained optimization problems. In the former, we wish to return a set of k points that maximize a weighted sum of their value and diversity measures, and in the latter, we wish to return a set of at most k points that maximize their value and satisfy a diversity constraint. We obtain three main types of results in this paper: Near-linear time (0.5-ϵ)-approximation algorithms for the bicriteria optimization problem in the offline setting. Near-linear size indexes for the bicriteria optimization problem that for a query rectangle return a (0.5-ϵ)-approximate solution in time O(k polylog(n)). The indexes can be constructed in O(n polylog(n)) time. Near-linear size indexes for answering constrained optimization range queries. For a query rectangle, a 0.5O(d)-approximate solution can be computed in O(k polylog(n)) time. If we allow some of the returned points to lie at most ϵ outside of the query rectangle then an (1-ϵ)-approximate solution can be computed in O(k polylog(n)) time. The indexes are constructed in O(n polylog(n)) and nO(1/ϵd) time, respectively.
Agarwal, PK; Sintos, S; Steiger, A
Proceedings of the Acm Sigact Sigmod Sigart Symposium on Principles of Database Systems
Start / End Page
International Standard Book Number 13 (ISBN-13)
Digital Object Identifier (DOI)