The pipelined set cover problem
A classical problem in query optimization is to find the optimal ordering of a set of possibly correlated selections. We provide an abstraction of this problem as a generalization of set cover called pipelined set cover, where the sets are applied sequentially to the elements to be covered and the elements covered at each stage are discarded. We show that several natural heuristics for this NP-hard problem, such as the greedy set-cover heuristic and a local-search heuristic, can be analyzed using a linear-programming framework. These heuristics lead to efficient algorithms for pipelined set cover that can be applied to order possibly correlated selections in conventional database systems as well as data-stream processing systems. We use our linear-programming framework to show that the greedy and local-search algorithms are 4-approximations for pipelined set cover. We extend our analysis to minimize the lp-norm of the costs paid by the sets, where p ≥ 2 is an integer, to examine the improvement in performance when the total cost has increasing contribution from initial sets in the pipeline. Finally, we consider the online version of pipelined set cover and present a competitive algorithm with a logarithmic performance guarantee. Our analysis framework may be applicable to other problems in query optimization where it is important to account for correlations. © 2005 Springer-Verlag.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- Artificial Intelligence & Image Processing
- 46 Information and computing sciences