Faster query answering in probabilistic databases using read-once functions

Published

Conference Paper

A boolean expression is in read-once form if each of its variables appears exactly once. When the variables denote independent events in a probability space, the probability of the event denoted by the whole expression in read-once form can be computed in polynomial time (whereas the general problem for arbitrary expressions is #P-complete). Known approaches to checking read-once property seem to require putting these expressions in disjunctive normal form. In this paper, we tell a better story for a large subclass of boolean event expressions: those that are generated by conjunctive queries without self-joins and on tuple-independent probabilistic databases. We first show that given a tuple-independent representation and the provenance graph of an SPJ query plan without self-joins, we can, without using the DNF of a result event expression, efficiently compute its co-occurrence graph. From this, the read-once form can already, if it exists, be computed efficiently using existing techniques. Our second and key contribution is a complete, efficient, and simple to implement algorithm for computing the read-once forms (whenever they exist) directly, using a new concept, that of co-table graph, which can be significantly smaller than the cooccurrence graph. © 2011 ACM.

Full Text

Duke Authors

Cited Authors

  • Roy, S; Perduca, V; Tannen, V

Published Date

  • March 11, 2011

Published In

  • Acm International Conference Proceeding Series

Start / End Page

  • 232 - 243

International Standard Book Number 13 (ISBN-13)

  • 9781450305297

Digital Object Identifier (DOI)

  • 10.1145/1938551.1938582

Citation Source

  • Scopus