Large-scale uncertainty management systems: Learning and exploiting your data

Conference Paper

The database community has made rapid strides in capturing, representing, and querying uncertain data. Probabilistic databases capture the inherent uncertainty in derived tuples as probability estimates. Data acquisition and stream systems can produce succinct summaries of very large and time-varying datasets. This tutorial addresses the natural next step in harnessing uncertain data: How can we efficiently and quantifiably determine what, how, and how much to learn in order to make good decisions based on the imprecise information available. The material in this tutorial is drawn from a range of fields including database systems, control and information theory, operations research, convex optimization, and statistical learning. The focus of the tutorial is on the natural constraints that are imposed in a database context and the demands of imprecise information from an optimization point of view. We look both into the past as well as into the future; to discuss general tools and techniques that can serve as a guide to database researchers and practitioners, and to enumerate the challenges that lie ahead. © 2009 ACM.

Full Text

Duke Authors

Cited Authors

  • Babu, S; Guha, S; Munagala, K

Published Date

  • January 1, 2009

Published In

  • Sigmod Pods'09 Proceedings of the International Conference on Management of Data and 28th Symposium on Principles of Database Systems

Start / End Page

  • 995 - 998

International Standard Book Number 13 (ISBN-13)

  • 9781605585543

Digital Object Identifier (DOI)

  • 10.1145/1559845.1559964

Citation Source

  • Scopus