Perturbation analysis of database queries
We present a system, Perada, for parallel perturbation analysis of database queries. Perturbation analysis considers the results of a query evaluated with (a typically large number of) different parameter settings, to help discover leads and evaluate claims from data. Perada simplifies the development of general, ad hoc perturbation analysis by providing a flexible API to support a variety of optimizations such as grouping, memoization, and pruning; by automatically optimizing performance through run-time observation, learning, and adaptation; and by hiding the complexity of concurrency and failures from its developers. We demonstrate Perada's efficacy and efficiency with real workloads applying perturbation analysis to computational journalism.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics