Skip to main content

DPXPlain: Privately Explaining Aggregate Query Answers

Publication ,  Conference
Tao, Y; Gilad, A; Machanavajjhala, A; Roy, S
Published in: Proceedings of the VLDB Endowment
January 1, 2022

Differential privacy (DP) is the state-of-the-art and rigorous notion of privacy for answering aggregate database queries while preserving the privacy of sensitive information in the data. In today’s era of data analysis, however, it poses new challenges for users to understand the trends and anomalies observed in the query results: Is the unexpected answer due to the data itself, or is it due to the extra noise that must be added to preserve DP? In the second case, even the observation made by the users on query results may be wrong. In the first case, can we still mine interesting explanations from the sensitive data while protecting its privacy? To address these challenges, we present a three-phase framework DPXPlain, which is the first system to the best of our knowledge for explaining group-by aggregate query answers with DP. In its three phases, DPXPlain (a) answers a group-by aggregate query with DP, (b) allows users to compare aggregate values of two groups and with high probability assesses whether this comparison holds or is flipped by the DP noise, and (c) eventually provides an explanation table containing the approximately ‘top-k’ explanation predicates along with their relative influences and ranks in the form of confidence intervals, while guaranteeing DP in all steps. We perform an extensive experimental analysis of DPXPlain with multiple use-cases on real and synthetic data showing that DPXPlain efficiently provides insightful explanations with good accuracy and utility.

Duke Scholars

Published In

Proceedings of the VLDB Endowment

DOI

EISSN

2150-8097

Publication Date

January 1, 2022

Volume

16

Issue

1

Start / End Page

113 / 126

Related Subject Headings

  • 4605 Data management and data science
  • 0807 Library and Information Studies
  • 0806 Information Systems
  • 0802 Computation Theory and Mathematics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Tao, Y., Gilad, A., Machanavajjhala, A., & Roy, S. (2022). DPXPlain: Privately Explaining Aggregate Query Answers. In Proceedings of the VLDB Endowment (Vol. 16, pp. 113–126). https://doi.org/10.14778/3561261.3561271
Tao, Y., A. Gilad, A. Machanavajjhala, and S. Roy. “DPXPlain: Privately Explaining Aggregate Query Answers.” In Proceedings of the VLDB Endowment, 16:113–26, 2022. https://doi.org/10.14778/3561261.3561271.
Tao Y, Gilad A, Machanavajjhala A, Roy S. DPXPlain: Privately Explaining Aggregate Query Answers. In: Proceedings of the VLDB Endowment. 2022. p. 113–26.
Tao, Y., et al. “DPXPlain: Privately Explaining Aggregate Query Answers.” Proceedings of the VLDB Endowment, vol. 16, no. 1, 2022, pp. 113–26. Scopus, doi:10.14778/3561261.3561271.
Tao Y, Gilad A, Machanavajjhala A, Roy S. DPXPlain: Privately Explaining Aggregate Query Answers. Proceedings of the VLDB Endowment. 2022. p. 113–126.

Published In

Proceedings of the VLDB Endowment

DOI

EISSN

2150-8097

Publication Date

January 1, 2022

Volume

16

Issue

1

Start / End Page

113 / 126

Related Subject Headings

  • 4605 Data management and data science
  • 0807 Library and Information Studies
  • 0806 Information Systems
  • 0802 Computation Theory and Mathematics