Skip to main content

DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms

Publication ,  Journal Article
Patwa, S; Sun, D; Gilad, A; Machanavajjhala, A; Roy, S
Published in: Proceedings of the VLDB Endowment
January 1, 2023

Synthetic data generation methods, and in particular, private synthetic data generation methods, are gaining popularity as a means to make copies of sensitive databases that can be shared widely for research and data analysis. Some of the fundamental operations in data analysis include analyzing aggregated statistics, e.g., count, sum, or median, on a subset of data satisfying some conditions. When synthetic data is generated, users may be interested in knowing if their aggregated queries generating such statistics can be reliably answered on the synthetic data, for instance, to decide if the synthetic data is suitable for specific tasks. However, the standard data generation systems do not provide “per-query” quality guarantees on the synthetic data, and the users have no way of knowing how much the aggregated statistics on the synthetic data can be trusted. To address this problem, we present a novel framework named DP-PQD (differentially-private per-query decider) to detect if the query answers on the private and synthetic datasets are within a user-specified threshold of each other while guaranteeing differential privacy. We give a suite of private algorithms for per-query deciders for count, sum, and median queries, analyze their properties, and evaluate them experimentally.

Duke Scholars

Published In

Proceedings of the VLDB Endowment

DOI

EISSN

2150-8097

Publication Date

January 1, 2023

Volume

17

Issue

1

Start / End Page

65 / 78

Related Subject Headings

  • 4605 Data management and data science
  • 0807 Library and Information Studies
  • 0806 Information Systems
  • 0802 Computation Theory and Mathematics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Patwa, S., Sun, D., Gilad, A., Machanavajjhala, A., & Roy, S. (2023). DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms. Proceedings of the VLDB Endowment, 17(1), 65–78. https://doi.org/10.14778/3617838.3617844
Patwa, S., D. Sun, A. Gilad, A. Machanavajjhala, and S. Roy. “DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms.” Proceedings of the VLDB Endowment 17, no. 1 (January 1, 2023): 65–78. https://doi.org/10.14778/3617838.3617844.
Patwa S, Sun D, Gilad A, Machanavajjhala A, Roy S. DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms. Proceedings of the VLDB Endowment. 2023 Jan 1;17(1):65–78.
Patwa, S., et al. “DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms.” Proceedings of the VLDB Endowment, vol. 17, no. 1, Jan. 2023, pp. 65–78. Scopus, doi:10.14778/3617838.3617844.
Patwa S, Sun D, Gilad A, Machanavajjhala A, Roy S. DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms. Proceedings of the VLDB Endowment. 2023 Jan 1;17(1):65–78.

Published In

Proceedings of the VLDB Endowment

DOI

EISSN

2150-8097

Publication Date

January 1, 2023

Volume

17

Issue

1

Start / End Page

65 / 78

Related Subject Headings

  • 4605 Data management and data science
  • 0807 Library and Information Studies
  • 0806 Information Systems
  • 0802 Computation Theory and Mathematics