Scholars@Duke publication: PreFair: Privately Generating Justifiably Fair Synthetic Data

PreFair: Privately Generating Justifiably Fair Synthetic Data

Publication , Conference

Pujol, D; Gilad, A; Machanavajjhala, A

Published in: Proceedings of the VLDB Endowment

January 1, 2023

When a database is protected by Differential Privacy (DP), its us-ability is limited in scope. In this scenario, generating a synthetic version of the data that mimics the properties of the private data allows users to perform any operation on the synthetic data, while maintaining the privacy of the original data. Therefore, multiple works have been devoted to devising systems for DP synthetic data generation. However, such systems may preserve or even magnify properties of the data that make it unfair, rendering the synthetic data unfit for use. In this work, we present PreFair, a system that allows for DP fair synthetic data generation. PreFair extends the state-of-the-art DP data generation mechanisms by incorporating a causal fairness criterion that ensures fair synthetic data. We adapt the notion of justifiable fairness to fit the synthetic data generation scenario. We further study the problem of generating DP fair synthetic data, showing its intractability and designing algorithms that are optimal under certain assumptions. We also provide an extensive experimental evaluation, showing that PreFair generates synthetic data that is significantly fairer than the data generated by leading DP data generation mechanisms, while remaining faithful to the private data.

Duke Scholars

Author Ashwinkumar Venkatanaga Machanavajjhala Computer Science

Published In

Proceedings of the VLDB Endowment

DOI

10.14778/3583140.3583168

EISSN

2150-8097

Publication Date

January 1, 2023

Volume

Issue

Start / End Page

1573 / 1586

Related Subject Headings

4605 Data management and data science
0807 Library and Information Studies
0806 Information Systems
0802 Computation Theory and Mathematics

Citation

APA

Chicago

ICMJE

MLA

NLM

Pujol, D., Gilad, A., & Machanavajjhala, A. (2023). PreFair: Privately Generating Justifiably Fair Synthetic Data. In Proceedings of the VLDB Endowment (Vol. 16, pp. 1573–1586). https://doi.org/10.14778/3583140.3583168

Pujol, D., A. Gilad, and A. Machanavajjhala. “PreFair: Privately Generating Justifiably Fair Synthetic Data.” In Proceedings of the VLDB Endowment, 16:1573–86, 2023. https://doi.org/10.14778/3583140.3583168.

Pujol D, Gilad A, Machanavajjhala A. PreFair: Privately Generating Justifiably Fair Synthetic Data. In: Proceedings of the VLDB Endowment. 2023. p. 1573–86.

Pujol, D., et al. “PreFair: Privately Generating Justifiably Fair Synthetic Data.” Proceedings of the VLDB Endowment, vol. 16, no. 6, 2023, pp. 1573–86. Scopus, doi:10.14778/3583140.3583168.

Pujol D, Gilad A, Machanavajjhala A. PreFair: Privately Generating Justifiably Fair Synthetic Data. Proceedings of the VLDB Endowment. 2023. p. 1573–1586.

Published In

Proceedings of the VLDB Endowment

DOI

10.14778/3583140.3583168

EISSN

2150-8097

Publication Date

January 1, 2023

Volume

Issue

Start / End Page

1573 / 1586

Related Subject Headings

4605 Data management and data science
0807 Library and Information Studies
0806 Information Systems
0802 Computation Theory and Mathematics