Differentially Private Significance Tests for Regression Coefficients
Many data producers seek to provide users access to confidential data without unduly compromising data subjects’ privacy and confidentiality. One general strategy is to require users to do analyses without seeing the confidential data; for example, analysts only get access to synthetic data or query systems that provide disclosure-protected outputs of statistical models. With synthetic data or redacted outputs, the analyst never really knows how much to trust the resulting findings. In particular, if the user did the same analysis on the confidential data, would regression coefficients of interest be statistically significant or not? We present algorithms for assessing this question that satisfy differential privacy. We describe conditions under which the algorithms should give accurate answers about statistical significance. We illustrate the properties of the proposed methods using artificial and genuine data. Supplementary materials for this article are available online.
Duke Scholars
DOI
EISSN
ISSN
Publication Date
Start / End Page
Related Subject Headings
- Statistics & Probability
- 4905 Statistics
- 1403 Econometrics
- 0104 Statistics
Citation
DOI
EISSN
ISSN
Publication Date
Start / End Page
Related Subject Headings
- Statistics & Probability
- 4905 Statistics
- 1403 Econometrics
- 0104 Statistics