Skip to main content

Utility cost of formal privacy for releasing national employer-employee statistics

Publication ,  Conference
Haney, S; Graham, M; Machanavajjhala, A; Kutzbach, M; Abowd, JM; Vilhuber, L
Published in: Proceedings of the ACM SIGMOD International Conference on Management of Data
May 9, 2017

National statistical agencies around the world publish tabular summaries based on combined employer-employee (ER-EE) data. The privacy of both individuals and business establishments that feature in these data are protected by law in most countries. These data are currently released using a variety of statistical disclosure limitation (SDL) techniques that do not reveal the exact characteristics of particular employers and employees, but lack provable privacy guarantees limiting inferential disclosures. In this work, we present novel algorithms for releasing tabular summaries of linked ER-EE data with formal, provable guarantees of privacy. We show that state-of-the-art differentially private algorithms add too much noise for the output to be useful. Instead, we identify the privacy requirements mandated by current interpretations of the relevant laws, and formalize them using the Pufferfish framework. We then develop new privacy definitions that are customized to ER-EE data and satisfy the statutory privacy requirements. We implement the experiments in this paper on production data gathered by the U.S. Census Bureau. An empirical evaluation of utility for these data shows that for reasonable values of the privacy-loss parameter ϵ ≥ 1, the additive error introduced by our provably private algorithms is comparable, and in some cases better, than the error introduced by existing SDL techniques that have no provable privacy guarantees. For some complex queries currently published, however, our algorithms do not have utility comparable to the existing traditional SDL algorithms. Those queries are fodder for future research.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Proceedings of the ACM SIGMOD International Conference on Management of Data

DOI

ISSN

0730-8078

Publication Date

May 9, 2017

Volume

Part F127746

Start / End Page

1339 / 1354
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Haney, S., Graham, M., Machanavajjhala, A., Kutzbach, M., Abowd, J. M., & Vilhuber, L. (2017). Utility cost of formal privacy for releasing national employer-employee statistics. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Vol. Part F127746, pp. 1339–1354). https://doi.org/10.1145/3035918.3035940
Haney, S., M. Graham, A. Machanavajjhala, M. Kutzbach, J. M. Abowd, and L. Vilhuber. “Utility cost of formal privacy for releasing national employer-employee statistics.” In Proceedings of the ACM SIGMOD International Conference on Management of Data, Part F127746:1339–54, 2017. https://doi.org/10.1145/3035918.3035940.
Haney S, Graham M, Machanavajjhala A, Kutzbach M, Abowd JM, Vilhuber L. Utility cost of formal privacy for releasing national employer-employee statistics. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2017. p. 1339–54.
Haney, S., et al. “Utility cost of formal privacy for releasing national employer-employee statistics.” Proceedings of the ACM SIGMOD International Conference on Management of Data, vol. Part F127746, 2017, pp. 1339–54. Scopus, doi:10.1145/3035918.3035940.
Haney S, Graham M, Machanavajjhala A, Kutzbach M, Abowd JM, Vilhuber L. Utility cost of formal privacy for releasing national employer-employee statistics. Proceedings of the ACM SIGMOD International Conference on Management of Data. 2017. p. 1339–1354.

Published In

Proceedings of the ACM SIGMOD International Conference on Management of Data

DOI

ISSN

0730-8078

Publication Date

May 9, 2017

Volume

Part F127746

Start / End Page

1339 / 1354