Skip to main content

Accounting for intruder uncertainty due to sampling when estimating identification disclosure risks in partially synthetic data

Publication ,  Journal Article
Drechsler, J; Reiter, JP
Published in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
January 1, 2008

Partially synthetic data comprise the units originally surveyed with some collected values, such as sensitive values at high risk of disclosure or values of key identifiers, replaced with multiple draws from statistical models. Because the original records remain on the file, intruders may be able to link those records to external databases, even though values are synthesized. We illustrate how statistical agencies can evaluate the risks of identification disclosures before releasing such data. We compute risk measures when intruders know who is in the sample and when the intruders do not know who is in the sample. We use classification and regression trees to synthesize data from the U.S. Current Population Survey. © 2008 Springer-Verlag Berlin Heidelberg.

Duke Scholars

Published In

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

DOI

EISSN

1611-3349

ISSN

0302-9743

Publication Date

January 1, 2008

Volume

5262 LNCS

Start / End Page

227 / 238

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 46 Information and computing sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Drechsler, J., & Reiter, J. P. (2008). Accounting for intruder uncertainty due to sampling when estimating identification disclosure risks in partially synthetic data. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5262 LNCS, 227–238. https://doi.org/10.1007/978-3-540-87471-3_19
Drechsler, J., and J. P. Reiter. “Accounting for intruder uncertainty due to sampling when estimating identification disclosure risks in partially synthetic data.” Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 5262 LNCS (January 1, 2008): 227–38. https://doi.org/10.1007/978-3-540-87471-3_19.
Drechsler J, Reiter JP. Accounting for intruder uncertainty due to sampling when estimating identification disclosure risks in partially synthetic data. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2008 Jan 1;5262 LNCS:227–38.
Drechsler, J., and J. P. Reiter. “Accounting for intruder uncertainty due to sampling when estimating identification disclosure risks in partially synthetic data.” Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5262 LNCS, Jan. 2008, pp. 227–38. Scopus, doi:10.1007/978-3-540-87471-3_19.
Drechsler J, Reiter JP. Accounting for intruder uncertainty due to sampling when estimating identification disclosure risks in partially synthetic data. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2008 Jan 1;5262 LNCS:227–238.

Published In

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

DOI

EISSN

1611-3349

ISSN

0302-9743

Publication Date

January 1, 2008

Volume

5262 LNCS

Start / End Page

227 / 238

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 46 Information and computing sciences