Scholars@Duke publication: Using Statistics to Protect Privacy

Privacy Big Data and the Public Good Frameworks for Engagement

Using Statistics to Protect Privacy

Publication , Chapter

Karr, AF; Reiter, JP

January 1, 2013

Those who generate data - for example, official statistics agencies, survey organizations, and principal investigators, henceforth all called agencies - have a long history of providing access to their data to researchers, policy analysts, decision makers, and the general public. At the same time, these agencies are obligated ethically and often legally to protect the confidentiality of data subjects' identities and sensitive attributes. Simply stripping names, exact addresses, and other direct identifiers typically does not suffice to protect confidentiality. When the released data include variables that are readily available in external files, such as demographic characteristics or employment histories, ill-intentioned users - henceforth called intruders - may be able to link records in the released data to records in external files, thereby compromising the agency's promise of confidentiality to those who provided the data. In response to this threat, agencies have developed an impressive variety of strategies for reducing the risks of unintended disclosures, ranging from restricting data access to altering data before release. Strategies that fall into the latter category are known as statistical disclosure limitation (SDL) techniques. Most SDL techniques have been developed for data derived from probability surveys or censuses. Even in complete form, these data would not typically be thought of as big data, with respect to scale (numbers of cases and attributes), complexity of attribute types, or structure: most datasets are released, if not actually structured, as flat files.

Duke Scholars

Author Jerome P. Reiter Statistical Science

DOI

10.1017/CBO9781107590205.017

Publication Date

January 1, 2013

Start / End Page

276 / 295

Citation

APA

Chicago

ICMJE

MLA

NLM

Karr, A. F., & Reiter, J. P. (2013). Using Statistics to Protect Privacy. In Privacy Big Data and the Public Good Frameworks for Engagement (pp. 276–295). https://doi.org/10.1017/CBO9781107590205.017

Karr, A. F., and J. P. Reiter. “Using Statistics to Protect Privacy.” In Privacy Big Data and the Public Good Frameworks for Engagement, 276–95, 2013. https://doi.org/10.1017/CBO9781107590205.017.

Karr AF, Reiter JP. Using Statistics to Protect Privacy. In: Privacy Big Data and the Public Good Frameworks for Engagement. 2013. p. 276–95.

Karr, A. F., and J. P. Reiter. “Using Statistics to Protect Privacy.” Privacy Big Data and the Public Good Frameworks for Engagement, 2013, pp. 276–95. Scopus, doi:10.1017/CBO9781107590205.017.

Karr AF, Reiter JP. Using Statistics to Protect Privacy. Privacy Big Data and the Public Good Frameworks for Engagement. 2013. p. 276–295.

DOI

10.1017/CBO9781107590205.017

Publication Date

January 1, 2013

Start / End Page

276 / 295