ℓ-Diversity: Privacy beyond k-anonymity
Publishing data about individuals without revealing sensitive information about them is an important problem. In recent y ears, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k - 1 other records with respect to certain "identifying" attributes. In this paper we show with two simple attacks that a k-anonymized dataset has some subtle, but severe privacy problems. First, we show that an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. Second, attackers often have background knowledge, and we show that k-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks and we propose a novel and powerful privacy definition called ℓ-diversity. In addition to building a formal foundation/or ℓ-diversity, we show in an experimental evaluation that ℓ-diversity is practical and can be implemented efficiently. © 2006 IEEE.