Skip to main content

Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in cloud

Publication ,  Journal Article
Zhang, X; Dou, W; Pei, J; Nepal, S; Yang, C; Liu, C; Chen, J
Published in: IEEE Transactions on Computers
August 1, 2015

Cloud computing provides promising scalable IT infrastructure to support various processing of a variety of big data applications in sectors such as healthcare and business. Data sets like electronic health records in such applications often contain privacy-sensitive information, which brings about privacy concerns potentially if the information is released or shared to third-parties in cloud. A practical and widely-adopted technique for data privacy preservation is to anonymize data via generalization to satisfy a given privacy model. However, most existing privacy preserving approaches tailored to small-scale data sets often fall short when encountering big data, due to their insufficiency or poor scalability. In this paper, we investigate the local-recoding problem for big data anonymization against proximity privacy breaches and attempt to identify a scalable solution to this problem. Specifically, we present a proximity privacy model with allowing semantic proximity of sensitive values and multiple sensitive attributes, and model the problem of local recoding as a proximity-aware clustering problem. A scalable two-phase clustering approach consisting of a t-ancestors clustering (similar to k-means) algorithm and a proximity-aware agglomerative clustering algorithm is proposed to address the above problem. We design the algorithms with MapReduce to gain high scalability by performing data-parallel computation in cloud. Extensive experiments on real-life data sets demonstrate that our approach significantly improves the capability of defending the proximity privacy breaches, the scalability and the time-efficiency of local-recoding anonymization over existing approaches.

Duke Scholars

Published In

IEEE Transactions on Computers

DOI

ISSN

0018-9340

Publication Date

August 1, 2015

Volume

64

Issue

8

Start / End Page

2293 / 2307

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 4009 Electronics, sensors and digital hardware
  • 1006 Computer Hardware
  • 0805 Distributed Computing
  • 0803 Computer Software
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhang, X., Dou, W., Pei, J., Nepal, S., Yang, C., Liu, C., & Chen, J. (2015). Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in cloud. IEEE Transactions on Computers, 64(8), 2293–2307. https://doi.org/10.1109/TC.2014.2360516
Zhang, X., W. Dou, J. Pei, S. Nepal, C. Yang, C. Liu, and J. Chen. “Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in cloud.” IEEE Transactions on Computers 64, no. 8 (August 1, 2015): 2293–2307. https://doi.org/10.1109/TC.2014.2360516.
Zhang X, Dou W, Pei J, Nepal S, Yang C, Liu C, et al. Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in cloud. IEEE Transactions on Computers. 2015 Aug 1;64(8):2293–307.
Zhang, X., et al. “Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in cloud.” IEEE Transactions on Computers, vol. 64, no. 8, Aug. 2015, pp. 2293–307. Scopus, doi:10.1109/TC.2014.2360516.
Zhang X, Dou W, Pei J, Nepal S, Yang C, Liu C, Chen J. Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in cloud. IEEE Transactions on Computers. 2015 Aug 1;64(8):2293–2307.

Published In

IEEE Transactions on Computers

DOI

ISSN

0018-9340

Publication Date

August 1, 2015

Volume

64

Issue

8

Start / End Page

2293 / 2307

Related Subject Headings

  • Computer Hardware & Architecture
  • 4606 Distributed computing and systems software
  • 4009 Electronics, sensors and digital hardware
  • 1006 Computer Hardware
  • 0805 Distributed Computing
  • 0803 Computer Software