Scholars@Duke publication: Data-Blind ML: Building privacy-aware machine learning models without direct data access

Data-Blind ML: Building privacy-aware machine learning models without direct data access

Publication , Conference

Pastorino, J; Biswas, AK

Published in: Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021

January 1, 2021

Published version (DOI)

Traditional Machine Learning (ML) pipeline development requires the ML practitioner to directly access the data to analyze, clean and preprocess it, in order to develop an ML model, train it and evaluate its performance. When the data owner has no infrastructure for in-house development, such pipelines are outsourced. It is common that data has some level of privacy constraints that will impose a laborious and maybe expensive infrastructure, including among others contracts drafting and infrastructure improvement. Traditional approaches rely either on anonymization which does not entirely protect from identity disclosure, or on synthetic data generation which requires expertise not necessarily available to the organization. In this paper, we present Data-Blind ML, an automated framework, fueled by synthetic generative learning and distributed computing paradigms, which enables an organization to outsource the development and training of ML models without sharing any sample from the real dataset. In addition, the framework allows the ML practitioner to get feedback of the model's performance against the actual real data without accessing it directly.

Duke Scholars

Author Javier Pastorino Pierre R. Lamond Department of Electrical and Computer Engin ...

Published In

Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021

DOI

10.1109/AIKE52691.2021.00020

Publication Date

January 1, 2021

Start / End Page

95 / 98

Citation

APA

Chicago

ICMJE

MLA

NLM

Pastorino, J., & Biswas, A. K. (2021). Data-Blind ML: Building privacy-aware machine learning models without direct data access. In Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021 (pp. 95–98). https://doi.org/10.1109/AIKE52691.2021.00020

Pastorino, J., and A. K. Biswas. “Data-Blind ML: Building privacy-aware machine learning models without direct data access.” In Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021, 95–98, 2021. https://doi.org/10.1109/AIKE52691.2021.00020.

Pastorino J, Biswas AK. Data-Blind ML: Building privacy-aware machine learning models without direct data access. In: Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021. 2021. p. 95–8.

Pastorino, J., and A. K. Biswas. “Data-Blind ML: Building privacy-aware machine learning models without direct data access.” Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021, 2021, pp. 95–98. Scopus, doi:10.1109/AIKE52691.2021.00020.

Pastorino J, Biswas AK. Data-Blind ML: Building privacy-aware machine learning models without direct data access. Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021. 2021. p. 95–98.

Published In

Proceedings 2021 IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering Aike 2021

DOI

10.1109/AIKE52691.2021.00020

Publication Date

January 1, 2021

Start / End Page

95 / 98