Scholars@Duke publication: Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models

Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models

Publication , Conference

Shan, S; Ding, W; Wenger, E; Zheng, H; Zhao, BY

Published in: Proceedings of the ACM Conference on Computer and Communications Security

November 7, 2022

Server breaches are an unfortunate reality on today's Internet. In the context of deep neural network (DNN) models, they are particularly harmful, because a leaked model gives an attacker "white-box'' access to generate adversarial examples, a threat model that has no practical robust defenses. For practitioners who have invested years and millions into proprietary DNNs, e.g. medical imaging, this seems like an inevitable disaster looming on the horizon. In this paper, we consider the problem of post-breach recovery for DNN models. We propose Neo, a new system that creates new versions of leaked models, alongside an inference time filter that detects and removes adversarial examples generated on previously leaked models. The classification surfaces of different model versions are slightly offset (by introducing hidden distributions), and Neo detects the overfitting of attacks to the leaked model used in its generation. We show that across a variety of tasks and attack methods, Neo is able to filter out attacks from leaked models with very high accuracy, and provides strong protection (7-10 recoveries) against attackers who repeatedly breach the server. Neo performs well against a variety of strong adaptive attacks, dropping slightly in # of breaches recoverable, and demonstrates potential as a complement to DNN defenses in the wild.

Duke Scholars

Author Emily Wenger Electrical and Computer Engineering

Published In

Proceedings of the ACM Conference on Computer and Communications Security

DOI

10.1145/3548606.3560561

ISSN

1543-7221

Publication Date

November 7, 2022

Start / End Page

2611 / 2625

Citation

APA

Chicago

ICMJE

MLA

NLM

Shan, S., Ding, W., Wenger, E., Zheng, H., & Zhao, B. Y. (2022). Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models. In Proceedings of the ACM Conference on Computer and Communications Security (pp. 2611–2625). https://doi.org/10.1145/3548606.3560561

Shan, S., W. Ding, E. Wenger, H. Zheng, and B. Y. Zhao. “Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models.” In Proceedings of the ACM Conference on Computer and Communications Security, 2611–25, 2022. https://doi.org/10.1145/3548606.3560561.

Shan S, Ding W, Wenger E, Zheng H, Zhao BY. Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models. In: Proceedings of the ACM Conference on Computer and Communications Security. 2022. p. 2611–25.

Shan, S., et al. “Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models.” Proceedings of the ACM Conference on Computer and Communications Security, 2022, pp. 2611–25. Scopus, doi:10.1145/3548606.3560561.

Shan S, Ding W, Wenger E, Zheng H, Zhao BY. Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models. Proceedings of the ACM Conference on Computer and Communications Security. 2022. p. 2611–2625.

Published In

Proceedings of the ACM Conference on Computer and Communications Security

DOI

10.1145/3548606.3560561

ISSN

1543-7221

Publication Date

November 7, 2022

Start / End Page

2611 / 2625