Scholars@Duke publication: Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability

Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability

Publication , Conference

Inkawhich, N; Liang, KJ; Wang, B; Inkawhich, M; Carin, L; Chen, Y

Published in: Advances in Neural Information Processing Systems

January 1, 2020

We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers. Rather than focusing on crossing decision boundaries at the output layer of the source model, our method perturbs representations throughout the extracted feature hierarchy to resemble other classes. We design a flexible attack framework that allows for multilayer perturbations and demonstrates state-of-the-art targeted transfer performance between ImageNet DNNs. We also show the superiority of our feature space methods under a relaxation of the common assumption that the source and target models are trained on the same dataset and label space, in some instances achieving a 10× increase in targeted success rate relative to other blackbox transfer methods. Finally, we analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Author Yiran Chen Electrical and Computer Engineering

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2020

Volume

2020-December

Related Subject Headings

4611 Machine learning
1702 Cognitive Sciences
1701 Psychology

Citation

APA

Chicago

ICMJE

MLA

NLM

Inkawhich, N., Liang, K. J., Wang, B., Inkawhich, M., Carin, L., & Chen, Y. (2020). Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability. In Advances in Neural Information Processing Systems (Vol. 2020-December).

Inkawhich, N., K. J. Liang, B. Wang, M. Inkawhich, L. Carin, and Y. Chen. “Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability.” In Advances in Neural Information Processing Systems, Vol. 2020-December, 2020.

Inkawhich N, Liang KJ, Wang B, Inkawhich M, Carin L, Chen Y. Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability. In: Advances in Neural Information Processing Systems. 2020.

Inkawhich, N., et al. “Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability.” Advances in Neural Information Processing Systems, vol. 2020-December, 2020.

Inkawhich N, Liang KJ, Wang B, Inkawhich M, Carin L, Chen Y. Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability. Advances in Neural Information Processing Systems. 2020.

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2020

Volume

2020-December

Related Subject Headings

4611 Machine learning
1702 Cognitive Sciences
1701 Psychology