Scholars@Duke publication: Concept whitening for interpretable image recognition

Concept whitening for interpretable image recognition

Publication , Journal Article

Chen, Z; Bei, Y; Rudin, C

Published in: Nature Machine Intelligence

December 1, 2020

What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can be misleading, unusable or rely on the latent space to possess properties that it may not have. Here, rather than attempting to analyse a neural network post hoc, we introduce a mechanism, called concept whitening (CW), to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a convolutional neural network, the latent space is whitened (that is, decorrelated and normalized) and the axes of the latent space are aligned with known concepts of interest. By experiment, we show that CW can provide us with a much clearer understanding of how the network gradually learns concepts over layers. CW is an alternative to a batch normalization layer in that it normalizes, and also decorrelates (whitens), the latent space. CW can be used in any layer of the network without hurting predictive performance.

Duke Scholars

Author Cynthia D. Rudin Computer Science

Published In

Nature Machine Intelligence

DOI

10.1038/s42256-020-00265-z

EISSN

2522-5839

Publication Date

December 1, 2020

Volume

Issue

Start / End Page

772 / 782

Related Subject Headings

46 Information and computing sciences
40 Engineering

Citation

APA

Chicago

ICMJE

MLA

NLM

Chen, Z., Bei, Y., & Rudin, C. (2020). Concept whitening for interpretable image recognition. Nature Machine Intelligence, 2(12), 772–782. https://doi.org/10.1038/s42256-020-00265-z

Chen, Z., Y. Bei, and C. Rudin. “Concept whitening for interpretable image recognition.” Nature Machine Intelligence 2, no. 12 (December 1, 2020): 772–82. https://doi.org/10.1038/s42256-020-00265-z.

Chen Z, Bei Y, Rudin C. Concept whitening for interpretable image recognition. Nature Machine Intelligence. 2020 Dec 1;2(12):772–82.

Chen, Z., et al. “Concept whitening for interpretable image recognition.” Nature Machine Intelligence, vol. 2, no. 12, Dec. 2020, pp. 772–82. Scopus, doi:10.1038/s42256-020-00265-z.

Chen Z, Bei Y, Rudin C. Concept whitening for interpretable image recognition. Nature Machine Intelligence. 2020 Dec 1;2(12):772–782.

Published In

Nature Machine Intelligence

DOI

10.1038/s42256-020-00265-z

EISSN

2522-5839

Publication Date

December 1, 2020

Volume

Issue

Start / End Page

772 / 782

Related Subject Headings

46 Information and computing sciences
40 Engineering