Scholars@Duke publication: Exploring the cloud of variable importance for the set of all good models

Exploring the cloud of variable importance for the set of all good models

Publication , Journal Article

Dong, J; Rudin, C

Published in: Nature Machine Intelligence

December 1, 2020

Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is problematic: what if there were multiple well-performing predictive models, and a specific variable is important to some of them but not to others? In that case, we cannot tell from a single well-performing model if a variable is always important, sometimes important, never important or perhaps only important when another variable is not important. Ideally, we would like to explore variable importance for all approximately equally accurate predictive models within the same model class. In this way, we can understand the importance of a variable in the context of other variables, and for many good models. This work introduces the concept of a variable importance cloud, which maps every variable to its importance for every good predictive model. We show properties of the variable importance cloud and draw connections to other areas of statistics. We introduce variable importance diagrams as a projection of the variable importance cloud into two dimensions for visualization purposes. Experiments with criminal justice, marketing data and image classification tasks illustrate how variables can change dramatically in importance for approximately equally accurate predictive models.

Duke Scholars

Author Cynthia D. Rudin Computer Science

Altmetric Attention Stats

Dimensions Citation Stats

Published In

Nature Machine Intelligence

DOI

10.1038/s42256-020-00264-0

EISSN

2522-5839

Publication Date

December 1, 2020

Volume

Issue

Start / End Page

810 / 824

Related Subject Headings

46 Information and computing sciences
40 Engineering

Citation

APA

Chicago

ICMJE

MLA

NLM

Dong, J., & Rudin, C. (2020). Exploring the cloud of variable importance for the set of all good models. Nature Machine Intelligence, 2(12), 810–824. https://doi.org/10.1038/s42256-020-00264-0

Dong, J., and C. Rudin. “Exploring the cloud of variable importance for the set of all good models.” Nature Machine Intelligence 2, no. 12 (December 1, 2020): 810–24. https://doi.org/10.1038/s42256-020-00264-0.

Dong J, Rudin C. Exploring the cloud of variable importance for the set of all good models. Nature Machine Intelligence. 2020 Dec 1;2(12):810–24.

Dong, J., and C. Rudin. “Exploring the cloud of variable importance for the set of all good models.” Nature Machine Intelligence, vol. 2, no. 12, Dec. 2020, pp. 810–24. Scopus, doi:10.1038/s42256-020-00264-0.

Dong J, Rudin C. Exploring the cloud of variable importance for the set of all good models. Nature Machine Intelligence. 2020 Dec 1;2(12):810–824.

Published In

Nature Machine Intelligence

DOI

10.1038/s42256-020-00264-0

EISSN

2522-5839

Publication Date

December 1, 2020

Volume

Issue

Start / End Page

810 / 824

Related Subject Headings

46 Information and computing sciences
40 Engineering