Skip to main content

A Path to Simpler Models Starts With Noise.

Publication ,  Journal Article
Semenova, L; Chen, H; Parr, R; Rudin, C
Published in: Advances in neural information processing systems
December 2023

The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular datasets in criminal justice, healthcare, lending, education, and in other areas, which has practical implications about whether simpler models can attain the same level of accuracy as more complex models. An open question is why Rashomon ratios often tend to be large. In this work, we propose and study a mechanism of the data generation process, coupled with choices usually made by the analyst during the learning process, that determines the size of the Rashomon ratio. Specifically, we demonstrate that noisier datasets lead to larger Rashomon ratios through the way that practitioners train models. Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets.

Duke Scholars

Published In

Advances in neural information processing systems

ISSN

1049-5258

Publication Date

December 2023

Volume

36

Start / End Page

3362 / 3401

Related Subject Headings

  • 4611 Machine learning
  • 1702 Cognitive Sciences
  • 1701 Psychology
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Semenova, L., Chen, H., Parr, R., & Rudin, C. (2023). A Path to Simpler Models Starts With Noise. Advances in Neural Information Processing Systems, 36, 3362–3401.
Semenova, Lesia, Harry Chen, Ronald Parr, and Cynthia Rudin. “A Path to Simpler Models Starts With Noise.Advances in Neural Information Processing Systems 36 (December 2023): 3362–3401.
Semenova L, Chen H, Parr R, Rudin C. A Path to Simpler Models Starts With Noise. Advances in neural information processing systems. 2023 Dec;36:3362–401.
Semenova, Lesia, et al. “A Path to Simpler Models Starts With Noise.Advances in Neural Information Processing Systems, vol. 36, Dec. 2023, pp. 3362–401.
Semenova L, Chen H, Parr R, Rudin C. A Path to Simpler Models Starts With Noise. Advances in neural information processing systems. 2023 Dec;36:3362–3401.

Published In

Advances in neural information processing systems

ISSN

1049-5258

Publication Date

December 2023

Volume

36

Start / End Page

3362 / 3401

Related Subject Headings

  • 4611 Machine learning
  • 1702 Cognitive Sciences
  • 1701 Psychology