Scholars@Duke publication: Subjectivity in the Creation of Machine Learning Models

Subjectivity in the Creation of Machine Learning Models

Publication , Journal Article

Cummings, ML; Li, S

Published in: Journal of Data and Information Quality

May 13, 2021

Transportation analysts are inundated with requests to apply popular machine learning modeling techniques to datasets to uncover never-before-seen relationships that could potentially revolutionize safety, congestion, and mobility. However, the results from such models can be influenced not just by biases in underlying data, but also through practitioner-induced biases. To demonstrate the significant number of subjective judgments made in the development and interpretation of machine learning models, we developed Logistic Regression and Neural Network models for transportation-focused datasets including those looking at driving injury/fatalities and pedestrian fatalities. We then developed five different representations of feature importance for each dataset, including different feature interpretations commonly used in the machine learning community. Twelve distinct judgments were highlighted in the development and interpretation of these models, which produced inconsistent results. Such inconsistencies can lead to very different interpretations of the results, which can lead to errors of commission and omission, with significant cost and safety implications if policies are erroneously adapted from such outcomes.

Published In

Journal of Data and Information Quality

DOI

10.1145/3418034

EISSN

1936-1963

ISSN

1936-1955

Publication Date

May 13, 2021

Volume

Issue

Related Subject Headings

4610 Library and information studies
4605 Data management and data science
08 Information and Computing Sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Cummings, M. L., & Li, S. (2021). Subjectivity in the Creation of Machine Learning Models. Journal of Data and Information Quality, 13(2). https://doi.org/10.1145/3418034

Cummings, M. L., and S. Li. “Subjectivity in the Creation of Machine Learning Models.” Journal of Data and Information Quality 13, no. 2 (May 13, 2021). https://doi.org/10.1145/3418034.

Cummings ML, Li S. Subjectivity in the Creation of Machine Learning Models. Journal of Data and Information Quality. 2021 May 13;13(2).

Cummings, M. L., and S. Li. “Subjectivity in the Creation of Machine Learning Models.” Journal of Data and Information Quality, vol. 13, no. 2, May 2021. Scopus, doi:10.1145/3418034.

Cummings ML, Li S. Subjectivity in the Creation of Machine Learning Models. Journal of Data and Information Quality. 2021 May 13;13(2).

Published In

Journal of Data and Information Quality

DOI

10.1145/3418034

EISSN

1936-1963

ISSN

1936-1955

Publication Date

May 13, 2021

Volume

Issue

Related Subject Headings

4610 Library and information studies
4605 Data management and data science
08 Information and Computing Sciences