Skip to main content

Cynthia D. Rudin

Earl D. McLean, Jr. Professor
Computer Science
LSRC D207, Durham, NC 27708

Selected Publications


Learning From Alarms: A Robust Learning Approach for Accurate Photoplethysmography-Based Atrial Fibrillation Detection Using Eight Million Samples Labeled With Imprecise Arrhythmia Alarms.

Journal Article IEEE journal of biomedical and health informatics · May 2024 Atrial fibrillation (AF) is a common cardiac arrhythmia with serious health consequences if not detected and treated early. Detecting AF using wearable devices with photoplethysmography (PPG) sensors and deep neural networks has demonstrated some success u ... Full text Cite

Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference

Conference Proceedings of the AAAI Conference on Artificial Intelligence · March 25, 2024 After a person is arrested and charged with a crime, they may be released on bail and required to participate in a community supervision program while awaiting trial. These 'pretrial programs' are common throughout the United States, but very little resear ... Full text Cite

AsymMirai: Interpretable Mammography-based Deep Learning Model for 1-5-year Breast Cancer Risk Prediction.

Journal Article Radiology · March 2024 Background Mirai, a state-of-the-art deep learning-based algorithm for predicting short-term breast cancer risk, outperforms standard clinical risk models. However, Mirai is a black box, risking overreliance on the algorithm and incorrect diagnoses. Purpos ... Full text Link to item Cite

OKRidge: Scalable Optimal k-Sparse Ridge Regression.

Journal Article Advances in neural information processing systems · December 2023 We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive t ... Cite

A Path to Simpler Models Starts With Noise.

Journal Article Advances in neural information processing systems · December 2023 The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular dat ... Cite

Impact of Cannabis Use on Immune Cell Populations and the Viral Reservoir in People With HIV on Suppressive Antiretroviral Therapy.

Journal Article J Infect Dis · November 28, 2023 BACKGROUND: Human immunodeficiency virus (HIV) infection remains incurable due to the persistence of a viral reservoir despite antiretroviral therapy (ART). Cannabis (CB) use is prevalent amongst people with HIV (PWH), but the impact of CB on the latent HI ... Full text Link to item Cite

Interpretable algorithmic forensics.

Journal Article Proceedings of the National Academy of Sciences of the United States of America · October 2023 One of the most troubling trends in criminal investigations is the growing use of "black box" technology, in which law enforcement rely on artificial intelligence (AI) models or algorithms that are either too complex for people to understand or they simply ... Full text Cite

An Interpretable, Flexible, and Interactive Probabilistic Framework for Melody Generation

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 6, 2023 The fast-growing demand for algorithmic music generation is found throughout entertainment, art, education, etc. Unfortunately, most recent models are practically impossible to interpret or musically fine-tune, as they use deep neural networks with thousan ... Full text Cite

Effects of epileptiform activity on discharge outcome in critically ill patients in the USA: a retrospective cross-sectional study.

Journal Article The Lancet. Digital health · August 2023 BackgroundEpileptiform activity is associated with worse patient outcomes, including increased risk of disability and death. However, the effect of epileptiform activity on neurological outcome is confounded by the feedback between treatment with ... Full text Cite

Prediction of tensile performance for 3D printed photopolymer gyroid lattices using structural porosity, base material properties, and machine learning

Journal Article Materials and Design · August 1, 2023 Advancements in additive manufacturing (AM) technology and three-dimensional (3D) modeling software have enabled the fabrication of parts with combinations of properties that were impossible to achieve with traditional manufacturing techniques. Porous desi ... Full text Cite

Tensile performance data of 3D printed photopolymer gyroid lattices.

Journal Article Data in brief · August 2023 Additive manufacturing has provided the ability to manufacture complex structures using a wide variety of materials and geometries. Structures such as triply periodic minimal surface (TPMS) lattices have been incorporated into products across many fields d ... Full text Cite

Applied machine learning as a driver for polymeric biomaterials design.

Journal Article Nature communications · August 2023 Polymers are ubiquitous to almost every aspect of modern society and their use in medical products is similarly pervasive. Despite this, the diversity in commercial polymers used in medicine is stunningly low. Considerable time and resources have been exte ... Full text Cite

Optimal Sparse Regression Trees

Conference Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 · June 27, 2023 Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been l ... Cite

Optimal Sparse Regression Trees.

Conference Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence · June 2023 Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been l ... Full text Cite

In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

Journal Article Journal of Quantitative Criminology · June 1, 2023 Objectives: We study interpretable recidivism prediction using machine learning (ML) models and analyze performance in terms of prediction ability, sparsity, and fairness. Unlike previous works, this study trains interpretable models that output probabilit ... Full text Cite

A user interface to communicate interpretable AI decisions to radiologists

Conference Progress in Biomedical Optics and Imaging - Proceedings of SPIE · January 1, 2023 Tools for computer-aided diagnosis based on deep learning have become increasingly important in the medical field. Such tools can be useful, but require effective communication of their decision-making process in order to safely and meaningfully guide clin ... Full text Cite

Variable Importance Matching for Causal Inference

Conference Proceedings of Machine Learning Research · January 1, 2023 Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, accurate for treatment effect estimation, and scalable to high-dimensional data. We describe a general framework called Model-to-Match that achieves ... Cite

The Mechanical Bard: An Interpretable Machine Learning Approach to Shakespearean Sonnet Generation

Conference Proceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2023 We consider the automated generation of sonnets, a poetic form constrained according to meter, rhyme scheme, and length. Sonnets generally also use rhetorical figures, expressive language, and a consistent theme or narrative. Our constrained decoding appro ... Cite

Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Conference Proceedings of Machine Learning Research · January 1, 2023 Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the ... Cite

Learning From Alarms: A Robust Learning Approach for Accurate Photoplethysmography-Based Atrial Fibrillation Detection Using Eight Million Samples Labeled With Imprecise Arrhythmia Alarms.

Journal Article IEEE journal of biomedical and health informatics · May 2024 Atrial fibrillation (AF) is a common cardiac arrhythmia with serious health consequences if not detected and treated early. Detecting AF using wearable devices with photoplethysmography (PPG) sensors and deep neural networks has demonstrated some success u ... Full text Cite

Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference

Conference Proceedings of the AAAI Conference on Artificial Intelligence · March 25, 2024 After a person is arrested and charged with a crime, they may be released on bail and required to participate in a community supervision program while awaiting trial. These 'pretrial programs' are common throughout the United States, but very little resear ... Full text Cite

AsymMirai: Interpretable Mammography-based Deep Learning Model for 1-5-year Breast Cancer Risk Prediction.

Journal Article Radiology · March 2024 Background Mirai, a state-of-the-art deep learning-based algorithm for predicting short-term breast cancer risk, outperforms standard clinical risk models. However, Mirai is a black box, risking overreliance on the algorithm and incorrect diagnoses. Purpos ... Full text Link to item Cite

OKRidge: Scalable Optimal k-Sparse Ridge Regression.

Journal Article Advances in neural information processing systems · December 2023 We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive t ... Cite

A Path to Simpler Models Starts With Noise.

Journal Article Advances in neural information processing systems · December 2023 The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular dat ... Cite

Impact of Cannabis Use on Immune Cell Populations and the Viral Reservoir in People With HIV on Suppressive Antiretroviral Therapy.

Journal Article J Infect Dis · November 28, 2023 BACKGROUND: Human immunodeficiency virus (HIV) infection remains incurable due to the persistence of a viral reservoir despite antiretroviral therapy (ART). Cannabis (CB) use is prevalent amongst people with HIV (PWH), but the impact of CB on the latent HI ... Full text Link to item Cite

Interpretable algorithmic forensics.

Journal Article Proceedings of the National Academy of Sciences of the United States of America · October 2023 One of the most troubling trends in criminal investigations is the growing use of "black box" technology, in which law enforcement rely on artificial intelligence (AI) models or algorithms that are either too complex for people to understand or they simply ... Full text Cite

An Interpretable, Flexible, and Interactive Probabilistic Framework for Melody Generation

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 6, 2023 The fast-growing demand for algorithmic music generation is found throughout entertainment, art, education, etc. Unfortunately, most recent models are practically impossible to interpret or musically fine-tune, as they use deep neural networks with thousan ... Full text Cite

Effects of epileptiform activity on discharge outcome in critically ill patients in the USA: a retrospective cross-sectional study.

Journal Article The Lancet. Digital health · August 2023 BackgroundEpileptiform activity is associated with worse patient outcomes, including increased risk of disability and death. However, the effect of epileptiform activity on neurological outcome is confounded by the feedback between treatment with ... Full text Cite

Prediction of tensile performance for 3D printed photopolymer gyroid lattices using structural porosity, base material properties, and machine learning

Journal Article Materials and Design · August 1, 2023 Advancements in additive manufacturing (AM) technology and three-dimensional (3D) modeling software have enabled the fabrication of parts with combinations of properties that were impossible to achieve with traditional manufacturing techniques. Porous desi ... Full text Cite

Tensile performance data of 3D printed photopolymer gyroid lattices.

Journal Article Data in brief · August 2023 Additive manufacturing has provided the ability to manufacture complex structures using a wide variety of materials and geometries. Structures such as triply periodic minimal surface (TPMS) lattices have been incorporated into products across many fields d ... Full text Cite

Applied machine learning as a driver for polymeric biomaterials design.

Journal Article Nature communications · August 2023 Polymers are ubiquitous to almost every aspect of modern society and their use in medical products is similarly pervasive. Despite this, the diversity in commercial polymers used in medicine is stunningly low. Considerable time and resources have been exte ... Full text Cite

Optimal Sparse Regression Trees

Conference Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 · June 27, 2023 Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been l ... Cite

Optimal Sparse Regression Trees.

Conference Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence · June 2023 Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been l ... Full text Cite

In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

Journal Article Journal of Quantitative Criminology · June 1, 2023 Objectives: We study interpretable recidivism prediction using machine learning (ML) models and analyze performance in terms of prediction ability, sparsity, and fairness. Unlike previous works, this study trains interpretable models that output probabilit ... Full text Cite

A user interface to communicate interpretable AI decisions to radiologists

Conference Progress in Biomedical Optics and Imaging - Proceedings of SPIE · January 1, 2023 Tools for computer-aided diagnosis based on deep learning have become increasingly important in the medical field. Such tools can be useful, but require effective communication of their decision-making process in order to safely and meaningfully guide clin ... Full text Cite

Variable Importance Matching for Causal Inference

Conference Proceedings of Machine Learning Research · January 1, 2023 Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, accurate for treatment effect estimation, and scalable to high-dimensional data. We describe a general framework called Model-to-Match that achieves ... Cite

The Mechanical Bard: An Interpretable Machine Learning Approach to Shakespearean Sonnet Generation

Conference Proceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2023 We consider the automated generation of sonnets, a poetic form constrained according to meter, rhyme scheme, and length. Sonnets generally also use rhetorical figures, expressive language, and a consistent theme or narrative. Our constrained decoding appro ... Cite

Missing Values and Imputation in Healthcare Data: Can Interpretable Machine Learning Help?

Conference Proceedings of Machine Learning Research · January 1, 2023 Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the ... Cite

This Looks Like Those: Illuminating Prototypical Concepts Using Multiple Visualizations

Conference Advances in Neural Information Processing Systems · January 1, 2023 We present ProtoConcepts, a method for interpretable image classification combining deep learning and case-based reasoning using prototypical parts. Existing work in prototype-based image classification uses a “this looks like that” reasoning process, whic ... Cite

The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance

Conference Advances in Neural Information Processing Systems · January 1, 2023 Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a give ... Cite

Why black box machine learning should be avoided for high-stakes decisions, in brief

Journal Article Nature Reviews Methods Primers · December 1, 2022 Full text Cite

How to see hidden patterns in metamaterials with interpretable machine learning

Journal Article Extreme Mechanics Letters · November 1, 2022 Machine learning models can assist with metamaterials design by approximating computationally expensive simulators or solving inverse design problems. However, past work has usually relied on black box deep neural networks, whose reasoning processes are op ... Full text Cite

Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design.

Conference CEUR workshop proceedings · October 2022 Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the alg ... Cite

MALTS: Matching After Learning to Stretch

Journal Article Journal of Machine Learning Research · August 1, 2022 We introduce a flexible framework that produces high-quality almost-exact matches for causal inference. Most prior work in matching uses ad-hoc distance metrics, often leading to poor quality matches, particularly when there are irrelevant covariates. In t ... Cite

Data solidarity for machine learning for embryo selection: a call for the creation of an open access repository of embryo data.

Journal Article Reproductive biomedicine online · July 2022 The last decade has seen an explosion of machine learning applications in healthcare, with mixed and sometimes harmful results despite much promise and associated hype. A significant reason for the reversal in the reported benefit of these applications is ... Full text Cite

Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization.

Journal Article Communications biology · July 2022 Dimension reduction (DR) algorithms project data from high dimensions to lower dimensions to enable visualization of interesting high-dimensional structure. DR algorithms are widely used for analysis of single-cell transcriptomic data. Despite widespread u ... Full text Cite

On the Existence of Simpler Machine Learning Models

Conference ACM International Conference Proceeding Series · June 21, 2022 It is almost always easier to find an accurate-but-complex model than an accurate-yet-simple model. Finding optimal, sparse, accurate models of various forms (linear models with integer coefficients, decision sets, rule lists, decision trees) is generally ... Full text Cite

Causal Rule Sets for Identifying Subgroups with Enhanced Treatment Effects

Journal Article INFORMS Journal on Computing · May 1, 2022 A key question in causal inference analyses is how to find subgroups with elevated treatment effects. This paper takes a machine learning approach and introduces a generative model, causal rule sets (CRS), for interpretable subgroup discovery. A CRS model ... Full text Cite

Fast Sparse Classification for Generalized Linear and Additive Models.

Journal Article Proceedings of machine learning research · March 2022 We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast spars ... Cite

A holistic approach to interpretability in financial lending: Models, visualizations, and summary-explanations

Journal Article Decision Support Systems · January 1, 2022 Lending decisions are usually made with proprietary models that provide minimally acceptable explanations to users. In a future world without such secrecy, what decision support tools would one want to use for justified lending decisions? This question is ... Full text Cite

Interpretable machine learning: Fundamental principles and 10 grand challenges

Journal Article Statistics Surveys · January 1, 2022 Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. ... Full text Cite

Rethinking Nonlinear Instrumental Variable Models through Prediction Validity

Journal Article Journal of Machine Learning Research · January 1, 2022 Instrumental variables (IV) are widely used in the social and health sciences in situations where a researcher would like to measure a causal effect but cannot perform an experiment. For valid causal inference in an IV model, there must be external (exogen ... Cite

Interpretable Deep Learning Models for Better Clinician-AI Communication in Clinical Mammography

Conference Progress in Biomedical Optics and Imaging - Proceedings of SPIE · January 1, 2022 There is increasing interest in using deep learning and computer vision to help guide clinical decisions, such as whether to order a biopsy based on a mammogram. Existing networks are typically black box, unable to explain how they make their predictions. ... Full text Cite

Fast Sparse Decision Tree Optimization via Reference Ensembles.

Journal Article Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence · January 2022 Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort ... Full text Cite

TimberTrek: Exploring and Curating Sparse Decision Trees with Interactive Visualization

Conference Proceedings - 2022 IEEE Visualization Conference - Short Papers, VIS 2022 · January 1, 2022 Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees-a huge set of almost-optimal i ... Full text Cite

FasterRisk: Fast and Accurate Interpretable Risk Scores

Conference Advances in Neural Information Processing Systems · January 1, 2022 Over the last century, risk scores have been the most popular form of predictive model used in healthcare and criminal justice. Risk scores are sparse linear models with integer coefficients; often these models can be memorized or placed on an index card. ... Cite

Data Poisoning Attacks on Off-Policy Policy Evaluation Methods

Conference Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022 · January 1, 2022 Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under adversa ... Cite

Exploring the Whole Rashomon Set of Sparse Decision Trees.

Conference Advances in neural information processing systems · January 2022 In any given machine learning problem, there might be many models that explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that m ... Cite

Data Poisoning Attacks on Off-Policy Policy Evaluation Methods

Conference Proceedings of Machine Learning Research · January 1, 2022 Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under adversa ... Cite

A supervised machine learning semantic segmentation approach for detecting artifacts in plethysmography signals from wearables.

Journal Article Physiological measurement · December 2021 Objective. Wearable devices equipped with plethysmography (PPG) sensors provided a low-cost, long-term solution to early diagnosis and continuous screening of heart conditions. However PPG signals collected from such devices often suffer from corrup ... Full text Cite

A case-based interpretable deep learning model for classification of mass lesions in digital mammography

Journal Article Nature Machine Intelligence · December 1, 2021 Interpretability in machine learning models is important in high-stakes decisions such as whether to order a biopsy based on a mammographic exam. Mammography poses important challenges that are not present in other computer vision tasks: datasets are small ... Full text Cite

A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results

Journal Article Management Science · October 1, 2021 Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theor ... Full text Cite

Ethical Implementation of Artificial Intelligence to Select Embryos in in Vitro Fertilization

Conference AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society · July 21, 2021 AI has the potential to revolutionize many areas of healthcare. Radiology, dermatology, and ophthalmology are some of the areas most likely to be impacted in the near future, and they have received significant attention from the broader research community. ... Full text Cite

There once was a really bad poet, it was automated but you didn’t know it

Journal Article Transactions of the Association for Computational Linguistics · July 8, 2021 Limerick generation exemplifies some of the most difficult challenges faced in poetry generation, as the poems must tell a story in only five lines, with constraints on rhyme, stress, and meter. To address these challenges, we introduce LimGen, a novel and ... Full text Cite

IAIA-BL: A Case-based Interpretable Deep Learning Model for Classification of Mass Lesions in Digital Mammography

Journal Article · March 23, 2021 Interpretability in machine learning models is important in high-stakes decisions, such as whether to order a biopsy based on a mammographic exam. Mammography poses important challenges that are not present in other computer vision tasks: datasets are smal ... Link to item Cite

dame-flame: A Python Library Providing Fast Interpretable Matching for Causal Inference

Journal Article · January 5, 2021 dame-flame is a Python package for performing matching for observational causal inference on datasets containing discrete covariates. This package implements the Dynamic Almost Matching Exactly (DAME) and Fast Large-Scale Almost Matching Exactly (FLAME) al ... Open Access Link to item Cite

FLAME: A fast large-scale almost matching exactly approach to causal inference

Journal Article Journal of Machine Learning Research · January 1, 2021 A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional ca ... Open Access Cite

Regulating greed over time in multi-armed bandits

Journal Article Journal of Machine Learning Research · January 1, 2021 In retail, there are predictable yet dramatic time-dependent patterns in customer behavior, such as periodic changes in the number of visitors, or increases in customers just before major holidays. The current paradigm of multi-armed bandit analysis does n ... Cite

Playing codenames with language graphs and word embeddings

Journal Article Journal of Artificial Intelligence Research · January 1, 2021 Although board games and video games have been studied for decades in artificial intelligence research, challenging word games remain relatively unexplored. Word games are not as constrained as games like chess or poker. Instead, word game strategy is defi ... Full text Cite

Understanding how dimension reduction tools work: An empirical approach to deciphering T-SNE, UMAP, TriMap, and PaCMAP for data visualization

Journal Article Journal of Machine Learning Research · January 1, 2021 Dimension reduction (DR) techniques such as t-SNE, UMAP, and TriMap have demonstrated impressive visualization performance on many real-world datasets. One tension that has always faced these methods is the trade-off between preservation of global structur ... Cite

Interpretable, not black-box, artificial intelligence should be used for embryo selection.

Journal Article Human reproduction open · January 2021 Artificial intelligence (AI) techniques are starting to be used in IVF, in particular for selecting which embryos to transfer to the woman. AI has the potential to process complex data sets, to be better at identifying subtle but important patterns, and to ... Full text Cite

Concept whitening for interpretable image recognition

Journal Article Nature Machine Intelligence · December 1, 2020 What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hid ... Full text Cite

Exploring the cloud of variable importance for the set of all good models

Journal Article Nature Machine Intelligence · December 1, 2020 Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is problematic: what ... Full text Cite

Cryo-ZSSR: multiple-image super-resolution based on deep internal learning

Journal Article · November 22, 2020 Single-particle cryo-electron microscopy (cryo-EM) is an emerging imaging modality capable of visualizing proteins and macro-molecular complexes at near-atomic resolution. The low electron-doses used to prevent sample radiation damage, result in images whe ... Link to item Cite

Towards Practical Lipschitz Bandits

Conference FODS 2020 - Proceedings of the 2020 ACM-IMS Foundations of Data Science Conference · October 19, 2020 Stochastic Lipschitz bandit algorithms balance exploration and exploitation, and have been used for a variety of important task domains. In this paper, we present a framework for Lipschitz bandit methods that adaptively learns partitions of context-and arm ... Full text Cite

AI reflections in 2019

Journal Article Nature Machine Intelligence · January 17, 2020 Full text Cite

Adaptive Hyper-box Matching for Interpretable Individualized Treatment Effect Estimation

Journal Article CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020) · 2020 Open Access Cite

Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference

Journal Article INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108 · 2020 Cite

Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference

Conference Proceedings of Machine Learning Research · January 1, 2020 We propose a matching method that recovers direct treatment effects from randomized experiments where units are connected in an observed network, and units that share edges can potentially influence each others' outcomes. Traditional treatment effect estim ... Cite

Adaptive Hyper-box Matching for Interpretable Individualized Treatment Effect Estimation

Conference Proceedings of Machine Learning Research · January 1, 2020 We propose a matching method for observational data that matches units with others in unit-specific, hyper-box-shaped regions of the covariate space. These regions are large enough that many matches are created for each unit and small enough that the treat ... Cite

PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2020 The primary aim of single-image super-resolution is to construct a high-resolution (HR) image from a corresponding low-resolution (LR) input. In previous approaches, which have generally been supervised, the training objective typically measures a pixel-wi ... Full text Cite

A transformer approach to contextual sarcasm detection in twitter

Conference Proceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2020 Understanding tone in Twitter posts will be increasingly important as more and more communication moves online. One of the most difficult, yet important tones to detect is sarcasm. In the past, LSTM and transformer architecture models have been used to tac ... Full text Cite

A Transformer Approach to Contextual Sarcasm Detection in Twitter

Conference FIGURATIVE LANGUAGE PROCESSING · 2020 Cite

Bandits for bmo functions

Conference 37th International Conference on Machine Learning, ICML 2020 · January 1, 2020 We study the bandit problem where the underlying expected reward is a Bounded Mean Oscillation (BMO) function. BMO functions are allowed to be discontinuous and unbounded, and are useful in modeling signals with infinities in the domain. We develop a tools ... Cite

Generalized and scalable optimal sparse decision trees

Conference 37th International Conference on Machine Learning, ICML 2020 · January 1, 2020 Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have al ... Cite

Modeling recovery curves with application to prostatectomy.

Journal Article Biostatistics (Oxford, England) · October 2019 In many clinical settings, a patient outcome takes the form of a scalar time series with a recovery curve shape, which is characterized by a sharp drop due to a disruptive event (e.g., surgery) and subsequent monotonic smooth rise towards an asymptotic lev ... Full text Cite

Do Simpler Models Exist and How Can We Find Them?

Conference Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining · July 25, 2019 Full text Cite

Learning optimized risk scores

Journal Article Journal of Machine Learning Research · June 1, 2019 Risk scores are simple classification models that let users make quick risk predictions by adding and subtracting a few small numbers. These models are widely used in medicine and criminal justice, but are difficult to learn from data because they need to ... Cite

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.

Journal Article Nature machine intelligence · May 2019 Black box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains. People have hoped that creating methods for explaining these blac ... Full text Cite

Interpretable Almost-Exact Matching for Causal Inference.

Journal Article Proceedings of machine learning research · April 2019 Matching methods are heavily used in the social and health sciences due to their interpretability. We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. The method proposed in t ... Cite

The big Data newsvendor: Practical insights from machine learning

Journal Article Operations Research · January 1, 2019 We investigate the data-driven newsvendor problem when one has n observations of p features related to the demand as well as historical demand data. Rather than a two-step process of first estimating a demand distribution then optimizing for the optimal or ... Full text Cite

Interpretable almost-matching-exactly with instrumental variables

Journal Article 35th Conference on Uncertainty in Artificial Intelligence, UAI 2019 · January 1, 2019 © 2019 Association For Uncertainty in Artificial Intelligence (AUAI). All rights reserved. Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linki ... Cite

Reducing exploration of dying arms in mortal bandits

Conference 35th Conference on Uncertainty in Artificial Intelligence, UAI 2019 · January 1, 2019 © 2019 Air and Waste Management Association. All rights reserved. Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of avai ... Cite

All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously.

Journal Article Journal of machine learning research : JMLR · January 2019 Variable importance (VI) tools describe how much covariates contribute to a prediction model's accuracy. However, important variables for one well-performing model (for example, a linear model f (x) = x Tβ with a fixed coef ... Cite

Interpretable almost-matching-exactly with instrumental variables

Conference 35th Conference on Uncertainty in Artificial Intelligence, UAI 2019 · January 1, 2019 Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the eff ... Cite

Reducing exploration of dying arms in mortal bandits

Conference 35th Conference on Uncertainty in Artificial Intelligence, UAI 2019 · January 1, 2019 Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of available options changes over time. Previous work on this problem sho ... Cite

This looks like that: Deep learning for interpretable image recognition

Conference Advances in Neural Information Processing Systems · January 1, 2019 When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final deci ... Cite

Interpretable Almost-Matching-Exactly With Instrumental Variables

Conference Proceedings of Machine Learning Research · January 1, 2019 Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the eff ... Cite

Reducing Exploration of Dying Arms in Mortal Bandits

Conference Proceedings of Machine Learning Research · January 1, 2019 Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of available options changes over time. Previous work on this problem sho ... Cite

An Application of Matching After Learning To Stretch (MALTS) to the ACIC 2018 Causal Inference Challenge Data

Journal Article Observational Studies · January 1, 2019 In the learning-to-match framework for causal inference, a parameterized distance metric is trained on a holdout train set so that the matching yields accurate estimated conditional average treatment effects. This way, the matching can be as accurate as ot ... Full text Cite

Interpretable Image Recognition with Hierarchical Prototypes

Conference Proceedings of the AAAI Conference on Human Computation and Crowdsourcing · January 1, 2019 Vision models are interpretable when they classify objects on the basis of features that a person can directly understand. Recently, methods relying on visual feature prototypes have been developed for this purpose. However, in contrast to how humans categ ... Full text Cite

NTIRE 2018 challenge on single image super-resolution: Methods and results

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · December 13, 2018 This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results. The challenge had 4 tracks. Track 1 employed the standard bicubic downscaling ... Full text Cite

New techniques for preserving global structure and denoising with low information loss in single-image super-resolution

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · December 13, 2018 This work identifies and addresses two important technical challenges in single-image super-resolution: (1) how to upsample an image without magnifying noise and (2) how to preserve large scale structure when upsampling. We summarize the techniques we deve ... Full text Cite

Learning customized and optimized lists of rules with mathematical programming

Journal Article Mathematical Programming Computation · December 1, 2018 We introduce a mathematical programming approach to building rule lists, which are a type of interpretable, nonlinear, and logical machine learning classifier involving IF-THEN rules. Unlike traditional decision tree algorithms like CART and C5.0, this met ... Full text Cite

MALTS: Matching After Learning to Stretch

Journal Article Journal.of.Machine.Learning.Research 23(240) (2022) 1-42 · November 18, 2018 We introduce a flexible framework that produces high-quality almost-exact matches for causal inference. Most prior work in matching uses ad-hoc distance metrics, often leading to poor quality matches, particularly when there are irrelevant covariates. In t ... Link to item Cite

Optimized scoring systems: Toward trust in machine learning for healthcare and criminal justice

Journal Article Interfaces · September 1, 2018 Abstract. Questions of trust in machine-learning models are becoming increasingly important as these tools are starting to be used widely for high-stakes decisions in medicine and criminal justice. Transparency of models is a key aspect affecting trust. Th ... Full text Cite

Interpretable Almost Matching Exactly for Causal Inference

Journal Article · June 18, 2018 We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. Matching methods are heavily used in the social sciences due to their interpretability, but most matching methods do not pa ... Link to item Cite

A Shared Vision for Machine Learning in Neuroscience.

Journal Article J Neurosci · February 14, 2018 With ever-increasing advancements in technology, neuroscientists are able to collect data in greater volumes and with finer resolution. The bottleneck in understanding how the brain works is consequently shifting away from the amount and type of data we ca ... Full text Link to item Cite

Learning certifiably optimal rule lists for categorical data

Journal Article Journal of Machine Learning Research · January 1, 2018 We present the design and implementation of a custom discrete optimization technique for building rule lists over a categorical feature space. Our algorithm produces rule lists with optimal training performance, according to the regularized empirical risk, ... Cite

Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions

Conference 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 · January 1, 2018 Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability - they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are often treated as “b ... Cite

Direct learning to rank and rerank

Conference International Conference on Artificial Intelligence and Statistics, AISTATS 2018 · January 1, 2018 Learning-to-rank techniques have proven to be extremely useful for prioritization problems, where we rank items in order of their estimated probabilities, and dedicate our limited resources to the top-ranked items. This work exposes a serious problem with ... Cite

An optimization approach to learning falling rule lists

Conference International Conference on Artificial Intelligence and Statistics, AISTATS 2018 · January 1, 2018 A falling rule list is a probabilistic decision list for binary classification, consisting of a series of if-then rules with antecedents in the if clauses and probabilities of the desired outcome (“1”) in the then clauses. Just as in a regular decision lis ... Cite

Association of an Electroencephalography-Based Risk Score With Seizure Probability in Hospitalized Patients.

Journal Article JAMA neurology · December 2017 ImportanceContinuous electroencephalography (EEG) use in critically ill patients is expanding. There is no validated method to combine risk factors and guide clinicians in assessing seizure risk.ObjectiveTo use seizure risk factors from E ... Full text Cite

Optimized risk scores

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 13, 2017 Risk scores are simple classification models that let users quickly assess risk by adding, subtracting, and multiplying a few small numbers. Such models are widely used in healthcare and criminal justice, but are often built ad hoc. In this paper, we prese ... Full text Cite

Learning certifiably optimal rule lists

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 13, 2017 We present the design and implementation of a custom discrete optimization technique for building rule lists over a categorical feature space. Our algorithm provides the optimal solution, with a certificate of optimality. By leveraging algorithmic bounds, ... Full text Cite

A Bayesian framework for learning rule sets for interpretable classification

Journal Article Journal of Machine Learning Research · August 1, 2017 We present a machine learning algorithm for building classifiers that are comprised of a small number of short rules. These are restricted disjunctive normal form models. An example of a classifier of this form is as follows: If X satisfies (condition A AN ... Cite

Interpretable classification models for recidivism prediction

Journal Article Journal of the Royal Statistical Society. Series A: Statistics in Society · June 1, 2017 We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent and interpretable to use for decision making. This question is complicated as these models are used to support differ ... Full text Cite

The World Health Organization Adult Attention-Deficit/Hyperactivity Disorder Self-Report Screening Scale for DSM-5.

Journal Article JAMA psychiatry · May 2017 ImportanceRecognition that adult attention-deficit/hyperactivity disorder (ADHD) is common, seriously impairing, and usually undiagnosed has led to the development of adult ADHD screening scales for use in community, workplace, and primary care se ... Full text Cite

Scalable Bayesian rule lists

Conference 34th International Conference on Machine Learning, ICML 2017 · January 1, 2017 We present an algorithm for building probabilistic rule lists that is two orders of magnitude faster than previous work. Rule list algorithms are competitors for decision tree algorithms. They are associative classifiers, in that they are built from pre-mi ... Cite

Learning cost-effective and interpretable treatment regimes

Conference Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017 · January 1, 2017 © 2017 PMLR. All rights reserved. Decision makers, such as doctors and judges, make crucial decisions such as recommending treatments to patients, and granting bail to defendants on a daily basis. Such decisions typically involve weighing the potential ben ... Cite

Learning cost-effective and interpretable treatment regimes

Conference Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017 · January 1, 2017 Decision makers, such as doctors and judges, make crucial decisions such as recommending treatments to patients, and granting bail to defendants on a daily basis. Such decisions typically involve weighing the potential benefits of taking an action against ... Cite

Bayesian inference of arrival rate and substitution behavior from sales transaction data with stockouts

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 13, 2016 When an item goes out of stock, sales transaction data no longer reflect the original customer demand, since some customers leave with no purchase while others substitute alternative products for the one that was out of stock. Here we develop a Bayesian hi ... Full text Cite

Bayesian rule sets for interpretable classification

Conference Proceedings - IEEE International Conference on Data Mining, ICDM · July 2, 2016 A Rule Set model consists of a small number of short rules for interpretable classification, where an instance is classified as positive if it satisfies at least one of the rules. The rule set provides reasons for predictions, and also descriptions of a pa ... Full text Cite

Prediction uncertainty and optimal experimental design for learning dynamical systems.

Journal Article Chaos (Woodbury, N.Y.) · June 2016 Dynamical systems are frequently used to model biological systems. When these models are fit to data, it is necessary to ascertain the uncertainty in the model fit. Here, we present prediction deviation, a metric of uncertainty that determines the extent t ... Full text Cite

The factorized self-controlled case series method: An approach for estimating the effects of many drugs on many outcomes

Journal Article Journal of Machine Learning Research · June 1, 2016 We provide a hierarchical Bayesian model for estimating the effects of transient drug exposures on a collection of health outcomes, where the effects of all drugs on all outcomes are estimated simultaneously. The method possesses properties that allow it t ... Cite

Learning classification models of cognitive conditions from subtle behaviors in the digital Clock Drawing Test

Journal Article Machine Learning · March 1, 2016 The Clock Drawing Test—a simple pencil and paper test—has been used for more than 50 years as a screening tool to differentiate normal individuals from those with cognitive impairment, and has proven useful in helping to diagnose cognitive dysfunction asso ... Full text Cite

Supersparse linear integer models for optimized medical scoring systems

Journal Article Machine Learning · March 1, 2016 Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data beca ... Full text Cite

Clinical Prediction Models for Sleep Apnea: The Importance of Medical History over Symptoms.

Journal Article Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine · February 2016 Study objectiveObstructive sleep apnea (OSA) is a treatable contributor to morbidity and mortality. However, most patients with OSA remain undiagnosed. We used a new machine learning method known as SLIM (Supersparse Linear Integer Models) to test ... Full text Cite

A Computational Model of Inhibition of HIV-1 by Interferon-Alpha.

Journal Article PloS one · January 2016 Type 1 interferons such as interferon-alpha (IFNα) inhibit replication of Human immunodeficiency virus (HIV-1) by upregulating the expression of genes that interfere with specific steps in the viral life cycle. This pathway thus represents a potential targ ... Full text Cite

CRAFT: ClusteR-specific Assorted Feature selecTion

Conference Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016 · January 1, 2016 We present a hierarchical Bayesian framework for clustering with cluster-specific feature selection. We derive a simplified model, CRAFT, by analyzing the asymptotic behavior of the log posterior formulations in a nonparametric MAP-based clustering setting ... Cite

A bayesian approach to learning scoring systems

Journal Article Big Data · December 1, 2015 We present a Bayesian method for building scoring systems, which are linear models with coefficients that have very few significant digits. Usually the construction of scoring systems involve manual effort - humans invent the full scoring system without us ... Full text Cite

The latent state hazard model, with application to wind turbine reliability

Journal Article Annals of Applied Statistics · December 1, 2015 We present a new model for reliability analysis that is able to distinguish the latent internal vulnerability state of the equipment from the vulnerability caused by temporary external sources. Consider a wind farm where each turbine is running under the e ... Full text Cite

Generalization bounds for learning with linear, polygonal, quadratic and conic side knowledge

Journal Article Machine Learning · September 17, 2015 In this paper, we consider a supervised learning setting where side knowledge is provided about the labels of unlabeled examples. The side knowledge has the effect of reducing the hypothesis space, leading to tighter generalization bounds, and thus possibl ... Full text Cite

Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model

Journal Article Annals of Applied Statistics · September 1, 2015 We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if … then. . . statements (e.g., if high blood pressure, then stroke) that discretize a ... Full text Cite

Finding Patterns with a Rotten Core: Data Mining for Crime Series with Cores

Journal Article Big Data · March 1, 2015 One of the most challenging problems facing crime analysts is that of identifying crime series, which are sets of crimes committed by the same individual or group. Detecting crime series can be an important step in predictive policing, as knowledge of a pa ... Full text Cite

Falling rule lists

Conference Journal of Machine Learning Research · January 1, 2015 Falling rule lists are classification models consisting of an ordered list of if-then rules, where (i) the order of rules determines which example should be classified by each rule, and (ii) the estimated probability of success decreases monotonically down ... Cite

Turning prediction tools into decision tools

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2015 Arguably, the main stumbling block in getting machine learning algorithms used in practice is the fact that people do not trust them. There could be many reasons for this, for instance, perhaps the models are not sparse or transparent, or perhaps the model ... Cite

Reactive point processes: A new approach to predicting power failures in underground electrical systems

Journal Article Annals of Applied Statistics · January 1, 2015 Reactive point processes (RPPs) are a new statistical model designed for predicting discrete events in time based on past history. RPPs were developed to handle an important problem within the domain of electrical grid reliability: short-term prediction of ... Full text Cite

Tire changes, fresh air, and yellow flags: Challenges in predictive analytics for professional racing

Journal Article Big Data · June 1, 2014 Our goal is to design a prediction and decision system for real-time use during a professional car race. In designing a knowledge discovery process for racing, we faced several challenges that were overcome only when domain knowledge of racing was carefull ... Full text Cite

A statistical learning theory framework for supervised pattern discovery

Conference SIAM International Conference on Data Mining 2014, SDM 2014 · January 1, 2014 This paper formalizes a latent variable inference problem we call supervised, pattern discovery, the goal of which is to find sets of observations that belong to a single "pattern." We discuss two versions of the problem and prove uniform risk bounds for b ... Full text Cite

The Bayesian case model: A generative approach for case-based reasoning and prototype classification

Conference Advances in Neural Information Processing Systems · January 1, 2014 We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "qui ... Cite

Box drawings for learning with imbalanced data

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · January 1, 2014 The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly imbalanced classif ... Full text Cite

On combining machine learning with decision making

Conference Machine Learning · January 1, 2014 We present a new application and covering number bound for the framework of "Machine Learning with Operational Costs (MLOC)," which is an exploratory form of decision theory. The MLOC framework incorporates knowledge about how a predictive model will be us ... Full text Cite

Approximating the crowd

Journal Article Data Mining and Knowledge Discovery · January 1, 2014 The problem of "approximating the crowd" is that of estimating the crowd's majority opinion by querying only a subset of it. Algorithms that approximate the crowd can intelligently stretch a limited budget for a crowdsourcing task. We present an algorithm, ... Full text Cite

Learning about meetings

Journal Article Data Mining and Knowledge Discovery · January 1, 2014 Most people participate in meetings almost every day, multiple times a day. The study of meetings is important, but also challenging, as it requires an understanding of social signals and complex interpersonal dynamics. Our aim in this work is to use a dat ... Full text Cite

Analytics for power grid distribution reliability in New York City

Journal Article Interfaces · January 1, 2014 We summarize the first major effort to use analytics for preemptive maintenance and repair of an electrical distribution network. This is a large-scale multiyear effort between scientists and students at Columbia University and the Massachusetts Institute ... Full text Cite

Modeling weather impact on a secondary electrical grid

Conference Procedia Computer Science · January 1, 2014 Weather can cause problems for underground electrical grids by increasing the probability of serious "manhole events" such as fires and explosions. In this work, we compare a model that incorporates weather features associated with the dates of serious eve ... Full text Cite

Machine learning for science and society

Journal Article Machine Learning · January 1, 2014 The special issue on "Machine Learning for Science and Society" showcases machine learning work with influence on our current and future society. These papers address several key problems such as how we perform repairs on critical infrastructure, how we pr ... Full text Cite

Robust optimization using machine learning for uncertainty sets

Conference International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014 © 2014 University of Illinois at Chicago. All rights reserved. Our goal is to build robust optimization problems that make decisions about the future, and where complex data from the past are used to model uncertainty. In robust optimization (RO) generally ... Cite

Generalization bounds for learning with linear and quadratic side knowledge

Conference International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014 © 2014 University of Illinois at Chicago. All rights reserved. In this paper, we consider a supervised learning setting where side knowledge is provided about the labels of unlabeled examples. The side knowledge has the effect of reducing the hypothesis sp ... Cite

Toward a theory of pattern discovery

Conference International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014 © 2014 University of Illinois at Chicago. All rights reserved. This paper formalizes a latent variable inference problem we call supervised pattern discovery, the goal of which is to find sets of observations that belong to a single “pattern.” We discuss t ... Cite

Robust optimization using machine learning for uncertainty sets

Conference International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014 Our goal is to build robust optimization problems that make decisions about the future, and where complex data from the past are used to model uncertainty. In robust optimization (RO) generally, the goal is to create a policy for decision-making that is ro ... Cite

Toward a theory of pattern discovery

Conference International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014 This paper formalizes a latent variable inference problem we call supervised pattern discovery, the goal of which is to find sets of observations that belong to a single “pattern.” We discuss two versions of the problem and prove uniform risk bounds for bo ... Cite

Generalization bounds for learning with linear and quadratic side knowledge

Conference International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014 In this paper, we consider a supervised learning setting where side knowledge is provided about the labels of unlabeled examples. The side knowledge has the effect of reducing the hypothesis space, leading to tighter generalization bounds, and thus possibl ... Cite

Growing a list

Journal Article Data Mining and Knowledge Discovery · December 1, 2013 It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: information can be scattered across many sources and must be aggregated to be useful. We introduce a method for ... Full text Cite

Learning theory analysis for association rules and sequential event prediction

Journal Article Journal of Machine Learning Research · November 1, 2013 We present a theoretical analysis for prediction algorithms based on association rules. As part of this analysis, we introduce a problem for which rules are particularly natural, called "sequential event prediction." In sequential event prediction, events ... Cite

Sequential event prediction

Journal Article Machine Learning · November 1, 2013 In sequential event prediction, we are given a "sequence database" of past event sequences to learn from, and we aim to predict the next event within a current event sequence. We focus on applications where the set of the past events has predictive power a ... Full text Cite

Learning to detect patterns of crime

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · October 31, 2013 Our goal is to automatically detect patterns of crime. Among a large set of crimes that happen every year in a major city, it is challenging, time-consuming, and labor-intensive for crime analysts to determine which ones may have been committed by the same ... Full text Cite

The rate of convergence of AdaBoost

Journal Article Journal of Machine Learning Research · August 1, 2013 The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exp ... Cite

Machine learning with operational costs

Journal Article Journal of Machine Learning Research · June 1, 2013 This work proposes a way to align statistical modeling with decision making. We provide a method that propagates the uncertainty in predictive modeling to the uncertainty in operational cost, where operational cost is the amount spent by the practitioner i ... Cite

Predicting power failures with reactive point processes

Conference AAAI Workshop - Technical Report · January 1, 2013 Cite

Machine learning for meeting analysis

Conference AAAI Workshop - Technical Report · January 1, 2013 Most people participate in meetings almost every day, multiple times a day. The study of meetings is important, but also challenging, as it requires an understanding of social signals and complex interpersonal dynamics. Our aim this work is to use a data-d ... Cite

An interpretable stroke prediction model using rules and Bayesian analysis

Conference AAAI Workshop - Technical Report · January 1, 2013 We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. We introduce a generative model called the Bayesian List Machine for fitting decision lists, a type of interpretable classifier, to data. We use th ... Cite

Detecting patterns of crime with Series Finder

Conference AAAI Workshop - Technical Report · January 1, 2013 Many crimes can happen every day in a major city, and figuring out which ones are committed by the same individual or group is an important and difficult data mining challenge. To do this, we propose a pattern detection algorithm called Series Finder, that ... Cite

Supersparse linear integer models for predictive scoring systems

Conference AAAI Workshop - Technical Report · January 1, 2013 Cite

The influence of operational cost on estimation

Conference International Symposium on Artificial Intelligence and Mathematics, ISAIM 2012 · December 1, 2012 This work concerns the way that statistical models are used to make decisions. In particular, we aim to merge the way estimation algorithms are designed with how they are used for a subsequent task. Our methodology considers the operational cost of carryin ... Cite

An integer optimization approach to associative classification

Conference Advances in Neural Information Processing Systems · December 1, 2012 We aim to design classifiers that have the interpretability of association rules yet have predictive power on par with the top machine learning algorithms for classification. We propose a novel mixed integer optimization (MIO) approach called Ordered Rules ... Cite

Selective sampling of labelers for approximating the crowd

Conference AAAI Fall Symposium - Technical Report · December 1, 2012 In this paper, we present CrowdSense, an algorithm for estimating the crowd's majority opinion by querying only a subset of it. CrowdSense works in an online fashion where examples come one at a time and it dynamically samples subsets of labelers based on ... Cite

How to reverse-engineer quality rankings

Journal Article Machine Learning · September 1, 2012 A good or bad product quality rating can make or break an organization. However, the notion of "quality" is often defined by an independent rating company that does not make the formula for determining the rank of a product publicly available. In order to ... Full text Cite

Bayesian hierarchical rule modeling for predicting medical conditions

Journal Article Annals of Applied Statistics · June 1, 2012 We propose a statistical modeling technique, called the Hierarchical Association Rule Model (HARM), that predicts a patient's possible future medical conditions given the patient's current and past history of reported conditions. The core of our technique ... Full text Cite

Machine learning for the New York City power grid

Journal Article IEEE Transactions on Pattern Analysis and Machine Intelligence · January 1, 2012 Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of ... Full text Cite

The machine learning and traveling repairman problem

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · October 31, 2011 The goal of the Machine Learning and Traveling Repairman Problem (ML&TRP) is to determine a route for a "repair crew," which repairs nodes on a graph. The repair crew aims to minimize the cost of failures at the nodes, but the failure probabilities are not ... Full text Cite

On equivalence relationships between classification and ranking algorithms

Journal Article Journal of Machine Learning Research · October 1, 2011 We demonstrate that there are machine learning algorithms that can achieve success for two separate tasks simultaneously, namely the tasks of classification and bipartite ranking. This means that advantages gained from solving one task can be carried over ... Cite

Data quality assurance and performance measurement of data mining for preventive maintenance of power grid

Conference Proceedings of the 1st International Workshop on Data Mining for Service and Maintenance, KDD4Service 2011 - Held in Conjunction with SIGKDD'11 · September 15, 2011 Ensuring reliability as the electrical grid morphs into the "smart grid" will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive maintenance; in the future, we will not only react to f ... Full text Cite

Estimation of system reliability using a semiparametric model

Conference IEEE 2011 EnergyTech, ENERGYTECH 2011 · August 17, 2011 An important problem in reliability engineering is to predict the failure rate, that is, the frequency with which an engineered system or component fails. This paper presents a new method of estimating failure rate using a semiparametric model with Gaussia ... Full text Cite

21st-century data miners meet 19th-century electrical cables

Journal Article Computer · June 1, 2011 Researchers can repurpose even extremely raw historical data for use in prediction. © 2006 IEEE. ... Full text Cite

Sequential event prediction with association rules

Conference Journal of Machine Learning Research · January 1, 2011 We consider a supervised learning problem in which data are revealed sequentially and the goal is to determine what will next be revealed. In the context of this problem, algorithms based on association rules have a distinct advantage over classical statis ... Cite

The rate of convergence of AdaBoost

Conference Journal of Machine Learning Research · January 1, 2011 The AdaBoost algorithm of Freund and Schapire (1997) was designed to combine many "weak" hypotheses that perform slightly better than a random guess into a "strong" hypo-thesis that has very low error. We study the rate at which AdaBoost iteratively conver ... Cite

A process for predicting manhole events in Manhattan

Journal Article Machine Learning · July 1, 2010 We present a knowledge discovery and data mining process developed as part of the Columbia/Con Edison project on manhole event prediction. This process can assist with real-world prioritization problems that involve raw data in the form of noisy documents ... Full text Cite

Online coordinate boosting

Conference 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009 · December 1, 2009 We present a new online boosting algorithm for updating the weights of a boosted classifier, which yields a closer approximation to the edges found by Freund and Schapire's AdaBoost algorithm than previous online boosting algorithms. We contribute a new wa ... Full text Cite

Report cards for manholes: Eliciting expert feedback for a learning task

Conference 8th International Conference on Machine Learning and Applications, ICMLA 2009 · December 1, 2009 We present a manhole profiling tool, developed as part of the Columbia/Con Edison machine learning project on manhole event prediction, and discuss its role in evaluating our machine learning model in three important ways: elimination of outliers, eliminat ... Full text Cite

Margin-based ranking and an equivalence between AdaBoost and RankBoost

Journal Article Journal of Machine Learning Research · November 30, 2009 We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then descri ... Cite

The P-norm push: A simple convex ranking algorithm that concentrates at the top of the list

Journal Article Journal of Machine Learning Research · November 30, 2009 We are interested in supervised ranking algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. In this work, we provide a general form of convex objective that gi ... Cite

Reducing noise in labels and features for a real world dataset: Application of NLP corpus annotation methods

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · July 21, 2009 This paper illustrates how a combination of information extraction, machine learning, and NLP corpus annotation practice was applied to a problem of ranking vulnerability of structures (service boxes, manholes) in the Manhattan electrical grid. By adapting ... Full text Cite

Arabic morphological tagging, diacritization, and lemmatization using lexeme models and feature ranking

Conference ACL-08: HLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference · January 1, 2008 We investigate the tasks of general morphological tagging, diacritization, and lemmatization for Arabic. We show that for all tasks we consider, both modeling the lexeme explicitly, and retuning the weights of individual classifiers for the specific task, ... Full text Cite

Arabic morphological tagging, diacritization, and lemmatization using lexeme models and feature ranking

Conference Proceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2008 We investigate the tasks of general morphological tagging, diacritization, and lemmatization for Arabic. We show that for all tasks we consider, both modeling the lexeme explicitly, and retuning the weights of individual classifiers for the specific task, ... Cite

Analysis of boosting algorithms using the smooth margin function

Journal Article Annals of Statistics · December 1, 2007 We introduce a useful tool for analyzing boosting algorithms called the "smooth margin function," a differentiable approximation of the usual margin for boosting algorithms. We present two boosting algorithms based on this smooth margin, "coordinate ascent ... Full text Cite

Ranking with a P-norm push

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2006 We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, ... Full text Cite

Re-Ranking Algorithms for Name Tagging

Conference HLT-NAACL 2006 - Computationally Hard Problems and Joint Inference in Speech and Language Processing, Proceedings of the Workshop · January 1, 2006 Integrating information from different stages of an NLP processing pipeline can yield significant error reduction. We demonstrate how re-ranking can improve name tagging in a Chinese information extraction system by incorporating information from relation ... Cite

Margin-based ranking meets boosting in the middle

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2005 We present several results related to ranking. We give a general margin-based bound for ranking based on the L∞ covering number of the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin generalize well. We then describe a ... Full text Cite

The dynamics of AdaBoost: Cyclic behavior and convergence of margins

Journal Article Journal of Machine Learning Research · December 1, 2004 In order to study the convergence properties of the AdaBoost algorithm, we reduce AdaBoost to a nonlinear iterated map and study the evolution of its weight vectors. This dynamical systems approach allows us to understand AdaBoost's convergence properties ... Cite

Boosting based on a smooth margin

Journal Article Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) · January 1, 2004 We study two boosting algorithms, Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting, which are explicitly designed to produce maximum margins. To derive these algorithms, we introduce a smooth approximation of the margin that one can ma ... Full text Cite

On the dynamics of boosting

Conference Advances in Neural Information Processing Systems · January 1, 2004 In order to understand AdaBoost's dynamics, especially its ability to maximize margins, we derive an associated simplified nonlinear iterated map and analyze its behavior in low-dimensional cases. We find stable cycles for these cases, which can explicitly ... Cite