Journal ArticleInternational Journal of Mechanical Sciences · June 1, 2025
In the field of metamaterial research, irregular structures offer a novel and less conventional approach compared to traditional periodic designs. Designing irregular metamaterials is challenging when it comes to ensuring interconnectivity, which is essent ...
Full textCite
Journal ArticleComputer Methods in Applied Mechanics and Engineering · May 15, 2025
Manipulating the dispersive characteristics of vibrational waves is beneficial for many applications, e.g., high-precision instruments. architected hierarchical phononic materials have sparked promise tunability of elastodynamic waves and vibrations over m ...
Full textCite
ConferenceProceedings of the Aaai Conference on Artificial Intelligence · April 11, 2025
Dimension reduction (DR) algorithms have proven to be extremely useful for gaining insight into large-scale high-dimensional datasets, particularly finding clusters in transcriptomic data. The initial phase of these DR methods often involves converting the ...
Full textCite
ConferenceProceedings of the Aaai Conference on Artificial Intelligence · April 11, 2025
Health outcomes depend on complex environmental and sociodemographic factors whose effects change over location and time. Only recently has fine-grained spatial and temporal data become available to study these effects, namely the MEDSAT dataset of English ...
Full textCite
Journal ArticleJournal of the American Medical Informatics Association : JAMIA · April 2025
ObjectivePrediction of mortality in intensive care unit (ICU) patients typically relies on black box models (that are unacceptable for use in hospitals) or hand-tuned interpretable models (that might lead to the loss in performance). We aim to bri ...
Full textCite
Journal ArticleComputers and Structures · December 1, 2024
Acoustic metamaterials are a subject of increasing study and utility. Through designed combinations of geometries with material properties, acoustic metamaterials can be built to arbitrarily manipulate acoustic waves for various applications. Despite the t ...
Full textCite
Journal ArticleNature Machine Intelligence · October 1, 2024
Rapid, reliable and accurate interpretation of medical time series signals is crucial for high-stakes clinical decision-making. Deep learning methods offered unprecedented performance in medical signal processing but at a cost: they were compute intensive ...
Full textCite
Journal ArticleElife · September 9, 2024
Understanding the interplay between the HIV reservoir and the host immune system may yield insights into HIV persistence during antiretroviral therapy (ART) and inform strategies for a cure. Here, we applied machine learning (ML) approaches to cross-sectio ...
Full textLink to itemCite
ConferenceProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 24, 2024
Music composition and analysis is an inherently creative task, involving a combination of heart and mind. However, the vast majority of algorithmic music models completely ignore the "heart"component of music, resulting in output that often lacks the rich ...
Full textCite
Journal ArticlePhysiological measurement · August 2024
Objective. Physiological data are often low quality and thereby compromises the effectiveness of related health monitoring. The primary goal of this study is to develop a robust foundation model that can effectively handle low-quality issue in physi ...
Full textCite
Journal ArticleAnnals of clinical and translational neurology · July 2024
Background/objectivesEpileptiform activity (EA), including seizures and periodic patterns, worsens outcomes in patients with acute brain injuries (e.g., aneurysmal subarachnoid hemorrhage [aSAH]). Randomized control trials (RCTs) assessing anti-se ...
Full textCite
Journal ArticleStochastic Systems · June 1, 2024
We study the problem of predicting congestion risk in intensive care units (ICUs). Congestion is associated with poor service experience, high costs, and poor health outcomes. By predicting future congestion, decision makers can initiate preventive measure ...
Full textCite
Journal ArticleGenomics Proteomics Bioinformatics · May 9, 2024
Despite the success of antiretroviral therapy, human immunodeficiency virus (HIV) cannot be cured because of a reservoir of latently infected cells that evades therapy. To understand the mechanisms of HIV latency, we employed an integrated single-cell RNA ...
Full textLink to itemCite
Journal ArticleIEEE journal of biomedical and health informatics · May 2024
Atrial fibrillation (AF) is a common cardiac arrhythmia with serious health consequences if not detected and treated early. Detecting AF using wearable devices with photoplethysmography (PPG) sensors and deep neural networks has demonstrated some success u ...
Full textCite
ConferenceProceedings of machine learning research · May 2024
Interpretability is crucial for doctors, hospitals, pharmaceutical companies and biotechnology corporations to analyze and make decisions for high stakes problems that involve human health. Tree-based methods have been widely adopted for survival analys ...
Cite
ConferenceProceedings of the Aaai Conference on Artificial Intelligence · March 25, 2024
After a person is arrested and charged with a crime, they may be released on bail and required to participate in a community supervision program while awaiting trial. These 'pretrial programs' are common throughout the United States, but very little resear ...
Full textCite
Journal ArticleRadiology · March 2024
Background Mirai, a state-of-the-art deep learning-based algorithm for predicting short-term breast cancer risk, outperforms standard clinical risk models. However, Mirai is a black box, risking overreliance on the algorithm and incorrect diagnoses. Purpos ...
Full textLink to itemCite
ConferenceProceedings of Machine Learning Research · January 1, 2024
Even if a model is not globally sparse, it is possible for decisions made from that model to be accurately and faithfully described by a small number of features. For instance, an application for a large loan might be denied to someone because they have no ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2024
Recent advancements in statistical and reinforcement learning methods have contributed to superior patient care strategies. However, these methods face substantial challenges in high-stakes contexts, including missing data, stochasticity, and the need for ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2024
Many modern causal questions ask how treatments affect complex outcomes that are measured using wearable devices and sensors. Current analysis approaches require summarizing these data into scalar statistics (e.g., the mean), but these summaries can be mis ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2024
The Rashomon Effect, coined by Leo Breiman, describes the phenomenon that there exist many equally good predictive models for the same dataset. This phenomenon happens for many real datasets and when it does, it sparks both magic and consternation, but mos ...
Cite
ConferenceIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · January 1, 2024
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are u ...
Full textCite
ConferenceProceedings of SPIE the International Society for Optical Engineering · January 1, 2024
In the realm of metamaterial research, the exploration of random structures presents an innovative path less traveled, compared to the conventional focus on periodic designs. Our study introduces a novel framework for generating random metamaterials using ...
Full textCite
Chapter · January 1, 2024
Schenkerian Analysis (SchA) is a uniquely expressive method of music analysis, combining elements of melody, harmony, counterpoint, and form to describe the hierarchical structure supporting a work of music. However, despite its powerful analytical utility ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2024
Parametric dimensionality reduction methods have gained prominence for their ability to generalize to unseen datasets, an advantage that traditional approaches typically lack. Despite their growing popularity, there remains a prevalent misconception among ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2024
Sparsity is a central aspect of interpretability in machine learning. Typically, sparsity is measured in terms of the size of a model globally, such as the number of variables it uses. However, this notion of sparsity is not particularly relevant for decis ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2024
Noise in data significantly influences decision-making in the data science process. In fact, it has been shown that noise in data generation processes leads practitioners to find simpler models. However, an open question still remains: what is the degree o ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2024
We present ProtoViT, a method for interpretable image classification combining deep learning and case-based reasoning. This method classifies an image by comparing it to a set of learned prototypes, providing explanations of the form “this looks like that. ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2024
Many important datasets contain samples that are missing one or more feature values. Maintaining the interpretability of machine learning models in the presence of such missing data is challenging. Singly or multiply imputing missing values complicates the ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2024
Survival analysis is an important research topic with applications in healthcare, business, and manufacturing. One essential tool in this area is the Cox proportional hazards (CPH) model, which is widely used for its interpretability, flexibility, and pred ...
Cite
Journal ArticleAdvances in neural information processing systems · December 2023
We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive t ...
Cite
Journal ArticleAdvances in neural information processing systems · December 2023
The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular dat ...
Cite
Journal ArticleAdvances in neural information processing systems · December 2023
In real applications, interaction between machine learning models and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring t ...
Cite
Journal ArticleJ Infect Dis · November 28, 2023
BACKGROUND: Human immunodeficiency virus (HIV) infection remains incurable due to the persistence of a viral reservoir despite antiretroviral therapy (ART). Cannabis (CB) use is prevalent amongst people with HIV (PWH), but the impact of CB on the latent HI ...
Full textLink to itemCite
Journal ArticleProceedings of the National Academy of Sciences of the United States of America · October 2023
One of the most troubling trends in criminal investigations is the growing use of "black box" technology, in which law enforcement rely on artificial intelligence (AI) models or algorithms that are either too complex for people to understand or they simply ...
Full textCite
ConferenceProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 4, 2023
The fast-growing demand for algorithmic music generation is found throughout entertainment, art, education, etc. Unfortunately, most recent models are practically impossible to interpret or musically fine-tune, as they use deep neural networks with thousan ...
Full textCite
Journal ArticleThe Lancet. Digital health · August 2023
BackgroundEpileptiform activity is associated with worse patient outcomes, including increased risk of disability and death. However, the effect of epileptiform activity on neurological outcome is confounded by the feedback between treatment with ...
Full textCite
Journal ArticleMaterials and Design · August 1, 2023
Advancements in additive manufacturing (AM) technology and three-dimensional (3D) modeling software have enabled the fabrication of parts with combinations of properties that were impossible to achieve with traditional manufacturing techniques. Porous desi ...
Full textCite
Journal ArticleData in brief · August 2023
Additive manufacturing has provided the ability to manufacture complex structures using a wide variety of materials and geometries. Structures such as triply periodic minimal surface (TPMS) lattices have been incorporated into products across many fields d ...
Full textCite
Journal ArticleNature communications · August 2023
Polymers are ubiquitous to almost every aspect of modern society and their use in medical products is similarly pervasive. Despite this, the diversity in commercial polymers used in medicine is stunningly low. Considerable time and resources have been exte ...
Full textCite
ConferenceProceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 · June 27, 2023
Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been l ...
Full textCite
ConferenceProceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence · June 2023
Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been l ...
Full textCite
Journal ArticleJournal of Quantitative Criminology · June 1, 2023
Objectives: We study interpretable recidivism prediction using machine learning (ML) models and analyze performance in terms of prediction ability, sparsity, and fairness. Unlike previous works, this study trains interpretable models that output probabilit ...
Full textCite
Journal ArticleJournal of Machine Learning Research · January 1, 2023
We develop a method for understanding specific predictions made by (global) predictive models by constructing (local) models tailored to each specific observation (these are also called “explanations” in the literature). Unlike existing work that “explains ...
Cite
ConferenceProgress in Biomedical Optics and Imaging Proceedings of SPIE · January 1, 2023
Tools for computer-aided diagnosis based on deep learning have become increasingly important in the medical field. Such tools can be useful, but require effective communication of their decision-making process in order to safely and meaningfully guide clin ...
Full textCite
ConferenceProceedings of Machine Learning Research · January 1, 2023
Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, accurate for treatment effect estimation, and scalable to high-dimensional data. We describe a general framework called Model-to-Match that achieves ...
Cite
ConferenceProceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2023
We consider the automated generation of sonnets, a poetic form constrained according to meter, rhyme scheme, and length. Sonnets generally also use rhetorical figures, expressive language, and a consistent theme or narrative. Our constrained decoding appro ...
Full textCite
ConferenceProceedings of Machine Learning Research · January 1, 2023
Missing values are a fundamental problem in data science. Many datasets have missing values that must be properly handled because the way missing values are treated can have large impact on the resulting machine learning model. In medical applications, the ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2023
We present ProtoConcepts, a method for interpretable image classification combining deep learning and case-based reasoning using prototypical parts. Existing work in prototype-based image classification uses a “this looks like that” reasoning process, whic ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2023
Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a give ...
Cite
Journal ArticleAis Transactions on Human Computer Interaction · January 1, 2023
We typically think of artificial intelligence (AI) as focusing on empowering machines with human capabilities so that they can function on their own, but, in truth, much of AI focuses on intelligence augmentation (IA), which is to augment human capabilitie ...
Full textCite
Journal ArticleExtreme Mechanics Letters · November 1, 2022
Machine learning models can assist with metamaterials design by approximating computationally expensive simulators or solving inverse design problems. However, past work has usually relied on black box deep neural networks, whose reasoning processes are op ...
Full textCite
ConferenceCEUR workshop proceedings · October 2022
Sparse decision trees are one of the most common forms of interpretable models. While recent advances have produced algorithms that fully optimize sparse decision trees for prediction, that work does not address policy design, because the alg ...
Cite
Journal ArticleJournal of Machine Learning Research · August 1, 2022
We introduce a flexible framework that produces high-quality almost-exact matches for causal inference. Most prior work in matching uses ad-hoc distance metrics, often leading to poor quality matches, particularly when there are irrelevant covariates. In t ...
Cite
Journal ArticleReproductive biomedicine online · July 2022
The last decade has seen an explosion of machine learning applications in healthcare, with mixed and sometimes harmful results despite much promise and associated hype. A significant reason for the reversal in the reported benefit of these applications is ...
Full textCite
Journal ArticleCommunications biology · July 2022
Dimension reduction (DR) algorithms project data from high dimensions to lower dimensions to enable visualization of interesting high-dimensional structure. DR algorithms are widely used for analysis of single-cell transcriptomic data. Despite widespread u ...
Full textCite
ConferenceACM International Conference Proceeding Series · June 21, 2022
It is almost always easier to find an accurate-but-complex model than an accurate-yet-simple model. Finding optimal, sparse, accurate models of various forms (linear models with integer coefficients, decision sets, rule lists, decision trees) is generally ...
Full textCite
Journal ArticleINFORMS Journal on Computing · May 1, 2022
A key question in causal inference analyses is how to find subgroups with elevated treatment effects. This paper takes a machine learning approach and introduces a generative model, causal rule sets (CRS), for interpretable subgroup discovery. A CRS model ...
Full textCite
Journal ArticleProceedings of machine learning research · March 2022
We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast spars ...
Cite
Journal ArticleDecision Support Systems · January 1, 2022
Lending decisions are usually made with proprietary models that provide minimally acceptable explanations to users. In a future world without such secrecy, what decision support tools would one want to use for justified lending decisions? This question is ...
Full textCite
Journal ArticleStatistics Surveys · January 1, 2022
Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. ...
Full textCite
Journal ArticleJournal of machine learning research : JMLR · January 2022
Instrumental variables (IV) are widely used in the social and health sciences in situations where a researcher would like to measure a causal effect but cannot perform an experiment. For valid causal inference in an IV model, there must be external (exogen ...
Cite
ConferenceProgress in Biomedical Optics and Imaging Proceedings of SPIE · January 1, 2022
There is increasing interest in using deep learning and computer vision to help guide clinical decisions, such as whether to order a biopsy based on a mammogram. Existing networks are typically black box, unable to explain how they make their predictions. ...
Full textCite
Journal ArticleProceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence · January 2022
Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort ...
Full textCite
ConferenceProceedings - 2022 IEEE Visualization Conference - Short Papers, VIS 2022 · January 1, 2022
Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees-a huge set of almost-optimal i ...
Full textCite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2022
Over the last century, risk scores have been the most popular form of predictive model used in healthcare and criminal justice. Risk scores are sparse linear models with integer coefficients; often these models can be memorized or placed on an index card. ...
Cite
ConferenceProceedings of the 38th Conference on Uncertainty in Artificial Intelligence Uai 2022 · January 1, 2022
Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under adversa ...
Cite
ConferenceAdvances in neural information processing systems · January 2022
In any given machine learning problem, there might be many models that explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that m ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2022
Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under adversa ...
Cite
Journal ArticlePhysiological measurement · December 2021
Objective. Wearable devices equipped with plethysmography (PPG) sensors provided a low-cost, long-term solution to early diagnosis and continuous screening of heart conditions. However PPG signals collected from such devices often suffer from corrup ...
Full textCite
Journal ArticleNature Machine Intelligence · December 1, 2021
Interpretability in machine learning models is important in high-stakes decisions such as whether to order a biopsy based on a mammographic exam. Mammography poses important challenges that are not present in other computer vision tasks: datasets are small ...
Full textCite
Journal ArticleManagement Science · October 1, 2021
Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theor ...
Full textCite
ConferenceAies 2021 Proceedings of the 2021 Aaai ACM Conference on AI Ethics and Society · July 21, 2021
AI has the potential to revolutionize many areas of healthcare. Radiology, dermatology, and ophthalmology are some of the areas most likely to be impacted in the near future, and they have received significant attention from the broader research community. ...
Full textCite
Journal ArticleTransactions of the Association for Computational Linguistics · July 8, 2021
Limerick generation exemplifies some of the most difficult challenges faced in poetry generation, as the poems must tell a story in only five lines, with constraints on rhyme, stress, and meter. To address these challenges, we introduce LimGen, a novel and ...
Full textCite
Journal Article · March 23, 2021
Interpretability in machine learning models is important in high-stakes
decisions, such as whether to order a biopsy based on a mammographic exam.
Mammography poses important challenges that are not present in other computer
vision tasks: datasets are smal ...
Link to itemCite
Journal Article · January 5, 2021
dame-flame is a Python package for performing matching for observational
causal inference on datasets containing discrete covariates. This package
implements the Dynamic Almost Matching Exactly (DAME) and Fast Large-Scale
Almost Matching Exactly (FLAME) al ...
Open AccessLink to itemCite
Journal ArticleJournal of Machine Learning Research · January 1, 2021
A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional ca ...
Open AccessCite
Journal ArticleJournal of Machine Learning Research · January 1, 2021
In retail, there are predictable yet dramatic time-dependent patterns in customer behavior, such as periodic changes in the number of visitors, or increases in customers just before major holidays. The current paradigm of multi-armed bandit analysis does n ...
Cite
Journal ArticleJournal of Artificial Intelligence Research · January 1, 2021
Although board games and video games have been studied for decades in artificial intelligence research, challenging word games remain relatively unexplored. Word games are not as constrained as games like chess or poker. Instead, word game strategy is defi ...
Full textCite
Journal ArticleJournal of Machine Learning Research · January 1, 2021
Dimension reduction (DR) techniques such as t-SNE, UMAP, and TriMap have demonstrated impressive visualization performance on many real-world datasets. One tension that has always faced these methods is the trade-off between preservation of global structur ...
Cite
Journal ArticleHuman reproduction open · January 2021
Artificial intelligence (AI) techniques are starting to be used in IVF, in particular for selecting which embryos to transfer to the woman. AI has the potential to process complex data sets, to be better at identifying subtle but important patterns, and to ...
Full textCite
Journal ArticleNature Machine Intelligence · December 1, 2020
What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hid ...
Full textCite
Journal ArticleNature Machine Intelligence · December 1, 2020
Variable importance is central to scientific studies, including the social sciences and causal inference, healthcare and other domains. However, current notions of variable importance are often tied to a specific predictive model. This is problematic: what ...
Full textCite
Journal Article · November 22, 2020
Single-particle cryo-electron microscopy (cryo-EM) is an emerging imaging
modality capable of visualizing proteins and macro-molecular complexes at
near-atomic resolution. The low electron-doses used to prevent sample radiation
damage, result in images whe ...
Link to itemCite
ConferenceFods 2020 Proceedings of the 2020 ACM IMS Foundations of Data Science Conference · October 19, 2020
Stochastic Lipschitz bandit algorithms balance exploration and exploitation, and have been used for a variety of important task domains. In this paper, we present a framework for Lipschitz bandit methods that adaptively learns partitions of context-and arm ...
Full textCite
ConferenceProceedings of Machine Learning Research · January 1, 2020
We propose a matching method that recovers direct treatment effects from randomized experiments where units are connected in an observed network, and units that share edges can potentially influence each others' outcomes. Traditional treatment effect estim ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2020
We propose a matching method for observational data that matches units with others in unit-specific, hyper-box-shaped regions of the covariate space. These regions are large enough that many matches are created for each unit and small enough that the treat ...
Cite
ConferenceProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2020
The primary aim of single-image super-resolution is to construct a high-resolution (HR) image from a corresponding low-resolution (LR) input. In previous approaches, which have generally been supervised, the training objective typically measures a pixel-wi ...
Full textCite
ConferenceProceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2020
Understanding tone in Twitter posts will be increasingly important as more and more communication moves online. One of the most difficult, yet important tones to detect is sarcasm. In the past, LSTM and transformer architecture models have been used to tac ...
Full textCite
Conference37th International Conference on Machine Learning Icml 2020 · January 1, 2020
We study the bandit problem where the underlying expected reward is a Bounded Mean Oscillation (BMO) function. BMO functions are allowed to be discontinuous and unbounded, and are useful in modeling signals with infinities in the domain. We develop a tools ...
Cite
Conference37th International Conference on Machine Learning Icml 2020 · January 1, 2020
Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have al ...
Cite
Journal ArticleBiostatistics (Oxford, England) · October 2019
In many clinical settings, a patient outcome takes the form of a scalar time series with a recovery curve shape, which is characterized by a sharp drop due to a disruptive event (e.g., surgery) and subsequent monotonic smooth rise towards an asymptotic lev ...
Full textCite
Journal ArticleJournal of Machine Learning Research · June 1, 2019
Risk scores are simple classification models that let users make quick risk predictions by adding and subtracting a few small numbers. These models are widely used in medicine and criminal justice, but are difficult to learn from data because they need to ...
Cite
Journal ArticleNature machine intelligence · May 2019
Black box machine learning models are currently being used for high stakes decision-making throughout society, causing problems throughout healthcare, criminal justice, and in other domains. People have hoped that creating methods for explaining these blac ...
Full textCite
Journal ArticleProceedings of machine learning research · April 2019
Matching methods are heavily used in the social and health sciences due to their interpretability. We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. The method proposed in t ...
Cite
Journal ArticleOperations Research · January 1, 2019
We investigate the data-driven newsvendor problem when one has n observations of p features related to the demand as well as historical demand data. Rather than a two-step process of first estimating a demand distribution then optimizing for the optimal or ...
Full textCite
Journal ArticleJournal of machine learning research : JMLR · January 2019
Variable importance (VI) tools describe how much covariates contribute to a prediction model's accuracy. However, important variables for one well-performing model (for example, a linear model f (x) = x Tβ with a fixed coef ...
Cite
Conference35th Conference on Uncertainty in Artificial Intelligence, UAI 2019 · January 1, 2019
Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the eff ...
Cite
Conference35th Conference on Uncertainty in Artificial Intelligence, UAI 2019 · January 1, 2019
Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of available options changes over time. Previous work on this problem sho ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2019
When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final deci ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2019
Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the eff ...
Cite
ConferenceProceedings of Machine Learning Research · January 1, 2019
Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of available options changes over time. Previous work on this problem sho ...
Cite
Journal ArticleObservational Studies · January 1, 2019
In the learning-to-match framework for causal inference, a parameterized distance metric is trained on a holdout train set so that the matching yields accurate estimated conditional average treatment effects. This way, the matching can be as accurate as ot ...
Full textCite
ConferenceProceedings of the Aaai Conference on Human Computation and Crowdsourcing · January 1, 2019
Vision models are interpretable when they classify objects on the basis of features that a person can directly understand. Recently, methods relying on visual feature prototypes have been developed for this purpose. However, in contrast to how humans categ ...
Full textCite
ConferenceIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · December 13, 2018
This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results. The challenge had 4 tracks. Track 1 employed the standard bicubic downscaling ...
Full textCite
ConferenceIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · December 13, 2018
This work identifies and addresses two important technical challenges in single-image super-resolution: (1) how to upsample an image without magnifying noise and (2) how to preserve large scale structure when upsampling. We summarize the techniques we deve ...
Full textCite
Journal ArticleMathematical Programming Computation · December 1, 2018
We introduce a mathematical programming approach to building rule lists, which are a type of interpretable, nonlinear, and logical machine learning classifier involving IF-THEN rules. Unlike traditional decision tree algorithms like CART and C5.0, this met ...
Full textCite
Journal ArticleJournal.of.Machine.Learning.Research 23(240) (2022) 1-42 · November 18, 2018
We introduce a flexible framework that produces high-quality almost-exact
matches for causal inference. Most prior work in matching uses ad-hoc distance
metrics, often leading to poor quality matches, particularly when there are
irrelevant covariates. In t ...
Link to itemCite
Journal ArticleInterfaces · September 1, 2018
Abstract. Questions of trust in machine-learning models are becoming increasingly important as these tools are starting to be used widely for high-stakes decisions in medicine and criminal justice. Transparency of models is a key aspect affecting trust. Th ...
Full textCite
Journal Article · June 18, 2018
We aim to create the highest possible quality of treatment-control matches
for categorical data in the potential outcomes framework. Matching methods are
heavily used in the social sciences due to their interpretability, but most
matching methods do not pa ...
Link to itemCite
Journal ArticleJ Neurosci · February 14, 2018
With ever-increasing advancements in technology, neuroscientists are able to collect data in greater volumes and with finer resolution. The bottleneck in understanding how the brain works is consequently shifting away from the amount and type of data we ca ...
Full textLink to itemCite
Journal ArticleJournal of Machine Learning Research · January 1, 2018
We present the design and implementation of a custom discrete optimization technique for building rule lists over a categorical feature space. Our algorithm produces rule lists with optimal training performance, according to the regularized empirical risk, ...
Cite
Conference32nd Aaai Conference on Artificial Intelligence Aaai 2018 · January 1, 2018
Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability - they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are often treated as “b ...
Cite
ConferenceInternational Conference on Artificial Intelligence and Statistics, AISTATS 2018 · January 1, 2018
Learning-to-rank techniques have proven to be extremely useful for prioritization problems, where we rank items in order of their estimated probabilities, and dedicate our limited resources to the top-ranked items. This work exposes a serious problem with ...
Cite
ConferenceInternational Conference on Artificial Intelligence and Statistics, AISTATS 2018 · January 1, 2018
A falling rule list is a probabilistic decision list for binary classification, consisting of a series of if-then rules with antecedents in the if clauses and probabilities of the desired outcome (“1”) in the then clauses. Just as in a regular decision lis ...
Cite
Journal ArticleJAMA neurology · December 2017
ImportanceContinuous electroencephalography (EEG) use in critically ill patients is expanding. There is no validated method to combine risk factors and guide clinicians in assessing seizure risk.ObjectiveTo use seizure risk factors from E ...
Full textCite
ConferenceProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 13, 2017
Risk scores are simple classification models that let users quickly assess risk by adding, subtracting, and multiplying a few small numbers. Such models are widely used in healthcare and criminal justice, but are often built ad hoc. In this paper, we prese ...
Full textCite
ConferenceProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 13, 2017
We present the design and implementation of a custom discrete optimization technique for building rule lists over a categorical feature space. Our algorithm provides the optimal solution, with a certificate of optimality. By leveraging algorithmic bounds, ...
Full textCite
Journal ArticleJournal of Machine Learning Research · August 1, 2017
We present a machine learning algorithm for building classifiers that are comprised of a small number of short rules. These are restricted disjunctive normal form models. An example of a classifier of this form is as follows: If X satisfies (condition A AN ...
Cite
Journal ArticleJournal of the Royal Statistical Society Series A Statistics in Society · June 1, 2017
We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent and interpretable to use for decision making. This question is complicated as these models are used to support differ ...
Full textCite
Journal ArticleJAMA psychiatry · May 2017
ImportanceRecognition that adult attention-deficit/hyperactivity disorder (ADHD) is common, seriously impairing, and usually undiagnosed has led to the development of adult ADHD screening scales for use in community, workplace, and primary care se ...
Full textCite
Conference34th International Conference on Machine Learning Icml 2017 · January 1, 2017
We present an algorithm for building probabilistic rule lists that is two orders of magnitude faster than previous work. Rule list algorithms are competitors for decision tree algorithms. They are associative classifiers, in that they are built from pre-mi ...
Cite
ConferenceProceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017 · January 1, 2017
Decision makers, such as doctors and judges, make crucial decisions such as recommending treatments to patients, and granting bail to defendants on a daily basis. Such decisions typically involve weighing the potential benefits of taking an action against ...
Cite
ConferenceProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 13, 2016
When an item goes out of stock, sales transaction data no longer reflect the original customer demand, since some customers leave with no purchase while others substitute alternative products for the one that was out of stock. Here we develop a Bayesian hi ...
Full textCite
ConferenceProceedings IEEE International Conference on Data Mining Icdm · July 2, 2016
A Rule Set model consists of a small number of short rules for interpretable classification, where an instance is classified as positive if it satisfies at least one of the rules. The rule set provides reasons for predictions, and also descriptions of a pa ...
Full textCite
Journal ArticleChaos (Woodbury, N.Y.) · June 2016
Dynamical systems are frequently used to model biological systems. When these models are fit to data, it is necessary to ascertain the uncertainty in the model fit. Here, we present prediction deviation, a metric of uncertainty that determines the extent t ...
Full textCite
Journal ArticleJournal of Machine Learning Research · June 1, 2016
We provide a hierarchical Bayesian model for estimating the effects of transient drug exposures on a collection of health outcomes, where the effects of all drugs on all outcomes are estimated simultaneously. The method possesses properties that allow it t ...
Cite
Journal ArticleMachine Learning · March 1, 2016
The Clock Drawing Test—a simple pencil and paper test—has been used for more than 50 years as a screening tool to differentiate normal individuals from those with cognitive impairment, and has proven useful in helping to diagnose cognitive dysfunction asso ...
Full textCite
Journal ArticleMachine Learning · March 1, 2016
Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data beca ...
Full textCite
Journal ArticleJournal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine · February 2016
Study objectiveObstructive sleep apnea (OSA) is a treatable contributor to morbidity and mortality. However, most patients with OSA remain undiagnosed. We used a new machine learning method known as SLIM (Supersparse Linear Integer Models) to test ...
Full textCite
Journal ArticlePloS one · January 2016
Type 1 interferons such as interferon-alpha (IFNα) inhibit replication of Human immunodeficiency virus (HIV-1) by upregulating the expression of genes that interfere with specific steps in the viral life cycle. This pathway thus represents a potential targ ...
Full textCite
ConferenceProceedings of the 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016 · January 1, 2016
We present a hierarchical Bayesian framework for clustering with cluster-specific feature selection. We derive a simplified model, CRAFT, by analyzing the asymptotic behavior of the log posterior formulations in a nonparametric MAP-based clustering setting ...
Cite
Journal ArticleBig Data · December 1, 2015
We present a Bayesian method for building scoring systems, which are linear models with coefficients that have very few significant digits. Usually the construction of scoring systems involve manual effort - humans invent the full scoring system without us ...
Full textCite
Journal ArticleAnnals of Applied Statistics · December 1, 2015
We present a new model for reliability analysis that is able to distinguish the latent internal vulnerability state of the equipment from the vulnerability caused by temporary external sources. Consider a wind farm where each turbine is running under the e ...
Full textCite
Journal ArticleMachine Learning · September 17, 2015
In this paper, we consider a supervised learning setting where side knowledge is provided about the labels of unlabeled examples. The side knowledge has the effect of reducing the hypothesis space, leading to tighter generalization bounds, and thus possibl ...
Full textCite
Journal ArticleAnnals of Applied Statistics · September 1, 2015
We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if … then. . . statements (e.g., if high blood pressure, then stroke) that discretize a ...
Full textCite
Journal ArticleBig Data · March 1, 2015
One of the most challenging problems facing crime analysts is that of identifying crime series, which are sets of crimes committed by the same individual or group. Detecting crime series can be an important step in predictive policing, as knowledge of a pa ...
Full textCite
ConferenceJournal of Machine Learning Research · January 1, 2015
Falling rule lists are classification models consisting of an ordered list of if-then rules, where (i) the order of rules determines which example should be classified by each rule, and (ii) the estimated probability of success decreases monotonically down ...
Cite
ConferenceLecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · January 1, 2015
Arguably, the main stumbling block in getting machine learning algorithms used in practice is the fact that people do not trust them. There could be many reasons for this, for instance, perhaps the models are not sparse or transparent, or perhaps the model ...
Cite
Journal ArticleAnnals of Applied Statistics · January 1, 2015
Reactive point processes (RPPs) are a new statistical model designed for predicting discrete events in time based on past history. RPPs were developed to handle an important problem within the domain of electrical grid reliability: short-term prediction of ...
Full textCite
Journal ArticleBig Data · June 1, 2014
Our goal is to design a prediction and decision system for real-time use during a professional car race. In designing a knowledge discovery process for racing, we faced several challenges that were overcome only when domain knowledge of racing was carefull ...
Full textCite
ConferenceSIAM International Conference on Data Mining 2014 Sdm 2014 · January 1, 2014
This paper formalizes a latent variable inference problem we call supervised, pattern discovery, the goal of which is to find sets of observations that belong to a single "pattern." We discuss two versions of the problem and prove uniform risk bounds for b ...
Full textCite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2014
We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "qui ...
Cite
ConferenceProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · January 1, 2014
The vast majority of real world classification problems are imbalanced, meaning there are far fewer data from the class of interest (the positive class) than from other classes. We propose two machine learning algorithms to handle highly imbalanced classif ...
Full textCite
ConferenceMachine Learning · January 1, 2014
We present a new application and covering number bound for the framework of "Machine Learning with Operational Costs (MLOC)," which is an exploratory form of decision theory. The MLOC framework incorporates knowledge about how a predictive model will be us ...
Full textCite
Journal ArticleData Mining and Knowledge Discovery · January 1, 2014
The problem of "approximating the crowd" is that of estimating the crowd's majority opinion by querying only a subset of it. Algorithms that approximate the crowd can intelligently stretch a limited budget for a crowdsourcing task. We present an algorithm, ...
Full textCite
Journal ArticleData Mining and Knowledge Discovery · January 1, 2014
Most people participate in meetings almost every day, multiple times a day. The study of meetings is important, but also challenging, as it requires an understanding of social signals and complex interpersonal dynamics. Our aim in this work is to use a dat ...
Full textCite
Journal ArticleInterfaces · January 1, 2014
We summarize the first major effort to use analytics for preemptive maintenance and repair of an electrical distribution network. This is a large-scale multiyear effort between scientists and students at Columbia University and the Massachusetts Institute ...
Full textCite
ConferenceProcedia Computer Science · January 1, 2014
Weather can cause problems for underground electrical grids by increasing the probability of serious "manhole events" such as fires and explosions. In this work, we compare a model that incorporates weather features associated with the dates of serious eve ...
Full textCite
Journal ArticleMachine Learning · January 1, 2014
The special issue on "Machine Learning for Science and Society" showcases machine learning work with influence on our current and future society. These papers address several key problems such as how we perform repairs on critical infrastructure, how we pr ...
Full textCite
ConferenceInternational Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014
Our goal is to build robust optimization problems that make decisions about the future, and where complex data from the past are used to model uncertainty. In robust optimization (RO) generally, the goal is to create a policy for decision-making that is ro ...
Cite
ConferenceInternational Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014
This paper formalizes a latent variable inference problem we call supervised pattern discovery, the goal of which is to find sets of observations that belong to a single “pattern.” We discuss two versions of the problem and prove uniform risk bounds for bo ...
Cite
ConferenceInternational Symposium on Artificial Intelligence and Mathematics, ISAIM 2014 · January 1, 2014
In this paper, we consider a supervised learning setting where side knowledge is provided about the labels of unlabeled examples. The side knowledge has the effect of reducing the hypothesis space, leading to tighter generalization bounds, and thus possibl ...
Cite
Journal ArticleData Mining and Knowledge Discovery · December 1, 2013
It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: information can be scattered across many sources and must be aggregated to be useful. We introduce a method for ...
Full textCite
Journal ArticleJournal of Machine Learning Research · November 1, 2013
We present a theoretical analysis for prediction algorithms based on association rules. As part of this analysis, we introduce a problem for which rules are particularly natural, called "sequential event prediction." In sequential event prediction, events ...
Cite
Journal ArticleMachine Learning · November 1, 2013
In sequential event prediction, we are given a "sequence database" of past event sequences to learn from, and we aim to predict the next event within a current event sequence. We focus on applications where the set of the past events has predictive power a ...
Full textCite
ConferenceLecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · October 31, 2013
Our goal is to automatically detect patterns of crime. Among a large set of crimes that happen every year in a major city, it is challenging, time-consuming, and labor-intensive for crime analysts to determine which ones may have been committed by the same ...
Full textCite
Journal ArticleJournal of Machine Learning Research · August 1, 2013
The AdaBoost algorithm was designed to combine many "weak" hypotheses that perform slightly better than random guessing into a "strong" hypothesis that has very low error. We study the rate at which AdaBoost iteratively converges to the minimum of the "exp ...
Cite
Journal ArticleJournal of Machine Learning Research · June 1, 2013
This work proposes a way to align statistical modeling with decision making. We provide a method that propagates the uncertainty in predictive modeling to the uncertainty in operational cost, where operational cost is the amount spent by the practitioner i ...
Cite
ConferenceAaai Workshop Technical Report · January 1, 2013
Most people participate in meetings almost every day, multiple times a day. The study of meetings is important, but also challenging, as it requires an understanding of social signals and complex interpersonal dynamics. Our aim this work is to use a data-d ...
Cite
ConferenceAaai Workshop Technical Report · January 1, 2013
We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. We introduce a generative model called the Bayesian List Machine for fitting decision lists, a type of interpretable classifier, to data. We use th ...
Cite
ConferenceAaai Workshop Technical Report · January 1, 2013
Many crimes can happen every day in a major city, and figuring out which ones are committed by the same individual or group is an important and difficult data mining challenge. To do this, we propose a pattern detection algorithm called Series Finder, that ...
Cite
ConferenceInternational Symposium on Artificial Intelligence and Mathematics Isaim 2012 · December 1, 2012
This work concerns the way that statistical models are used to make decisions. In particular, we aim to merge the way estimation algorithms are designed with how they are used for a subsequent task. Our methodology considers the operational cost of carryin ...
Cite
ConferenceAdvances in Neural Information Processing Systems · December 1, 2012
We aim to design classifiers that have the interpretability of association rules yet have predictive power on par with the top machine learning algorithms for classification. We propose a novel mixed integer optimization (MIO) approach called Ordered Rules ...
Cite
ConferenceAaai Fall Symposium Technical Report · December 1, 2012
In this paper, we present CrowdSense, an algorithm for estimating the crowd's majority opinion by querying only a subset of it. CrowdSense works in an online fashion where examples come one at a time and it dynamically samples subsets of labelers based on ...
Cite
Journal ArticleMachine Learning · September 1, 2012
A good or bad product quality rating can make or break an organization. However, the notion of "quality" is often defined by an independent rating company that does not make the formula for determining the rank of a product publicly available. In order to ...
Full textCite
Journal ArticleAnnals of Applied Statistics · June 1, 2012
We propose a statistical modeling technique, called the Hierarchical Association Rule Model (HARM), that predicts a patient's possible future medical conditions given the patient's current and past history of reported conditions. The core of our technique ...
Full textCite
Journal ArticleIEEE Transactions on Pattern Analysis and Machine Intelligence · January 1, 2012
Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of ...
Full textCite
ConferenceLecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · October 31, 2011
The goal of the Machine Learning and Traveling Repairman Problem (ML&TRP) is to determine a route for a "repair crew," which repairs nodes on a graph. The repair crew aims to minimize the cost of failures at the nodes, but the failure probabilities are not ...
Full textCite
Journal ArticleJournal of Machine Learning Research · October 1, 2011
We demonstrate that there are machine learning algorithms that can achieve success for two separate tasks simultaneously, namely the tasks of classification and bipartite ranking. This means that advantages gained from solving one task can be carried over ...
Cite
ConferenceProceedings of the 1st International Workshop on Data Mining for Service and Maintenance Kdd4service 2011 Held in Conjunction with SIGKDD 11 · September 15, 2011
Ensuring reliability as the electrical grid morphs into the "smart grid" will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive maintenance; in the future, we will not only react to f ...
Full textCite
ConferenceIEEE 2011 Energytech Energytech 2011 · August 17, 2011
An important problem in reliability engineering is to predict the failure rate, that is, the frequency with which an engineered system or component fails. This paper presents a new method of estimating failure rate using a semiparametric model with Gaussia ...
Full textCite
ConferenceJournal of Machine Learning Research · January 1, 2011
We consider a supervised learning problem in which data are revealed sequentially and the goal is to determine what will next be revealed. In the context of this problem, algorithms based on association rules have a distinct advantage over classical statis ...
Cite
ConferenceJournal of Machine Learning Research · January 1, 2011
The AdaBoost algorithm of Freund and Schapire (1997) was designed to combine many "weak" hypotheses that perform slightly better than a random guess into a "strong" hypo-thesis that has very low error. We study the rate at which AdaBoost iteratively conver ...
Cite
Journal ArticleMachine Learning · July 1, 2010
We present a knowledge discovery and data mining process developed as part of the Columbia/Con Edison project on manhole event prediction. This process can assist with real-world prioritization problems that involve raw data in the form of noisy documents ...
Full textCite
Conference2009 IEEE 12th International Conference on Computer Vision Workshops Iccv Workshops 2009 · December 1, 2009
We present a new online boosting algorithm for updating the weights of a boosted classifier, which yields a closer approximation to the edges found by Freund and Schapire's AdaBoost algorithm than previous online boosting algorithms. We contribute a new wa ...
Full textCite
Conference8th International Conference on Machine Learning and Applications Icmla 2009 · December 1, 2009
We present a manhole profiling tool, developed as part of the Columbia/Con Edison machine learning project on manhole event prediction, and discuss its role in evaluating our machine learning model in three important ways: elimination of outliers, eliminat ...
Full textCite
Journal ArticleJournal of Machine Learning Research · November 30, 2009
We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then descri ...
Cite
Journal ArticleJournal of Machine Learning Research · November 30, 2009
We are interested in supervised ranking algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. In this work, we provide a general form of convex objective that gi ...
Cite
ConferenceLecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · July 21, 2009
This paper illustrates how a combination of information extraction, machine learning, and NLP corpus annotation practice was applied to a problem of ranking vulnerability of structures (service boxes, manholes) in the Manhattan electrical grid. By adapting ...
Full textCite
ConferenceAcl 08 Hlt 46th Annual Meeting of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference · January 1, 2008
We investigate the tasks of general morphological tagging, diacritization, and lemmatization for Arabic. We show that for all tasks we consider, both modeling the lexeme explicitly, and retuning the weights of individual classifiers for the specific task, ...
Full textCite
ConferenceProceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2008
We investigate the tasks of general morphological tagging, diacritization, and lemmatization for Arabic. We show that for all tasks we consider, both modeling the lexeme explicitly, and retuning the weights of individual classifiers for the specific task, ...
Cite
Journal ArticleAnnals of Statistics · December 1, 2007
We introduce a useful tool for analyzing boosting algorithms called the "smooth margin function," a differentiable approximation of the usual margin for boosting algorithms. We present two boosting algorithms based on this smooth margin, "coordinate ascent ...
Full textCite
ConferenceLecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · January 1, 2006
We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, ...
Full textCite
ConferenceHlt Naacl 2006 Computationally Hard Problems and Joint Inference in Speech and Language Processing Proceedings of the Workshop · January 1, 2006
Integrating information from different stages of an NLP processing pipeline can yield significant error reduction. We demonstrate how re-ranking can improve name tagging in a Chinese information extraction system by incorporating information from relation ...
Cite
ConferenceLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2005
We present several results related to ranking. We give a general margin-based bound for ranking based on the L∞ covering number of the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin generalize well. We then describe a ...
Full textCite
Journal ArticleJournal of Machine Learning Research · December 1, 2004
In order to study the convergence properties of the AdaBoost algorithm, we reduce AdaBoost to a nonlinear iterated map and study the evolution of its weight vectors. This dynamical systems approach allows us to understand AdaBoost's convergence properties ...
Cite
Journal ArticleLecture Notes in Artificial Intelligence Subseries of Lecture Notes in Computer Science · January 1, 2004
We study two boosting algorithms, Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting, which are explicitly designed to produce maximum margins. To derive these algorithms, we introduce a smooth approximation of the margin that one can ma ...
Full textCite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2004
In order to understand AdaBoost's dynamics, especially its ability to maximize margins, we derive an associated simplified nonlinear iterated map and analyze its behavior in low-dimensional cases. We find stable cycles for these cases, which can explicitly ...
Cite