Skip to main content

Ricardo Henao

Associate Professor of Biostatistics & Bioinformatics
Biostatistics & Bioinformatics, Division of Translational Biomedical
Duke Box 90984, Durham, NC 27710
140 Science Drive, Durham, NC 27710

Selected Publications


A Rapid Host Response Blood Test for Bacterial/Viral Infection Discrimination Using a Portable Molecular Diagnostic Platform

Journal Article Open Forum Infectious Diseases · December 11, 2024 AbstractBackgroundDifficulty discriminating bacterial versus viral etiologies of infection drives unwarranted antibacterial prescripti ... Full text Cite

A conditional multi-label model to improve prediction of a rare outcome: An illustration predicting autism diagnosis.

Journal Article J Biomed Inform · September 2024 OBJECTIVE: This study aimed to develop a novel approach using routinely collected electronic health records (EHRs) data to improve the prediction of a rare event. We illustrated this using an example of improving early prediction of an autism diagnosis, gi ... Full text Link to item Cite

Image Quality Assessment Using Convolutional Neural Network in Clinical Skin Images.

Journal Article JID Innov · July 2024 The image quality received for clinical evaluation is often suboptimal. The goal is to develop an image quality analysis tool to assess patient- and primary care physician-derived images using deep learning model. Dataset included patient- and primary care ... Full text Link to item Cite

Text Feature Adversarial Learning for Text Generation With Knowledge Transfer From GPT2.

Journal Article IEEE Trans Neural Netw Learn Syst · May 2024 Text generation is a key component of many natural language tasks. Motivated by the success of generative adversarial networks (GANs) for image generation, many text-specific GANs have been proposed. However, due to the discrete nature of text, these text ... Full text Link to item Cite

Translating ethical and quality principles for the effective, safe and fair development, deployment and use of artificial intelligence technologies in healthcare.

Journal Article J Am Med Inform Assoc · February 16, 2024 OBJECTIVE: The complexity and rapid pace of development of algorithmic technologies pose challenges for their regulation and oversight in healthcare settings. We sought to improve our institution's approach to evaluation and governance of algorithmic techn ... Full text Link to item Cite

Trans-Balance: Reducing demographic disparity for prediction models in the presence of class imbalance.

Journal Article J Biomed Inform · January 2024 INTRODUCTION: Risk prediction, including early disease detection, prevention, and intervention, is essential to precision medicine. However, systematic bias in risk estimation caused by heterogeneity across different demographic groups can lead to inapprop ... Full text Link to item Cite

Pathogen class-specific transcriptional responses derived from PBMCs accurately discriminate between fungal, bacterial, and viral infections.

Journal Article PLoS One · 2024 Immune responses during acute infection often contain canonical elements which are shared across the responses to an array of agents within a given pathogen class (i.e., respiratory viral infection). Identification of these shared, canonical elements acros ... Full text Link to item Cite

305. PBMC-Derived Transcriptomic Signatures Accurately Discriminate Between Viral, Bacterial, and Fungal Infections and can be Translated to Real-World Human Infections

Journal Article Open Forum Infectious Diseases · November 27, 2023 AbstractBackgroundAnalysis of host gene expression patterns (‘signatures’) can provide diagnostic information to determine the etiolog ... Full text Cite

A Deep-Learning Algorithm to Predict Short-Term Progression to Geographic Atrophy on Spectral-Domain Optical Coherence Tomography.

Journal Article JAMA Ophthalmol · November 1, 2023 IMPORTANCE: The identification of patients at risk of progressing from intermediate age-related macular degeneration (iAMD) to geographic atrophy (GA) is essential for clinical trials aimed at preventing disease progression. DeepGAze is a fully automated a ... Full text Link to item Cite

Facilitating Harmonization of Variables in Framingham, MESA, ARIC, and REGARDS Studies Through a Metadata Repository.

Journal Article Circ Cardiovasc Qual Outcomes · November 2023 BACKGROUND: High-quality research in cardiovascular prevention, as in other fields, requires inclusion of a broad range of data sets from different sources. Integrating and harmonizing different data sources are essential to increase generalizability, samp ... Full text Link to item Cite

Machine learning functional impairment classification with electronic health record data.

Journal Article J Am Geriatr Soc · September 2023 BACKGROUND: Poor functional status is a key marker of morbidity, yet is not routinely captured in clinical encounters. We developed and evaluated the accuracy of a machine learning algorithm that leveraged electronic health record (EHR) data to provide a s ... Full text Open Access Link to item Cite

Deep-Learning-Based Screening and Ancillary Testing for Thyroid Cytopathology.

Journal Article Am J Pathol · September 2023 Thyroid cancer is the most common malignant endocrine tumor. The key test to assess preoperative risk of malignancy is cytologic evaluation of fine-needle aspiration biopsies (FNABs). The evaluation findings can often be indeterminate, leading to unnecessa ... Full text Link to item Cite

Neural Insights for Digital Marketing Content Design

Conference Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining · August 6, 2023 In digital marketing, experimenting with new website content is one of the key levers to improve customer engagement. However, creating successful marketing content is a manual and time-consuming process that lacks clear guiding principles. This paper seek ... Full text Cite

Learning Hierarchical Document Graphs From Multilevel Sentence Relations.

Journal Article IEEE Trans Neural Netw Learn Syst · August 2023 Organizing the implicit topology of a document as a graph, and further performing feature extraction via the graph convolutional network (GCN), has proven effective in document analysis. However, existing document graphs are often restricted to expressing ... Full text Link to item Cite

Serum Metabolites Are Associated With HFpEF in Biopsy-Proven Nonalcoholic Fatty Liver Disease.

Journal Article J Am Heart Assoc · July 18, 2023 Background Nonalcoholic fatty liver disease (NAFLD) and heart failure with preserved ejection fraction (HFpEF) share common risk factors, including obesity and diabetes. They are also thought to be mechanistically linked. The aim of this study was to defin ... Full text Link to item Cite

Development and validation of a REcurrent Liver cAncer Prediction ScorE (RELAPSE) following liver transplantation in patients with hepatocellular carcinoma: Analysis of the US Multicenter HCC Transplant Consortium.

Journal Article Liver Transpl · July 1, 2023 HCC recurrence following liver transplantation (LT) is highly morbid and occurs despite strict patient selection criteria. Individualized prediction of post-LT HCC recurrence risk remains an important need. Clinico-radiologic and pathologic data of 4981 pa ... Full text Link to item Cite

Few-Shot Composition Learning for Image Retrieval with Prompt Tuning

Journal Article Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 · June 27, 2023 We study the problem of composition learning for image retrieval, for which we learn to retrieve target images with search queries in the form of a composition of a reference image and a modification text that describes desired modifications of the image. ... Cite

Thyroid Cytopathology Cancer Diagnosis from Smartphone Images Using Machine Learning.

Journal Article Mod Pathol · June 2023 We examined the performance of deep learning models on the classification of thyroid fine-needle aspiration biopsies using microscope images captured in 2 ways: with a high-resolution scanner and with a mobile phone camera. Our training set consisted of im ... Full text Link to item Cite

Calibration and Uncertainty in Neural Time-to-Event Modeling.

Journal Article IEEE Trans Neural Netw Learn Syst · April 2023 Models for predicting the time of a future event are crucial for risk assessment, across a diverse range of applications. Existing time-to-event (survival) models have focused primarily on preserving pairwise ordering of estimated event times (i.e., relati ... Full text Link to item Cite

Deep Learning in Dermatology: A Systematic Review of Current Approaches, Outcomes, and Limitations.

Journal Article JID Innov · January 2023 Artificial intelligence (AI) has recently made great advances in image classification and malignancy prediction in the field of dermatology. However, understanding the applicability of AI in clinical dermatology practice remains challenging owing to the va ... Full text Link to item Cite

Pushing the Efficiency Limit Using Structured Sparse Convolutions

Conference Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023 · January 1, 2023 Weight pruning is among the most popular approaches for compressing deep convolutional neural networks. Recent work suggests that in a randomly initialized deep neural network, there exist sparse subnetworks that achieve performance comparable to the origi ... Full text Cite

Federated Domain Adaptation for Named Entity Recognition via Distilling with Heterogeneous Tag Sets

Conference Proceedings of the Annual Meeting of the Association for Computational Linguistics · January 1, 2023 Federated learning involves collaborative training with private data from multiple platforms, while not violating data privacy. We study the problem of federated domain adaptation for Named Entity Recognition (NER), where we seek to transfer knowledge acro ... Cite

Multidimensional machine learning models predicting outcomes after trauma.

Journal Article Surgery · December 2022 BACKGROUND: An emerging body of literature supports the role of individualized prognostic tools to guide the management of patients after trauma. The aim of this study was to develop advanced modeling tools from multidimensional data sources, including imm ... Full text Link to item Cite

Collaborative Anomaly Detection

Preprint · September 20, 2022 Link to item Cite

TAMC: A deep-learning approach to predict motif-centric transcriptional factor binding activity based on ATAC-seq profile.

Journal Article PLoS Comput Biol · September 2022 Determining transcriptional factor binding sites (TFBSs) is critical for understanding the molecular mechanisms regulating gene expression in different biological conditions. Biological assays designed to directly mapping TFBSs require large sample size an ... Full text Link to item Cite

Use of Machine Learning-Based Software for the Screening of Thyroid Cytopathology Whole Slide Images.

Journal Article Arch Pathol Lab Med · July 1, 2022 CONTEXT.—: The use of whole slide images (WSIs) in diagnostic pathology presents special challenges for the cytopathologist. Informative areas on a direct smear from a thyroid fine-needle aspiration biopsy (FNAB) smear may be spread across a large area com ... Full text Link to item Cite

A Multidimensional Bioinformatic Platform for the Study of Human Response to Surgery.

Journal Article Ann Surg · June 1, 2022 OBJECTIVE: To design and establish a prospective biospecimen repository that integrates multi-omics assays with clinical data to study mechanisms of controlled injury and healing. BACKGROUND: Elective surgery is an opportunity to understand both the system ... Full text Link to item Cite

Current Landscape of Generative Adversarial Networks for Facial Deidentification in Dermatology: Systematic Review and Evaluation.

Journal Article JMIR Dermatol · May 27, 2022 BACKGROUND: Deidentifying facial images is critical for protecting patient anonymity in the era of increasing tools for automatic image analysis in dermatology. OBJECTIVE: The aim of this paper was to review the current literature in the field of automatic ... Full text Link to item Cite

GENESIS: Gene-Specific Machine Learning Models for Variants of Uncertain Significance Found in Catecholaminergic Polymorphic Ventricular Tachycardia and Long QT Syndrome-Associated Genes.

Journal Article Circ Arrhythm Electrophysiol · April 2022 BACKGROUND: Cardiac channelopathies such as catecholaminergic polymorphic tachycardia and long QT syndrome predispose patients to fatal arrhythmias and sudden cardiac death. As genetic testing has become common in clinical practice, variants of uncertain s ... Full text Link to item Cite

Convolutional neural network to identify symptomatic Alzheimer's disease using multimodal retinal imaging.

Journal Article Br J Ophthalmol · March 2022 BACKGROUND/AIMS: To develop a convolutional neural network (CNN) to detect symptomatic Alzheimer's disease (AD) using a combination of multimodal retinal images and patient data. METHODS: Colour maps of ganglion cell-inner plexiform layer (GC-IPL) thicknes ... Full text Open Access Link to item Cite

Privacy Protection With Facial Deidentification Machine Learning Methods: Can Current Methods Be Applied to Dermatology?

Journal Article Iproceedings · December 17, 2021 BackgroundIn the era of increasing tools for automatic image analysis in dermatology, new machine learning models require high-quality image data sets. Facial image data are needed for de ... Full text Cite

Privacy Protection With Facial Deidentification Machine Learning Methods: Can Current Methods Be Applied to Dermatology? (Preprint)

Journal Article · December 3, 2021 BACKGROUNDIn the era of increasing tools for automatic image analysis in dermatology, new machine learning models require high-quality image data sets. Facial image data are needed for de ... Full text Cite

The importance of weight stabilization amongst those with overweight or obesity: Results from a large health care system.

Journal Article Prev Med Rep · December 2021 Data on patterns of weight change among adults with overweight or obesity are minimal. We aimed to examine patterns of weight change and associated hospitalizations in a large health system, and to develop a model to predict 2-year significant weight gain. ... Full text Link to item Cite

Discriminating Bacterial and Viral Infection Using a Rapid Host Gene Expression Test.

Journal Article Crit Care Med · October 1, 2021 OBJECTIVES: Host gene expression signatures discriminate bacterial and viral infection but have not been translated to a clinical test platform. This study enrolled an independent cohort of patients to describe and validate a first-in-class host response b ... Full text Open Access Link to item Cite

Evaluation of an RNAseq-Based Immunogenomic Liquid Biopsy Approach in Early-Stage Prostate Cancer.

Journal Article Cells · September 28, 2021 The primary objective of this study is to detect biomarkers and develop models that enable the identification of clinically significant prostate cancer and to understand the biologic implications of the genes involved. Peripheral blood samples (1018 patien ... Full text Link to item Cite

Antibody signatures of asymptomatic Plasmodium falciparum malaria infections measured from dried blood spots.

Journal Article Malar J · September 23, 2021 BACKGROUND: Screening malaria-specific antibody responses on protein microarrays can help identify immune factors that mediate protection against malaria infection, disease, and transmission, as well as markers of past exposure to both malaria parasites an ... Full text Link to item Cite

Glycemic Control Predicts Severity of Hepatocyte Ballooning and Hepatic Fibrosis in Nonalcoholic Fatty Liver Disease.

Journal Article Hepatology · September 2021 BACKGROUND AND AIMS: Whether glycemic control, as opposed to diabetes status, is associated with the severity of NAFLD is open for study. We aimed to evaluate whether degree of glycemic control in the years preceding liver biopsy predicts the histological ... Full text Open Access Link to item Cite

Assessment of the Feasibility of Using Noninvasive Wearable Biometric Monitoring Sensors to Detect Influenza and the Common Cold Before Symptom Onset.

Journal Article JAMA Netw Open · September 1, 2021 IMPORTANCE: Currently, there are no presymptomatic screening methods to identify individuals infected with a respiratory virus to prevent disease spread and to predict their trajectory for resource allocation. OBJECTIVE: To evaluate the feasibility of usin ... Full text Open Access Link to item Cite

Validation of a Host Gene Expression Test for Bacterial/Viral Discrimination in Immunocompromised Hosts.

Journal Article Clin Infect Dis · August 16, 2021 BACKGROUND: Host gene expression has emerged as a complementary strategy to pathogen detection tests for the discrimination of bacterial and viral infection. The impact of immunocompromise on host-response tests remains unknown. We evaluated a host-respons ... Full text Open Access Link to item Cite

Adaptive Multi-Channel Event Segmentation and Feature Extraction for Monitoring Health Outcomes.

Journal Article IEEE Trans Biomed Eng · August 2021 OBJECTIVE: To develop a multi-channel device event segmentation and feature extraction algorithm that is robust to changes in data distribution. METHODS: We introduce an adaptive transfer learning algorithm to classify and segment events from non-stationar ... Full text Link to item Cite

CPT to RVU conversion improves model performance in the prediction of surgical case length.

Journal Article Sci Rep · July 8, 2021 Methods used to predict surgical case time often rely upon the current procedural terminology (CPT) code as a nominal variable to train machine-learned models, however this limits the ability of the model to incorporate new procedures and adds complexity a ... Full text Open Access Link to item Cite

The host transcriptional response to Candidemia is dominated by neutrophil activation and heme biosynthesis and supports novel diagnostic approaches.

Journal Article Genome Med · July 5, 2021 BACKGROUND: Candidemia is one of the most common nosocomial bloodstream infections in the United States, causing significant morbidity and mortality in hospitalized patients, but the breadth of the host response to Candida infections in human patients rema ... Full text Open Access Link to item Cite

Gradient Importance Learning for Incomplete Observations

Journal Article · July 5, 2021 Though recent works have developed methods that can generate estimates (or imputations) of the missing entries in a dataset to facilitate downstream analysis, most depend on assumptions that may not align with real-world applications and could suffer from ... Link to item Cite

Efficient Classification of Very Large Images with Tiny Objects

Journal Article · June 4, 2021 An increasing number of applications in computer vision, specially, in medical imaging and remote sensing, become challenging when the goal is to classify very large images with tiny informative objects. Specifically, these classification tasks face two ke ... Link to item Cite

Variational Disentanglement for Rare Event Modeling.

Journal Article Proc AAAI Conf Artif Intell · May 18, 2021 Combining the increasing availability and abundance of healthcare data and the current advances in machine learning methods have created renewed opportunities to improve clinical decision support systems. However, in healthcare risk prediction applications ... Link to item Cite

An atlas connecting shared genetic architecture of human diseases and molecular phenotypes provides insight into COVID-19 susceptibility.

Journal Article Genome Med · May 17, 2021 BACKGROUND: While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challeng ... Full text Open Access Link to item Cite

Enabling counterfactual survival analysis with balanced representations

Conference ACM CHIL 2021 - Proceedings of the 2021 ACM Conference on Health, Inference, and Learning · April 8, 2021 Balanced representation learning methods have been applied successfully to counterfactual inference from observational data. However, approaches that account for survival outcomes are relatively limited. Survival data are frequently encountered across dive ... Full text Cite

Affinitention nets: Kernel perspective on attention architectures for set classification with applications to medical text and images

Conference ACM CHIL 2021 - Proceedings of the 2021 ACM Conference on Health, Inference, and Learning · April 8, 2021 Set classification is the task of predicting a single label from a set comprising multiple instances. The examples we consider are pathology slides represented by sets of patches and medical text data represented by sets of word embeddings. State-of-the-ar ... Full text Cite

Malignancy Prediction and Lesion Identification from Clinical Dermatological Images

Journal Article · April 2, 2021 We consider machine-learning-based malignancy prediction and lesion identification from clinical dermatological images, which can be indistinctly acquired via smartphone or dermoscopy capture. Additionally, we do not assume that images contain single lesio ... Open Access Link to item Cite

Serum Bile Acid, Vitamin E, and Serotonin Metabolites Are Associated With Future Liver-Related Events in Nonalcoholic Fatty Liver Disease.

Journal Article Hepatol Commun · April 2021 Identifying patients at higher risk for poor outcomes from nonalcoholic fatty liver disease (NAFLD) remains challenging. Metabolomics, the comprehensive measurement of small molecules in biological samples, has the potential to reveal novel noninvasive bio ... Full text Link to item Cite

A blood-based host gene expression assay for early detection of respiratory viral infection: an index-cluster prospective cohort study.

Journal Article The Lancet. Infectious diseases · March 2021 BackgroundEarly and accurate identification of individuals with viral infections is crucial for clinical management and public health interventions. We aimed to assess the ability of transcriptomic biomarkers to identify naturally acquired respira ... Full text Open Access Cite

Dysregulated transcriptional responses to SARS-CoV-2 in the periphery.

Journal Article Nat Commun · February 17, 2021 SARS-CoV-2 infection has been shown to trigger a wide spectrum of immune responses and clinical manifestations in human hosts. Here, we sought to elucidate novel aspects of the host response to SARS-CoV-2 infection through RNA sequencing of peripheral bloo ... Full text Open Access Link to item Cite

Weakly supervised instance learning for thyroid malignancy prediction from whole slide cytopathology images.

Journal Article Medical image analysis · January 2021 We consider machine-learning-based thyroid-malignancy prediction from cytopathology whole-slide images (WSI). Multiple instance learning (MIL) approaches, typically used for the analysis of WSIs, divide the image (bag) into patches (instances), which are u ... Full text Open Access Cite

Machine-learning-based multiple abnormality prediction with large-scale chest computed tomography volumes.

Journal Article Medical image analysis · January 2021 Machine learning models for radiology benefit from large-scale data sets with high quality labels for abnormalities. We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients. This is the largest multip ... Full text Cite

Impact of the COVID-19 pandemic on patterns of outpatient cardiovascular care.

Journal Article Am Heart J · January 2021 BACKGROUND: The coronavirus disease 2019 (COVID-19) pandemic brought about abrupt changes in the way health care is delivered, and the impact of transitioning outpatient clinic visits to telehealth visits on processes of care and outcomes is unclear. METHO ... Full text Link to item Cite

Counterfactual Representation Learning with Balancing Weights

Journal Article Proceedings of Machine Learning Research · January 1, 2021 A key to causal inference with observational data is achieving balance in predictive features associated with each treatment type. Recent literature has explored representation learning to achieve this goal. In this work, we discuss the pitfalls of these s ... Cite

Wasserstein contrastive representation distillation

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2021 The primary goal of knowledge distillation (KD) is to encapsulate the information of a model learned from a teacher network into a student network, with the latter being more compact than the former. Existing work, e.g., using Kullback-Leibler divergence f ... Full text Cite

Machine Learning Prediction of Surgical Intervention for Small Bowel Obstruction

Journal Article · 2021 ABSTRACT Small bowel obstruction (SBO) results in >350,000 operations and >$2 billion annual health care expenditures in the US. Prompt, effective identification of patients at high/low surgery risk could improve survival, lower complication rates ... Full text Cite

SpanPredict: Extraction of Predictive Document Spans with Neural Attention

Conference NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference · January 1, 2021 In many natural language processing applications, identifying predictive text can be as important as the predictions themselves. When predicting medical diagnoses, for example, identifying predictive content in clinical notes not only enhances interpretabi ... Cite

A comparison of host response strategies to distinguish bacterial and viral infection.

Journal Article PLoS One · 2021 OBJECTIVES: Compare three host response strategies to distinguish bacterial and viral etiologies of acute respiratory illness (ARI). METHODS: In this observational cohort study, procalcitonin, a 3-protein panel (CRP, IP-10, TRAIL), and a host gene expressi ... Full text Open Access Link to item Cite

An atlas connecting shared genetic architecture of human diseases and molecular phenotypes provides insight into COVID-19 susceptibility.

Journal Article medRxiv · December 22, 2020 While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge. Methods a ... Full text Link to item Cite

Chromatin remodeling in peripheral blood cells reflects COVID-19 symptom severity.

Journal Article bioRxiv · December 9, 2020 SARS-CoV-2 infection triggers highly variable host responses and causes varying degrees of illness in humans. We sought to harness the peripheral blood mononuclear cell (PBMC) response over the course of illness to provide insight into COVID-19 physiology. ... Full text Open Access Link to item Cite

Variational Disentanglement for Rare Event Modeling.

Conference ArXiv · September 17, 2020 Combining the increasing availability and abundance of healthcare data and the current advances in machine learning methods have created renewed opportunities to improve clinical decision support systems. However, in healthcare risk prediction applications ... Link to item Cite

Identification of Undetected Monogenic Cardiovascular Disorders.

Journal Article J Am Coll Cardiol · August 18, 2020 BACKGROUND: Monogenic diseases are individually rare but collectively common, and are likely underdiagnosed. OBJECTIVES: The purpose of this study was to estimate the prevalence of monogenic cardiovascular diseases (MCVDs) and potentially missed diagnoses ... Full text Link to item Cite

Average Weighted Accuracy: Pragmatic Analysis for a Rapid Diagnostics in Categorizing Acute Lung Infections (RADICAL) Study.

Journal Article Clin Infect Dis · June 10, 2020 Patient management relies on diagnostic information to identify appropriate treatment. Standard evaluations of diagnostic tests consist of estimating sensitivity, specificity, positive/negative predictive values, likelihood ratios, and accuracy. Although u ... Full text Open Access Link to item Cite

Previously Derived Host Gene Expression Classifiers Identify Bacterial and Viral Etiologies of Acute Febrile Respiratory Illness in a South Asian Population.

Journal Article Open Forum Infect Dis · June 2020 BACKGROUND: Pathogen-based diagnostics for acute respiratory infection (ARI) have limited ability to detect etiology of illness. We previously showed that peripheral blood-based host gene expression classifiers accurately identify bacterial and viral ARI i ... Full text Open Access Link to item Cite

Neural Conditional Event Time Models

Journal Article · April 3, 2020 Event time models predict occurrence times of an event of interest based on known features. Recent work has demonstrated that neural networks achieve state-of-the-art event time predictions in a variety of settings. However, standard event time models supp ... Link to item Cite

Application of a machine learning algorithm to predict malignancy in thyroid cytopathology.

Journal Article Cancer Cytopathol · April 2020 BACKGROUND: The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) comprises 6 categories used for the diagnosis of thyroid fine-needle aspiration biopsy (FNAB). Each category has an associated risk of malignancy, which is important in the manage ... Full text Link to item Cite

Variational Learning of Individual Survival Distributions.

Journal Article Proc ACM Conf Health Inference Learn (2020) · April 2020 The abundance of modern health data provides many opportunities for the use of machine learning techniques to build better statistical models to improve clinical decision making. Predicting time-to-event distributions, also known as survival analysis, play ... Full text Link to item Cite

Survival cluster analysis

Journal Article ACM CHIL 2020 - Proceedings of the 2020 ACM Conference on Health, Inference, and Learning · February 4, 2020 Conventional survival analysis approaches estimate risk scores or individualized time-to-event distributions conditioned on covariates. In practice, there is often great population-level phenotypic heterogeneity, resulting from (unknown) subpopulations wit ... Full text Cite

Increased Glutaminolysis Marks Active Scarring in Nonalcoholic Steatohepatitis Progression.

Journal Article Cell Mol Gastroenterol Hepatol · 2020 BACKGROUND & AIMS: Nonalcoholic steatohepatitis (NASH) occurs in the context of aberrant metabolism. Glutaminolysis is required for metabolic reprograming of hepatic stellate cells (HSCs) and liver fibrogenesis in mice. However, it is unclear how changes i ... Full text Open Access Link to item Cite

Learning autoencoders with relational regularization

Journal Article 37th International Conference on Machine Learning, ICML 2020 · January 1, 2020 A new algorithmic framework is proposed for learning autoencoders of data distributions. We minimize the discrepancy between the model and target distributions, with a relational regularization on the learnable latent prior. This regularization penalizes t ... Cite

Survival cluster analysis.

Conference CHIL · 2020 Cite

Sequence generation with optimal-transport-enhanced reinforcement learning

Conference AAAI 2020 - 34th AAAI Conference on Artificial Intelligence · January 1, 2020 Reinforcement learning (RL) has been widely used to aid training in language generation. This is achieved by enhancing standard maximum likelihood objectives with user-specified reward functions that encourage global semantic consistency. We propose a prin ... Cite

Students Need More Attention: BERT-based Attention Model for Small Data with Application to Automatic Patient Message Triage

Journal Article Proceedings of Machine Learning Research · January 1, 2020 Small and imbalanced datasets commonly seen in healthcare represent a challenge when training classifiers based on deep learning models. So motivated, we propose a novel framework based on BioBERT (Bidirectional Encoder Representations from Transformers fo ... Cite

Weakly supervised cross-domain alignment with optimal transport

Journal Article 31st British Machine Vision Conference, BMVC 2020 · January 1, 2020 Cross-domain alignment between image objects and text sequences is key to many visual-language tasks, and it poses a fundamental challenge to both computer vision and natural language processing. This paper investigates a novel approach for the identificat ... Cite

Integrating task specific information into pretrained language models for low resource fine tuning

Conference Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020 · January 1, 2020 Pretrained Language Models (PLMs) have improved the performance of natural language understanding in recent years. Such models are pretrained on large corpora, which encode the general prior knowledge of natural languages but are agnostic to information ch ... Cite

Neural Conditional Event Time Models

Conference Proceedings of Machine Learning Research · January 1, 2020 Event time models predict occurrence times of an event of interest based on known features. Recent work has demonstrated that neural networks achieve state-of-the-art event time predictions in biomedical applications, where event time models are frequently ... Cite

Hierarchical infinite factor models for improving the prediction of surgical complications for geriatric patients

Journal Article Annals of Applied Statistics · December 1, 2019 Nearly a third of all surgeries performed in the United States occur for patients over the age of 65; these older adults experience a higher rate of postoperative morbidity and mortality. To improve the care for these patients, we aim to identify and chara ... Full text Cite

Validation of a host response test to distinguish bacterial and viral respiratory infection.

Journal Article EBioMedicine · October 2019 BACKGROUND: Distinguishing bacterial and viral respiratory infections is challenging. Novel diagnostics based on differential host gene expression patterns are promising but have not been translated to a clinical platform nor extensively tested. Here, we v ... Full text Open Access Link to item Cite

Identifying Smoking Environments From Images of Daily Life With Deep Learning.

Journal Article JAMA Netw Open · August 2, 2019 IMPORTANCE: Environments associated with smoking increase a smoker's craving to smoke and may provoke lapses during a quit attempt. Identifying smoking risk environments from images of a smoker's daily life provides a basis for environment-based interventi ... Full text Link to item Cite

Thyroid Cancer Malignancy Prediction From Whole Slide Cytopathology Images

Journal Article Proceedings of Machine Learning Research, 2019, Vol. 106 · March 29, 2019 We consider preoperative prediction of thyroid cancer based on ultra-high-resolution whole-slide cytopathology images. Inspired by how human experts perform diagnosis, our approach first identifies and classifies diagnostic image regions containing informa ... Open Access Link to item Cite

Pilot study of myocardial ischemia-induced metabolomic changes in emergency department patients undergoing stress testing.

Journal Article PLoS One · 2019 BACKGROUND: The heart is a metabolically active organ, and plasma acylcarnitines are associated with long-term risk for myocardial infarction. We hypothesized that myocardial ischemia from cardiac stress testing will produce dynamic changes in acylcarnitin ... Full text Open Access Link to item Cite

A host gene expression approach for identifying triggers of asthma exacerbations.

Journal Article PLoS One · 2019 RATIONALE: Asthma exacerbations often occur due to infectious triggers, but determining whether infection is present and whether it is bacterial or viral remains clinically challenging. A diagnostic strategy that clarifies these uncertainties could enable ... Full text Open Access Link to item Cite

Combining deep learning methods and human knowledge to identify abnormalities in computed tomography (CT) reports

Conference Progress in Biomedical Optics and Imaging - Proceedings of SPIE · January 1, 2019 Many researchers in the field of machine learning have addressed the problem of detecting anomalies within Computed Tomography (CT) scans. Training these machine learning algorithms requires a dataset of CT scans with identified anomalies (labels), usually ... Full text Cite

Classifying abnormalities in computed tomography radiology reports with rule-based and natural language processing models

Conference Progress in Biomedical Optics and Imaging - Proceedings of SPIE · January 1, 2019 Purpose: When conducting machine learning algorithms on classification and detection of abnormalities for medical imaging, many researchers are faced with the problem that it is hard to get enough labeled data. This is especially difficult for modalities s ... Full text Cite

Deep learning of 3D computed tomography (CT) images for organ segmentation using 2D multi-channel SegNet model

Conference Progress in Biomedical Optics and Imaging - Proceedings of SPIE · January 1, 2019 Purpose To accurately segment organs from 3D CT image volumes using a 2D, multi-channel SegNet model consisting of a deep Convolutional Neural Network (CNN) encoder-decoder architecture. Method We trained a SegNet model on the extended cardiac-Torso (XCAT) ... Full text Cite

Communication-Efficient stochastic gradient mcmc for neural networks

Conference 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 · January 1, 2019 Learning probability distributions on the weights of neural networks has recently proven beneficial in many applications. Bayesian methods such as Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) offer an elegant framework to reason about model uncer ... Cite

Improving textual network learning with variational homophilic embeddings

Journal Article Advances in Neural Information Processing Systems · January 1, 2019 The performance of many network learning applications crucially hinges on the success of network embedding algorithms, which aim to encode rich network information into low-dimensional vertex-based vector representations. This paper considers a novel varia ... Cite

Thyroid Cancer Malignancy Prediction From Whole Slide Cytopathology Images

Journal Article Proceedings of Machine Learning Research · January 1, 2019 We consider preoperative prediction of thyroid cancer based on ultra-high-resolution whole-slide cytopathology images. Inspired by how human experts perform diagnosis, our approach first identifies and classifies diagnostic image regions containing informa ... Cite

Kernel-based approaches for sequence modeling: Connections to neural methods

Journal Article Advances in Neural Information Processing Systems · January 1, 2019 We investigate time-dependent data analysis from the perspective of recurrent kernel machines, from which models with hidden units and gated memory cells arise naturally. By considering dynamic gating of the memory cell, a model closely related to the long ... Cite

Serum Interleukin-8, Osteopontin, and Monocyte Chemoattractant Protein 1 Are Associated With Hepatic Fibrosis in Patients With Nonalcoholic Fatty Liver Disease.

Journal Article Hepatol Commun · November 2018 The severity of hepatic fibrosis is the primary predictor of liver-related morbidity and mortality in patients with nonalcoholic fatty liver disease (NAFLD). Unfortunately, noninvasive serum biomarkers for NAFLD-associated fibrosis are limited. We analyzed ... Full text Link to item Cite

A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection.

Journal Article Nat Commun · October 24, 2018 The response to respiratory viruses varies substantially between individuals, and there are currently no known molecular predictors from the early stages of infection. Here we conduct a community-based analysis to determine whether pre- or early post-expos ... Full text Open Access Link to item Cite

RAB11FIP5 Expression and Altered Natural Killer Cell Function Are Associated with Induction of HIV Broadly Neutralizing Antibody Responses.

Journal Article Cell · October 4, 2018 HIV-1 broadly neutralizing antibodies (bnAbs) are difficult to induce with vaccines but are generated in ∼50% of HIV-1-infected individuals. Understanding the molecular mechanisms of host control of bnAb induction is critical to vaccine design. Here, we pe ... Full text Link to item Cite

Unsupervised Analysis of Transcriptomics in Bacterial Sepsis Across Multiple Datasets Reveals Three Robust Clusters.

Journal Article Crit Care Med · June 2018 OBJECTIVES: To find and validate generalizable sepsis subtypes using data-driven clustering. DESIGN: We used advanced informatics techniques to pool data from 14 bacterial sepsis transcriptomic datasets from eight different countries (n = 700). SETTING: Re ... Full text Link to item Cite

A community approach to mortality prediction in sepsis via gene expression analysis.

Journal Article Nat Commun · February 15, 2018 Improved risk stratification and prognosis prediction in sepsis is a critical unmet need. Clinical severity scores and available assays such as blood lactate reflect global illness severity with suboptimal performance, and do not specifically reveal the un ... Full text Open Access Link to item Cite

Deconvolutional latent-variable model for text sequence matching

Journal Article 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 · January 1, 2018 A latent-variable model is introduced for text matching, inferring sentence representations by jointly optimizing generative and discriminative objectives. To alleviate typical optimization challenges in latent-variable models for text, we employ deconvolu ... Cite

Multi-Label Learning from Medical Plain Text with Convolutional Residual Models

Journal Article Proceedings of Machine Learning Research · January 1, 2018 Predicting diagnoses from Electronic Health Records (EHRs) is an important medical application of multi-label learning. We propose a convolutional residual model for multi-label classification from doctor notes in EHR data. A given patient may have multipl ... Cite

Adversarial time-to-event modeling

Journal Article 35th International Conference on Machine Learning, ICML 2018 · January 1, 2018 Modern health data science applications leverage abundant molecular and electronic health data; providing opportunities for machine learning to build statistical models to support clinical practice. Time-to-event analysis, also called survival analysis, st ... Cite

Joint embedding of words and labels for text classification

Journal Article ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) · January 1, 2018 Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences. We propose to view text classification as a label-word joint embedding problem: each label is ... Full text Cite

NasH: Toward end-to-end neural architecture for generative semantic hashing

Journal Article ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) · January 1, 2018 Semantic hashing has become a powerful paradigm for fast similarity search in many information retrieval systems. While fairly successful, previous techniques generally require two-stage training, and the binary constraints are handled ad-hoc. In this pape ... Full text Open Access Cite

Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms

Journal Article ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) · January 1, 2018 Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sop ... Full text Cite

Variational Inference and Model Selection with Generalized Evidence Bounds

Conference INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80 · 2018 Cite

JointGAN: Multi-domain joint distribution learning with generative adversarial nets

Journal Article 35th International Conference on Machine Learning, ICML 2018 · January 1, 2018 A new generative adversarial network is developed for joint distribution matching. Distinct from most existing approaches, that only learn conditional distributions, the proposed model aims to learn a joint distribution of multiple random variables (domain ... Cite

Improved semantic-aware network embedding with fine-grained word alignment

Journal Article Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 · January 1, 2018 Network embeddings, which learn low-dimensional representations for each vertex in a large-scale network, have received considerable attention in recent years. For a wide range of applications, vertices in a network are typically accompanied by rich textua ... Full text Cite

Branched chain amino acid transaminase 1 (BCAT1) is overexpressed and hypomethylated in patients with non-alcoholic fatty liver disease who experience adverse clinical events: A pilot study.

Journal Article PLoS One · 2018 BACKGROUND AND OBJECTIVES: Although the burden of non-alcoholic fatty liver disease (NAFLD) continues to increase worldwide, genetic factors predicting progression to cirrhosis and decompensation in NAFLD remain poorly understood. We sought to determine wh ... Full text Open Access Link to item Cite

X2 generative adversarial network

Conference 35th International Conference on Machine Learning, ICML 2018 · January 1, 2018 To assess the difference between real and synthetic data, Generative Adversarial Networks (GANs) are trained using a distribution discrepancy measure. Three widely employed measures are information-theoretic divergences, integral probability metrics, and H ... Cite

Supplementary material for "x2 Generative Adversarial Net"

Conference 35th International Conference on Machine Learning, ICML 2018 · January 1, 2018 Cite

Variational inference and model selection with generalized evidence bounds

Conference 35th International Conference on Machine Learning, ICML 2018 · January 1, 2018 Recent advances on the scalability and flexibility of variational inference have made it successful at unravelling hidden patterns in complex data. In this work we propose a new variational bound formulation, yielding an estimator that extends beyond the c ... Cite

A miRNA Host Response Signature Accurately Discriminates Acute Respiratory Infection Etiologies.

Journal Article Front Microbiol · 2018 Background: Acute respiratory infections (ARIs) are the leading indication for antibacterial prescriptions despite a viral etiology in the majority of cases. The lack of available diagnostics to discriminate viral and bacterial etiologies contributes to th ... Full text Open Access Link to item Cite

Gaussian process based independent analysis for temporal source separation in fMRI.

Journal Article Neuroimage · May 15, 2017 Functional Magnetic Resonance Imaging (fMRI) gives us a unique insight into the processes of the brain, and opens up for analyzing the functional activation patterns of the underlying sources. Task-inferred supervised learning with restrictive assumptions ... Full text Link to item Cite

Reply to Kim et al.

Journal Article Am J Gastroenterol · May 2017 Full text Link to item Cite

Nasopharyngeal Protein Biomarkers of Acute Respiratory Virus Infection.

Journal Article EBioMedicine · March 2017 Infection of respiratory mucosa with viral pathogens triggers complex immunologic events in the affected host. We sought to characterize this response through proteomic analysis of nasopharyngeal lavage in human subjects experimentally challenged with infl ... Full text Open Access Link to item Cite

Adversarial feature matching for text generation

Journal Article 34th International Conference on Machine Learning, ICML 2017 · January 1, 2017 The Generative Adversarial Network (GAN) has achieved great success in generating realistic (real-valued) synthetic data. However, convergence issues and difficulties dealing with discrete data hinder the applicability of GAN to text. We propose a framewor ... Cite

Stochastic Gradient Monomial Gamma Sampler

Conference INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 · 2017 Cite

VAE Learning via Stein Variational Gradient Descent

Conference ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) · January 1, 2017 Link to item Cite

Adversarial Symmetric Variational Autoencoder

Conference ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) · January 1, 2017 Link to item Cite

Adversarial symmetric variational autoencoder

Journal Article Advances in Neural Information Processing Systems · January 1, 2017 A new form of variational autoencoder (VAE) is developed, in which the joint distribution of data and codes is considered in two (symmetric) forms: (i) from observed data fed through the encoder to yield codes, and (ii) from latent codes drawn from a simpl ... Cite

Deconvolutional paragraph representation learning

Journal Article Advances in Neural Information Processing Systems · January 1, 2017 Learning latent representations from long text sequences is an important first step in many natural language processing applications. Recurrent Neural Networks (RNNs) have become a cornerstone for this challenging task. However, the quality of sentences du ... Cite

Stein Variational Autoencoder.

Journal Article CoRR · 2017 Cite

ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching

Conference ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) · January 1, 2017 Link to item Cite

Learning generic sentence representations using convolutional neural networks

Conference EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings · January 1, 2017 We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a continuous vector, ... Full text Cite

Stochastic gradient monomial gamma sampler

Journal Article 34th International Conference on Machine Learning, ICML 2017 · January 1, 2017 Recent advances in stochastic gradient techniques have made it possible to estimate posterior distributions from large datasets via Markov Chain Monte Carlo (MCMC). However, when the target posterior is multimodal, mixing performance is often poor. This re ... Cite

VAE learning via Stein variational gradient descent

Conference Advances in Neural Information Processing Systems · January 1, 2017 A new method for learning variational autoencoders (VAEs) is developed, based on Stein variational gradient descent. A key advantage of this approach is that one need not make parametric assumptions about the form of the encoder distribution. Performance i ... Cite

ALICE: Towards understanding adversarial learning for joint distribution matching

Conference Advances in Neural Information Processing Systems · January 1, 2017 We investigate the non-identifiability issues associated with bidirectional adversarial training for joint distribution matching. Within a framework of conditional entropy, we propose both adversarial and non-adversarial approaches to learn desirable match ... Cite

Vitamin D is Not Associated With Severity in NAFLD: Results of a Paired Clinical and Gene Expression Profile Analysis.

Journal Article Am J Gastroenterol · November 2016 OBJECTIVES: The pathogenesis of nonalcoholic fatty liver disease (NAFLD) is complex. Vitamin D (VitD) has been implicated in NAFLD pathogenesis because it has roles in immune modulation, cell differentiation and proliferation, and regulation of inflammatio ... Full text Link to item Cite

Dynamic poisson factor analysis

Conference Proceedings - IEEE International Conference on Data Mining, ICDM · July 2, 2016 We introduce a novel dynamic model for discrete time-series data, in which the temporal sampling may be nonuniform. The model is specified by constructing a hierarchy of Poisson factor analysis blocks, one for the transitions between latent states and the ... Full text Cite

Triply stochastic variational inference for non-linear beta process factor analysis

Conference Proceedings - IEEE International Conference on Data Mining, ICDM · July 2, 2016 We propose a non-linear extension to factor analysis with beta process priors for improved data representation ability. This non-linear Beta Process Factor Analysis (nBPFA) allows data to be represented as a non-linear transformation of a standard sparse f ... Full text Cite

Electronic health record analysis via deep poisson factor models

Journal Article Journal of Machine Learning Research · April 1, 2016 Electronic Health Record (EHR) phenotyping utilizes patient data captured through normal medical practice, to identify features that may represent computational medical phenotypes. These features may be used to identify at-risk patients and improve predict ... Cite

Differential evolution of peripheral cytokine levels in symptomatic and asymptomatic responses to experimental influenza virus challenge.

Journal Article Clin Exp Immunol · March 2016 Exposure to influenza virus triggers a complex cascade of events in the human host. In order to understand more clearly the evolution of this intricate response over time, human volunteers were inoculated with influenza A/Wisconsin/67/2005 (H3N2), and then ... Full text Link to item Cite

Host gene expression classifiers diagnose acute respiratory illness etiology.

Journal Article Sci Transl Med · January 20, 2016 Acute respiratory infections caused by bacterial or viral pathogens are among the most common reasons for seeking medical care. Despite improvements in pathogen-based diagnostics, most patients receive inappropriate antibiotics. Host response biomarkers of ... Full text Open Access Link to item Cite

Laplacian Hamiltonian Monte Carlo

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2016 We proposed a Hamiltonian Monte Carlo (HMC) method with Laplace kinetic energy, and demonstrate the connection between slice sampling and proposed HMC method in one-dimensional cases. Based on this connection, one can perform slice sampling using a numeric ... Full text Cite

Bayesian dictionary learning with Gaussian processes and sigmoid belief networks

Conference IJCAI International Joint Conference on Artificial Intelligence · January 1, 2016 In dictionary learning for analysis of images, spatial correlation from extracted patches can be leveraged to improve characterization power. We propose a Bayesian framework for dictionary learning, with spatial location dependencies captured by imposing a ... Cite

Towards unifying hamiltonian Monte Carlo and Slice sampling

Conference Advances in Neural Information Processing Systems · January 1, 2016 We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling, demonstrating their connection via the Hamiltonian-Jacobi equation from Hamiltonian mechanics. This insight enables extension of HMC and slice sampling to a broader family of samplers, cal ... Cite

Variational autoencoder for deep learning of images, labels and captions

Conference Advances in Neural Information Processing Systems · January 1, 2016 A novel variational autoencoder is developed to model images, as well as associated labels or captions. The Deep Generative Deconvolutional Network (DGDN) is used as a decoder of the latent image features, and a deep Convolutional Neural Network (CNN) is u ... Cite

Learning sigmoid belief networks via Monte Carlo expectation maximization

Conference Artificial Intelligence and Statistics · 2016 Cite

Cancers of unknown primary origin (CUP) are characterized by chromosomal instability (CIN) compared to metastasis of know origin.

Journal Article BMC Cancer · March 19, 2015 BACKGROUND: Cancers of unknown primary (CUPs) constitute ~5% of all cancers. The tumors have an aggressive biological and clinical behavior. The aim of the present study has been to uncover whether CUPs exhibit distinct molecular features compared to metas ... Full text Link to item Cite

Non-Gaussian discriminative factor models via the max-margin rank-likelihood

Journal Article 32nd International Conference on Machine Learning, ICML 2015 · January 1, 2015 We consider the problem of discriminative factor analysis for data that are in general non-Gaussian. A Bayesian model based on the ranks of the data is proposed. We first introduce a new max-margin version of the rank-likelihood. A discriminative factor mo ... Open Access Cite

Deep temporal sigmoid belief networks for sequence modeling

Journal Article Advances in Neural Information Processing Systems · 2015 Cite

Learning deep sigmoid belief networks with data augmentation

Conference Artificial Intelligence and Statistics · 2015 Cite

A multitask point process predictive model

Conference 32nd International Conference on Machine Learning, ICML 2015 · January 1, 2015 Point process data are commonly observed in fields like healthcare and the social sciences. Designing predictive models for such event streams is an under-explored problem, due to often scarce training data. In this work we propose a multitask point proces ... Cite

Large-scale Bayesian multi-label learning via topic-based label embeddings

Conference Advances in Neural Information Processing Systems · January 1, 2015 We present a scalable Bayesian multi-label learning model based on learning lowdimensional label embeddings. Our model assumes that each label vector is generated as a weighted combination of a set of topics (each topic being a distribution over labels), w ... Cite

Deep poisson factor modeling

Conference Advances in Neural Information Processing Systems · January 1, 2015 We propose a new deep architecture for topic modeling, based on Poisson Factor Analysis (PFA) modules. The model is composed of a Poisson distribution to model observed vectors of counts, as well as a deep hierarchy of hidden binary units. Rather than usin ... Cite

An integrated transcriptome and expressed variant analysis of sepsis survival and death.

Journal Article Genome Med · 2014 BACKGROUND: Sepsis, a leading cause of morbidity and mortality, is not a homogeneous disease but rather a syndrome encompassing many heterogeneous pathophysiologies. Patient factors including genetics predispose to poor outcomes, though current clinical ch ... Full text Open Access Link to item Cite

Bayesian nonlinear support vector machines and discriminative factor modeling

Conference Advances in Neural Information Processing Systems · January 1, 2014 A new Bayesian formulation is developed for nonlinear support vector machines (SVMs), based on a Gaussian process and with the SVM hinge loss expressed as a scaled mixture of normals. We then integrate the Bayesian SVM into a factor model, in which feature ... Cite

A flexible statistical model for alignment of label-free proteomics data--incorporating ion mobility and product ion information.

Journal Article BMC Bioinformatics · December 16, 2013 BACKGROUND: The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, a ... Full text Open Access Link to item Cite

Latent protein trees

Journal Article Annals of Applied Statistics · June 1, 2013 Unbiased, label-free proteomics is becoming a powerful technique for measuring protein expression in almost any biological sample. The output of these measurements after preprocessing is a collection of features and their associated intensities for each sa ... Full text Open Access Cite

A network of substrates of the E3 ubiquitin ligases MDM2 and HUWE1 control apoptosis independently of p53.

Journal Article Sci Signal · May 7, 2013 In the intrinsic pathway of apoptosis, cell-damaging signals promote the release of cytochrome c from mitochondria, triggering activation of the Apaf-1 and caspase-9 apoptosome. The ubiquitin E3 ligase MDM2 decreases the stability of the proapoptotic facto ... Full text Open Access Link to item Cite

Patient clustering with uncoded text in electronic medical records.

Journal Article AMIA Annu Symp Proc · 2013 We propose a mixture model for text data designed to capture underlying structure in the history of present illness section of electronic medical records data. Additionally, we propose a method to induce bias that leads to more homogeneous sets of diagnose ... Link to item Cite

Hierarchical factor modeling of proteomics data

Journal Article 2012 IEEE 2nd International Conference on Computational Advances in Bio and Medical Sciences, ICCABS 2012 · May 8, 2012 This paper presents a hierarchical bayesian factor model specifically designed to model the known correlation structure of both peptides and proteins in unbiased, label free proteomics. The model utilizes partial identification information from peptide seq ... Full text Cite

Efficient hierarchical clustering for continuous data

Journal Article · April 20, 2012 We present an new sequential Monte Carlo sampler for coalescent based Bayesian hierarchical clustering. Our model is appropriate for modeling non-i.i.d. data and offers a substantial reduction of computational cost when compared to the original sampler wit ... Link to item Cite

Predictive active set selection methods for Gaussian processes

Journal Article Neurocomputing · March 15, 2012 We propose an active set selection framework for Gaussian process classification for cases when the dataset is large enough to render its inference prohibitive. Our scheme consists of a two step alternating procedure of active set update rules and hyperpar ... Full text Cite

Down-regulation of microRNAs controlling tumourigenic factors in follicular thyroid carcinoma.

Journal Article J Mol Endocrinol · February 2012 The molecular determinants of thyroid follicular nodules are incompletely understood and assessment of malignancy is a diagnostic challenge. Since microRNA (miRNA) analyses could provide new leads to malignant progression, we characterised the global miRNA ... Full text Link to item Cite

Gene expression of the endolymphatic sac.

Journal Article Acta Otolaryngol · December 2011 CONCLUSION: The endolymphatic sac is part of the membranous inner ear and is thought to play a role in the fluid homeostasis and immune defense of the inner ear; however, the exact function of the endolymphatic sac is not fully known. Many of the detected ... Full text Link to item Cite

Sparse linear identifiable multivariate modeling

Journal Article Journal of Machine Learning Research · March 1, 2011 In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and ... Cite

PASS-GP: Predictive active set selection for Gaussian processes

Journal Article Proceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2010 · November 24, 2010 We propose a new approximation method for Gaussian process (GP) learning for large data sets that combines inline active set selection with hyperparameter optimization. The predictive probability of the label is used for ranking the data points. We use the ... Full text Cite

Molecular signatures of thyroid follicular neoplasia.

Journal Article Endocr Relat Cancer · September 2010 The molecular pathways leading to thyroid follicular neoplasia are incompletely understood, and the diagnosis of follicular tumors is a clinical challenge. To provide leads to the pathogenesis and diagnosis of the tumors, we examined the global transcripto ... Full text Link to item Cite

Semi-Supervised Kernel PCA

Journal Article CoRR · 2010 Cite

Bayesian sparse factor models and DAGs inference and comparison

Conference Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference · January 1, 2009 In this paper we present a novel approach to learn directed acyclic graphs (DAGs) and factor models within the same framework while also allowing for model comparison between them. For this purpose, we exploit the connection between factor models and DAGs ... Cite

Myocardial ischemia detection using Hidden Markov principal component analysis

Journal Article IFMBE Proceedings · January 1, 2008 This paper introduces a new temporal version of Principal Component Analysis by using a Hidden Markov Model in order to obtain optimized representations of observed data through time. The novelty of the proposed method consists mainly in the way in which a ... Full text Cite

Probabilistic kernel principal component analysis through time

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2006 This paper introduces a temporal version of Probabilistic Kernel Principal Component Analysis by using a hidden Markov model in order to obtain optimized representations of observed data through time. Recently introduced. Probabilistic Kernel Principal Com ... Full text Cite

Kernel Principal Component analysis through time for voice disorder classification.

Journal Article Conf Proc IEEE Eng Med Biol Soc · 2006 Kernel Principal Component analysis is a nonlinear generalization of the popular linear multivariate analysis method. However, this method assumes that the observed data is independent, a disadvantage for many practical applications. In order to overcome t ... Full text Link to item Cite

Kernel Principal Component analysis through time for voice disorder classification.

Journal Article Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference · 2006 Kernel Principal Component analysis is a nonlinear generalization of the popular linear multivariate analysis method. However, this method assumes that the observed data is independent, a disadvantage for many practical applications. In order to overcome t ... Cite

Kernel principal component analysis through time for voice disorder classification

Conference 2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15 · January 1, 2006 Link to item Cite

Active Learning on the Classification of Voice Pathologies

Conference Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH · January 1, 2004 In this article, it is studied the usefulness of the support vector machines (SVM) algorithm in the active classification of voice records into the sets normal and pathologic. In practice, each one of the samples employed on the classifier training must be ... Cite