Scholars@Duke publication: Latent protein trees

Latent protein trees

Publication , Journal Article

Henao, R; Thompson, JW; Moseley, MA; Ginsburg, GS; Carin, L; Lucas, JE

Published in: Annals of Applied Statistics

June 1, 2013

Published version (DOI) Open Access Copy (Duke)

Unbiased, label-free proteomics is becoming a powerful technique for measuring protein expression in almost any biological sample. The output of these measurements after preprocessing is a collection of features and their associated intensities for each sample. Subsets of features within the data are from the same peptide, subsets of peptides are from the same protein, and subsets of proteins are in the same biological pathways, therefore, there is the potential for very complex and informative correlational structure inherent in these data. Recent attempts to utilize this data often focus on the identification of single features that are associated with a particular phenotype that is relevant to the experiment. However, to date, there have been no published approaches that directly model what we know to be multiple different levels of correlation structure. Here we present a hierarchical Bayesian model which is specifically designed to model such correlation structure in unbiased, label-free proteomics. This model utilizes partial identification information from peptide sequencing and database lookup as well as the observed correlation in the data to appropriately compress features into latent proteins and to estimate their correlation structure. We demonstrate the effectiveness of the model using artificial/benchmark data and in the context of a series of proteomics measurements of blood plasma from a collection of volunteers who were infected with two different strains of viral influenza. © Institute of Mathematical Statistics, 2013.

Duke Scholars

Author Ricardo Henao Biostatistics & Bioinformatics, Division of Translational Bi ...

Author J. Will Thompson Pharmacology & Cancer Biology

Author Martin Arthur Moseley III Cell Biology

Author Geoffrey Steven Ginsburg Medicine, Cardiology

Author Lawrence Carin Electrical and Computer Engineering

Published In

Annals of Applied Statistics

DOI

10.1214/13-AOAS639

EISSN

1941-7330

ISSN

1932-6157

Publication Date

June 1, 2013

Volume

Issue

Start / End Page

691 / 713

Related Subject Headings

Statistics & Probability
4905 Statistics
1403 Econometrics
0104 Statistics

Citation

APA

Chicago

ICMJE

MLA

NLM

Henao, R., Thompson, J. W., Moseley, M. A., Ginsburg, G. S., Carin, L., & Lucas, J. E. (2013). Latent protein trees. Annals of Applied Statistics, 7(2), 691–713. https://doi.org/10.1214/13-AOAS639

Henao, R., J. W. Thompson, M. A. Moseley, G. S. Ginsburg, L. Carin, and J. E. Lucas. “Latent protein trees.” Annals of Applied Statistics 7, no. 2 (June 1, 2013): 691–713. https://doi.org/10.1214/13-AOAS639.

Henao R, Thompson JW, Moseley MA, Ginsburg GS, Carin L, Lucas JE. Latent protein trees. Annals of Applied Statistics. 2013 Jun 1;7(2):691–713.

Henao, R., et al. “Latent protein trees.” Annals of Applied Statistics, vol. 7, no. 2, June 2013, pp. 691–713. Scopus, doi:10.1214/13-AOAS639.

Henao R, Thompson JW, Moseley MA, Ginsburg GS, Carin L, Lucas JE. Latent protein trees. Annals of Applied Statistics. 2013 Jun 1;7(2):691–713.

Published In

Annals of Applied Statistics

DOI

10.1214/13-AOAS639

EISSN

1941-7330

ISSN

1932-6157

Publication Date

June 1, 2013

Volume

Issue

Start / End Page

691 / 713

Related Subject Headings

Statistics & Probability
4905 Statistics
1403 Econometrics
0104 Statistics