Hierarchical factor modeling of proteomics data
This paper presents a hierarchical bayesian factor model specifically designed to model the known correlation structure of both peptides and proteins in unbiased, label free proteomics. The model utilizes partial identification information from peptide sequencing and database lookup as well as observed correlation in the data set in order to appropriately compress features into metaproteins and to estimate correlation structure. Although peptide to phenotype associations may be computed from hypothesis testing or multiple regression summaries, to date, there have been no published approaches that directly model what we know to be multiple different levels of correlation structure. We test the the proposed model using publicly available benchmark data and a recent study based on a collection of volunteers who were infected with two different strands of viral influenza. © 2012 IEEE.