Mixture model normalization for non-targeted gas chromatography/mass spectrometry metabolomics data.

Published online

Journal Article

BACKGROUND: Metabolomics offers a unique integrative perspective for health research, reflecting genetic and environmental contributions to disease-related phenotypes. Identifying robust associations in population-based or large-scale clinical studies demands large numbers of subjects and therefore sample batching for gas-chromatography/mass spectrometry (GC/MS) non-targeted assays. When run over weeks or months, technical noise due to batch and run-order threatens data interpretability. Application of existing normalization methods to metabolomics is challenged by unsatisfied modeling assumptions and, notably, failure to address batch-specific truncation of low abundance compounds. RESULTS: To curtail technical noise and make GC/MS metabolomics data amenable to analyses describing biologically relevant variability, we propose mixture model normalization (mixnorm) that accommodates truncated data and estimates per-metabolite batch and run-order effects using quality control samples. Mixnorm outperforms other approaches across many metrics, including improved correlation of non-targeted and targeted measurements and superior performance when metabolite detectability varies according to batch. For some metrics, particularly when truncation is less frequent for a metabolite, mean centering and median scaling demonstrate comparable performance to mixnorm. CONCLUSIONS: When quality control samples are systematically included in batches, mixnorm is uniquely suited to normalizing non-targeted GC/MS metabolomics data due to explicit accommodation of batch effects, run order and varying thresholds of detectability. Especially in large-scale studies, normalization is crucial for drawing accurate conclusions from non-targeted GC/MS metabolomics data.

Full Text

Duke Authors

Cited Authors

  • Reisetter, AC; Muehlbauer, MJ; Bain, JR; Nodzenski, M; Stevens, RD; Ilkayeva, O; Metzger, BE; Newgard, CB; Lowe, WL; Scholtens, DM

Published Date

  • February 2, 2017

Published In

Volume / Issue

  • 18 / 1

Start / End Page

  • 84 -

PubMed ID

  • 28153035

Pubmed Central ID

  • 28153035

Electronic International Standard Serial Number (EISSN)

  • 1471-2105

Digital Object Identifier (DOI)

  • 10.1186/s12859-017-1501-7

Language

  • eng

Conference Location

  • England