Skip to main content
Journal cover image

Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data.

Publication ,  Journal Article
Fan, S; Kind, T; Cajka, T; Hazen, SL; Tang, WHW; Kaddurah-Daouk, R; Irvin, MR; Arnett, DK; Barupal, DK; Fiehn, O
Published in: Anal Chem
March 5, 2019

Large-scale untargeted lipidomics experiments involve the measurement of hundreds to thousands of samples. Such data sets are usually acquired on one instrument over days or weeks of analysis time. Such extensive data acquisition processes introduce a variety of systematic errors, including batch differences, longitudinal drifts, or even instrument-to-instrument variation. Technical data variance can obscure the true biological signal and hinder biological discoveries. To combat this issue, we present a novel normalization approach based on using quality control pool samples (QC). This method is called systematic error removal using random forest (SERRF) for eliminating the unwanted systematic variations in large sample sets. We compared SERRF with 15 other commonly used normalization methods using six lipidomics data sets from three large cohort studies (832, 1162, and 2696 samples). SERRF reduced the average technical errors for these data sets to 5% relative standard deviation. We conclude that SERRF outperforms other existing methods and can significantly reduce the unwanted systematic variation, revealing biological variance of interest.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Anal Chem

DOI

EISSN

1520-6882

Publication Date

March 5, 2019

Volume

91

Issue

5

Start / End Page

3590 / 3596

Location

United States

Related Subject Headings

  • Scientific Experimental Error
  • Quality Control
  • Lipidomics
  • Datasets as Topic
  • Analytical Chemistry
  • 4004 Chemical engineering
  • 3401 Analytical chemistry
  • 3205 Medical biochemistry and metabolomics
  • 0399 Other Chemical Sciences
  • 0301 Analytical Chemistry
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Fan, S., Kind, T., Cajka, T., Hazen, S. L., Tang, W. H. W., Kaddurah-Daouk, R., … Fiehn, O. (2019). Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data. Anal Chem, 91(5), 3590–3596. https://doi.org/10.1021/acs.analchem.8b05592
Fan, Sili, Tobias Kind, Tomas Cajka, Stanley L. Hazen, WH Wilson Tang, Rima Kaddurah-Daouk, Marguerite R. Irvin, Donna K. Arnett, Dinesh K. Barupal, and Oliver Fiehn. “Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data.Anal Chem 91, no. 5 (March 5, 2019): 3590–96. https://doi.org/10.1021/acs.analchem.8b05592.
Fan S, Kind T, Cajka T, Hazen SL, Tang WHW, Kaddurah-Daouk R, et al. Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data. Anal Chem. 2019 Mar 5;91(5):3590–6.
Fan, Sili, et al. “Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data.Anal Chem, vol. 91, no. 5, Mar. 2019, pp. 3590–96. Pubmed, doi:10.1021/acs.analchem.8b05592.
Fan S, Kind T, Cajka T, Hazen SL, Tang WHW, Kaddurah-Daouk R, Irvin MR, Arnett DK, Barupal DK, Fiehn O. Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data. Anal Chem. 2019 Mar 5;91(5):3590–3596.
Journal cover image

Published In

Anal Chem

DOI

EISSN

1520-6882

Publication Date

March 5, 2019

Volume

91

Issue

5

Start / End Page

3590 / 3596

Location

United States

Related Subject Headings

  • Scientific Experimental Error
  • Quality Control
  • Lipidomics
  • Datasets as Topic
  • Analytical Chemistry
  • 4004 Chemical engineering
  • 3401 Analytical chemistry
  • 3205 Medical biochemistry and metabolomics
  • 0399 Other Chemical Sciences
  • 0301 Analytical Chemistry