Skip to main content

Finding diagnostic biomarkers in proteomic spectra.

Publication ,  Conference
Pratapa, PN; Patz, EF; Hartemink, AJ
Published in: Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
January 2006

In seeking to find diagnostic biomarkers in proteomic spectra, two significant problems arise. First, not only is there noise in the measured intensity at each m/z value, but there is also noise in the measured m/z value itself. Second, the potential for overfitting is severe: it is easy to find features in the spectra that accurately discriminate disease states but have no biological meaning. We address these problems by developing and testing a series of steps for pre-processing proteomic spectra and extracting putatively meaningful features before presentation to feature selection and classification algorithms. These steps include an HMM-based latent spectrum extraction algorithm for fusing the information from multiple replicate spectra obtained from a single tissue sample, a simple algorithm for baseline correction based on a segmented convex hull, a peak identification and quantification algorithm, and a peak registration algorithm to align peaks from multiple tissue samples into common peak registers. We apply these steps to MALDI spectral data collected from normal and tumor lung tissue samples, and then compare the performance of feature selection with FDR followed by classification with an SVM, versus joint feature selection and classification with Bayesian sparse multinomial logistic regression (SMLR). The SMLR approach outperformed FDR+SVM, but both were effective in achieving good diagnostic accuracy with a small number of features. Some of the selected features have previously been investigated as clinical markers for lung cancer diagnosis; some of the remaining features are excellent candidates for further research.

Duke Scholars

Published In

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

EISSN

2335-6936

ISSN

2335-6928

Publication Date

January 2006

Start / End Page

279 / 290

Related Subject Headings

  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization
  • Proteomics
  • Lung Neoplasms
  • Humans
  • Computational Biology
  • Biomarkers, Tumor
  • Biomarkers
  • Algorithms
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Pratapa, P. N., Patz, E. F., & Hartemink, A. J. (2006). Finding diagnostic biomarkers in proteomic spectra. In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing (pp. 279–290).
Pratapa, Pallavi N., Edward F. Patz, and Alexander J. Hartemink. “Finding diagnostic biomarkers in proteomic spectra.” In Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 279–90, 2006.
Pratapa PN, Patz EF, Hartemink AJ. Finding diagnostic biomarkers in proteomic spectra. In: Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing. 2006. p. 279–90.
Pratapa, Pallavi N., et al. “Finding diagnostic biomarkers in proteomic spectra.Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2006, pp. 279–90.
Pratapa PN, Patz EF, Hartemink AJ. Finding diagnostic biomarkers in proteomic spectra. Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing. 2006. p. 279–290.

Published In

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

EISSN

2335-6936

ISSN

2335-6928

Publication Date

January 2006

Start / End Page

279 / 290

Related Subject Headings

  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization
  • Proteomics
  • Lung Neoplasms
  • Humans
  • Computational Biology
  • Biomarkers, Tumor
  • Biomarkers
  • Algorithms