Skip to main content

Finding diagnostic biomarkers in proteomic spectra.

Publication ,  Conference
Pratapa, PN; Patz, EF; Hartemink, AJ
Published in: Pac Symp Biocomput
2006

In seeking to find diagnostic biomarkers in proteomic spectra, two significant problems arise. First, not only is there noise in the measured intensity at each m/z value, but there is also noise in the measured m/z value itself. Second, the potential for overfitting is severe: it is easy to find features in the spectra that accurately discriminate disease states but have no biological meaning. We address these problems by developing and testing a series of steps for pre-processing proteomic spectra and extracting putatively meaningful features before presentation to feature selection and classification algorithms. These steps include an HMM-based latent spectrum extraction algorithm for fusing the information from multiple replicate spectra obtained from a single tissue sample, a simple algorithm for baseline correction based on a segmented convex hull, a peak identification and quantification algorithm, and a peak registration algorithm to align peaks from multiple tissue samples into common peak registers. We apply these steps to MALDI spectral data collected from normal and tumor lung tissue samples, and then compare the performance of feature selection with FDR followed by classification with an SVM, versus joint feature selection and classification with Bayesian sparse multinomial logistic regression (SMLR). The SMLR approach outperformed FDR+SVM, but both were effective in achieving good diagnostic accuracy with a small number of features. Some of the selected features have previously been investigated as clinical markers for lung cancer diagnosis; some of the remaining features are excellent candidates for further research.

Duke Scholars

Published In

Pac Symp Biocomput

ISSN

2335-6928

Publication Date

2006

Start / End Page

279 / 290

Location

United States

Related Subject Headings

  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization
  • Proteomics
  • Lung Neoplasms
  • Humans
  • Computational Biology
  • Biomarkers, Tumor
  • Biomarkers
  • Algorithms
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Pratapa, P. N., Patz, E. F., & Hartemink, A. J. (2006). Finding diagnostic biomarkers in proteomic spectra. In Pac Symp Biocomput (pp. 279–290). United States.
Pratapa, Pallavi N., Edward F. Patz, and Alexander J. Hartemink. “Finding diagnostic biomarkers in proteomic spectra.” In Pac Symp Biocomput, 279–90, 2006.
Pratapa PN, Patz EF, Hartemink AJ. Finding diagnostic biomarkers in proteomic spectra. In: Pac Symp Biocomput. 2006. p. 279–90.
Pratapa, Pallavi N., et al. “Finding diagnostic biomarkers in proteomic spectra.Pac Symp Biocomput, 2006, pp. 279–90.
Pratapa PN, Patz EF, Hartemink AJ. Finding diagnostic biomarkers in proteomic spectra. Pac Symp Biocomput. 2006. p. 279–290.

Published In

Pac Symp Biocomput

ISSN

2335-6928

Publication Date

2006

Start / End Page

279 / 290

Location

United States

Related Subject Headings

  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization
  • Proteomics
  • Lung Neoplasms
  • Humans
  • Computational Biology
  • Biomarkers, Tumor
  • Biomarkers
  • Algorithms