Scholars@Duke publication: Classifying noisy and incomplete medical data by a differential latent semantic indexing approach

Springer Optimization and Its Applications

Classifying noisy and incomplete medical data by a differential latent semantic indexing approach

Publication , Chapter

Chen, L; Zeng, J; Pei, J

January 1, 2007

It is well-recognized that medical datasets are often noisy and incomplete due to the difficulties in data collection and integration. Noise and incompleteness in medical data post substantial challenges for accurate classification. A differential latent semantic indexing (DLSI) approach which is an improvement of the standard LSI method has been proposed for information retrieval and demonstrated improved performance over standard LSI approach. The key idea is that DLSI adapts to the unique characteristics of individual record/document. By experimental results on real datasets, we show that DLSI outperforms the standard LSI method on noisy and incomplete medical datasets. The results strongly indicate that the DLSI approach is also capable of medical numerical data analysis.

Duke Scholars

Author Jian Pei Computer Science

DOI

10.1007/978-0-387-69319-4_10

Publication Date

January 1, 2007

Volume

Start / End Page

169 / 176

Citation

APA

Chicago

ICMJE

MLA

NLM

Chen, L., Zeng, J., & Pei, J. (2007). Classifying noisy and incomplete medical data by a differential latent semantic indexing approach. In Springer Optimization and Its Applications (Vol. 7, pp. 169–176). https://doi.org/10.1007/978-0-387-69319-4_10

Chen, L., J. Zeng, and J. Pei. “Classifying noisy and incomplete medical data by a differential latent semantic indexing approach.” In Springer Optimization and Its Applications, 7:169–76, 2007. https://doi.org/10.1007/978-0-387-69319-4_10.

Chen L, Zeng J, Pei J. Classifying noisy and incomplete medical data by a differential latent semantic indexing approach. In: Springer Optimization and Its Applications. 2007. p. 169–76.

Chen, L., et al. “Classifying noisy and incomplete medical data by a differential latent semantic indexing approach.” Springer Optimization and Its Applications, vol. 7, 2007, pp. 169–76. Scopus, doi:10.1007/978-0-387-69319-4_10.

Chen L, Zeng J, Pei J. Classifying noisy and incomplete medical data by a differential latent semantic indexing approach. Springer Optimization and Its Applications. 2007. p. 169–176.

DOI

10.1007/978-0-387-69319-4_10

Publication Date

January 1, 2007

Volume

Start / End Page

169 / 176