Skip to main content

MSIseq: Software for Assessing Microsatellite Instability from Catalogs of Somatic Mutations.

Publication ,  Journal Article
Huang, MN; McPherson, JR; Cutcutache, I; Teh, BT; Tan, P; Rozen, SG
Published in: Sci Rep
August 26, 2015

Microsatellite instability (MSI) is a form of hypermutation that occurs in some tumors due to defects in cellular DNA mismatch repair. MSI is characterized by frequent somatic mutations (i.e., cancer-specific mutations) that change the length of simple repeats (e.g., AAAAA…., GATAGATAGATA...). Clinical MSI tests evaluate the lengths of a handful of simple repeat sites, while next-generation sequencing can assay many more sites and offers a much more complete view of their somatic mutation frequencies. Using somatic mutation data from the exomes of a 361-tumor training set, we developed classifiers to determine MSI status based on four machine-learning frameworks. All frameworks had high accuracy, and after choosing one we determined that it had >98% concordance with clinical tests in a separate 163-tumor test set. Furthermore, this classifier retained high concordance even when classifying tumors based on subsets of whole-exome data. We have released a CRAN R package, MSIseq, based on this classifier. MSIseq is faster and simpler to use than software that requires large files of aligned sequenced reads. MSIseq will be useful for genomic studies in which clinical MSI test results are unavailable and for detecting possible misclassifications by clinical tests.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Sci Rep

DOI

EISSN

2045-2322

Publication Date

August 26, 2015

Volume

5

Start / End Page

13321

Location

England

Related Subject Headings

  • Software
  • Sequence Analysis, DNA
  • Sensitivity and Specificity
  • Reproducibility of Results
  • Pattern Recognition, Automated
  • Molecular Sequence Data
  • Microsatellite Instability
  • Machine Learning
  • Humans
  • Databases, Genetic
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Huang, M. N., McPherson, J. R., Cutcutache, I., Teh, B. T., Tan, P., & Rozen, S. G. (2015). MSIseq: Software for Assessing Microsatellite Instability from Catalogs of Somatic Mutations. Sci Rep, 5, 13321. https://doi.org/10.1038/srep13321
Huang, Mi Ni, John R. McPherson, Ioana Cutcutache, Bin Tean Teh, Patrick Tan, and Steven G. Rozen. “MSIseq: Software for Assessing Microsatellite Instability from Catalogs of Somatic Mutations.Sci Rep 5 (August 26, 2015): 13321. https://doi.org/10.1038/srep13321.
Huang MN, McPherson JR, Cutcutache I, Teh BT, Tan P, Rozen SG. MSIseq: Software for Assessing Microsatellite Instability from Catalogs of Somatic Mutations. Sci Rep. 2015 Aug 26;5:13321.
Huang, Mi Ni, et al. “MSIseq: Software for Assessing Microsatellite Instability from Catalogs of Somatic Mutations.Sci Rep, vol. 5, Aug. 2015, p. 13321. Pubmed, doi:10.1038/srep13321.
Huang MN, McPherson JR, Cutcutache I, Teh BT, Tan P, Rozen SG. MSIseq: Software for Assessing Microsatellite Instability from Catalogs of Somatic Mutations. Sci Rep. 2015 Aug 26;5:13321.

Published In

Sci Rep

DOI

EISSN

2045-2322

Publication Date

August 26, 2015

Volume

5

Start / End Page

13321

Location

England

Related Subject Headings

  • Software
  • Sequence Analysis, DNA
  • Sensitivity and Specificity
  • Reproducibility of Results
  • Pattern Recognition, Automated
  • Molecular Sequence Data
  • Microsatellite Instability
  • Machine Learning
  • Humans
  • Databases, Genetic