A maximum-likelihood base caller for DNA sequencing.
The procedures used to sequence the human genome involve the electrophoretic separation of mixtures of dioxyribonucleic acid (DNA) fragments tagged with reporting groups, usually fluorescent dyes. Each fluorescent pulse which arrives from an optical detector corresponds to a nucleotide (base) in the DNA sequence, and the subsequent process of base detection is known as base calling. Generating longer and more accurate sequences in the base-calling process will reduce the high cost of DNA sequencing. This paper presents an automated base-calling algorithm, referred to as maximum-likelihood base caller (MLB), which is based on maximum likelihood equalization for digital communication channels. Based on 125 experimental datasets, MLB averaged up to 40% fewer errors than the widely used ABI base caller from the Applied Biosystems Division of PE Corporation. MLB's accuracy rivaled that of another well-known base caller, Phred, surpassing it on datasets with high background noise.
Duke Scholars
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Sequence Analysis, DNA
- Likelihood Functions
- Humans
- Databases, Factual
- DNA
- Biomedical Engineering
- Biomedical Engineering
- Base Sequence
- Algorithms
- 4603 Computer vision and multimedia computation
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Sequence Analysis, DNA
- Likelihood Functions
- Humans
- Databases, Factual
- DNA
- Biomedical Engineering
- Biomedical Engineering
- Base Sequence
- Algorithms
- 4603 Computer vision and multimedia computation