Identifying structural motifs in proteins.
In biological macromolecules, structural patterns (motifs) are often repeated across different molecules. Detection of these common motifs in a new molecule can provide useful clues to the functional properties of such a molecule. We formulate the problem of identifying a given structural motif (pattern) in a target protein (example) and discuss the notion of complete matches vis-a-vis partial matches. We describe the precise error criterion that has to be minimized and also discuss different metrics for evaluating the quality of partial matches. Secondly, we present a new polynomial time algorithm for the problem of matching a given motif in a target protein. We also use the sequence and (if available) secondary structure information to annotate the different points in motif and the target protein, thus reducing the search space size. Our algorithm guarantees the detection of a perfect match, if present. Even otherwise, the algorithm computes very good matches. Unlike other methods, the error minimized by our algorithm directly translates to root mean square deviation (RMSD), the most commonly accepted metric for structure matching in biological macromolecules. The algorithm does not involve any preprocessing and is suitable for the detection of both small and large motifs in the target protein. We also present experiments exploring the quality of matches found by the algorithm. We examine its performance in matching (both full and partial) active sites in proteins.
Duke Scholars
Published In
ISSN
Publication Date
Start / End Page
Location
Related Subject Headings
- Sequence Alignment
- Proteins
- Molecular Structure
- Models, Molecular
- Computer Simulation
- Binding Sites
- Amino Acid Sequence
- Algorithms
Citation
Published In
ISSN
Publication Date
Start / End Page
Location
Related Subject Headings
- Sequence Alignment
- Proteins
- Molecular Structure
- Models, Molecular
- Computer Simulation
- Binding Sites
- Amino Acid Sequence
- Algorithms