Skip to main content

Structure-based querying of proteins using wavelets

Publication ,  Conference
Marsolo, K; Parthasarathy, S; Ramamohanarao, K
Published in: International Conference on Information and Knowledge Management, Proceedings
December 1, 2006

The ability to retrieve molecules based on structural similarity has use in many applications, from disease diagnosis and treatment to drug discovery and design. In this paper, we present a method to represent protein molecules that allows for the fast, flexible and efficient retrieval of similar structures, based on either global or local attributes. We begin by computing the pair-wise distance between amino acids, transforming each 3D structure into a 2D distance matrix. We normalize this matrix to a specific size and apply a 2D wavelet decomposition to generate a set of approximation coefficients, which serves as our global feature vector. This transformation reduces the overall dimensionality of the data while still preserving spatial features and correlations. We test our method by running queries on three different protein data sets that have been used previously in the literature, basing our comparisons on labels taken from the SCOP database. We find that our method significantly outperforms existing approaches, in terms of retrieval accuracy, memory utilization and execution time. Specifically, using a k-d tree and running a 10-nearest-neighbor search on a dataset of 33,000 proteins against itself, we see an average accuracy of 89% at the SCOP SuperFamily level and a total query time that is up to 350 times faster than previously published techniques. In addition to processing queries based on global similarity, we also propose innovative extensions to effectively match proteins based solely on shared local substructures, allowing for a more flexible query interface. Copyright 2006 ACM.

Duke Scholars

Published In

International Conference on Information and Knowledge Management, Proceedings

DOI

ISBN

9781595934338

Publication Date

December 1, 2006

Start / End Page

24 / 33
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Marsolo, K., Parthasarathy, S., & Ramamohanarao, K. (2006). Structure-based querying of proteins using wavelets. In International Conference on Information and Knowledge Management, Proceedings (pp. 24–33). https://doi.org/10.1145/1183614.1183622
Marsolo, K., S. Parthasarathy, and K. Ramamohanarao. “Structure-based querying of proteins using wavelets.” In International Conference on Information and Knowledge Management, Proceedings, 24–33, 2006. https://doi.org/10.1145/1183614.1183622.
Marsolo K, Parthasarathy S, Ramamohanarao K. Structure-based querying of proteins using wavelets. In: International Conference on Information and Knowledge Management, Proceedings. 2006. p. 24–33.
Marsolo, K., et al. “Structure-based querying of proteins using wavelets.” International Conference on Information and Knowledge Management, Proceedings, 2006, pp. 24–33. Scopus, doi:10.1145/1183614.1183622.
Marsolo K, Parthasarathy S, Ramamohanarao K. Structure-based querying of proteins using wavelets. International Conference on Information and Knowledge Management, Proceedings. 2006. p. 24–33.

Published In

International Conference on Information and Knowledge Management, Proceedings

DOI

ISBN

9781595934338

Publication Date

December 1, 2006

Start / End Page

24 / 33