Skip to main content

Multiresolution indexing of XML for frequent queries

Publication ,  Journal Article
He, H; Yang, J
Published in: Proceedings International Conference on Data Engineering
June 1, 2004

XML and other types of semi-structured data are typically represented by a labeled directed graph. To speed up path expression queries over the graph, a variety of structural indexes have been proposed. They usually work by partitioning nodes in the data graph into equivalence classes and storing equivalence classes as index nodes. A(k)-index introduces the concept of local bisimilarity for partitioning, allowing the trade-off between index size and query answering power. However, all index nodes in A(k)-index have the same local similarity k, which cannot take advantage of the fact that a workload may contain path expressions of different lengths, or that different parts of the data graph may have different local similarity requirements. To overcome these limitations, we propose M(k)- and M*(k)-indexes. The basic M(k)-index is workload-aware: Like the previously proposed D(k)-index, it allows different index nodes to have different local similarity requirements, providing finer partitioning only for parts of the data graph targeted by longer path expressions. Unlike D(k)-index, M(k)-index is never over-refined for irrelevant index or data nodes. However, the workload-aware feature still incurs overrefinement due to over-qualified parent index nodes. Moreover, fine partitions penalize the performance of short path expressions. To solve these problems, we further propose the M*(k)-index. An M*(k)-index consists of a collection of indexes whose nodes are organized in a partition hierarchy, allowing successively coarser partitioning information to co-exist with the finest partitioning information required. Experiments show that our indexes are superior to previously proposed indexes in terms of index size and query performance.

Duke Scholars

Published In

Proceedings International Conference on Data Engineering

Publication Date

June 1, 2004

Volume

20

Start / End Page

683 / 694
 

Citation

APA
Chicago
ICMJE
MLA
NLM
He, H., & Yang, J. (2004). Multiresolution indexing of XML for frequent queries. Proceedings International Conference on Data Engineering, 20, 683–694.
He, H., and J. Yang. “Multiresolution indexing of XML for frequent queries.” Proceedings International Conference on Data Engineering 20 (June 1, 2004): 683–94.
He H, Yang J. Multiresolution indexing of XML for frequent queries. Proceedings International Conference on Data Engineering. 2004 Jun 1;20:683–94.
He, H., and J. Yang. “Multiresolution indexing of XML for frequent queries.” Proceedings International Conference on Data Engineering, vol. 20, June 2004, pp. 683–94.
He H, Yang J. Multiresolution indexing of XML for frequent queries. Proceedings International Conference on Data Engineering. 2004 Jun 1;20:683–694.

Published In

Proceedings International Conference on Data Engineering

Publication Date

June 1, 2004

Volume

20

Start / End Page

683 / 694