Flexible infrastructure for gathering XML statistics and estimating query cardinality

Update Item Information
Publication Type Journal Article
School or College College of Engineering
Department Computing, School of
Creator Freire, Juliana
Other Author Ramanath, Maya; Zhang, Lingzhi
Title Flexible infrastructure for gathering XML statistics and estimating query cardinality
Date 2004
Description A key component of XML data management systems is the result size estimator, which estimates the cardinalities of user queries. Estimated cardinalities are needed in a variety of tasks, including query optimization and cost-based storage design; and they can also be used to give users early feedback about the expected outcome of their queries. In [2], we proposed StatiX. In contrast to previously proposed result estimators, which use specialized data structures and estimation algorithms, StatiX uses histograms to uniformly capture both the structural and value skew present in documents. It also leverages schema information to produce high-quality and concise statistical summaries. In particular, it exploits XML Schema transformations [1] to obtain statistics at different granularities.
Type Text
Publisher Institute of Electrical and Electronics Engineers (IEEE)
First Page 857
Subject Query cardinality; StatiX++; XML schema
Subject LCSH XML (Document markup language); Database management; Query languages (Computer science); Internet searching
Language eng
Bibliographic Citation Freire, J., Ramanath, M., & Zhang, L. (2004). Flexible infrastructure for gathering XML statistics and estimating query cardinality. Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 20 Mar - 2 Apr 2004, Boston, MA, USA, 857.
Rights Management © 2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Format Medium application/pdf
Format Extent 36,289 bytes
Identifier ir-main,12358
ARK ark:/87278/s6668x9p
Setname ir_uspace
ID 702709
Reference URL https://collections.lib.utah.edu/ark:/87278/s6668x9p
Back to Search Results