Browsing by Author "CHEN, Ting"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- ItemFrom region encoding to extended Dewey: on efficient processing of XML twig pattern matching(2005-07-04T09:27:43Z) LU, Jiaheng; LING, Tok Wang; CHAN, Chee-Yong; CHEN, TingFinding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process a twig query based on region encoding labeling scheme. While region encoding supports efficient determination of ancestor-descendant (or parent-child) relationship between two elements, we observe that the information within a single label is very limited. In this paper, we propose a new labeling scheme, called extended Dewey. This is a powerful labeling scheme, since from the label of an element alone, we can derive all the elements names along the path from the root to the element. Based on extended Dewey, we design a novel holistic twig join algorithm, called TJFast. Unlike all previous algorithms based on region encoding, to answer a twig query, TJFast only needs to access the labels of the leaf query nodes. Through this, not only do we reduce disk access, but we also support the efficient evaluation of queries with wildcards in branching nodes, which is very difficult to be answered by algorithms based on region encoding. Finally, we report our experimental results to show that our algorithms are superior to previous approaches in terms of the number of elements scanned, the size of intermediate results and query performance.
- ItemTJFast: Efficient Processing of XML Twig Pattern Matching(2004-11-29T06:46:05Z) LU, Jiaheng; CHEN, Ting; LING, Tok WangFinding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. Previous twig join algorithms were proposed as an I/O optimal solution when the twig pattern only involves ancestor-descendant relationships. In this paper, we propose a new efficient holistic twig join algorithm, namely TJFast, which is based on a variation of Dewey ID labeling scheme. In order to answer a twig query, TJFast only needs to access the labels of elements whose tags appear in the leaf nodes of query. TJFast guarantees the I/O optimality for queries with only ancestor-descendant edges connecting branching nodes and their child nodes. In other words, unlike previous algorithms, the optimality of TJFast allows parent-child edges to appear in some edges. Besides TJFast, we also present a novel streaming strategy, called tag+level streaming, where two elements appear in the same stream if and only if they have the same tag name and level number. We develop a twig join algorithm, called TJFast+ based on tag+level streaming. We show that TJFast+ can identify even larger query class to guarantee I/O optimality than TJFast. Therefore, in this paper, by proposing TJFast/TJFast, we significantly enlarge the optimal query class compared to previous algorithms. Finally, we report our experimental results to show that our algorithms are superior to previous approaches in terms of the number of elements scanned, the size of intermediate results and query performance.