TJFast: Efficient Processing of XML Twig Pattern Matching

dc.contributor.authorLU, Jiahengen_US
dc.contributor.authorCHEN, Tingen_US
dc.contributor.authorLING, Tok Wangen_US
dc.date.accessioned2004-11-29T06:46:05Zen_US
dc.date.accessioned2017-01-23T06:59:42Z
dc.date.available2004-11-29T06:46:05Zen_US
dc.date.available2017-01-23T06:59:42Z
dc.date.issued2004-11-29T06:46:05Zen_US
dc.description.abstractFinding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. Previous twig join algorithms were proposed as an I/O optimal solution when the twig pattern only involves ancestor-descendant relationships. In this paper, we propose a new efficient holistic twig join algorithm, namely TJFast, which is based on a variation of Dewey ID labeling scheme. In order to answer a twig query, TJFast only needs to access the labels of elements whose tags appear in the leaf nodes of query. TJFast guarantees the I/O optimality for queries with only ancestor-descendant edges connecting branching nodes and their child nodes. In other words, unlike previous algorithms, the optimality of TJFast allows parent-child edges to appear in some edges. Besides TJFast, we also present a novel streaming strategy, called tag+level streaming, where two elements appear in the same stream if and only if they have the same tag name and level number. We develop a twig join algorithm, called TJFast+ based on tag+level streaming. We show that TJFast+ can identify even larger query class to guarantee I/O optimality than TJFast. Therefore, in this paper, by proposing TJFast/TJFast, we significantly enlarge the optimal query class compared to previous algorithms. Finally, we report our experimental results to show that our algorithms are superior to previous approaches in terms of the number of elements scanned, the size of intermediate results and query performance.en_US
dc.format.extent387216 bytesen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.urihttps://dl.comp.nus.edu.sg/xmlui/handle/1900.100/1516en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTR21/04en_US
dc.titleTJFast: Efficient Processing of XML Twig Pattern Matchingen_US
dc.typeTechnical Reporten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
upload.pdf
Size:
378.14 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.53 KB
Format:
Plain Text
Description: