ICRA: Effective Semantics for Ranked XML Keyword Search

dc.contributor.authorCHEN, Boen_US
dc.contributor.authorLU, Jiahengen_US
dc.contributor.authorLING, Tok Wangen_US
dc.date.accessioned2007-06-04T01:04:52Zen_US
dc.date.accessioned2017-01-23T07:00:23Z
dc.date.available2007-06-04T01:04:52Zen_US
dc.date.available2017-01-23T07:00:23Z
dc.date.issued2007-05-30en_US
dc.description.abstractKeyword search is a user-friendly way to query XML databases. Most previous efforts in this area focus on keyword proximity search in XML based on either tree data model or graph (or digraph) data model. Tree data model cannot capture connections such as ID references in XML databases. In the contrast, techniques based on graph (or digraph) data model capture connections, but are generally inefficient to compute. In this paper, we propose interconnected object trees model for keyword search to achieve the efficiency of tree model and meanwhile to capture the connections such as ID references in XML by fully exploiting the property and schema information of XML databases. In particular, we propose ICA (Interested Common Ancestor) semantics to find all predefined interested objects that contain all query keywords. We also introduce novel IRA (Interested Related Ancestors) semantics to capture the conceptual connections between interested objects and include more objects that only contain some query keywords. Then, a novel ranking metric, RelevanceRank, is studied to dynamically assign higher ranks to objects that are more relevant to a given keyword query according to the conceptual connections in IRAs. We design and analyze efficient algorithms for keyword search based on our data model; and experiment results show our approach is efficient and outperforms most existing systems in terms of result quality. A prototype of our ICRA system (ICRA = ICA + IRA) on the updated 321M DBLP data is available at http://xmldb.ddns.comp.nus.edu.sg/.en_US
dc.format.extent6625370 bytesen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.urihttps://dl.comp.nus.edu.sg/xmlui/handle/1900.100/2553en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTRC5/07en_US
dc.titleICRA: Effective Semantics for Ranked XML Keyword Searchen_US
dc.typeTechnical Reporten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TRC5-07.pdf
Size:
6.32 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.53 KB
Format:
Plain Text
Description: