Browsing by Author "BAO, Zhifeng"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- ItemAutomatic XML Keyword Query Refinement(2009-06-23T02:36:06Z) BAO, Zhifeng; LU, Jiaheng; LING, Tok Wang; MENG, XiaofengExisting XML keyword search methods focus on how to find relevant and meaningful data fragments for a keyword query, assuming each keyword is intended as part of it. However, user's queries usually contain irrelevant or mismatched terms, spelling errors etc, which causes the search results to be either empty or meaningless. In this paper, we introduce the problem of automatic XML keyword query refinement, where automatic means the search engine should be able to adaptively decide whether a query Q needs to be refined during the processing of Q, and at the same time find a list of promising refined query candidates and their matching results over XML data, without any user interaction or a second try. In order to achieve this goal, we build a primary framework which consists of two core parts: (1) we build a novel query ranking model to evaluate the quality of a refined query RQ, which takes into account of the relevance of RQ w.r.t Q over XML data, the morphological/semantical similarity between Q and RQ, and the dependence of keywords of RQ in XML data. (2) We integrate the exploration of RQ candidates and the generation of their matching results as a single problem at the same time of processing Q, which is fulfilled within a one-time scan of related keyword inverted lists optimally. Finally, an extensive empirical study verifies the efficiency and effectiveness of our framework.
- ItemFast SLCA and ELCA Computation for XML Keyword Query Based on Set Intersection Operation(2011-07-21T08:22:45Z) ZHOU, Junfeng; BAO, Zhifeng; WANG, Wei; TOK, Wang Ling; CHEN, Ziyang; LIN, Xudong; GUO, JingfengIn this paper, we focus on efficient keyword query processing on XML data based on SLCA and ELCA semantics. We have an elaborate design of the keyword inverted list to efficiently locate the nodes that directly or indirectly contain a given keyword. We propose a family of algorithms that are based on set intersection operation to accelerate SLCA and ELCA computation. In essence, the problem of SLCA and ELCA computation becomes finding a set of common nodes that appear in all inverted lists and checking their satisfiability. We show that any existing set intersection algorithms can be adopted for SLCA and ELCA computation, and our optimization techniques can be used together with any existing search method to improve the overall performance. Experiments verify that the performance of our methods outperforms existing methods by more than 2 orders of magnitude in most cases.
- ItemSchema-independence in XML Keyword Search(2014-06-24) LE, Thuy Ngoc; BAO, Zhifeng; LING, Tok WangXML keyword search has attracted a lot of interests with typical search based on lowest common ancestor (LCA). However, in this paper, we show that meaningful answers can be found beyond LCA and should be independent from schema designs of the same data content. Therefore, we propose a new semantics, called CR (Common Relative), which not only can find more answers beyond LCA, but the returned answers are independent from schema designs as well. To find answers based on the CR semantics, we propose an approach, in which we have new strategies for indexing and processing. Experimental results show that the CR semantics can improve the recall significantly and the answer set is independent from the schema designs.