BROAD: Diversified Keyword Search in Databases

ZHAO, Feng; ZHANG, Xiaolong; TUNG, Anthony K. H.; CHEN, Gang

BROAD: Diversified Keyword Search in Databases

Files

TRD3-11.pdf(2.06 MB)

Date

2011-03-21T08:18:20Z

Authors

Abstract

Keyword search in databases has received a lot of attention in the database community as it is an effective approach for querying a database without knowing its underlying schema. However, keyword search queries often return too many results. One standard solution is to rank results such that the “best” results appear first. Still, this approach can suffer from redundancy problem where many high ranking results are in fact coming from the same part of the database and results in other parts of the database are missed completely. In this paper, we propose the BROAD system which allows users to perform diverse, hierarchical browsing on keyword search results. Our system partitions the answer trees in the keyword search results by selecting k diverse representatives from the trees, separating the answer trees into k groups based on their similarity to the representatives and then recursively applying the partitioning for each group. By constructing summarized result for the answer trees in each of the k groups, we provide a way for users to quickly locate the results that they desire. Technically, our solution consists of three components. First, a new distance metric is used to capture both semantic and structural dissimilarity between answer trees. Second, based on this metric, we propose a tree-based algorithm to efficiently achieve result diversification. Finally, by coupling our partitioning solution with result summarization techniques, we allow users to decide which partition to drill down in order to obtain their intended answers. Extensive experiments were conducted and the results validate the feasibility and the efficiency of our system.

URI

https://dl.comp.nus.edu.sg/xmlui/handle/1900.100/3348

Collections

Technical Reports

Full item page