BROAD: Diversified Keyword Search in Databases

dc.contributor.authorZHAO, Fengen_US
dc.contributor.authorZHANG, Xiaolongen_US
dc.contributor.authorTUNG, Anthony K. H.en_US
dc.contributor.authorCHEN, Gangen_US
dc.date.accessioned2011-03-21T08:18:20Zen_US
dc.date.accessioned2017-01-23T07:00:17Z
dc.date.available2011-03-21T08:18:20Zen_US
dc.date.available2017-01-23T07:00:17Z
dc.date.issued2011-03-21T08:18:20Zen_US
dc.description.abstractKeyword search in databases has received a lot of attention in the database community as it is an effective approach for querying a database without knowing its underlying schema. However, keyword search queries often return too many results. One standard solution is to rank results such that the “best” results appear first. Still, this approach can suffer from redundancy problem where many high ranking results are in fact coming from the same part of the database and results in other parts of the database are missed completely. In this paper, we propose the BROAD system which allows users to perform diverse, hierarchical browsing on keyword search results. Our system partitions the answer trees in the keyword search results by selecting k diverse representatives from the trees, separating the answer trees into k groups based on their similarity to the representatives and then recursively applying the partitioning for each group. By constructing summarized result for the answer trees in each of the k groups, we provide a way for users to quickly locate the results that they desire. Technically, our solution consists of three components. First, a new distance metric is used to capture both semantic and structural dissimilarity between answer trees. Second, based on this metric, we propose a tree-based algorithm to efficiently achieve result diversification. Finally, by coupling our partitioning solution with result summarization techniques, we allow users to decide which partition to drill down in order to obtain their intended answers. Extensive experiments were conducted and the results validate the feasibility and the efficiency of our system.en_US
dc.format.extent2163470 bytesen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.urihttps://dl.comp.nus.edu.sg/xmlui/handle/1900.100/3348en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTRD3/11en_US
dc.titleBROAD: Diversified Keyword Search in Databasesen_US
dc.typeTechnical Reporten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TRD3-11.pdf
Size:
2.06 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.53 KB
Format:
Plain Text
Description: