Browsing by Author "LI, Luochen"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- ItemDiscovering Semantics from Data-Centric XML(2013-06-17) LI, Luochen; LE, Thuy Ngoc; WU, Huayu; LING, Tok Wang; BRESSAN, StephaneIn database applications in general, and in applications using XML in particular, the availability of a conceptual schema or of elements of semantics constitute invaluable leverage for improving the effectiveness, and sometimes the efficiency, of many tasks including query processing, keyword search and schema and data integration. The Object-Relationship-Attribute model for Semi-Structured data (ORA-SS) model is a conceptual model designed to capture the semantics of the important constructs in a variety of data models in general and in semi-structured data models in particular. It is specifically intended to capture the semantics of object classes, object identifiers, relationship types, object attributes and relationship attributes underlying XML schemas and data. For a given application, we refer to the set of instances of these semantic concepts as the ORA-semantics of the application. While ORA-SS can be used a priori for the design of new applications, we are interested, in this work, in the automatically discovering of the ORA-semantics from existing XML data and XML schemas. In this paper, we present a novel approach to automatically discover the ORA-semantics from XML schemas and XML data. The approach we proposed is based on a set of techniques and heuristics that identify the different semantic concepts. We empirically and comparatively evaluate the effectiveness of the approach. We show by experiments that the semantics discovered by our approach has more than 90% accuracy.
- ItemDiscovering Semantics from XML(2012-03-26T01:54:23Z) LI, Luochen; LE, Thuy Ngoc; LING, Tok Wang; WU, Huayu; BRESSAN, StephaneIn database applications in general, and in applications using XML in particular, the availability of a conceptual schema or of elements of semantics constitute invaluable leverage for improving the effectiveness, and sometimes the efficiency, of many tasks including query processing, keyword search and schema and data integration. The Object-Relationship-Attribute model for Semi-Structured data (ORA-SS) model is a conceptual model designed to capture the semantics of the important constructs in a variety of data models in general and in semi-structured data models in particular. It is specifically intended to capture the semantics of object classes, object identifiers, relationship types, object attributes and relationship attributes underlying XML schemas and data. For a given application, we refer to the set of instances of these semantic concepts as the ORA-semantics of the application. While ORA-SS can be used a priori for the design of new applications, we are interested, in this work, in the automatically discovering of the ORA-semantics from existing XML data and schemas. In this paper, we present a novel approach to automatically discover the ORA-semantics from XML schemas and XML data. The approach we proposed is based on a set of techniques and heuristics that identify the different semantic concepts. We empirically and comparatively evaluate the effectiveness of the approach.
- ItemFrom Revisiting the LCA-based Approach to a New Semantics-based Approach for XML Keyword Search(2011-05-30) LE, Thuy Ngoc; WU, Huayu; LING, Tok wang; LI, LuochenMost keyword search approaches for data-centric XML documents are based on the computation of Lowest Common Ancestors (LCA), such as SLCA and MLCA. In this paper, we show that the LCA is not always a correct search model for processing keyword queries over general XML data. In particular, when an XML database contains relationships among objects, which is quite common in practical data, LCA-based search may not be able to find desired answers for many keyword queries. We propose to use semantics instead of the structure of XML data to perform keyword search, and show that the semantics-based search can solve the problems of the LCA-based approach. To the best of our knowledge, this is the first work to point out serious problems of the LCA-based XML keyword search approach, and propose an approach to perform XML keyword search based on semantics rather than the hierarchical structure of XML data to address those problems.
- ItemProblems of LCA and Impact of ORA-semantics in XML Keyword Search(2012-03-26T02:13:57Z) LE, Thuy Ngoc; WU, Huayu; LING, Tok Wang; LI, LuochenMost keyword search approaches for data-centric XML documents are based on the computation of Lowest Common Ancestors (LCA). However, LCA-based search methods depend much on hierarchical structures of XML data. Therefore it may not be able to find desired answers for many keyword queries since a relationship among objects in XML data can be represented in different hierarchical structures. In this paper, we first point out serious problems of the LCA-based approach, due to its unawareness of semantics of object, relationship and attribute, referred to as ORA-semantics. Through detailed analysis of these problems, we show the impact of ORA-semantics in XML keyword search. We then propose an ORA-semantics based approach with rules to infer expected answers for XML keyword queries. Experimental results show that our ORA-semantics based approach can resolve the problems of the LCA-based approach, and thus can be a promising research direction for XML keyword search.