DSpace Repository

Discovering Semantics from Data-Centric XML

Show simple item record

dc.contributor.author LI, Luochen en_US
dc.contributor.author LE, Thuy Ngoc en_US
dc.contributor.author WU, Huayu en_US
dc.contributor.author LING, Tok Wang en_US
dc.contributor.author BRESSAN, Stephane en_US
dc.date.accessioned 2013-06-18T04:09:30Z en_US
dc.date.accessioned 2017-01-23T07:00:07Z
dc.date.available 2013-06-18T04:09:30Z en_US
dc.date.available 2017-01-23T07:00:07Z
dc.date.issued 2013-06-17 en_US
dc.identifier.uri http://hdl.handle.net/1900.100/4159 en_US
dc.description.abstract In database applications in general, and in applications using XML in particular, the availability of a conceptual schema or of elements of semantics constitute invaluable leverage for improving the effectiveness, and sometimes the efficiency, of many tasks including query processing, keyword search and schema and data integration. The Object-Relationship-Attribute model for Semi-Structured data (ORA-SS) model is a conceptual model designed to capture the semantics of the important constructs in a variety of data models in general and in semi-structured data models in particular. It is specifically intended to capture the semantics of object classes, object identifiers, relationship types, object attributes and relationship attributes underlying XML schemas and data. For a given application, we refer to the set of instances of these semantic concepts as the ORA-semantics of the application. While ORA-SS can be used a priori for the design of new applications, we are interested, in this work, in the automatically discovering of the ORA-semantics from existing XML data and XML schemas. In this paper, we present a novel approach to automatically discover the ORA-semantics from XML schemas and XML data. The approach we proposed is based on a set of techniques and heuristics that identify the different semantic concepts. We empirically and comparatively evaluate the effectiveness of the approach. We show by experiments that the semantics discovered by our approach has more than 90% accuracy. en_US
dc.format.extent 1654608 bytes en_US
dc.format.mimetype application/pdf en_US
dc.language.iso en en_US
dc.relation.ispartofseries ;TRA6/13 en_US
dc.title Discovering Semantics from Data-Centric XML en_US
dc.type Technical Report en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account