Correlation-based Attribute Outlier Detection in XML
dc.contributor.author | KOH, Judice L. Y. | en_US |
dc.contributor.author | LEE, Mong Li | en_US |
dc.contributor.author | HSU, Wynne | en_US |
dc.contributor.author | ANG, Wee Tiong | en_US |
dc.date.accessioned | 2007-07-06T08:18:24Z | en_US |
dc.date.accessioned | 2017-01-23T07:00:24Z | |
dc.date.available | 2007-07-06T08:18:24Z | en_US |
dc.date.available | 2017-01-23T07:00:24Z | |
dc.date.issued | 2007-06-25 | en_US |
dc.description.abstract | Outlier detection - the problem of identifying deviating patterns in data sets, has important applications in data cleaning, fraud detection, and stock market analysis. Compared to relational data models, the hierarchical structure of semi-structured data such as XML provides semantically meaningful neighborhoods advancing existing outlier detection methods. In this paper, we propose a systematic framework - XODDS (XML Outlier Detection from Data Subspace) for outlier detection in XML data. XODDS utilizes the correlation between attributes to adaptively identify outliers. XODDS consists of four key steps: (1) attribute aggregation defines summarizing elements in the hierarchical XML structures, (2) subspace identification determines contextually informative neighborhoods for outlier detection, (3) outlier scoring computes the extent of outlier-ness using correlation-based metrics, and (4) outlier identification adaptively determine the optimal thresholds distinguishing the outliers from non-outliers. Experimental results on both synthetic and real-world data sets indicate that XODDS is effective in detecting outliers in XML data. | en_US |
dc.format.extent | 1387015 bytes | en_US |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.uri | https://dl.comp.nus.edu.sg/xmlui/handle/1900.100/2557 | en_US |
dc.language.iso | en | en_US |
dc.relation.ispartofseries | TRB6/07 | en_US |
dc.subject | Outlier Detection | en_US |
dc.subject | XML | en_US |
dc.subject | Data Mining | en_US |
dc.title | Correlation-based Attribute Outlier Detection in XML | en_US |
dc.type | Technical Report | en_US |