Correlation-based Attribute Outlier Detection in XML

dc.contributor.authorKOH, Judice L. Y.en_US
dc.contributor.authorLEE, Mong Lien_US
dc.contributor.authorHSU, Wynneen_US
dc.contributor.authorANG, Wee Tiongen_US
dc.date.accessioned2007-07-06T08:18:24Zen_US
dc.date.accessioned2017-01-23T07:00:24Z
dc.date.available2007-07-06T08:18:24Zen_US
dc.date.available2017-01-23T07:00:24Z
dc.date.issued2007-06-25en_US
dc.description.abstractOutlier detection - the problem of identifying deviating patterns in data sets, has important applications in data cleaning, fraud detection, and stock market analysis. Compared to relational data models, the hierarchical structure of semi-structured data such as XML provides semantically meaningful neighborhoods advancing existing outlier detection methods. In this paper, we propose a systematic framework - XODDS (XML Outlier Detection from Data Subspace) for outlier detection in XML data. XODDS utilizes the correlation between attributes to adaptively identify outliers. XODDS consists of four key steps: (1) attribute aggregation defines summarizing elements in the hierarchical XML structures, (2) subspace identification determines contextually informative neighborhoods for outlier detection, (3) outlier scoring computes the extent of outlier-ness using correlation-based metrics, and (4) outlier identification adaptively determine the optimal thresholds distinguishing the outliers from non-outliers. Experimental results on both synthetic and real-world data sets indicate that XODDS is effective in detecting outliers in XML data.en_US
dc.format.extent1387015 bytesen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.urihttps://dl.comp.nus.edu.sg/xmlui/handle/1900.100/2557en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTRB6/07en_US
dc.subjectOutlier Detectionen_US
dc.subjectXMLen_US
dc.subjectData Miningen_US
dc.titleCorrelation-based Attribute Outlier Detection in XMLen_US
dc.typeTechnical Reporten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TRB6-07.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.53 KB
Format:
Plain Text
Description: