School of Computing
Permanent URI for this community
Browse
Browsing School of Computing by Subject "Data Mining"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemCorrelation-based Attribute Outlier Detection in XML(2007-06-25) KOH, Judice L. Y.; LEE, Mong Li; HSU, Wynne; ANG, Wee TiongOutlier detection - the problem of identifying deviating patterns in data sets, has important applications in data cleaning, fraud detection, and stock market analysis. Compared to relational data models, the hierarchical structure of semi-structured data such as XML provides semantically meaningful neighborhoods advancing existing outlier detection methods. In this paper, we propose a systematic framework - XODDS (XML Outlier Detection from Data Subspace) for outlier detection in XML data. XODDS utilizes the correlation between attributes to adaptively identify outliers. XODDS consists of four key steps: (1) attribute aggregation defines summarizing elements in the hierarchical XML structures, (2) subspace identification determines contextually informative neighborhoods for outlier detection, (3) outlier scoring computes the extent of outlier-ness using correlation-based metrics, and (4) outlier identification adaptively determine the optimal thresholds distinguishing the outliers from non-outliers. Experimental results on both synthetic and real-world data sets indicate that XODDS is effective in detecting outliers in XML data.