Finding Time-lagged 3D Clusters

No Thumbnail Available
Date
2008-06-19
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Existing 3D clustering algorithms on $gene\times sample\times time$ expression data do not consider the \emph{time lags} between correlated gene expression patterns. Besides, they either ignore the correlation on \emph{time subseries}, or disregard the \emph{continuity} of the time series, or only validate pure shifting or pure scaling coherent patterns instead of the general \emph{shifting-and-scaling patterns}. In this paper, we propose a novel 3D cluster model, $S^2D^3$ Cluster, to address these problems, where $S^2$ reflects the shifting-and-scaling correlation and $D^3$ the 3-Dimensional $gene\times sample\times time$ data. Within the $S^2D^3$ Cluster model, expression levels of genes are shifting-and-scaling coherent in both sample subspace and time subseries with arbitrary time lags. We develop a 3D clustering algorithm, $LagMiner$, for identifying interesting $S^2D^3$ Clusters that satisfy the constraints of regulation ($\gamma$), coherence ($\epsilon$), minimum gene number ($MinG$), minimum sample subspace size ($MinS$) and minimum time periods length ($MinT$). Experimental results on both synthetic and real-life datasets show that $LagMiner$ is effective, scalable and parameter-robust. While we use gene expression data in this paper, our model and algorithm can be applied on any other data where both spatial and temporal coherence are pursued.
Description
Keywords
Citation