Finding Time-lagged 3D Clusters
No Thumbnail Available
Files
Date
2008-06-19
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Existing 3D clustering algorithms on $gene\times sample\times time$ expression data do not consider the \emph{time lags} between
correlated gene expression patterns. Besides, they either ignore the correlation on \emph{time subseries}, or disregard the \emph{continuity} of the time series, or only validate pure
shifting or pure scaling coherent patterns instead of the general \emph{shifting-and-scaling patterns}. In this paper, we propose a
novel 3D cluster model, $S^2D^3$ Cluster, to address these problems, where $S^2$ reflects the shifting-and-scaling correlation and $D^3$ the 3-Dimensional $gene\times sample\times
time$ data. Within the $S^2D^3$ Cluster model, expression levels of genes are shifting-and-scaling coherent in both sample subspace
and time subseries with arbitrary time lags. We develop a 3D clustering algorithm, $LagMiner$, for identifying interesting $S^2D^3$ Clusters that satisfy the constraints of regulation
($\gamma$), coherence ($\epsilon$), minimum gene number ($MinG$), minimum sample subspace size ($MinS$) and minimum time periods length ($MinT$). Experimental results on both synthetic and real-life datasets show that $LagMiner$ is effective, scalable and parameter-robust. While we use gene expression data in this paper,
our model and algorithm can be applied on any other data where both spatial and temporal coherence are pursued.