The Algebra and Analysis of Adaptive-Binning
No Thumbnail Available
Date
2002-08-01T00:00:00Z
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Histograms are commonly used in content-based image retrieval systems to represent the
distributions of colors in images.
It is a common understanding that histograms that adapt to images can represent their color
distributions
more efficiently than do histograms with fixed binnings.
However, existing systems almost exclusively adopt fixed-binning histograms because among
existing well-known dissimilarity measures, only the computationally expensive Earth Mover's
Distance (EMD) can compare histograms with different binnings.
Another major concern is that fixed-binning histograms have been regarded as vectors in a linear
vector space so that powerful algorithms such as clustering, PCA, and SVD can be applied to
process and analyze the histograms.
Unfortunately, this approach is not satisfactory because the natural distance measure in linear
vector space, the Euclidean distance, is less reliable than other non-Euclidean dissimilarities
are in measuring histogram dissimilarity, thus compromising the effectiveness and reliability of
the approach.
This technical report addresses the above issues in a mathematically sound manner.
In this article, adaptive histograms are formally defined and are provided with a well-defined
set of operations.
Properties of the operations on adaptive histograms are analyzed, leading to mathematically
sound definitions of similarity, dissimilarity, and mean of histograms.
Extensive test results in \cite{Leow-cvpr2001} have shown that the use of adaptive histograms
produces the best overall performance, in terms of good accuracy, small number of bins, no empty
bin, and efficient computation, compared to existing methods in histogram retrieval and
classification tasks.