The Algebra and Analysis of Adaptive-Binning

No Thumbnail Available
Date
2002-08-01T00:00:00Z
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Histograms are commonly used in content-based image retrieval systems to represent the distributions of colors in images. It is a common understanding that histograms that adapt to images can represent their color distributions more efficiently than do histograms with fixed binnings. However, existing systems almost exclusively adopt fixed-binning histograms because among existing well-known dissimilarity measures, only the computationally expensive Earth Mover's Distance (EMD) can compare histograms with different binnings. Another major concern is that fixed-binning histograms have been regarded as vectors in a linear vector space so that powerful algorithms such as clustering, PCA, and SVD can be applied to process and analyze the histograms. Unfortunately, this approach is not satisfactory because the natural distance measure in linear vector space, the Euclidean distance, is less reliable than other non-Euclidean dissimilarities are in measuring histogram dissimilarity, thus compromising the effectiveness and reliability of the approach. This technical report addresses the above issues in a mathematically sound manner. In this article, adaptive histograms are formally defined and are provided with a well-defined set of operations. Properties of the operations on adaptive histograms are analyzed, leading to mathematically sound definitions of similarity, dissimilarity, and mean of histograms. Extensive test results in \cite{Leow-cvpr2001} have shown that the use of adaptive histograms produces the best overall performance, in terms of good accuracy, small number of bins, no empty bin, and efficient computation, compared to existing methods in histogram retrieval and classification tasks.
Description
Keywords
Citation