A Similarity Measure between Vector Sequences with Application to Word Image Retrieval
José A. Rodriguez, Florent Perronnin, Josep Llados, Gemma Sanchez
This article proposes a novel similarity measure between vector sequences. Recently, a model-based approach was introduced to address this issue. It consists in modeling each sequence with a continuousHiddenMarkovModel (CHMM)
and computing a probabilistic measure of similarity between C-HMMs. In this paper we propose to model sequences with semi-continuous HMMs (SC-HMMs): the
Gaussians of the SC-HMMs are constrained to belong to a shared pool of Gaussians. This constraint provides two major benefits. First, the a priori information contained in the common set of Gaussians leads to a more accurate
estimate of the HMM parameters. Second, the computation of a probabilistic similarity between two SC-HMMs can be simplified to a Dynamic Time Warping (DTW) between their mixture weight vectors, which reduces significantly
the computational cost. Experimental results on a handwritten word retrieval task show that the proposed similarity outperforms the traditional DTW between the original sequences, and the model-based approach which uses C-HMMs. We also show that this increase in accuracy can be traded against a significant reduction of the computational cost (up to 100 times).
CVPR 2009 (Computer Vision & Pattern Recognition), Miami, Florida, USA, June 20-26, 2009