Handwritten word-image retrieval with synthesized typed queries
Florent Perronnin, José A. Rodriguez
We propose a new method for handwritten word-spotting which does not require prior training or gathering examples for querying. More precisely, a model is trained on the fly with images rendered from the searched words in
one or multiple computer fonts. To reduce the mismatch between the typed-text prototypes and the candidate handwritten images, we make use of: (i) local gradient histogram (LGH) features, which were shown to model word shapes
robustly, and (ii) semi-continuous hidden Markov models (SC-HMM), in which the typed-text models are constrained to a vocabulary of handwritten shapes, thus learning a link between both types of data. Experiments show that the proposed method is effective in retrieving handwritten words, and the comparison to alternative methods reveals that the contribution of both the LGH features and the SCHMM is crucial. To the best of the authors? knowledge, this is the first work to address this issue in a non-trivial manner.
ICDAR 2009 (International Conference on Document Analysis and Recognition). Barcelona, Sapin, July 26-29, 2009.
Full paper available on ICDAR 2009 Website