An empirical study of fusion operators for multi-modal image retrieval
Gabriela Csurka, Stéphane Clinchant
In this paper we propose an empirical study of late fusion
operators for multimodal image retrieval. Therefore,
we consider two experts, one based on textual and one on
visual similarities between documents and study the possibilities
to go beyond simple score averaging. The main
idea is to exploit the correlation between the two experts
by encoding explicitly or implicitly an â€œandâ€? and an â€œorâ€?
operator in an efficient way. We show through several experiments
that the operators that combine both of these two
aspects generally outperform the ones that look only to one
of them. Based on this observation we propose several generalized
version of most classical fusion operators and compare
them using ImageClef benchmark datasets both in an
unsupervised and in a supervised framework.
10th Workshop on Content-Based Multimedia Indexing, Annecy, France, June 27-29, 2012.