Semantic Combination of Textual and Visual Information in Multimedia Retrieval

Julien Ah-Pine, St├ęphane Clinchant, Gabriela Csurka
The goal of this paper is to introduce a set of techniques we call semantic combination in order to efficiently fuse text and image retrieval systems in the context of multimedia information access. These techniques emerge from the observation that image and textual queries are expressed at different semantic levels and that a single image query is often ambiguous. Overall, the semantic combination techniques overcome a conceptual barrier rather than a technical one: these methods can be seen as a combination of late fusion and image reranking. Albeit simple, this approach has not been used yet. We assess the proposed techniques against late and cross-media fusion using 4 different ImageCLEF datasets. Compared to late fusion, performances significantly increase on two datasets and remain similar on the two other ones.
ICMR-International Conference on Multimedia Retrieval(ACM)- Trento,Italy - April 17-20,2011


2011-010icmr2011.pdf (2.87 MB)