Information-Based Models for Ad Hoc IR

St├ęphane Clinchant, Eric Gaussier
We introduce in this paper the family of information-based models for ad hoc information retrieval. These models draw their inspiration from a long-standing hypothesis in IR, namely the fact that the difference in the behaviors of a word at the document and collection levels brings information on the significance of the word for the document. This hypothesis has been exploited in the 2-Poisson mixture models, in the notion of eliteness in BM25, and more recently in DFR models. We show here that, combined with notions related to burstiness, it can lead to simpler and better models.
ACM SIGIR 2010 (Special Interest Group on Information Retrieval), Geneva, Switzerland, 19-23 July 2010