Publications
Authors:
  • St├ęphane Clinchant , Eric Gaussier
Citation:
ICTIR, 3rd International Conference on the Theory of Information Retrieval, Bertinoro, Italy, 12-14 September 2011.
Abstract:
We introduce in this paper a new heuristic constraint for
PRF models, referred to as the Document Frequency (DF) constraint,
which is validated through a series of experiments with an oracle.We then
analyze, from a theoretical point of view, state-of-the-art PRF models
according to their relation with this constraint. This analysis reveals that
the standard mixture model for PRF in the language modeling family
does not satisfy the DF constraint on the contrary to several recently
proposed models. Lastly, we perform tests, which further validate the
constraint, with a simple family of tf-idf functions based on a parameter controlling the satisfaction of the DF constraint.
Year:
2011
Report number:
2011/045
Attachments: