Publications
Authors:
  • Cyril Goutte , HervĂ© Dejean , Eric Gaussier , Jean-Michel Renders , Nicola Cancedda
Citation:
Proc. of Sixth Conference on Natural Language Learning (CoNLL-2002), Taipei, Taiwan, 24-25 August, 2002.
Abstract:
We address the problem of using partially labelled data, eg large collection were only few data is annotated,
for extracting entities. Our approach relies on a combination of probabilistic models, which we use to model
the generation of entitties and their context, and kernel machines, which implement powerful categorisers
based on a similarity measure and some labelled data. This combination takes the form of the so-called
Fisher Kernels which implement a similarity based on an underlying probabilistic model. Such kernels are
compared with transductive inference, an alternative approach to combining labelled and unlabelled data, again
coupled with Support Vector Machines. Experiments are performed on a database of abstracts extracted from
Medline.
Year:
2002
Report number:
2002/024
Attachments: