Learning from partially labelled data -- with confidence
Eric Gaussier, Cyril Goutte
In this paper, we propose a unifying treatment of several strategies for training mixture models from label-deficient data. After a review of different approaches to estimating classification models on partially labelled data using mixture models, we identify a number of problems which lead us to propose a new EM variant. The aim is to better handle unlabelled data and provide a more confident discrimination decision. This is illustrated by an experimental comparison of the different models on the Leptograpsus crab data.
Proceedings of Learning with Partially Classified Training Data - ICML 2005 workshop, Bonn, Germany, 7 August, 2005.
xrce_confidence.pdf (912.81 kB)