Hybrid Techniques for Training HMM POS Taggers.
Ted Briscoe, Greg Grefenstette
We describe and experimentally evaluate a hybrid technique for training part of speech taggers which utilises
training from small quantities of unambiguously-tagged material combined with maximum likelihood
re-estimation over the target untagged corpus. This approach, unlike previous ones employing re-estimation,
does not involve skilled manipulation of the initial parameters of the model or the use of sophisicated models
of suffix-tag probabilities derived from unambiguously-tagged material. We conclude that this technique can
yield usefully accurate taggers for several languages, but that the conditions required for success are difficult
to state precisely.
technical report MLTT-007 (May 94)