Minimum Error Rate Training by Sampling the Translation Lattice
Samidh Chatterjee, Nicola Cancedda
Minimum Error Rate Training is the algorithm for log-linear model parameter training most used in state-of-the-art Statistical Machine Translation systems. In its original formulation, the algorithm uses N-best lists output by the decoder to grow the Translation Pool that shapes the surface on which the actual optimization is performed. Recent work has been done to extend the algorithm to use the entire translation lattice built by the decoder, instead of N-best lists. We disclose here a third, intermediate way, consisting in growing the translation pool using samples randomly drawn from the translation lattice. We empirically measure an improvement in the BLEU scores compared to training using N-best lists, without suffering the increase in computational complexity associated with operating with the whole lattice.
EMNLP (Conference on Empirical Methods for Natural Language Processing), MIT Stata Center, Massachusetts, USA, 9-11 October 2010