2005/042 - Translating with non contiguous phrase
- Michel Simard,Nicola Cancedda,Bruno Cavestro,Marc Dymetman,Eric Gaussier,Cyril Goutte,Philippe Langlais,Kenji Yamada,Arne Mauser
HLT/EMNLP: Human Language Technology Conference/Conference on Empirical methods in natural language processing, Vancouver, Canada, October 6-8, 2005.
This paper presents a phrase-based statistical machine translation method, based on non-contiguous phrase, i.e. phrases with gaps. A method for producing such phrases from a word aligned corpora is proposed. A statistical translation model is also presented that deals such phrases, as well as a training method based on the maximization of translation accuracy, as measured with the NIST evaluation metric. Translations are produced by means of a beam-search decoder. Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data.