Definite Noun Phrases in Statistical Machine Translation into Scandinavian Languages

Sara Stymne
The Scandinavian languages have an unusual structure of definite noun phrases (NPs), with a noun suffix as one possibility of expressing definiteness, which is problematic for statistical machine translation from languages with different NP structures. We show that translation can be improved by simple source side transformations of definite NPs, for translation from English and Italian, into Danish, Swedish, and Norwegian, with small adjustments of the preprocessing strategy, depending on the language pair. We also explored target side transformations, with mixed results.
EAMT-2011 : the 15th Annual Conference of the European Association for Machine Translation, May 30-31, 2011, Leuven, Belgium.