Definite Noun Phrases in Statistical Machine Translation into Danish
Authors:
Sara Stymne
Workshop on Extracting and Using Constructions in NLP, Odense, Denmark, 14 May 2009.
The Full text is available
online
There are two ways to express definiteness in Danish, which makes it problematic for statistical machine translation (SMT) from English, since the wrong realisation can be chosen. We present a method for identifying and transforming English definite NPs which would likely be expressed in a
different way in Danish. The transformed English is used for training a phrase-based SMT system. We show significant improvements of translation quality, of up to 21.1% relative on Bleu, by performing identification and transformation of definiteness, compared to a baseline trained on original English, in two different domains.
Year:
2009
Report number:
2009/024