A Dataset for Assessing Machine Translation Evaluation Metrics
Lucia Specia, Nicola Cancedda, Marc Dymetman
We describe a dataset containing 16,000 translations produced by four machine
translation systems and manually annotated for quality by professional transla-
tors. This dataset can be used in a range of tasks assessing machine translation evaluation metrics, from basic correlation analysis to training and test of learning-based metrics. By providing a standard dataset for such tasks, we expect to encourage the development of better MT evaluation metrics.
LREC 2010 (Seventh international conference on Language Resources and Evaluation), Malta, 17-23 May 2010
Full paper available on LREC 2010 website