Improving the Confidence of Machine Translation Quality Estimates
Lucia Specia, Zhuoran Wang, Marco Turchi, John Shawe-Taylor, Craig Saunders
We investigate the problem of estimating the quality of the output of machine translation systems at the sentence level when reference translations are not available. The focus is on automatically identifying a threshold to map
a continuous predicted score into ?good?/?bad? categories for filtering out bad-quality cases in a translation post-edition task. We use the theory of Inductive Confidence Machines (ICM) to identify this threshold according to a
confidence level that is expected for a given task. Experiments show that this approach gives improved estimates when compared to those based on classification or regression algorithms without ICM.
MT Summit 2009 (Machine Translation Summit XII), Ottawa, Ontario, Canada, August 26-30, 2009. Full paper available here