Reducing parameter space for word alignment
Hervé Dejean, Eric Gaussier, Cyril Goutte, Kenji Yamada
This paper presents experimental results to reduce the parameter space for word alignment algorithm. We use
IBM Model 4 as a baseline. We applied a word lemmatizer program and a term extraction algorithm to
preprocess a training corpus to reduce the model parameter space. We obtained an improvement in the
alignment error rate by the additional components.
Available from the http://www.cs.unt.edu/~rada/wpt/NAACL/HLT Workshop. Building and Using Parallel
Texts: Data Driven Machine Translation and Beyond website.
http://www.cs.unt.edu/~rada/wpt/NAACL/HLT Workshop Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, Edmonton, Canada, May 31, 2003.
Dejean.pdf (41.78 kB)