A Geometric view on bilingual lexicon extraction from comparable corpora

Eric Gaussier, Jean-Michel Renders, Irina Matveeva, Cyril Goutte, Hervé Dejean
We adopt in this study a geometric view on bilingual lexicon extraction from comparable corpora. This view makes it possible to re-interpret the methods proposed so far and identify unresolved problems. We then motivate and formulate three new methods, partly inspired by latent semantic analysis, that aim at solving these problems. We finally evaluate these methods. showing their strengths and weaknesses. Our final results show a significant gain in the accuracyof extracted lexicons.
42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, July 25-26, 2004.


2004_013.pdf (94.16 kB)