Publications
Authors:
  • Madalina Barbaiani , Nicola Cancedda , Chris Dance , Szilard Fazekas , Tamas Gaal , Eric Gaussier
Citation:
Finite-State Methods and Natural Language Processing, Potsdam, 14-16 September, 2007
Abstract:
This article describes a HMM-based word-alignment method that can selectively enforce a contiguity constraint. This method has a direct application in the extraction of a bilingual terminological lexicon from a parallel corpus, but can also be used as a preliminary step for the extraction of phrase pairs in a Phrase-Based Statistical Machine Translation system. Contiguous source words composing terms are aligned to contiguous target language words. The HMM is transformed to a Weighted Finite State Transducer (WFST) and contiguity constraints are assured by the use of further specific multi-tape WFSTs encoding contiguity constraints. The proposed method is especially suited when basic linguistic resources (morphological analyzer, part-of-speech taggers and term extractors) are available for the source language only
Year:
2007
Report number:
2007/021