Publication Search Form




We found publication with these paramters.

Finite-State Based Reductionist Parsing for French.

Jean-Pierre Chanod, Pasi Tapanainen
This paper describes a robust finite-state based parser applied to French. The non-deterministic tokeniser includes a finite-state automaton for simple tokens and a lexical transducer for encoding a wide variety of multiword expressions. The lexicon attaches morpho-syntactic tags to each token and alternative clause boundaries inbetween. The parser can parse technical manuals with high accuracy: in a test sample 95% of both functional and part-of-speech tags were correct. The average number of parses per sentence is low, more than 92% of sentences produce four or less than four parses, including the correct one.
Andréas Kornai (ed): Extended Finite State Models of Language. Cambridge University Press.


finite-state.pdf (97.38 kB)