Publications
Authors:
  • Greg Grefenstette
Citation:
ECAI '96 Workshop on "Extended Finite State Models of Language". August 11-12, 1996 Budapest, pp. 65-69
Abstract:
For a number of language processing tasks, such as information retrieval and information extraction tasks,
pertinent information can he extracted from text without doing a full parse of the individual sentences.
The most common restriction of the parser is to adopt a non-recursive model of the language treated, which
allows an implementation of the parser using efficient finite-state tools at the cost of missing some coverage.
These light parsers allow the successive introduction of symbols into the input string wherever specified
regular expressions of words and/or part-of-speech tags match. Recent advances in finite-state expression
compilation make writing mark-up transducers simpler, leading to quicker implementations of layered
finite-state parsers. The resulting parsers are easier to create and maintain.

In this article, we describe a light parsing method using recently created finite-state operators. Two applications
of this parser are described: groupinig adjacent synitactically-related units, and extractinig non-adjacent n-ary
grammatical relations. A system for evaluating the parser over a large corpus is described.
Year:
1996
Report number:
1996-012