Syntactic Analysis of a Natural Language Using Linguistic Rules and Corpus-Based Patterns.
Pasi Tapanainen, Timo Jarvinen
We are concerned with the syntactic annotation of unrestricted text. We combine a rule-based analysis with
subsequent exploitation of empirical data.
The rule-based surface syntactic analyser leaves some amount of ambiguity in the output that is resolved
using empirical patterns. We have implemented a system for generating and applying corpus-based
patterns. Some patterns describe the main constituents in the sentence and some the local context of the
each syntactic function. There are several (partly) redundant patterns, and the "pattern" parser selects
analysis of the sentence that matches the strictest possible pattern(s).
The system is applied to all experimental corpus. We present the results and discuss possible refinements
of the method from a linguistic point of view.
COLING'94, Aug. 5 - 9 Kyoto, Japan, Vol. I pp.629-634