Publication Search Form




We found publication with these paramters.

Evaluation Techniques for Automatic Semantic Extraction:Comparing Syntactic and Window Based Approaches

Greg Grefenstette
As large on-line corpora become more prevalent, a number of attempts have been made to automatically extract thesaurus-like relations directly from text using knowledge poor methods. In the absence of any specific application, comparing the results of these attempts is difficult. Here we propose an evaluation method using gold standards, i.e., pre-existing hand-compiled resources, as a means of comparing extraction techniques. Using this evaluation method, we compare two semantic extraction techniques which produce similar word lists, one using syntactic context of words, and the other using windows of heuristically tagged words. The two techniques are very similar except that in one case selective natural language processing, a partial syntactic analysis, is performed. On a 4 megabyte corpus, syntactic contexts produce significantly better results against, the gold standards for the most characteristic words in the corpus, while windows produce better results for rare words.
Workshop on Acquisition of Lexical Knowledge from text (ACL SIGEX) Columbus, Ohio. Corpus Processing for Lexical Acquisition, Eds: Bran Boguraev and James Pustejovsky,MIT Press, 1996 ISBN: 026202392X