Publication Search Form

Keywords

Authors

Year

We found publication with these paramters.

a Formalism For Universal Segmentation of Text

Julien Quint
Sumo is a formalism for universal segmentation of text. Its purpose is to provide a framework for the creation of segmentation applications. It is called #universal# as the formalism itself is independent of the language of the documents to process and independent of the levels of segmentation #e.g. words, sentences, paragraphs, morphemes...# considered by the target application. This framework relies on a layered structure representing the possible segmentations of the document. This structure and the tools to manipulate it are described, followed by detailed examples highlighting some features of Sumo.
Coling 2000
2000
2000/038