Home page Site map Contact
  

 

FINITE STATE TECHNOLOGY

The Finite State Technology research concentrates on tools for specifying and manipulating finite state automata (acceptors, transducers, and multi-tape machines). Our tools (xfst, twolc, lexc) are built on top of a software library that provides algorithms for creating automata from regular expressions and equivalent formalisms and contains both classical operations such as union and composition and also new algorithms such as replacement and local sequentialisation. Over the years, the products of our research have come to be used all over the world in many linguistic applications such as morphological analysis, tokenisation, and shallow parsing of a wide variety of natural languages. The xfst tool has been licensed to over 70 universities world-wide. Many components have been incorporated into commercial software.

 

SELECTED PROJECTS, TOOLS, and DEMOS :

Weighted Finite-State Compiler - WFSC (Creating and manipulating multi-tape weighted finite-state machines with symbol classes)
Xerox Finite State Programming (Info: Regular expressions, automata, literature)
Xerox Finite State Compiler (Demo: Enter a regular expression and get an automaton)
Arabic Morphological Processing (Demo)
Tools for Natural Language Processing
Internships (All of XRCE)

 

LICENSING :

Xfst, twolc, and lexc are now available, under a non-commercial license, in the book Finite State Morphology (Beesley and Karttunen, 2003, CSLI Publications) which documents their use. The finite-state software has also been licensed commercially.

 

PUBLICATIONS :

References to finite state methods in Natural Language Processing (out-dated)
Publications from XRCE researchers

 

PEOPLE :

Ken Beesley, Tamás Gaál, André Kempe - Permanent researchers,
Lauri Karttunen - Founder (Now at Palo Alto Research Center, CA, USA),
Eric Boumaour, Tibor Csáki, Szilárd Fazekas, Franck Guingne, Florent Nicart, Pasi Tapanainen - Project alumni

 

We welcome comments and questions.

Back to Content Analysis homepage