The linguistic annotation of naturally occurring text can be seen as a progression of transformations of the
original text, with each step abstracting away surface differences. Tokenization is one of the earliest steps in
this transformation during natural language processing.
Syntactic Wordclass Tagging. Kluwer Academic Publishers, Dordrecht, 1999