Language Technologies and patent search and classification

David Hull, Salah Ait-Mokhtar, Mathieu Chuat, Andreas Eisele, Eric Gaussier, Greg Grefenstette, Pierre Isabelle, Christer Samuelsson, Frederique Segond
Research on a number of developments in language technologies, targeted at improving patent processing procedures within patent offices and in subsequent patent database search systems, is described. Aspects of patent processing covered are (1) OCR correction, to assist the conversion of paper documents to electronic versions, and (2) text classification, to assist in the allocation of new patent applications to the correct technical experts. Aspects of patent searching covered are (3) terminology enrichment, linking information access, to help formulate queries in other languages for the search of documents in those languages, and then to provide more sophisticated search activities, for example in patent information centres, to complement the low-cost or free offerings on the Internet.
World Patent Information 23 (2001) 265-268