Document Image Retrieval without OCRing Using a Video Scanning System
Ercan kuruoglu, Vern Tan
We propose a technique for efficient document retrieval from digital libraries containing document images
which are token based compressed. The technique we proposed uses the layout information supplied by
the relative positions of the character tokens on the page of a "query" paper document to retrieve the original
document in the image database. The query image is captured from a paper document by a multimedia
system composed of a PC and a video scanning tool. This technique avoids OCRing the query document
and the documents in the database; moreover avoids decompressing the token based compressed
documents in the database, therefore achieving important time and computational gains.
The technique provides one with the capability of retrieving the original document stored in a digital library
using part of a previously produced paper copy.
E.Kuruoglu & V.Tan, Document Image Retrieval without OCRing Using a Video Scanning System, Proc ACM International Workshop on
Multimedia Information Retrieval 2000 30th October - 4th November 2000, Los Angeles California. ACM New York.