Internship

Image retrieval/categorization (2)

Unit: Grenoble/TVPA

Florent Perronnin - Florent.Perronnin@xrce.xerox.com
Marco Bressan - Marco.Bressan@xrce.xerox.com

Duration: 3-6 months
Start Date: Feb 2009 and after

The main research lines within the Textual and Visual Pattern Analysis (TVPA) area at XRCE are categorization and retrieval of text and images; multimodal and hybrid pattern analysis (text, images, cross-lingual); image clustering and visualization; machine learning; document object detection and image aesthetics. Much of our research along these lines has been delivered into innovative solutions with proven scientific and commercial performance. Examples of this are Xerox GVC (image categorization engine), CategoriX (text categorization engine) and Xerox AIE (Automatic Image Enhancement).

The goal of this internship is to invent, implement and evaluate algorithms for the automatic categorization of scanned documents. While current systems typically consider page-level classification we are interested in addressing the problem raised by multi-pages documents. The main challenge will be to uncover automatically the hidden structure of such documents.

The successful candidate will have a strong theoretical and practical knowledge of computer vision and machine learning as well as good programming skills in C/C++.

XRCE provides an informal and relaxed working environment situated in the Parc de Maupertuis in Meylan. The successful students will be given the freedom and flexibility to find their own solutions and to work in a way that suits them but will have the guidance and support of experienced full-time Xerox researchers and thereby gain an introduction to the field of commercial research in a world-class research laboratory.

The Xerox Research Centre Europe (XRCE) is a young, dynamic research organization, which aims at creating innovative document technologies to support growth in Xerox content and document management services across the different Xerox businesses

XRCE: Château

XRCE is both a multicultural and multidisciplinary organization set in Grenoble, France. Our domains of research stretch from the social sciences to computing. We have renowned expertise in natural language applications, work practice studies, image-based document processing, distributed applications and knowledge management agents. The diversity of culture and disciplines at XRCE makes it an interesting and stimulating environment to work in, leading to often unexpected discoveries!

XRCE is part of the Xerox Innovation group made up of 800 researchers and engineers in four world-renowned research and technology centres. Xerox is an equal opportunity employer.

The Grenoble site is set in a park in the heart of the French Alps in a stunning location only a few kilometers from the city centre. The city of Grenoble has a large scientific community made up of national research institutes (CNRS, Universities, INRIA) and private industries. Stimulated also by the presence of a large student community, Grenoble has become a resolutely modern city, with a rich heritage and a vibrant cultural scene. It is a lively and cosmopolitan place, offering a host of leisure opportunities. Winter sports resorts just half an hour from campus and three natural parks at the city limits make running, skiing, trekking, climbing and paragliding easily available.
Grenoble is close to both the Swiss and Italian borders.