Phone : +33 (0)4 76 61 51 98 Fax : +33 (0)4 76 61 50 99 Guillaume.Bouchard@xrce.xerox.com
My main research interest is statistical learning for text understanding and user modeling. I work with Cedric Archambeau
, Onno Zoeter
and Jean-Marc Andreoli
applying data mining techniques to several applications, including print infrastructure optimization and content creation modeling. I'm also a member of the EU PASCAL-2
network of excellence (noE).
I recently organized:
The amount of time we spend writing every day is striking : emails, reports, presentations, administrative tasks, chat, sms, blogs, questions, notes, etc. In fact most people (at least most of those that I know!) spend several hours each day writing some text. Most of the time, these texts are written on computers, smartphones or tablets, but the automated support that exists is rather basic (word completion, spelling correction). I believe we can develop much better technology to support everyday writing by learning
what to write and proposing drafts automatically based on the context. I'm looking at several business applications where texts are regularly composed and that contain a potentially large amount of repeated information
, e.g. texts that are created by a combination of several sources of information and that could be constructed by a series of of copy-paste operations and limited customization that could be based on simple syntactic transformation rules. If we can successfully propose meaningful texts to users, it would open the door to many exciting user interfaces with the computer. From the early age of computer science, tools to improve communication between humans and machines has been developed but we are far from having a common language to smoothly communicate. My vision (or hope...) is that the next generation of computer language programs is actually natural language.
Graphical models (a.k.a. Bayesian networks) are graphs that encode conditional independencies between random variables. They provide an efficient "language" to express complex probability distributions. Many real world systems can be modeled using this paradigm. In many applications, I tried the power of this representation to understand, compare and improve existing algorithms using graphical models. Many machine learning tasks can be expressed in terms of inference in a probabilistic model. In many situations (e.g. in graphical models), this problem is intractable but can be efficiently approximated using approximate solutions. Originally coming from the MCMC community, my research focuses nowadays on variational techniques, such as Variational Mean-Field algorithms.
In Machine Learning, generative and discriminative methods have both advantages and drawbacks. My main contributions to this domain are the definition of a hybrid generative-discriminative estimation technique (called Generative-Discriminative Tradeoff) and the proof of its optimality under weak conditions. This was roughly the main topic of my PhD thesis but I'm still working on it, mainly on the inference which is in general a difficult optimization problem.
Publications since I joined XRCE
Publications prior to 2005
I received an engineering degree in mathematics from Institut National des Sciences Appliquées
and a master in Applied Mathematics with a specialization in probability theory in 2001. I received in May 2005 a PhD in statistics from INRIA Rhône-Alpes and Université Joseph Fourier (France). My PhD supervisors were Gilles Celeux
from the SELECT
team, and Bill Triggs
, from the LEAR
project. My grant was provided by the European LAVA
matlab code to demonstrate the upper-bound to the log-sum-exp function