Guillaume Bouchard

Phone : +33 (0)4 76 61 51 98
Fax : +33 (0)4 76 61 50 99
Guillaume.Bouchard@xrce.xerox.com


guillaume_NB.png

My main research interest is statistical learning for text understanding and user modeling. I work with Cedric Archambeau , Onno Zoeter and Jean-Marc Andreoli applying data mining techniques to several applications, including print infrastructure optimization and content creation modeling. I'm also a member of the EU PASCAL-2 network of excellence (noE).

I recently organized:

Research

Support for text generation

The amount of time we spend writing every day is striking : emails, reports, presentations, administrative tasks, chat, sms, blogs, questions, notes, etc. In fact most people (at least most of those that I know!) spend several hours each day writing some text. Most of the time, these texts are written on computers, smartphones or tablets, but the automated support that exists is rather basic (word completion, spelling correction). I believe we can develop much better technology to support everyday writing by learning what to write and proposing drafts automatically based on the context. I'm looking at several business applications where texts are regularly composed and that contain a potentially large amount of repeated information , e.g. texts that are created by a combination of several sources of information and that could be constructed by a series of of copy-paste operations and limited customization that could be based on simple syntactic transformation rules. If we can successfully propose meaningful texts to users, it would open the door to many exciting user interfaces with the computer. From the early age of computer science, tools to improve communication between humans and machines has been developed but we are far from having a common language to smoothly communicate. My vision (or hope...) is that the next generation of computer language programs is actually natural language.

Graphical models and probabilistic inference

Graphical models (a.k.a. Bayesian networks) are graphs that encode conditional independencies between random variables. They provide an efficient "language" to express complex probability distributions. Many real world systems can be modeled using this paradigm. In many applications, I tried the power of this representation to understand, compare and improve existing algorithms using graphical models. Many machine learning tasks can be expressed in terms of inference in a probabilistic model. In many situations (e.g. in graphical models), this problem is intractable but can be efficiently approximated using approximate solutions. Originally coming from the MCMC community, my research focuses nowadays on variational techniques, such as Variational Mean-Field algorithms.

Automatic classification

In Machine Learning, generative and discriminative methods have both advantages and drawbacks. My main contributions to this domain are the definition of a hybrid generative-discriminative estimation technique (called Generative-Discriminative Tradeoff) and the proof of its optimality under weak conditions. This was roughly the main topic of my PhD thesis but I'm still working on it, mainly on the inference which is in general a difficult optimization problem.

Publications

Recent

  • M. E. Khan, B. Marlin, G. Bouchard , K. Murphy . Variational bounds for mixed-data factor analysis (2010). Advances in Neural Information Processing Systems (NIPS) . [pdf]
  • G. Convertino, B. Hanrahan, N. Kong, T. Weksteen, E. H. Chi, G. Bouchard , C. Archambeau . Mail2Wiki: low-cost sharing and organization on wikis. Workshop on Collective Intelligence in Organizations: Tools and Studies and ACM Group 10 Conference; 2010 November 7; Sanibel Island FL. [link]
  • G. Bouchard . Modèles hybrides génératifs-discriminatifs: théorie et applications. 2010. Invited speaker at journées MAS 2010 . [slides]
  • G. Bouchard . Named Entity Generation using Sampling-based Structured Prediction (2010). Generation Challenge special track at the 6th International Natural Language Generation Conference (INLG 2010). [pdf]
  • P. Liang , F. Bach , G. Bouchard , M. I. Jordan. Asymptotically optimal regularization in smooth parametric models (2010). Advances in Neural Information Processing Systems (NIPS) . [pdf ]

Publications since I joined XRCE

Publications prior to 2005

Education

I received an engineering degree in mathematics from Institut National des Sciences Appliquées and a master in Applied Mathematics with a specialization in probability theory in 2001. I received in May 2005 a PhD in statistics from INRIA Rhône-Alpes and Université Joseph Fourier (France). My PhD supervisors were Gilles Celeux from the SELECT team, and Bill Triggs , from the LEAR project. My grant was provided by the European LAVA project.

Code

matlab code to demonstrate the upper-bound to the log-sum-exp function