European Conference on Machine Learning & Practice of Knowledge Discovery in Databases ECML-PKDD September 15-19

Tomi Silander presenting paper  co-authored by Arvind Agarwal Saurabh Kataria (XRCW): “Multitask Learning for Sequence Labeling Tasks”.

Anna Stavrianou presenting paper co-authored by Caroline Brun, Tomi Silander and Claude Roux:  “NLP based Feature Extraction for Automated Tweet Classification”, 

Publication: NLP-based Feature Extraction for Automated Tweet Classification

Authors: Anna Stavrianou, Caroline Brun, Tomi Silander, Claude Roux

Seminar: "Perceptual annotation: measuring human vision to improve computer vision"; 15 September 2014 11:00AM

Speaker: Walter Scheirer, post-doctoral researcher at Harvard University, Cambridge, MA, U.S.A.

Abstract: For many problems in computer vision, human learners are considerably better than machines. Humans possess highly accurate internal recognition and learning mechanisms that are not yet understood, and they frequently have access to more extensive training data through a lifetime of unbiased experience with the visual world. In this talk, I will propose the use of visual psychophysics to directly leverage the abilities of human subjects to build better machine learning systems. First, I will describe an advanced online psychometric testing platform to make new kinds of annotation data available for learning. Second, I will develop a technique for harnessing these new kinds of information - "perceptual annotations" - for support vector machines. A key intuition for this approach is that while it may remain infeasible to dramatically increase the amount of data and high-quality labels available for the training of a given system, measuring the exemplar-by-exemplar difficulty and pattern of errors of human annotators can provide important information for regularizing the solution of the system at hand. In practice, such models generalize quite well to big data problems. A case study for the problem of unconstrained face detection highlights this observation: the approach yields state-of-the-art results on the challenging FDDB data set.

More information on XRCE seminars.

Publication: Learning mobility user choice and demand models from public transport fare collection data

Authors: Frédéric Roulland, Luis Ulloa, Arturo Mondragon, Michael Niemaz, Guillaume Bouchard, Victor Ciriza

CLEF 2014 Conference and Labs of the Evaluation Forum; 15 - 18 September 2014, Sheffield - UK

Information Access Evaluation meets Multilinguality, Multimodality, and Interaction

Boris Chidlovskii presenting: "Assembling Heterogeneous Domain Adaptation Methods for Image Classification" , co-authored by Gabriela Csurka, Boris Chidlovskii

Abstract: In this paper we report the contribution of XRCE team to the Domain Adaptation Challenge organized in the framework of ImageCLEF 2014 competition. We describe our approach to build an image classification system when a weak image annotation in the target domain is compensated by massively annotated images in source domains. One method is based using several heterogeneous methods for the domain adaptation aimed at the late fusion of the individual predictions. One big class of domain adaptation methods addresses a selective reuse of instances from source domains for target domain. We adopt the adaptive boosting for weighting source instances which learns a combination of weak classifiers in the target domain. Another class of methods aims to transform both target and source domains in a common space. In this class  we focused on metric learning approaches aimed at reducing distances between images from the same class and to increase distances of different classes independently if they are from source or target domain. Combined the above approaches with a ”brute-force” SVM-based approach we obtain a set of heterogeneous classifiers for class prediction of target instances. In order to improve the overall accuracy, we combine individual classifiers through different versions of majority voting. We describe different series of experiments including those submitted for the official competition and analyze their results.