Research Seminar: NLP and machine learning, Low-rank Matrix Learning for Compositional Objects, Strings and Trees
Speaker: Xavier Carreras, senior research scientist, Xerox Research Centre Europe
Abstract: I will present a framework based on formulating the learning problem as low-rank matrix learning, where we employ the well-known nuclear norm regularizer to favor parameter matrices that are low-rank.
I will then illustrate applications of this framework in NLP tasks. First, I will show a method to learn word embeddings tailored for specific linguistic relations, which results in word vectors that are both very compact and predictive.
Then I will show the importance of low-rank regularization in conjunctive feature spaces, and specifically how our method can propagate weights to conjunctions not observed during training, which results in large improvements in a named entity extraction task.
Finally, I'll move to structure prediction and show that low-rank matrix learning can be used to induce the states of weighted automata for sequence tagging tasks.