Tutorial
Learning with Dependencies between Several
Response Variables:
From Hierarchical Bayes and Multitask Learning to Structured Prediction
and Relational Learning
Volker Tresp, Siemens Corporate Technologies
Kai Yu, NEC Laboratories America
We analyze situations where modeling several response variables for a
given input improves the prediction accuracy for each individual
response variable. Interestingly, this setting has appeared in
different context and a number of different but related approaches have
been proposed. In all these approaches some assumptions about the
dependency structure between the response variables is made.
Here is a small selection of labels describing relevant work:
multitask learning, multi-class
classification, multi-label prediction, hierarchical Bayes, inductive
transfer learning, hierarchical linear models, mixed effect models,
partial least squares, canonical correlation analysis, maximal
covariance regression, multivariate regression, structured prediction,
relational learning, …
The large number of approaches is confusing for the novice, and often
even for the expert. In this tutorial we systematically introduce some
of the major approaches and describe them from a common viewpoint.
Organizers:
Volker Tresp received a Diploma
degree from the University of Göttingen, Germany, in 1984, and the
Ph.D. from Yale University in 1989. Since 1989 he is the head of a
research team in machine learning at Siemens, Corporate Research and
Technology. In 1994 he was a visiting scientist at the Massachusetts
Institute of Technology's Center for Biological and Computational
Learning. Each summer (since 2003) he is giving a lecture on
machine learning and data mining at the University of Munich. He has
been involved in all leading program committees in machine learning and
is on the organizing committee of the annual Learning Workshop . He is
co-editor of Neural Information Processing Systems (NIPS*13), MIT
Press, 2000 and he was co-chair for the industrial track of KDD 2008.
Kai Yu
is a researcher at NEC Laboratories America in California. Before
joining NEC Labs, he was a senior research scientist at Siemens
Corporate Technology at the lovely city Munich in Germany. He obtained
his Ph.D in computer science from University of Munich (LMU),
supervised by Prof. Hans-Peter Kriegel and Dr. Volker Tresp (Siemens
AG), and received the B.Sc and M.Sc degrees both in electrical
engineering from Nanjing University, China, in 1998 and 2000
respectively.
Slides (final)
Outline
Introduction and Motivation
We provide a number of examples for
modeling with several response variables
Hierarchical Bayes and
Mixed Models
Problem Settings and
Simple Solutions
Here we discuss situations where
hierarchical Bayesian models are appropriate but also present simple
alternatives
Hierarchical Bayes - Mixed Models
We discuss hierarchical Bayes using
models with fixed basis functions. We briefly introduce mixed models,
which are the frequentist equivalent of hierarchical Bayesian models
Nonparametric Hierarchical Bayes
We elaborate in which situations a
nonparametric approach is more suitable and present hierarchical Bayes
Gaussian processes and Dirichlet process mixture models
Projection Methods
Asymptotically, the hierarchical
Bayesian solution corresponds to the derivation of new basis functions
that are linear combinations of the original basis functions.
Projection methods also derive new basis functions, but with a
different motivation. Principle component regression is a well known
approach that is based on a matrix decomposition of the design matrix.
With several outputs being available, it is desirable to include output
information. We will discuss canonical correlation analysis regression,
maximal covariance regression, partial least squares and multi-output
regularized projection
Structured Prediction
The motivation behind structured
prediction models is different: for an object, several labels or
annotations need to be predicted and there exist dependencies between
the labels. We present some generic approaches to structured prediction
modeling. We classify applications and models with respect to their
dimensionality in input space and output space and discuss CRF models,
recommendation systems and topic models
A Relational View
Often multivariate modeling is
performed in situations that really require a relational view. We
motivate the transition from multivariate modeling to relational
modeling and present basic models for relational recommendation tasks
and social network analysis