Tutorial

Learning with Dependencies between Several Response Variables:
From Hierarchical Bayes and Multitask Learning to Structured Prediction and Relational Learning

 
Volker Tresp, Siemens Corporate Technologies
Kai Yu, NEC Laboratories America

 
We analyze situations where modeling several response variables for a given input improves the prediction accuracy for each individual response variable.  Interestingly, this setting has appeared in different context and a number of different but related approaches have been proposed. In all these approaches some assumptions about the dependency structure between the response variables is made.
 
Here is a small selection of labels describing relevant work:
multitask learning, multi-class classification, multi-label prediction, hierarchical Bayes, inductive transfer learning, hierarchical linear models, mixed effect models, partial least squares, canonical correlation analysis, maximal covariance regression, multivariate regression, structured prediction, relational learning, …
 
The large number of approaches is confusing for the novice, and often even for the expert. In this tutorial we systematically introduce some of the major approaches and describe them from a common viewpoint.
 

Organizers:
Volker Tresp received a Diploma degree from the University of Göttingen, Germany, in 1984, and the Ph.D. from Yale University in 1989. Since 1989 he is the head of a research team in machine learning at Siemens, Corporate Research and Technology. In 1994 he was a visiting scientist at the Massachusetts Institute of Technology's Center for Biological and Computational Learning. Each summer (since 2003) he is giving a lecture on machine learning and data mining at the University of Munich. He has been involved in all leading program committees in machine learning and is on the organizing committee of the annual Learning Workshop . He is co-editor of Neural Information Processing Systems (NIPS*13), MIT Press, 2000 and he was co-chair for the industrial track of KDD 2008.

Kai Yu is a researcher at NEC Laboratories America in California. Before joining NEC Labs, he was a senior research scientist at Siemens Corporate Technology at the lovely city Munich in Germany. He obtained his Ph.D in computer science from University of Munich (LMU), supervised by Prof. Hans-Peter Kriegel and Dr. Volker Tresp (Siemens AG), and received the B.Sc and M.Sc degrees both in electrical engineering from Nanjing University, China, in 1998 and 2000 respectively.

Slides (final)
 
 
Outline
Introduction and Motivation
We provide a number of examples for modeling with several response variables
Hierarchical Bayes and Mixed Models
    Problem Settings and Simple Solutions
Here we discuss situations where hierarchical Bayesian models are appropriate but also present simple alternatives
    Hierarchical Bayes - Mixed Models
We discuss hierarchical Bayes using models with fixed basis functions. We briefly introduce mixed models, which are the frequentist equivalent of hierarchical Bayesian models
    Nonparametric Hierarchical Bayes
We elaborate in which situations a nonparametric approach is more suitable and present hierarchical Bayes Gaussian processes and Dirichlet process mixture models
Projection Methods
Asymptotically, the hierarchical Bayesian solution corresponds to the derivation of new basis functions that are linear combinations of the original basis functions. Projection methods also derive new basis functions, but with a different motivation. Principle component regression is a well known approach that is based on a matrix decomposition of the design matrix. With several outputs being available, it is desirable to include output information. We will discuss canonical correlation analysis regression, maximal covariance regression, partial least squares and multi-output regularized projection
Structured Prediction
The motivation behind structured prediction models is different: for an object, several labels or annotations need to be predicted and there exist dependencies between the labels. We present some generic approaches to structured prediction modeling. We classify applications and models with respect to their dimensionality in input space and output space and discuss CRF models, recommendation systems and topic models
A Relational View
Often multivariate modeling is performed in situations that really require a relational view. We motivate the transition from multivariate modeling to relational modeling and present basic models for relational recommendation tasks and social network analysis