NIPS SISO 2008: "Structured Input - Structured Output" Workshop Schedule
7.30-7.35 Welcome Remarks, Karsten Borgwardt (MPI Biological Cybernetics & MPI Developmental Biology)
7.35-8.30 1st Keynote Speech: Recent Advances in Learning Sparse Structured Input/Output Model: Models, Algorithms, and Applications, (PDF), Eric Xing (Carnegie Mellon University)
In many high-dimensional structured input/output problems, such as genome-phenome association analysis, where both input and output can contain tens of thousands, sometimes even millions of inter-related features, learning a sparse and consistent structured predictive function can be of paramount importance for both robustness and interpretability of the model. Despite its impotence, this problem has not been extensively explored in machine learning and statistics literature. In this talk, I will present some recent results along this line. I will first present a family of sparse structured regression models in the contexts of uncovering true associations between linked genetic variations (inputs) and networked phenotypes (outputs), and reverse engineering time-varying networks from time series data, which can be cast as efficiently solvable convex optimization problems and yield parsimonious and possibly consistent maximum likelihood estimates of the model. Then I will present another class of new models known as the maximum entropy discrimination Markov networks, which address the same problem in the maximum margin paradigm, but using a entropic regularizer that lead to a distribution of structured prediction function that are simultaneously primal and dual sparse (i.e., with few support vectors, and of low effective feature dimension), and can be efficiently solved via a novel algorithm that builds on variational inference and existing solvers for the maximum margin Markov network (which is a special case of our proposed model).
8.30-8.50 Coffee Break
8.50-9.15 Integrating Ontological Prior Knowledge into Relational Learning, (PDF), Stefan Reckow, Volker Tresp (MPI Psychiatry & Siemens AG)
Ontologies represent an important source of prior information which lends itself to the integration into statistical modeling. This paper discusses approaches towards employing ontological knowledge for relational learning. Our analysis is based on the IHRM model that performs relational learning by including latent variables that can be interpreted as cluster variables of the entities in the domain. We apply our approach to the modeling of yeast genomic data and demonstrate that the inclusion of ontologies as prior knowledge in relational learning can lead to significantly improved results and to better interpretable clustering structures.
9.15-9.40 Relation-Prediction in Multi-Relational Domains using Matrix-Factorization, (PDF), Christoph Lippert, Stefan-Hagen Weber, Yi Huang, Volker Tresp, Matthias Schubert, Hans-Peter Kriegel (Siemens AG & LMU München)
The paper is concerned with relation prediction in multi-relational domains using matrix factorization. While most past predictive models focussed on one single relation type between two entity types, in the paper a generalized model is presented that is able to deal with an arbitrary number of relation types and entity types in a domain of interest. The novel multi-relational matrix factorization is domain independent and highly scalable. We validate the performance of our approach using two real-world data sets, i.e. user-movie recommendations and gene function prediction.
9.40-10.05 Logistic Regression for Graph Classification, (PDF), Nino Shervashidze, Koji Tsuda (MPI Biological Cybernetics)
In this paper we deal with graph classification. We propose a new algorithm for performing sparse logistic regression for graphs, which is comparable in accuracy with other methods of graph classification and produces probabilistic output in addition. Sparsity is required for the reason of interpretability, which is often necessary in domains such as bioinformatics or chemoinformatics
10.05-10.30 Graphical Multi-Task Learning, (PDF), Daniel Sheldon (Cornell University)
We investigate the problem of learning multiple tasks that are related according to a network structure, using the multi-task kernel framework proposed in (Evgeniou et al., 2006). Our method combines a graphical task kernel with an arbitrary base kernel. We demonstrate its effectiveness on a real ecological application that inspired this work.
10.30-15.30 Snow Break
15.30-16.30 2nd Keynote Speech: Joint Learning of Multiple Structured Output Prediction Tasks, (PDF), Yasemin Altun (MPI Biological Cybernetics)
Joint Learning of Multiple Structured Output Prediction Tasks We investigate the problem of learning multiple structured output prediction tasks jointly. This problem surfaces during the inference of classifiers for complex tasks, such as machine translation, protein tertiary structure prediction, which consist of multiple inter-dependent structured prediction problems. Instead of the predominant approach, where subtasks are solved in a cascaded manner and their outputs are used as features for subsequent tasks, we propose learning these tasks jointly using techniques from structured prediction learning and multi task learning. In particular, we propose optimizing a loss function that enforces a low dimensional feature representation shared across structured tasks and we provide an algorithm to discover this representation along with the corresponding parameters simultaneously. Our experimental results show advantages over the cascaded approach.
16.30-16.50 Coffee Break
16.50-17.15 Learning to Predict Combinatorial Structures, (PDF), Thomas Gaertner, Shankar Vembu (Fraunhofer IAIS)
We consider the problem of predicting combinatorial structures such as directed cycles, partially ordered sets, and other graph classes. Assumptions made by existing structured prediction algorithms preclude their applicability to this problem. We present an algorithm that overcomes these limitations.
17.15-17.40 Joint Kernel Support Estimation for Structured Prediction, (PDF), Christoph Lampert, Matthew Blaschko (MPI Biological Cybernetics)
We present a new technique for structured prediction that works in a hybrid gener- ative/discriminative way, using a one-class support vector machine to model the joint probability of (input, output)-pairs in a joint reproducing kernel Hilbert space. Compared to discriminative techniques, like conditional random fields or structured out- put SVMs?, the proposed method has the ad- vantage that its training time depends only on the number of training examples, not on the size of the label space. Due to its gener- ative aspect, it is also very tolerant against ambiguous, incomplete or incorrect labels. Experiments on realistic data show that our method works efficiently and robustly in sit- uations that discriminative techniques have problems with or that are computationally infeasible for them.
17.40-18.05 Learning Optimal Subsets with Implicit User Preferences, (PDF), Yunsong Guo, Carla Gomes (Cornell University)
In this paper we study the problem of learning an optimal subset from a larger ground set of items, where the optimality criterion is defined by an unknown preference function. We model the problem as a discriminative structural learning problem and solve it using a Structural Support Vector Machine (SSVM) that optimizes a ``set accuracy'' performance measure representing set similarities. Our approach departs from previous approaches since we do not explicitly learn a pre-defined preference function. Experimental results on both a synthetic block selection problem and a real-world face image subset selection problem show that our method significantly outperforms previous approaches.
18.05-18.30 Learning Structural Support Vector Machines with Latent Variables, (PDF), Chun-Na Yu, Thorsten Joachims (Cornell University)
It is well known in statistics and machine learning that the combination of latent variables and observed variables offer more expressive power than models with observed variables alone. Structural SVMs? have excellent performance in many structured prediction tasks, and yet currently they do not support the use of latent variables. In this work we extend Structural SVMs? to include latent variables, and provide an efficient algorithm for solving the optimization problem of our proposed formulation. We apply our new algorithm to the problem of discriminative motif finding in yeast DNA and some initial results are presented.