Sparse Linear Models: Bayesian Inference and Experimental Design
ICML 2008 Tutorial
Tutorial Speaker
Description
Linear models are ubiquitous in statistics, and the notion of enforcing parameter sparsity, through prior distributions or regularization, is of fundamental importance, underlying concepts such as feature selection, selective shrinkage, or automatic relevance determination. Sparse linear models have been successfully used in a wide range of applications, such as signal processing (compressed sensing), learning control, systems biology, low-level image modelling, graphical model structure learning, and sensor networks, among many others. Significant progress has been made recently in understanding sparse estimation methods (compressed sensing), as well as in constructing approximation methods to Bayesian inference for continuous variable models (expectation propagation, variational mean field Bayes).
In this tutorial, several recent inference approximations for the sparse linear model will be introduced, and relations between them will be clarified. The main motivation will be (Bayesian) experimental design (or active learning): the uncertainty (covariance) estimates obtained through inference are used in order to guide data sampling or modify the measurement architecture, with the aim of reducing uncertainty as rapidly as possible. Experimental design can lead to large cost reductions in practice, where designs chosen by human experts can be overly regular and redundant.
Methods such as sparse Bayesian learning (SBL), expectation propagation (EP), and variational mean field Bayes (VMFB) will be explored. It is possible to compare them directly for the sparse linear model, touching on important statistical principles such as scale mixtures, convex duality, and moment matching. In this case, all methods have essentially the same backbone, whose numerically stable implementation will be discussed. Moreover, I will show how to implement Bayesian sequential design, based on these approximations.
Content
The tutorial will consist of two parts. The outline is roughly:
- The Sparse Linear Model
- Approximate Inference- Expectation Propagation: Moment Matching- Sparse Bayesian Learning: Scale Mixtures- Direct Site Bounding: Convex Duality- Variational Mean Field Bayes- Numerical Representation
- Sequential Bayesian Design
- Measuring Images
- Outlook
The break between the two parts will be somewhere in the approximate inference part. Bayesian sequential design will be discussed here, serving as motivation for the added complexity of the Bayesian machinery. A running example will be the problem of how to optimize filters for linear measurements of natural images, which has applications in compressed sensing and magnetic resonance imaging.
Audience
The tutorial content is geared towards practitioners. Theoretical questions, such as recent compressed sensing results about asymptotics of estimators, will not be covered. An ideal participant has a real hands-on interest in understanding and working with these methods, i.e. in opening up the notorious black box. Prior experience with sparsity-favouring models, variational inference approximations, or numerical matrix computation is a plus, but not essential.
Course Material
Here are the slides and Matlab code I will use during the tutorial. This material is subject to changes before the tutorial is held.
Speaker Biography
Matthias W. Seeger received the M.S. degree in Computer Science from the University of Karlsruhe, Germany, in 1999 and the Ph.D. degree in Informatics from the University of Edinburgh, U.K., in 2003. From 2003 to 2005, he was a Research Associate in the Electrical Engineering and Computer Science Department, University of California at Berkeley. Since 2005, he is with the Max Planck Institute for Biological Cybernetics, Tuebingen, Germany. He is an expert in approximate Bayesian inference for continuous variable models, with experience in applications from Gaussian process models, systems biology, image coding, image measurement, and magnetic resonance image reconstruction. He also designed and maintains the open source software package LHOTSE for high-performance numerical computations.
References
- Tipping, M.Sparse Bayesian Learning and the Relevance Vector MachineJournal of Machine Learning Research 1 (2001), 211--244The seminal paper on sparse Bayesian learning. Contains useful characerization of sparsity arising from scale mixtures and automatic relevance determination
- Palmer, A.; Wipf, D.; Kreutz-Delgado, K.; Rao, B.Variational EM Algorithms for Non-Gaussian Latent Variable ModelsNeural Information Processing Systems 18 (2006)Clarifies different approaches to approximate inference for the sparse linear model (scale mixtures, convex duality). Proves equivalence of convex site bounding and variational mean field Bayes
- Wipf, D.A New View of Automatic Relevance DeterminationNeural Information Processing Systems 20 (2008)Provides efficient, convergent double-loop algorithm for SBL, calling re-weighted Lasso in inner loops
- Minka, T.Expectation Propagation for Approximate Bayesian InferenceUncertainty in Artificial Intelligence 17 (2001)Seminal paper on expectation propagation algorithm
- Seeger, M.Expectation Propagation for Exponential FamiliesTechnical Report (2005)University of California at BerkeleyGeneral view on expectation propagation and its relationship to Opper and Winther's expectation consistent framework
- Seeger, M.Bayesian Inference and Optimal Design in the Sparse Linear ModelJournal of Machine Learning Research (2008, in print)Expectation propagation for the sparse linear model, driving experimental design