Series Foreword

Preface

1 A Tutorial Introduction

1.1 Data Representation and Similarity
1.2 A Simple Pattern Recognition Algorithm
1.3 Some Insights From Statistical Learning Theory
1.4 Hyperplane Classifiers
1.5 Support Vector Classification
1.6 Support Vector Regression
1.7 Kernel Principal Component Analysis
1.8 Empirical Results and Implementations

I CONCEPTS AND TOOLS

2 Kernels

2.1 Product Features
2.2 The Representation of Similarities in Linear Spaces
2.3 Examples and Properties of Kernels
2.4 The Representation of Dissimilarities in Linear Spaces
2.5 Summary
2.6 Problems

3 Risk and Loss Functions

3.1 Loss Functions
3.2 Test Error and Expected Risk
3.3 A Statistical Perspective
3.4 Robust Estimators
3.5 Summary
3.6 Problems

4 Regularization

4.1 The Regularized Risk Functional
4.2 The Representer Theorem
4.3 Regularization Operators
4.4 Translation Invariant Kernels
4.5 Translation Invariant Kernels in Higher Dimensions
4.6 Dot Product Kernels
4.7 Multi-Output Regularization
4.8 Semiparametric Regularization
4.9 Coefficient Based Regularization
4.10 Summary
4.11 Problems

5 Elements of Statistical Learning Theory

5.1 Introduction
5.2 The Law of Large Numbers
5.3 When Does Learning Work: the Question of Consistency
5.4 Uniform Convergence and Consistency
5.5 How to Derive a VC Bound
5.6 A Model Selection Example
5.7 Summary
5.8 Problems

6 Optimization

6.1 Convex Optimization
6.2 Unconstrained Problems
6.3 Constrained Problems
6.4 Interior Point Methods
6.5 Maximum Search Problems
6.6 Summary
6.7 Problems

II SUPPORT VECTOR MACHINES

7 Pattern Recognition

7.1 Separating Hyperplanes
7.2 The Role of the Margin
7.3 Optimal Margin Hyperplanes
7.4 Nonlinear Support Vector Classifiers
7.5 Soft Margin Hyperplanes
7.6 Multi-Class Classification
7.7 Variations on a Theme
7.8 Experiments
7.9 Summary
7.10 Problems

8 Single-Class Problems: Quantile Estimation and Novelty Detection

8.1 Introduction
8.2 A Distribution's Support and Quantiles
8.3 Algorithms
8.4 Optimization
8.5 Theory
8.6 Discussion
8.7 Experiments
8.8 Summary
8.9 Problems

9 Regression Estimation

9.1 Linear Regression with Insensitive Loss Function
9.2 Dual Problems
9.3 nu-SV Regression
9.4 Convex Combinations and l1-Norms
9.5 Parametric Insensitivity Models
9.6 Applications
9.7 Summary
9.8 Problems

10 Implementation

10.1 Tricks of the Trade
10.2 Sparse Greedy Matrix Approximation
10.3 Interior Point Algorithms
10.4 Subset Selection Methods
10.5 Sequential Minimal Optimization
10.6 Iterative Methods
10.7 Summary
10.8 Problems

11 Incorporating Invariances

11.1 Prior Knowledge
11.2 Transformation Invariance
11.3 The Virtual SV Method
11.4 Constructing Invariance Kernels
11.5 The Jittered SV Method
11.6 Summary
11.7 Problems

12 Learning Theory Revisited

12.1 Concentration of Measure Inequalities
12.2 Leave-One-Out Estimates
12.3 PAC-Bayesian Bounds
12.4 Operator-Theoretic Methods in Learning Theory
12.5 Summary
12.6 Problems

III KERNEL METHODS

13 Designing Kernels

13.1 Tricks for Constructing Kernels
13.2 String Kernels
13.3 Locality-Improved Kernels
13.4 Natural Kernels
13.5 Summary
13.6 Problems

14 Kernel Feature Extraction

14.1 Introduction
14.2 Kernel PCA
14.3 Kernel PCA Experiments
14.4 A Framework for Feature Extraction
14.5 Algorithms for Sparse KFA
14.6 KFA Experiments
14.7 Summary
14.8 Problems

15 Kernel Fisher Discriminant

15.1 Introduction
15.2 Fisher's Discriminant in Feature Space
15.3 Efficient Training of Kernel Fisher Discriminants
15.4 Probabilistic Outputs
15.5 Experiments
15.6 Summary
15.7 Problems

16 Bayesian Kernel Methods

16.1 Bayesics
16.2 Inference Methods
16.3 Gaussian Processes
16.4 Implementation of Gaussian Processes
16.5 Laplacian Processes
16.6 Relevance Vector Machines
16.7 Summary
16.8 Problems

17 Regularized Principal Manifolds

17.1 A Coding Framework
17.2 A Regularized Quantization Functional
17.3 An Algorithm for Minimizing Rreg[f]
17.4 Connections to Other Algorithms
17.5 Uniform Convergence Bounds
17.6 Experiments
17.7 Summary
17.8 Problems

18 Pre-Images and Reduced Set Methods

18.1 The Pre-Image Problem
18.2 Finding Approximate Pre-Images
18.3 Reduced Set Methods
18.4 Reduced Set Selection Methods
18.5 Reduced Set Construction Methods
18.6 Sequential Evaluation of Reduced Set Expansions
18.7 Summary
18.8 Problems

A Addenda

A.1 Data Sets
A.2 Proofs

B Mathematical Prerequisites

B.1 Probability
B.2 Linear Algebra
B.3 Functional Analysis

References

Index

Notation and Symbols



Last modified November 30, 2001