Book contents
- Frontmatter
- Contents
- Acknowledgements
- List of contributors
- Foreword
- 1 Introduction
- 2 On-line Learning and Stochastic Approximations
- 3 Exact and Perturbation Solutions for the Ensemble Dynamics
- 4 A Statistical Study of On-line Learning
- 5 On-line Learning in Switching and Drifting Environments with Application to Blind Source Separation
- 6 Parameter Adaptation in Stochastic Optimization
- 7 Optimal On-line Learning in Multilayer Neural Networks
- 8 Universal Asymptotics in Committee Machines with Tree Architecture
- 9 Incorporating Curvature Information into On-line Learning
- 10 Annealed On-line Learning in Multilayer Neural Networks
- 11 On-line Learning of Prototypes and Principal Components
- 12 On-line Learning with Time-Correlated Examples
- 13 On-line Learning from Finite Training Sets
- 14 Dynamics of Supervised Learning with Restricted Training Sets
- 15 On-line Learning of a Decision Boundary with and without Queries
- 16 A Bayesian Approach to On-line Learning
- 17 Optimal Perceptron Learning: an On-line Bayesian Approach
13 - On-line Learning from Finite Training Sets
Published online by Cambridge University Press: 28 January 2010
- Frontmatter
- Contents
- Acknowledgements
- List of contributors
- Foreword
- 1 Introduction
- 2 On-line Learning and Stochastic Approximations
- 3 Exact and Perturbation Solutions for the Ensemble Dynamics
- 4 A Statistical Study of On-line Learning
- 5 On-line Learning in Switching and Drifting Environments with Application to Blind Source Separation
- 6 Parameter Adaptation in Stochastic Optimization
- 7 Optimal On-line Learning in Multilayer Neural Networks
- 8 Universal Asymptotics in Committee Machines with Tree Architecture
- 9 Incorporating Curvature Information into On-line Learning
- 10 Annealed On-line Learning in Multilayer Neural Networks
- 11 On-line Learning of Prototypes and Principal Components
- 12 On-line Learning with Time-Correlated Examples
- 13 On-line Learning from Finite Training Sets
- 14 Dynamics of Supervised Learning with Restricted Training Sets
- 15 On-line Learning of a Decision Boundary with and without Queries
- 16 A Bayesian Approach to On-line Learning
- 17 Optimal Perceptron Learning: an On-line Bayesian Approach
Summary
Abstract
We analyse online gradient descent learning from finite training sets at non-infinitesimal learning rates η for both linear and non-linear networks. In the linear case, exact results are obtained for the time-dependent generalization error of networks with a large number of weights N, trained on p = αN examples. This allows us to study in detail the effects of finite training set size α on, for example, the optimal choice of learning rate η. We also compare online and offline learning, for respective optimal settings of η at given final learning time. Online learning turns out to be much more robust to input bias and actually outperforms offline learning when such bias is present; for unbiased inputs, online and offline learning perform almost equally well. Our analysis of online learning for non-linear networks (namely, soft-committee machines), advances the theory to more realistic learning scenarios. Dynamical equations are derived for an appropriate set of order parameters; these are exact in the limiting case of either linear networks or infinite training sets. Preliminary comparisons with simulations suggest that the theory captures some effects of finite training sets, but may not yet account correctly for the presence of local minima.
Introduction
The analysis of online (gradient descent) learning, which is one of the most common approaches to supervised learning found in the neural networks community, has recently been the focus of much attention. The characteristic feature of online learning is that the weights of a network (‘student’) are updated each time a new training example is presented, such that the error on this example is reduced.
- Type
- Chapter
- Information
- On-Line Learning in Neural Networks , pp. 279 - 302Publisher: Cambridge University PressPrint publication year: 1999
- 1
- Cited by