Aspects of the numerical analysis of neural networks

S.W. Ellacott

doi:10.1017/S0962492900002439

Aspects of the numerical analysis of neural networks

Published online by Cambridge University Press: 07 November 2008

S.W. Ellacott

Show author details

S.W. Ellacott: Affiliation:
Department of Mathematical SciencesUniversity of Brighton MoulsecoombBrighton BN2 4GJEngland E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This article starts with a brief introduction to neural networks for those unfamiliar with the basic concepts, together with a very brief overview of mathematical approaches to the subject. This is followed by a more detailed look at three areas of research which are of particular interest to numerical analysts.

The first area is approximation theory. If K is a compact set in ℝn, for some n, then it is proved that a semilinear feedforward network with one hidden layer can uniformly approximate any continuous function in C(K) to any required accuracy. A discussion of known results and open questions on the degree of approximation is included. We also consider the relevance of radial basis functions to neural networks.

The second area considered is that of learning algorithms. A detailed analysis of one popular algorithm (the delta rule) will be given, indicating why one implementation leads to a stable numerical process, whereas an initially attractive variant (essentially a form of steepest descent) does not. Similar considerations apply to the backpropagation algorithm. The effect of filtering and other preprocessing of the input data will also be discussed systematically.

Finally some applications of neural networks to numerical computation are considered.

Type: Research Article
Information: Acta Numerica , Volume 3 , January 1994 , pp. 145 - 202

DOI: https://doi.org/10.1017/S0962492900002439 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1994

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Aleksander, I. and Morton, H. (1990), An Introduction to Neural Computing, Chapman and Hall (London).Google Scholar

Almeida, L. B. and Silva, F. M. (1992), ‘Adaptive decorrelation’, in Artificial Neural Networks 2 (Aleksander, I. and Taylor, J., eds), Vol. 2, North-Holland (Amsterdam) 149–156.Google Scholar

Amari, S. I. (1990), ‘Mathematical foundations of neurocomputing’, Proc. IEEE 78, 1143–1463.CrossRef Google Scholar

Aiyer, S. V. B., Niranjan, M. and Fallside, F. (1989), ‘A theoretical investigation into the performance of the Hopfield model’, Tech. Report, CUED/F-INFENG/TR 36, Cambridge University Engineering Department, Cambridge, CB2 1PZ, England.Google Scholar

Baldi, P. and Hornik, K. (1989), ‘Neural networks and principal component analysis: learning from examples without local minima’, Neural Networks 2, 53–58.CrossRef Google Scholar

Ben-Israel, A. and Greville, T. N. E. (1974), Generalised Inverses, Theory and Applications, Wiley (Chichester).Google Scholar

Bichsel, M. and Seitz, P. (1989), ‘Minimum class entropy: a maximum information approach to layered networks’, Neural Networks 2, 133–141.CrossRef Google Scholar

Brause, R. W. (1992), ‘The error bounded descriptional complexity of approximation networks’, Fachberiech Informatik, J W Goethe University, Frankfurt am Main, Germany.Google Scholar

Broomhead, D.S. and Lowe, D. (1988), ‘Multivariable function interpolation and adaptive networks’, Complex Systems 2, 321–355.Google Scholar

Chen, T., Chen, H. and Liu, R. (1991), ‘A constructive proof and extension of Cy-benko's approximation theorem’, in Computing Science and Statistics: Proc. 22nd Symp. on the Interface, Springer (Berlin) 163–168.Google Scholar

Cheney, E. W. (1966), Introduction to Approximation Theory, McGraw-Hill (New York).Google Scholar

Cybenko, G. (1989), ‘∞ approximation by superpositions of a sigmoidal function’, Math. Control-Signals Systems 2, 303–314.CrossRef Google Scholar

Diaconis, P. and Shashahani, M. (1984), ‘On nonlinear functions of linear combinations’, SIAM J. Sci. Statist. Comput. 5, 175–191.CrossRef Google Scholar

Ellacott, S. W. (1990), ‘An analysis of the delta rule’, Proc. Int. Neural Net Conf., ParisKluwer (Deventer) 956–959.CrossRef Google Scholar

Ellacott, S. W. (1993a), ‘The numerical analysis approach’, in Mathematical Approaches to Neural Networks (Taylor, J.G., ed.), North Holland (Amsterdam) 103–138.Google Scholar

Ellacott, S. W. (1993b), ‘Techniques for the mathematical analysis of neural networks’, J. Appl. Comput. Math, to appear.Google Scholar

Ellacott, S. W. (1993c), ‘Singular values and neural network algorithms’, in Proc. British Neural Network Society Meeting, February 1993.Google Scholar

Falconer, K. (1990), Fractal Geometry, Wiley (New York).Google Scholar

Fombellida, M. and Destine, J. (1992), ‘The extended quickprop’, in Artificial Neural Networks 2 (Aleksander, I. and Taylor, J.), Vol. 2, North-Holland (Amsterdam) 973–977.CrossRef Google Scholar

Funahashi, K.-I. (1989), ‘On the approximate realization of continuous mappings by neural networks’, Neural Networks 2, 183–192.CrossRef Google Scholar

Hornik K, K., Stinchcombe, M. and White, H. (1989), ‘Multilayer feedforward networks are universal approximators’, Neural Networks 2, 359–366.CrossRef Google Scholar

INNC 90 (1990), Proc. Int. Neural Network Conf. (9–13 July 1990, Palais de Con-gres, Paris, France) Kluwer (Deventer).Google Scholar

Isaacson, E. and Keller, H. B. (1966), Analysis of Numerical Methods, Wiley (New York).Google Scholar

Jacobs, D. (ed.) (1977), The State of the Art in Numerical Analysis, Academic Press (New York).Google Scholar

Jones, A.J. (1992), ‘Neural computing applications to prediction and control’, Department of Computing, Imperial College, London, United Kingdom.Google Scholar

Kreyszig, E. (1978), Introductory Functional Analysis with Applications, Wiley (New York).Google Scholar

Linggard, B. and Nightingale, C. (eds) (1992), Neural Networks for Images, Speech and Natural Language, Chapman and Hall (London).CrossRef Google Scholar

Light, W. (1992), ‘Ridge function, sigmoidal functions and neural networks’, in Approximation Theory VII (Cheney, E. W., Chui, C. K. and Schumaker, L. L., eds) Academic (Boston) 1–44.Google Scholar

Mason, J. C. and Parks, P. C. (1992), ‘Selection of neural network structures – some approximation theory guidelines’, Ch. 8, in Neural Networks for Control and Systems, (Warwick, K., Irwin, G. W. and Hunt, K. J., eds) IEE Control Engineering Series no. 46, Peter Peregrinus (Letchworth).Google Scholar

Mhaskar, H. N. and Micchelli, C. (1992), ‘Approximation by superposition of sigmoidal functions’, Adv. Appl. Math. 13, 350–373.CrossRef Google Scholar

Mhaskar, H. N. (1993), ‘Approximation properties of a multilayered feedforward artificial neural network’, Adv. Comput. Math. 1, 61–80.CrossRef Google Scholar

Moré, J. J. (1978), ‘The Levenberg–Marquardt algorithm, implementation and theory’, in Proc. Dundee Biennial Conf. on Numerical Analysis 1977, (Watson, G. A., ed.), Springer Lecture Notes in Mathematics no. 630, Springer (Berlin) 105–116.Google Scholar

Oja, E. (1983), Subspace Methods of Pattern Recognition, Research Studies Press (Letchworth, UK).Google Scholar

Oja, E. (1992), ‘Principal components, minor components and linear neural networks’, Neural Networks 5, 927–935.CrossRef Google Scholar

Oja, E., Ogawa, H. and Wangviwattana, J. (1992), ‘PCA in fully parallel neural networks’, in Artificial Neural Networks 2 (Aleksander, I. and Taylor, J., eds), Vol. 2, North-Holland (Amsterdam) 199–202.Google Scholar

Powell, M. J. D. (1992), ‘The theory of radial basis functions approximation in 1990’, in Advances in Numerical Analysis (Light, W., ed.), Vol. II, Oxford University Press (Oxford), 105–210.CrossRef Google Scholar

Rumelhart, D. E. and McClelland, J. L. (1986), Parallel and Distributed Processing: Explorations in the Micro structure of Cognition, Vols 1 and 2, MIT (Cambridge, MA).CrossRef Google Scholar

Sagan, H. (1969), Introduction to the Calculus of Variations, McGraw-Hill (New York).Google Scholar

Simpson, P. K. (1990), Artificial Neural Systems: Foundations, Paradigms, Applications and Implementations, Pergamon Press (New York).Google Scholar

Stein, E. and Weiss, G. (1971), Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press (Princeton, USA).Google Scholar

Taylor, J.G. (ed.)(1993), Mathematical Approaches to Neural Networks, North Holland (Amsterdam).Google Scholar

Van den Bout, D. E. and Miller, T. K. (1988), ‘A travelling salesman objective function that works’, Proc. IEEE Conf. on Neural Networks, Vol. 2, SOS Printing (San Diego, CA) 299–304.Google Scholar

Venkataraman, G. and Athithan G, G. (1991), ‘Spin glass, the travelling salesman problem, neural networks and all that’, Prãamana J. Phys. 36, 1–77.CrossRef Google Scholar

Wang, Z., Tham, M. T. and Morris, A. J. (1992), ‘Multilayer feed forward neural networks: a cannonical form approximation of nonlinearity’, Department of Chemical and Process Engineering, University of Newcastle upon Tyne, Newcastle upon Tyne, NE1 7RU, United Kingdom.Google Scholar

Warwick, K., Irwin, G. W. and Hunt, K. J. (1992), ‘Neural networks for control and systems’, IEE Control Engineering Series No. 46, Peter Peregrinus (Letch-worth).Google Scholar

Wasserman, P. D. (1989), Neural Computing: Theory and Practice, Van Nostrand Reinhold (New York).Google Scholar

Werbos, P. J. (1992), ‘Neurocontrol: where it is going and why it is crucial’, in Artificial Neural Networks 2 (Aleksander, I. and Taylor, J., eds), Vol. 1, North-Holland (Amsterdam) 61–68.Google Scholar

Xu, Y., Light, W. A. and Cheney, E. W. (1991), ‘Constructive methods of approximation by ridge functions and radial functions’. Address of first author: Department of Mathematics, University of Arkansas at Little Rock, Little Rock, AR 72204, USA.Google Scholar

Article contents

Aspects of the numerical analysis of neural networks

Abstract

Access options

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests