Theory of algorithms for unconstrained optimization

Jorge Nocedal

doi:10.1017/S0962492900002270

Theory of algorithms for unconstrained optimization

Published online by Cambridge University Press: 07 November 2008

Jorge Nocedal

Show author details

Jorge Nocedal: Affiliation:
Department of Electrical Engineering and Computer ScienceNorthwestern University, Evanston, IL 60208USA, E-mail [email protected]

Article contents

Extract
References

Get access

Rights & Permissions

Extract

A few months ago, while preparing a lecture to an audience that included engineers and numerical analysts, I asked myself the question: from the point of view of a user of nonlinear optimization routines, how interesting and practical is the body of theoretical analysis developed in this field? To make the question a bit more precise, I decided to select the best optimization methods known to date – those methods that deserve to be in a subroutine library – and for each method ask: what do we know about the behaviour of this method, as implemented in practice? To make my task more tractable, I decided to consider only algorithms for unconstrained optimization.

Type: Research Article
Information: Acta Numerica , Volume 1 , January 1992 , pp. 199 - 242

DOI: https://doi.org/10.1017/S0962492900002270 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1992

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

REFERENCES

Akaike, H. (1959), ‘On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method’, Ann. Inst. Statist. Math. 11, 1–17.CrossRef Google Scholar

Al-Baali, M. (1985), ‘Descent property and global convergence of the Fletcher–Reeves method with inexact line search’, IMA J. Numer. Anal. 5, 121–124.CrossRef Google Scholar

Al-Baali, M. (1990), ‘Variational Quasi-Newton Methods for Unconstrained Optimization’, Technical Report, Department of Mathematics, University of Damascus (Syria).Google Scholar

Bank, R.E. and Rose, D.J. (1981) ‘Global approximate Newton methods’, Numer. Math. 37, 279–295.CrossRef Google Scholar

Baptist, P. and Stoer, J. (1977), ‘On the relation between quadratic termination and convergence properties of minimization algorithms, Part II, Applications’, Numer. Math. 28, 367–392.CrossRef Google Scholar

Beale, E.M.L. (1972), ‘A derivation of conjugate gradients’, in Numerical Methods for Nonlinear Optimization (Lootsma, F.A., ed.), Academic Press (New York).Google Scholar

Brown, P.N. and Saad, Y. (1989), ‘Globally convergent techniques in nonlinear Newton–Krylov algorithms’, Technical Report UCRL-102434, Lawrence Livermore National Laboratory.Google Scholar

Brown, P.N. and Saad, Y. (1990), ‘Hybrid Krylov methods for nonlinear systems of equations’, SIAM J. Sci. Stat. Comput. 11, 450–481.CrossRef Google Scholar

Broyden, G.G., Dennis, J.E. and Moré, J.J. (1973), ‘On the local and superlinear convergence of quasi-Newton methods’, J. Inst. Math. Applics 12, 223–246.CrossRef Google Scholar

Buckley, A. and LeNir, A. (1983), ‘QN-like variable storage conjugate gradients’, Math. Program. 27, 155–175.CrossRef Google Scholar

Burmeister, W. (1973), ‘Die Konvergenzordnung des Fletcher–Powell Algorithmus’, Z. Angew. Math. Mech. 53, 693–699.CrossRef Google Scholar

Byrd, R.H. and Nocedal, J. (1989), ‘A tool for the analysis of quasi-Newton methods with application to unconstrained minimization’, SIAM J. Numer. Anal. 26, 727–739.CrossRef Google Scholar

Byrd, R., Liu, D. and Nocedal, J. (1990), ‘On the Behavior of Broyden's Class of Quasi-Newton Methods’, Technical Report NAM 01, Northwestern University, EECS Dept.Google Scholar

Byrd, R.H., Nocedal, J. and Yuan, Y. (1987), ‘Global convergence of a class of quasi-Newton methods on convex problems’, SIAM J. Numer. Anal. 24, 1171–1190.CrossRef Google Scholar

Byrd, R.H., Tapia, R.A. and Zhang, Y. (1990), ‘An SQP Augmented Lagrangian Bfgs Algorithm for Constrained Optimization’, Technical Report, University of Colorado (Boulder).Google Scholar

Cohen, A. (1972), ‘Rate of convergence of several conjugate gradient algorithms’, SIAM J. Numer. Anal. 9, 248–259.CrossRef Google Scholar

Conn, A.R., Gould, N.I.M., and Toint, Ph.L. (1988a), ‘Global convergence of a class of trust region algorithms for optmization with simple bounds’, SIAM J. Numer. Anal. 25, 433–460.CrossRef Google Scholar

Conn, A.R., Gould, N.I.M., and Toint, Ph.L. (1988b), ‘Testing a class of methods for solving minimization problems with simple bounds on the variables’, Math. Comput. 50, 399–430.CrossRef Google Scholar

Conn, A.R., Gould, N.I.M., and Toint, Ph.L. (1991), ‘Convergence of quasi-Newton matrices generated by the symmetric rank one update’, Math. Prog. 2, 177–195.CrossRef Google Scholar

Crowder, H.P. and Wolfe, P. (1972), ‘Linear convergence of the conjugate gradient method’, IBM J. Res. Dev. 16, 431–433.CrossRef Google Scholar

Davidon, W.C. (1959), ‘Variable metric methods for minimization’, Argonne National Lab Report (Argonne, IL).Google Scholar

Davidon, W.C. (1975), ‘Optimally conditioned optimization algorithms without line searches’, Math. Prog. 9, 1–30.CrossRef Google Scholar

Dembo, R.S., Eisenstat, S.C., and Steihaug, T. (1982), ‘Inexact Newton methods’, SIAM J. Numer. Anal. 19, 400–408.CrossRef Google Scholar

Dennis, J.E. Jr and Moré, J.J. (1974), ‘A characterization of superlinear convergence and its application to quasi-Newton methods’, Math. Comp. 28, 549–560.CrossRef Google Scholar

Dennis, J.E. Jr and Moré, J.J. (1977), ‘Quasi-Newton methods, motivation and theory’, SIAM Review 19, 46–89.CrossRef Google Scholar

Dennis, J.E. Jr and Schnabel, R.B. (1983), Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall (Englewood Cliffs, NJ).Google Scholar

Dennis, J.E. Jr and Schnabel, R.B. (1987), ‘A view of unconstrained optimization’, Technical Report CU-CS-376-87, University of Colorado (Boulder), to appear in: Handbooks in Operations Research and Management Science vol. 1: Optimization (G.L. Nemhauser, A.H.G. Rinnooy Kan, and M.J. Todd, eds), North-Holland (Amsterdam).Google Scholar

Dennis, J.E. Jr and Walker, H.F. (1981), ‘Convergence theorems for least-change secant update methods’, SIAM J. Numer. Anal. 18, 949–987; 19, 443.CrossRef Google Scholar

Dennis, J.E. Jr and Wolkowicz, H. (1991), ‘Sizing and least change secant methods’, Technical Report CORR 90-02, Department of Combinatorics and Optimization, University of Waterloo, Waterloo, Canada.Google Scholar

Dixon, L.C.W. (1972), ‘The choice of step length, a crucial factor in the performance of variable metric algorithms’, in Numerical Methods for Nonlinear Optimization (Lootsma, F.A., ed.), Academic Press (London).Google Scholar

Eisenstat, S. and Walker, H. (1990), ‘Globally convergent inexact Newton methods’, Yale Technical Report.Google Scholar

El Hallabi, M. and Tapia, R.A. (1989), ‘A global convergence theory for arbitrary norm trust-region methods for nonlinear equations’, Technical Report TR87-25, Rice University, Department of Mathematical Sciences.Google Scholar

Fiacco, A.V. and McCormick, G.P. (1968), Nonlinear Programmming, John Wiley & Sons (New York).Google Scholar

Fletcher, R. (1987), Practical Methods of Optimization vol. 1: Unconstrained Optimization John Wiley & Sons (New York).Google Scholar

Fletcher, R. (1991), ‘A new variational result for quasi-Newton formulae’, SIAM J. Optimization 1(1), 18–21.CrossRef Google Scholar

Fletcher, R. and Powell, M.J.D. (1963), ‘A rapidly convergent descent method for minimization’, Comput. J. 6, 163–168.CrossRef Google Scholar

Fletcher, R. and Reeves, C. (1964), ‘Function minimization by conjugate gradients’, Comput. J. 7, 149–154.CrossRef Google Scholar

Gay, D. (1983), ‘Subroutines for unconstrained minimization using a model/ trust-region approach’, ACM Trans. Math. Soft. 9(4), 503–524.CrossRef Google Scholar

Gilbert, J.C. and Lemaréchal, C. (1989), ‘Some numerical experiments with variable storage quasi-Newton algorithms’, Math. Program. 45, 407–436.CrossRef Google Scholar

Gilbert, J.C. and Nocedal, J. (1990), ‘Global convergence properties of conjugate gradient methods for optimization’, Rapport de Recherche, INRIA (Paris).Google Scholar

Gill, P.E. and Murray, W. (1979), ‘Conjugate-gradient methods for large-scale nonlinear optimization’, Technical report SOL 79-15, Dept. of Operations Research, Stanford University.CrossRef Google Scholar

Gill, P. E., Murray, W. and Wright, M. H. (1981), Practiced Optimization, Academic Press (London).Google Scholar

Goldfarb, D., and Idnani, A. (1983), ‘A numerically stable dual method for solving strictly convex quadric programs’, Math. Prog. 27, 1–33.CrossRef Google Scholar

Griewank, A. (1991), ‘The global convergence of partitioned BFGS on problems with convex decompositions and Lipschitzian gradients’, Math. Prog. 50, 141–175.CrossRef Google Scholar

Griewank, A. and Toint, Ph.L. (1982a), ‘On the unconstrained optimization of partially separable objective functions’, in Nonlinear Optimization 1981 (Powell, M.J.D., ed.), Academic Press (London), 301–312.Google Scholar

Griewank, A. and Toint, Ph.L. (1982b), ‘Local convergence analysis of partitioned quasi-Newton updates’, Numer. Math. 39, 429–448.CrossRef Google Scholar

Griewank, A. and Toint, Ph.L. (1982c), ‘Partitioned variable metric updates for large structured optimization problems’, Numer. Math. 39, 119–137.CrossRef Google Scholar

Griewank, A. and Toint, Ph.L. (1984), ‘Numerical experiments with partially separable optimization problems’, in Numerical Analysis: Proceedings Dundee 1983 (Lecture Notes in Mathematics 1066) (Griffiths, D.F., ed.), Springer Verlag, (Berlin), 203–220.CrossRef Google Scholar

Grippo, L., Lampariello, F. and Lucidi, S. (1990a), ‘A quasi-discrete Newton algorithm with a nonmonote stabilization technique’, J. Optim. Theory Appl. 64, 485–500.CrossRef Google Scholar

Grippo, L., Lampariello, F. and Lucidi, S. (1990b), ‘A class of nonmonotone stabilization methods in unconstrained optimization’, Technical Report R-290, Consiglio Nazionale delle Ricerche.Google Scholar

Güler, O. (1989), ‘Optimal algorithms for smooth convex programming’, Working Paper Series No. 89-17, Department of Management Sciences, The University of Iowa.Google Scholar

Hamming, R.W. (1971), Introduction to Applied Numerical Analysis, McGraw-Hill (New York).Google Scholar

Hu, Y.F. and Storey, C. (1991), ‘On optimally and near-optimally conditioned quasi-Newton updates’, Technical Report A141, Department of Mathematical Sciences, Loughborough University of Technology (Leicestershire).Google Scholar

Khalfan, H., Byrd, R.H., and Schnabel, R.B. (1990), ‘A theoretical and experimental study of the symmetric rank one update’, Technical Report CU-CS-489-90, University of Colorado (Boulder).CrossRef Google Scholar

Lalee, M. and Nocedal, J. (1991), ‘Automatic column scaling strategies for quasi-Newton methods’, Report NAM 04, EECS Department, Northwestern University (Evanston, IL).Google Scholar

Lemaréchal, C. (1981), ‘A view of line searches’, in Optimization and Optimal Control (Lecture Notes in Control and Information Science 30) (Auslander, A., Oettli, W. and Stoer, J., eds) Springer Verlag (Berlin) 59–78.CrossRef Google Scholar

Liu, D.C. and Nocedal, J. (1989), ‘On the limited memory BFGS method for large scale optimization’, Math. Program. 45, 503–528.CrossRef Google Scholar

Luenberger, D.G. (1984), Linear and Nonlinear Programming, 2nd edition, Addison-Wesley (Reading, MA).Google Scholar

Luksšan, L. (1991a), ‘Computational experience with improved conjugate gradient methods for unconstrained minimization’, Technical Report 488, Institute of Computer and Information Sciences, Czechoslovak Academy of Sciences (Prague).Google Scholar

Luksšan, L. (1991b), ‘On variationally derived scaling and preconvex variable metric updates’, Technical Report 496, Institute of Computer and Information Sciences, Czechoslovak Academy of Sciences (Prague).Google Scholar

Martínez, J.M. (1990), ‘Local convergence theory of inexact Newton methods based on structured least change updates’, Math. Comp. 55, 143–168.CrossRef Google Scholar

Martínez, J.M. (1991), ‘On the relation between two local convergence theories of least change update methods’, Technical Report IMECC-UNICAMP.CrossRef Google Scholar

Moré, J. (1983), ‘Recent developments in algorithms and software for trust region methods’, in Mathematical Programming, The State of the Art (Bachem, A., Grotschel, M. and Korte, G., eds), Springer-Verlag (Berlin), 256–287.Google Scholar

Moré, J. and Sorensen, D.C. (1984), ‘Newton's method’, in Studies in Numerical Analysis (Golub, G.H., ed.), The Mathematical Association of America (Providence RI), 29–82.Google Scholar

Moré, J., and Thuente, D.J. (1990) ‘On line search algorithms with guaranteed sufficient decrease’, Mathematics and Computer Science Division Preprint MCS-P153-0590, Argonne National Laboratory (Argonne, IL).Google Scholar

Nash, S.G. (1985), ‘Preconditioning of truncated-Newton methods’, SIAM J. Sci. Stat. Comput, 6, 599–616.CrossRef Google Scholar

Nazareth, J.L. and Mifflin, R.B. (1991), ‘The least prior deviation quasi-Newton update’, Technical Report, Department of Pure and Applied Mathematics, Washington State University.Google Scholar

Nemirovsky, A.S. and Yudin, D. B. (1983), Problem Complexity and Method Efficiency, Wiley (New York).Google Scholar

Nesterov, Y.E. (1983), ‘A method of solving a convex programming problem with convergence rate O(1/k ²)’, Sov, Math. Dokl. 27, 372–376.Google Scholar

Nesterov, Y.E. (1988), ‘On an approach to the construction of optimal methods of minimization of smooth convex functions’, Ekonom. i Matem. 24, 509–517.Google Scholar

Nocedal, J. (1980), ‘Updating quasi-Newton matrices with limited storage’, Math. Comput. 35, 773–782.CrossRef Google Scholar

Nocedal, J. (1990), ‘The performance of several algorithms for large scal unconstrained optimization’, in Large-Scale Numerical Optimization (Coleman, T.F. and Li, Y., eds), SIAM (Philadelphia), 138–151.Google Scholar

Nocedal, J. and Yuan, Y. (1991), ‘Analysis of a self-scaling quasi-Newton method’, Technical Report NAM-02, Department of Electrical Engineering and Computer Science, Northwestern University (Evanston, IL).Google Scholar

O'Leary, D.P. (1982), ‘A discrete Newton algorithm for minimizing a function of many variables’, Math. Prog. 23, 20–33.CrossRef Google Scholar

Oren, S.S. (1982), ‘Perspectives on self-scaling variable metric algorithms’, J. Opt. Theory Appl. 37, 137–147.CrossRef Google Scholar

Oren, S.S. and Luenberger, D.G. (1974), ‘Self-scaling variable metric(SSVM) Algoriths I: Criteria and sufficient conditions for scaling a class of algorithms’, Management Sci. 20, 845–862.CrossRef Google Scholar

Ortega, J.M. and Rheinboldt, W.C. (1970), Iterative Solution of Nonlinear Equations in Several Variables, Academic Press (New York).Google Scholar

Osborne, M.R. and Sun, L.P. (1988), ‘A new approach to the symmetric rank-one updating algorithm’, Report NMO/01, Department of Statistics, IAS, Australian National University.Google Scholar

Ostrowski, A. (1966), Solution of Equations and Systems of Equations, second edition, Academic Press (New York).Google Scholar

Pearson, J.D. (1969), ‘Variable metric methods of minimization’, Comput. J. 12, 171–178.CrossRef Google Scholar

Polak, E. and Ribière, G. (1969), ‘Note sur la convergence de methodes de directions conjugées’, Rev. Française Informat Recherche Operationelle, 3e Année 16, 35–43.Google Scholar

Powell, M.J.D. (1971), ‘On the convergence of the variable metric algorithm’, J. Inst. Math. Applics 7, 21–36.CrossRef Google Scholar

Powell, M.J.D. (1972), ‘Some properties of the variable metric method’, in Numerical Methods for Nonlinear Optimization (Lootsma, F.A., ed.), Academic Press (London).Google Scholar

Powell, M.J.D. (1976a), ‘Some global convergence properties of a variable metric algorithm for minimization without exact line searches’, in Nonlinear Programming, SIAM-AMS Proceedings, Vol. IX, (Cottle, R.W. and Lemke, C.E., eds), SIAM Publications (Philadelphia).Google Scholar

Powell, M.J.D. (1976b), ‘Some convergence properties of the conjugate gradient method’, Math. Program. 11, 42–49.CrossRef Google Scholar

Powell, M.J.D. (1977), ‘Restart procedures of the conjugate gradient method’, Math. Program. 2, 241–254J.CrossRef Google Scholar

Powell, M.J.D. (1983), ‘On the rate of convergence of variable metric algorithms for unconstrained optimization’, Report DAMTP 1983/NA7, Department of Applied Mathematics and Theoretical Physics, University of Cambridge (Cambridge).Google Scholar

Powell, M.J.D. (1984a), ‘Nonconvex minimization calculations and the conjugate gradient method’, in Lecture Notes in Mathematics vol. 1066, Springer-Verlag (Berlin), 122–141.Google Scholar

Powell, M.J.D. (1984b), ‘On the global convergence of trust region algorithms for unconstrained minimization’, Math. Program. 29, 297–303.CrossRef Google Scholar

Powell, M.J.D. (1985), ‘Convergence properties of algorithms for nonlinear optimization’, Report DAMTP 1985/NA1, Department of Applied Mathematics and Theoretical Physics, University of Cambridge (Cambridge).Google Scholar

Powell, M.J.D. (1986), ‘How bad are the BFGS and DFP methods when the objective function is quadratic?’, Math. Prog. 34, 34–47.CrossRef Google Scholar

Powell, M.J.D. (1987), ‘Update conjugate directions by the BFGS formula’, Math. Prog. 38, 29–46.CrossRef Google Scholar

Pu, D.G. and Yu, W.C. (1988), ‘On the convergence property of the DFP algorithm’, J. Qüfu Normol University 14(3), 63–69.Google Scholar

Ritter, K. (1980), ‘On the rate of superlinear convergence of a class of variable metric methods’, Numer. Math. 35, 293–313.CrossRef Google Scholar

Schnabel, R.B. (1978), ‘Optimal conditioning in the convex class of rank two updates’, Math. Prog. 15, 247–260.CrossRef Google Scholar

Schnabel, R.B. (1989), ‘Sequential and parallel methods for unconstrained optimization’, in Mathematical Programming, Recent Developments and Applications (Iri, M. and Tanabe, K., eds), Kluwer (Deventer), 227–261.Google Scholar

Schuller, G. (1974), ‘On the order of convergence of certain quasi-Newton methods’, Numer. Math. 23, 181–192.CrossRef Google Scholar

Shanno, D.F. (1978a), ‘On the convergence of a new conjugate gradient algorithm’, SIAM J. Numer. Anal. 15, 1247–1257.CrossRef Google Scholar

Shanno, D.F. (1978b), ‘Conjugate gradient methods with inexact searches’, Math. Operations Res. 3, 244–256.CrossRef Google Scholar

Shanno, D.F. and Phua, K.H. (1980), ‘Remark on algorithm 500: minimization of unconstrained multivariate functions’, ACM Trans, on Math. Software 6, 618–622.CrossRef Google Scholar

Siegel, D. (1991), ‘Modifying the BFGS update by a new column scaling technique’, Technical Report DAMTP 1991/NA5, Department of Applied Mathematics and Theoretical Physics, University of Cambridge.Google Scholar

Steihaug, T. (1983), ‘The conjugate gradient method and trust regions in large scale optimization’, SIAM J. Numer. Anal. 20, 626–637.CrossRef Google Scholar

Stoer, J. (1977), ‘On the relation between quadratic termination and convergence properties of minimization algorithms’, Numer. Math. 28, 343–366.CrossRef Google Scholar

Toint, Ph.L. (1983), ‘VE08AD, a routine for partially separable optimization with bounded variables’, Harwell Subroutine Library, AERE (UK).Google Scholar

Toint, Ph.L. (1986a), ‘A view of nonlinear optimization in a large number of variables’, Technical Report 86/16, Facultés Universitaires de Namur.Google Scholar

Toint, Ph.L. (1986b), ‘Global convergence of the partioned BFGS algorithm for convex partially separable optimization’, Math. Prog. 36, 290–306.CrossRef Google Scholar

Torczon, V. (1991), ‘On the convergence of the multidimensional search algorithm’, SIAM J. Optimization 1(1), 123–145.CrossRef Google Scholar

Touati-Ahmed, D. and Storey, C. (1990), ‘Efficient hybrid conjugate gradient techniques’, J. Optimization Theory Appl. 64, 379–397.CrossRef Google Scholar

Warth, W. and Werner, J. (1977), ‘Effiziente Schrittweitenfunktionen bei unrestringierten Optimierungsaufgaben’, Computing 19, 1, 59–72.CrossRef Google Scholar

Werner, J. (1978), ‘Uber die globale konvergenz von Variable-Metric Verfahren mit nichtexakter Schrittweitenbestimmung’, Numer. Math. 31, 321–334.CrossRef Google Scholar

Werner, J. (1989), ‘Global convergence of quasi-Newton methods with practical line searches’, NAM-Bericht 67, Institut für Numerische und Angewandte Mathematik der Universität Göttingen.Google Scholar

Wilkinson, J.H. (1965), The Algebraic Eigenvalue Problem, Oxford University Press (London).Google Scholar

Wolfe, P. (1969), ‘Convergence conditions for ascent methods’, SIAM Review 11, 226–235.CrossRef Google Scholar

Wolfe, P. (1971), ‘Convergence conditions for ascent methods. II: Some corrections’, SIAM Review 13, 185–188.CrossRef Google Scholar

Yuan, Y. (1991), ‘A modified BFGS algorithm for unconstrained optimization’, IMA J. Numer. Anal. 11(3), 325–332.CrossRef Google Scholar

Yuan, Y. and Byrd, R. (1991), ‘Nonquasi-Newton updates for unconstrained optimization’, Technical Report, Department of Computer Science, University of Colorado (Boulder).Google Scholar

Zhang, Y. and Tewarson, R.P. (1988), ‘Quasi-Newton algorithms with updates from the pre-convex part of Broyden's family’, IMA J. Numer. Anal. 8, 487–509.CrossRef Google Scholar

Zoutendijk, G. (1970), ‘Nonlinear Programming, Computational Methods’, in Integer and Nonlinear Programming (Abadie, J., ed.), North-Holland (Amsterdam), 37–86.Google Scholar

Article contents

Theory of algorithms for unconstrained optimization

Extract

Access options

Article purchase

Temporarily unavailable

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests