Tractable Optimization in Machine Learning

doi:10.1017/CBO9781139177801.008

7 - Tractable Optimization in Machine Learning

from Part 3 - Algorithms and their Analysis

Published online by Cambridge University Press: 05 February 2014

Edited by

Youssef Hamadi and

Suvrit Sra: Affiliation:
Max Planck Institute for Intelligent Systems
Lucas Bordeaux: Affiliation:
Microsoft Research
Youssef Hamadi: Affiliation:
Microsoft Research
Pushmeet Kohli: Affiliation:
Microsoft Research

Book contents

Get access

Summary

Machine learning and data analysis have driven explosive growth in interest in the methods of large-scale optimization. Many commonly used techniques such as stochastic-gradients date back several decades, but owing to their practical success they have gained great importance in machine learning. Before interior point methods totally dominated the field of optimization, first-order methods had already been studied and theoretically analyzed in substantial detail. But interest in these techniques skyrocketed after the prolific rise of applications in machine learning, signal processing, etc. This chapter is a brief introduction to this vast and flourishing area of large-scale optimization.

Introduction

Machine Learning (ML) broadly encompasses a variety of adaptive, autonomous, and intelligent tasks where one must “learn” to predict from observations and feedback. Throughout its evolution, ML has drawn heavily and successfully on optimization algorithms; this relation to optimization is not surprising as “learning” and “adapting” ultimately involve problems where some quality function must be optimized.

But the interaction between ML and optimization is now undergoing rapid change. The increased size, complexity, and variety of ML problems, not only prompts a refinement of existing optimization techniques, but also spurs development of new methods tuned to the specific needs of ML applications.

In particular, ML applications must usually cope with large-scale data, which forces us to prefer “simpler,” perhaps less accurate but more scalable algorithms. Such methods can also crunch through more data, and may actually be better suited for learning – for a more precise characterization see [11]. The use of possibly less accurate methods is also grounded in pragmatic concerns: modeling limitations, observational noise, uncertainty, and computational errors are pervasive in real data.

Type: Chapter
Information: Tractability
Practical Approaches to Hard Problems
, pp. 202 - 230

DOI: https://doi.org/10.1017/CBO9781139177801.008 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

7 - Tractable Optimization in Machine Learning

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive