Moderate deviations inequalities for Gaussian process regression

Jialin Li; Ilya O. Ryzhov

doi:10.1017/jpr.2023.30

Moderate deviations inequalities for Gaussian process regression

Part of: Design of experiments Stochastic processes

Published online by Cambridge University Press: 05 June 2023

Jialin Li and

Ilya O. Ryzhov

Show author details

Jialin Li*: Affiliation:
University of Toronto
Ilya O. Ryzhov*: Affiliation:
University of Maryland
*: *Postal address: Rotman School of Management, University of Toronto, Ontario, Canada. Email address: [email protected]
**Postal address: Robert H. Smith School of Business, University of Maryland, College Park, MD 20742, USA. Email address: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Gaussian process regression is widely used to model an unknown function on a continuous domain by interpolating a discrete set of observed design points. We develop a theoretical framework for proving new moderate deviations inequalities on different types of error probabilities that arise in GP regression. Two specific examples of broad interest are the probability of falsely ordering pairs of points (incorrectly estimating one point as being better than another) and the tail probability of the estimation error at an arbitrary point. Our inequalities connect these probabilities to the mesh norm, which measures how well the design points fill the space.

Keywords

Gaussian process regression moderate deviations interpolation theory

MSC classification

Primary: 60G15: Gaussian processes

Secondary: 62K99: None of the above, but in this section

Type: Original Article
Information: Journal of Applied Probability , Volume 61 , Issue 1 , March 2024 , pp. 172 - 197

DOI: https://doi.org/10.1017/jpr.2023.30 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Adler, R. J. (2000). On excursion sets, tube formulas and maxima of random fields. Ann. Appl. Prob. 10, 1–74.CrossRef Google Scholar

Ankenman, B., Nelson, B. L. and Staum, J. (2010). Stochastic kriging for simulation metamodeling. Operat. Res. 58, 371–382.CrossRef Google Scholar

Arcones, M. A. (2006). Large deviations for M-estimators. Ann. Inst. Statist. Math. 58, 21–52.CrossRef Google Scholar

Bect, J., Bachoc, F. and Ginsbourger, D. (2019). A supermartingale approach to Gaussian process based sequential design of experiments. Bernoulli 25, 2883–2919.CrossRef Google Scholar

Beknazaryan, A., Sang, H. and Xiao, Y. (2019). Cramér type moderate deviations for random fields. J. Appl. Prob. 56, 223–245.CrossRef Google Scholar

Bull, A. D. (2011). Convergence rates of efficient global optimization algorithms. J. Mach. Learn. Res. 12, 2879–2904.Google Scholar

Chan, H. P. and Lai, T. L. (2006). Maxima of asymptotically Gaussian random fields and moderate deviation approximations to boundary crossing probabilities of sums of random variables with multidimensional indices. Ann. Prob. 34, 80–121.CrossRef Google Scholar

Cheng, D. and Xiao, Y. (2016). Excursion probability of Gaussian random fields on sphere. Bernoulli 22, 1113–1130.CrossRef Google Scholar

Ciesielski, Z. (1961). Hölder conditions for realizations of Gaussian processes. Trans. Amer. Math. Soc. 99, 403–413.Google Scholar

Dembo, A. and Zeitouni, O. (2009). Large Deviations Techniques and Applications, 2nd edn. Springer, Berlin and Heidelberg.Google Scholar

Ghosal, S. and Roy, A. (2006). Posterior consistency of Gaussian process prior for nonparametric binary regression. Ann. Statist. 34, 2413–2429.CrossRef Google Scholar

Glynn, P. W. and Juneja, S. (2004). A large deviations perspective on ordinal optimization. In Proceedings of the 2004 Winter Simulation Conference, ed. R. Ingalls et al., pp. 577–585.CrossRef Google Scholar

Janson, S. (1987). Maximal spacings in several dimensions. Ann. Prob. 15, 274–280.CrossRef Google Scholar

Johnson, M. E., Moore, L. M. and Ylvisaker, D. (1990). Minimax and maximin distance designs. J. Statist. Planning Infer. 26, 131–148.CrossRef Google Scholar

Jones, D. R., Schonlau, M. and Welch, W. J. (1998). Efficient global optimization of expensive black-box functions. J. Global Optimization 13, 455–492.CrossRef Google Scholar

Joseph, V. R., Gul, E. and Ba, S. (2015). Maximum projection designs for computer experiments. Biometrika 102, 371–380.CrossRef Google Scholar

Lederer, A., Umlauft, J. and Hirche, S. (2019). Uniform error bounds for Gaussian process regression with application to safe control. In Advances in Neural Information Processing Systems 32, ed. H. Wallach et al. Curran Associates, Red Hook, NY.Google Scholar

Lee, S. I., Mortazavi, B., Hoffman, H. A., Lu, D. S., Li, C., Paak, B. H., Garst, J. H., Razaghy, M., Espinal, M., Park, E., Lu, D. C. and Sarrafzadeh, M. (2014). A prediction model for functional outcomes in spinal cord disorder patients using Gaussian process regression. IEEE J. Biomed. Health Inform. 20, 91–99.CrossRef Google Scholar PubMed

Li, J. and Ryzhov, I. O. (2023). Convergence rates of epsilon-greedy global optimization under radial basis function interpolation. Stoch. Systems 13, 59–92.CrossRef Google Scholar

Li, X., Liu, J., Lu, J. and Zhou, X. (2018). Moderate deviation for random elliptic PDE with small noise. Ann. Appl. Prob. 28, 2781–2813.CrossRef Google Scholar

Lukić, M. and Beder, J. (2001). Stochastic processes with sample paths in reproducing kernel Hilbert spaces. Trans. Amer. Math. Soc. 353, 3945–3969.CrossRef Google Scholar

Marcus, M. B. and Shepp, L. A. (1972). Sample behavior of Gaussian processes. In Proceedings of the 6th Berkeley Symposium on Mathematical Statistics and Probability, vol. 2, ed. L. M. Le Cam et al., pp. 423–421. University of California Press, Berkeley and Los Angeles.Google Scholar

Pati, D., Bhattacharya, A. and Cheng, G. (2015). Optimal Bayesian estimation in random covariate design with a rescaled Gaussian process prior. J. Mach. Learning Res. 16, 2837–2851.Google Scholar

Pronzato, L. and Müller, W. G. (2012). Design of computer experiments: space filling and beyond. Statist. Comput. 22, 681–701.CrossRef Google Scholar

Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA.Google Scholar

Scott, W. R., Powell, W. B. and Simão, H. P. (2010). Calibrating simulation models using the knowledge gradient with continuous parameters. In Proceedings of the 2010 Winter Simulation Conference, ed. B. Johansson et al., pp. 1099–1109. IEEE, Piscataway, NJ.Google Scholar

Sheibani, M. and Ou, G. (2021). The development of Gaussian process regression for effective regional post-earthquake building damage inference. Comput.-Aided Civ. Infrastruct. Eng. 36, 264–288.CrossRef Google Scholar

Snoek, J., Larochelle, H. and Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems 25, ed. F. Pereira et al., pp. 2951–2959. Curran Associates, Red Hook, NY.Google Scholar

Srinivas, N., Krause, A., Kakade, S. M. and Seeger, M. W. (2012). Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Trans. Inform. Theory 58, 3250–3265.CrossRef Google Scholar

Teckentrup, A. L. (2020). Convergence of Gaussian process regression with estimated hyper-parameters and applications in Bayesian inverse problems. SIAM/ASA J. Uncertain. Quantif. 8, 1310–1337.CrossRef Google Scholar

Toth, C. and Oberhauser, H. (2020). Bayesian learning from sequential data using Gaussian processes with signature covariances. In Proceedings of the 37th International Conference on Machine Learning, ed. H. Daumé and A. Singh, pp. 9548–9560. PMLR, Cambridge, MA.Google Scholar

van der Hofstad, R. and Honnappa, H. (2019). Large deviations of bivariate Gaussian extrema. Queueing Systems 93, 333–349.CrossRef Google Scholar

Vazquez, E. and Bect, J. (2010). Convergence properties of the expected improvement algorithm with fixed mean and covariance functions. J. Statist. Planning Infer. 140, 3088–3095.CrossRef Google Scholar

Wang, W., Tuo, R. and Wu, C. F. J. (2020). On prediction properties of kriging: uniform error bounds and robustness. J. Amer. Statist. Assoc. 115, 920–930.CrossRef Google Scholar

Wendland, H. (2004). Scattered Data Approximation. Cambridge University Press.CrossRef Google Scholar

Wu, Z.-M. and Schaback, R. (1993). Local error estimates for radial basis function interpolation of scattered data. IMA J. Numer. Anal. 13, 13–27.CrossRef Google Scholar

Zhou, J. and Ryzhov, I. O. (2022). Technical note: A new rate-optimal sampling allocation for linear belief models. Operat. Res., to appear.Google Scholar

Article contents

Moderate deviations inequalities for Gaussian process regression

Abstract

Keywords

MSC classification

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests