PERFORMANCE OF EMPIRICAL RISK MINIMIZATION FOR LINEAR REGRESSION WITH DEPENDENT DATA

Christian Brownlees; Guđmundur Stefán Guđmundsson

doi:10.1017/S0266466623000348

PERFORMANCE OF EMPIRICAL RISK MINIMIZATION FOR LINEAR REGRESSION WITH DEPENDENT DATA

Published online by Cambridge University Press: 10 November 2023

Christian Brownlees

and

Guđmundur Stefán Guđmundsson

Show author details

Christian Brownlees*: Affiliation:
Universitat Pompeu Fabra and Barcelona SE
Guđmundur Stefán Guđmundsson: Affiliation:
Aarhus University
*: Address correspondence to Christian Brownlees, Department of Economics and Business, Universitat Pompeu Fabra and Barcelona SE, Barcelona, Spain; e-mail: [email protected].

Article contents

Abstract
Footnotes
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This paper establishes bounds on the performance of empirical risk minimization for large-dimensional linear regression. We generalize existing results by allowing the data to be dependent and heavy-tailed. The analysis covers both the cases of identically and heterogeneously distributed observations. Our analysis is nonparametric in the sense that the relationship between the regressand and the regressors is not specified. The main results of this paper show that the empirical risk minimizer achieves the optimal performance (up to a logarithmic factor) in a dependent data setting.

Type: ARTICLES
Information: Econometric Theory , First View , pp. 1 - 30

DOI: https://doi.org/10.1017/S0266466623000348 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Footnotes

We have benefited from discussions with Liudas Giraitis, Emmanuel Guerre, Petra Laketa, Gabor Lugosi, Stanislav Nagy, Jordi Llorens-Terrazas, Yaping Wang, and Geert Mesters as well as seminar participants at the Granger Center, Nottingham University and School of Economics and Finance, Queen Mary University of London. We would also like to thank the Co-Editor Liangjun Su and two anonymous referees for their useful comments. Christian Brownlees acknowledges support from the Spanish Ministry of Science and Technology (Grant No. MTM2012-37195); the Severo Ochoa Programme for Centres of Excellence in R&D (Barcelona School of Economics CEX2019-000915-S) funded by MCIN/AEI/10.13039/501100011033; the Ayudas Fundación BBVA Proyectos de Investigación Cientìfica en Matemáticas 2021. Guðmundur Stefán Guðmundsson acknowledges financial support from Danish National Research Foundation (DNRF Chair Grant No. DNRF154).

References

REFERENCES

Andrews, D.W.K. (1991) Asymptotic normality of series estimators for nonparametric and semiparametric regression models. Econometrica 59(2), 307–345.CrossRef Google Scholar

Audibert, J.-Y. & Catoni, O. (2011) Robust linear least squares regression. Annals of Statistics 39, 2766–2794.CrossRef Google Scholar

Babii, A., Ghysels, E., & Striaukas, J. (2023) High-dimensional Granger causality tests with an application to VIX and news. Journal of Financial Econometrics, forthcoming.Google Scholar

Bai, J. & Ng, S. (2002) Determining the number of factors in approximate factor models. Econometrica 70, 191–221.CrossRef Google Scholar

Belloni, A., Chernozhukov, V., Chetverikov, D., & Kato, K. (2015) Some new asymptotic theory for least squares series: Pointwise and uniform results. Journal of Econometrics , 186(2), 345–366.CrossRef Google Scholar

Birge, L. & Massart, P. (1998) Minimum contrast estimators on sieves: Exponential bounds and rates of convergence. Bernoulli 4, 329–375.CrossRef Google Scholar

Bosq, D. (1998) Nonparametric Statistics for Stochastic Processes. Estimation and Prediction , 2nd Edition. Springer.CrossRef Google Scholar

Brownlees, C., Joly, E., & Lugosi, G. (2015) Empirical risk minimization for heavy-tailed losses. Annals of Statistics 43, 2507–2536.CrossRef Google Scholar

Bunea, F., Tsybakov, A.B., & Wegkamp, M.H. (2007) Aggregation for Gaussian regression. Annals of Statistics 35, 1673–1697.CrossRef Google Scholar

Caner, M. & Knight, K. (2013) An alternative to unit root tests: Bridge estimators differentiate between nonstationary versus stationary models and select optimal lag. Journal of Statistical Planning and Inference 143, 691–715.CrossRef Google Scholar

Chen, X. (2006). Large sample sieve estimation of semi-nonparametric models. In Heckman, J. J. and Leamer, E. E. (eds), Handbook of Econometrics, pp. 5549–5632. North-Holland.Google Scholar

Chen, X. & Shen, X. (1998) Sieve extremum estimates for weakly dependent data. Econometrica 66(2), 289–314.CrossRef Google Scholar

Dendramis, Y., Giraitis, L., & Kapetanios, G. (2021) Estimation of time-varying covariance matrices for large datasets. Econometric Theory 37(6), 1100–1134.CrossRef Google Scholar

Emery, M., Nemirovski, A., & Voiculescu, D. (2000) Ecole d'Ete de Probabilites de Saint-Flour XXVIII-1998. In P. Bernard (ed), Lecture Notes in Mathematics , Vol. 1738, pp. 87–285.Google Scholar

Fan, J., Liao, Y., & Mincheva, M. (2011) High dimensional covariance matrix estimation in approximate factor models. Annals of Statistics 39, 3320–3356.CrossRef Google Scholar PubMed

Fang, K.-T., Kotz, S., & Ng, K.W. (1990) Symmetric Multivariate and Related Distributions . Chapman and Hall).CrossRef Google Scholar

Fang, K.-T. & Zhang, Y.-T. (1990) Generalized Multivariate Analysis . Springer.Google Scholar

Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2000) The generalized dynamic factor model: Identification and estimation. Review of Economics and Statistics 82, 540–554.CrossRef Google Scholar

Garcia, M.G., Medeiros, M.C., & Vasconcelos, G.F. (2017) Real-time inflation forecasting with high-dimensional models: The case of Brazil. International Journal of Forecasting 33(3), 679–693.CrossRef Google Scholar

Hansen, B.E. (2008) Uniform convergence rates for kernel estimation with dependent data. Econometric Theory 24, 726–748.CrossRef Google Scholar

Hastie, T., Tibshirani, R., & Friedman, J. (2001) The Elements of Statistical Learning . Springer Series in Statistics. Springer.CrossRef Google Scholar

Ibragimov, I.A. (1962) Some limit theorems for stationary processes. Theory of Probability and its Applications 7, 349–382.CrossRef Google Scholar

Jiang, W. & Tanner, M.A. (2010) Risk minimization for time series binary choice with variable selection. Econometric Theory 26, 1437–1452.CrossRef Google Scholar

Kock, A.B. & Callot, L. (2015) Oracle inequalities for high dimensional vector autoregressions. Journal of Econometrics 186, 325–344.CrossRef Google Scholar

Lecué, G. & Mendelson, S. (2016) Performance of empirical risk minimization in linear aggregation. Bernoulli 22, 1520–1534.CrossRef Google Scholar

Lecué, G. & Mendelson, S. (2017) Regularization and the small-ball method ii: Complexity dependent error rates. Journal of Machine Learning Research 18, 1–48.Google Scholar

Lecué, G. & Mendelson, S. (2018) Regularization and the small-ball method i: Sparse recovery. Annals of Statistics 46, 611–641.CrossRef Google Scholar

Li, Q. & Racine, J. (2006) Nonparamtric Econometrics: Theory and Practice . Princeton University Press.Google Scholar

Liao, Z. & Phillips, P.C.B. (2015) Automated estimation of vector error correction models. Econometric Theory 31(3), 581–646.CrossRef Google Scholar

Liebscher, E. (1996) Strong convergence of sums of

$\alpha$ -mixing random variables with applications to density estimation. Stochastic Processes and Their Applications , 65, 69–80.CrossRef Google Scholar

Medeiros, M.C. & Mendes, E.F. (2016)

${\ell}_1$ -regularization of high-dimensional time-series models with non-Gaussian and heteroskedastic errors. Journal of Econometrics 191, 255–271.CrossRef Google Scholar

Meitz, M. & Saikkonen, P. (2008) Ergodicity, mixing, and existence of moments of a class of Markov models with applications to GARCH and ACD models. Econometric Theory 24(5), 1291–1320.CrossRef Google Scholar

Mendelson, S. (2015) Learning without concentration. Journal of the ACM 62(3), 1–25.CrossRef Google Scholar

Mendelson, S. (2018) Learning without concentration for general loss functions. Probability Theory and Related Fields 171(1), 459–502.CrossRef Google Scholar

Miao, K., Phillips, P.C.B., & Su, L. (2023) High-dimensional VARs with common factors. Journal of Econometrics 233(1), 155–183.CrossRef Google Scholar

Newey, W.K. (1997) Convergence rates and asymptotic normality for series estimators. Journal of Econometrics 79(1), 147–168.CrossRef Google Scholar

Onatski, A. (2012) Asymptotics of the principal components estimator of large factor models with weakly influential factors. Journal of Econometrics 168(2), 244–258.CrossRef Google Scholar

Rio, E. (1995) The functional law of the iterated logarithm for stationary strongly mixing sequences. Annals of Probability 23, 1188–1203.CrossRef Google Scholar

Stock, J.H. & Watson, M.W. (2002) Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97, 1167–1179.CrossRef Google Scholar

Stone, C. (1985) Additive regression and other nonparametric models. Annals of Statistics 13(2), 689–705.CrossRef Google Scholar

Su, L. & White, H. (2010) Testing structural change in partially linear models. Econometric Theory 26(6), 1761–1806.CrossRef Google Scholar

Tsybakov, A.B. (2003) Optimal rates of aggregation. In B. Schölkopf and M.K. Warmuth (eds), Learning Theory and Kernel Machines . Lecture Notes in Computer Science, Vol. 2777. Springer.Google Scholar

Tsybakov, A.B. (2014) Aggregation and minimax optimality in high-dimensional estimation. In Proceedings of the International Congress of Mathematicians (Seoul, August 2014) , pp. 225–246.Google Scholar

Vershynin, R. (2018) High-Dimensional Probability: An Introduction with Applications in Data Science . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.CrossRef Google Scholar

Wainwright, M.J. (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.CrossRef Google Scholar

White, H. (2001) Asymptotic Theory for Econometricians , Revised Edition. Academic Press.Google Scholar

Article contents

PERFORMANCE OF EMPIRICAL RISK MINIMIZATION FOR LINEAR REGRESSION WITH DEPENDENT DATA

Abstract

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests