Hostname: page-component-cd9895bd7-7cvxr Total loading time: 0 Render date: 2024-12-27T18:31:43.241Z Has data issue: false hasContentIssue false

UNIFORM-IN-SUBMODEL BOUNDS FOR LINEAR REGRESSION IN A MODEL-FREE FRAMEWORK

Published online by Cambridge University Press:  04 June 2021

Arun K. Kuchibhotla*
Affiliation:
Carnegie Mellon University
Lawrence D. Brown
Affiliation:
University of Pennsylvania
Andreas Buja
Affiliation:
University of Pennsylvania
Edward I. George
Affiliation:
University of Pennsylvania
Linda Zhao
Affiliation:
University of Pennsylvania
*
Address correspondence to Arun K. Kuchibhotla, Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA; e-mail: [email protected].
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

For the last two decades, high-dimensional data and methods have proliferated throughout the literature. Yet, the classical technique of linear regression has not lost its usefulness in applications. In fact, many high-dimensional estimation techniques can be seen as variable selection that leads to a smaller set of variables (a “submodel”) where classical linear regression applies. We analyze linear regression estimators resulting from model selection by proving estimation error and linear representation bounds uniformly over sets of submodels. Based on deterministic inequalities, our results provide “good” rates when applied to both independent and dependent data. These results are useful in meaningfully interpreting the linear regression estimator obtained after exploring and reducing the variables and also in justifying post-model-selection inference. All results are derived under no model assumptions and are nonasymptotic in nature.

Type
ARTICLES
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press

Footnotes

We would like to thank Abhishek Chakrabortty for discussions that led to Remark 4.5. We would also like to thank the reviewers and the Editor for their constructive comments which have led to a better presentation.

References

REFERENCES

Adamczak, R. (2008) A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electronic Journal of Probability 13(34), 10001034.CrossRefGoogle Scholar
Bachoc, F., Blanchard, G., Neuvial, P. (2018) On the post selection inference constant under restricted isometry properties. Electronic Journal of Statistics 12(2), 37363757.CrossRefGoogle Scholar
Bachoc, F., Leeb, H., Pötscher, B. M. (2019a) Valid confidence intervals for post-model-selection predictors. Annals of Statistics 47(3), 14751504.CrossRefGoogle Scholar
Bachoc, F., Preinerstorfer, D., & Steinberger, L. (2019b) Uniformly valid confidence intervals post-model-selection. Annals of Statistics 48(1), 440463.Google Scholar
Belloni, A. & Chernozhukov, V. (2013) Least squares after model selection in high-dimensional sparse models. Bernoulli 19(2), 521547.CrossRefGoogle Scholar
Belloni, A., Chernozhukov, V., Chetverikov, D., Hansen, C., Kato, K. (2018) High-dimensional econometrics and regularized GMM. Technical report, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.Google Scholar
Buja, A., Berk, R., Brown, L., George, E., Pitkin, E., Traskin, M., Zhan, K. & Zhao, L. (2019) Models as approximations, part I: A conspiracy of random regressors and model deviations against classical inference in regression. Statistical Science 34(4), 523544.Google Scholar
Cai, T. T. & Yuan, M. (2012) Adaptive covariance matrix estimation through block thresholding. Annals of Statistics 40(4), 20142042.CrossRefGoogle Scholar
Catoni, O. (2012) Challenging the empirical mean and empirical variance: A deviation study. Annales de l’Institut Henri Poincaré Probabilités et Statistiques 48(4), 11481185.CrossRefGoogle Scholar
Catoni, O. & Giulini, I. (2017) Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression. Preprint, arXiv:1712.02747.Google Scholar
Chakrabortty, A., Nandy, P., & Li, H. (2021) Inference for individual mediation effects and interventional effects in sparse high-dimensional causal graphical models. Preprint, arXiv:1809.10652.Google Scholar
Chen, Y., Caramanis, C. & Mannor, S. (2013) Robust sparse regression under adversarial corruption. In Dasgupta, S. & McAllester, D. (eds.), Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 774782. Proceedings of Machine Learning Research.Google Scholar
de la Peña, V.H. & Giné, E. (1999) Decoupling. Probability and Its Applications. Springer-Verlag.CrossRefGoogle Scholar
Foygel, R. & Srebro, N. (2011) Fast-rate and optimistic-rate error bounds for L1-regularized regression. Preprint, arXiv:1108.0373.Google Scholar
Giessing, A. (2018) On high-dimensional misspecified quantile regression. PhD thesis, University of Michigan.Google Scholar
Javanmard, A. & Montanari, A. (2018) Debiasing the lasso: Optimal sample size for Gaussian designs. Annals of Statistics 46(6A), 25932622.CrossRefGoogle Scholar
Kuchibhotla, A.K. (2018). Deterministic inequalities for smooth M-estimators. Preprint, arXiv:1809.05172.Google Scholar
Kuchibhotla, A.K., Brown, L.D., Buja, A., Cai, J., George, E.I., & Zhao, L. (2019) Valid post-selection inference in model-free linear regression. Annals of Statistics 48(5), 29532981.Google Scholar
Kuchibhotla, A.K., Brown, L.D., Buja, A., George, E.I., & Zhao, L. (2018) A model free perspective for linear regression: Uniform-in-model bounds for post selection inference. Preprint, arXiv:1802.05801v1.Google Scholar
Kuchibhotla, A.K. & Chakrabortty, A. (2020) Moving beyond sub-Gaussianity in high-dimensional statistics: Applications in covariance estimation and linear regression. Preprint, arXiv:1804.02605.Google Scholar
Kuchibhotla, A. K., Mukherjee, S., Banerjee, D. (2021) High-dimensional CLT: Improvements, non-uniform extensions and large deviations. Bernoulli 27(1), 192217.CrossRefGoogle Scholar
Leeb, H. & Pötscher, B. M. (2005) Model selection and inference: Facts and fiction. Econometric Theory 21(1), 2159.CrossRefGoogle Scholar
Leeb, H. & Pötscher, B. M. (2006a) Can one estimate the conditional distribution of post-model-selection estimators? Annals of Statistics 34(5), 25542591.CrossRefGoogle Scholar
Leeb, H. & Pötscher, B. M. (2006b) Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results. Econometric Theory 22(1), 6997.CrossRefGoogle Scholar
Leeb, H. & Pötscher, B. M. (2008) Can one estimate the unconditional distribution of post-model-selection estimators? Econometric Theory 24(2), 338376.CrossRefGoogle Scholar
Liu, W. & Wu, W. B. (2010) Asymptotics of spectral density estimates. Econometric Theory 26(4), 12181245.CrossRefGoogle Scholar
Liu, W., Xiao, H. & Wu, W. B. (2013) Probability and moment inequalities under dependence. Statistica Sinica 23(3), 12571272.Google Scholar
Loh, P.-L. & Wainwright, M. J. (2012) High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity. Annals of Statistics 40(3), 16371664.CrossRefGoogle Scholar
Minsker, S. (2015) Geometric median and robust estimation in Banach spaces. Bernoulli 21(4), 23082335.CrossRefGoogle Scholar
Monahan, J. F. (2008) A Primer on Linear Models. CRC Press.CrossRefGoogle Scholar
Plan, Y. & Vershynin, R. (2013) One-bit compressed sensing by linear programming. Communications on Pure and Applied Mathematics 66(8), 12751297.CrossRefGoogle Scholar
Pollard, D. (1990). Empirical Processes: Theory and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 2. Institute of Mathematical Statistics and American Statistical Association.CrossRefGoogle Scholar
Pötscher, B. M. & Prucha, I. R. (1997) Dynamic Nonlinear Econometric Models: Asymptotic Theory. Springer-Verlag.CrossRefGoogle Scholar
Raskutti, G., Wainwright, M. J. & Yu, B. (2011) Minimax rates of estimation for high-dimensional linear regression over ${\ell}_q$ -balls. IEEE Transactions on Information Theory 57(10), 69766994.CrossRefGoogle Scholar
Rinaldo, A., Wasserman, L., G’Sell, M., Lei, J., & Tibshirani, R. (2019) Bootstrapping and sample splitting for high-dimensional, assumption-free inference. Annals of Statistics 47(6), 34383469.Google Scholar
Rio, E. (2009) Moment inequalities for sums of dependent random variables under projective conditions. Journal of Theoretical Probability 22(1), 146163.CrossRefGoogle Scholar
Romano, J. P. & Wolf, M. (2000) A more general central limit theorem for $m$ -dependent random variables with unbounded $m$ . Statistics & Probability Letters 47(2), 115124.CrossRefGoogle Scholar
Serfling, R.J. (1980) Approximation Theorems of Mathematical Statistics. Wiley Series in Probability and Mathematical Statistics. Wiley.CrossRefGoogle Scholar
van der Vaart, A.W. & Wellner, J.A. (1996) Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics. Springer-Verlag.CrossRefGoogle Scholar
van Zwet, W. R. (1984) A Berry–Esseen bound for symmetric statistics. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 66(3), 425440.CrossRefGoogle Scholar
Vershynin, R. (2012) How close is the sample covariance matrix to the actual covariance matrix? Journal of Theoretical Probability 25(3), 655686.CrossRefGoogle Scholar
Vershynin, R. (2018) High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.CrossRefGoogle Scholar
Wei, X. & Minsker, S. (2017) Estimation of the covariance structure of heavy-tailed distributions. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., & Garnett, R. (eds.), Advances in Neural Information Processing Systems, vol. 30, pp. 28552864. Curran Associates, Inc.Google Scholar
White, H. (1980a) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica: Journal of the Econometric Society 48(4), 817838.CrossRefGoogle Scholar
White, H. (1980b) Using least squares to approximate unknown regression functions. International Economic Review 21(1), 149170.CrossRefGoogle Scholar
White, H. (2001) Asymptotic Theory for Econometricians. Economic Theory, Econometrics, and Mathematical Economics. Academic Press.Google Scholar
Wu, W. B. (2005) Nonlinear system theory: Another look at dependence. Proceedings of the National Academy of Sciences of the United States of America 102(40), 1415014154.CrossRefGoogle Scholar
Wu, W. B. & Mielniczuk, J. (2010) A new look at measuring dependence. In ed. Doukhan, P., Lang, G., Surgailis, D., & Teyssiere, G., Dependence in Probability and Statistics, pp. 123142. Springer.CrossRefGoogle Scholar
Wu, W.-B. & Wu, Y. N. (2016) Performance bounds for parameter estimates of high-dimensional linear models with correlated errors. Electronic Journal of Statistics 10(1), 352379.CrossRefGoogle Scholar
Zhang, D. & Wu, W. B. (2017) Gaussian approximation for high dimensional time series. Annals of Statistics 45(5), 18951919.CrossRefGoogle Scholar
Zhang, X. & Cheng, G. (2014) Bootstrapping high dimensional time series. Preprint, arXiv:1406.1037.Google Scholar