Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-27T17:47:56.032Z Has data issue: false hasContentIssue false

SUPERCONSISTENCY OF TESTS IN HIGH DIMENSIONS

Published online by Cambridge University Press:  28 October 2022

Anders Bredahl Kock*
Affiliation:
University of Oxford
David Preinerstorfer
Affiliation:
University of St. Gallen
*
Address correspondence to Anders Bredahl Kock, University of Oxford, 10 Manor Road, Oxford OX1 3UQ, UK; e-mail: [email protected].
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

To assess whether there is some signal in a big database, aggregate tests for the global null hypothesis of no effect are routinely applied in practice before more specialized analysis is carried out. Although a plethora of aggregate tests is available, each test has its strengths but also its blind spots. In a Gaussian sequence model, we study whether it is possible to obtain a test with substantially better consistency properties than the likelihood ratio (LR; i.e., Euclidean norm-based) test. We establish an impossibility result, showing that in the high-dimensional framework we consider, the set of alternatives for which a test may improve upon the LR test (i.e., its superconsistency points) is always asymptotically negligible in a relative volume sense.

Type
MISCELLANEA
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Footnotes

We are grateful for the comments of the Editor, a Co-Editor, four referees, and the participants of the “High Voltage Econometrics” workshop, which helped to improve the previous version of the manuscript.

References

REFERENCES

Anderson, T.W. (1955) The integral of a symmetric unimodal function over a symmetric convex set and some probability inequalities. Proceedings of the American Mathematical Society 6, 170176.CrossRefGoogle Scholar
Arias-Castro, E., Candès, E.J., & Plan, Y. (2011) Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Annals of Statistics 39, 25332556.CrossRefGoogle Scholar
Arias-Castro, E. & Ying, A. (2019) Detection of sparse mixtures: Higher criticism and scan statistic. Electronic Journal of Statistics 13, 208230.CrossRefGoogle Scholar
Barnett, I. & Lin, X. (2014) Analytical P-value calculation for the higher criticism test in finite d problems. Biometrika 101, 964970.CrossRefGoogle ScholarPubMed
Bentkus, V. (2003) On the dependence of the Berry–Esseen bound on dimension. Journal of Statistical Planning and Inference 113, 385402.CrossRefGoogle Scholar
Birnbaum, A. (1955) Characterizations of complete classes of tests of some multiparametric hypotheses, with applications to likelihood ratio tests. Annals of Mathematical Statistics 26, 2136.CrossRefGoogle Scholar
Bogomolov, M., Peterson, C.B., Benjamini, Y., & Sabatti, C. (2020) Hypotheses on a tree: New error rates and testing strategies. Biometrika 108, 575590.CrossRefGoogle ScholarPubMed
Cai, T., Jeng, J., & Jin, J. (2011) Optimal detection of heterogeneous and heteroscedastic mixtures. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73, 629662.Google Scholar
Carpentier, A. & Verzelen, N. (2019) Adaptive estimation of the sparsity in the Gaussian vector model. Annals of Statistics 47, 93126.CrossRefGoogle Scholar
Castillo, I. & Roquain, E. (2020) On spike and slab empirical Bayes multiple testing. Annals of Statistics 48, 25482574.CrossRefGoogle Scholar
Chernozhukov, V., Chetverikov, D., & Kato, K. (2017) Central limit theorems and bootstrap in high dimensions. Annals of Probability 45, 23092352.CrossRefGoogle Scholar
Cousins, R. (2007) Annotated bibliography of some papers on combining significances or p-values. Preprint, arXiv:0705.2209.Google Scholar
Donoho, D. & Jin, J. (2004) Higher criticism for detecting sparse heterogeneous mixtures. Annals of Statistics 32, 962994.CrossRefGoogle Scholar
Donoho, D. & Jin, J. (2009) Feature selection by higher criticism thresholding achieves the optimal phase diagram. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 367, 44494470.CrossRefGoogle ScholarPubMed
Duan, B., Ramdas, A., Balakrishnan, S., & Wasserman, L. (2020) Interactive martingale tests for the global null. Electronic Journal of Statistics 14, 44894551.CrossRefGoogle Scholar
Fan, J., Liao, Y., & Yao, J. (2015) Power enhancement in high-dimensional cross-sectional tests. Econometrica 83, 14971541.CrossRefGoogle ScholarPubMed
Feng, L., Jiang, T., Liu, B., & Xiong, W. (2022) Max-sum tests for cross-sectional independence of high-dimensional panel data. Annals of Statistics 50, 11241143.CrossRefGoogle Scholar
Fisher, R.A. (1934) Statistical Methods for Research Workers , 5th Edition. Oliver and Boyd.Google Scholar
Giessing, A. & Fan, J. (2020) Bootstrapping ${\ell}_p$ -statistics in high dimensions. Preprint, arXiv:2006.13099.Google Scholar
Goeman, J.J. & Solari, A. (2010) The sequential rejection principle of familywise error control. Annals of Statistics 38, 37823810.CrossRefGoogle Scholar
Hall, P. & Jin, J. (2010) Innovated higher criticism for detecting sparse signals in correlated noise. Annals of Statistics 38, 16861732.CrossRefGoogle Scholar
Han, C., Phillips, P.C.B., & Sul, D. (2011) Uniform asymptotic normality in stationary and unit root autoregression. Econometric Theory 27, 11171151.CrossRefGoogle Scholar
He, X. & Shao, Q.-M. (2000) On parameters of increasing dimensions. Journal of Multivariate Analysis 73, 120135.CrossRefGoogle Scholar
He, Y., Xu, G., Wu, C., & Pan, W. (2021) Asymptotically independent U-statistics in high-dimensional testing. Annals of Statistics 49, 154181.CrossRefGoogle ScholarPubMed
Heller, R., Chatterjee, N., Krieger, A., & Shi, J. (2018) Post-selection inference following aggregate level hypothesis testing in large-scale genomic data. Journal of the American Statistical Association 113, 17701783.CrossRefGoogle Scholar
Heller, R., Meir, A., & Chatterjee, N. (2019) Post-selection estimation and testing following aggregate association tests. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 81, 547573.CrossRefGoogle Scholar
Ingster, Y. & Suslina, I.A. (2003) Nonparametric Goodness-of-Fit Testing under Gaussian Models . Springer.CrossRefGoogle Scholar
Jameson, G.J.O. (2013) Inequalities for gamma function ratios. The American Mathematical Monthly 120, 936940.CrossRefGoogle Scholar
Jammalamadaka, S.R., Meintanis, S., & Verdebout, T. (2020) On Sobolev tests of uniformity on the circle with an extension to the sphere. Bernoulli 26, 22262252.CrossRefGoogle Scholar
Johnstone, I.M. (2019). Gaussian estimation: Sequence and wavelet models. Unpublished manuscript.Google Scholar
Kock, A.B. & Preinerstorfer, D. (2019) Power in high-dimensional testing problems. Econometrica 87, 10551069.CrossRefGoogle Scholar
Kock, A.B. and Preinerstorfer, D. (2021). Consistency of $p$ -norm based tests in high dimensions: Characterization, monotonicity, domination. Preprint, arXiv:2103.11201.Google Scholar
Le Cam, L. (1953) On some asymptotic properties of maximum likelihood estimates and related Bayes estimates. University of California Publications in Statistics 1, 277330.Google Scholar
Ledoux, M. (1992) A heat semigroup approach to concentration on the sphere and on a compact Riemannian manifold. Geometric and Functional Analysis 2, 221224.CrossRefGoogle Scholar
Ledoux, M. (2001) The Concentration of Measure Phenomenon . Mathematical Surveys and Monographs, vol. 89. American Mathematical Society.Google Scholar
Leeb, H. & Pötscher, B.M. (2006) Performance limits for estimators of the risk or distribution of shrinkage-type estimators, and some general lower risk-bound results. Econometric Theory 22, 6997.CrossRefGoogle Scholar
Leeb, H. & Pötscher, B.M. (2008) Sparse estimators and the oracle property, or the return of Hodges’ estimator. Journal of Econometrics 142, 201211.CrossRefGoogle Scholar
Li, J. & Siegmund, D. (2015) Higher criticism: $p$ -values and criticism. Annals of Statistics 43, 13231350.CrossRefGoogle Scholar
Meinshausen, N. (2008) Hierarchical testing of variable importance. Biometrika 95, 265278.CrossRefGoogle Scholar
Owen, A. (2009) Karl Pearson’s meta-analysis revisited. Annals of Statistics 37, 38673892.CrossRefGoogle Scholar
Pearson, K. (1933) On a method of determining whether a sample of size n supposed to have been drawn from a parent population having a known probability integral has probably been drawn at random. Biometrika 25, 379410.CrossRefGoogle Scholar
Porter, T. & Stewart, M. (2020) Beyond HC: More sensitive tests for rare/weak alternatives. Annals of Statistics 48, 22302252.CrossRefGoogle Scholar
Portnoy, S. (1985) Asymptotic behavior of $M$ estimators of $p$ regression parameters when ${p}^2/ n$ is large; II. Normal approximation. Annals of Statistics 13, 14031417.CrossRefGoogle Scholar
Portnoy, S. (1988) Asymptotic behavior of likelihood methods for exponential families when the number of parameters tends to infinity. Annals of Statistics 16, 356366.CrossRefGoogle Scholar
Preinerstorfer, D. (2021) How to avoid the zero-power trap in testing for correlation. Econometric Theory, to appear.CrossRefGoogle Scholar
Romano, J. & Wolf, M. (2005) Exact and approximate stepdown methods for multiple hypothesis testing. Journal of the American Statistical Association 100, 94108.CrossRefGoogle Scholar
Romano, J.P., Shaikh, A.M., & Wolf, M. (2008) Formalized data snooping based on generalized error rates. Econometric Theory 24, 404447.CrossRefGoogle Scholar
Rosenbaum, P. (2008) Testing hypotheses in order. Biometrika 95, 248252.CrossRefGoogle Scholar
Schechtman, G. & Schmuckenschläger, M. (1991) Another remark on the volume of the intersection of two ${L}_p^n$  balls. In Lindenstrauss, J. and Milman, V. D. (eds.), Geometric Aspects of Functional Analysis , pp. 174178. Springer.CrossRefGoogle Scholar
Schechtman, G. & Zinn, J. (1990) On the volume of the intersection of two ${L}_p^n$ balls. Proceedings of the American Mathematical Society 110, 217224.Google Scholar
Simes, J. (1986) An improved Bonferroni procedure for multiple tests of significance. Biometrika 73, 751754.CrossRefGoogle Scholar
Stein, C. (1956) The admissibility of Hotelling’s ${t}^2$ -test. Annals of Mathematical Statistics 27, 616623.CrossRefGoogle Scholar
Stouffer, S.A., Suchman, E.A., DeVinney, L.C., Star, S.A., & WilliamsJr, R.M. (1949) The American Soldier: Adjustment During Army Life (Studies in Social Psychology in World War II) , vol. 1. Princeton Univ. Press.Google Scholar
Strasser, H. (1985) Mathematical Theory of Statistics . Walter de Gruyter.CrossRefGoogle Scholar
Stroock, D.W. (1998) A Concise Introduction to the Theory of Integration . Springer.Google Scholar
Tippett, L.H.C. (1931) The Methods of Statistics . Williams & Norgate Ltd.Google Scholar
Tsybakov, A.B. (2009) Introduction to Nonparametric Estimation . Springer.CrossRefGoogle Scholar
Tukey, J. (1976) T13 N: The Higher Criticism. Course Notes . Statistics 411. Princeton University.Google Scholar
van der Vaart, A.W. (1997) Superefficiency. In Pollard, D., Torgersen, E., & Yang, G. L. (eds.), Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics , pp. 397410. Springer.CrossRefGoogle Scholar
Vovk, V. & Wang, R. (2020) Combining p-values via averaging. Biometrika 107, 791808.CrossRefGoogle Scholar
Vovk, V. & Wang, R. (2021) E-values: Calibration, combination and applications. Annals of Statistics 49, 17361754.CrossRefGoogle Scholar
Xu, G., Lin, L., Wei, P., & Pan, W. (2016) An adaptive two-sample test for high-dimensional means. Biometrika 103, 609624.CrossRefGoogle ScholarPubMed
Yang, Q. & Pan, G. (2017) Weighted statistic in detecting faint and sparse alternatives for high-dimensional covariance matrices. Journal of the American Statistical Association 112, 188200.CrossRefGoogle Scholar
Yekutieli, D. (2008) Hierarchical false discovery rate–controlling methodology. Journal of the American Statistical Association 103, 309316.CrossRefGoogle Scholar
Yu, X., Li, D., & Xue, L. (2020) Fisher’s combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, to appear.Google Scholar
Yu, X., Li, D., Xue, L., & Li, R. (2021) Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, to appear.CrossRefGoogle Scholar
Zhang, Y., Wang, R., & Shao, X. (2021) Adaptive inference for change points in high-dimensional data. Journal of the American Statistical Association , to appear.Google Scholar