Hostname: page-component-586b7cd67f-rcrh6 Total loading time: 0 Render date: 2024-11-26T10:59:55.939Z Has data issue: false hasContentIssue false

Recursive bias estimation for multivariate regressionsmoothers

Published online by Cambridge University Press:  10 October 2014

Pierre-André Cornillon
Affiliation:
IRMAR, UMR 6625, Univ. Rennes 2, 35043 Rennes, France
N. W. Hengartner
Affiliation:
Stochastics Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
E. Matzner-Løber
Affiliation:
Lab. Mathématiques Appliquées, Agrocampus Ouest et Univ. Rennes 2, 35043 Rennes, France. [email protected]
Get access

Abstract

This paper presents a practical and simple fully nonparametric multivariate smoothingprocedure that adapts to the underlying smoothness of the true regression function. Ourestimator is easily computed by successive application of existing base smoothers (withoutthe need of selecting an optimal smoothing parameter), such as thin-plate spline or kernelsmoothers. The resulting smoother has better out of sample predictive capabilities thanthe underlying base smoother, or competing structurally constrained models (MARS, GAM) forsmall dimension (3 ≤ d ≤7) and moderate sample size n ≤ 1000. Moreover our estimator is still usefulwhen d > 10and to our knowledge, no other adaptive fully nonparametric regression estimator isavailable without constrained assumption such as additivity for example. On a realexample, the Boston Housing Data, our method reduces the out of sample prediction error by20%. An R package ibr, available at CRAN, implements the proposedmultivariate nonparametric method in R.

Type
Research Article
Copyright
© EDP Sciences, SMAI 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abdous, B., Computationally efficient classes of higher-order kernel functions. Can. J. Statist. 23 (1995) 2127. Google Scholar
L. Breiman, Using adaptive bagging to debias regressions. Technical Report 547, Dpt of Statist., UC Berkeley (1999).
Breiman, L. and Friedman, J., Estimating optimal transformation for multiple regression and correlation. J. Amer. Stat. Assoc. 80 (1995) 580598. Google Scholar
Bühlmann, P. and Yu, B., Boosting with the l 2 loss: Regression and classification. J. Amer. Stat. Assoc. 98 (2003) 324339. Google Scholar
P.-A. Cornillon, N. Hengartner and E. Matzner-Løber, Recursive bias estimation and l 2 boosting. Technical report, ArXiv:0801.4629 (2008).
P.-A. Cornillon, N. Hengartner and Matzner-Løber, ibr: Iterative Bias Reduction. CRAN (2010). http://cran.r-project.org/web/packages/ibr/index.html.
P.-A. Cornillon, N. Hengartner, N. Jégou and Matzner-Løber, Iterative bias reduction: a comparative study. Statist. Comput. (2012).
Craven, P. and Wahba, G., Smoothing noisy data with spline functions. Numer. Math. 31 (1979) 377403. Google Scholar
Di Marzio, M. and Taylor, C., On boosting kernel regression. J. Statist. Plan. Infer. 138 (2008) 24832498. Google Scholar
R. Eubank, Nonparametric regression and spline smoothing. Dekker, 2nd edition (1999).
W. Feller, An introduction to probability and its applications, vol. 2. Wiley (1966).
Friedman, J., Multivariate adaptive regression splines. Ann. Statist. 19 (1991) 337407. Google Scholar
Friedman, J., Greedy function approximation: A gradient boosting machine. Ann. Statist. 28 (11891232) (2001). Google Scholar
Friedman, J. and Stuetzle, W., Projection pursuit regression. J. Amer. Statist. Assoc. 76 (817823) (1981). Google Scholar
Friedman, J., Hastie, T. and Tibshirani, R., Additive logistic regression: a statistical view of boosting. Ann. Statist. 28 (2000) 337407. Google Scholar
C. Gu, Smoothing spline ANOVA models. Springer (2002).
L. Gyorfi, M. Kohler, A. Krzyzak and H. Walk, A Distribution-Free Theory of Nonparametric Regression. Springer Verlag (2002).
D. Harrison and D. Rubinfeld, Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. (1978) 81–102.
T. Hastie and R. Tibshirani, Generalized Additive Models. Chapman & Hall (1995).
R.A. Horn and C.R. Johnson, Matrix analysis. Cambridge (1985).
Hurvich, C., Simonoff, G. and Tsai, C.L., Smoothing parameter selection in nonparametric regression using and improved akaike information criterion. J. Roy. Stat. Soc. B 60 (1998) 271294. Google Scholar
Lepski, O., Asymptotically minimax adaptive estimation. I: upper bounds. optimally adaptive estimates. Theory Probab. Appl. 37 (1991) 682697. Google Scholar
Li, K.-C., Asymptotic optimality for C p, C L, cross-validation and generalized cross-validation: Discrete index set. Ann. Statist. 15 (1987) 958975. Google Scholar
Ridgeway, G., Additive logistic regression: a statistical view of boosting: Discussion. Ann. Statist. 28 (2000) 393400. Google Scholar
L. Schwartz, Analyse IV applications à la théorie de la mesure. Hermann (1993).
W. Stuetzle and Y. Mittal, Some comments on the asymptotic behavior of robust smoothers, in Smoothing Techniques for Curve Estimation, edited by T. Gasser and M. Rosenblatt. Springer-Verlag (1979) 191–195.
J. Tukey, Explanatory Data Analysis. Addison-Wesley (1977).
F. Utreras, Convergence rates for multivariate smoothing spline functions. J. Approx. Theory (1988) 1–27.
J. Wendelberger, Smoothing Noisy Data with Multivariate Splines and Generalized Cross-Validation. PhD thesis, University of Wisconsin (1982).
Wood, S., Thin plate regression splines. J. R. Statist. Soc. B 65 (2003) 95114. Google Scholar
Yang, Y., Combining different procedures for adaptive regression. J. Mult. Analysis 74 (2000) 135161. Google Scholar