Automatic selection of reliability estimates for individual regression predictions

Zoran Bosnić; Igor Kononenko

doi:10.1017/S0269888909990154

Automatic selection of reliability estimates for individual regression predictions

Published online by Cambridge University Press: 01 March 2010

Zoran Bosnić and

Igor Kononenko

Show author details

Zoran Bosnić*: Affiliation:
University of Ljubljana, Faculty of Computer and Information Science, Tržaška 25, Ljubljana, Slovenia
Igor Kononenko*: Affiliation:
University of Ljubljana, Faculty of Computer and Information Science, Tržaška 25, Ljubljana, Slovenia
*: e-mail: [email protected], [email protected]
e-mail: [email protected], [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In machine learning and its risk-sensitive applications (e.g. medicine, engineering, business), the reliability estimates for individual predictions provide more information about the individual prediction error (the difference between the true label and regression prediction) than the average accuracy of predictive model (e.g. relative mean squared error). Furthermore, they enable the users to distinguish between more and less reliable predictions. The empirical evaluations of the existing individual reliability estimates revealed that the successful estimates’ performance depends on the used regression model and on the particular problem domain. In the current paper, we focus on that problem as such and propose and empirically evaluate two approaches for automatic selection of the most appropriate estimate for a given domain and regression model: the internal cross-validation approach and the meta-learning approach. The testing results of both approaches demonstrated an advantage in the performance of dynamically chosen reliability estimates to the performance of the individual reliability estimates. The best results were achieved using the internal cross-validation procedure, where reliability estimates significantly positively correlated with the prediction error in 73% of experiments. In addition, the preliminary testing of the proposed methodology on a medical domain demonstrated the potential for its usage in practice.

Type: Articles
Information: The Knowledge Engineering Review , Volume 25 , Issue 1 , March 2010 , pp. 27 - 47

DOI: https://doi.org/10.1017/S0269888909990154 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2010

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aha, D. W. 1992. Generalizing from case studies: A case study. In Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), Aberdeen, Scotland, UK, 1–10.Google Scholar

Asuncion, A., Newman, D. J. 2007. UCI machine learning repository, http://www.ics.uci.edu/~mlearn/MLRepository.html, Irvine, CA: University of California, School of Information and Computer Science.Google Scholar

Birattari, M., Bontempi, H., Bersini, H. 1998. Local learning for data analysis. In Proceedings of the 8th Belgian-Dutch Conference on Machine Learning, Wageningen, The Netherlands, 55–61.Google Scholar

Blum, A., Mitchell, T. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory, Madison, Wisconsin, 92–100.Google Scholar

Bosnić, Z., Kononenko, I. 2007. Estimation of individual prediction reliability using the local sensitivity analysis. Applied Intelligence 29(3), 187–203.Google Scholar

Bosnić, Z., Kononenko, I. 2008a. Estimation of regressor reliability. Journal of Intelligent Systems 17(1/3), 297–311.CrossRef Google Scholar

Bosnić, Z., Kononenko, I. 2008b. Comparison of approaches for estimating reliability of individual regression predictions. Data & Knowledge Engineering 67(3), 504–516.Google Scholar

Bosnić, Z., Kononenko, I., Robnik-Šikonja, M., Kukar, M. 2003. Evaluation of prediction reliability in regression using the transduction principle. In Proceedings of Eurocon 2003, Zajc, B. & Tkalčič, M. (eds), 99–103. IEEE (Institute of Electrical and Electronics Engineering, Inc.)Google Scholar

Bousquet, O., Elisseeff, A. 2002. Stability and generalization. Journal of Machine Learning Research 2, 499–526.Google Scholar

Breierova, L., Choudhari, M. 1996. An introduction to sensitivity analysis. MIT System Dynamics in Education Project.Google Scholar

Breiman, L. 1996. Bagging predictors. Machine Learning 24(2), 123–140.CrossRef Google Scholar

Breiman, L. 2001. Random forests. Machine Learning 45(1), 5–32.CrossRef Google Scholar

Breiman, L., Friedman, J. H., Olshen, R. A., Stone, C. J. 1984. Classification and Regression Trees. Wadsworth International Group.Google Scholar

Carney, J., Cunningham, P. 1999. Confidence and prediction intervals for neural network ensembles. In Proceedings of IJCNN’99, The International Joint Conference on Neural Networks, Washington, USA, 1215–1218.Google Scholar

Caruana, R. 1997. Multitask learning. Machine Learning 28(1), 41–75.CrossRef Google Scholar

Chang, C., Lin, C. 2001. LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.Google Scholar

Christiannini, N., Shawe-Taylor, J. 2000. Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press.CrossRef Google Scholar

Cohn, D. A., Atlas, L., Ladner, R. 1990. Training connectionist networks with queries and selective sampling. In Advances in Neural Information Processing Systems, Touretzky, D. (ed.) 2, 566–573. Morgan Kaufman.Google Scholar

Cohn, D. A., Ghahramani, Z., Jordan, M. I. 1995. Active learning with statistical models. In Advances in Neural Information Processing Systems, Tesauro, G., Touretzky, D. & Leen, T. (eds) 7, 705–712. The MIT Press.Google Scholar

Crowder, M. J., Kimber, A. C., Smith, R. L., Sweeting, T. J. 1991. Statistical Concepts in Reliability. Statistical Analysis of Reliability Data. Chapman & Hall.CrossRef Google Scholar

de Sa, V. 1993. Learning classification with unlabeled data. In Proc. NIPS’93, Neural Information Processing Systems, Cowan, J. D., Tesauro, G. & Alspector, J. (eds), 112–119. Morgan Kaufmann Publishers.Google Scholar

DesJardins, M., Gordon Diana, F. 1995. Evaluation and Selection of Biases in Machine Learning. Machine Learning 20, 5–22.Google Scholar

Department of Statistics at Carnegie Mellon University 2005. Statlib – Data, Software and News from the Statistics Community. http://lib.stat.cmu.edu/.Google Scholar

Elidan, G., Ninio, M., Friedman, N., Schuurmans, D. 2002. Data perturbation for escaping local maxima in learning. In Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28 - August 1, 2002, Edmonton, Alberta, Canada, 132–139. AAAI Press.Google Scholar

Freund, Y., Schapire, R. E. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139.CrossRef Google Scholar

Gama, J., Brazdil, P. 1995. Characterization of classification algorithms. In Progress in Artificial Intelligence, 7th Portuguese Conference on Artificial Intelligence, EPIA-95, Pinto-Ferreira, C. & Mamede, N. (eds), 189–200. Springer-Verlag.Google Scholar

Gammerman, A., Vovk, V., Vapnik, V. 1998. Learning by transduction. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, Madison, Wisconsin, 148–155.Google Scholar

Giacinto, G., Roli, F. 2001. Dynamic classifier selection based on multiple classifier behaviour. Pattern Recognition 34(9), 1879–1881.CrossRef Google Scholar

Goldman, S., Zhou, Y. 2000. Enhancing supervised learning with unlabeled data. In Proc. 17th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA, 327–334.Google Scholar

Hastie, T., Tibshirani, R. 1990. Generalized Additive Models. Chapman and Hall.Google Scholar

Heskes, T. 1997. Practical confidence and prediction intervals. In Advances in Neural Information Processing Systems, Mozer, M. C., Jordan, M. I. & Petsche, T. (eds), 9, 176–182. The MIT Press.Google Scholar

Jeon, B., Landgrebe, D. A. 1994. Parzen density estimation using clustering-based branch and bound. IEEE Transactions on Pattern Analysis and Machine Intelligence, 950–954.CrossRef Google Scholar

Kearns, M. J., Ron, D. 1997. Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. In Computational Learning Theory, Freund Y. & Shapire R. (eds), 152–162, Morgan Kaufmann.Google Scholar

Kleijnen, J. 2001. Experimental designs for sensitivity analysis of simulation models. Tutorial at the Eurosim 2001 Conference.Google Scholar

Kononenko, I., Kukar, M. 2007. Machine Learning and Data Mining: Introduction to Principles and Algorithms. Horwood Publishing Limited.CrossRef Google Scholar

Krieger, A. M., Green, P. E. 1999. A cautionary note on using internal cross validation to select the number of clusters. Psychometrika 64, 341–353.CrossRef Google Scholar

Kukar, M., Kononenko, I. 2002. Reliable classifications with machine learning. In Proc. Machine Learning: ECML-2002, Elomaa, T., Manilla, H. & Toivonen, H. (eds), 219–231. Springer-Verlag.CrossRef Google Scholar

Li, M., Vitányi, P. 1993. An Introduction to Kolmogorov Complexity and its Applications. Springer-Verlag.CrossRef Google Scholar

Linden, A., Weber, F. 1992. Implementing inner drive by competence reflection. In Proceedings of the 2nd International Conference on Simulation of Adaptive Behavior, Hawaii, 321–326.Google Scholar

Merz, C. J. 1996. Dynamical selection of learning algorithms. In Learning from Data: Artificial Intelligence and Statistics, Fisher, D. & Lenz, H. J. (eds), 1–10. Springer-Verlag.Google Scholar

Michie, D., Spiegelhalter, D. J., Taylor, C. C. (eds) 1994. Analysis of results. In Machine Learning, Neural and Statistical Classification, 176–212. Ellis Horwood.Google Scholar

Mitchell, T. 1999. The role of unlabelled data in supervised learning. In Proceedings of the 6th International Colloquium of Cognitive Science, San Sebastian, Spain.Google Scholar

Nouretdinov, I., Melluish, T., Vovk, V. 2001. Ridge regression confidence machine. In Proc. 18th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA, 385–392.Google Scholar

Pratt, L., Jennings, B. 1998. A survey of connectionist network reuse through transfer. Learning to Learn, Norwell, MA, USA, ISBN: 0-7923-8047-9, 19–43. Kluwer Academic Publishers.Google Scholar

R Development Core Team 2006. A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.Google Scholar

Rumelhart, D. E., Hinton, G. E., Williams, R. J. 1986. Learning Internal Representations by Error Propagation. MIT Press, 318–362.Google Scholar

Saunders, C., Gammerman, A., Vovk, V. 1999. Transduction with confidence and credibility. In Proceedings of IJCAI’99, 2, 722–726.Google Scholar

Schaal, S., Atkeson, C. G. 1994. Assessing the quality of learned local models. In Advances in Neural Information Processing Systems, Cowan, J. D., Tesauro, G. & Alspector, J. (eds), 160–167. Morgan Kaufmann Publishers.Google Scholar

Schaal, S., Atkeson, C. G. 1998. Constructive incremental learning from only local information. Neural Computation 10(8), 2047–2084.CrossRef Google Scholar PubMed

Schaffer, C. 1993. Selecting a classification method by cross-validation. In Fourth International Workshop on Artificial Intelligence & Statistics, 15–25.Google Scholar

Schmidhuber, J., Storck, J. 1993. Reinforcement Driven Information Acquisition in Nondeterministic Environments. Technical Report. Fakultat fur Informatik, Technische Universit at Munchen.Google Scholar

Schmidhuber, J, Zhao, J., Wiering, M. 1996. Simple principles of metalearning, Technical Report IDSIA-69-96, Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale, 1–23.Google Scholar

Seeger, M. 2000. Learning with Labeled and Unlabeled Data. Technical report. http://www.dai.ed.ac.uk/seeger/papers.html.Google Scholar

Silverman, B. W. 1986. Density Estimation for Statistics and Data Analysis. Monographs on Statistics and Applied Probability. Chapman and Hall.Google Scholar

Smola, A. J., Schölkopf, B. 1998. A Tutorial on Support Vector Regression. NeuroCOLT2 Technical Report NC2-TR-1998-030.Google Scholar

Tibshirani, R., Knight, K. 1999. Model search and inference by bootstrap bumping. Journal of Computational and Graphical Statistics 8, 671–686.Google Scholar

Torgo, L. 2003. Data Mining with R: Learning by Case Studies. University of Porto, LIACC-FEP.Google Scholar

Tsuda, K., Rätsch, G., Mika, S., Müller, K. 2001. Learning to predict the leave-one-out error of kernel based classifiers. In Lecture Notes in Computer Science, 227–331. Springer Berlin/Heidelberg.Google Scholar

Vapnik, V. 1995. The Nature of Statistical Learning Theory. Springer.CrossRef Google Scholar

Vilalta, R., Drissi, Y. 2002. A perspective view and survey of metalearning. Artificial Intelligence Review 18(2), 77–95.CrossRef Google Scholar

Wand, M. P., Jones, M. C. 1995. Kernel Smoothing. Chapman and Hall.CrossRef Google Scholar

Weigend, A., Nix, D. 1994. Predictions with confidence intervals (local error bars). In Proceedings of the International Conference on Neural Information Processing (ICONIP’94), Seoul, Korea, 847–852.Google Scholar

Whitehead, S. D. 1991. A complexity analysis of cooperative mechanisms in reinforcement learning. In AAAI, 607–613.Google Scholar

Wolpert, D. H. 1992. Stacked generalization. In Neural Networks, Amari S. Grossberg S. & Taylor J. G. (eds) 5, 241–259. Pergamon Press.Google Scholar

Wood, S. N. 2006. Generalized Additive Models: An Introduction with R, Chapman & Hall/CRC.CrossRef Google Scholar

Woods, K., Kegelmeyer, W. P., Bowyer, K. 1997. Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on PAMI 19(4), 405–410.CrossRef Google Scholar

Article contents

Automatic selection of reliability estimates for individual regression predictions

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests