Hostname: page-component-78c5997874-xbtfd Total loading time: 0 Render date: 2024-11-06T04:18:03.926Z Has data issue: false hasContentIssue false

On the prediction of claim duration for income protection insurance policyholders

Published online by Cambridge University Press:  04 November 2013

Qing Liu*
Affiliation:
Centre for Actuarial Studies, Faculty of Business and Economics, The University of Melbourne
David Pitt
Affiliation:
Department of Applied Finance and Actuarial Studies, Faculty of Business and Economics, Macquarie University
Xueyuan Wu
Affiliation:
Centre for Actuarial Studies, Faculty of Business and Economics, The University of Melbourne
*
*Correspondence to: Qing Liu, Centre for Actuarial Studies, Faculty of Business and Economics, The University of Melbourne, VIC 3010, Australia. E-mail: [email protected]

Abstract

This paper explores how we can apply various modern data mining techniques to better understand Australian Income Protection Insurance (IPI). We provide a fast and objective method of scoring claims into different portfolios using available rating factors. Results from fitting several prediction models are compared based on not only the conventional loss prediction error function, but also a modified loss function. We demonstrate that the prediction power of all the data mining methods under consideration is clearly evident using a misclassification plot. We also point out that this predictability can be masked by looking at just the conventional prediction error function. We then suggest using the stepwise regression technique to reduce the number of variables used in the data mining methods. Apart from this variable selection method, we also look at principal components analysis to increase understanding of the rating factors that drive claim durations of insured lives. We also discuss and compare how different variable combining techniques can be used to weight available predicting variables. One interesting outcome we discover is that principal components analysis and the weighted combination prediction model together provide very consistent results on identifying the most significant variables for explaining claim durations.

Type
Papers
Copyright
Copyright © Institute and Faculty of Actuaries 2013 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bates, J.M., Granger, C.W.J. (1969). The combination of forecasts. Operational Research Quarterly, 20, 451468.Google Scholar
Bellman, R.E. (1961). Adaptive Control Processes. Princeton University Press.Google Scholar
Cover, T., Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, IT–11, 2127.Google Scholar
Fisher, R.A. (1936). The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics, 7(2), 179188.CrossRefGoogle Scholar
Hastie, T., Tibshirani, R., Friedman, J. (2008). The Elements of Statistical Learning: Data Mining, Inference, and Prediction.Google Scholar
Hocking, R.R. (1976). The Analysis and Selection of Variables in Linear Regression. Biometrics, 32.Google Scholar
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417441 and 498–520.Google Scholar
Ling, S.Y., Waters, H.R., Wilkie, A.D. (2010). Modelling Income Protection Insurance claim termination rates by cause of sickness I: Recoveries. The Annals of Actuarial Science, 4(2), 199240.Google Scholar
Liu, Q. (2013). Statistical modelling of insurance claims, Doctor of Philosophy Thesis, Faculty of Business and Economics, The University of Melbourne.Google Scholar
Manly. (1986). Multivariate Statistical Methods: A Primer. Chapman Hall, London.Google Scholar
McCullagh, P. (1980). Regression Models for Oridinal Data. Journal of the Royal Statistical Society, Series B (Methodological), 42(2), 109142.Google Scholar
Michie, D., Spiegelhalter, D., Taylor, C. (eds.) (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood Series in Artificial Intelligence, Ellis Horwood.Google Scholar
Myers, R.H. (2000). Classical and Modern Regression with Applications.Google Scholar
Pearson, K. (1901). On lines and planes of closest fit to a system of points in space. Philosophical Magazine, 2, 557572.Google Scholar
Pitt, D. (2007). Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models. Annals of Actuarial Science, 2(1), 124.Google Scholar
Rapach, D.E., Strauss, J.K., Zhou, G. (2010). Out-of-sample equity premium prediction: Combination forecasts and links to the real economy. Review of Financial Studies, 23, 821862.CrossRefGoogle Scholar
Senensky, B., Polon, J. (2004). Predicting Return to Work with Data Mining, Claim Analytics.Google Scholar
Stock, J.H., Watson, M.W. (2004). Combination forecasts of output growth in a seven-country data set. Journal of Forecasting, 23, 405430.CrossRefGoogle Scholar
Wilkinson, L., Dallal, G.E. (1981). Tests of significance in forward selection regression with an F-to enter stopping rule. Technometrics, 23, 377380.Google Scholar
1991 Continuous Mortality Investigation Report No 12, UK, CMIB, Institute and Faculty of Actuaries.Google Scholar
1997 The Institute of Actuaries of Australia Report of the Disability Committee, Transactions of the Institute of Actuaries of Australia, 489–576.Google Scholar