TREE-BASED MACHINE LEARNING METHODS FOR MODELING AND FORECASTING MORTALITY

Dorethe Skovgaard Bjerre

doi:10.1017/asb.2022.11

TREE-BASED MACHINE LEARNING METHODS FOR MODELING AND FORECASTING MORTALITY

Published online by Cambridge University Press: 20 May 2022

Dorethe Skovgaard Bjerre

Show author details

Dorethe Skovgaard Bjerre*: Affiliation:
CREATES and Department of Economics and Business Economics Aarhus University Fuglesangs Allé 4 8210 Aarhus V, Denmark E-Mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Machine learning has recently entered the mortality literature in order to improve the forecasts of stochastic mortality models. This paper proposes to use two pure, tree-based machine learning models: random forests and gradient boosting, based on the differenced log-mortality rates to produce more accurate mortality forecasts. These forecasts are compared with forecasts from traditional, stochastic mortality models and with forecasts from random forests and gradient boosting variants of the stochastic models. The comparisons are based on the Model Confidence Set procedure. The results show that the pure, tree-based models significantly outperform all other models in the majority of cases considered. To address the lack of interpretability issue associated with machine learning models, we demonstrate how to extract information about the relationships uncovered by the tree-based models. For this purpose, we consider variable importance, partial dependence plots, and variable split conditions. Results from the in-sample fit suggest that tree-based models can be very useful tools for detecting patterns within and between variables that are not commonly identifiable with traditional methods.

Keywords

Mortality random forests gradient boosting stochastic mortality models model confidence set

Type: Research Article
Information: ASTIN Bulletin: The Journal of the IAA , Volume 52 , Issue 3 , September 2022 , pp. 765 - 787

DOI: https://doi.org/10.1017/asb.2022.11 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press on behalf of The International Actuarial Association

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aburto, J.M., Wensink, M., van Raalte, A. and Lindahl-Jacobsen, R. (2018) Potential gains in life expectancy by reducing inequality of lifespans in Denmark: An international comparison and cause-of-death analysis. BMC Public Health, 18(1), 831. doi: 10.1186/s12889-018-5730-0.CrossRef Google Scholar PubMed

Anderson, B.A. and Silver, B.D. (1989) The changing shape of soviet mortality, 1958-1985: An evaluation of old and new evidence. Population Studies, 43(2), 243–265. doi: 10.1080/0032472031000144106.CrossRef Google Scholar PubMed

Bernardi, M. and Catania, L. (2018) The model confidence set package for R. International Journal of Computational Economics and Econometrics, 8(2), 144–158. doi: 10.1504/IJCEE.2018.091037.CrossRef Google Scholar

Blum, A. and Monnier, A. (1989) Recent mortality trends in the U.S.S.R.: New evidence. Population Studies, 43(2), 211–241. doi: 10.1080/0032472031000144096.CrossRef Google Scholar

Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32. doi: 10.1023/ A:1010933404324.CrossRef Google Scholar

Breiman, L., Friedman, J., Stone, C.J. and Olshen, R.A. (1984) Classification and Regression Trees. Belmont, CA: Wadsworth International Group.Google Scholar

Cairns, A.J.G., Blake, D. and Dowd, K. (2006) A two-factor model for stochastic mortality with parameter uncertainty: Theory and calibration. Journal of Risk & Insurance, 73(4), 687–718. doi: 10.1111/j.1539-6975.2006.00195.x.CrossRef Google Scholar

Cairns, A.J.G., Blake, D., Dowd, K., Coughlan, G.D., Epstein, D., Ong, A. and Balevich, I. (2009) A quantitative comparison of stochastic mortality models using data from England and Wales and the United States. North American Actuarial Journal, 13(10, 1–35. doi: 10.1080/10920277.2009.10597538.CrossRef Google Scholar

Cairns, A.J.G., Blake, D. and Dowd, K. (2008) Modelling and management of mortality risk: A review. Scandinavian Actuarial Journal, 2–3, 79–113. doi: 10.1080/03461230802173608.CrossRef Google Scholar

Currie, I.D. (2006) Smoothing and forecasting mortality rates with P-splines. Talk given at the Institute of Actuaries. http:www.ma.hw.ac.uk/~{}iain/research/talks.html (visited on 11/03/2020).Google Scholar

Deng, H. (2019) Interpreting tree ensembles with inTrees. International Journal of Data Science and Analytics, 7, 277–287. doi: 10.1007/s41060-018-0144-8.CrossRef Google Scholar

Deprez, P., Shevchenko, P.V. and Wüthrich, M.V. (2017) Machine learning techniques for mortality modeling. European Actuarial Journal, 7(2), 337–352. doi: 10.1007/s13385-017-0152-4.Google Scholar

Friedman, J.H. (2001) Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. doi: 10.1214/aos/1013203451.CrossRef Google Scholar

Friedman, J.H. (2002) Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378. doi: 10.1016/S0167-9473(01)00065-2.CrossRef Google Scholar

Fung, M., Peters, G. and Shevchenko, P. (2017) A united approach to mortality modelling using state-space framework: Characterisation, identification, estimation and forecasting. Annals of Actuarial Science, 11(2), 343–389. doi: 10.1017/S1748499517000069.CrossRef Google Scholar

Fung, M., Peters, G. and Shevchenko, P. (2019) Cohort effects in mortality modelling: A Bayesian state-space approach. Annals of Actuarial Science, 13(1), 109–144. doi: 10.1017/S1748499518000131.CrossRef Google Scholar

Greenwell, B., Boehmke, B., Cunningham, J. and Developers, GBM (2020) Package ‘gbm’ (version 2.1.8). https://cran.r-project.org/web/packages/gbm/gbm.pdf (visited on 10/18/2021).Google Scholar

Grønborg, N.S., Lunde, A., Timmermann, A. and Wermers, R. (2020) Picking funds with confidence. Journal of Financial Economics. doi: 10.1016/j.jfineco.2020.07.003.Google Scholar

Hainaut, D. (2018) A neural-network analyzer for mortality forecast. Astin Bulletin, 48(2), 481–508. doi: 10.1017/asb.2017.45.CrossRef Google Scholar

Hansen, P.R., Lunde, A. and Nason, J.M. (2011) The model confidence set. Econometrica, 79(2), 453–497. doi: 10.3982/ECTA5771.Google Scholar

Hiam, L., Harrison, D., McKee, M. and Dorling, D. (2018) Why is life expectancy in England and Wales ‘stalling’? Journal of Epidemiology & Community Health, 72(5), 404–408. doi: 10.1136/jech-2017-210401.CrossRef Google Scholar

Ho, J.Y. and Hendi, A.S. (2018) Recent trends in life expectancy across high income countries: Retrospective observational study. BMJ, 362, k2562. doi: 10.1136/bmj.k2562.CrossRef Google Scholar PubMed

Human Mortality Database. (2020) University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Data downloaded on 29 March 2020. www.mortality.org.Google Scholar

James, G., Witten, D., Hastie, T. and Tibshirani, R. (2013) An Introduction to Statistical Learning: with Applications in R, 1st ed. New York: Springer. doi: 10.1007/978-1-4614-7138-7.CrossRef Google Scholar

Laurent, S., Rombouts, J.V.K. and Violante, F. (2011) On the forecasting accuracy of multivariate GARCH models. Journal of Applied Econometrics, 27(6), 934–955. doi: 10.1002/jae.1248.CrossRef Google Scholar

Lee, R. and Carter, L.R. (1992) Modeling and forecasting of U.S. mortality. Journal of the American Statistical Association, 87(419), 659–675. doi: 10.1080/01621459.1992.10475265.Google Scholar

Levantesi, S. and Nigri, A. (2020) A random forest algorithm to improve the Lee-Carter mortality forecasting: Impact on q-forward. Soft Computing, 24, 8553–8567. doi: 10.1007/s00500-019-04427-z.CrossRef Google Scholar

Levantesi, S., Nigri, A. and Piscopo, G. (2020) Longevity risk management through machine learning: State of the art. Insurance Markets and Companies, 11(1), 11–20. doi: 10.21511/ins.11(1).2020.02.CrossRef Google Scholar

Levantesi, S. and Pizzorusso, V. (2019) Application of machine learning to mortality modeling and forecasting. Risks, 7(1), 26. doi: 10.3390/risks7010026.CrossRef Google Scholar

Li, N. and Lee, R. (2005) Coherent mortality forecasts for a group of populations: An extension of the Lee-Carter method. Demography, 42(3), 575–594. doi: 10.1353/dem.2005.0021.CrossRef Google Scholar PubMed

Li, N., Lee, R. and Gerland, P. (2013) Extending the Lee-Carter method to model the rotation of age patterns of mortality decline for long-term projections. Demography, 50(6), 2037–2051. doi: 10.1007/s13524-013-0232-2.CrossRef Google Scholar PubMed

Liaw, A. (2018) Package ‘randomForest’ (version 4.6-14). https://cran.r-project.org/web/packages/randomForest/randomForest.pdf (visited on 10/18/2021).Google Scholar

Liu, L.Y., Patton, A.J. and Sheppard, K. (2015) Does anything beat 5-minute RV? A comparison of realized measures across multiple asset classes. Journal of Econometrics, 187(1), 293–311. doi: 10.1016/j.jeconom.2015.02.008.Google Scholar

Medeiros, M.C., Vasconcelos, G.F.R., Veiga, A. and Zilberman, E. (2019) Forecasting ination in a data-rich environment: The benefits of machine learning methods. Journal of Business & Economic Statistics, 1–22. doi: 10.1080/07350015.2019.1637745.Google Scholar

Nigri, A., Levantesi, S. and Marino, M. (2021) Life expectancy and lifespan disparity forecasting: A long short-term memory approach. Scandinavian Actuarial Journal, 2021(2), 110–133. doi: 10.1080/03461238.2020.1814855.CrossRef Google Scholar

Nigri, A., Levantesi, S., Marino, M., Scognamiglio, S. and Perla, F. (2019) A deep learning integrated Lee-Carter model. Risks, 7(1), 33. doi: 10.3390/risks7010033.CrossRef Google Scholar

Oeppen, J. (2008) Coherent forecasting of multiple-decrement life tables: A test using Japanese cause of death data. European Population Conference 2008, European Association for Population Studies.Google Scholar

Plat, R. (2009) On stochastic mortality modeling. Insurance Mathematics and Economics, 45(3), 393–404. doi: 10.1016/j.insmatheco.2009.08.006.CrossRef Google Scholar

Renshaw, A.E. and Haberman, S. (2006) A cohort-based extension to the Lee-Carter model for mortality reduction factors. Insurance: Mathematics and Economics, 38(3), 556–570. doi: 10.1016/j.insmatheco.2005.12.001.Google Scholar

Richman, R. and Wüthrich, M.V. (2019) A neural network extension of the LeeCarter model to multiple populations. Annals of Actuarial Science, 1–21. doi: 10.1017/S1748499519000071.Google Scholar

Schnürch, S. and Korn, R. (2021) Point and interval forecasts of death rates using neural networks. ASTIN Bulletin, 1–28. doi: 10.1017/asb.2021.34.Google Scholar

Shang, H.L. and Haberman, S. (2020) Retiree mortality forecasting: A partial age-range or a full age-range model? Risks, 8(3), 69. doi: 10.3390/risks8030069.CrossRef Google Scholar

Villegas, A.M., Kaishev, V.K. and Millossovich, P. (2018) StMoMo: An R package for stochastic mortality modeling. Journal of Statistical Software, 84(3), 1–38. doi: 10.18637/jss.v084.i03.CrossRef Google Scholar

Zhang, Y. and Haghani, A. (2015) A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies, 58, 308–324. doi: 10.1016/j.trc.2015.02.019.CrossRef Google Scholar

Bjerre supplementary material

Bjerre supplementary material 1

File 81 KB

Bjerre supplementary material

Bjerre supplementary material 2

PDF 591.6 KB

Article contents

TREE-BASED MACHINE LEARNING METHODS FOR MODELING AND FORECASTING MORTALITY

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Bjerre supplementary material

Bjerre supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests