Hostname: page-component-78c5997874-v9fdk Total loading time: 0 Render date: 2024-11-17T18:23:12.741Z Has data issue: false hasContentIssue false

GEOGRAPHIC RATEMAKING WITH SPATIAL EMBEDDINGS

Published online by Cambridge University Press:  04 October 2021

Christopher Blier-Wong
Affiliation:
École d’actuariat, Université Laval, Quebec, Canada Centre de recherche en données massives, Université Laval, Quebec, Canada E-mail: [email protected]
Hélène Cossette
Affiliation:
École d’actuariat, Université Laval, Quebec, Canada Centre de recherche en données massives, Université Laval, Quebec, Canada Centre interdisciplinaire en modélisation mathématique, Université Laval, Quebec, Canada E-mail: [email protected]
Luc Lamontagne
Affiliation:
Département d’informatique et de génie logiciel, Université Laval, Quebec, Canada Centre de recherche en données massives, Université Laval, Quebec, Canada E-mail: [email protected]
Etienne Marceau*
Affiliation:
École d’actuariat, Université Laval, Quebec, Canada Centre de recherche en données massives, Université Laval, Quebec, Canada Centre interdisciplinaire en modélisation mathématique, Université Laval, Quebec, Canada E-mail: [email protected]

Abstract

Spatial data are a rich source of information for actuarial applications: knowledge of a risk’s location could improve an insurance company’s ratemaking, reserving or risk management processes. Relying on historical geolocated loss data is problematic for areas where it is limited or unavailable. In this paper, we construct spatial embeddings within a complex convolutional neural network representation model using external census data and use them as inputs to a simple predictive model. Compared to spatial interpolation models, our approach leads to smaller predictive bias and reduced variance in most situations. This method also enables us to generate rates in territories with no historical experience.

Type
Research Article
Copyright
© The Author(s), 2021. Published by Cambridge University Press on behalf of The International Actuarial Association

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Anselin, L., Syabri, I. and Kho, Y. (2010) Geoda: An introduction to spatial data analysis. In Handbook of Applied Spatial Analysis, pp. 7389. Heidelberg, Germany: Springer.CrossRefGoogle Scholar
Bengio, Y., Courville, A. and Vincent, P. (2013) Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 17981828.CrossRefGoogle ScholarPubMed
Blier-Wong, C., Baillargeon, J.-T., Cossette, H., Lamontagne, L. and Marceau, E. (2020) Encoding neighbor information into geographical embeddings using convolutional neural networks. In The Thirty-Third International Flairs Conference.Google Scholar
Blier-Wong, C., Baillargeon, J.-T., Cossette, H., Lamontagne, L. and Marceau, E. (2021) Rethinking representations in P&C actuarial science with deep neural networks. arXiv preprint arXiv:2102.05784.Google Scholar
Boskov, M. and Verrall, R. (1994) Premium rating by geographic area using spatial models. ASTIN Bulletin: The Journal of the IAA, 24(1), 131143.CrossRefGoogle Scholar
Cocos, A. and Callison-Burch, C. (2017) The language of place: Semantic value from geospatial context. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 99104.CrossRefGoogle Scholar
Collobert, R. and Weston, J. (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, pp. 160167.CrossRefGoogle Scholar
Denuit, M. and Lang, S. (2004) Non-life rate-making with Bayesian GAMs. Insurance: Mathematics and Economics, 35(3), 627–647.CrossRefGoogle Scholar
Dimakos, X. K. and Di Rattalma, A. F. (2002) Bayesian premium rating with latent structure. Scandinavian Actuarial Journal, 2002(3), 162184.CrossRefGoogle Scholar
Dumoulin, V. and Visin, F. (2018) A guide to convolution arithmetic for deep learning. arXiv:1603.07285 [cs, stat].Google Scholar
Eisenstein, J., O’Connor, B., Smith, N. A. and Xing, E. P. (2010) A latent variable model for geographic lexical variation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1277–1287. Association for Computational Linguistics.Google Scholar
Fahrmeir, L., Lang, S. and Spies, F. (2003) Generalized geoadditive models for insurance claims data. Blätter der DGVFM, 26(1), 723.CrossRefGoogle Scholar
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G. and Ruppin, E. (2002) Placing Search in Context: The Concept Revisited. ACM Transactions on Information Systems 20(1), 16.Google Scholar
Frees, E. W. (2015) Analytics of insurance markets. Annual Review of Financial Economics, 7, 253277.CrossRefGoogle Scholar
Glorot, X. and Bengio, Y. (2010) Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS) 2010, p. 8, Sardinia, Italy.Google Scholar
Goodfellow, I., Bengio, Y. and Courville, A. (2016) Deep Learning. Cambridge, Massachusetts: MIT Press.Google Scholar
Gschlößl, S. and Czado, C. (2007) Spatial modelling of claim frequency and claim size in non-life insurance. Scandinavian Actuarial Journal, 2007(3), 202–225.CrossRefGoogle Scholar
He, K., Zhang, X., Ren, S. and Sun, J. (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 10261034.CrossRefGoogle Scholar
He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770778.CrossRefGoogle Scholar
Henckaerts, R., Antonio, K., Clijsters, M. and Verbelen, R. (2018) A data driven binning strategy for the construction of insurance tariff classes. Scandinavian Actuarial Journal, 2018(8), 681705.CrossRefGoogle Scholar
Hengl, T., Heuvelink, G. B. and Rossiter, D. G. (2007) About regression-kriging: From equations to case studies. Computers & Geosciences, 33(10), 13011315.CrossRefGoogle Scholar
Hui, B., Yan, D., Ku, W.-S. and Wang, W. (2020) Predicting economic growth by region embedding: A multigraph convolutional network approach. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 555564.CrossRefGoogle Scholar
Ioffe, S. and Szegedy, C. (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pp. 448456.Google Scholar
ISO 19109 (2015) Geographic information — rules for application schema. Standard, International Organization for Standardization, Geneva. Technical Committee ISO/TC 211, Geographic Information/Geomatics.Google Scholar
Jeawak, S. S., Jones, C. B. and Schockaert, S. (2019) Embedding geographic locations for modelling the natural environment using Flickr tags and structured data. In European Conference on Information Retrieval, pp. 51–66. Springer.CrossRefGoogle Scholar
Jurafsky, D. and Martin, J. H. (2009) Speech & Language Processing, second edition. Upper Saddle River, New Jersey: Prentice Hall.Google Scholar
Kingma, D. P. and Ba, J. (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.Google Scholar
Lambert, J. H. (1772) Beiträge zum gebrauche der mathematik und deren anwendung: Part iii, section 6: Anmerkungen und zusätze zur entwerfung der land-und himmelscharten: Berlin, translated and introduced by WR Tobler. Translated and introduced by WR Tobler, Univ. Michigan in 1972.Google Scholar
Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.Google Scholar
Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C. and Joulin, A. (2017) Advances in pre-training distributed word representations. arXiv preprint arXiv:1712.09405.Google Scholar
Miller, H. J. (2004) Tobler’s first law and spatial analysis. Annals of the Association of American Geographers, 94(2), 284289.CrossRefGoogle Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L. and Lerer, A. (2017) Automatic differentiation in pytorch. In 31st Conference on Neural Information Processing Systems.Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. (2019) Pytorch: An imperative style, high-performance deep learning library. arXiv preprint arXiv:1912.01703.Google Scholar
R Core Team (2020) R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar
Shi, P. and Shi, K. (2017) Territorial risk classification using spatially dependent frequency-severity models. ASTIN Bulletin: The Journal of the IAA, 47(2), 437465.CrossRefGoogle Scholar
Simonyan, K. and Zisserman, A. (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Google Scholar
Taylor, G. (2001) Geographic premium rating by Whittaker spatial smoothing. ASTIN Bulletin: The Journal of the IAA, 31(1), 147160.CrossRefGoogle Scholar
Taylor, G. C. (1989) Use of spline functions for premium rating by geographic area. ASTIN Bulletin: The Journal of the IAA, 19(1), 91122.CrossRefGoogle Scholar
Tobler, W. R. (1970) A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(sup1):234240.CrossRefGoogle Scholar
Wang, C., Schifano, E. D. and Yan, J. (2017) Geographical ratings with spatial random effects in a two-part model. Variance, 13(1), 20.Google Scholar
Wang, Z., Li, H. and Rajagopal, R. (2020) Urban2Vec: Incorporating Street View imagery and POIs for multi-modal urban neighborhood embedding. Proceedings of the AAAI Conference on Artificial Intelligence, 34(01), 10131020.CrossRefGoogle Scholar
Wood, S. (2012) mgcv: Mixed GAM computation vehicle with GCV/AIC/REML smoothness estimation.Google Scholar
Xu, S., Cao, J., Legg, P., Liu, B. and Li, S. (2020) Venue2Vec: An efficient embedding model for fine-grained user location prediction in geo-social networks. IEEE Systems Journal 14(2), 17401751.CrossRefGoogle Scholar
Yao, Y., Li, X., Liu, X., Liu, P., Liang, Z., Zhang, J. and Mai, K. (2017) Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model. International Journal of Geographical Information Science, 31(4), 825848.CrossRefGoogle Scholar
Yin, Y., Liu, Z., Zhang, Y., Wang, S., Shah, R. R. and Zimmermann, R. (2019) GPS2Vec: Towards generating worldwide GPS embeddings. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 416–419, Chicago IL USA. ACM.CrossRefGoogle Scholar
Supplementary material: PDF

Blier-Wong et al. supplementary material

Blier-Wong et al. supplementary material

Download Blier-Wong et al. supplementary material(PDF)
PDF 260 KB