Hostname: page-component-7bb8b95d7b-pwrkn Total loading time: 0 Render date: 2024-09-28T20:21:45.091Z Has data issue: false hasContentIssue false

Valuation of guaranteed minimum accumulation benefits (GMABs) with physics-inspired neural networks

Published online by Cambridge University Press:  13 May 2024

Donatien Hainaut*
Affiliation:
UCLouvain, LIDAM, Ottignies-Louvain-la-Neuve, Belgium
*
Corresponding author: Donatien Hainaut; Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Guaranteed minimum accumulation benefits (GMABs) are retirement savings vehicles that protect the policyholder against downside market risk. This article proposes a valuation method for these contracts based on physics-inspired neural networks (PINNs), in the presence of multiple financial and biometric risk factors. A PINN integrates principles from physics into its learning process to enhance its efficiency in solving complex problems. In this article, the driving principle is the Feynman–Kac (FK) equation, which is a partial differential equation (PDE) governing the GMAB price in an arbitrage-free market. In our context, the FK PDE depends on multiple variables and is difficult to solve using classical finite difference approximations. In comparison, PINNs constitute an efficient alternative that can evaluate GMABs with various specifications without the need for retraining. To illustrate this, we consider a market with four risk factors. We first derive a closed-form expression for the GMAB that serves as a benchmark for the PINN. Next, we propose a scaled version of the FK equation that we solve using a PINN. Pricing errors are analyzed in a numerical illustration.

Type
Original Research Paper
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of Institute and Faculty of Actuaries

1. Introduction

A physics-inspired neural network (PINN) incorporates principles from physics into its learning process, enhancing its efficiency in solving complex scientific problems. Compared to existing approaches, PINNs reduce the reliance on large datasets and provide accurate predictions even with sparse or noisy measurements. Researchers have employed PINNs to address a diverse range of problems, including fluid dynamics, solid mechanics, heat transfer, and quantum mechanics.

In this study, we demonstrate that PINNs can be used for the valuation and design of specific variable annuities (VAs) known as guaranteed minimum accumulation benefits (GMABs). This retirement savings vehicle guarantees the policyholder the higher of the account value or a specified amount upon maturity, assuming survival. The proposed model incorporates the following risk factors: interest rates, equity prices, and mortality rates. We also consider four asset classes: cash, zero-coupon bonds, stocks, and mortality-linked bonds. Our goal is to develop a single neural network capable of pricing GMABs with various specifications, including different investment strategies and maturities.

In summary, a PINN serves as an approximate solution to a nonlinear partial differential equation (PDE) that describes the dynamics of a model. The neural network is trained by minimizing the approximation error at sampled points within the PDE domain and on its boundaries. This method originates from the work of Lee & Kang (Reference Lee and Kang1990) and has since been applied to various nonlinear PDEs. For example, Raissi et al. (Reference Raissi, Perdikaris and Karniadakis2019) used PINNs for solving two main classes of mathematical problems: data-driven solutions and data-driven discovery of PDEs. In the field of physics, Carleo & Troyer (Reference Carleo and Troyer2017) and Cai (Reference Cai2018) employed PINNs to accurately approximate quantum many-body wave functions. For further information and recent developments and applications of PINNs, we recommend referring to Cuomo et al. (Reference Cuomo, Schiano Di Cola, Giampaolo, Rozza, Raissi and Piccialli2022) for a comprehensive literature review.

The literature on the valuation of financial and actuarial liabilities using neural networks is relatively recent. Hejazi & Jackson (Reference Hejazi and Jackson2016) developed a neural network to price and estimate the "Greeks" for a large portfolio of VAs. Doyle & Groendyke (Reference Doyle and Groendyke2019) priced and hedged equity-linked contracts using neural networks. They constructed a dataset of variable annuity prices with various features through Monte Carlo (MC) simulations. The neural network was then estimated by minimizing prediction errors using a scaled conjugate gradient descent. Sirignano & Spiliopoulos (Reference Sirignano and Spiliopoulos2018) introduced a deep Galerkin method (DGM) based on a network architecture inspired by long short-term memory (LSTM) neural cells. Their approach was tested on a class of high-dimensional free boundary PDEs and on high-dimensional Hamilton–Jacobi–Bellman PDEs and Burgers’ equation. Gatta et al. (Reference Gatta, Di Cola, Giampaolo and Piccialli F.Cuomo2023) evaluated a suitable PINN for the pricing of American multiasset options and proposed a novel algorithmic technique for free boundary training. Al-Aradi et al. (Reference Al-Aradi, Correia, Jardim, de Freitas Naiff and Saporito2022) extended the DGM in several directions to solve Fokker–Planck and Hamilton–Jacobi–Bellman equations. More recently, Jiang et al. (Reference Jiang, Sirignano and Cohen2023) demonstrated, under mild assumptions, the convergence of the deep Galerkin and PINNs method for solving PDEs. Glau & Wunderlich (Reference Glau and Wunderlich2022) formalized and analyzed the deep parametric PDE method for solving high-dimensional parametric PDEs.

A close alternative to PINN is developed in Weinan et al. (Reference Weinan, Han and Jentzen2017), where they solve parabolic PDEs in high dimensions using neural networks connected to backward stochastic differential equations (BSDEs). This approach employs two separate networks for the solution and its gradient. They apply their algorithm to price options in a multivariate Black and Scholes model. Beck et al. (Reference Beck, Weinan and Jentzen2019) introduce a similar method for solving high-dimensional PDEs based on a connection between fully nonlinear second-order PDEs and second-order backward stochastic differential equations. They illustrate the accuracy of their method with the Black and Scholes and Hamilton–Jacobi–Bellman equations. Using this principle, Barigou & Delong (Reference Barigou and Delong2022) evaluate equity-linked life insurances with neural networks by solving a BSDE, building upon previous results from Delong et al. (Reference Delong, Dhaene and Barigou2019). Unlike PINNs, BSDE methods rely on two networks instead of one, and any modification of contract specifications requires retraining.

Another valuation approach based on neural networks relies on simulations. Buehler et al. (Reference Buehler, Gonon, Teichmann and Wood2019) present a framework for pricing and hedging derivatives using neural networks that are trained on simulated sample paths of risk factors. In this case, the neural network approximates the optimal investment strategy, and the value is determined by averaging discounted payoffs obtained through simulations. Similarly, Horvath et al. (Reference Horvath, Teichmann and Zuric2021) studied the performance of deep hedging under rough volatility models, while Biagini et al. (Reference Biagini, Gonon and Reitsam2023) developed a neural network approximation using simulations for the superhedging price and replicating strategy. As with BSDE approaches, any modification of contract specifications requires retraining. For a comprehensive review of algorithms based on neural networks for stochastic control and PDEs in finance, we recommend referring to Germain et al. (Reference Germain, Pham and Warin2021).

In the financial and insurance industry, four dominant numerical methods are commonly used for pricing contingent claim contracts: MC simulations, bi- or trinomial trees, PDE solving, or fast Fourier transform (FFT) inversion. For situations involving more than two risk factors, trees, PDE solving, or FFT methods become too complex, making MC simulations the only viable alternative. In contrast, the valuation using a PINN does not require the simulation of sample paths of underlying risk factors. Furthermore, once the PINN is trained, valuation becomes nearly instantaneous. Another noteworthy feature is the ability to parameterize a PINN with contract features such as asset-mix or contract durations. This provides a quick solution for understanding the impact of asset-liability management (ALM) on pricing without the need to retrain the network for each setting. This presents a significant advantage compared to BSDE or deep hedging approaches.

This article makes the following contributions. First, we develop a closed-form expression for the price of a GMAB with participation in a market consisting of five classes of assets, including a mortality bond for hedging exposure to longevity risk. This analytical formula enables us to assess the pricing accuracy of a large set of contracts using PINNs without the need for MC simulations. Next, we introduce a specific type of neural network in which intermediate layers are fed with both the output of the previous layer and the initial input vector. Such a network better captures the nonlinear behavior of GMAB prices compared to classical feed-forward networks. Third, we consider risk factors (interest and mortality rates) driven by mean-reverting processes, whereas the existing literature on PINNs predominantly focuses on high-dimensional geometric processes. Lastly, we train the network for various investment strategies, asset mixes, and contract durations. In this sense, our model is parametric and offers significant flexibility compared with BSDE or deep hedging methods, which require retraining for each specific setting.

The article is structured as follows. The next section introduces the financial and biometric risk factors. We consider a multivariate Brownian setting to maintain the analytical tractability needed for benchmarking the PINN. In Section 3, we develop the closed-form expressions for zero-coupon bonds, survival probabilities, and pure endowments. Section 4 provides detailed specifications of the GMAB, including the assumption that assets are invested in a portfolio of cash, bonds, stocks, and mortality bonds. We offer the analytical formula for pricing the GMAB within this framework. Section 5 presents a scaled version of the Feynman–Kac equation that GMAB prices satisfy. In Section 6, we outline the architecture of the neural network used to solve this PDE and explain the training procedure. The final section offers a comprehensive analysis of pricing errors based on a validation set of 500,000 contracts with various features.

2. Risk factors

Our objective is to demonstrate the effectiveness of PINNs in valuing GMABs and establishing investment strategies for them. To validate our methodology, we compare the exact GMAB prices to the estimates generated by the neural network. To facilitate this comparison, we consider Brownian financial and biometric risk factors and derive a closed-form expression for the GMAB price. An alternative validation approach would involve evaluating GMAB prices through MC simulations, but this method is computationally intensive and less accurate. This section introduces the dynamics of risk factors, while Section 3 provides analytical expressions for bond prices and survival probabilities. Finally, we obtain the analytical value of the GMAB in Section 4.

The GMABs provide policyholders with a promise that the value of their investment, conditionally to their survival, will reach a minimum predetermined amount at expiry, regardless of how the underlying investments perform. The value of GMABs depends both on biometrical and financial risk factors that we detail in this section. We consider an hybrid market in which we invest in cash, stocks, interest rate bonds, and mortality bonds. The stock indice and the interest rate are, respectively, denoted by $\left (S_{t}\right )_{t\geq 0}$ , $\left (r_{t}\right )_{t\geq 0}$ . The insured is $x_{\mu }$ year old, and the reference force of mortality is denoted by $\left (\mu _{x+t}\right )_{t\geq 0}$ . The insurer can also invest in mortality bonds written on a reference population of age $x_{\lambda }$ and mortality rates $\left (\lambda _{x_{\lambda }+t}\right )_{t\geq 0}$ .

The financial and biometric processes are defined on a probability space $\left ({\Omega },\mathcal{F},\mathbb{P}\right )$ associated to four independant Brownian motions under $\mathbb{P}$ , ${\tilde{\boldsymbol{W}}}_{t}=\left (\tilde{W}_{t}^{(1)},\tilde{W}_{t}^{(2)},\tilde{W}_{t}^{(3)},\tilde{W}_{t}^{(4)}\right )_{t\geq 0}^{\top }$ . The state variables $\left (S_{t},r_{t},\mu _{x_{\mu }+t},\lambda _{x_{\lambda }+t}\right )$ are driven by the following SDEs:

(1) \begin{eqnarray} d\left (\begin{array}{c} S_{t}\\ r_{t}\\ \mu _{x_{\mu }+t}\\ \lambda _{x_{\lambda }+t} \end{array}\right ) & = & \left (\begin{array}{c} \left (r_{t}+\nu _{S}\sigma _{S}\right )\,S_{t}\\ \kappa _{r}\left (\gamma _{r}(t)-\nu _{r}\frac{\sigma _{r}}{\kappa _{\kappa }}-r_{t}\right )\\ \kappa _{\mu }\left (\gamma _{\mu }(t)-\nu _{\mu }\frac{\sigma _{\mu }(t)}{\kappa _{\mu }}-\mu _{x_{\mu }+t}\right )\\ \kappa _{\lambda }\left (\gamma _{\lambda }(t)-\nu _{\lambda }\frac{\sigma _{\lambda }(t)}{\kappa _{\lambda }}-\lambda _{x_{\lambda }+t}\right ) \end{array}\right )dt+\left (\begin{array}{c} S_{t}\sigma _{S}\Sigma _{S}^{\top }\\ \sigma _{r}\Sigma _{r}^{\top }\\ \sigma _{\mu }(t)\Sigma _{\mu }^{\top }\\ \sigma _{\lambda }(t)\Sigma _{\lambda }^{\top } \end{array}\right )d{\tilde{\boldsymbol{W}}}_{t}\,, \end{eqnarray}

where $\kappa _{r}$ , $\kappa _{\mu }$ , $\kappa _{\lambda }$ , $\sigma _{S}$ , and $\sigma _{r}$ belong to $\mathbb{R}^{+}$ , whereas $\gamma _{r}(t)$ , $\gamma _{\mu }(t)$ , $\gamma _{\lambda }(t)$ , $\sigma _{\mu }(t)$ , and $\sigma _{\lambda }(t)$ are positive functions of time. $\gamma _{r}(t)$ $\gamma _{\mu }(t)$ , and $\gamma _{\lambda }(t)$ are fitted to term structures of interest and mortality rates. Details about the estimation procedure are provided in Appendix. Notice that $\gamma _{\mu }(t)$ , $\gamma _{\lambda }(t)$ , $\sigma _{\mu }(t)$ , and $\sigma _{\lambda }(t)$ depend on the initial age of the insured and of the hedging population; $x_{\mu },x_{\lambda }\in [0,x_{max})$ , $x_{max}\gt 0$ .

In our framework, mortality rates have a mean-reverting Gaussian dynamic, which offers a good trade-off between complexity and analytical tractability. Milevsky & Promislow (Reference Milevsky and Promislow2001) were among the firsts to consider mean-reverting stochastic processes for modeling mortality. Luciano & Vigna (Reference Luciano and Vigna2005) developed a jump-diffusion affine model for modeling rates and showed that constant mean reverting process are not adapted for describing the mortality. Luciano & Vigna (Reference Luciano and Vigna2008) calibrated various time homogeneous mean-reverting models to different generations in the UK population and investigate their empirical appropriateness. Jevtic et al. (Reference Jevtic, Luciano and Vigna2013) studied and calibrated a cohort-based model which captures the characteristics of a mortality surface with a continuous-time factor approach. Zeddouk & Devolder (Reference Zeddouk and Devolder2020) modeled the force of mortality with mean reverting affine processes, where the level of reversion is time dependent. Hainaut (Reference Hainaut2023) proposed a mortality model based on a mean reverting random field indexed by time and age.

In Equation (1), the parameters $\nu _{S}$ , $\nu _{r}$ , $\nu _{\mu }$ , and $\nu _{\lambda }$ tune the risk premiums $\nu _{S}\sigma _{S}$ , $-\nu _{r}\sigma _{r}$ , $-\nu _{\mu }\sigma _{\mu }(t)$ , $-\nu _{\lambda }\sigma _{\lambda }(t)$ of processes. Furthermore, we assume that the standard deviation of mortality rates is related to age through the relations $\sigma _{\mu }(t)=\alpha _{\mu }e^{\beta _{\mu }(x_{\mu }+t)}$ and $\sigma _{\lambda }(t)=\alpha _{\lambda }e^{\beta _{\lambda }(x_{\lambda }+t)}$ . $\Sigma _{S}$ , $\Sigma _{r}$ , $\Sigma _{\mu }$ , and $\Sigma _{\lambda }$ are vectors such that $\Sigma =\left (\Sigma _{S}^{\top },\Sigma _{r}^{\top }\Sigma _{\mu }^{\top },\Sigma _{\lambda }^{\top }\right )$ is the (upper) Choleski decomposition of the correlation matrix:

\begin{eqnarray*} \Sigma & =\left (\begin{array}{c} \Sigma _{S}^{\top }\\[5pt] \Sigma _{r}^{\top }\\[5pt] \Sigma _{\mu }^{\top }\\[5pt] \Sigma _{\lambda }^{\top } \end{array}\right )= & \left (\begin{array}{c@{\quad}c@{\quad}c@{\quad}c} \epsilon _{SS} & \epsilon _{Sr} & \epsilon _{S\mu } & \epsilon _{S\lambda }\\[5pt] 0 & \epsilon _{rr} & \epsilon _{r\mu } & \epsilon _{r\lambda }\\[5pt] 0 & 0 & \epsilon _{\mu \mu } & \epsilon _{\mu \lambda }\\[5pt] 0 & 0 & 0 & 1 \end{array}\right )\quad,\quad \left (\begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & \rho _{Sr} & \rho _{S\mu } & \rho _{S\lambda }\\[5pt] \rho _{Sr} & 1 & \rho _{r\mu } & \rho _{r\lambda }\\[5pt] \rho _{S\mu } & \rho _{r\mu } & 1 & \rho _{\mu \lambda }\\[5pt] \rho _{S\lambda } & \rho _{r\lambda } & \rho _{\mu \lambda } & 1 \end{array}\right )=\Sigma \Sigma ^{\top }\,, \end{eqnarray*}

where $\rho _{Sr,}\,\rho _{S\mu },\,\rho _{S\lambda },\rho _{r\mu },\rho _{r\lambda }$ , and $\rho _{\mu \lambda }\in ({-}1,1)$ . This model incorporates the correlation between financial and mortality shocks, which can be relevant in the context of events such as a pandemic like COVID-19. We assume that costs of risk, noted $\boldsymbol{\theta }=\left (\theta _{j}\right )_{j=1,\ldots,4}^{\top }$ , are constant, i.e., the Brownian motions with drift under $\mathbb{P}$ ,

\begin{eqnarray*} dW_{t}^{(j)} & = & d\tilde{W}_{t}^{(j)}+\theta _{j}dt\,, \end{eqnarray*}

are Brownian motions under $\mathbb{Q}$ . To keep identical dynamics under $\mathbb{P}$ and $\mathbb{Q}$ , the risk premium parameters, stored in a vector $\boldsymbol{\theta }=\left (\theta _{1},\theta _{2},\theta _{3},\theta _{4}\right )^{\top }$ , are such that

\begin{eqnarray*} \nu _{S} & = & \epsilon _{SS}\theta _{1}+\epsilon _{Sr}\theta _{2}+\epsilon _{S\mu }\theta _{3}+\epsilon _{S\lambda }\theta _{4}\,,\\ \nu _{r} & = & -\left (\epsilon _{rr}\theta _{2}+\epsilon _{r\mu }\theta _{3}+\epsilon _{r\lambda }\theta _{4}\right )\,,\\ \nu _{\mu } & = & -\left (\epsilon _{\mu \mu }\theta _{3}+\epsilon _{\mu \lambda }\theta _{4}\right )\,,\\ \nu _{\lambda } & = & -\theta _{4}\,. \end{eqnarray*}

Under the risk neutral measure $\mathbb{Q}$ , financial and biometric processes are ruled by the following SDEs:

(2) \begin{eqnarray} d\left (\begin{array}{c} S_{t}\\ r_{t}\\ \mu _{x_{\mu }+t}\\ \lambda _{x_{\lambda }+t} \end{array}\right ) & = & \left (\begin{array}{c} r_{t}\,S_{t}\\ \kappa _{r}\left (\gamma _{r}(t)-r_{t}\right )\\ \kappa _{\mu }\left (\gamma _{\mu }(t)-\mu _{x_{\mu }+t}\right )\\ \kappa _{\lambda }\left (\gamma _{\lambda }(t)-\lambda _{x_{\lambda }+t}\right ) \end{array}\right )dt+\left (\begin{array}{c} S_{t}\sigma _{S}\Sigma _{S}^{\top }\\[6pt] \sigma _{r}\Sigma _{r}^{\top }\\[6pt] \sigma _{\mu }(t)\Sigma _{\mu }^{\top }\\[6pt] \sigma _{\lambda }(t)\Sigma _{\lambda }^{\top } \end{array}\right )d\boldsymbol{W}_{t}\,. \end{eqnarray}

We will use later these equations to build the valuation equation fulfilled by the GMAB. As mentioned earlier, we have chosen a pure Brownian model specifically for the purpose of benchmarking exact GMAB prices against PINN predictions. The following two sections present intermediate analytical results that are essential for deriving a closed-form expression for the GMAB.

3. Bonds and survival probabilities

We denote by $P(t,T)$ the present value of a zero-coupon bond that pays one monetary unit at expiry date, $T$ . The survival probabilities of an $x_{z}+t$ year old individual up to time $t$ are noted $\,_{T}p_{x_{z}+t}^{z}$ , for $z\in \{\mu,\lambda \}$ . Pure endowments provide a lump sum payment at expiry in case of survival and nothing if the policyholder dies before the term. We denote them by $_{T}E_{t}^{z}$ for $z\in \{\mu,\lambda \}$ . Zero-coupon bonds, survival probabilities, and pure endowents are valued by the following expectations under the risk neutral measure:

\begin{equation*} \begin {cases} P(t,T) & =\mathbb {E}^{\mathbb {Q}}\left (e^{-\int _{t}^{T}r_{s}ds}|\mathcal {F}_{t}\right )\,,\\ \,_{T}p_{x_{z}+t}^{z} & =\mathbb {E}^{\mathbb {Q}}\left (e^{-\int _{t}^{T}z_{x_{z}+s}ds}\,|\,\mathcal {F}_{t}\right )\;z\in \{\mu,\lambda \},\\ _{T}E_{t}^{z} & =N_{t}^{\mu }\mathbb {E}^{\mathbb {Q}}\left (e^{-\int _{t}^{T}(r_{s}+z_{x_{z}+s})ds}\,|\,\mathcal {F}_{t}\right )\;z\in \{\mu,\lambda \}. \end {cases} \end{equation*}

The model being affine, we derive the closed-form expressions of these products. In the remainder of this article, we adopt the following notation

\begin{align*} B_{y}(t,T) & =\frac{1-e^{-y(T-t)}}{y}\,, \end{align*}

where $y\in \mathbb{R}^{+}$ is a positive parameter. Various combined integrals of $B_{y}(t,T)$ , $\sigma _{\mu }(t)$ , and $\sigma _{\lambda }(t)$ are provided in Appendix B and serve to analytically evaluate the GMAB. From Proposition 4 in Appendix A, $\gamma _{r}(T)$ matches the initial yield curve of interest rate if

\begin{eqnarray*} \gamma _{r}(T) & = & -\frac{1}{\kappa _{r}}\partial _{T}^{2}\ln P(0,T)-\partial _{T}\ln P(0,T)+\frac{\sigma _{r}^{2}}{2\kappa _{r}^{2}}\left (1-e^{-2\kappa _{r}T}\right )\,, \end{eqnarray*}

where $-\partial _{T}\ln P(0,T)$ is the instantaneous forward rate. In a similar manner, for a given initial mortality curve $\,_{T}p_{x_{z}}^{z}$ , the function $\gamma _{z}(u)$ is such that

\begin{eqnarray*} \gamma _{z}(T) & = & -\frac{1}{\kappa _{z}}\partial _{T}^{2}\ln \,_{T}p_{x_{z}}^{z}-\partial _{T}\ln \,_{T}p_{x_{z}}^{z}+\frac{\alpha _{z}^{2}e^{2\beta _{z}x_{z}}}{2\kappa _{z}(\kappa _{z}+\beta _{z})}\left (e^{2\beta _{z}T}-e^{-2\kappa _{z}T}\right )\,. \end{eqnarray*}

Bond prices, survival probabilities, and endowments admit closed-form expressions presented in the next proposition.

Proposition 1. The price at time $t\leq T$ of a discount bond of maturity $T$ is linked to the initial interest rate curve at time $t=0$ by the relation

(3) \begin{eqnarray} P(t,T) & = & \exp \left (-r_{t}B_{\kappa _{r}}(t,T)-\left (\partial _{t}\ln P(0,t)\right )B_{\kappa _{r}}(t,T)+\ln \frac{P(0,T)}{P(0,t)}\right )\\ & & \times \exp \left (-\frac{\sigma _{r}^{2}}{4\kappa _{r}}\left (\left (1-e^{-2\kappa _{r}t}\right )B_{\kappa _{r}}(t,T)^{2}\right )\right )\,.\nonumber \end{eqnarray}

In a similar manner, we can show that, when alive at age $x_{z}+t$ , the survival probability up to time $T$ depends on the initial survival term structure as follows for $z\in \{\mu,\lambda \}$

(4) \begin{eqnarray} &&\,_{T}p_{x+t}^{z}=\exp \left (-z_{x_{z}+t}B_{\kappa _{z}}(t,T)-\left (\partial _{t}\ln \,_{t}p_{x_{z}}^{z}\right )B_{\kappa _{z}}(t,T)+\ln \frac{\,_{T}p_{x_{z}}^{z}}{\,_{t}p_{x_{z}}^{z}}\right )\times \\ && \quad \exp \left (\frac{\alpha _{z}^{2}e^{2\beta _{z}(x_{z}+T)}}{2\kappa _{z}^{2}}\left (B_{2\beta _{z}}(t,T)-2B_{2\beta _{z}+\kappa _{z}}(t,T)+B_{2\beta _{z}+2\kappa _{z}}(t,T)\right )\right )\times \nonumber \\ && \quad \exp \left (\frac{\alpha _{z}^{2}e^{2\beta _{z}x_{z}}}{2\kappa _{z}(\kappa _{z}+\beta _{z})}\left (e^{2\beta _{z}T}B_{2\beta _{z}+\kappa _{z}}(t,T)-e^{2\beta _{z}T}B_{2\beta _{z}}(t,T)\right )\right )\times \nonumber \\ && \quad \exp \left (\frac{\alpha _{z}^{2}e^{2\beta _{z}x_{z}}}{2\kappa _{z}(\kappa _{z}+\beta _{z})}\left (e^{-2\kappa _{z}t}B_{2\kappa _{z}}(t,T)-e^{-\kappa _{z}(T+t)}B_{\kappa _{z}}(t,T)\right )\right )\,,\nonumber \end{eqnarray}

whereas the pure endowment contracts are valued by

(5) \begin{eqnarray} && _{T}E_{t}^{\mu }=N_{t}^{\mu }\:_{T}p_{x_{\mu }+t}^{\mu }\:P(t,T)\,\exp \left (\frac{\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\mu }\alpha _{\mu }e^{\beta _{\mu }\left (x_{\mu }+T\right )}}{\kappa _{\mu }\kappa _{r}}\right )\\ & & \quad \times \exp \left (B_{\beta _{\mu }}(t,T)-B_{\kappa _{\mu }+\beta _{\mu }}(t,T)-B_{\kappa _{r}+\beta _{\mu }}(t,T)+B_{\kappa _{\mu }+\kappa _{r}+\beta _{\mu }}(t,T)\right )\,.\nonumber \end{eqnarray}

and

(6) \begin{eqnarray} & & _{T}E_{t}^{\lambda }=N_{t}^{\lambda }\:_{T}p_{x_{\lambda }+t}^{\lambda }\:P(t,T)\,\exp \left (\frac{\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\lambda }\alpha _{\lambda }e^{\beta _{\lambda }\left (x_{\lambda }+T\right )}}{\kappa _{\lambda }\kappa _{r}}\right )\\ & & \quad \times \exp \left (B_{\beta _{\lambda }}(t,T)-B_{\kappa _{\lambda }+\beta _{\lambda }}(t,T)-B_{\kappa _{r}+\beta _{\lambda }}(t,T)+B_{\kappa _{\lambda }+\kappa _{r}+\beta _{\lambda }}(t,T)\right )\,.\nonumber \end{eqnarray}

The sketch of the proof is provided in Appendix A. Using standard developments, the bond price dynamic is equal to

(7) \begin{equation} \frac{dP(t,T)}{P(t,T)}=\left (r_{t}+B_{\kappa _{r}}(t,T)\sigma _{r}\nu _{r}\right )\,dt-B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }d\tilde{\boldsymbol{W}}_{t}\,. \end{equation}

We assume that policyholder’s savings is invested in cash, stocks, zero-coupon bonds, and in mortality bonds written on a reference population of age $x_{\lambda }$ at time $t=0$ . We denote by $D(t,T)$ , the value of this mortality linked bond expiring at time $T$ . Its return depends on the benchmark mortality rate, $\lambda _{x_{\lambda }+t}$ , and is valued at by the following expectation:

\begin{eqnarray*} D(t,T) & = & \mathbb{E^{Q}}\left (e^{-\int _{t}^{T}r_{s}ds-\int _{0}^{T}\lambda _{x_{\lambda }+s}ds}|\mathcal{F}_{t}\right )\\ & = & e^{-\int _{0}^{t}\lambda _{x_{\lambda }+s}ds}\mathbb{E^{Q}}\left (e^{-\int _{t}^{T}\left (r_{s}+\lambda _{x_{\lambda }+s}\right )ds}|\mathcal{F}_{t}\right )\,. \end{eqnarray*}

This product is designed in such a way that, in the event of over- or undermortality, the bondholder realizes a capital loss or gain at maturity. The dynamics of this bond are described in the next proposition, which is proven in Appendix A.

Proposition 2. Under the risk neutral measure $\mathbb{Q}$ , the mortality bond earns on average the risk-free rate and its price obeys to the SDE:

(8) \begin{eqnarray} \frac{dD(t,T)}{D(t,T)} & = & r_{t}dt-\left (B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }+B_{\kappa _{\lambda }}(t,T)\sigma _{\lambda }(t)\Sigma _{\lambda }^{\top }\right )d\boldsymbol{W}_{t} \end{eqnarray}

Under $\mathbb{P}$ , the risk premium of the mortality bond is proportional to interest and mortality volatilities:

(9) \begin{eqnarray} \frac{dD(t,T)}{D(t,T)} & = & \left (r_{t}+B_{\kappa _{r}}(t,T)\sigma _{r}\nu _{r}+B_{\kappa _{\lambda }}(t,T)\sigma _{\lambda }(t)\nu _{\lambda }\right )dt\\ & & -\left (B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }+B_{\kappa _{\lambda }}(t,T)\sigma _{\lambda }(t)\Sigma _{\lambda }^{\top }\right )d\tilde{\boldsymbol{W}}_{t}\nonumber \end{eqnarray}

4. Contract and analytical valuation

GMABs are retirement savings products which promises in case of survival, the maximum between investments, and a guaranteed capital. The policyholder savings are invested in cash, stocks, risk-free and mortality bonds of maturity $T$ . We denote by $\left (A_{t}\right )_{t\geq 0}$ the total value of these investments. The percentages of stocks, risk-free and mortality bonds are assumed constant and denoted by $\boldsymbol{\pi }=\left (\pi _{S},\pi _{P},\pi _{D}\right )_{t\geq 0}$ . We assume that the investment policy is self-financed, the dynamic of assets under management is equal to

\begin{eqnarray*} \frac{dA_{t}}{A_{t}} & = & \pi _{S}\frac{dS_{t}}{S_{t}}+\pi _{P}\frac{dP_{t}}{P_{t}}+\pi _{D}\frac{dD_{t}}{D_{t}}+\left (1-\pi _{S}-\pi _{P}-\pi _{D}\right )r_{t}\,dt\,, \end{eqnarray*}

where $dS_{t}$ , $dP(t,T)$ , and $dD(t,T)$ are, respectively, provided in Equations (1), (7), and (9). We can therefore reformulate the differential of $A_{t}$ under $\mathbb{P}$ as follows

\begin{eqnarray*} \frac{dA_{t}}{A_{t}} & = & \left (r_{t}+\boldsymbol{\pi }^{\top }\boldsymbol{\nu }_{A}(t)\right )dt+\boldsymbol{\pi }^{\top }\Sigma _{A}(t)d\tilde{\boldsymbol{W}}_{t} \end{eqnarray*}

where $\boldsymbol{\nu }_{A}(t)$ is a time-varying vector of risk premiums,

\begin{eqnarray*} \boldsymbol{\nu }_{A}(t) & = & \left (\begin{array}{c} \sigma _{S}\nu _{S}\\ B_{\kappa _{r}}(t,T)\sigma _{r}\nu _{r}\\ B_{\kappa _{r}}(t,T)\sigma _{r}\nu _{r}+B_{\kappa _{\lambda }}(t,T)\sigma _{\lambda }(t)\nu _{\lambda } \end{array}\right )\,, \end{eqnarray*}

and $\Sigma _{A}(t)$ is a volatility $3\times 4$ -matrix :

\begin{equation*} \Sigma _{A}(t)=\left (\begin {array}{c} \sigma _{S}\Sigma _{S}^{\top }\\ -B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }\\ -B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }-B_{\kappa _{\lambda }}(t,T)\sigma _{\lambda }(t)\Sigma _{\lambda }^{\top } \end {array}\right )\,. \end{equation*}

Under the risk neutral measure, investments earn on average the risk-free rate and the differential of $A_{t}$ is ruled by

\begin{eqnarray*} \frac{dA_{t}}{A_{t}} & = & r_{t}dt+\boldsymbol{\pi }^{\top }\Sigma _{A}(t)d\boldsymbol{W}_{t}. \end{eqnarray*}

The contract is subscribed by an individual of age $x_{\mu }$ and guarantees at the expiry date, denoted by $T^{*}$ , a payout equal to the maximum between a guaranteed capital $G_{m}$ and the portfolio $A_{T^{*}}$ , in the event of survival. However, the benefit is eventually bounded by $G_{M}$ . In practice, $G_{m}$ is often set to the initial investment, $A_{0}$ , or to a percentage of $A_{0}$ . Whereas, the participation is most of the time unbounded ( $G_{M}=+\infty$ ). We denote by $\tau _{\mu }\in \mathbb{R}^{+}$ , the random time of insured’s death and $N_{t}^{\mu }=\textbf{1}_{\{\tau _{\mu }\geq t\}}$ . The payoff in case of survival is

(10) \begin{eqnarray} H(A_{T^{*}},G_{m}) & = & \left (G_{T^{*}}+\left (A_{T^{*}}-G_{m}\right )_{+}-\left (A_{T^{*}}-G_{M}\right )_{+}\right ) \end{eqnarray}

The fair value of such a policy, denoted by $L_{t}$ , is equal to the expected discounted cash flows under the chosen risk neutral measure, noted $\mathbb{Q}$ . This is a function of all state variables:

(11) \begin{eqnarray} & & L_{t}=N_{t}^{\mu }\mathbb{E}^{\mathbb{Q}}\left (e^{-\int _{t}^{T^{*}}(r_{s}+\mu _{x_{\mu }+s})ds}G_{m}\,|\,\mathcal{F}_{t}\right )\\ & & \quad \,+N_{t}^{\mu }\mathbb{E}^{\mathbb{Q}}\left (e^{-\int _{t}^{T^{*}}(r_{s}+\mu _{x_{\mu }+s})ds}\left (\left (A_{T^{*}}-G_{m}\right )_{+}\right )\,|\,\mathcal{F}_{t}\right )\nonumber \\ & & \quad \,-N_{t}^{\mu }\mathbb{E}^{\mathbb{Q}}\left (e^{-\int _{t}^{T^{*}}(r_{s}+\mu _{x_{\mu }+s})ds}\left (\left (A_{T^{*}}-G_{M}\right )_{+}\right )\,|\,\mathcal{F}_{t}\right ).\nonumber \end{eqnarray}

In order to obtain a closed form expression of the saving contract (10), we perform a change of measure using as Radon–Nykodym derivative:

(12) \begin{eqnarray} \left .\frac{d\mathbb{F}}{d\mathbb{Q}}\right |_{T^{*}} & =\mathbb{E}^{\mathbb{Q}}\left (\frac{d\mathbb{F}}{d\mathbb{Q}}|\mathcal{F}_{T^{*}}\right )= & \frac{e^{-\int _{0}^{T^{*}}(r_{s}+\mu _{x_{\mu }+s})ds}}{\mathbb{E}^{\mathbb{Q}}\left (e^{-\int _{0}^{T^{*}}(r_{s}+\mu _{x_{\mu }+s})ds}|\mathcal{F}_{0}\right )}\,. \end{eqnarray}

From Equations (A6), this change of measure is rewritten as follows:

\begin{eqnarray*} & & \left .\frac{d\mathbb{F}}{d\mathbb{Q}}\right |_{T^{*}}=\exp \left (-\frac{\sigma _{r}^{2}\epsilon _{rr}^{2}}{2}\int _{0}^{T^{*}}B_{\kappa _{r}}(u,T^{*})^{2}du-\sigma _{r}\epsilon _{rr}\int _{0}^{T^{*}}B_{\kappa _{r}}(u,T^{*})\,dW_{u}^{(2)}\right )\\ & & \quad \times{{\exp \left (-\int _{0}^{T^{*}}\left (\sigma _{r}\epsilon _{r\mu }B_{\kappa _{r}}(u,T^{*})+\sigma _{\mu }(u)\epsilon _{\mu \mu }B_{\kappa _{\mu }}(u,T^{*})\right )dW_{u}^{(3)}\right )}}\\ & & \quad \times \exp \left (-\frac{1}{2}\int _{0}^{T^{*}}\left (\sigma _{r}\epsilon _{r\mu }B_{\kappa _{r}}(u,T^{*})+\sigma _{\mu }(u)\epsilon _{\mu \mu }B_{\kappa _{\mu }}(u,T^{*})\right )^{2}du\right )\\ & & \quad \times{{\exp \left (-\int _{0}^{T^{*}}\left (\sigma _{r}\epsilon _{r\lambda }B_{\kappa _{r}}(u,T^{*})+\sigma _{\mu }(u)\epsilon _{\mu \lambda }B_{\kappa _{\mu }}(u,T^{*})+\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T^{*})\right )dW_{u}^{(4)}\right )}}\\ & & \quad \times{{\exp \left (-\frac{1}{2}\int _{0}^{T^{*}}\left (\sigma _{r}\epsilon _{r\lambda }B_{\kappa _{r}}(u,T^{*})+\sigma _{\mu }(u)\epsilon _{\mu \lambda }B_{\kappa _{\mu }}(u,T^{*})+\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T^{*})\right )^{2}du\right )}} \end{eqnarray*}

We recognize a Doleans–Dade exponential and then under the measure $\mathbb{F}.$ We denote by $\boldsymbol{W}_{t}^{\mathbb{F}}$ the vector of $\mathbb{F}$ -Brownian motions $\left (W_{t}^{(1)\mathbb{F}},W_{t}^{(2)\mathbb{F}},W_{t}^{(3)\mathbb{F}},W_{t}^{(4)\mathbb{F}}\right )$ , defined by

(13) \begin{eqnarray} d\boldsymbol{W}_{t}^{\mathbb{F}} & = & d\boldsymbol{W}_{t}+\left (\sigma _{r}B_{\kappa _{r}}(t,T^{*})\Sigma _{r}+\sigma _{\mu }(t)B_{\kappa _{\mu }}(t,T^{*})\Sigma _{\mu }+\sigma _{\lambda }(t)B_{\kappa _{\lambda }}(t,T^{*})\Sigma _{\lambda }\right )dt\,. \end{eqnarray}

The dynamic of the total asset under the $\mathbb{F}$ - measure is then equal to

(14) \begin{align} & \frac{dA_{t}}{A_{t}}=r_{t}dt+\boldsymbol{\pi }^{\top }\Sigma _{A}(t)d\boldsymbol{W}_{t}^{\mathbb{F}}-\boldsymbol{\pi }^{\top }\Sigma _{A}(t)\times \\ & \left (\sigma _{r}B_{\kappa _{r}}(t,T^{*})\Sigma _{r}+\sigma _{\mu }(t)B_{\kappa _{\mu }}(t,T^{*})\Sigma _{\mu }+\sigma _{\lambda }(t)B_{\kappa _{\lambda }}(t,T^{*})\Sigma _{\lambda }\right )dt\nonumber \end{align}

As $A_{T^{*}}=\frac{A_{T^{*}}}{P(T^{*},T^{*})}$ , we focus on the dynamics of $d\frac{A_{t}}{P(t,T^{*})}$ in order to prove that this ratio is a log-normal process.

Proposition 3. The quantity $\frac{A_{t}}{P(t,T^{*})}$ has a log-normal distribution, $\ln \frac{A_{T^{*}}/P(T^{*},T^{*})}{A_{t}/P(t,T^{*})}$ $N\left (\mu _{\mathbb{F}}(t),v_{\mathbb{F}}(t)\right )$ , with a time dependent variance equal to

(15) \begin{eqnarray} & & v_{\mathbb{F}}(t)=\int _{t}^{T^{*}}\left (B_{\kappa _{r}}(u,T^{*})\sigma _{r}\Sigma _{r}^{\top }+\boldsymbol{\pi }^{\top }\Sigma _{A}(u)\right )\left (B_{\kappa _{r}}(u,T^{*})\sigma _{r}\Sigma _{r}+\Sigma _{A}(u)^{\top }\boldsymbol{\pi }\right )du\,, \end{eqnarray}

and a time dependent mean given by

(16) \begin{eqnarray} & & \mu _{\mathbb{F}}(t)=-\frac{1}{2}v_{\mathbb{F}}(t)-\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\mu }\int _{t}^{T^{*}}\sigma _{\mu }(u)B_{\kappa _{r}}(u,T^{*})B_{\kappa _{\mu }}(u,T^{*})\,du\\ & & \qquad -\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\lambda }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{r}}(u,T^{*})B_{\kappa _{\lambda }}(u,T^{*})du\nonumber \\ & & \qquad -\boldsymbol{\pi }^{\top }\int _{t}^{T^{*}}\Sigma _{A}(u)\sigma _{\mu }(u)B_{\kappa _{\mu }}(u,T^{*})du\,\Sigma _{\mu }\nonumber \\ & & \qquad -\boldsymbol{\pi }^{\top }\int _{t}^{T^{*}}\Sigma _{A}(u)\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T^{*})du\,\Sigma _{\lambda }\,.\nonumber \end{eqnarray}

The proof is provided in Appendix A, while the full analytical expressions of $v_{\mathbb{F}}(t)$ and $\mu _{\mathbb{F}}(t)$ are detailed in Appendix C. As $\frac{A_{t}}{P(t,T^{*})}$ is log-normal under the forward measure, we can evaluate the call options embedded in the GMAB.

Corollary 1. Let us denote by $\Phi (.)$ is the cumulative distribution function (cdf) of a $N(0,1)$ . If we define

(17) \begin{equation} \begin{cases} d_{2}(t)=\frac{\ln \left (\frac{G_{T^{*}}}{A_{t}/P(t,T^{*})}\right )-\mu _{\mathbb{F}}(t)}{\sqrt{v_{\mathbb{F}}(t)}} & \,,\,d_{1}(t)=d_{2}(t)-\sqrt{v_{\mathbb{F}}(t)}\negmedspace,\\ d_{2}^{M}(t)=\frac{\ln \left (\frac{G_{M}}{A_{t}/P(t,T^{*})}\right )-\mu _{\mathbb{F}}(t)}{\sqrt{v_{\mathbb{F}}(t)}} & \,,\,d_{1}^{M}(t)=d_{2}^{M}(t)-\sqrt{v_{\mathbb{F}}(t)}\negmedspace \end{cases} \end{equation}

then expected positive differences between $A_{T^{*}}$ , $G_{T^{*}}$ , and $G_{M}$ under the forward measure are given by

\begin{equation*} \begin {array}{c} \mathbb {E}^{\mathbb {F}}\left (\left (A_{T^{*}}-G_{m}\right )_{+}|\mathcal {F}_{t}\right )=\frac {A_{t}}{P(t,T^{*})}e^{\mu _{\mathbb {F}}(t)+\frac {v_{\mathbb {F}}(t)}{2}}\Phi \left (-d_{1}(t)\right )-G_{m}\Phi \left (-d_{2}(t)\right )\,,\\ \mathbb {E}^{\mathbb {F}}\left (\left (A_{T^{*}}-G_{M}\right )_{+}|\mathcal {F}_{t}\right )=\frac {A_{t}}{P(t,T^{*})}e^{\mu _{\mathbb {F}}(t)+\frac {v_{\mathbb {F}}(t)}{2}}\Phi \left (-d_{1}^{M}(t)\right )-G_{M}\Phi \left (-d_{2}^{M}(t)\right )\,. \end {array} \end{equation*}

This corollary is proven using standard Black & Scholes developments. This last result allows us to infer that the market value of the GMAB (11) is given by

(18) \begin{eqnarray} & & L_{t}=\,_{T^{*}}E_{t}^{\mu }\,G_{m}\\ & & +\,_{T^{*}}E_{t}^{\mu }\left [\frac{A_{t}e^{\mu _{\mathbb{F}}(t)+\frac{v_{\mathbb{F}}(t)}{2}}}{P(t,T^{*})}\Phi \left (-d_{1}(t)\right )-G_{m}\,\Phi \left (-d_{2}(t)\right )\right ]\nonumber \\ & & -\,_{T^{*}}E_{t}^{\mu }\left [\frac{A_{t}e^{\mu _{\mathbb{F}}(t)+\frac{v_{\mathbb{F}}(t)}{2}}}{P(t,T^{*})}\Phi \left (-d_{1}^{M}(t)\right )-G_{M}\Phi \left (-d_{2}^{M}(t)\right )\right ]\,,\nonumber \end{eqnarray}

where $\,_{T^{*}}E_{t}^{\mu }$ is given by Equation (5). This closed form is used to measure the accuracy of the PINN for predicting GMAB prices.

5. Scaled Feynman–Kac equation

A PINN integrates principles from physics, including PDEs governing the behavior of state variables. In a financial context, we adopt the Feynman–Kac (FK) equation as our guiding principle. The FK equation is a PDE satisfied by all assets traded in an arbitrage-free market. In this section, we construct this equation for the model under consideration in this article. In the following section, we utilize a neural network to solve it for various investment strategies, contract maturities, and bond maturities. To lighten future developments, we denote the vector of state variables, augmented with the total asset value, by

\begin{equation*} \boldsymbol{y}_{t}=\left (S_{t},r_{t},\mu _{x_{\mu }+t},\lambda _{x_{\lambda }+t},A_{t}\right )^{\top }\,. \end{equation*}

Under the risk neutral measure $\mathbb{Q}$ , this multivariate process is ruled by the following SDE:

\begin{eqnarray*} \text{d}\boldsymbol{y}_{t} & = & \boldsymbol{\mu} _{\boldsymbol{y}}(t,\boldsymbol{y}_{t})dt+\Sigma _{\boldsymbol{y}}(t,\boldsymbol{y}_{t})d\boldsymbol{W}_{t}\,, \end{eqnarray*}

where $\boldsymbol{\mu} _{\boldsymbol{y}}(.)$ is a vector of dimension 5 and $\Sigma _{\boldsymbol{y}}(.)$ is a $5\times 4$ matrix:

\begin{equation*} \boldsymbol{\mu}_{\tilde{\boldsymbol{y}}}(t,\boldsymbol{y}_{t})=\left (\begin {array}{c} \begin {array}{c} \begin {array}{c} r_{t}\,S_{t}\\ \kappa _{r}\left (\gamma _{r}(t)-r_{t}\right )\\ \kappa _{\mu }\left (\gamma _{\mu }(t)-\mu _{x_{\mu }+t}\right )\\ \kappa _{\lambda }\left (\gamma _{\lambda }(t)-\lambda _{x_{\lambda }+t}\right )\\ r_{t}A_{t} \end {array}\end {array}\end {array}\right )\;,\;\Sigma _{\boldsymbol{y}}(t,\boldsymbol{y}_{t})=\left (\begin {array}{c} S_{t}\sigma _{S}\Sigma _{S}^{\top }\\[5pt] \sigma _{r}\Sigma _{r}^{\top }\\[5pt] \sigma _{\mu }(t)\Sigma _{\mu }^{\top }\\[5pt] \sigma _{\lambda }(t)\Sigma _{\lambda }^{\top }\\[5pt] A_{t}\boldsymbol {\pi }^{\top }\Sigma _{A}(t) \end {array}\right )\,. \end{equation*}

The GMAB is a function of time and state variables parameterized by the investment policy $\boldsymbol{\pi }$ , the duration of the contract $T^{*}$ , and the maturity of bonds, $T$ . As we aim to understand the relation between assets and liabilities, we emphasize the dependence to investment parameters by denoting the contract price as follows:

\begin{equation*} L_{t}=V\left (t,\boldsymbol{y}_{t},N_{t}^{\mu }\,|\,\boldsymbol {\pi },T^{*},T\right )\,. \end{equation*}

Under the assumption of arbitrage-free market, all traded securities, including the insurance contract, earn on average the risk-free rate, i.e. :

(19) \begin{align} \mathbb{E}\left (dL_{t+}\,|\,\mathcal{F}_{t}\right ) & =L_{t}\,r_{t}\,dt\,. \end{align}

Let us, respectively, denote the gradient and the Hessian of $V(.)$ with respect to $\boldsymbol{y}$ by $\nabla _{\boldsymbol{y}}V$ and $\mathcal{H}_{\boldsymbol{y}}(V)$ :

\begin{eqnarray*} \nabla _{\boldsymbol{y}}V=\left (\begin{array}{c} \partial _{S}V\\ \partial _{r}V\\ \partial _{\mu }V\\ \partial _{\lambda }V\\ \partial _{A}V \end{array}\right ) & \:,\: & \mathcal{H}_{\boldsymbol{y}}(V)=\left (\begin{array}{ccccc} \partial _{SS}V & \partial _{Sr}V & \partial _{S\mu }V & \partial _{S\lambda }V & \partial _{SA}V\\ \partial _{Sr}V & \partial _{rr}V & \partial _{r\mu }V & \partial _{r\lambda }V & \partial _{rA}V\\ \partial _{S\mu }V & \partial _{r\mu }V & \partial _{\mu \mu }V & \partial _{\mu \lambda }V & \partial _{\mu A}V\\ \partial _{S\lambda }V & \partial _{r\lambda }V & \partial _{\mu \lambda }V & \partial _{\lambda \lambda }V & \partial _{\lambda A}V\\ \partial _{SA}V & \partial _{rA}V & \partial _{\mu A}V & \partial _{\lambda A}V & \partial _{AA}V \end{array}\right )\,. \end{eqnarray*}

Applying the Itô’s lemma to $V\left (.\right )$ allows us to rewrite (19) as a partial differential valuation equation:

(20) \begin{align} 0= & \partial _{t}V-\left (r_{t}+\mu _{x+t}\right )\,V+\boldsymbol{\mu} _{\boldsymbol{y}}(t,\boldsymbol{y}_{t})^{\top }\nabla _{\boldsymbol{y}}V\\ & +\frac{1}{2}\text{tr}\left (\Sigma _{\boldsymbol{y}}(t,\boldsymbol{y}_{t})\Sigma _{\boldsymbol{y}}(t,\boldsymbol{y}_{t})^{\top }\mathcal{H}_{\boldsymbol{y}}(V)\right )\,.\nonumber \end{align}

This last expression is called the “Feynman–Kac (FK) equation”. The boundary constraints on $V(.)$ are

(21) \begin{equation} \begin{cases} V(T^{*},\boldsymbol{y}_{T^{*}},1\,|\,\boldsymbol{\pi },T^{*},T) & =H\left (A_{T^{*}},G_{m},G_{M}\right )\,,\\ V(t,\boldsymbol{y}_{t},0\,|\,\boldsymbol{\pi },T^{*},T) & =0\,. \end{cases} \end{equation}

Notice that we have 5 state variables and 4 Brownian motions. The model is, therefore, over-specified, and we can remove one state variable from $\boldsymbol{y}$ , for example, the stock price, which is redundant with respect to the total asset value, $A_{t}$ . Nonetheless, we have retained all state variables in the numerical illustration to observe how the PINN handles this naive over-specification.

Due to the numerous risk factors involved, solving numerically the PDE (20) is numerically challenging. Implementing either an explicit or implicit finite difference method necessitates constructing a 6-dimensional grid in the spacetime of time and state variables, with relatively fine meshes to ensure convergence. In practice, this becomes intractable. As an alternative, we approximate the solution to the FK Equation (20) using a neural network that takes as input the time, state variables, asset mix, and durations of GMAB and bonds. The mathematical definition of a neural network will be revisited in the following section, but it is worth noting that a neural network employs bounded activation functions, such as sigmoid or hyperbolic tangents. To mitigate convergence issues associated with the vanishing gradient problem, we must standardize and scale the network’s inputs. This preprocessing step slightly modifies the FK equation. Let us define the vectors $\boldsymbol{a},\boldsymbol{b}\in \mathbb{R}^{5}$ as follows

(22) \begin{eqnarray} \boldsymbol{a} & = & \left (a_{S},a_{r},a_{\mu },a_{\lambda },a_{A}\right )^{\top }\,,\\ \boldsymbol{b} & = & \left (b_{S},b_{r},b_{\mu },b_{\lambda },b_{A}\right )^{\top }\,.\nonumber \end{eqnarray}

$\boldsymbol{a}$ and $\boldsymbol{b}$ are chosen to normalize a random sample of state variables. This point is discussed in the next section. The vector of centered and scaled state variables is denoted by

(23) \begin{eqnarray} \tilde{\boldsymbol{y}_{t}} & = & \boldsymbol{a}+\boldsymbol{b}\odot \boldsymbol{y}_{t}\,, \end{eqnarray}

where $\odot$ is the elementwise product. In a similar manner, we center and scale the time with coefficients $a_{h}$ and $b_{h}$ that will depends on the horizon of valuation. The centered scaled time and durations are defined as follows:

(24) \begin{equation} \begin{cases} \tilde{t} & =a_{h}+b_{h}t\,,\\ \tilde{T} & =a_{h}+b_{h}T\,,\\ \tilde{T}^{*} & =a_{h}+b_{h}T^{*}. \end{cases} \end{equation}

We do not need to scale the vector $\boldsymbol{\pi }\in [0,1]^{3}$ . The normalized state variables are ruled by the SDE:

\begin{eqnarray*} d{{\tilde{\boldsymbol{y}}_{t}}} & = & \boldsymbol{\mu} _{\tilde{\boldsymbol{y}}}(t,{\tilde{\boldsymbol{y}}}_{t})dt+\Sigma _{{\tilde{\boldsymbol{y}}}}(t,{\tilde{\boldsymbol{y}}}_{t})d\boldsymbol{W}_{t}\,, \end{eqnarray*}

where $\mu _{\tilde{y}}(.)$ is a $5\times$ vector and $\Sigma _{{\tilde{\boldsymbol{y}}}}(.)$ is a 5 $\times$ 4 matrix:

\begin{equation*} \boldsymbol{\mu} _{\tilde{\boldsymbol{y}}}(t,\tilde{\boldsymbol{y}}_{t})=\left (\begin{array}{c} \left (\tilde {S}_{t}-a_{S}\right )\frac {\tilde {r}_{t}-a_{r}}{b_{r}}\\ \kappa _{r}\left (b_{r}\gamma _{r}(t)+a_{r}-\tilde {r}_{t}\right )\\ \kappa _{\mu }\left (b_{\mu }\gamma _{\mu }(t)+a_{\mu }-\tilde {\mu }_{x_{\mu }+t}\right )\\ \kappa _{\lambda }\left (b_{\lambda }\gamma _{\lambda }(t)+a_{\lambda }-\tilde {\lambda }_{x_{\lambda }+t}\right )\\ \left (\tilde {A}_{t}-a_{A}\right )\frac {\tilde {r}_{t}-a_{r}}{b_{r}} \end{array}\right )\;,\;\Sigma _{\tilde{\boldsymbol{y}}}(t,\tilde{\boldsymbol{y}}_{t})=\left(\begin{array}{c} \left (\tilde {S}_{t}-a_{S}\right )\sigma _{S}\Sigma _{S}^{\top }\\ b_{r}\sigma _{r}\Sigma _{r}^{\top }\\ b_{\mu }\sigma _{\mu }(t)\Sigma _{\mu }^{\top }\\ b_{\lambda }\sigma _{\lambda }(t)\Sigma _{\lambda }^{\top }\\ \left (\tilde {A}_{t}-a_{A}\right )\boldsymbol {\pi }_{t}^{\top }\Sigma _{A}(t) \end{array}\right)\,. \end{equation*}

The value of the contract may be seen as a function of time and rescaled state variables, $L_{t}=V(\tilde{t},\tilde{\boldsymbol{y}}_{t},N_{t}^{\mu },|\,\boldsymbol{\pi },\tilde{T}^{*},\tilde{T})$ . The valuation Equation (20) is then rewritten in rescaled form:

(25) \begin{align} 0= & b_{h}\,\partial _{\tilde{t}}V-\left (\frac{\tilde{r}_{t}-a_{r}}{b_{r}}+\frac{\tilde{\mu }_{x+t}-a_{\mu }}{b_{\mu }}\right )\,V+\boldsymbol{\mu} _{\tilde{\boldsymbol{y}}}(t,\tilde{\boldsymbol{y}}_{t})^{\top }\nabla _{{\tilde{\boldsymbol{y}}}}V\\ & +\frac{1}{2}\text{tr}\left (\Sigma _{{\tilde{\boldsymbol{y}}}}(t,{\tilde{\boldsymbol{y}}}_{t})\Sigma _{{\tilde{\boldsymbol{y}}}}(t,{\tilde{\boldsymbol{y}}}_{t})^{\top }\mathcal{H}_{{\tilde{\boldsymbol{y}}}}(V)\right )\,,\nonumber \end{align}

where $\nabla _{{\tilde{\boldsymbol{y}}}}V$ and $\mathcal{H}_{{\tilde{\boldsymbol{y}}}}(V)$ are, respectively, the gradient and the Hessian of $V$ with respect to standardized state variables, ${\tilde{\boldsymbol{y}}}$ . The rescaled version of boundary conditions (21) is

(26) \begin{equation} \begin{cases} V\left (\tilde{T}^{*},{\tilde{\boldsymbol{y}}}_{T^{*}},1\,|\,\boldsymbol{\pi },\tilde{T}^{*},\tilde{T}\right ) & =H\left (\left (\tilde{A}_{T^{*}}-a_{A}\right )/b_{A},G_{T^{*}}\right )\,,\\ V\left (\tilde{t},{\tilde{\boldsymbol{y}}}_{t},0\,|\,\boldsymbol{\pi },\tilde{T}^{*},\tilde{T}\right ) & =0\,. \end{cases} \end{equation}

The next section explains how to solve Equation (25) with a neural network. Notice that the stock and total asset values exhibit geometric dynamics. An alternative formulation of the FK equation may be derived by considering log-stock and log-asset prices as state variables. This has the advantage of partially reducing the nonlinearity of the FK PDE. However, in the existing literature, such a transformation is not applied and does not adversely affect the accuracy of PINNs. For instance, Sirignano & Spiliopoulos (Reference Sirignano and Spiliopoulos2018), Glau & Wunderlich (Reference Glau and Wunderlich2022), or Gatta et al. (Reference Gatta, Di Cola, Giampaolo and Piccialli F.Cuomo2023) evaluate European or American basket options by solving the Feynman–Kac equation in a multivariate geometric market without prior log-transform.

In practice, a log-transform is deemed unnecessary because the partial derivatives of the approximated GMAB prices are computed precisely through automatic differentiation, not finite differences. This ensures the robustness of the valuation of the Feynman–Kac equation.

Figure 1 Feed forward network with skip connections toward intermediate layers.

6. Neural networks

We approximate the value function solving Equation (25) using a particular type of neural network. This network takes as input a vector of dimension 11, denoted by ${\boldsymbol{z}}$ , containing the scaled time and state variables, the asset mix vector, and scaled contract and bond maturities:

(27) \begin{eqnarray} {\boldsymbol{z}} & :\!= & \left (\tilde{t},{\tilde{\boldsymbol{y}}}_{t},\boldsymbol{\pi },\tilde{T}^{*},\tilde{T}\right )^{\top }\,. \end{eqnarray}

There is no systematic procedure for determining the optimal structure of a neural network. As proven by Hornik (Reference Hornik1991), we know that single-layer neural networks can approximate regular functions arbitrarily well, but achieving reasonable accuracy may require a high number of neurons. An alternative approach is provided by feed-forward networks, where information flows forward through several neural layers. However, these networks often struggle to replicate non-linear functions with a limited number of layers.

To address this challenge, we adopt a network with an architecture illustrated in Fig. 1 This network is a basic version of recurrent neural networks. During the valuation of the network response, the outputs of neurons in hidden layers, added to the initial data vector, serve as inputs for the immediate next layer. The long short-term memory (LSTM) networks are built on the same principle and are powerful tools for time-series prediction (see, e.g., Denuit et al., Reference Denuit, Hainaut and Trufin2019, Chapter 8). The deep Galerkin method (DGM) network, initially proposed by Sirignano & Spiliopoulos (Reference Sirignano and Spiliopoulos2018) for PINN pricing of basket options, belongs to the family of LSTM models. Recurrent neural networks better replicate non-linear functions than classical feed-forward networks. The same network as the one proposed in this article is successfully applied in Hainaut & Casas (Reference Hainaut and Casas2024) for pricing European options in the Heston model.

The network used in this work presents the same drawbacks as a classical recurrent architecture. A model with too many layers becomes untrainable due to the phenomenon of the “vanishing gradient,” as revealed by Hochreiter et al. (Reference Hochreiter, Bengio, Frasconi and Schmidhuber2001). The lagged responses from hidden neural layers pass through several intermediate activation functions, having an indirect feedback effect on the final response of the network. This effect may become marginal when considering too many intermediate layers, rendering the network untrainable.

Definition Let $l$ , $n_{0}$ , $n_{1}$ , …, $n_{l}\in \mathbb{N}$ be, respectively, the number of layers and neurons in each layer. $n_{0}$ is also the size of input vector. The activation function of layer $k=1,2,\ldots,l$ is noted $\phi _{k}(.)\,{:}\,\mathbb{R}\to \mathbb{R}$ . Let $C_{1}\in \mathbb{R}^{n_{1}}\times \mathbb{R}^{n_{0}}$ , ${\boldsymbol{c}}_{1}\in \mathbb{R}^{n_{1}}$ , $C_{k}\in \mathbb{R}^{n_{k}}\times \mathbb{R}^{n_{0}+n_{k-1}}$ , ${\boldsymbol{c}}_{k}\in \mathbb{R}^{n_{k}}$ for $k=2,\ldots,l-1$ , $C_{l}\in \mathbb{R}^{n_{l}}\times \mathbb{R}^{n_{l-1}}$ , ${\boldsymbol{c}}_{l}\in \mathbb{R}^{n_{l}}$ be neural weights defining the input, intermediate, and output layers. We define the following functions

\begin{equation*} \begin {cases} \psi _{k}({\boldsymbol{x}})=\phi _{k}\left (C_{k}{\boldsymbol{x}}+{\boldsymbol{c}}_{k}\right )\,, & k=1,l\\ \psi _{k}({\boldsymbol{x}},{\boldsymbol{z}})=\phi _{k}\left (C_{k}\left (\begin {array}{c} {\boldsymbol{x}}\\ {\boldsymbol{z}} \end {array}\right )+{\boldsymbol{c}}_{k}\right )\,, & k=2,\ldots,l-1 \end {cases} \end{equation*}

where activation functions $\phi _{k}(.)$ are applied componentwise. The residual neural network is a function $F\,{:}\,\mathbb{R}^{n_{0}}\to \mathbb{R}^{n_{l}}$ defined by

\begin{eqnarray*} F({\boldsymbol{z}}) & = & \psi _{l-1}\circ \psi _{l-1}\left (.,{\boldsymbol{z}}\right )\ldots \circ \psi _{2}\left (.,{\boldsymbol{z}}\right )\circ \psi _{1}({\boldsymbol{z}})\,. \end{eqnarray*}

After having chosen a network architecture, the model is trained by minimizing a loss function that is proportional to errors of approximation. This error is measured by replacing $V(.)$ with $F(.)$ in the scaled FK Equation (25), at random points in the domain.

At time $t\in [0,T^{*}]$ , the domain of the state vector $\boldsymbol{y}_{t}=\left (S_{t},r_{t},\mu _{x_{\mu }+t},\lambda _{x_{\mu }+t},A_{t}\right )^{\top }$ is $\,\mathbb{R}^{+}\times \mathbb{R}\times \mathbb{R}^{+\,3}$ . We approximate this domain by a closed convex subspace:

\begin{align*} \mathcal{D}^{\boldsymbol{y}}\, & \,{:}\,[S_{l},S_{u}]\times [r_{l},r_{u}]\times [\mu _{l},\mu _{u}]\times [\lambda _{l},\lambda _{u}]\times [A_{l},A_{u}]\,. \end{align*}

In order to fit the neural network, we draw a sample of $n_{\mathcal{D}}$ realizations of $\boldsymbol{y}_{t}$ in $\mathcal{D}^{\boldsymbol{y}}$ , at random times. We also sample parameters $\left (\boldsymbol{\pi }_{j},T_{j}^{*},T_{j}\right )_{j=1,\ldots,,n_{\mathcal{D}}}$ under the constraints:

\begin{equation*} \begin {cases} \boldsymbol {\pi }_{j}\in [0,1]\times [0,1]\times [0,1]\,,\\ T_{j}^{*}\in [0,T_{max}^{*}]\,,\,T_{j}\in [0,T_{max}]\\ t_{j}\leq T_{j}^{*}\leq T_{j}\,. \end {cases} \end{equation*}

The set of sampled state variables and parameters is noted $\mathcal{S}_{\mathcal{D}}=\left (t_{j},\boldsymbol{y}_{j},\pi _{j},T_{j}^{*},T_{j}\right )_{j=1,\ldots,n_{\mathcal{D}}}$ . We next center and scale state variables and times. The mean and standard deviation of sampled state variables are computed with the standard formulas:

\begin{equation*} \bar {\boldsymbol{y}}=\frac {1}{n_{\mathcal {D}}}\sum _{j=1}^{n_{\mathcal {D}}}\boldsymbol{y}_{j}\quad,\quad {\boldsymbol{S}}_{y}=\sqrt {\frac {1}{n_{\mathcal {D}}-1}\sum _{j=1}^{n_{\mathcal {D}}}\left (\boldsymbol{y}_{j}-\bar{\boldsymbol{y}}\right )^{2}}\,. \end{equation*}

The scaling vectors (22) are defined by $\boldsymbol{a}=-\frac{\bar{\boldsymbol{y}}}{{\boldsymbol{S}}_{\boldsymbol{y}}}$ and $\boldsymbol{b}=\frac{1}{{\boldsymbol{S}}_{\boldsymbol{y}}}$ . For $1,\ldots,n_{\mathcal{D}}$ , we define $\tilde{\boldsymbol{y}_{j}}=\boldsymbol{a}+\boldsymbol{b}\odot \boldsymbol{y}_{j}\,$ . The times, contract, and bond maturities are also rescaled with relations (24) and coefficients

\begin{eqnarray*} a_{h}= & -\frac{1}{2}\:,\: & b_{h}=\frac{1}{T_{max}^{*}}\,. \end{eqnarray*}

The set of sampled normalized state variables and parameters is denoted by

\begin{equation*} \tilde {\mathcal {S}}_{\mathcal {D}}=\left (\tilde {t}_{j},\tilde {\boldsymbol{y}}_{j},\pi _{j},\tilde {T}_{j}^{*},\tilde {T}_{j}\right )_{j=1,\ldots,n_{\mathcal {D}}}\,. \end{equation*}

During the training phase, the error of approximation is measured for all points of this sample sets with the scaled FK Equation (26). The error at expiry is measured with another sample set, denoted by $\mathcal{\tilde{S}}_{T^{*}}$ , of $n_{T^{*}}$ realizations of state variables and parameters:

\begin{equation*} \mathcal {\tilde {S}}_{T^{*}}=\left (\tilde {\boldsymbol{y}}_{k},\pi _{k},\tilde {T}_{k}^{*},\tilde {T}_{k}\right )_{k=1,\ldots,n_{T^{*}}}\,. \end{equation*}

Let us denote by $\boldsymbol{\Theta },$ the vector containing all neural weights $\left (C_{k},{\boldsymbol{c}}_{k}\right )_{k=1,\ldots,l}$ . At points of $\tilde{\mathcal{S}}_{\mathcal{D}}$ , we define for $j=1,\ldots,n_{\mathcal{D}}$ , the error in Equation (25) when $V$ is replaced by the neural network as follows:

(28) \begin{align} & e_{j}^{\mathcal{D}}(\boldsymbol{\Theta })=b_{h}\,\partial _{\tilde{t}_{j}}F-\left (\frac{\tilde{r}_{j}-a_{r}}{b_{r}}+\frac{\tilde{\mu }_{j}-a_{\mu }}{b_{\mu }}\right )\,F\\ & \quad +\boldsymbol{\mu} _{\tilde{\boldsymbol{y}}}(t_{j},\tilde{\boldsymbol{y}}_{j})^{\top }\nabla _{{\tilde{\boldsymbol{y}}}}F+\frac{1}{2}\text{tr}\left (\Sigma _{{\tilde{\boldsymbol{y}}}}(t_{j},\tilde{\boldsymbol{y}}_{j})\Sigma _{{\tilde{\boldsymbol{y}}}}(t_{j},\tilde{\boldsymbol{y}}_{j})^{\top }\mathcal{H}_{{\tilde{\boldsymbol{y}}}}(F)\right )\,.\nonumber \end{align}

We refer to $e_{j}^{\mathcal{D}}$ as the error on the interior domain since it measures the goodness of fit before expiry. The average quadratic loss on $\tilde{\mathcal{S}}_{\mathcal{D}}$ is the first component of the total loss function used to fit the neural network,

(29) \begin{equation} \mathcal{L}_{D}\left (\boldsymbol{\Theta }\right )=\frac{1}{n_{\mathcal{D}}}\sum _{j=1}^{n_{\mathcal{D}}}e_{j}^{\mathcal{D}}(\boldsymbol{\Theta })^{2}\,. \end{equation}

Since the error $e_{j}^{\mathcal{D}}(\boldsymbol{\Theta })$ depends on first- and second-order partial derivatives, special attention must be given to the accuracy of their calculation. Computing these derivatives using a standard finite difference method may introduce numerical instabilities. Therefore, we opt for an alternative approach known as automatic or algorithmic differentiation. Automatic differentiation leverages the fact that every computer calculation executes a sequence of elementary arithmetic operations and elementary functions, allowing us to compute partial derivatives with accuracy up to the working precision. Automatic differentiation is implemented in TensorFlow through functions such as GradientTape(.) and tape.gradient(.). For a more in-depth introduction and perspectives, we refer the reader to van Merriënboer et al. (Reference van Merriënboer, Breuleux and Bergeron2018).

On the boundary sample set $\mathcal{\tilde{S}}_{T^{*}}$ , we define the error $e_{k}^{T^{*}}(\boldsymbol{\Theta })$ for $k=1,\ldots,n_{T^{*}}$ , as the difference between the output of the neural network and the payoff at expiry:

\begin{eqnarray*} e_{k}^{T^{*}}(\boldsymbol{\Theta }) & = & F(\tilde{T}_{k}^{*},{\tilde{\boldsymbol{y}}}_{k},\boldsymbol{\pi }_{k},\tilde{T}_{k}^{*},\tilde{T}_{k})-H\left (\left (\tilde{A}_{k}^{*}-a_{S}\right )/b_{S},G_{k}\right )\,. \end{eqnarray*}

The average quadratic loss at expiry is the second component of the total loss function.

(30) \begin{equation} \mathcal{L}_{T^{*}}\left (\boldsymbol{\Theta }\right )=\frac{1}{n_{T^{*}}}\sum _{k=1}^{n_{T^{*}}}e_{k}^{T^{*}}(\boldsymbol{\Theta })^{2}\,. \end{equation}

The optimal network weights are found by minimizing the losses in $\tilde{\mathcal{S}}_{\mathcal{D}}$ and $\mathcal{\tilde{S}}_{T^{*}}$ ,

(31) \begin{align} \boldsymbol{\Theta }_{opt} & =\arg \min _{\boldsymbol{\Theta }}\left [\mathcal{L}_{D}\left (\boldsymbol{\Theta }\right )+\mathcal{L}_{T^{*}}\left (\boldsymbol{\Theta }\right )\right ]\:. \end{align}

In practice, the minimization is performed with a gradient descent algorithm. For an introduction, we refer, e.g., to Denuit et al. (Reference Denuit, Hainaut and Trufin2019), Section 1.6.

7. Numerical illustration

We fit a Nelson–Siegel model to the Belgian state yield curveFootnote 1 on the 1/9/23. Initial survival probabilities, $\,_{t}p_{x}^{\mu }$ and $\sigma _{\mu }(t)$ , are fitted to male Belgian mortality rates.Footnote 2 Details are provided in Appendix D. Parameters of survival probabilities $\,_{t}p_{x}^{\lambda }$ , of the reference population for the mortality bond, are assumed proportional to these of $\,_{t}p_{x}^{\mu }$ . Reference ages, $x_{\mu }$ and $x_{\lambda }$ , are set to $50$ years. Other market parameters are estimated from daily time series of the Belgian stock index BEL 20 and of the 1 year Belgian state yield from the 1/9/10 to the 1/9/23. The correlations $\rho _{S\,\mu }$ and $\rho _{r\,\mu }$ are set to –5% and 0%. Parameter estimates are reported in Table 1.

Table 1. Financial and biometric parameters

We first build a sample set, $\mathcal{S_{D}}$ , of contract features, asset-mixes, and durations. We randomly draw $n_{\mathcal{D}}=100,000$ combinations of risk factors and investment-duration parameters. Most of these quantities are drawn from uniform distributions whose characteristics are reported in Table 2. The contract and bond maturities are drawn from exponential distributions to ensure a smoother allocation of durations than what would be obtained with uniform distributions. As $\pi _{S}$ , $\pi _{P}$ , and $\pi _{D}$ are randomly distributed in $[0,1]$ , we allow for short positions in cash. Using the same distributions as for $\mathcal{S_{D}}$ , we generate a boundary sample set $\mathcal{S}_{T^{*}}$ of size, $n_{T^{*}}=50,000$ . The last step consists in scaling and centering the sample sets, as described in the previous section. Notice that setting $G_{m}=100$ is not constraining. As the payoff is piecewise linear, we can use the rule of thumb to evaluate any contract with another minimum capital. If $V(A_{t},G_{m})$ is the GMAB price at time $t$ , for an asset value $A_{t}$ and a guarantee $G_{m}$ , the value of a contract with a guaranteed capital $G_{m}^{'}\neq G_{m}$ is equal to:

Table 2. Model parameters and features of contracts

\begin{equation*} V(A_{t},G_{m}^{'})=\frac {G_{m}^{'}}{G_{m}}V\left (\frac {G_{m}}{G_{m}^{'}}A_{t},G_{m}\right )\,. \end{equation*}

The neural network is implemented in TensorFlow using Python. As mentioned earlier, the necessary partial derivatives for evaluating the inner loss are computed through automatic differentiation. This procedure is implemented in TensorFlow functions like GradientTape(.) and tape.gradient(.), allowing for accurate derivative calculations without relying on finite differences. To train the network, we employ the ADAM optimizer and utilize random batches containing 34,000 and 17,000 samples from $\mathcal{\tilde{S}_{D}}$ and $\mathcal{\tilde{S}}_{T^{*}}$ , respectively. Over the course of one epoch, neural weights are updated three times. We run a total of 26,000 epochs and adjust the ADAM learning rate based on the pattern provided in Table 3, to expedite calibration. The training is conducted on a laptop equipped with an AMD Ryzen 7 5825U processor and 16 Gb of RAM (without a CUDA compatible GPU). The training time depends on the network configuration. As indicated in Table 4, running 200 epochs takes approximately 12–18 minutes for networks with 3 and 5 hidden layers with 32 neurons, respectively. The entire calibration process spans from 26 to 40 hours. Choosing 32 neurons per hidden layer strikes a good balance between accuracy and training time. Tests conducted with 16 neurons over 8000 epochs revealed that this configuration performed less effectively compared to the 32-neuron networks.

Table 3. Learning rate in function of Epochs

Table 4. Information about training time and loss values after training for three neural configurations

Table 4 reports the losses on $\mathcal{\tilde{S}_{D}}$ , $\mathcal{\tilde{S}}_{T^{*}}$ and their total, computed with Equations (29) and (30). We observe a significant reduction of losses between 3 and 4 hidden layers models with 32 neurons. The reduction is relatively smaller when we switch from a 4 to a 5 layers model.

Table 5 presents statistics about relative errorsFootnote 3 between PINN and exact prices obtained with formula (18). These statistics are computed for the 100,000 contracts in the training data set $\tilde{\mathcal{S}}_{\mathcal{D}}$ and for a validation set of same size and built in the same manner. The similarity of figures on both sets does not reveal that we overfit the training dataset. The average relative errors are small and around 0.15% for 4 and 5 hidden layers models. The 95% confidence level is also below 1% for these networks which is quite acceptable.

Table 5. Relative errors computed on the 100 000 contracts of the training set $\tilde{\mathcal{S}}_{\mathcal{D}}$ and on a validation set of same size

In the remainder of this section, we focus on the network with 4 hidden layers as it yields the lowest relative errors on training and validation sets. Fig. 2 compares exact and approximated PINN prices for an 8-year GMAB. The upper plots show the values at issuance and at expiry for various values of the total asset. Market conditions are as of September 1, 2023. Across a wide range of asset values, the exact and PINN prices closely align. The largest deviations are observed for the lowest and highest values of $A_{t}$ . The lower plots display the value at issuance of the same contract with $A_{0}\in \{90,100\}$ for $r_{0}$ ranging from 0% and 8%. We observe that the spreads between true and approximated prices depend on both parameters. However, it is noteworthy that the error remains under 1% in most cases.

Figure 2 Comparison of exact and PINN prices of a $T^{*}=T=8$ years GMAB with: $S_{0}=100$ , $\boldsymbol{\pi }=(0.50,0.25,0.25)$ , $(r_{0},\mu _{0},\lambda _{0})$ of Table 2. Upper plots: values at issuance and expiry, for $A_{t}\in [23.82,414.15]$ . Lower plots: values at issuance with $A_{0}=100$ and 90, for $r_{0}\in [0.00,0.08]$ .

To understand the circumstances in which relative errors are the highest, we analyze them by subgroups of contracts. In order to have a sufficient number of representatives in each sub-category, we generate a third validation set, this time consisting of 500,000 contracts instead of 100,000. Table 6 presents the averages and standard deviations of relative errors for 20 classes of contracts grouped by asset values. We observe that the highest errors occur with the lowest asset values. The left heatmap in Fig. 3 displays relative errors by categories of asset values and residual contract maturities, $T^{*}-t$ . We draw two conclusions from this graph. First, for $A_{t}\leq 80$ , the error increases with the residual contract duration and may be significant. Second, the errors in the usual pricing range, around the guarantee level $G_{m}=100$ , remain close to 0.01% regardless of the time horizon.

Table 6. Mean and standard deviations (std) of relative errors for contracts grouped by asset values

It is worth noting that for low asset values, GMAB is deeply out of the money, and the contract price is very close to that of a pure endowment for which we have a closed-form expression. One way to improve accuracy would be to add a third penalty term to the loss function (31). This term would penalize the quadratic spread between PINN and endowment prices for low asset values. However, we deliberately avoided this solution to observe how the network behaves when provided with minimal information.

Table 7. Mean and standard deviations (std) of relative errors for contracts grouped by interest rates

Figure 3 Left plot: heatmap of relative errors per asset and maturity classes. Right plot: heatmap of relative errors per interest rate and maturity classes.

Table 7 presents the means and standard deviations of errors for 10 groups of contracts, categorized by ranges of interest rates, $r_{t}$ . Regardless of the category, the average relative error remains close to 0.14%. The right heatmap in Fig. 3 suggests that, on average, errors increase with maturity. However, this observation should be qualified by the noticeable variability of errors between consecutive interest rate classes for the same maturity. This variability indicates that the increase in errors is partly attributed to other factors, such as the low asset values present within each of the interest rate classes.

Tables 8 and 9 provide information about errors for contracts categorized by classes of mortality rates. Similar to interest rates, the average error remains small and close to 0.14% across all class. The heatmaps of Fig. 4 show a slight increase in errors with the contract time to expiry. Once again, we observe a higher variability of errors for longer maturities. As mentioned earlier, this variability signals that the increase in errors is influenced by factors beyond mortality rates.

Table 8. Mean and standard deviations (std) of relative errors for contracts grouped by mortality rates $\mu _{x+t}$

Table 9. Mean and standard deviations (std) of relative errors for contracts grouped by hedging mortality rates $\lambda _{x+t}$

Figure 4 Left plot: heatmap of relative errors per $\mu _{x+t}$ (in $^{\circ }$ /oo) and maturity classes. Right plot: heatmap of relative errors per $\lambda _{x+t}$ (in $^{\circ }$ /oo) and maturity classes.

Tables 10 and 11 provide statistics on relative errors for contracts categorized by fractions of stocks and bonds (including interest rate and mortality bonds). The left plot in Fig. 5 displays a heatmap of errors based on these two parameters. Interestingly, we do not observe any particular trend, and the pricing error remains acceptable regardless of the asset mix. This indicates that the PINN can be used for ALM purposes, particularly in approximating the impact of changes in asset mix on contract fair values.

Table 10. Mean and standard deviations (std) of relative errors for contracts grouped by fraction of stocks

Table 11. Mean and standard deviations (std) of relative errors for contracts grouped by fraction of bonds

Figure 5 Heatmap of relative errors per $\pi _{S}$ and $(\pi _{P}+\pi _{D})$ asset allocation classes.

Nevertheless, the right heatmap of Fig. 5 reveals that the error increases with the time to expiry. To illustrate this, we compare the exact and PINN prices of a 5- and 8-year contract in Fig. 6, where $\pi _{S}$ ranges from 0 to 1 and $\pi _{D}=\pi _{M}=\frac{1-\pi _{S}}{2}$ . This graph shows that the PINN captures the trend of the relationship between prices and the fraction of stocks, but the error may reach 1% for certain combinations of parameters and risk factors. While this error may appear relatively high, it should be considered in the context of existing alternatives. For instance, in risk management, a common approach to valuing a contract at a future date relies on the least aquare Monte-Carlo (LSMC) method. As demonstrated in Hainaut & Akbaraly (Reference Hainaut and Akbaraly2023) for a similar contract, the average valuation error often exceeds 1% and can even surpass 3% in extreme cases. Replacing LSMC with the PINN in such a context would clearly reduce the overall valuation error.

Figure 6 Comparison of exact and PINN prices of a 5- and 8-year GMAB with: $S_{0}=100$ , $\pi _{S}\in [0,1]$ and $\pi _{D}=\pi _{M}=\frac{1-\pi _{S}}{2}$ . $(r_{0},\mu _{0},\lambda _{0})$ of Table 2. Upper plots: values at issuance and expiry, for $A_{t}\in [23.82,414.15]$ . Lower plots: values at issuance with $A_{0}=100$ and 90, for $r_{0}\in [0.00,0.08]$ .

We conclude from the error analysis that the average accuracy of a PINN is sufficient for risk management and ALM purposes. The significant advantage of using this approach is the substantial time savings once the calibration is completed. Computing prices for a sample set of 500,000 contracts, each with varying durations, asset mixes, and market conditions, takes less than 45 seconds. In contrast, evaluating the same portfolio using a standard MC method would require a considerable amount of time because we would need to regenerate a sufficient number of sample paths for each initial combination of risk factors. The PINN model is capable of pricing an impressive range of contracts, and with additional computing power, it is likely possible to further improve both accuracy and the diversity of contracts.

8. Conclusions

In this work, we explore the capability of PINNs to price GMABs. The first part of the article introduces market and biometric risk factors within a Gaussian framework. We also develop an analytical formula for GMAB pricing to serve as a benchmark for the PINN. Next, we propose a scaled version of the Feynman–Kac (FK) equation, which helps mitigate the vanishing gradient phenomenon during training. An approximate solution is constructed using a neural network with skip connections. We conclude with a detailed analysis of pricing errors.

This study highlights several advantages of PINNs. First, they excel in evaluating products exposed to multiple risk factors. Alternative procedures, such as those based on finite differences or FFT, become challenging and time-consuming when dealing with more than two risk factors. Second, PINNs offer parameterization options, allowing flexibility in incorporating contract features such as duration or asset mix. This flexibility sets them apart from other methods that require retraining for each unique combination of contract features. Third, once trained, PINNs provide nearly instantaneous valuation. We can appraise a portfolio of 500,000 GMABs with various market conditions and features in less than a minute. In contrast, performing the same exercise with a MC method would entail simulating sample paths for each initial risk factor combination.

However, PINNs also have some drawbacks. First, there is no systematic procedure to determine the optimal network architecture and training set size. After several attempts, we opted for a network with direct connections between input and hidden layers. Second, the training time can be lengthy, although parallelization on GPUs may help reduce it. On a positive note, updating the network to adapt to new market conditions is expected to be less time-consuming when initialized with neural weights obtained after the initial training.

Numerical results are undeniably encouraging. Within the typical range of risk factors, a PINN with four intermediate layers approximates prices with a relative error smaller than ten basis points. Furthermore, the PINN effectively captures the relationship between GMAB prices, risk factors, and asset mix. Its accuracy is sufficient to conduct sensitivity analyses on contract specifications and test various asset allocation strategies.

However, we do observe non-negligible errors for very low asset values. In these cases, the embedded call option in the GMAB is deeply out of the money, and the contract price closely resembles that of a pure endowment. Additionally, we notice that errors tend to slightly increase with the contract’s maturity. While errors may appear relatively high in certain configurations, it is crucial to consider these results in the context of existing alternatives. For example, in risk management, the common approach for valuing contracts at a future date relies on the LSMC method, which is known to be significantly less accurate (as demonstrated in Hainaut & Akbaraly, Reference Hainaut and Akbaraly2023).

This article lays the groundwork for future experiments. First, errors for deep out-of-the-money contracts can be reduced by introducing a third penalty term into the loss function (31). As mentioned earlier, in such cases, the GMAB value is nearly identical to the value of a pure endowment, which can be calculated analytically. Therefore, we can include a term in the loss function that penalizes the quadratic spread between PINN and endowment prices when the asset value falls below a certain threshold. We deliberately avoided this solution initially to assess how the network performs with minimal information. Second, it would be interesting to expand the range of parameters and contract features to generalize the procedure. Lastly, we have not explored all possible neural architectures, and there may be room for optimization to enhance accuracy and reduce computing time.

Data availability statements

The data and code that support the findings of this study are available from the corresponding author (DH) upon reasonable request.

Financial support

The first author thanks the FNRS (Fonds de la recherche scientifique) for the financial support through the EOS project Asterisk (research grant 40007517).

Competing interests

The author declares none.

Appendix A

The next proposition enable us to infer the expressions of $\gamma _{r}(t)$ and $\gamma _{z\in \{\mu,\lambda \}}(t)$ matching the initial term structures of interest and mortality rates.

Proposition 4. At time $0\leq t\leq T$ , the value of the discount bond of maturity $T$ is equal to

(A1) \begin{eqnarray} P(t,T) & = & \exp \left (-r_{t}B_{\kappa _{r}}(t,T)-\int _{t}^{T}\gamma _{r}(u)\left (1-e^{-\kappa _{r}(T-u)}\right )\,du\right )\\ & & \quad \quad \times \exp \left (\frac{\sigma_{r}^{2}}{2}\int _{t}^{T}B_{\kappa _{r}}(u,T)^{2}du\right )\,,\nonumber \end{eqnarray}

The survival probability up to time $T$ for $z\in \{\mu,\lambda \}$ , is given by

(A2) \begin{eqnarray} \,_{T}p_{x_{z}+t}^{z} & = & \exp \left (-z_{x_{z}+t}B_{\kappa _{z}}(t,T)-\int _{t}^{T}\gamma _{z}(u)\left (1-e^{-\kappa _{z}(T-u)}\right )\,du\right )\\ & & \quad \quad \times \exp \left (\frac{1}{2}\int _{t}^{T}\left (\sigma _{z}(u)B_{\kappa _{z}}(u,T)\right )^{2}du\right )\,,\nonumber \end{eqnarray}

The pure endowments admit the following expressions:

(A3) \begin{eqnarray} & & _{T}E_{t}^{\mu }=\textbf{1}_{\{\tau _{\mu }\geq t\}}\exp \left (-r_{t}B_{\kappa _{r}}(t,T)-\mu _{x_{\mu }+t}B_{\kappa _{\mu }}(t,T)+\frac{\sigma _{r}^{2}}{2}\int _{t}^{T}B_{\kappa _{r}}(u,T)^{2}du\right )\times \\ & & \exp \left (-\int _{t}^{T}\gamma _{r}(u)\left (1-e^{-\kappa _{r}(T-u)}\right )\,du-\int _{t}^{T}\gamma _{\mu }(u)\left (1-e^{-\kappa _{\mu }(T-u)}\right )\,du\right )\times \nonumber \\ & & \exp \left (\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\mu }\int _{t}^{T}\sigma _{\mu }(u)B_{\kappa _{\mu }}(u,T)B_{\kappa _{r}}(u,T)du+\frac{1}{2}\int _{t}^{T}\left (\sigma _{\mu }(u)B_{\kappa _{\mu }}(u,T)\right )^{2}du\right )\,,\nonumber \end{eqnarray}

and

(A4) \begin{eqnarray} & & _{T}E_{t}^{\lambda }=\textbf{1}_{\{\tau _{\lambda }\geq t\}}\exp \left (-r_{t}B_{\kappa _{r}}(t,T)-\lambda _{x_{\lambda }+t}B_{\kappa _{\lambda }}(t,T)+\frac{\sigma _{r}^{2}}{2}\int _{t}^{T}B_{\kappa _{r}}(u,T)^{2}du\right )\times \\ & & \exp \left (-\int _{t}^{T}\gamma _{r}(u)\left (1-e^{-\kappa _{r}(T-u)}\right )\,du-\int _{t}^{T}\gamma _{\lambda }(u)\left (1-e^{-\kappa _{\lambda }(T-u)}\right )\,du\right )\times \nonumber \\ & & \exp \left (\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\lambda }\int _{t}^{T}\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T)B_{\kappa _{r}}(u,T)du+\frac{1}{2}\int _{t}^{T}\left (\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T)\right )^{2}du\right )\,,\nonumber \end{eqnarray}

Sketch of the proof

We can show by direct differentation that interest and mortality rates are equal to

(A5) \begin{eqnarray} \left (\begin{array}{c} r_{s}\\ \mu _{x_{\mu }+s}\\ \lambda _{x_{\lambda }+s} \end{array}\right ) & = & \left (\begin{array}{c} e^{-\kappa _{r}(s-t)}r_{t}\\ e^{-\kappa _{\mu }(s-t)}\mu _{x_{\mu }+t}\\ e^{-\kappa _{\lambda }(s-t)}\lambda _{x_{\lambda }+t} \end{array}\right )+\left (\begin{array}{c} \kappa _{r}\int _{t}^{s}\gamma _{r}(u)e^{-\kappa _{r}(s-u)}du\\ \kappa _{\mu }\int _{t}^{s}\gamma _{\mu }(u)e^{-\kappa _{\mu }(s-u)}du\\ \kappa _{\lambda }\int _{t}^{s}\gamma _{\lambda }(u)e^{-\kappa _{\lambda }(s-u)}du \end{array}\right )\\ & & +\left (\begin{array}{c} \sigma _{r}\Sigma _{r}^{\top }\int _{t}^{s}e^{-\kappa _{r}(s-u)}d\boldsymbol{W}_{u}\\ \Sigma _{\mu }^{\top }\int _{t}^{s}\sigma _{\mu }(u)e^{-\kappa _{\mu }(s-u)}d\boldsymbol{W}_{u}\\ \Sigma _{\lambda }^{\top }\int _{t}^{s}\sigma _{\lambda }(u)e^{-\kappa _{\lambda }(s-u)}d\boldsymbol{W}_{u} \end{array}\right )\nonumber \end{eqnarray}

The integrals of interest and mortality rates are obtained by direct integration

(A6) \begin{eqnarray} \left (\begin{array}{c} \int _{t}^{T}r_{s}ds\\ \int _{t}^{T}\mu _{x_{\mu }+s}ds\\ \int _{t}^{T}\lambda _{x_{\lambda }+s}ds \end{array}\right ) & = & \left (\begin{array}{c} r_{t}B_{\kappa _{r}}(t,T)\\ \mu _{x_{\mu }+t}B_{\kappa _{\mu }}(t,T)\\ \lambda _{x_{\lambda }+t}B_{\kappa _{\lambda }}(t,T) \end{array}\right )+\left (\begin{array}{c} \int _{t}^{T}\gamma _{r}(u)\left (1-e^{-\kappa _{r}(T-u)}\right )\,du\\ \int _{t}^{T}\gamma _{\mu }(u)\left (1-e^{-\kappa _{\mu }(T-u)}\right )\,du\\ \int _{t}^{T}\gamma _{\lambda }(u)\left (1-e^{-\kappa _{\lambda }(T-u)}\right )\,du \end{array}\right )\\ & & +\left (\begin{array}{c} \sigma _{r}\Sigma _{r}^{\top }\int _{t}^{s}B_{\kappa _{r}}(u,T)\,d\boldsymbol{W}_{u}\\ \Sigma _{\mu }^{\top }\int _{t}^{s}\sigma _{\mu }(u)B_{\kappa _{\mu }}(u,T)\,d\boldsymbol{W}_{u}\\ \Sigma _{\lambda }^{\top }\int _{t}^{s}\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T)\,d\boldsymbol{W}_{u} \end{array}\right )\,.\nonumber \end{eqnarray}

The results follow from the log-normality of $e^{-\int _{t}^{T}z_{s}ds}$ for $z\in \{r,\mu,\lambda \}$ , from $\epsilon _{rr}^{2}+\epsilon _{r\mu }^{2}+\epsilon _{r\lambda }^{2}=1$ and $\epsilon _{\mu \mu }^{2}+\epsilon _{\mu \lambda }^{2}=1$ .

end

From Equation (A1), we deduce that the function $\gamma _{r}(u)$ must satisfy the next relation to match the initial yield curve of zero-coupon bond:

(A7) \begin{eqnarray} \int _{0}^{T}\gamma _{r}(u)\left (1-e^{-\kappa _{r}(T-u)}\right )\,du & = & -\ln P(0,T)-r_{0}B_{\kappa _{r}}(0,T)+\frac{\sigma _{r}^{2}}{2}\int _{0}^{T}B_{\kappa _{r}}(u,T)^{2}du\,. \end{eqnarray}

Deriving twice this expression leads to the following useful reformulation of $\gamma _{r}(T)$ :

(A8) \begin{eqnarray} \gamma _{r}(T) & = & -\frac{1}{\kappa _{r}}\partial _{T}^{2}\ln P(0,T)-\partial _{T}\ln P(0,T)+\frac{\sigma _{r}^{2}}{2\kappa _{r}^{2}}\left (1-e^{-2\kappa _{r}T}\right )\,, \end{eqnarray}

where $-\partial _{T}\ln P(0,T)$ is the instantaneous forward rate. For a given initial mortality curve $\,_{T}p_{x_{z}}^{z}$ , $z\in \{\mu,\lambda \}$ , we show in a similar manner that the function $\gamma _{z}(u)$ satisfies the relation

(A9) \begin{eqnarray} \gamma _{z}(T) & = & -\frac{1}{\kappa _{z}}\partial _{T}^{2}\ln \,_{T}p_{x_{z}}^{z}-\partial _{T}\ln \,_{T}p_{x_{z}}^{z}+\frac{1}{\kappa _{z}}\int _{0}^{T}\sigma _{z}(u)^{2}\left (e^{-2\kappa _{z}(T-u)}\right )du\nonumber \\ & = & -\frac{1}{\kappa _{z}}\partial _{T}^{2}\ln \,_{T}p_{x_{z}}^{z}-\partial _{T}\ln \,_{T}p_{x_{z}}^{z}+\frac{\alpha _{z}^{2}e^{2\beta _{z}x_{z}}}{2\kappa _{z}(\kappa _{z}+\beta _{z})}\left (e^{2\beta _{z}T}-e^{-2\kappa _{z}T}\right )\,. \end{eqnarray}

Equations (A8) and (A9) allows us to rewrite bond prices, survival probabilities and endowments as function of initial term structures of mortality and interest rates. Analytical expressions are provided in Proposition 1 the proof is summarized below.

Proposition 1: sketch of the proof

By direct integration of Equations (A8) and (A9), we obtain that

\begin{eqnarray*} \int _{t}^{T}\gamma _{r}(u)\left (1-e^{-\kappa _{r}(T-u)}\right )\,du & = & \left (\partial _{t}\ln P(0,t)\right )B_{\kappa _{r}}(t,T)-\ln \frac{P(0,T)}{P(0,t)}+\frac{\sigma _{r}^{2}}{2\kappa _{r}^{2}}(T-t)\\ & & -\frac{\sigma _{r}^{2}}{2\kappa _{r}^{2}}B_{\kappa _{r}}(t,T)-\frac{\sigma _{r}^{2}}{4\kappa _{r}}e^{-2\kappa _{r}t}B_{\kappa _{r}}(t,T)^{2}\,, \end{eqnarray*}

and

\begin{eqnarray*} & & \int _{t}^{T}\gamma _{z}(u)\left (1-e^{-\kappa _{z}(T-u)}\right )\,du=\left (\partial _{t}\ln \,_{t}p_{x_{z}}^{z}\right )B_{\kappa _{z}}(t,T)-\ln \frac{\,_{T}p_{x_{z}}^{z}}{\,_{t}p_{x_{z}}^{z}}+\frac{\alpha _{z}^{2}e^{2\beta _{z}x_{z}}}{2\kappa _{z}(\kappa _{z}+\beta _{z})}\\ & & \times \left (e^{2\beta _{z}T}B_{2\beta _{z}}(t,T)-e^{-2\kappa _{z}t}B_{2\kappa _{z}}(t,T)-e^{2\beta _{z}T}B_{2\beta _{z}+\kappa _{z}}(t,T)+e^{-\kappa _{z}(T+t)}B_{\kappa _{z}}(t,T)\right ) \end{eqnarray*}

Combining these expressions with these of Proposition 4.

end

Proposition 2 : sketch of the proof

Let us denote the mortality indexed discount factor by $\,_{T}m_{t}^{\lambda }=\mathbb{E^{Q}}\left (e^{-\int _{t}^{T}\left (r_{s}+\lambda _{x_{\lambda }+s}\right )ds}|\mathcal{F}_{t}\right )$ such that $D(t,T)=e^{-\int _{0}^{t}\lambda _{x_{\lambda }+s}ds}\,_{T}m_{t}^{\lambda }$ . The differential of $D(t,T)$ then equal to

(A10) \begin{eqnarray} dD(t,T) & = & -\lambda _{x_{\lambda }+t}\,_{T}m_{t}^{\lambda }\,dt+e^{-\int _{0}^{t}\lambda _{x_{\lambda }+s}ds}d\,_{T}m_{t}^{\lambda }\,. \end{eqnarray}

From Proposition 5, we infer that the mortality indexed discount factor is ruled by the SDE:

\begin{equation*} \frac {d\,_{T}m_{x+t}^{\lambda }}{\,_{T}m_{x+t}^{\lambda }}=\left (r_{t}+\lambda _{x_{\lambda }+t}\right )dt-B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }d{\boldsymbol{W}}_{t}-B_{\kappa _{\lambda }}(t,T)\sigma _{\lambda }(t)\Sigma _{\lambda }^{\top }d{\boldsymbol{W}}_{t}\,. \end{equation*}

From Equation (A10), we infer the dynamics under $\mathbb{Q}$ , Equation (8). As $d\boldsymbol{W}_{t}=d\tilde{\boldsymbol{W}}_{t}+\boldsymbol{\theta }dt$ with

\begin{eqnarray*} \nu _{r} & = & -\Sigma _{r}^{\top }\boldsymbol{\theta }\,,\\ \nu _{\lambda } & = & -\Sigma _{\lambda .}^{\top }\boldsymbol{\theta }\,, \end{eqnarray*}

we obtain Equation (9).

end

The next result presents the dynamics of the discount bond and endowment under the pricing measure. This is a direct consequence of the Itô’s lemma applied to Proposition 4.

Proposition 5. Under the risk neutral measure $\mathbb{Q}$ , the dynamics of the zero-coupon bond and of the pure endowment at time $t\leq T$ are given by

(A11) \begin{equation} \begin{cases} \frac{dP(t,T)}{P(t,T)} & =\,r_{t}\,dt-B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }d\boldsymbol{W}_{t}\,,\\ \frac{d\,_{T}E_{t}^{\mu }}{_{T}E_{t}^{\mu }} & =\left (r_{t}+\mu _{x_{\mu }+t}\right )dt+\,d\textbf{1}_{\{\tau _{\mu }\geq t\}}-B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }d\boldsymbol{W}_{t}\\ & -B_{\kappa _{\mu }}(t,T)\sigma _{\mu }(t)\Sigma _{\mu }^{\top }d\boldsymbol{W}_{t}\\ \frac{d\,_{T}E_{t}^{\lambda }}{\,_{T}E_{t}^{\lambda }} & =\,\left (r_{t}+\lambda _{x+t}\right )dt+\,d\textbf{1}_{\{\tau _{\lambda }\geq t\}}-B_{\kappa _{r}}(t,T)\sigma _{r}\Sigma _{r}^{\top }d\boldsymbol{W}_{t}\\ & -B_{\kappa _{\lambda }}(t,T)\sigma _{\lambda }(t)\Sigma _{\lambda }^{\top }d\boldsymbol{W}_{t} \end{cases} \end{equation}

As $\mathbb{E}^{\mathbb{Q}}\left (d\textbf{1}_{\{\tau _{\mu }\geq t\}}\right )=-\mu _{x+t}dt$ for, we check that the pure endowment has a return equal to the risk-free rate: $\mathbb{E}^{\mathbb{Q}}\left (d\,_{T}E_{t}^{\mu }\right )={}_{T}E_{t}^{\mu }\,r_{t}\,dt$ .

Proposition 3: sketch of the proof

From the Itô’s lemma, the dynamic of $\frac{1}{P(t,T^{*})}$ under the risk neutral measure $\mathbb{Q}$ is equal to

\begin{eqnarray*} d\frac{1}{P(t,T^{*})} & = & -\frac{r_{t}}{P(t,T^{*})}dt+\frac{B_{\kappa _{r}}(t,T^{*})^{2}\sigma _{r}^{2}}{P(t,T^{*})}dt+\frac{B_{\kappa _{r}}(t,T^{*})}{P(t,T^{*})}\sigma _{r}\Sigma _{r}^{\top }d\boldsymbol{W}_{t}\,. \end{eqnarray*}

From Equation (13), we infer that under $\mathbb{F}$ , this ratio is ruled by

(A12) \begin{eqnarray} & & d\frac{1}{P(t,T^{*})}=-\frac{r_{t}}{P(t,T^{*})}dt+\frac{B_{\kappa _{r}}(t,T^{*})^{2}\sigma _{r}^{2}}{P(t,T^{*})}dt+\frac{B_{\kappa _{r}}(t,T^{*})}{P(t,T^{*})}\sigma _{r}\Sigma _{r}^{\top }d\boldsymbol{W}_{t}^{\mathbb{F}}\\ & & -\frac{B_{\kappa _{r}}(t,T^{*})}{P(t,T^{*})}\sigma _{r}\left (\sigma _{r}B_{\kappa _{r}}(t,T^{*})+\sigma _{\mu }(t)\Sigma _{r}^{\top }\Sigma _{\mu }B_{\kappa _{\mu }}(t,T^{*})+\sigma _{\lambda }(t)\Sigma _{r}^{\top }\Sigma _{\lambda }B_{\kappa _{\lambda }}(t,T^{*})\right )dt\,.\nonumber \end{eqnarray}

On the other hand, the differential of $\frac{A_{t}}{P(t,T^{*})}$ is related to $dA_{t}$ and $d\left (1/P(t,T^{*})\right )$ by the relation:

\begin{eqnarray*} d\left (\frac{A_{t}}{P(t,T^{*})}\right ) & = & A_{t}\,d\left (\frac{1}{P(t,T^{*})}\right )+\frac{dA_{t}}{P(t,T^{*})}+\boldsymbol{\pi }^{\top }\Sigma _{A}(t)\Sigma _{r}\sigma _{r}B_{\kappa _{r}}(t,T^{*})\frac{A_{t}}{P(t,T^{*})}\,dt\,. \end{eqnarray*}

From Equations (A12) and (14), we rewrite this last expression as follows:

\begin{eqnarray*} & & \frac{d\left (A_{t}/P(t,T^{*})\right )}{A_{t}/P(t,T^{*})}=-\boldsymbol{\pi }^{\top }\Sigma _{A}(t)\left (\sigma _{\mu }(t)B_{\kappa _{\mu }}(t,T^{*})\Sigma _{\mu }+\sigma _{\lambda }(t)B_{\kappa _{\lambda }}(t,T^{*})\Sigma _{\lambda }\right )dt\\ & & \quad -B_{\kappa _{r}}(t,T^{*})\sigma _{r}\left (\sigma _{\mu }(t)\Sigma _{r}^{\top }\Sigma _{\mu }B_{\kappa _{\mu }}(t,T^{*})+\sigma _{\lambda }(t)\Sigma _{r}^{\top }\Sigma _{\lambda }B_{\kappa _{\lambda }}(t,T^{*})\right )dt\\ & & \quad +\left (B_{\kappa _{r}}(t,T^{*})\sigma _{r}\Sigma _{r}^{\top }+\boldsymbol{\pi }^{\top }\Sigma _{A}(t)\right )d\boldsymbol{W}_{t}^{\mathbb{F}}\,. \end{eqnarray*}

Applying the Itô’s lemma to $\ln \left (A_{t}/P(t,T^{*})\right )$ , leads to the following result:

\begin{eqnarray*} & & d\ln \left (A_{t}/P(t,T^{*})\right )=-\boldsymbol{\pi }^{\top }\Sigma _{A}(t)\left (\sigma _{\mu }(t)B_{\kappa _{\mu }}(t,T^{*})\Sigma _{\mu }+\sigma _{\lambda }(t)B_{\kappa _{\lambda }}(t,T^{*})\Sigma _{\lambda }\right )dt\\ & & \quad -B_{\kappa _{r}}(t,T^{*})\sigma _{r}\left (\sigma _{\mu }(t)\Sigma _{r}^{\top }\Sigma _{\mu }B_{\kappa _{\mu }}(t,T^{*})+\sigma _{\lambda }(t)\Sigma _{r}^{\top }\Sigma _{\lambda }B_{\kappa _{\lambda }}(t,T^{*})\right )dt\\ & & \quad -\frac{1}{2}\left (B_{\kappa _{r}}(t,T^{*})\sigma _{r}\Sigma _{r}^{\top }+\boldsymbol{\pi }^{\top }\Sigma _{A}(t)\right )\left (B_{\kappa _{r}}(t,T^{*})\sigma _{r}\Sigma _{r}+\Sigma _{A}(t)^{\top }\boldsymbol{\pi }\right )dt\\ & & \quad +\left (B_{\kappa _{r}}(t,T^{*})\sigma _{r}\Sigma _{r}^{\top }+\boldsymbol{\pi }^{\top }\Sigma _{A}(t)\right )d\boldsymbol{W}_{t}^{\mathbb{F}}\,. \end{eqnarray*}

This last equation emphasizes that $\ln \frac{A_{T^{*}}/P(T^{*},T^{*})}{A_{t}/P(t,T^{*})}\sim N\left (\mu _{\mathbb{F}}(t),v_{\mathbb{F}}(t)\right )$ .

end

Appendix B

This appendix provides various integrals used for calculating analytical prices. For $t\leq T^{*}$ , $T^{*}\leq T$ , $T^{*}\leq S$ and $y\in \mathbb{R}^{+}$ , we have the following expressions:

\begin{eqnarray*} & & \int _{t}^{T^{*}}B_{y}(u,S)\,du=\frac{1}{y}\left (T^{*}-t\right )-\frac{1}{y}e^{-y(S-T^{*})}B_{y}(t,T^{*})\,,\\ & & \int _{t}^{T^{*}}B_{y}(u,S)^{2}\,du=\frac{1}{y^{2}}\left [\left (T^{*}-t\right )-2e^{-y(S-T^{*})}B_{y}(t,T^{*})\right .\\ & & \quad \quad \quad \quad \quad \left .+e^{-2y(S-T^{*})}B_{2y}(t,T^{*})\right ]\,,\\ & & \int _{t}^{T^{*}}B_{y}(u,S)B_{y}(u,T)\,du=\frac{1}{y^{2}}\left [\left (T^{*}-t\right )-e^{-y(S-T^{*})}B_{y}(t,T^{*})\right .\\ & & \quad \quad \quad \quad \quad \left .-e^{-y(T-T^{*})}B_{y}(t,T^{*})+e^{-y(S+T)}e^{2y\,T^{*}}B_{2y}(t,T^{*})\right ]\,. \end{eqnarray*}

The combined integrals of $B_{y}(t,T)$ , $\sigma _{\mu }(t)$ and $\sigma _{\lambda }(t)$ are:

\begin{eqnarray*} & & \int _{t}^{T^{*}}\sigma _{z}(u)B_{y}(u,S)du=\frac{\alpha _{z}}{y}e^{\beta _{z}\left (x_{z}+T^{*}\right )}B_{\beta _{z}}(t,T^{*})\\ & & \quad \quad \quad \quad \quad \quad \quad \quad -\frac{\alpha _{z}}{y}e^{\beta _{z}\left (x_{z}+T^{*}\right )-y\left (S-T^{*}\right )}B_{\beta _{z}+y}(t,T^{*})\,,\\ & & \int _{t}^{T^{*}}\sigma _{z}(u)B_{y}(u,S)B_{\kappa _{z}}(u,T)\,du=\frac{\alpha _{z}e^{\beta _{z}\left (x_{z}+T^{*}\right )}}{y\kappa _{z}}\left (B_{\beta _{z}}(t,T^{*})\right .\\ & & \quad \quad \quad \quad \quad \quad \quad \quad -e^{-y(S-T^{*})}B_{\beta _{z}+y}(t,T^{*})-e^{-\kappa _{z}(T-T^{*})}B_{\beta _{z}+\kappa _{z}}(t,T^{*})\\ & & \quad \quad \quad \quad \quad \quad \quad \quad \left .+e^{-y(S-T^{*})-\kappa _{z}(T-T^{*})}B_{y+\kappa _{z}+\beta _{z}}(t,T^{*})\right )\,,\\ & & \int _{t}^{T^{*}}\sigma _{z}(u)^{2}B_{\kappa _{z}}(u,S)B_{\kappa _{z}}(u,T)\,du=\left (\frac{\alpha _{z}e^{\beta _{z}\left (x_{z}+T^{*}\right )}}{\kappa _{z}}\right )^{2}\left (B_{2\beta _{z}}(t,T^{*})\right .\\ & & \quad \quad \quad \quad \quad \quad \quad \quad -e^{-\kappa _{z}(S-T^{*})}B_{2\beta _{z}+\kappa _{z}}(t,T^{*})-e^{-\kappa _{z}(T-T^{*})}B_{2\beta _{z}+\kappa _{z}}(t,T^{*})\\ & & \quad \quad \quad \quad \quad \quad \quad \quad \left .+e^{-\kappa _{z}\left (S-T^{*}\right )-\kappa _{z}\left (T-T^{*}\right )}B_{2(\kappa _{z}+\beta _{z})}(t,T^{*})\right )\,,\\ & & \int _{t}^{T^{*}}\sigma _{\mu }(u)\sigma _{\lambda }(u)B_{\kappa _{\mu }}(u,S)B_{\kappa _{\lambda }}(u,T)\,du=\frac{\alpha _{\mu }\alpha _{\lambda }e^{\beta _{\mu }(x_{\mu }+T^{*})+\beta _{\lambda }(x_{\lambda }+T^{*})}}{\kappa _{\mu }\kappa _{\lambda }}\\ & & \quad \quad \quad \quad \quad \quad \quad \quad \left (B_{\beta _{\mu }+\beta _{\lambda }}(t,T^{*})-e^{-\kappa _{\mu }(S-T^{*})}B_{\beta _{\mu }+\beta _{\lambda }+\kappa _{\mu }}(t,T^{*})\right .\\ & & \quad \quad \quad \quad \quad \quad \quad \quad -e^{-\kappa _{\lambda }(T-T^{*})}B_{\beta _{\mu }+\beta _{\lambda }+\kappa _{\lambda }}(t,T^{*})\\ & & \quad \quad \quad \quad \quad \quad \quad \quad \left .+e^{-\kappa _{\mu }(S-T^{*})-\kappa _{\lambda }(T-T^{*})}B_{\kappa _{\mu }+\kappa _{\lambda }+\beta _{\mu }+\beta _{\lambda }}(t,T^{*})\right )\,. \end{eqnarray*}

Appendix C

This appendix, provides the full analytical expressions of the variance and mean of the lognormal random process, $\ln \frac{A_{T^{*}}/P(T^{*},T^{*})}{A_{t}/P(t,T^{*})}$ (see Proposition 3). The variance $v_{\mathbb{F}}(t)$ is developed as the following sum:

(C1) \begin{eqnarray} & & v_{\mathbb{F}}(t)=\int _{t}^{T^{*}}\sigma _{r}^{2}B_{\kappa _{r}}(u,T^{*})^{2}\,du+2\boldsymbol{\pi }^{\top }\int _{t}^{T^{*}}\Sigma _{A}(u)B_{\kappa _{r}}(u,T^{*})\,du\,\Sigma _{r}\sigma _{r}\\ & & \quad +\boldsymbol{\pi }^{\top }\int _{t}^{T^{*}}\Sigma _{A}(u)\Sigma _{A}(u)^{\top }\,du\,\boldsymbol{\pi }\,.\nonumber \end{eqnarray}

The integral in the second term of the right hand side is a $3\times 4$ matrix equal to

(C2) \begin{eqnarray} & & \int _{t}^{T^{*}}\Sigma _{A}(u)B_{\kappa _{r}}(u,T^{*})\,du=\\ & & \quad \left (\begin{array}{c} \sigma _{S}\Sigma _{S}^{\top }\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T^{*})\,du\\ -\sigma _{r}\Sigma _{r}^{\top }\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T)B_{\kappa _{r}}(u,T^{*})\,du\\ \left (\begin{array}{c} -\sigma _{r}\Sigma _{r}^{\top }\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T)B_{\kappa _{r}}(u,T^{*})\,du-\Sigma _{\lambda }^{\top }\\ \times \int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{r}}(u,T^{*})B_{\kappa _{\lambda }}(u,T)du \end{array}\right )\end{array}\right )\,.\nonumber \end{eqnarray}

The integral in the third term of the right hand side of Equation (3) is a symmetric $3\times 3$ matrix:

(C3) \begin{equation} \int _{t}^{T^{*}}\Sigma _{A}(u)\Sigma _{A}(u)^{\top }\,du=\left (\begin{array}{ccc} c_{11}(t) & c_{12}(t) & c_{13}(t)\\ c_{12}(t) & c_{22}(t) & c_{23}(t)\\ c_{13}(t) & c_{23}(t) & c_{33}(t) \end{array}\right )\,, \end{equation}

with the following entries:

\begin{eqnarray*} & & c_{11}(t)=\sigma _{S}^{2}\left (T^{*}-t\right )\,,\\ & & c_{12}(t)=-\sigma _{S}\Sigma _{S}^{\top }\Sigma _{r}\sigma _{r}\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T)\,du\,,\\ & & c_{13}(t)=-\sigma _{S}\Sigma _{S}^{\top }\Sigma _{r}\sigma _{r}\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T)du-\sigma _{S}\Sigma _{S}^{\top }\Sigma _{\lambda }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T)du\,,\\ & & c_{22}(t)=\sigma _{r}^{2}\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T)^{2}du\,,\\ & & c_{23}(t)=\sigma _{r}^{2}\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T)^{2}du+\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\lambda }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{r}}(u,T)B_{\kappa _{\lambda }}(u,T)du\,,\\ & & c_{33}(t)=\sigma _{r}^{2}\int _{t}^{T^{*}}B_{\kappa _{r}}(u,T)^{2}du+\int _{t}^{T^{*}}\sigma _{\lambda }(u)^{2}B_{\kappa _{\lambda }}(u,T)^{2}du\\ & & \qquad +2\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\lambda }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{r}}(u,T)B_{\kappa _{\lambda }}(u,T)du\,. \end{eqnarray*}

The mean function, $\mu _{\mathbb{F}}(t)$ , is developed as follows

(C4) \begin{eqnarray} & & \mu _{\mathbb{F}}(t)=-\frac{1}{2}v_{\mathbb{F}}(t)-\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\mu }\int _{t}^{T^{*}}\sigma _{\mu }(u)B_{\kappa _{r}}(u,T^{*})B_{\kappa _{\mu }}(u,T^{*})\,du\\ & & \qquad -\sigma _{r}\Sigma _{r}^{\top }\Sigma _{\lambda }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{r}}(u,T^{*})B_{\kappa _{\lambda }}(u,T^{*})du\nonumber \\ & & \qquad -\boldsymbol{\pi }^{\top }\int _{t}^{T^{*}}\Sigma _{A}(u)\sigma _{\mu }(u)B_{\kappa _{\mu }}(u,T^{*})du\,\Sigma _{\mu }\nonumber \\ & & \qquad -\boldsymbol{\pi }^{\top }\int _{t}^{T^{*}}\Sigma _{A}(u)\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T^{*})du\,\Sigma _{\lambda }\,.\nonumber \end{eqnarray}

The integrals in the third and fourth terms of Equation (C4) are respectively equal to

(C5) \begin{eqnarray} & & \int _{t}^{T^{*}}\Sigma _{A}(u)\sigma _{\mu }(u)B_{\kappa _{\mu }}(u,T^{*})du=\\ & & \quad \left (\begin{array}{c} \sigma _{S}\Sigma _{S}^{\top }\int _{t}^{T^{*}}\sigma _{\mu }(u)B_{\kappa _{\mu }}(u,T^{*})\,du\\ -\sigma _{r}\Sigma _{r}^{\top }\int _{t}^{T^{*}}\sigma _{\mu }(u)B_{\kappa _{r}}(u,T)B_{\kappa _{\mu }}(u,T^{*})\,du\\ \left (\begin{array}{c} -\sigma _{r}\Sigma _{r}^{\top }\int _{t}^{T^{*}}\sigma _{\mu }(u)B_{\kappa _{r}}(u,T)B_{\kappa _{\mu }}(u,T^{*})du-\Sigma _{\lambda }^{\top }\\ \times \int _{t}^{T^{*}}\sigma _{\mu }(u)\sigma _{\lambda }(u)B_{\kappa _{\mu }}(u,T^{*})B_{\kappa _{\lambda }}(u,T)\,du \end{array}\right ) \end{array}\right )\,,\nonumber \end{eqnarray}

and

(C6) \begin{eqnarray} & & \int _{t}^{T^{*}}\Sigma _{A}(u)\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T^{*})du=\\ & & \quad \left (\begin{array}{c} \sigma _{S}\Sigma _{S}^{\top }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{\lambda }}(u,T^{*})\,du\\ -\sigma _{r}\Sigma _{r}^{\top }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{r}}(u,T)B_{\kappa _{\lambda }}(u,T^{*})\,du\\ \left (\begin{array}{c} -\sigma _{r}\Sigma _{r}^{\top }\int _{t}^{T^{*}}\sigma _{\lambda }(u)B_{\kappa _{r}}(u,T)B_{\kappa _{\lambda }}(u,T^{*})du-\Sigma _{\lambda }^{\top }\\ \times \int _{t}^{T^{*}}\sigma _{\lambda }(u)^{2}B_{\kappa _{\lambda }}(u,T^{*})B_{\kappa _{\lambda }}(u,T)\,du \end{array}\right ) \end{array}\right )\,.\nonumber \end{eqnarray}

The integrals in Equations (C2), (C3), (C5) and (C6) are detailed in Appendix B.

Appendix D

In the numerical illustration, we model the initial yield curve with the Nelson-Siegel (NS) model. In this framework, initial instantaneous forward rates are provided by the following function:

\begin{eqnarray*} f(0,t):\!=-\partial _{t}\ln P(0,t) & = & b_{0}^{(r)}+\left (b_{10}^{(r)}+b_{11}^{(r)}t\right )\exp \left (-c_{1}^{(r)}t\right )\,. \end{eqnarray*}

Parameters $\{b_{0},b_{10},b_{11},c_{1}\}$ are estimated by minimizing the quadratic spread between market and model zero-coupon yields:

\begin{eqnarray*} P(0,t) & = & \exp \left (b_{0}^{(r)}+\frac{1}{t}\frac{b_{10}^{(r)}}{c_{1}^{(r)}}\left (1-e^{-c_{1}^{(r)}t}\right )+\frac{1}{t}\frac{b_{11}^{(r)}}{\left (c_{1}^{(r)}\right )^{2}}\left (1-\left (c_{1}^{(r)}t+1\right )e^{-c_{1}^{(r)}t}\right )\right )\,. \end{eqnarray*}

We fit the NS model to the yield curve of Belgian state bonds observed on the first of September 2023 and obtain estimates reported in Table D.1.

The volatility of mortality rates is fitted by least square minimization of spreads between $\sigma _{x}(.)$ and empirical deviations of variations of mortality rates by cohort (ages between 20 and 90 years from 1950 to 2020). If $\mu _{x}^{(y)}$ is the observed mortality rates at age $x$ during the calendar year $y$ , we denote by $\Delta \mu _{x}^{(y)}=\mu _{x}^{(y)}-\mu _{x-1}^{(y-1)}$ and by $S_{x}$ the standard deviation of $\Delta \mu _{x}^{(y)}$ for $y$ =1950 to 2020. The $\alpha _{\mu }$ and $\beta _{\mu }$ are obtained by minimizing the sum

\begin{equation*} \alpha _{\mu },\beta _{\mu }=\arg \min \sum _{x=20}^{90}\left (S_{x}-\alpha _{\mu }e^{\beta _{\mu }x}\right )^{2}\,. \end{equation*}

On the other hand, the initial curve of survival probabilities is described by a Makeham’s model, i.e.

\begin{eqnarray*} _{t}p_{x}^{\mu } & = & \exp -\int _{x}^{x+t}\left (a^{(\mu )}+b^{(\mu )}\left (c^{(\mu )}\right )^{s}\right )ds\\ & = & \exp ({-}a^{(\mu )}t)\exp \left (-\frac{b^{(\mu )}}{\ln c^{(\mu )}}\left (\left (c^{(\mu )}\right )^{x+t}-\left (c^{(\mu )}\right )^{x}\right )\right )\,. \end{eqnarray*}

where $a^{(\mu )},b^{(\mu )},c^{(\mu )}\in \mathbb{R}^{+}$ . These parameters and the reversion speed $\kappa _{u}$ are obtained by least square minimization of spreads between prospective and model survival probabilities. Prospective survival probabilities are computed with a Lee-Carter model fitted to Belgian mortality rates from 1950 to 2020 for 0 to 105 years, male population. Model $\,_{t}p_{x}$ are computed with Equation (25) for $x=20$ years old. Estimated parameters are provided in Table D.2.

Table D.1. Nelson-Siegel parameters, Belgian state bonds, 1/9/23

Table D.2 Mortality parameters, Belgian male mortality rates, year 2020

Footnotes

1 Source: national bank of Belgium (www.nbb.be).

2 Source: Human mortality database (www.mortality.org)

3 By relative absolute error, we mean $\frac{|\text{Exact price}-\text{PINN price}|}{\text{Exact price}}$ .

References

Al-Aradi, A., Correia, A., Jardim, G., de Freitas Naiff, D., & Saporito, Y. (2022). Extensions of the deep Galerkin method. Applied Mathematics and Computation, 430, 127287.CrossRefGoogle Scholar
Barigou, K., & Delong, L. (2022). Pricing equity-linked life insurance contracts with multiple risk factors by neural networks. Journal of Computational and Applied Mathematics, 404, 113922.CrossRefGoogle Scholar
Beck, C., Weinan, E., & Jentzen, A. (2019). Machine learning approximation algorithms for high dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations. Journal of Nonlinear Science, 29(4), 1563–1519.CrossRefGoogle Scholar
Biagini, F., Gonon, L., & Reitsam, T. (2023). Neural network approximation for superhedging prices. Mathematical Finance, 33(1), 146184.CrossRefGoogle Scholar
Buehler, H., Gonon, L., Teichmann, J., & Wood, B. (2019). Deep hedging. Quantitative Finance, 19(8), 121.CrossRefGoogle Scholar
Cai, Z. (2018). Approximating quantum many-body wave-functions using artificial neural networks. Physical Review B, 97(3), 035116.CrossRefGoogle Scholar
Carleo, G., & Troyer, M. (2017). Solving the quantum many-body problem with artificial neural networks. Science, 355(6325), 602606.CrossRefGoogle ScholarPubMed
Cuomo, S., Schiano Di Cola, V., Giampaolo, F., Rozza, G., Raissi, M., & Piccialli, F. (2022). Scientific machine learning through physics-informed neural networks: Where we are and what’s next. Journal of Scientific Computing, 92(3), 88.CrossRefGoogle Scholar
Delong, L., Dhaene, J., & Barigou, K. (2019). Fair valuation of insurance liability cash-flow streams in continuous time: Theory. Insurance: Mathematics and Economics, 88, 196208.Google Scholar
Denuit, M., Hainaut, D., & Trufin, J. (2019). Effective statistical learning for actuaries III. Neural networks and extensions. Springer Nature Switzerland.CrossRefGoogle Scholar
Doyle, D., & Groendyke, C. (2019). Using neural networks to price and hedge variable annuity guarantees. Risks, 7(1), 1.CrossRefGoogle Scholar
Gatta, F., Di Cola, V. S., Giampaolo, F., & Piccialli F.Cuomo, S. (2023). Meshless methods for American option pricing through physics-informed neural networks. Engineering Analysis with Boundary Elements, 151, 6882.CrossRefGoogle Scholar
Germain, M., Pham, H., & Warin, X. (2021). Neural networks-based algorithms for stochastic control and PDE’s in finance. In Capponi, A., & Lehalle, C.-A.(Eds.), Machine learning and data sciences for financial markets: A guide to contemporary practices. Cambridge University Press.Google Scholar
Glau, K., & Wunderlich, L. (2022). The deep parametric PDE method and applications to option pricing. Applied Mathematics and Computation, 432, 127355.CrossRefGoogle Scholar
Hainaut, D. (2023). A calendar year mortality model in continuous time. ASTIN Bulletin, 53(2), 351376.CrossRefGoogle Scholar
Hainaut, D., & Akbaraly, A. (2023). Risk management with local least-square Monte-Carlo. ASTIN Bulletin, 53(3), 489514.CrossRefGoogle Scholar
Hainaut, D., & Casas, A. (2024). Option pricing in the Heston model with physics inspired neural networks. UCLouvain working paper. https://dial.uclouvain.be/pr/boreal/object/boreal:284660.Google Scholar
Hejazi, S. A., & Jackson, K. R. (2016). A neural network approach to efficient valuation of large portfolios of variable annuities. Insurance: Mathematics and Economics, 70(p), 169181.Google Scholar
Hochreiter, S., Bengio, Y., Frasconi, P., & Schmidhuber, J. (2001) Gradient flow in recurrent nets: The difficulty of learning long-term dependencies. In S. C. Kremer & J. F. Kolen (Eds.), A field guide to dynamical recurrent neural networks. IEEE Press.Google Scholar
Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks. Neural Networks, 4(2), 251257.CrossRefGoogle Scholar
Horvath, B., Teichmann, J., & Zuric, Z. (2021). Deep hedging under rough volatility. Risk, 9(7), 138.CrossRefGoogle Scholar
Jevtic, P., Luciano, E., & Vigna, E. (2013). Mortality surface by means of continuous time cohort models. Insurance: Mathematics and Economics, 53(1), 122133.Google Scholar
Jiang, Q., Sirignano, J., & Cohen, S. (2023). Global convergence of Deep Galerkin and PINNs method for solving PDE. arXiv preprint.Google Scholar
Lee, H., & Kang, I. S. (1990). Neural algorithm for solving differential equations. Journal of Computational Physics, 91(1), 110131.CrossRefGoogle Scholar
Luciano, E., & Vigna, E. (2005). Non mean reverting affine processes for stochastic mortality. ICER working paper 4/2005.CrossRefGoogle Scholar
Luciano, E., & Vigna, E. (2008). Mortality risk via affine stochastic intensities: Calibration and empirical relevance. MPRA Paper No. 59627.Google Scholar
Milevsky, M., & Promislow, S. (2001). Mortality derivatives and the option to annuitise. Insurance: Mathematics and Economics, 29(3), 299318.Google Scholar
Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686707.CrossRefGoogle Scholar
Sirignano, J., & Spiliopoulos, K. (2018). DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics, 375, 13391364.CrossRefGoogle Scholar
van Merriënboer, B., Breuleux, O., & Bergeron, A. (2018). Automatic differentiation in ML: Where we are and where we should be going. In 32th NeurIPS proceedings. Advances in Neural Information Processing Systems, 31 (pp. 111).Google Scholar
Weinan, E., Han, J., & Jentzen, A. (2017). Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in Mathematics and Statistics, 5(4), 349380.Google Scholar
Zeddouk, F., & Devolder, P. (2020). Mean reversion in stochastic mortality: Why and how? European Actuarial Journal, 10(2), 499525.CrossRefGoogle Scholar
Figure 0

Figure 1 Feed forward network with skip connections toward intermediate layers.

Figure 1

Table 1. Financial and biometric parameters

Figure 2

Table 2. Model parameters and features of contracts

Figure 3

Table 3. Learning rate in function of Epochs

Figure 4

Table 4. Information about training time and loss values after training for three neural configurations

Figure 5

Table 5. Relative errors computed on the 100 000 contracts of the training set $\tilde{\mathcal{S}}_{\mathcal{D}}$ and on a validation set of same size

Figure 6

Figure 2 Comparison of exact and PINN prices of a $T^{*}=T=8$ years GMAB with: $S_{0}=100$, $\boldsymbol{\pi }=(0.50,0.25,0.25)$, $(r_{0},\mu _{0},\lambda _{0})$ of Table 2. Upper plots: values at issuance and expiry, for $A_{t}\in [23.82,414.15]$. Lower plots: values at issuance with $A_{0}=100$ and 90, for $r_{0}\in [0.00,0.08]$.

Figure 7

Table 6. Mean and standard deviations (std) of relative errors for contracts grouped by asset values

Figure 8

Table 7. Mean and standard deviations (std) of relative errors for contracts grouped by interest rates

Figure 9

Figure 3 Left plot: heatmap of relative errors per asset and maturity classes. Right plot: heatmap of relative errors per interest rate and maturity classes.

Figure 10

Table 8. Mean and standard deviations (std) of relative errors for contracts grouped by mortality rates $\mu _{x+t}$

Figure 11

Table 9. Mean and standard deviations (std) of relative errors for contracts grouped by hedging mortality rates $\lambda _{x+t}$

Figure 12

Figure 4 Left plot: heatmap of relative errors per $\mu _{x+t}$ (in $^{\circ }$/oo) and maturity classes. Right plot: heatmap of relative errors per $\lambda _{x+t}$ (in $^{\circ }$/oo) and maturity classes.

Figure 13

Table 10. Mean and standard deviations (std) of relative errors for contracts grouped by fraction of stocks

Figure 14

Table 11. Mean and standard deviations (std) of relative errors for contracts grouped by fraction of bonds

Figure 15

Figure 5 Heatmap of relative errors per $\pi _{S}$ and $(\pi _{P}+\pi _{D})$ asset allocation classes.

Figure 16

Figure 6 Comparison of exact and PINN prices of a 5- and 8-year GMAB with: $S_{0}=100$, $\pi _{S}\in [0,1]$ and $\pi _{D}=\pi _{M}=\frac{1-\pi _{S}}{2}$. $(r_{0},\mu _{0},\lambda _{0})$ of Table 2. Upper plots: values at issuance and expiry, for $A_{t}\in [23.82,414.15]$. Lower plots: values at issuance with $A_{0}=100$ and 90, for $r_{0}\in [0.00,0.08]$.

Figure 17

Table D.1. Nelson-Siegel parameters, Belgian state bonds, 1/9/23

Figure 18

Table D.2 Mortality parameters, Belgian male mortality rates, year 2020