1. Introduction
Exposure to the uncertain dynamics of volatility is a desirable feature of most trading strategies and has naturally generated wide interest in volatility derivatives. From a theoretical viewpoint, an adequate financial model should reproduce the volatility dynamics accurately and consistently with those of the asset price; any discrepancy may otherwise lead to arbitrage opportunity. Despite extensive research, implied volatility surfaces from options on the VIX and the S&P 500 index still display discrepancies, betraying the lack of a proper modelling framework. This issue is well known as the SPX–VIX joint calibration problem and has motivated a number of creative modelling innovations in the past fifteen years. Reconciling both implied volatilities requires additional factors to enrich the variance curve dynamics, as argued by Bergomi [Reference Bergomi13, Reference Bergomi14], who proposed the multi-factor model
for the forward variance, with $W^1,\ldots, W^N$ correlated Brownian motions and coeffients $c_1,\kappa_1,\ldots, c_N, \kappa_N \gt 0$ . Gatheral [Reference Gatheral26] recognised the importance of the additional factor to disentangle different aspects of the implied volatility and to allow humps in the variance curve, and introduced a mean-reverting version—the double CEV model—where the instantaneous mean of the variance follows a CEV model itself. Although promising, these attempts fell short of reproducing jointly the short-time behaviour of the SPX and VIX implied volatilities. A variety of new models were suggested to tackle this issue, both with continuous paths [Reference Barletta, Nicolato and Pagliarani7, Reference Fouque and Saporito24, Reference Goutte, Ismail and Pham29] and with jumps [Reference Baldeaux and Badran6, Reference Carr and Madan18, Reference Cont and Kokholm19, Reference Kokholm and Stisen40, Reference Pacati, Pompa and Reno43, Reference Papanicolaou and Sircar47], incorporating novel ideas and increased complexity such as regime-switching volatility dynamics. Model-free bounds were also obtained in [Reference De Marco and Henry-Labordère21, Reference Guyon31, Reference Guyon, Menegaux and Nutz32, Reference Papanicolaou46], shedding light on the links between VIX and SPX and the difficulty of capturing them both simultaneously.
Getting rid of the restraining Markovian assumption that burdens classical stochastic volatility models has permitted the emergence of rough volatility models, which consistently agree with stylised facts under both the historical and the pricing measures [Reference Alòs, León and Vives4, Reference Bayer8, Reference Bayer, Friz and Gatheral9, Reference Bennedsen, Lunde and Pakkanen11, Reference Fukasawa25, Reference Gatheral, Jaisson and Rosenbaum27]. A large portion of the toolbox developed for Markovian diffusion models is not available any longer, and asymptotic methods [Reference Guennoun, Jacquier, Roome and Shi30, Reference Horvath, Jacquier and Lacombe33, Reference Jacquier, Pakkanen and Stone36, Reference Jacquier and Pannier37]—and more recently path-dependent PDE methods [Reference Bayer, Qiu and Yao10, Reference Bonesini, Jacquier and Pannier16, Reference Jacquier and Zuric38, Reference Pannier44, Reference Pannier and Salvi45, Reference Viens and Zhang50]—thus play a prominent role in understanding the theoretical properties and numerical aspects of these models. Since the fit of the spot implied volatility skew is extremely accurate under this class of models [Reference Gatheral, Jaisson and Rosenbaum27], it seems reasonable to expect good results when calibrating VIX options. Moreover, the newly established hedging formula by Viens and Zhang [Reference Viens and Zhang50] shows that a rough volatility market is complete if it also contains a proxy of the volatility of the asset; this acts as an additional motivation for our work. Still, [Reference Jacquier, Martini and Muguruza35] showed that the rough Bergomi model is too close to lognormal to jointly calibrate both markets. Its younger sister [Reference Horvath, Jacquier and Tankov34] added a stochastic volatility-of-volatility component, generating a smile sandwiched between the bid–ask prices when calibrating VIX, but the joint calibration is not provided. By incorporating a Zumbach effect, the quadratic rough Heston model [Reference Gatheral, Jaisson and Rosenbaum28] achieves good results for the joint calibration at one given date. Further numerical methods were developed in [Reference Bonesini, Callegaro and Jacquier15, Reference Bourgey and De Marco17, Reference Rosenbaum and Zhang49]. However, the lack of analytical tractability of rough volatility models is holding back the progress of theoretical results on the VIX, with the notable exception of large deviations results from [Reference Forde, Gerhold and Smith23, Reference Lacombe, Muguruza and Stone41] and the small-time asymptotics of [Reference Alòs, García-Lorite and Muguruza2].
In the latter, $\mathcal{F}_T$ -measurable random variables (with volatility derivatives in mind) are written in the form of exponential martingales thanks to the Clark–Ocone formula, allowing the application of established asymptotic methods from [Reference Alòs, León and Vives4]. An expression for the short-time limit at-the-money (ATM) implied volatility skew is derived, yielding an analytical criterion that a model should satisfy to reproduce the correct short-time behaviour. The proposed mixed rough Bergomi model does meet the requirement of positive skew of the VIX implied volatility, backing its implementation with theoretical evidence. And indeed, the fits are rather satisfying. This model is built by replacing the exponential kernels of the Bergomi model (1) ( $t\mapsto \mathrm{e}^{-\kappa t}$ ) with fractional kernels of the type $t\mapsto t^{H-\frac{1}{2}}$ with $H\in(0,\frac{1}{2})$ , but is limited to a single factor, i.e. $W^1=W^2$ . As a result, numerical computations under this model induce a linear smile, or equivalently a null curvature, unfortunately inconsistent with market observations. To remedy this, we incorporate Bergomi’s and Gatheral’s insights on multi-factor models (integrated by [Reference De Marco20, Reference Lacombe, Muguruza and Stone41] into rough volatility models) and extend [Reference Alòs, García-Lorite and Muguruza2] to the multi-factor case; we also compute the short-time ATM implied volatility curvature, deriving a second criterion for a more accurate model choice. In summary, the present paper goes beyond [Reference Alòs, García-Lorite and Muguruza2] in three ways:
-
• We consider multi-factor models, far more efficient for VIX calibration, which complicate the analysis.
-
• We compute the second derivative of the implied volatility to discriminate better between models; this turns out to be considerably more technical than the skew.
-
• We provide detailed proofs of all of our results at three levels: abstract model, generic rough volatility model for the VIX, and two-factor rough Bergomi model, checking carefully that all the assumptions are satisfied, proving technical lemmas applicable to our setting, and exhibiting definite formulas at all three levels of generality.
We gather in Section 2 our abstract framework and assumptions. The main results, which provide the short-time limits of the implied volatility level, skew, and curvature, are contained in Section 3. Our framework covers a wide range of underlying assets, including VIX (Section 4) and stock options (Section 5); see in particular in Propositions 1 and 4. We provide further a detailed analysis of the two-factor rough Bergomi model (1). Closed-form expressions that depend explicitly on the parameters of the model are provided in Proposition 3 for the VIX and Corollary 1 for the stock. These expressions give insight into the interplay between the different parameters, and make the calibration task easier by allowing us to fit some stylised facts before performing numerical computations. For instance, different combinations of parameters can yield positive or negative curvature. All the proofs are gathered in Section 6, starting with useful lemmas and then following the order of the sections.
Notation. For an integer $N\in\mathbb{N}$ and a vector $\mathbf{x}\in\mathbb{R}^N$ , we define $|\mathbf{x}| \,:\!=\, \sum_{i=1}^{N}x_i$ and ${\left\|{\mathbf{x}}\right\|}^2 \,:\!=\, \sum_{i=1}^{N}x_i^2$ . We fix a finite time horizon $T \gt 0$ and let $\mathbb{T}\,:\!=\,[0,T]$ . For all $p\ge1$ , $L^p$ stands for the space $L^p(\Omega)$ for some reference sample space $\Omega$ . As we consider rough volatility models, the Hurst parameter $H\in (0,\frac{1}{2})$ is a fundamental quantity and we shall write $H_+\,:\!=\,H+\frac{1}{2}$ and $H_-\,:\!=\,H-\frac{1}{2}$ .
2. Framework
We consider a square-integrable strictly positive process $ (A_t)_{t\in\mathbb{T}}$ , adapted to the natural filtration $ (\mathcal{F}_t)_{t\in\mathbb{T}}$ of an N-dimensional Brownian motion $\mathbf{W}=(W^1,...,W^N)$ defined on a probability space $(\Omega,\mathcal{F},\mathbb{P})$ . We further introduce the true $(\mathcal{F}_t)_{t\in\mathbb{T}}$ -martingale conditional expectation process
The set $\mathbb{D}^{1,2}$ will denote the domain of the Malliavin derivative operator D with respect to the Brownian motion $\mathbf{W}$ , while $\mathrm{D}^i$ indicates the Malliavin derivative operator with respect to $W^i$ . It is well known that $\mathbb{D}^{1,2}$ is a dense subset of $L^{2}(\Omega)$ and that D is a closed and unbounded operator from $L^{2}(\Omega)$ into $L^{2}(\mathbb{T}\times\Omega)$ . Analogously we define the sets of Malliavin differentiable processes $\mathbb{L}^{n,2}\,:\!=\,L^{2}(\mathbb{T};\mathbb{D}^{n,2})$ . We refer to [Reference Nualart42] for more details on Malliavin calculus. Assuming $A_T\in\mathbb{D}^{1,2}$ , the Clark–Ocone formula [Reference Nualart42, Theorem 1.3.14] reads, for each $t\in\mathbb{T}$ ,
where each component of $\mathbf{m}$ is $m^i_s\,:\!=\,\mathbb{E}\left[\mathrm{D}^{i}_s A_T |\mathcal{F}_s\right]$ . Since M is a martingale, we may rewrite (2) as
where $\boldsymbol\phi_{s} \,:\!=\, \mathbf{m}_s / M_s$ is defined whenever $M_s\neq0$ almost surely. If $\boldsymbol\phi=(\phi^1,...,\phi^N)$ belongs to $\mathbb{L}^{n,2}$ , then the following processes are well defined for all $t \lt T$ :
Note that all the processes depend implicitly on T, which will be crucial when we study the short-time limit as T tends to zero.
2.1. Level, skew, and curvature
Since M is a strictly positive martingale process, we can use it as an underlying to introduce options. A standard practice is to work with its logarithm $\mathfrak{M} \,:\!=\,\log(M)$ , so that $\mathfrak{M}_T = \log\mathbb{E}_T[A_T] = \log(A_T)$ and $\mathfrak{M}_0 = \log\mathbb{E}[A_T]$ . Under no-arbitrage arguments, the price $\Pi_t$ at time t of a European call option with maturity T and log-strike $k\geq 0$ is equal to
and the ATM value is denoted by $\Pi_t\,:\!=\,\Pi_t(\mathfrak{M}_0) = \mathbb{E}_t[(A_T - M_t)^+]$ . We adapt the usual definitions of ATM implied volatility level, skew, and curvature to the case where the underlying is a general process (later specified for the VIX and the S&P). Denote by $\mathrm{BS}(t,x,k,\sigma)$ the Black–Scholes price of a European call option at time $t\in\mathbb{T}$ , with maturity T, log-stock x, log-strike k, and volatility $\sigma$ . Its closed-form expression reads
with $d_{\pm }(x,k,\sigma) \,:\!=\,\frac{x-k}{\sigma \sqrt{T-t}}\pm\frac{\sigma \sqrt{T-t}}{2}$ , where $\mathcal{N}$ denotes the Gaussian cumulative distribution function.
Definition 1.
-
• For any $k\in\mathbb{R}$ , the implied volatility $\mathcal{I}_{T}(k)$ is the unique non-negative solution to $\Pi_0(k)=\mathrm{BS}\big(0,\mathfrak{M}_0, k, \mathcal{I}_{T}(k)\big)$ ; we omit the k-dependence when considering it ATM ( $k=\mathfrak{M}_0$ ).
-
• The ATM implied skew $\mathcal{S}$ and curvature $\mathcal{C}$ at time zero are defined as
$$ \mathcal{S}_{T}\,:\!=\,\left|\partial_{k} \mathcal{I}_{T}(k)\right|_{k=\mathfrak{M}_0} \qquad\text{and}\qquad \mathcal{C}_{T}\,:\!=\, \left|\partial_{k}^2 \mathcal{I}_{T}(k)\right|_{k=\mathfrak{M}_0}. $$
2.2. Examples
The framework (3) encompasses a large class of models, including stochastic volatility models ubiquitous in quantitative finance. Consider a stock price process $(S_t)_{t\in\mathbb{T}}$ satisfying
where v is a stochastic process adapted to $(\mathcal{F}_t)_{t\in\mathbb{T}}$ , $\boldsymbol\rho\,:\!=\,(\rho_1,\cdots,\rho_N)\in[{-}1,1]^N$ with $\boldsymbol\rho\boldsymbol\rho^\top =1$ .
2.2.1. Asset price
For $N=2$ , the model (3) corresponds to a one-dimensional stochastic volatility model under the identification $A=M=S$ , $\phi^1= \rho_1 \sqrt{v}$ , and $\phi^2 = \rho_2 \sqrt{v}$ , and v is a process driven by $W^1$ . Our analysis generalises [Reference Alòs, León and Vives4, Equation (2.1)] to the multi-factor case (in the continuous-path case). We refer to Section 5 for the details in the multi-factor setting and the analysis of the implied volatility.
2.2.2. VIX
The VIX is defined as $\mathrm{VIX}_{T}=\sqrt{\frac{1}{\Delta}\int_T^{T+\Delta}\mathbb{E}_T[v_t ] \mathrm{d}t}$ , where $\Delta$ is one month. The representation (2) yields that the underlying is the VIX future
2.2.3. Asian options
For Asian options, the process of interest is $\mathcal{A}_T\,:\!=\,\frac{1}{T}\int_0^T S_t \mathrm{d}t$ . Using (2) we find
2.2.4. Multi-factor rough Bergomi
Rough volatility models can be written as $v_t=f(\mathbf{W}^H_t)$ , where $\mathbf{W}^H$ is an N-dimensional fractional Brownian motion with correlated components and $f\,:\,\mathbb{R}^N\to\mathbb{R}$ . For instance, in the two-factor rough Bergomi model,
with $\chi \in (0,1)$ , $\nu, \eta, v_0 \gt 0$ . In Example 2.2.2 we set $A=\mathrm{VIX}$ and hence $N=2$ , but in the asset price case we set $A=S$ and therefore $N=3$ even though the variance depends on only two factors.
2.3. General assumptions
We introduce the following broad assumptions, which are key to our entire analysis; in Section 4 we provide sufficient conditions to simplify them in the VIX case:
-
(H1) $A\in \mathbb{L}^{4,p}$ .
-
(H2) $\displaystyle\frac{1}{M_t}\in L^p$ , for all $p \gt 1$ , and all $t\in\mathbb{T}$ .
-
(H3) The term $\displaystyle \mathbb{E}_t\left[ \int_{t}^{T}\frac{|\boldsymbol\Theta_{s}|}{\mathfrak{u}_{s}^2}\mathrm{d}s\right]$ is well defined for all $t\in\mathbb{T}$ .
-
(H4) The term $\displaystyle \frac{1}{\sqrt{T}} \mathbb{E}\left[ \int_0^T \frac{|\boldsymbol\Theta_s|}{\mathfrak{u}_s^2} \mathrm{d}s\right]$ tends to zero as T tends to zero.
-
(H5) There exists $p\ge1$ such that $\sup_{T\in[0,1]}\mathfrak{u}_0^p \lt \infty$ almost surely and, for all random variables $Z\in L^p$ and all $i\in [\![ 1,N ]\!]$ , the following terms are well defined and tend to zero as T tends to zero:
$$ \int_0^T \mathbb{E}\left[Z \left( \mathbb{E}_s \left[ \frac{1}{u_0} \int_0^T \mathrm{D}^i_s {\left\|{\boldsymbol\phi_r}\right\|}^2 \mathrm{d}r \right]\right)^2\right] \mathrm{d}s. $$
There exists $\lambda\in({-}\frac{1}{2},0]$ such that the following hold:
$\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ The following expressions converge to zero as T tends to zero:
$\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ The random variable $\mathfrak{K}_T\,:\!=\,\displaystyle \frac{\int_0^T |\boldsymbol\Theta_s|\mathrm{d}s}{T^{\frac{1}{2}+\lambda} \mathfrak{u}_{0}^{3}}$ is such that $\mathbb{E}[\mathfrak{u}_0^2 \mathfrak{K}_T]$ tends to zero and $\mathbb{E}[\mathfrak{K}_T]$ has a finite limit as T tends to zero.
There exists $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ such that the following hold:
$\gamma\in({-}1,0]$ The following expressions converge to zero as T tends to zero:
$\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ The random variables
are such that $\mathbb{E}\big[(\mathfrak{u}_0^6+\mathfrak{u}_0^4+\mathfrak{u}_0^2) \mathfrak{H}^1_T + (\mathfrak{u}_0^4+\mathfrak{u}_0^2) \mathfrak{H}_T^2 \big]$ tends to zero and both $\mathbb{E}[\mathfrak{H}^1_T]$ and $\mathbb{E}[\mathfrak{H}^2_T]$ have a finite limit as T tends to zero.
Remark 1.
-
• $\boldsymbol{(\mathrm{H}_{1})}$ requires A to be four times Malliavin differentiable. This is necessary to prove the curvature formula using the Clark–Ocone formula (2) and using the anticipative Itô formula three times.
-
• When the underlying is the stock price (as in Section 2.2.1), it satisfies Equation (3) where $\phi$ corresponds to its volatility $\sqrt{v}$ . One can then directly make assumptions on the variance process, as in [Reference Alòs and León3–Reference Alòs and Shiraya5]. We make this explicit in Proposition 4 for example. In the case of the VIX (Section 4.1) we refrain from doing the same, since $\phi$ is much more intricate. Nevertheless, sufficient conditions are given by $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ .
3. Main results
We gather here our main asymptotic results for the general framework above, with the proofs postponed to Section 6.2 to ease the flow. The first theorem states that the small-time limit of the implied volatility is equal to the limit of the forward volatility. This is well known for Markovian stochastic volatility models [Reference Alòs and Shiraya5, Reference Berestycki, Busca and Florent12] and in a one-factor setting [Reference Alòs, García-Lorite and Muguruza2]. To streamline the call to the assumptions, we shall group them using mixed subscript notation; for example $\boldsymbol{(\mathrm{H}_{123})}$ corresponds to $\boldsymbol{(\mathrm{H}_{1})}$ - $\boldsymbol{(\mathrm{H}_{2})}$ - $\boldsymbol{(\mathrm{H}_{3})}$ , and we further write $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda}\boldsymbol{)}$ to mean $\boldsymbol{(\mathrm{H}_{12345})}$ - $\boldsymbol{(\mathrm{H}}_{\boldsymbol{67}}^{\lambda}\boldsymbol{)}$ and $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ as short for $\boldsymbol{(\mathrm{H}_{12345})}$ - $\boldsymbol{(\mathrm{H}}_{\boldsymbol{67}}^{\lambda}\boldsymbol{)}$ - $\boldsymbol{(\mathrm{H}}_{\boldsymbol{89}}^{\gamma}\boldsymbol{)}$ .
Theorem 1. If $\boldsymbol{(\mathrm{H}_{12345})}$ hold, then
Note that we did not assume the limit of $\mathbb{E}[u_0]$ to be finite. The proof, in Section 6.2.1, builds on arguments from [Reference Alòs and Shiraya5, Proposition 3.1]. We then turn our attention to the ATM skew, defined in 1. This short-time asymptotic is reminiscent of [Reference Alòs, León and Vives4, Proposition 6.2] and [Reference Alòs, García-Lorite and Muguruza2, Theorem 8].
Theorem 2. If there exists $\lambda\in({-}\frac{1}{2},0]$ such that $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda}\boldsymbol{)}$ are satisfied, then
Note that (7) still holds without $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ , but in that case both sides are infinite. In the rough-volatility setting of Section 2.2.1 with $v_t=f(\mathbf{W}^H_t)$ , $\lambda$ corresponds to $H-\frac{1}{2}$ so that (7) matches the slope of the observed ATM skew of SPX implied volatility. We prove this theorem in Section 6.2.2. We also provide the short-term curvature, in the following theorem, which is proved in Section 6.2.3.
Theorem 3. If there exist $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1,\lambda]$ ensuring $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ , then
The limit still holds without $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ , but in that case the second and third terms are infinite.
Note that $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ with $\lambda\ge\gamma$ guarantees that $T^{-\gamma} \mathcal{S}_T$ converges. By Theorem 2,
4. Asymptotic results in the VIX case
As advertised, our framework includes the VIX case where
for $v_r\in\mathbb{D}^{3,2}$ for all $r\in[0,T+\Delta]$ , and we provide simple sufficient conditions for $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ to hold.
4.1. A generic volatility model
Consider the following four conditions, which we gather under the notation $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ . There exist $H \in (0,\frac{1}{2})$ and $X\in L^p$ for all $p \gt 1$ such that the following hold:
-
(C 1 ) For all $t\ge0$ , $\frac{1}{M_t} \le X$ almost surely.
-
(C 2 ) For all $i,j,k\in [\![ 1, N ]\!]$ and $t\le s\le y\le T \le r$ , we have, almost surely
-
• $v_r\le X$ ,
-
• $\mathrm{D}^i_y v_r \le X (r-y)^{H_{-}}$ ,
-
• $\mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-s)^{H_{-}} (r-y)^{H_{-}}$ ,
-
• $\mathrm{D}^k_t \mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-t)^{H_{-}} (r-s)^{H_{-}} (r-y)^{H_{-}}$ .
-
-
(C 3 ) For all $p \gt 1$ , $\mathbb{E}[u_s^{-p}]$ is uniformly bounded in s and T, with $s\le T$ .
-
(C 4 ) For all $ i,j,k\in [\![1, N ]\!]$ and $r\ge0$ , the mappings $y\mapsto\mathrm{D}^i_y v_r$ , $s\mapsto \mathrm{D}_s^j \mathrm{D}^i_y v_r$ , and $t\mapsto \mathrm{D}^k_t \mathrm{D}^j_s \mathrm{D}^i_y v_r$ are almost surely continuous in a neighbourhood of zero.
Recall the notation $H_-$ and $H_+$ from the introduction. We compute the level, skew, and curvature of the VIX implied volatility in a model which satisfies the sufficient conditions. Let us define the following limits:
Proposition 1. Under $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ , the following limits hold:
Remark 2. Our results stand under the fairly general set of assumptions ${\bf (\overline{C})}$ . If v is a reasonably well-behaved function of an N-dimensional Gaussian Volterra process $(W^{1,H},\cdots, W^{N,H})$ , then these should be relatively easy to check, as Proposition 2 suggests. For other rough stochastic volatility models, such as the rough Heston model [Reference El Euch and Rosenbaum22, Reference Jaisson and Rosenbaum39], it might be harder to verify the assumptions. Indeed, the latter is not even known to be Malliavin differentiable to this day, and thus does not lie within the scope of the present study.
We split the proof into two steps, collected in Section 6.3. First we show that $\boldsymbol{(\mathrm{C}_{1})}$ , $\boldsymbol{(\mathrm{C}_{2})}$ , $\boldsymbol{(\mathrm{C}_{3})}$ are sufficient to apply our main theorems, as they imply $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ for any $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1, 3H-\frac{1}{2}]$ . Thanks to $\boldsymbol{(\mathrm{C}_{4})}$ we can also compute the limits—after a rigorous statement of convergence results—starting with $\mathcal{I}_T$ and the skew with $\lambda=0$ . Restricting H to $(0,1/6)$ , which is the most relevant regime for rough volatility models, we can set $\gamma=3H-\frac{1}{2} \lt \lambda$ and compute the short-time curvature, with only the second term in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ contributing to the limit. The curvature limit in Proposition 1 is finite by the last item of $\boldsymbol{(\mathrm{C}_{2})}$ .
Remark 3. In the regime $H\in[1/6,1/2)$ , the rescaling becomes $\gamma=0$ , and many more terms that would just vanish when $H \lt 1/6$ now make a non-trivial contribution in the limit. Informally (that is, without a proof), the limit reads
4.2. The two-factor rough Bergomi
We consider the two-factor exponential model
where $H\in(0,\frac{1}{2}]$ , $W^{i,H}_t= \int_0^t (t-s)^{H_{-}} \mathrm{d} W^i_s$ , $W^1,W^2$ are independent Brownian motions, the Wick exponential is defined as $\mathcal{E}(X)\,:\!=\,\exp\{X-\frac{1}{2} \mathbb{E}[X^2]\}$ for any random variable X, and $\chi\in[0,1]$ , $\overline{\chi}\,:\!=\,1-\chi$ , $v_0,\nu,\eta \gt 0$ , $\rho\in[{-}1,1]$ , $\overline{\rho}=\sqrt{1-\rho^2}$ . This model is an extension of the Bergomi model [Reference Bergomi14], where the exponential kernel is replaced by a fractional one and an extension of the rough Bergomi model [Reference Bayer, Friz and Gatheral9] to the two-factor case. It combines Bergomi’s insights on the need for several factors with the benefits of rough volatility. As proved in Section 6.4.1, it satisfies our conditions.;
Proposition 2. If $\rho\in({-}\sqrt{2}/2,1]$ , the model (10) satisfies $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ .
The restriction of the range of $\rho$ is equivalent to $\rho+\overline{\rho} \gt 0$ , a necessary requirement in the proof. Proposition 1 therefore applies and we obtain the following limits, as proved in Section 6.4.2.
Proposition 3. Let $\psi(\rho,\nu,\eta,\chi)\,:\!=\,\sqrt{ (\chi\nu+\overline{\chi}\eta\rho)^2+ \overline{\chi}^2\eta^2\overline{\rho}^2}$ . If $H\in(0,\frac{1}{6})$ and $\rho\in({-}\frac{\sqrt{2}}{2},1]$ , then
The limits depend explicitly on the parameters of the model $(H,\chi,\nu,\eta,\rho)$ and can be used to gain insight on their impact over the quantities of interest.
Remark 4.
-
• In the case $\rho=1$ (and hence $\overline{\rho}=0$ ), the above limits simplify to
\begin{align*} \lim_{T\downarrow 0} \mathcal{I}_{T} & = \frac{\Delta^{H_{-}}}{2H_+}\left(\chi\nu+\overline{\chi}\eta\right),\\[3pt] \lim_{T\downarrow 0} \mathcal{S}_T & =\frac{1}{2} \frac{H_+\Delta^{H_{-}}}{\chi\nu+\overline{\chi}\eta} \left[\frac{\chi\nu^2+\overline{\chi}\eta^2}{2H}-\left(\frac{\chi\nu+\overline{\chi}\eta}{H_{+}}\right)^2\right],\\[3pt] \lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \frac{128 \Delta^{-2H} H_+^2}{3-18H} \frac{\chi\nu^3+\overline{\chi}\eta^3}{(\chi\nu+\overline{\chi}\eta)^2}. \end{align*} -
• If we set $\rho=0$ (and hence $\overline{\rho}=1$ ), we obtain
\begin{align*} \lim_{T\downarrow 0} \mathcal{I}_{T} & = \frac{\Delta^{H_{-}}}{2H_+}\sqrt{\chi^2\nu^2+\overline{\chi}^2\eta^2},\\[3pt] \lim_{T\downarrow0} \mathcal{S}_T & = \frac{H_+\Delta^{H_{-}}}{2(\chi^2\nu^2+\overline{\chi}^2\eta^2)^{3/2}} \\[3pt] & \qquad \Bigg\{\chi^3\nu^4 \left[ \frac{1}{2H}-\frac{\chi}{H_+^2}\right]- \frac{2 \chi\nu^2\overline{\chi}^2\eta^2}{H_+^2} + \overline{\chi}^3 \eta^4\left(\frac{1}{2H}-\frac{1}{H_+^2}\right) \Bigg\},\\[3pt] \lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \frac{128 \Delta^{-2H} H_+^2}{3-18H} \frac{\chi^4\nu^6+\overline{\chi}^4\eta^6}{(\chi^2\nu^2+\overline{\chi}^2\eta^2)^{5/2}}. \end{align*} -
• When $\rho=-1$ (not covered per se by the proposition), the above limits simplify to
\begin{align*} \lim_{T\downarrow0} \mathcal{I}_T & = \frac{\Delta^{H_{-}}}{2H+1} \lvert\chi\nu-\overline{\chi}\eta\lvert,\\[3pt] \lim_{T\downarrow0} \mathcal{S}_T &= \frac{H_{+}\Delta^{H_{-}}}{2\lvert\chi\nu-\overline{\chi}\eta\lvert} \left[ \frac{\chi\nu^2+\overline{\chi}\eta^2}{2H} - \left(\frac{\chi\nu-\overline{\chi}\eta}{H_{+}}\right)^2\right], \\[3pt] \lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}} & = \frac{128 \Delta^{-2H} H_+^2}{3-18H} \frac{\chi\nu^3-\overline{\chi}\eta^3}{(\chi\nu-\overline{\chi}\eta)^2} \,{\rm sgn}(\chi\nu-\bar{\chi}\eta). \end{align*}
Some tedious yet straightforward manipulations allow us to obtain some information about the sign of the limiting curvature.
Lemma 1. For any $\eta,\nu \gt 0$ , $\chi \in [0,1]$ , there exists $\rho^*_{\chi,\nu\,\eta} \lt 0$ such that $\lim_{T\downarrow 0} \frac{\mathcal{C}_T}{T^{3H-\frac{1}{2}}}$ is strictly positive when $\rho \gt \rho^*_{\chi,\nu\,\eta}$ and strictly negative when $\rho \lt \rho^*_{\chi,\nu\,\eta}$ . When
we have $\rho^*_{\chi,\nu\,\eta} \lt -1$ , and hence the limiting curvature is strictly positive for all $\rho \in [{-}1,1]$ .
Proof. The expression we are interested in, given in Proposition 3, and ignoring the obviously strictly positive multiplicative factor, reads
where
These surprising simplifications show that $\Phi$ is in fact a polynomial in $\rho$ of order three, with a strictly positive leading coefficient $\alpha_3$ , so that $\Phi^{\prime\prime}$ is linear and increasing in $\rho$ and is such that
Now,
Since $\Phi^{\prime}$ is an upward parabola with strictly positive minimum, it is always strictly positive; hence $\Phi$ is a strictly increasing function (of $\rho$ ), and the lemma follows. Let $\rho^{*}_{\chi, \nu, \eta}$ denote the unique solution to $\Phi(\rho^{*}_{\chi, \nu, \eta})=0$ ; there is an explicit closed-form expression for this solution, but its exact representation is messy and not particularly informative. We can, however, provide an upper bound. Indeed,
As soon as $\Phi({-}1) \gt 0$ , clearly $\rho^{*}_{\chi, \nu, \eta} \lt -1$ , so the limiting curvature is always strictly positive. The sign of $\Phi({-}1)$ is given by that of $\left(\chi\nu - \overline{\chi}\eta\right)\left(\chi\nu^3 - \overline{\chi}\eta^3\right)$ , which is an upward parabola in $\chi$ .
5. The stock smile under multi-factor models
We use the setting of Section 2.2.1 to apply our results to an asset price of the form
where B is correlated with the other N Brownian motions as $B= \sum_{i=1}^{N} \rho_i W^i$ with $\sum_{i=1}^N \rho_i^2=1,\,\rho_i\in[{-}1, 1]$ for all $i\in [\![1,N ]\!]$ . The volatility is a function of $(N-1)$ Brownian motions, such that the stock price features one additional and independent source of randomness. To fit this model into (3) we set $A=S$ and identify $\phi^i$ with $\rho_i \sqrt v$ . We modify the notation slightly to differentiate from the VIX framework: the implied volatility is denoted by $\widehat{\mathcal{I}}_{T}$ and the skew by $\widehat{\mathcal{S}}_{T}$ . We do not consider the curvature in this setting, for lack of an explicit formula. The proof of this proposition and the following corollary are postponed to Section 6.5.
Proposition 4. Assume that there exist $H\in(0,\frac{1}{2})$ and a random variable X such that, for all $0\le s\le y$ , $j\in [\![ 1,N ]\!]$ , and $p\ge1$ , $X\in L^p$ , the following hold:
-
(i) $v_s\le X$ ;
-
(ii) $\mathrm{D}_s^j v_y\le X (y-s)^{H_{-}}$ ;
-
(iii) $\sup_{s\le T} \mathbb{E}[u_s^{-p}] \lt \infty$ ;
-
(iv) $\limsup_{T\downarrow0} \mathbb{E}\big[(\sqrt{v_T/v_0}-1)^2 \big]=0$ .
Then the short-time limits of the implied volatility and skew are
Remark 5.
-
• The second limit is finite because of the condition (ii).
-
• The one-dimensional version ( $N=2$ ) agrees with [Reference Alòs, León and Vives4, Theorem 6.3] up to the sign because the authors of [Reference Alòs, León and Vives4] derive with respect to the spot x and not to the log-strike k.
In the two-factor rough Bergomi model (10) we can compute the short-time skew more explicitly. Recall from Example 2.2.4 that, for all $t\ge0$ , it means setting $N=3$ and defining
where $W^{i,H}_t = \int_0^t (t-s)^{H_{-}} \,\mathrm{d} W^i_s$ , for $i=1,2$ and $B = \sum_{i=1}^3 \rho_i W^i$ , with $W^1,W^2,W^3$ being independent Brownian motions. Hence $W^3$ influences only the asset price, not the variance.
Corollary 1. In the two-factor rough Bergomi model we have the short-time skew limit
5.1. Tips for joint calibration in the two-factor rough Bergomi model
Assuming we can observe the short-time limits of the spot ATM implied volatility, it grants us $v_0$ for free, while the slope of its skew gives us H by (11). Next, we simplify the expressions from Proposition 3 in the case $\chi=\frac{1}{2}$ . Denote by $\mathcal{I}_0$ , $\mathcal{S}_0$ , and $\mathcal{C}_0$ the three limits of Proposition 3, and let $H_{\pm} \,:\!=\, H\pm\frac{1}{2}$ , $\alpha\,:\!=\,\eta\rho$ , $\beta\,:\!=\,\eta\overline{\rho}$ . Introduce further the normalised parameters
so that, defining $\widetilde\psi(\widetilde\alpha, \widetilde\beta)\,:\!=\,\sqrt{(1+\widetilde\alpha)^2+\widetilde\beta^2}$ , we have, after simplifications,
where the constants $C_I, C_\mathcal{S}, C_\mathcal{C}$ only depend on $\Delta$ and H. Provided we can observe an approximation of these three limits, we can numerically solve for $\nu,\widetilde\alpha,\widetilde\beta$ in a system with three equations. Alternatively, since all three quantities have the factor $\nu$ , any quotient of two of them is a function of only $\widetilde\alpha,\widetilde\beta$ , which we can plot and match to observed data. Both methods allow us to deduce $\nu,\widetilde\alpha,\widetilde\beta$ , in turn yielding $\eta$ and $\rho$ . Finally, we are left with $\rho_1$ and $\rho_2$ to play with so that the right-hand-side of (11) matches the market observations.
Remark 6. We are not here—as in fact in most papers related to asymptotics—advocating the use of these formulae for actual direct option pricing, since they are asymptotics. In particular, this raises several calibration issues (shared with most results on the topic): (i) very short-maturity options on the VIX are hardly available, and the computation of the curvature, in particular, is a matter of personal choice (the result will change drastically depending on the number of data points around the ATM), which is left to the trader; (ii) such asymptotic formulae serve to provide some intuition about the roles of the model parameters, in particular on which one helps for each part of the smile. One key message of our result, for example, is that the model is able to disentangle (over short time horizons) the role of H and that of $\nu, \eta,\rho$ , and to a certain extent the role of $\nu$ and that of $\eta,\rho$ . Compared to simpler models (one-factor (rough) Bergomi, classical Heston), we have more parameters here, and our results should be combined with more asymptotics (for smile wings and large expiries) to be fully meaningful. Unfortunately, these are not fully available yet, and we would rather leave a full-scale numerical calibration scheme to future endeavours.
6. Proofs
6.1. Useful results
We start by adapting to the multivariate case a well-known decomposition formula and then prove a lemma which will be used extensively in the rest of the proofs. Both proofs build on the multidimensional anticipative Itô formula [Reference Nualart42, Theorem 3.2.4].
Proposition 5. (Price decomposition.) Under $\boldsymbol{(\mathrm{H}_{123})}$ , the following decomposition formula holds, for all $t\in\mathbb{T}$ , for the price (5), with $u_t$ defined in (4) and $G\,:\!=\,(\partial_{x}^2 - \partial_{x})\mathrm{BS}$ :
Proof. Define $\widehat{\mathrm{BS}}(t,x,k,\sigma^2 (T-t)) \,:\!=\, \mathrm{BS}(t,x,k,\sigma)$ and write for simplicity $\widehat{\mathrm{BS}}_t\,:\!=\,\widehat{\mathrm{BS}}\left(t,\mathfrak{M}_t,k, Y_t \right)=\mathrm{BS}\left(t,\mathfrak{M}_t,k, u_t \right)$ , where we recall that $Y_t=u_t^2(T-t)$ . Note that $\Pi_T= \widehat{\mathrm{BS}}_T$ ; hence $\Pi_t=\mathbb{E}_t\left[\widehat{\mathrm{BS}}_T\right]$ by no-arbitrage arguments. Thanks to $\boldsymbol{(\mathrm{H}_{1})}$ and $\boldsymbol{(\mathrm{H}_{2})}$ , we can then apply a multidimensional anticipative Itô’s formula [Reference Nualart42, Theorem 3.2.4] with respect to $(t,\mathfrak{M},Y)$ :
with $\boldsymbol\Theta$ as in (4). The derivatives of the Black–Scholes price read as follows (for simplicity, we omit the argument):
Putting everything together, using the gamma–vega–delta relation
and applying conditional expectation, we obtain
where $\mathcal{L}_{\mathrm{BS}}(s,u_{s}) \,:\!=\, \frac{1}{2}\left[u_{s}^2\left(\partial_{x}^2-\partial_{x}\right)+\partial_{s}\right]\mathrm{BS}(s,\mathfrak{M}_s,k,u_{s})$ is the Black–Scholes operator applied to the Black–Scholes function. Since $\mathcal{L}_{\mathrm{BS}}(s,u_{s})=0$ by construction and
the last term in (13) is well defined by $\boldsymbol{(\mathrm{H}_{3})}$ and the proposition follows.
Lemma 2. For all $t\in\mathbb{T}$ , let $J_t \,:\!=\, \int_t^T a_s \mathrm{d}s$ , for some adapted process $a\in \mathbb{L}^{1,2}$ , and let $\mathfrak{L}\,:\!=\,\sum_{i=1}^n c_i \partial_x^i$ be a linear combination of partial derivatives, with weights $c_i\in\mathbb{R}$ . Then, writing for clarity $\mathrm{BS}_t\,:\!=\,\mathrm{BS}(t,\mathfrak{M}_t,\mathfrak{M}_0,u_{t})$ , we have
Remark 7. We will use this lemma freely below, with the justification that the condition $a\in \mathbb{L}^{1,2}$ is always satisfied thanks to $\boldsymbol{(\mathrm{H}_{1})}$ .
Proof. As in the proof of Proposition 5, we define $\widehat{\mathrm{BS}}(t,x,k,\sigma^2 (T-t)) \,:\!=\, \mathrm{BS}(t,x,k,\sigma)$ and write for simplicity $\widehat{\mathrm{BS}}_t\,:\!=\,\widehat{\mathrm{BS}}\left(t,\mathfrak{M}_t,\mathfrak{M}_0, Y_t \right)=\mathrm{BS}\left(t,\mathfrak{M}_t,\mathfrak{M}_0, u_t \right)$ . Define $\widehat{\mathrm{P}}(t,x,k,y,j)\,:\!=\,\mathfrak{L} \widehat {\mathrm{BS}}(t,x,k,u)j$ and denote $\widehat{\mathrm{P}}\left(t,\mathfrak{M}_s,\mathfrak{M}_0,u_s, J_s\right)$ by $\widehat{\mathrm{P}}_t$ for simplicity. We then apply the multidimensional anticipative Itô’s formula [Reference Nualart42, Theorem 3.2.4] with respect to $(t,\mathfrak{M},Y,J)$ :
One first notices that $\widehat{\mathrm{P}}_0 = \mathfrak{L} \widehat{\mathrm{BS}}_0 J_0$ and $\widehat{\mathrm{P}}_T=0$ . Moreover we observe that $\int_0^T \partial_j \widehat{\mathrm{P}}_s \,\mathrm{d} J_s = -\int_0^T \mathfrak{L} \widehat{\mathrm{BS}}_s a_s \mathrm{d}s$ , which corresponds to the left-hand-side of (14), and
Since $\mathfrak{L}$ is a linear operator, the partial derivatives in s, x, and u cancel as in the proof of Proposition 5. That means we are left with
Since $\partial_x^n \mathrm{BS}(s,x,u)= \partial_x^n \widehat{\mathrm{BS}}(s,x,u^2(T-s))$ for any $n\in\mathbb N$ , summing everything and taking expectations imply the claim.
We adapt and clarify [Reference Alòs, León and Vives4, Lemma 4.1] to obtain a convenient bound for the partial derivatives of G. For notational simplicity, since $\sigma$ and $T-t$ are fixed, we write $\varsigma\,:\!=\,\sigma\sqrt{T-t}$ and $\mathfrak{G}(x,k,\varsigma)\,:\!=\,G(t,x,k,\sigma)$ .
Proposition 6. For any $n\in\mathbb{N}$ and $p\in\mathbb{R}$ , there exists $C_{n,p} \gt 0$ independent of x and $\varsigma$ such that, for all $\varsigma \gt 0$ and $x\in\mathbb{R}\setminus\left\{0, \frac{\varsigma^2}{2}\right\}$ ,
If $x=0$ , then for any $n\in\mathbb{N}$ the bound (15) holds with $p=n$ .
If $x=\frac{1}{2} \varsigma^2$ , there exists a strictly positive constant $C_{n}$ independent of $\varsigma$ such that
The following simplification (and extension) will be useful later.
Corollary 2. For any $n\in\mathbb{N}$ , there exists a non-negative $C_{n,k}$ independent of x and $\varsigma$ such that, for all $\varsigma \gt 0$ and $x\in\mathbb{R}$ ,
Proof of Proposition 6. We first consider the case $k=0$ . Since
where $d_+(x,\varsigma) \,:\!=\,d_+(x,0,\sigma)= \frac{x}{\varsigma}+\frac{\varsigma}{2}$ , direct computation (proof by recursion) yields, for any $n\in\mathbb{N}$ ,
where, for each j, $P_{j}$ is a polynomial of degree j independent of $\varsigma$ .
Since $d_+(\frac{\varsigma^2}{2},\varsigma)=\partial_x d_+(\frac{\varsigma^2}{2},\varsigma)=0$ , $\partial_x^2 d_+(\frac{\varsigma^2}{2},\varsigma) = -\frac{1}{\varsigma^{2}}$ , the induction simplifies to
for some constant $C_n \gt 0$ independent of $\varsigma$ , proving the third statement in the proposition.
Similarly, if $x=0$ , simplifications occur which yield, for any $n\in\mathbb{N}$ ,
and the second statement in the proposition follows.
Finally, in the general case $x\in\mathbb{R}\setminus\left\{0,\frac{\varsigma^2}{2}\right\}$ , we can rewrite (16) for any $p\in\mathbb{R}$ as
For each $n \in \mathbb{N}, p\in\mathbb{N}$ , $H_{n,p}$ is a two-dimensional function consisting only of powers of $\varsigma^2$ and $x^2/\varsigma^2$ . Since the exponential factor contains these very same terms, there exists a strictly positive constant $C_{n,p}$ , independent of x and $\varsigma$ , such that
proving the proposition in the case $k=0$ .
The case $k\in\mathbb{R}$ follows directly from the observation that $\mathfrak{G}(x,0,\varsigma) = \mathfrak{G}(x-k,0,\varsigma) \mathrm{e}^k$ . Finally, since $\partial_k d_+(x,k,\sigma) = - \partial_x d_+(x,k,\sigma)$ and $\partial_k^2 d_+(x,k,\sigma) = - \partial_x^2 d_+(x,k,\sigma)$ , the same simplifications occur if we take a partial derivative with respect to k instead of x.
6.2. Proofs of the main results
6.2.1. Proof of Theorem 1: level
To prove this result, we draw insights from the proofs of [Reference Alòs, García-Lorite and Muguruza2, Theorem 8] and [Reference Alòs and Shiraya5, Proposition 3.1]. By definition
and we write $\widetilde{\mathrm{BS}}(x) \,:\!=\, \mathrm{BS}(0,x,x,u_0)$ . Using Proposition 5 at time 0, we see that $\Pi_0=\Gamma_T$ , where
which is a deterministic path. The fundamental theorem of integration reads
where $G_t\,:\!=\, G(t,\mathfrak{M}_t,\mathfrak{M}_0,u_t)$ . We can deal with the integral by computing $\overleftarrow{\mathrm{BS}}^{\prime}$ and $\partial_x G$ explicitly:
Since $\Gamma:\mathbb{R}_+\to\mathbb{R}$ and $\overleftarrow{\mathrm{BS}}:\mathbb{R}\to\mathbb{R}$ are continuous, the following is uniformly bounded for all $T\le1$ :
Therefore, by $\boldsymbol{(\mathrm{H}_{4})}$ we obtain
Since $\Gamma_0=\mathbb{E}\left[\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right]$ and $u_0=\overleftarrow{\mathrm{BS}}\left(\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\right)$ , we have
The Clark–Ocone formula yields
and by the gamma–vega–delta relation (12) we have
which in turn implies
Define $\Lambda_r\,:\!=\,\mathbb{E}_r \Big[\widetilde{\mathrm{BS}}(\mathfrak{M}_0)\Big]$ , so that the difference we are interested in from (17) reads, after we apply the standard Itô’s formula,
The stochastic integral above has zero expectation by the same argument as used for [Reference Alòs and Shiraya5, Proposition 3.1]. Moreover, $\boldsymbol{(\mathrm{H}}_{\boldsymbol{5}}\boldsymbol{)}$ states that $\mathfrak{u}_0$ is dominated almost surely by $Z\in L^p$ , and therefore so are $\Lambda$ and
by continuity. Plugging in the expression for $U^i$ from (19), we apply $\boldsymbol{(\mathrm{H}}_{\boldsymbol{5}}\boldsymbol{)}$ to conclude that the second integral of (20) tends to zero.
6.2.2. Proof of Theorem 2: skew
This proof follows from arguments similar to those of [Reference Alòs, León and Vives4, Proposition 5.1]. We recall that $\Pi_0(k)= \mathrm{BS}\big(0,\mathfrak{M}_0,k,\mathcal{I}_{T}(k)\big)$ . On the one hand, by the chain rule we have
On the other hand, the decomposition obtained in Proposition 5 yields
which in particular also holds for $k=\mathfrak{M}_0$ . Performing simple algebraic manipulations and using the derivatives of the Black–Scholes function ATM as in [Reference Alòs, León and Vives4, Proposition 5.1], we find the following (remember we drop the k-dependence in $\mathcal{I}_{T}$ when ATM):
By (23), this in turn yields
where $L\,:\!=\,(\frac{1}{2}+\partial_{k})\frac{1}{2}\partial_{x} G$ . We write $L_s\,:\!=\,L(s,\mathfrak{M}_s,\mathfrak{M}_0,u_s)$ for simplicity and apply Lemma 2 to $L_s\, \int_s^T |\boldsymbol\Theta_r|\mathrm{d}r$ , which yields
We combine (18) with the bound $\partial_k \partial_x^n G(t,x,k,\sigma) \le C \big(\sigma\sqrt{T-t}\big)^{-n-2}$ from Corollary 2 to obtain
and both converge to zero by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ . We are left with $R_1$ . From Section 6.6, we have
and therefore by (18),
This yields
where $\displaystyle\mathfrak{K}_T\,:\!=\,\frac{\int_0^T |\boldsymbol\Theta_s|\mathrm{d}s}{2 T^{\frac{1}{2}+\lambda} \mathfrak{u}_0^3}$ . Furthermore,
not only is finite but converges to zero as T goes to zero. Hence,
We can finally conclude by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ that
which has a finite limit.
6.2.3. Proof of Theorem 3: curvature
Step 1. Let us start by simply taking a second derivative with respect to k (we write $\mathrm{BS}(\mathfrak{M}_0,\mathcal{I}_{T}(k))$ as short for $\mathrm{BS}(0,\mathfrak{M}_0,\mathfrak{M}_0,\mathcal{I}_{T}(k))$ ):
Taking the derivative with respect to k in (24) and equating with the above formula yields
A similar expression is presented in [Reference Alòs and León3], and we notice that $T_1$ and $T_2$ in the expression above, after being multiplied by $T^{-\lambda}$ , are identical to those from [Reference Alòs and León3, Equation (25)] and can therefore be dealt with in the same way. Step 1 shows that $T^{-\lambda} T_1$ tends to zero as $T\downarrow 0$ , and Step 2 yields $T_2= - \frac{1}{2} \partial_{k} \mathcal{I}_{T}(k)$ .
Step 2. Recall that $L = \frac{1}{2} \left(\frac{1}{2} + \partial_{k} \right) \partial_{x} G$ . We need the anticipative Itô’s formula (Lemma 2) twice on $T_3$ . Indeed, even though the bound on $\partial_{x}^n G$ worsens as n increases, it is more than compensated for by the additional integrations. The terms with more integrals (i.e. more regularity) tend to zero as T goes to zero, by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ , and we compute the others in closed form. For clarity we write $L_s = L (s,\mathfrak{M}_s,\mathfrak{M}_0,u_s )$ for all $s\ge0$ . By a first application of Lemma 2 on $\partial_{k} L_s \int_s^T |\boldsymbol\Theta_s| \mathrm{d}s$ we obtain
To deal with $S_2$ , we apply Lemma 2 again on $(\partial_{x}^3 - \partial_{x}^2) \partial_{k} L_s \int_s^T |\boldsymbol\Theta_r|\left( \int_r^T |\boldsymbol\Theta_{y}|\mathrm{d}y \right) \mathrm{d}r =: H_s Z_s$ , which yields
We will deal with these terms in the last step. For $S_3$ , we apply Lemma 2 once more to
and obtain
Step 3. We now evaluate the derivative at $k=\mathfrak{M}_0$ and drop the k-dependence. To summarise,
where
We recall once again the bound $\partial_x^n G(t,x,k,\sigma) \le C \big(\sigma\sqrt{T-t}\big)^{-n-1}$ as $T-t$ goes to zero. We observe that H and $\widetilde H$ consist of derivatives of G up to the sixth and the fourth order, respectively; therefore $S_2^b, S_2^c,S_3^b,S_3^c$ tend to zero by $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ . In order to deal with $S_1$ , $S_2^a$ , and $S_3^a$ , we use the explicit partial derivatives from Section 6.6 and (18); as in the proof of Theorem 2, $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ implies that only the higher derivatives of $\mathfrak{u}_0$ remain in the limit:
Hence, to conclude, the claim follows from
6.3. Proof of Proposition 1: VIX asymptotics
In this section, we will repeatedly interchange the Malliavin derivative and conditional expectation, which is justified by [Reference Nualart42, Proposition 1.2.8].
Proposition 7. In the case where $A=\mathrm{VIX}$ , the conditions in $\boldsymbol{(\overline{\mathrm{C}}}\boldsymbol{)}$ imply the assumptions $\boldsymbol{(\overline{\mathrm{H}}}^{\lambda\gamma}\boldsymbol{)}$ for any $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1, 3H-\frac{1}{2}]$ .
Proof. We write $a\lesssim b$ when there exists $X\in L^p$ such that $a\le Xb$ almost surely, and $a\approx b$ if $a\lesssim b$ and $b\lesssim a$ . The assumption $\boldsymbol{(\mathrm{H}_{1})}$ is given by the first item of $\boldsymbol{(\mathrm{C}_{2})}$ , and $\boldsymbol{(\mathrm{H}_{2})}$ corresponds to $\boldsymbol{(\mathrm{C}_{1})}$ . Since $1/M$ is dominated, so is $1/\mathrm{VIX}$ . We then have, for $i=1,2$ and by Cauchy–Schwarz,
If $H \lt \frac{1}{2}$ , then the incremental function $x\mapsto (x+\Delta)^{H_{+}} -x^{H_{+}}$ is decreasing by concavity. For $j=1,2$ and $t\le s$ , this implies by domination of $1/M$ that $\phi^i\approx m^i$ is also dominated and
Combining these two estimates, we obtain
It is clear by now that indices and sums do not influence the estimates, so we informally drop them for more clarity and continue with the higher derivatives:
where the first and second terms behave like $T-s$ . For $t\le s\le y\le T$ , we deduce from (25) that $\mathrm{D}_t\mathrm{D}_s \phi_y$ consists of five terms, of which four behave like $(T-s)$ , and only one features three derivatives:
If $H\ge \frac{1}{6}$ , then concavity implies $\mathrm{D}_t \Theta_s \lesssim (T-s) $ . Otherwise, if $H \lt \frac{1}{6}$ , then
In the second derivative of $\Theta$ , the first and second terms behave like $(T-s)$ and $\mathrm{D}_t\Theta_s\lesssim (T-s)+(T-s)^{(3H+\frac{1}{2})\wedge 1}$ respectively; hence we focus on $\int_s^T \mathrm{D}_w \mathrm{D}_t \mathrm{D}_s \phi_y \mathrm{d}y$ , where the new term is
If $H\ge \frac{1}{4}$ , then $\mathrm{D}_t \Theta_s \lesssim (T-s)$ by concavity. Otherwise, when $H \lt \frac{1}{4}$ ,
where the last inequality holds by yet again the same concavity argument.
This yields a rule for checking that the quantities in our assumptions indeed converge. We summarise the above estimates in the case $H\le\frac{1}{2}$ : there exists $Z\in L^p$ such that for $s\le T$ and T small enough,
hold almost surely. Thanks to the Cauchy–Schwarz inequality we can disentangle the numerators (integrals and derivatives of $\boldsymbol\Theta$ ) and denominators (powers of u) of the assumptions, which are both uniformly bounded in $L^p$ . We can easily deduce that $\boldsymbol{(\mathrm{H}_{3})}$ , $\boldsymbol{(\mathrm{H}_{4})}$ , $\boldsymbol{(\mathrm{H}}_{\boldsymbol{5}}\boldsymbol{)}$ , $\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ , $\boldsymbol{(\mathrm{H}}_{\boldsymbol{8}}^{\gamma}\boldsymbol{)}$ are satisfied (convergence to zero). In $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ , $\mathbb{E}[\mathfrak{K}_T]$ behaves as $T^{-\lambda}$ , so it converges for any $\lambda\in({-}\frac{1}{2},0]$ , and the uniform $L^2$ bound is satisfied thanks to $\boldsymbol{(\mathrm{C}_{3})}$ . Moreover, in the limit the first term in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{9}}^{\gamma}\boldsymbol{)}$ behaves as $T^{-\gamma}$ and the second behaves as $T^{3H-\frac{1}{2}-\gamma}$ ; therefore both assumptions are satisfied for any $\lambda\in({-}\frac{1}{2},0]$ and $\gamma\in({-}1, 3H-\frac{1}{2}]$ . Similarly, $\boldsymbol{(\mathrm{C}_{3})}$ ensures the uniform $L^2$ bounds.
6.3.1. Convergence lemmas
We require some preliminaries before we dive into the computations. We present three versions of integral convergence tailored to our purposes, which are essential for computing the limits in Theorems 1, 2, and 3. The conditions they require hold thanks to the continuity of $\boldsymbol{(\mathrm{C}_{4})}$ . Recall the local Taylor theorem: if a function $g({\cdot})$ is continuous on $[0, \delta]$ for some $\delta \gt 0$ , then there exists a continuous function $\varepsilon({\cdot})$ on $[0, \delta]$ with $\lim_{x\downarrow 0}\varepsilon(x)=0$ such that $g(x) = g(0) + \varepsilon(x)$ for any $x \in [0, \delta]$ .
Lemma 3. If $f\,:\,\mathbb{R}_+^2\to\mathbb{R}$ is such that $f(T,\cdot)$ is continuous on $[0,\delta_0]$ for some $\delta_0 \gt 0$ and $\lim\limits_{T\downarrow0}f(T,0)=f(0,0)$ , then
Proof. For $T \lt \delta_0$ , we can write
where the function $\varepsilon_0$ is continuous on $[0,\delta_0]$ and converges to zero at the origin. Hence, for any $\eta_0 \gt 0$ , there exists $\widetilde \delta_0 \gt 0$ such that, for any $y\le\widetilde \delta_0$ , $|\varepsilon_0(y)| \lt \eta_0$ . For all $T \lt \widetilde \delta_0 \wedge \delta_0$ ,
Since $\eta_0$ can be taken as small as desired, the fact that $\lim_{T\downarrow0} f(T,0)=f(0,0)$ concludes the proof.
Lemma 4. Let $f\,:\,\mathbb{R}^3_+\to\mathbb{R}$ be such that, for each $y\le T$ , $f(T,y,\cdot)$ is continuous on $[0,\delta_0]$ with $\delta_0 \gt 0$ , $f(T,\cdot,0)$ is continuous on $[0,\delta_1]$ with $\delta_1 \gt 0$ , and $\lim_{T\downarrow0}f(T,0,0)=f(0,0,0)$ . Then
Proof. For $T \lt \delta_0\wedge\delta_1$ , we can write
where $\varepsilon_{1}({\cdot})$ is continuous on $[0,\delta_1]$ and $\varepsilon_{0}({\cdot})$ is continuous on $[0,\delta_0]$ , and both are null at the origin. For any $\eta_{1} \gt 0$ , there exists $\widetilde{\delta}_{1} \gt 0$ such that, for any $y \in [0, \widetilde{\delta}_{1}]$ , we have $|\varepsilon_{1}(y)| \lt \eta_{1}$ . Therefore, for the first integral, we have, for $T \lt \widetilde{\delta}_{1}\wedge\delta_0\wedge\delta_1$ ,
Likewise, since $\varepsilon_{0}({\cdot})$ tends to zero at the origin, for any $\eta_{0} \gt 0$ there exists $\widetilde{\delta}_{0} \gt 0$ such that, for any $y \in [0, \widetilde{\delta}_{0}]$ , we have $|\varepsilon_{0}(y)| \lt \eta_{0}$ . Therefore, for the second integral, we have, for $T \lt \widetilde{\delta}_{0}\wedge\delta_0\wedge\delta_1$ ,
Since $\eta_1$ and $\eta_0$ can be taken as small as desired, taking the limit of f(T, 0, 0) as T goes to zero concludes the proof.
Lemma 5. Let $f\,:\,\mathbb{R}^4_+\to\mathbb{R}$ be such that, for all $0\le s\le y\le T$ , the functions $f(T,y,s,\cdot)$ , $f(T,y,\cdot,0)$ , $f(T,\cdot,0,0)$ are continuous on $[0,\delta_0]$ , $[0,\delta_1]$ , $[0,\delta_2]$ , respectively, for some $\delta_0,\delta_1,\delta_2 \gt 0$ , and $\lim_{T\downarrow0}f(T,0,0,0)=f(0,0,0,0)$ . Then the following limit holds:
Proof. For $T \lt \delta_0\wedge\delta_1\wedge\delta_2$ , we can write
where the function $\varepsilon_2$ is continuous on $[0,\delta_2]$ , the function $\varepsilon_{1}$ is continuous on $[0,\delta_1]$ , and the function $\varepsilon_{0}$ is continuous on $[0,\delta_0]$ , all converging to zero at the origin. By the same argument as in the previous proof, for any $\eta_0,\eta_1,\eta_2 \gt 0$ , there exists $\widetilde\delta \gt 0$ such that for all $T\le\widetilde \delta$ , we have $|\varepsilon_0(T)|\le \eta_0$ , $|\varepsilon_1(T)|\le \eta_1$ , and $|\varepsilon_2(T)|\le \eta_2$ . This implies
Since $\eta_2$ , $\eta_1$ and $\eta_0$ can be taken as small as desired, taking the limit of f(T,0,0,0) as T goes to zero concludes the proof.
To apply these lemmas, we will use a modified version of the martingale convergence theorem, which holds in our setting thanks to domination provided by $\boldsymbol{(\mathrm{C}_{1})}$ and $\boldsymbol{(\mathrm{C}_{2})}$ and the continuity of $\boldsymbol{(\mathrm{C}_{4})}$ .
Lemma 6. Let $(X_t)_{t\ge0}$ be almost surely continuous in a neighbourhood of zero, with $\sup_{t\le 1} |X_t|\le Z\in L^1$ . Then the conditional expectation process $(\mathbb{E}_t[X_t])_{t\ge0}$ is also almost surely continuous in a neighbourhood of zero. In particular,
Remark 8. The process $(X_t)_{t\ge0}$ is not necessarily adapted.
Proof. All the limits are taken in the almost sure sense. Let $\delta \gt 0$ be such that X is continuous on $[0,\delta]$ , and fix $t \lt \delta$ . We set a sequence $\{t_n\}_{n\in\mathbb{N}}$ on $[0,\delta]$ which converges to t as n goes infinity. Assume first that $\{t_n\}_{n\in\mathbb{N}}$ is a monotone sequence. Since $\mathcal{F}_{t_n}$ tends monotonically to $\mathcal{F}_t$ and X is dominated, the classical martingale convergence theorem (MCT) asserts that $\lim_{n\uparrow\infty} \mathbb{E}_{t_n}[X_t] = \mathbb{E}_t[X_t]$ . For fixed $n\in\mathbb{N}$ and any $\mathfrak{q}\ge |t_n-t|$ ,
Let us fix $\varepsilon \gt 0$ . By the MCT, there exists $n_0\in\mathbb{N}$ such that, if $n\ge n_0$ , then
and by dominated convergence there exists $\delta^{\prime} \gt 0$ with $\mathbb{E}_t \Big[\sup_{|\mathfrak{p}-t|\le \delta^{\prime}} |X_{\mathfrak{p}} -X_t| \Big] \lt \varepsilon$ . There exists $n_1\in\mathbb{N}$ such that $|t_n-t|\le\delta^{\prime}$ for all $n\ge n_1$ ; thus if $n\ge n_0\vee n_1$ , then (29) yields $\mathbb{E}_{t_n}[|X_{t_n}-X_t|] \lt 2\varepsilon$ and
Now we consider the general case where $\{t_n\}_{n\in\mathbb{N}}$ is not monotone. From every subsequence of $\{t_n\}_{n\in\mathbb{N}}$ , one can extract a further subsequence which is monotone. Let us call this sub-subsequence $\{t_{n_k}\}_{k\in\mathbb{N}}$ . Therefore, (30) holds with $t_{n_k}$ instead of $t_{n}$ . Since every subsequence of $(\mathbb{E}_{t_n}[X_{t_n}])_{n\in\mathbb{N}}$ has a further subsequence that converges to the same limit, the original sequence also converges to this limit.
For convenience, we use the following definition.
Definition 2. Let $k,n\in\mathbb{N}$ with $k\le n$ . For a function $f\,:\,\mathbb{R}_+^n\to\mathbb{R}$ , we define
Notice that the right-hand sides of (26), (27), and (28) correspond to
respectively.
6.3.2. Proof of Proposition 1
Let us recall some important quantities:
We also recall that $J_i$ and $G_{ij}$ , $i,j\in [\![ 1,N ]\!]$ , were defined in (9). In this proof we will define $f(0)\,:\!=\,\lim_{x\downarrow0}f(x)$ , for every $f\,:\,\mathbb{R}_+\to\mathbb{R}$ , as soon as the limit exists and even if f is not actually continuous around zero. In this way we make it continuous, which allows us to apply the convergence lemmas.
Level. By $\boldsymbol{(\mathrm{C}_{1})}$ and the MCT, $\lim_{y\downarrow0}\mathbb{E}_y[\mathrm{VIX}_T]=\mathbb{E}[\mathrm{VIX}_T]$ and $(M_y)_{y\ge0}$ is continuous around zero, almost surely. By $\boldsymbol{(\mathrm{C}_{4})}$ and the dominated convergence theorem (DCT), we have $\lim_{y\downarrow0} \int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r=\int_T^{T+\Delta} \mathrm{D}^i_0 v_r\mathrm{d}r$ and $\big(\int_T^{T+\Delta} \mathrm{D}^i_y v_r\mathrm{d}r)_{y\ge0}$ is continuous around zero, almost surely. Let $i\in [\![ 1,N ]\!]$ ; from $\boldsymbol{(\mathrm{C}_{1})}$ and $\boldsymbol{(\mathrm{C}_{2})}$ we also obtain that almost surely
for some $X\in L^2$ . Therefore it is dominated, and by Lemma 6, almost surely $m_y^i$ is continuous at zero and
Since $M_y \gt 0$ for all $y\le T$ , $\phi^i$ is also continuous at zero and $\lim_{y \le T\downarrow0} \phi^i_y = J_i / (2\Delta\mathrm{VIX}_0^2)$ . By virtue of Theorem 1 and Lemma 3, we obtain
Skew. To obtain the skew limit we need to compute a few Malliavin derivatives. For all $i,j\in [\![ 1,N ]\!]$ ,
which yields
Based on $\boldsymbol{(\mathrm{C}_{1})}$ , $\boldsymbol{(\mathrm{C}_{2})}$ , and $\boldsymbol{(\mathrm{C}_{4})}$ , for each $T\ge0$ , $A^{ij}_T$ , $B^{ij}_T$ , and $C^{ij}_T$ are dominated and almost surely continuous in both arguments. For each $s\ge0$ , Lemma 6 and the DCT yield, almost surely, that $(\mathrm{D}^j_s \phi^i_y)_{y\ge0}$ and $(\mathrm{D}^j_s \phi^i_0)_{s\ge0}$ are continuous around zero. In particular,
By the DCT again this yields
Therefore $\phi_s^j \mathrm{D}_s^j (\phi_y^i)^2$ satisfies the continuity requirements of f(T, y, s) in Lemma 4. We combine this lemma with the limits above to see that, almost surely,
We also recall that $\lim_{T\downarrow0} u_0 =\frac{{\left\|{\boldsymbol J}\right\|}}{2\Delta\mathrm{VIX}_0^2}$ almost surely; hence, with $\boldsymbol{(\mathrm{C}_{2})}$ and $\boldsymbol{(\mathrm{C}_{3})}$ , the DCT implies
Curvature. We now turn our attention to the curvature. By the same arguments as above we have
For the last term of (8) we need to go one step further and compute more Malliavin derivatives, since
Thus we zoom in on the last term of the display above:
We zoom in again on $Q_1^{ijk}(t,s,y,T)$ :
Some additional computations lead to
We notice, crucially, that we have already justified the continuity of $\phi$ and $\mathrm{D}\phi$ around zero in the proofs of level and skew, respectively. Furthermore, by Lemma 6, the first two terms in $\Upsilon^{ijk}$ as well as $Q_2,Q_3,Q_4,Q_5$ all converge to some finite limit as $t\le s\le y\downarrow0$ and are continuous around zero, almost surely. Similarly, $\beta_T$ and the second term in $\alpha_T$ are almost surely continuous around zero, and their conditional expectation converges almost surely to some finite limit as $t\le s\le y\downarrow0$ by the DCT and Lemma 6. Taking the limit as T goes to zero afterwards, we see that all of the aforementioned terms tend to a finite limit. On the other hand, by $\boldsymbol{(\mathrm{C}_{4})}$ , the DCT, and Lemma 6 we know that the conditional expectation of the first term in $\alpha_T$ is almost surely continuous around zero, and its limit is
Since $\gamma \lt 0$ , only this term contributes in the limit:
where we applied the DCT at the end. Moreover, we know by $\boldsymbol{(\mathrm{C}_{2})}$ that this limit is finite for $\gamma=3H-\frac{1}{2}$ ; hence the conditions of Lemma 5 are satisfied. We also recall that $\lim_{T\downarrow0} u_0 =\frac{{\left\|{\boldsymbol J}\right\|}}{2\Delta\mathrm{VIX}_0^2}$ almost surely; hence Lemma 5 yields the almost sure limit
The first two terms in (8) tend to zero since $\gamma \lt 0$ ; hence Theorem 3 and the DCT yield the final result:
6.4. Proofs in the two-factor rough Bergomi model
6.4.1. Proof of Proposition 2
We start with a useful lemma for Gaussian processes.
Lemma 7. If B is a Gaussian process with ${\left\|{B}\right\|}_T\,:\!=\,\sup\limits_{t\le T}|B_t|$ , then $\mathbb{E}[\mathrm{e}^{p \|B\|_T}]$ is finite for all $p\in\mathbb{R}$ .
Proof. The Borell–TIS inequality asserts that $\mathbb{E}[{\left\|{B}\right\|}_T] \lt \infty$ and
where $\sigma^2_T\,:\!=\,\sup_{t\le T} \mathbb{E}[B_t^2]$ ; see [Reference Adler and Taylor1, Theorem 2.1.1]. We then follow the proof of [Reference Adler and Taylor1, Theorem 2.1.2]:
The Borell–TIS inequality in particular reads as follows:
After a change of variable this yields
which is finite as desired.
By the above lemma, ${\left\|{v}\right\|}_T\in L^p$ , so that we can compute its Malliavin derivatives
Without explicitly computing further derivatives, one notices that $\boldsymbol{(\mathrm{C}_{4})}$ holds and that there exist $C \gt 0$ and a random variable $X=C{\left\|{\mathcal{E}_r^1+\mathcal{E}_r^2}\right\|}_T\in L^p$ for all $p \gt 1$ such that $\mathrm{D}^i_y v_r \le X (r-y)^{H_{-}}$ , $\mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-s)^{H_{-}} (r-y)^{H_{-}}$ , and $\mathrm{D}^k_t \mathrm{D}^j_s \mathrm{D}^i_y v_r \le X (r-t)^{H_{-}} (r-s)^{H_{-}} (r-y)^{H_{-}}$ , implying $\boldsymbol{(\mathrm{C}_{2})}$ . The following lemma yields $\boldsymbol{(\mathrm{C}_{1})}$ .
Lemma 8. In the two-factor rough Bergomi model (10) with $0\leq T_1 \lt T_2$ ,
is finite for all $p \gt 1$ . In particular, $1/M$ is dominated in $L^p$ .
Proof. We first apply an $\exp$ – $\log$ identity, then Jensen’s inequality (using the concavity of the logarithm function), to obtain
We further bound $\log\mathbb{E}_y [v_r]$ as follows, using the concavity of the logarithm and (10):
which we now compute as
Let us deal with the first term of (34), as the second one is analogous. We have
which is clearly bounded below for all $0\le u\le T_1$ . Moreover, by Fubini’s theorem,
is a Gaussian process. Since $\exp\{\cdot\}$ is increasing, $\sup_{t\in[0,T]} \exp\{\overline{B}_t\} = \exp\{\sup_{t\in[0,T]} \overline{B}_t\}$ ; thus
by Lemma 7, which concludes the proof.
Combining (33) and (35), we obtain $\mathbb{E}_y[\mathrm{D}^i_y v_r]$ , $i=1,2$ . The following lemma proves that $\boldsymbol{(\mathrm{C}_{3})}$ is satisfied.
Lemma 9. For any $p \gt 1$ , $ \mathbb{E}[u_s^{-p}]$ is uniformly bounded in s and T, with $s\le T$ .
Proof. Since $\nu,\eta,\rho+\overline{\rho} \gt 0$ , we have $\mathrm{D}^1_y v_r+\mathrm{D}^2_y v_r \gt 0$ almost surely for all $y\le r$ . Moreover, VIX and $1/\mathrm{VIX}$ are dominated by some $X \in L^p$ for all $p \gt 1$ , so, almost surely and independently of the sign of the numerator, we obtain
Therefore, using that $1/M$ is dominated by X and Jensen’s inequality, we get
Hence we turn our attention to
using Jensen’s inequality, the Cauchy–Schwarz inequality, and the fact that $\mathrm{e}^{p\mathbb{E}_y[\log(X)]}\le \mathbb{E}_y[X^p]$ . Convexity and (33) imply
We focus on the first term; the other can be treated identically. From (35) we have
Let us start with
By Taylor’s theorem, $\log(T+\Delta-s)-\log(\Delta) = \frac{T-s}{\Delta} + \varepsilon(T-s)$ , where $\varepsilon:\mathbb{R}_+\to\mathbb{R}_+$ is such that $\varepsilon(x)/x$ tends to zero at the origin. We conclude that
is uniformly bounded. Now we study the second term of (38):
Therefore the following is uniformly bounded:
For the last term, by the stochastic Fubini theorem [Reference Protter48, Theorem 65], we get
Standard Gaussian computations then yield
The incremental function $x\mapsto (x+\Delta)^{H_{+}}-x^{H_{+}}$ is decreasing by concavity; hence $(T+\Delta-t)^{H_{+}} - (T-t)^{H_{+}} \le \Delta^{H_{+}}$ , and we obtain
which implies that (39) is uniformly bounded. We have thus shown that (37) is uniformly bounded in s, T.
Coming back to (36), by the Cauchy–Schwarz inequality we have
which is uniformly bounded for all $s\le T$ , and this concludes the proof.
6.4.2. Proof of Proposition 3
Level. We start with the derivatives
and recall, from the definitions in (9),
We also note that $\mathbb{E}[\mathcal{E}^i_t]=1$ . This yields the norm
with the function $\psi$ defined in the proposition, which grants us the first limit by Proposition 1. To simplify the notation below, we introduce $\mathfrak{w} \,:\!=\, \chi\nu + \overline{\chi}\eta\rho$ .
Skew. We compute the further derivatives
Similarly to J, we recall that $G_{ij}= \int_0^{\Delta} \mathbb{E}\big[\mathrm{D}^j_0 \mathrm{D}^i_0 v_r \big]\mathrm{d}r$ , so that
Notice that $\mathrm{VIX}^2_0=v_0$ ; thus we have
Finally, by Proposition 1 we obtain
Curvature. For the last step we go one step further:
We notice that
By the curvature limit in Proposition 1, we have
which yields the claim.
6.5. Proofs for the stock price
6.5.1. Proof of Proposition 4
Since $\phi$ and $u^{-p}$ are dominated by the conditions (i) and (iii) respectively, with the same notation as in the proof of Proposition 7, we obtain by (ii), as T goes to zero,
Under our three assumptions it is straightforward to see that $\boldsymbol{(\mathrm{H}_{12345})}$ are satisfied. Moreover, the terms in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{6}}^{\lambda}\boldsymbol{)}$ behave as $T^{2H-\lambda}$ and the one in $\boldsymbol{(\mathrm{H}}_{\boldsymbol{7}}^{\lambda}\boldsymbol{)}$ as $T^{H_{-}{-\lambda}}$ , which means that if we set $\lambda=H_-$ the former vanishes and the second yields a non-trivial behaviour.
Let us have a look at the short-time implied volatility. By Lemma 3 and the continuity of v we have $\lim_{T\downarrow0} u_0 = \sqrt{\sum_{i=1}^N v_0 \rho_i^2}=\sqrt{v_0}$ almost surely; hence by Theorem 1 and the DCT,
We then turn our attention to the short-time skew. With $\lambda=H_-$ , Theorem 2 and the DCT imply
where we used $\sum_{i=1}^N \rho_i^2=1$ . For any $j\in [\![ 1,N ]\!]c$ , the Cauchy–Schwarz inequality yields
where $\mathbb{E}\big[ (\mathrm{D}^j_s v_y)^2 \big]^\frac{1}{2}\le C (y-s)^{H_{-}}$ for some finite constant C by (ii). Therefore,
Since the fraction is equal to $((H+\frac32)H_+)^{-1}$ and $\limsup_{T\downarrow0} \mathbb{E}\big[(\sqrt{v_t/v_0}-1)^2 \big]$ is null by (iv), we obtain
6.5.2. Proof of Corollary 1
Since
Lemmas 7 and 8 show that the assumptions (i)–(iii) of Proposition 4 hold. Moreover, v has almost-sure continuous paths; hence $\sqrt{\frac{v_t}{v_0}}$ tends to one almost surely and (iv) holds by reverse Fatou’s lemma. For $0\le s\le y$ , (33) implies
and clearly $\mathbb{E}[\mathrm{D}^3_s v_y] =0$ . Therefore, Proposition 4 implies
6.6. Partial derivatives of the Black–Scholes function
Recall the Black–Scholes formula (6) and assume $\varsigma\,:\!=\,\sigma\sqrt{T-t} \gt 0$ fixed. Then
so that (we drop the dependence on t and $\sigma$ in the $G({\cdot})$ notation)
Now define
We then have
For the partial derivatives, noting that $\partial_{x} G = \frac{1}{\varsigma\sqrt{2\pi}} \partial_{x} f \mathrm{e}^{f}$ implies the ATM formula
and furthermore,
We further define the partial derivatives appearing in the proof of Theorem 2, after (24):
Using $\partial_k f = 1-\partial_x f$ and $\partial_{xk} f = - \partial_{xx} f = -\partial_{kk}f$ , we compute
Finally, we need the derivatives featuring in the proof of Theorem 3. We start with
The next partial derivative yields
and differentiating one last time we reach
We conclude that
Funding information
A. P. acknowledges financial support from the EPSRC CDT in Financial Computing and Analytics, while A. P. and A. J. acknowledge support from the EPSRC EP/T032146/1 grant. Part of the project was carried out while A. P. was at Imperial.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.