1. Introduction
A fundamental issue in modelling biological populations is the risk of a population or species becoming extinct. The corresponding issue in modelling infectious spread is extinction of infection from a population. In either case, a random variable of particular interest is the persistence time until extinction occurs.
A substantial body of recent research addresses the issue via the semi-rigorous WKB (Wentzel–Kramers–Brillouin) approach; see the review papers [Reference Assaf and Meerson2, Reference Ovaskainen and Meerson20]. Such work focuses specifically on the supercitical case (defined precisely in Section 2 below), so that the expected persistence time grows exponentially with population size. For many naturally one-dimensional models (processes that can be approximated, in a sense made precise in Section 2.3, by the solution of a one-dimensional ordinary differential equation), it can be shown that in the supercritical case, as a system size parameter N tends to infinity, the expected persistence time $\tau$ starting from quasi-equilibrium satisfies
where we use $\sim$ to denote that the ratio of the two sides converges to 1 as $N \to \infty$ . Here A and K are constants whose values do not depend upon N, with explicit formulae available for A and K in terms of parameters of the process.
For multidimensional models, it is usually only possible to establish results of the cruder form $\ln \tau \sim A N$ , and to evaluate the leading-order constant A via numerical solution of a system of ordinary differential equations. An exception is provided by [Reference Clancy9], where a relation of the form (1) and explicit formulae for A and K are obtained using the WKB approach for a susceptible–infectious–susceptible (SIS) infection model in a heterogeneous population, a naturally multidimensional process.
In a different vein, a rather more rigorous approach is taken in [Reference Chazottes, Collet and Méléard7]. For a general class of multitype birth–death processes, the existence of a quasi-stationary distribution is proved and a bound established for the total variation distance between the process conditioned to non-extinction before time t and the quasi-stationary distribution, As a by-product of this analysis, it is shown that there exist constants $d_1 \gt d_2 \gt 0$ such that for all sufficiently large N, the mean persistence time $\tau$ starting from quasi-stationarity satisfies $\exp ( d_2 N ) \le \tau \le \exp ( d_1 N )$ .
A limitation of both the WKB approach and the approach of [Reference Chazottes, Collet and Méléard7] is that the models considered generally assume that individuals’ lifetimes or infectious periods are exponentially distributed. This is not usually biologically realistic, the assumption being rather for reasons of mathematical tractability. In [Reference Ball, Britton and Neal5] the effects of relaxing this assumption were considered for a class of (one-dimensional) birth–death processes, via application of a network insensitivity result of [Reference Zachary23]. The subcritical and critical cases were considered in addition to the more widely studied supercritical case. In the critical case it was found that the expected persistence time $\tau_0$ , starting from one initial individual, depends logarithmically upon the system size parameter N as $N \to \infty$ , whereas in the subcritical case, $\tau_0$ was shown to converge to a finite limit as $N \to \infty$ . In [Reference Clancy9], the multitype version of the insensitivity result of [Reference Zachary23] was applied to allow for non-exponentially distributed infectious periods in the SIS infection model in a heterogeneous population, in the supercritical case.
In the current paper, we consider a particular class of multitype birth–death processes, defined in Section 2. We build on the approach of [Reference Ball, Britton and Neal5], so we do not need to assume exponentially distributed lifetimes. For this class of processes we are able to obtain unusually precise results, including a result of the form (1) in the supercritical case, with explicit formulae for the constants A and K. The only multidimensional process for which such a precise result has previously been available is the heterogeneous population SIS model of [Reference Clancy9]. Not only is the class of processes considered here considerably more general than in [Reference Clancy9], but the methods of proof are quite different, and in contrast to [Reference Clancy9] we consider the subcritical and critical cases as well as the supercritical case. The class of processes is more restricted than that of [Reference Chazottes, Collet and Méléard7], but our results, unlike those of [Reference Chazottes, Collet and Méléard7], do not require exponentially distributed lifetimes. In estimating the mean persistence time starting from a single individual, our approach is fully rigorous, unlike the WKB method of [Reference Clancy9]. In estimating the mean persistence time starting from quasi-stationarity, we present only a heuristic sketch proof, referring the reader to [Reference Ball, Britton and Neal4] for full details of the proof in a specific one-dimensional setting.
The remainder of the paper is structured as follows. In Section 2, we define our general modelling framework and give two illustrative examples, before describing two standard approximating processes that provide motivation and intuition for what follows. In Section 3, we define a restarted version of our general process and show how this restarted process may be used to analyse extinction times for the original process of interest. In Sections 4 and 5, we consider the subcritical and critical cases, respectively, obtaining limiting results (Theorems 1, 2, and 3) for the expected extinction time starting from a single initial individual. We move on in Section 6 to the supercritical case, for which we again obtaining a limiting result (Theorem 4) for the expected extinction time starting from a single initial individual, before going on to outline the corresponding result (the formula (49)) for the expected extinction time starting from quasi-equilibrium. Finally, in Section 7 we present some concluding discussion.
2. The model
We consider a sequence of multitype birth–death processes indexed by N, where N is an overall system size parameter, our interest being in the limiting behaviour as $N \to \infty$ . The state of the process at time $t \ge 0$ is denoted by
where k is the number of types of individual, and for $i=1,2,\ldots,k$ , $X_i^{(N)} (t)$ is the number of type i individuals present at time t. We assume that the state space S is either the whole of ${\mathbb Z}_+^k$ , or the finite set $\left\{ \boldsymbol{{x}} = \left( x_1 , x_2 , \ldots , x_k \right) :\ 0 \le x_i \le N_i \mbox{ for } i=1,2,\ldots,k \right\}$ for some $\boldsymbol{{N}} = \left( N_1 , N_2 , \ldots , N_k \right)$ , and that the extinction state $\boldsymbol{{x}} = {\bf 0}$ is absorbing, while $C = S \setminus \{ {\bf 0} \}$ forms a single communicating class. In the case of finite state space, noting that $N_i$ is the upper bound for the number of individuals of type i, we take $N = N_1 + N_2 + \cdots + N_k$ , set $f_i = N_i / N$ for $i=1,2,\ldots,k$ , and assume that $f_1 , f_2 , \ldots , f_k$ are all strictly positive. For simplicity we assume that $f_1 , f_2 , \ldots , f_k$ do not vary with N, so when we write, for instance, $N \to \infty$ , it is implicit that N increases through a subsequence of integers such that $N f_i$ is always integer-valued for all i. To avoid overly cumbersome notation, we suppress the superscript N from now on.
We assume that all birth and death rates scale with N in the manner of the ‘density-dependent’ processes of Chapter 11 of [Reference Ethier and Kurtz13], so that when N is large the scaled process $\boldsymbol{{X}} (t) / N$ may be approximated by a deterministic process (see Section 2.3 for details). We allow the birth and death rates for type i individuals to depend upon (i) the (scaled) number of type i individuals present, and (ii) the (scaled) total number of individuals of all types. More precisely, if we denote by $\xi_1 , \xi_2 , \ldots , \xi_k$ a set of independent homogeneous Poisson processes of rate 1, then for $i=1,2,\ldots,k$ , new type i individuals are born at the points of the time-transformed process
for some functions $b_0 , b_1 , \ldots , b_k$ from ${\mathbb R}^+$ to ${\mathbb R}^+$ . Each newly born type i individual is assigned a ‘lifeforce’ distributed as a non-negative random variable $Q_i$ , independent of other individuals’ lifeforce variables and of the processes $\xi_1 , \xi_2 , \ldots , \xi_k$ , with $\mathbb{E} \left[ Q_i \right] = 1$ for $i=1,2,\ldots,k$ . Each type i individual’s lifeforce reduces at rate
for some functions $d_0 , d_1 , \ldots , d_k$ from ${\mathbb R}^+$ to ${\mathbb R}^+$ , until it reaches zero, at which point the individual is removed from the population (dies). Note that if $d_0 (y) = 1$ and $d_i(y) = y / \alpha_i$ for $i=1,\ldots,k$ , for some constants $\alpha_1 , \alpha_2 , \ldots , \alpha_k \gt 0$ , then the time that a type i individual remains alive in the population is distributed as $\alpha_i Q_i$ , so the lifeforce random variables are simply scaled lifetimes. More generally, the functions $d_0 , d_1 , \ldots , d_k$ may be interpreted as competition effects, so that $d_0$ represents the effect on an individual’s lifespan of competition from the entire population, while for $i=1,2,\ldots,k$ , $d_i$ represents the effect on a type i individual’s lifespan of competition from individuals of its own type. Alternatively, these could be ‘safety in numbers’ effects, whereby the presence of others enhances an individual’s survival chances.
The model that we have defined may be regarded as a piecewise deterministic Markov process [Reference Davis11], in which the state of the process at time t is given by $\boldsymbol{{X}}(t)$ together with the remaining lifeforce values at time t for every individual alive in the population at that time. However, our approach avoids the need for the general machinery of piecewise deterministic Markov processes, so we will not pursue this. In the case that $Q_i$ is exponentially distributed for $i=1,2,\ldots,k$ , $\left\{ \boldsymbol{{X}} (t) :\ t \ge 0 \right\}$ is a continuous-time Markov process with transition rates as follows (where $\boldsymbol{{e}}_i$ denotes the unit vector with ith element equal to 1 and all other elements equal to 0):
To see this, we use an argument based on a construction of [Reference Sellke21]. Suppose that $\boldsymbol{{X}}(t) = \boldsymbol{{x}}$ for some $\boldsymbol{{x}} \in {\mathbb Z}_+^k$ . Since each $Q_i$ has constant hazard rate equal to 1, it follows from the expression (3) that the probability that a particular type i individual is removed in the time interval $[t , t + \delta t )$ is $N d_0 \left( \sum_{j=1}^k x_j / N \right) \left( d_i \left( x_i / N \right) / x_i \right) \delta t + o ( \delta t )$ . Independence of the lifeforce random variables of different individuals then implies that the probability that exactly one individual is removed in the time interval $[t , t + \delta t )$ , and they are of type i, is $N d_0 \left( \sum_{j=1}^k x_j / N \right) d_i \left( x_i / N \right) \delta t + o ( \delta t )$ . Similarly, the expression (2) together with independence of the processes $\xi_1 , \xi_2 , \ldots , \xi_k$ implies that the probability that exactly one individual is born in the time interval $[t , t + \delta t )$ , and they are of type i, is $N b_0 \left( \left. \sum_{j=1}^k x_j \right/ N \right) b_i ( x_i / N ) \delta t + o ( \delta t )$ . Independence between the lifeforce random variables and the processes $\xi_i$ then yields the required result.
2.1. Examples
Two examples of processes that illustrate the biological usefulness of the framework described above are as follows.
Example 1. The susceptible–infectious–susceptible (SIS) infection model of [Reference Clancy8, Reference Clancy9] with heterogeneous susceptibilities and infectious periods. This process has finite state space, with $N_i$ now representing the (constant) total number of type i individuals in the population, while $X_i (t)$ is the number of infectious type i individuals at time t. Rate functions are $b_0(y) = \beta y$ , $d_0 (y) = 1$ , and for $i=1,2,\ldots,k$ , $b_i (y) = \mu_i ( f_i - y )$ , $d_i (y) = y/\alpha_i$ . Here $\beta$ is an overall infection rate parameter, while for $i=1,2,\ldots,k$ , $\mu_i$ gives the level of susceptibility of type i individuals, $\alpha_i$ is the mean infectious period of type i individuals, and $f_i = N_i / N$ is the proportion of all individuals that are of type i. We assume that $\beta \gt 0$ and that $\alpha_i , \mu_i , f_i \gt 0$ for $i=1,2,\ldots, k$ , and without loss of generality we scale the $\mu_i$ values so that $\sum_{i=1}^k \mu_i f_i = 1$ .
Example 2. The multitype birth–death process with linear birth rates and quadratic death rates described in [Reference Chazottes, Collet and Méléard7, Section 2.2]. This process has state space ${\mathbb Z}_+^k$ , and may be obtained by setting $b_0(y) = \lambda y$ , $d_0(y) = \mu + \kappa y$ , and for $i=1,2,\ldots,k$ , $b_i (y) = 1$ , $d_i (y) = y$ , with parameters $\lambda , \mu , \kappa \gt 0$ . In [Reference Chazottes, Collet and Méléard7] it is additionally assumed that the lifeforce variables $Q_i$ are exponentially distributed.
2.2. Assumptions
In order to prove our results, we adopt the following assumptions throughout.
In the case that $S = {\mathbb Z}_+^k$ we set $\tilde C_0 = \tilde C_1 = \cdots = \tilde C_k = (0,\infty)$ , while in the case of finite state space we set $\tilde C_0 = (0,1)$ and $\tilde C_i = \left( 0 , f_i \right)$ for $i=1,2,\ldots,k$ . To ensure that ${\bf 0}$ is an absorbing state, that $C = S \setminus \{ {\bf 0} \}$ is a communicating class, and that the process cannot leave S, we require that
and for $i=1,2,\ldots,k$ ,
In the case of finite state space, we additionally require
We further assume that
Finally, define
In Section 3 we will define a modified version of our process, restarted in a particular way whenever the original process is absorbed at the state $\boldsymbol{{x}} = {\bf 0}$ . The stationary distribution of this restarted process is found to be proportional to $\{ a ( \boldsymbol{{x}} ) : \boldsymbol{{x}} \in S \}$ ; see Section 3. We assume that for all sufficiently large N,
The significance of the conditions (10) and (11) is discussed in Section 3. Note that in the case of a finite state space, these two conditions are trivially satisfied. For Example 2 of Section 2.1, we verify (10) as follows. Recalling the multinomial formula
we have
The condition (11) follows by similar arguments.
We assume the conditions (4)–(8), (10), (11) throughout. For the supercritical case, we require the further assumptions (35)–(39), set out in Section 6.
2.3. Deterministic and branching process approximations
We now describe two approximating processes, valid in different regimes, that are useful in motivating our methods and giving some intuition behind our results.
Firstly, when N is large and, for $i=1,2,\ldots,k$ , $X_i(t)$ is of comparable order to N, the scaled process $\boldsymbol{{X}}(t) / N$ may be approximated by the deterministic process $\boldsymbol{{y}}(t) = \left( y_1 (t) , y_2 (t) , \ldots , y_k (t) \right)$ defined by
for $i=1,2,\ldots,k$ . That is, in the deterministic process, the (scaled) number of individuals of type i at time t, $y_i (t)$ , consists of all those individuals born into group i at time $t-s$ who remain alive at time t, integrated over $s \ge 0$ . Individuals are born into group i at time $t-s$ at rate $b_0 \left( \sum_{j=1}^k y_j (t-s) \right) b_i ( y_i (t-s) )$ . The proportion of such individuals remaining alive at time t is equal to $\mathbb{P} \left( Q_i \gt \int_{u=0}^s \eta_i (t-u) \, du \right)$ , where $\eta_i (t-u)$ is the rate of reduction of each type i individual’s lifeforce at time $t-u$ , given by $\eta_i (t-u) = \left. d_0 \left( \sum_{j=1}^k y_j (t-u) \right) d_i ( y_i (t-u) ) \right/ y_i (t-u)$ .
When the $Q_i$ are exponentially distributed, (12) is equivalent to the system
The assumptions (4) and (5) ensure that the system (13) has an equilibrium point at $\boldsymbol{{y}} = {\bf 0}$ , corresponding to extinction. Assuming that $\boldsymbol{{X}}(0)/N \to \boldsymbol{{y}}_0$ as $N \to \infty$ , where $\boldsymbol{{y}}_0 \ne \textbf{0}$ , it follows immediately from [Reference Ethier and Kurtz13, Section 11.2] that if the $Q_i$ are exponentially distributed, then under certain conditions on the functions $b_0 , b_1 , \ldots , b_k$ and $d_0 , d_1 , \ldots , d_k$ , in the limit as $N \to \infty$ the scaled process $\boldsymbol{{X}}(t)/N$ converges in probability over finite time intervals to the process $\boldsymbol{{y}}(t)$ satisfying the ordinary differential equations (13) with initial condition $\boldsymbol{{y}}(0)=\boldsymbol{{y}}_0$ . We will not need to make use of this result; rather the approximating deterministic process is used merely to provide intuitive motivation.
Secondly, when N is large and, for $i=1,2,\ldots,k$ , $X_i(t)$ is small compared to N, the process $\boldsymbol{{X}}(t)$ may be approximated by a (linear) multitype branching process as follows. Suppose that $\boldsymbol{{X}} (t) = \boldsymbol{{x}}$ for some $\boldsymbol{{x}} \in {\mathbb Z}_+^k$ . Then, recalling the assumptions (5) and (7), the rate (3) at which each type i individual’s lifeforce reduces may be approximated by
Hence each type i individual in the approximating branching process lives for a time distributed as $\left( d_0 (0) d_i^\prime (0) \right)^{-1} Q_i$ . During its lifetime, the rate at which each type i individual gives birth to type j individuals in the approximating branching process is given, from the expression (2), recalling the assumptions (4) and (7), by
for $j=1,2,\ldots,k$ .
The mean number of type j offspring produced by a type i individual over its lifetime in this approximating process is
Denoting by M the matrix with elements $m_{ij}$ , the assumptions (5) and (8) ensure that all elements of M are finite and strictly positive. Further, it follows from the expression (14) that M is of rank 1, with Perron–Frobenius eigenvalue $R_0$ given by its trace, viz.
This type of branching process approximation is made rigorous for certain epidemic models in, for example, [Reference Ball3, Reference Ball and O’Neill6]. We will not make use of such results, as we employ the approximating branching process only for intuitive motivation and in the heuristic argument of Section 6.2. In the context of epidemic modelling, the matrix M is referred to as the next-generation matrix and its Perron–Frobenius eigenvalue, $R_0$ , as the basic reproduction number. In general, the next-generation matrix M need not be of rank 1; this is a consequence of our model assumptions, in particular the fact that in the approximating branching process, the rate at which type i individuals give birth to type j individuals depends only upon j.
Denote by $\omega_i$ the probability that the multitype branching process initiated by a single individual of type i produces a finite number of progeny, and by $\phi_i ( \cdot )$ the moment generating function of the lifeforce random variable $Q_i$ ; then from standard theory of multitype branching processes, e.g. [Reference Mode19], the probabilities $\omega_i$ satisfy
For $R_0 \le 1$ , we have $\omega_1 = \omega_2 = \cdots = \omega_k = 1$ , while for $R_0 \gt 1$ , $\boldsymbol{\omega} = \left( \omega_1 , \omega_2 , \ldots , \omega_k \right)$ is the unique solution of the equations (16) in $[0,1]^k \setminus \{ {\bf 1} \}$ , and $\omega_i \lt 1$ for $i=1,2,\ldots,k$ (see [Reference Mode19, Section 1.7]).
This approximating multitype branching process can alternatively be viewed as a single-type branching process, as follows. Each individual has lifetime distributed according to a mixture distribution, distributed as $\left( d_0(0) d_i^\prime(0) \right)^{-1} Q_i$ with probability $b_i(0) / \sum_j b_j(0)$ for $i=1,2,\ldots,k$ , and during its lifetime gives birth at rate $b_0 ^\prime(0) \sum_j b_j(0)$ . Denote by $\omega$ the probability that this single-type branching process produces a finite number of progeny. Then for $R_0 \le 1$ , we have $\omega = 1$ , while for $R_0 \gt 1$ , $\omega$ is the unique solution in [0,1) of
This single-type branching process is equivalent to the preceding multitype branching process provided that the multitype process is initiated by a single individual which is of type i with probability $b_i(0) / \sum_j b_j (0)$ for $i=1,2,\ldots,k$ . Consequently, we have the relationship
Note that combining the relationship (18) with the equations (16) gives
so that each probability $\omega_i$ ( $i=1,2,\ldots,k$ ) can be expressed in terms of the single probability $\omega$ .
A link between the approximating deterministic process and the approximating branching process is as follows. Recalling the assumptions (4) and (5), with $\delta_{ij}$ denoting the Kronecker delta, the Jacobian $J ( {\bf 0} )$ of the deterministic system (13) at $\boldsymbol{{y}} = {\bf 0}$ has elements
From [Reference Diekmann, Heesterbeek and Roberts12, Theorem A.1] or [Reference Van den Driessche and Watmough22, Theorem 2], all eigenvalues of $J ( {\bf 0} )$ have strictly negative real part precisely when $R_0 \lt 1$ , and conversely $J ( {\bf 0} )$ has at least one eigenvalue with strictly positive real part precisely when $R_0 \gt 1$ . That is, subcriticality of the approximating branching process implies local stability of the extinction equilibrium point $\boldsymbol{{y}} = {\bf 0}$ of the deterministic system (13), while supercriticality of the branching process implies local instability of the extinction point.
3. The restarted process
In order to analyse the long-term behaviour of our process, we follow [Reference Ball, Britton and Neal5, Reference Hernández-Suárez and Castillo-Chavez14] in considering first the stationary behaviour of a modified process without extinction. Specifically, we introduce a regeneration step as follows. Whenever the process reaches the state $\boldsymbol{{x}} = {\bf 0}$ , it remains there for an exponentially distributed time of mean 1, after which a birth occurs, being of type i with probability $\rho_i$ for some distribution $\boldsymbol{\rho} = \left( \rho_1 , \rho_2 , \ldots , \rho_k \right)$ , and the process then continues as before. Taking
ensures that for the case of exponentially distributed lifeforce variables $Q_i$ , the restarted process is reversible [Reference Kelly16]. We can then apply detailed balance conditions to find its stationary distribution $\boldsymbol{\pi} = \{ \pi ( \boldsymbol{{x}} ) :\ \boldsymbol{{x}} \in S \}$ . Recalling the definition (9) of $a ( \boldsymbol{{x}} )$ , we find that
with
The assumptions (10) and (11) imply, respectively, that $\boldsymbol{\pi}$ is a proper distribution and that the condition (11) of [Reference Zachary23] is satisfied. From [Reference Zachary23, Theorem 2] it now follows that $\boldsymbol{\pi}$ is the stationary distribution of numbers of individuals of each type in the system, regardless of the distributions of the $Q_i$ (recall that $\mathbb{E} [Q_i]=1$ for $i=1,2, \dots,k$ ).
The stationary distribution $\boldsymbol{\pi}$ of the restarted process, restricted to C, may be used to approximate the quasi-stationary distribution $\boldsymbol{{q}}$ of the original process (assuming that a unique quasi-stationary distribution exists). For the classic (single-type) SIS epidemic model, this gives the approximation of [Reference Kryscio and Lefèvre17, Section 3.1]. It has been shown [Reference Clancy and Pollett10, Reference Keilson and Ramaswamy15] that for general single-type birth–death processes (with exponentially distributed lifetimes), $\boldsymbol{\pi}$ restricted to C is a lower bound for $\boldsymbol{{q}}$ in the sense of likelihood ratio ordering of distributions.
The mean time between regenerations is $1 / \pi ( {\bf 0} )$ . Denoting by $\tau_i$ the mean time taken for the process to first hit state $\bf 0$ starting from a single type i individual (which is the same for the original process as for the restarted version) and recalling the form (20) of the probabilities $\rho_i$ and the formula (22), we thus have
To investigate mean persistence time, we consider the asymptotic behaviour of the sum (23) as $N \to \infty$ .
4. The subcritical case
For the subcritical case $R_0 \lt 1$ , we have the following.
Theorem 1. Suppose that $R_0 \lt 1$ , that the assumptions (4)–(8) are satisfied, and that $b_0^\prime , b_1 , b_2 , \ldots , b_k$ are non-increasing functions, while $d_0 , d_1^\prime , d_2^\prime , \ldots , d_k^\prime$ are non-decreasing functions.
Then the conditions (10), (11) are automatically satisfied, and recalling that $\tau_i$ denotes the mean time to extinction starting from a single type i individual, we have
Proof. First, note that the conditions of the theorem imply that
Now from the multinomial theorem, we have
and so
so that the condition (10) is satisfied, and from Equation (23) it follows that
A similar argument shows that under the conditions of the theorem,
so that the condition (11) is satisfied.
Next, for any $m \in \mathbb{N}$ , we have
Since m is fixed, then as $N \to \infty$ , for each $r \le m$ we have $N b_0 (r/N) \to r b_0^\prime(0)$ and $d_0 ((r+1)/N) \to d_0 (0)$ , and for $i=1,2,\ldots,k$ we have $b_i (r/N) \to b_i (0)$ and $N d_i((r+1)/N) \to (r+1) d_i^\prime(0)$ . Note also that for $l \le m$ , for all sufficiently large N, $\left\{ \boldsymbol{{x}} \in C : \sum_i x_i = l \right\} = \left\{ \boldsymbol{{x}} \in {\mathbb Z}_+^k : \sum_i x_i = l \right\}$ . Hence
and letting $m \to \infty$ we obtain
The result follows.
Remark 2. In the symmetric case where $b_1 = b_2 = \cdots = b_k$ and $d_1 = d_2 = \cdots = d_k$ , we have $\tau_1 = \tau_2 = \cdots = \tau_k$ , and so for $i=1,2,\ldots,k$ we have
In particular, for Example 2 of Section 2.1, with $R_0 = \lambda k / \mu$ , we have that for $i=1,2,\ldots,k$ ,
Remark 3. For Example 1 of Section 2.1, the heterogeneous SIS model, with $R_0 = \beta \sum_{i=1}^k \alpha_i \mu_i f_i$ , Theorem 1 yields that
It seems intuitively clear that for this model, $\tau_i$ will depend upon i only through the mean infectious period $\alpha_i$ , and that larger $\alpha_i$ values will correspond to larger values of $\tau_i$ . Given the form of (24), a natural conjecture then is that $\tau_i \to - \left( \alpha_i / R_0 \right) \log \left( 1 - R_0 \right)$ as $N \to \infty$ . For more general processes satisfying the conditions of Theorem 1, we correspondingly conjecture that $\tau_i \to - \left( \log \left( 1 - R_0 \right) \right) / \left( d_0 (0) d_i^\prime (0) R_0 \right)$ as $N \to \infty$ .
5. The critical case
For the critical case $R_0 = 1$ , we present results for two models, Example 1 of Section 2.1 and an extension of Example 2. The algebra is rather more intricate and model-specific than in the subcritical case.
Theorem 2. For the SIS infection model with heterogeneous susceptibilities and infectious periods, Example 1 of Section 2.1, if $R_0=1$ then
Proof. For $\boldsymbol{{x}} \in C$ and $\sum_{i=1}^k x_i=l$ ,
Now,
where
(For $x \in \mathbb{R}$ , $\lfloor x \rfloor$ denotes the greatest integer $\le x$ and, for future reference, $\lceil x \rceil$ denotes the smallest integer $\ge x$ .) Using (25) and the multinomial theorem,
since $R_0=\beta \sum_{i=1}^k \alpha_i \mu_i f_i =1$ .
Turning to $B_N$ , note that by the AM/GM inequality, if $x_i>0$ then
and that this inequality holds trivially if $x_i=0$ . Therefore,
Further, if $l \gt \lfloor \sqrt{N} \rfloor$ and $x_1+x_2+\dots+x_k=l$ , then at least one of $x_1, x_2, \dots, x_k$ is $\ge \sqrt{N}/k$ . Hence, for all sufficiently large N, for such l,
and using the multinomial theorem, it follows from (27) that
since $R_0=1$ . Thus,
for all sufficiently large N, which together with (26) implies
Fix $V>0$ . Then it follows from (25) that for any $\boldsymbol{{x}} \in C$ with $\sum_{i=1}^k x_i=l$ , where $l \le \lfloor V\sqrt{N} \rfloor$ ,
Thus, for $l \le \lfloor V\sqrt{N} \rfloor$ and $R_0=1$ , application of the multinomial theorem yields
whence
Straightforward analysis (see Appendix A) yields
Therefore,
and letting $V \downarrow 0$ ,
Remark 4. Theorem 2 for the case $k=1$ corresponds to the relation (3.7) of [Reference Ball, Britton and Neal5].
We consider now an extension of Example 2 of Section 2.1, in which $d_0(y)=\mu+\kappa y^{\eta}$ , where $\eta \gt 0$ . Note from (15) that $R_0=k \lambda/\mu$ .
Theorem 3. For the above extension of Example 2, if $R_0=1$ , then for $i=1,2,\ldots,k$ ,
Proof. For $\boldsymbol{{x}} \in C$ and $\sum_{i=1}^k x_i=l$ ,
where $\theta=\kappa/\mu$ . Thus, using the multinomial theorem,
and, since $R_0=1$ ,
Now
where
Clearly,
Also,
so
Note that in conjunction with (31) this shows that the condition (10) is satisfied, and a similar argument shows that the condition (11) is satisfied. Further, it follows that
Fix $L>0$ . Then
Straightforward analysis (see Appendix A) yields that, for $\eta, L>0$ ,
so
From (32) and letting $L \downarrow 0$ in (34), it follows that
Recalling Remark 2, from the symmetry of the process we have $\tau_1 = \tau_2 = \cdots = \tau_k$ , and the result follows.
6. The supercritical case
Our assumptions of Section 2.2 ensure that the system (13) has an equilibrium point at $\boldsymbol{{y}} = {\bf 0}$ . We now consider the supercritical case $R_0 \gt 1$ under the following further assumptions:
-
1.
(35) -
2.
(36) -
3.
(37) -
4.
(38) -
5.
(39)
We have already seen that when $R_0 \gt 1$ the Jacobian of the system (13) at $\boldsymbol{{y}} = {\bf 0}$ has at least one eigenvalue with strictly positive real part, so that the extinction equilibrium point is locally unstable; the condition (36) ensures that the equilibrium point $\boldsymbol{{y}}^*$ is locally stable.
6.1. Time to extinction starting from a single individual
Corresponding to Theorem 1 for the subcritical case, in the supercritical case we have the following.
Theorem 4. Suppose that $R_0 \gt 1$ , and that the assumptions (4)–(8), (10), (11) together with the assumptions (35)–(39) are all satisfied. With $\delta_{ij}$ denoting the Kronecker delta, denote by H the matrix with elements $h_{ij}$ given by
for $i,j = 1,2,\ldots,k$ . Recalling that $\tau_i$ denotes the mean time to extinction starting from a single individual of type i, as $N \to \infty$ we have
where
and
with
Proof. Letting $\boldsymbol{{y}} = \boldsymbol{{x}} / N$ and $\tilde a ( \boldsymbol{{y}} ) = a ( N \boldsymbol{{y}} )$ , the formula (9) may be written as
so that
For some $i=1,2,\ldots,k$ , consider the sum $\sum_{r=1}^{N y_i} \log \left( \left. b_i ( r/N ) \right/ d_i (r/N) \right)$ for large N. We can approximate this sum by an integral via the trapezium rule, with the complication that $d_i (0) = 0$ , so that the integrand diverges at the lower end. We deal with this by regularizing the integrand as follows:
where we have made use of the assumption (39) to control the order of the approximation error in the trapezium rule. From the assumption (8) we have that $d_i^\prime (0) \gt 0$ , so applying l’Hôpital’s rule to the term $\lim_{u \to 0} ( u / d_i (u) )$ and Stirling’s formula to the factorial term yields
The term $\sum_{r=1}^{N \sum_j y_j} \log \left( b_0 (r/N) / d_0 (r/N) \right)$ in (44) may be treated similarly, noting that from the assumptions (4), (8) we have $b_0 (0) = 0$ , $d_0 (0) \gt 0$ , and $b_0^\prime (0) \gt 0$ , and making use of the assumption (38) in invoking the trapezium rule. Equation (44) now becomes
That is, as $N \to \infty$ ,
where
and
It is clear from the form of (46) that for large N, contributions to the sum (23) will be dominated by terms $a ( N \boldsymbol{{y}} ) = \tilde a (\boldsymbol{{y}})$ where $f ( \boldsymbol{{y}} )$ achieves its maximal value. Recalling from (21) that the stationary distribution of the scaled restarted process is proportional to $\{ \tilde a(\boldsymbol{{y}}) :\ N \boldsymbol{{y}} \in S \}$ , one would intuitively expect this stationary distribution to be peaked around a unique mode that converges to the deterministic stable equilibrium point $\boldsymbol{{y}}^*$ as $N \to \infty$ , and to become increasingly strongly peaked as N increases. We will now show that $f ( \boldsymbol{{y}} )$ does indeed have a unique maximum point at $\boldsymbol{{y}}^*$ .
For $i=1,2,\ldots,k$ ,
so the condition for a stationary point of $f ( \boldsymbol{{y}} )$ is precisely the condition for an equilibrium point of the deterministic system (13). We have assumed that the system (13) has exactly two equilibrium points, at $\bf 0$ and $\boldsymbol{{y}}^*$ . We now show that $\boldsymbol{{y}}^*$ is a local maximum of $f ( \boldsymbol{{y}} )$ , and hence the unique global maximum.
The elements $h_{ij} = \frac{\partial^2 f}{\partial y_i \partial y_j}$ of the Hessian matrix H of $f ( \boldsymbol{{y}} )$ at $\boldsymbol{{y}}^*$ are given by (40). Since all off-diagonal elements of H are equal, application of the matrix determinant lemma shows that the determinant $\mbox{det} (-H)$ is given by the formula (43). Noting that $\boldsymbol{{y}}^*$ is an equilibrium point of (13), for $i=1,2,\ldots,k$ we have $b_0 \left( \sum_l y_l^* \right) b_i ( y_i^* ) = d_0 \left( \sum_l y_l^* \right) d_i ( y_i^* )$ , and so
for $i,j = 1,2,\ldots,k$ . That is, denoting by B the diagonal matrix with entries $b_{ii} = b_0 \left( \sum_l y_l^* \right) b_i \left( y_i^* \right)$ for $i=1,2,\ldots,k$ , we have $H = B^{-1} J ( \boldsymbol{{y}}^* )$ , where $J ( \boldsymbol{{y}}^* )$ is the Jacobian of the system (13) evaluated at $\boldsymbol{{y}}^*$ . Setting $\Sigma = - H^{-1}$ and noting that $\Sigma = \Sigma^T$ , it follows that $\Sigma$ satisfies the Lyapunov equation $J ( \boldsymbol{{y}}^* ) \Sigma + \Sigma J ( \boldsymbol{{y}}^* )^T = - 2 B$ . From [Reference Laub18, Theorem 13.21], this Lyapunov equation has a unique solution $\Sigma$ . Since all diagonal elements of B are strictly positive and, by the assumption (36), the eigenvalues of $J ( \boldsymbol{{y}}^* )$ all have negative real part, it follows from [Reference Laub18, Theorem 13.24] that $\Sigma$ is positive definite. Hence H is negative definite, and $\boldsymbol{{y}}^*$ is a local maximum point of $f ( \boldsymbol{{y}} )$ .
Writing $\tilde C = \tilde C_1 \times \tilde C_2 \times \cdots \times \tilde C_k$ , then as $N \to \infty$ , from (46) we obtain
Now the condition (10) implies that for sufficiently large N the integral on the right-hand side of (47) is convergent, and the conditions (38), (39) ensure that the o(1) term converges uniformly to zero within a neighbourhood of $\boldsymbol{{y}}^*$ . We can therefore apply the multivariate Laplace approximation to this integral, yielding
So finally,
where
and the result follows.
6.2. Time to extinction starting from quasi-equilibrium
In the supercritical case, under the assumptions (35)–(36), as well as the mean time to extinction starting from a single individual, another quantity of interest is the mean time to extinction starting from close to the quasi-equilibrium state $N \boldsymbol{{y}}^*$ . Denoting by $\tau$ this mean extinction time starting close to quasi-equilibrium, we follow [Reference Ball, Britton and Neal5] in outlining a heuristic derivation of the asymptotic form of $\tau$ .
Recall from Section 2.3 that in its initial phase, the process initiated from a single individual may be approximated by a multitype branching process. In the supercritical case $R_0 \gt 1$ , this branching process either goes extinct quickly, or takes off to produce an infinite number of progeny. When the approximating branching process takes off, the original multitype birth–death process will move quickly towards the quasi-equilibrium state $N \boldsymbol{{y}}^*$ , and spend a long time fluctuating close to this state before eventually going extinct. This can be seen from the form of the equilibrium distribution $\pi ( \boldsymbol{{x}} )$ , since we have seen in Section 6.1 that for large N, the probability mass is concentrated close to $N \boldsymbol{{y}}^*$ .
For a process initiated by a single individual of type i ( $i=1,2,\ldots,k$ ), denote by $\zeta_i$ the expected time to extinction of the process conditional upon rapid extinction, and by $\sigma_i$ the expected time taken to first attain a small neighbourhood of $N \boldsymbol{{y}}^*$ conditional upon the process taking off, and recall that $\omega_i$ denotes the extinction probability of the approximating branching process. Then in the limit as $N \to \infty$ , we have
where both $\zeta_i$ and $\sigma_i$ are negligible in comparison to $\tau$ , so
Recalling the relationship (18), from Theorem 4 we now obtain
where A is given by (41), and
with $K_0$ given by Equation (42) and $\omega$ being the solution to Equation (17).
The above heuristic argument is made rigorous for the case of the SIS infection model in $k=1$ dimension, with the additional condition that lifeforce random variables are of finite variance, in [Reference Ball, Britton and Neal4, Appendix B]. The proof is rather lengthy, even in this particular case. It is conjectured in [Reference Ball, Britton and Neal4] that the finite variance condition may not be necessary.
Example 1. For the heterogeneous-population SIS infection model (Example 1 of Section 2.1), (49) reduces, after some algebraic simplification, to the formula (6) of [Reference Clancy9].
Example 2. For Example 2 of Section 2.1, the multitype birth–death process with linear birth rates and quadratic death rates, we obtain
Note that for this process, if lifeforce variables are exponentially distributed, then the total number of individuals $\sum_{i=1}^k X_i (t)$ is a one-dimensional Markov process, and the formulae (51), (52) can alternatively be obtained from Equation (57) of [Reference Assaf and Meerson1].
6.3. Exponentially distributed lifeforces
In the case that the lifeforce random variables $Q_i$ are exponentially distributed, so that $\boldsymbol{{X}}(t)$ is a Markov process, the equations (16) become
Setting $D = \left( \left. b_0^\prime (0) \right/ d_0 (0) \right) \sum_{j=1}^k ( 1 - \omega_j ) b_j (0)$ , we may write the equations (53) as
Substituting from (54) back into both sides of (53) and rearranging, we find that for $R_0 \gt 1$ , D is the unique positive solution of
Recalling the relationship (18), it is then straightforward to show that
so that (50) becomes
The formula (49) with K given by (55) can alternatively be obtained via the WKB approximation method, following the same steps as in [Reference Clancy9]. Although neither the argument of Section 6.2 nor the WKB method is entirely rigorous, the fact that the two methods are in agreement provides some reassurance.
7. Discussion
In this paper, we have greatly extended the class of multidimensional processes for which precise asymptotic formulae such as (1) are available. We have found that in the subcritical case, expected persistence time remains bounded as the system size parameter N grows to infinity; in the examples considered in the critical case, expected persistence time grows logarithmically with N; and in the supercritical case, expected persistence time grows exponentially with N. This is as one might expect. What is more surprising is that we were able to obtain such precise and explicit formulae as those provided by Theorem 4 and (49). The key to our approach is that the restarted process of Section 3 is time-reversible, although the details of our proofs are also dependent upon the particular form of the transition rate functions. The extension of our results to more general restart-reversible processes is the subject of ongoing work.
It is worth noting that while formulae such as (41) and (42) may appear formidable, their evaluation is essentially trivial. This contrasts with the standard characterization of mean persistence time from quasi-stationarity in terms of an eigenvalue of the transition rate matrix, in which the eigenvalue equation appears simple, but evaluation of the relevant eigenvalue can present a formidable computational challenge.
For supercritical processes, arguably of greatest interest is the mean extinction time from quasi-equilibrium, $\tau$ . For the mean extinction time $\tau_i$ starting from a single individual of type i, Theorems 1, 2, and 4 provide limiting and asymptotic formulae only for the linear combination $\sum_{i=1}^k b_i (0) \tau_i$ . In the subcritical case, we conjectured in Section 4 that $\tau_i \to - \left( \log \left( 1 - R_0 \right) \right) / \left( d_0 (0) d_i^\prime (0) R_0 \right)$ as $N \to \infty$ for $i=1,2,\ldots,k$ . In the supercritical case, the heuristic arguments of Section 6.2 imply that $\tau_i \sim ( 1 - \omega_i ) \tau$ as $N \to \infty$ (the relationship (48)), which together with (49) gives the asymptotic form of $\tau_i$ for $i=1,2,\ldots,k$ .
We have focused here upon processes with a single absorbing state at the origin. Another interesting class of population processes is those for which absorption occurs at the boundary $A_0 = \{ \boldsymbol{{x}} \in {\mathbb Z}_+^k :\ x_i = 0 \mbox{ for some } i \}$ . The study of such processes is a topic for future work.
Appendix A. Proofs of Equations (29) and (33)
Consider first Equation (29). Fix $V, \gamma>0$ and for $N \gt (V\gamma)^2$ let
so
By Taylor’s theorem with the Lagrange form of the remainder, for $x \in (0,1)$ ,
Hence,
Also, $1-\frac{r\gamma}{N} \ge 1-\frac{V \gamma}{N^{\frac{1}{2}}}$ for $r=0,1,\dots, \lfloor V\sqrt{N} \rfloor$ , so
Thus,
which together with (56) implies
and (29) follows.
Turning to Equation (33), recall that $\eta, L>0$ , and let $c_N=\lfloor LN^{\frac{\eta}{1+\eta}} \rfloor$ and
For $x>0$ ,
so
where
Now,
Also, $\lim_{N \to \infty}N^{-\eta}c_N^{\eta+1} = L^{\eta+1}$ , so $\lim_{N \to \infty} d_N=\frac{\theta L^{\eta+1}}{\eta+1}$ . Further,
so $\lim_{N \to \infty} e_N=0$ . Thus,
and (33) follows.
Acknowledgement
It is a pleasure to thank two anonymous reviewers for detailed remarks that improved the presentation of this paper.
Funding information
There are no funding bodies to thank in relation to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.