Hostname: page-component-5cf477f64f-bmd9x Total loading time: 0 Render date: 2025-03-29T11:52:41.451Z Has data issue: false hasContentIssue false

The real-time growth rate of stochastic epidemics on random intersection graphs

Published online by Cambridge University Press:  24 March 2025

Carolina Fransson*
Affiliation:
Stockholm University
Maddalena Donà*
Affiliation:
University of Groningen
*
*Postal address: Stockholms universitet, 106 91 Stockholm, Sweden. Email: [email protected]
**Postal address: University of Groningen, PO Box 72, 9700 AB Groningen, The Netherlands. Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

This paper is concerned with the growth rate of susceptible–infectious–recovered epidemics with general infectious period distribution on random intersection graphs. This type of graph is characterised by the presence of cliques (fully connected subgraphs). We study epidemics on random intersection graphs with a mixed Poisson degree distribution and show that in the limit of large population sizes the number of infected individuals grows exponentially during the early phase of the epidemic, as is generally the case for epidemics on asymptotically unclustered networks. The Malthusian parameter is shown to satisfy a variant of the classical Euler–Lotka equation. To obtain these results we construct a coupling of the epidemic process and a continuous-time multitype branching process, where the type of an individual is (essentially) given by the length of its infectious period. Asymptotic results are then obtained via an embedded single-type Crump–Mode–Jagers branching process.

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

In the earliest epidemic models, it is assumed that the disease spreads in a population consisting of indistinguishable individuals exhibiting homogeneous mixing. Since the advent of those early models, there has been considerable interest in incorporating realistic elements from real-world social structures that depart from the simplistic assumption of homogeneity. Such realistic features may take the form both of heterogeneity in social behaviour (some individuals may have a higher proclivity to being socially active than others, or the population may exhibit a more complex social structure than homogeneous mixing) and of biological differences in the ‘susceptibility’ and ‘infectivity’ of individuals.

To give some examples, for deterministic epidemic models this has been manifested through models where the population is stratified into a relatively small number of classes and individuals interact with each other at a rate that is determined by their classes. Individuals may, for instance, be spatially separated or stratified by age or sex. This typically gives rise to a system of differential equations, which governs the dynamics of the epidemic [Reference Heesterbeek, Britton and Diekmann18, Reference Watson33].

A similar development of increasingly complex social structures has taken place in the field of stochastic epidemic modelling on networks. In particular, a large body of epidemic models that aim to capture the tendency of individuals who know each other to have mutual acquaintances has appeared in the literature. In the context of models where the social network of the population is fully specified by a graph, this means that the graph is clustered (i.e. it contains a considerable number of triangles and other short circuits). Some examples include the great circle model [Reference Ball, Mollison and Scalia-Tomba1, Reference Ball and Neal3, Reference Neal25] and the closely related small-world network model [Reference Watts and Strogatz34], where individuals typically have both local contacts in a local environment, which exhibits clustering, and global contacts.

In a similar vein, several models that include the presence of small, closely connected groups, or cliques, with intense within-clique interactions have been introduced [Reference Ball, Mollison and Scalia-Tomba1, Reference Becker and Dietz11]. A clique may, for instance, represent a household, workplace, or school. Models with this feature have been investigated in various forms; see, for instance, [Reference Ball and Neal2, Reference Ball, Pellis and Trapman4, Reference Ball and Sirl6Reference Ball, Sirl and Trapman8, Reference Pellis, Ball and Trapman30], to name a few.

In the present paper, we study the real-time growth rate of an epidemic that spreads on a random graph whose structure, like that of the above-mentioned models, is characterised by the presence of small (possibly overlapping) highly connected cliques. During the early phase of an epidemic, the number of infectious individuals typically grows exponentially; this is the case for many theoretical models and has also been observed in empirical data [Reference Dye15, Reference Nishiura and Chowell27]. The growth rate is one of the most readily available attributes of an emerging epidemic and it is arguably one of the most natural parameters by which to describe the seriousness of the epidemic. For many models of epidemics on random graphs with clustering, obtaining results that concern the real-time growth rate is, however, more challenging than analysing the final outcome of the epidemic. The reason for this is that results on the final outcome of an epidemic may be obtained without taking the actual chain of transmission into account. This idea was first mentioned in a paper devoted to epidemic modelling [Reference Ludwig24] but was, however, implicitly present in earlier literature on percolation [Reference Broadbent and Hammersley14, Reference Frisch and Hammersley17]. For this reason, many models lend themselves more readily to analysis of the final outcome than of the real-time growth rate. In [Reference Pellis, Ferguson and Fraser31] the authors proposed approximate methods for estimating the so-called household reproduction number based on observations of the real-time growth rate in a population structured into small (possibly overlapping) communities, both in the Markovian case and under the arguably strong assumption that the total “infectivity” of an infectious individual and the time points at which the individual transmits the disease are independent. On a related note, [Reference Ball and Shaw5] provided methods to estimate the within-household infection rate for a susceptible–infectious–recovered (SIR) epidemic among a population of households from the observed real-time growth rate.

As mentioned before, the real-time growth rate of an epidemic in a population with households, schools, and workplaces has previously been studied in [Reference Pellis, Ferguson and Fraser31], where (among other things) heuristic results similar to those presented here were obtained. In this paper we provide rigorous proofs of these results. It is worth pointing out that the methods employed here can be applied to a more general class of random graphs with cliques than the model considered in this paper, and also a more general class of household/school/workplace models than that studied in [Reference Pellis, Ferguson and Fraser31]. In particular, previous results concern only the case where the clique or household sizes are bounded and do not trivially extend to the setting with unbounded clique sizes, whereas the current paper deals with the unbounded case.

The key tool of this paper is a single-type branching process, which we embed in the epidemic process. Our approach is inspired by [Reference Iksanov and Meiners19], where a similar embedding was used to obtain the polynomial rate of convergence of multi-type branching processes. The techniques employed here are also related to what [Reference Sagitov32] calls regenerative Galton–Watson processes and to the concepts of local infectious clumps and global contacts in [Reference Ball and Neal2]; see also [Reference Olofsson29], which treats multitype branching processes with local dependencies.

To be more specific about the graph model, here we consider the real-time growth rate of epidemics on a random intersection graph [Reference Karoński, Scheinerman and Singer-Cohen23]. Simply put, a random intersection graph is constructed by dividing the nodes of the graph into groups (a node may belong to zero, one, or several groups) and then connecting nodes that belong to the same group, so that the groups form fully connected (possibly overlapping) subgraphs. Thus, a random intersection graph does, in general, contain a non-negligible number of short circuits, which makes the widely used branching process approximation of the early phase of the epidemic somewhat delicate. Here we consider the real-time growth rate of epidemics on a random intersection graph [Reference Karoński, Scheinerman and Singer-Cohen23] in which the degree distributions are mixed Poisson. Epidemics on graphs of this type have previously been studied in [Reference Ball, Sirl and Trapman9], where expressions for the asymptotic probability of a major outbreak, the final size of a major outbreak, and a threshold parameter were derived. Epidemics on random intersection graphs were also studied in [Reference Britton, Deijfen, Lagerås and Lindholm13], where the clustering of the underlying network is tunable.

This paper is structured as follows. In Section 2.1 we present the notation conventions and abbreviations. Section 2.2 contains an introduction to the underlying graph model, in Section 2.3 we define the epidemic model, and in Section 2.4 we define the approximating branching process. The main results are presented in Section 3. Sections 3.1 and 4 contain some background theory and proofs of the main results.

2. Epidemics on random intersection graphs

2.1. Notation and abbreviations

This section contains a summary of the notation conventions and abbreviations that will be used frequently in this paper.

For any $B\subset \mathbb{R}$ and $x\in \mathbb{R}$ we use the notation $B_{\geq x}=B\cap [ x, \infty)$ , and $B_{>x}$ , $B_{\leq x}$ , and $B_{< x}$ are defined analogously.

For $x\in\mathbb{R}$ , $\lfloor x \rfloor=\sup\mathbb{Z}_{\leq x}$ . For real numbers x and y, $x\vee y=\max(x,y) $ and $\log_+(x)=\log(1\vee x)$ . For any $n\in \mathbb{Z}_{\geq 1}$ , $[n]=\{1,\ldots, n\}$ .

Let $f\colon\mathbb{R}\to\mathbb{R}$ and $g\colon\mathbb{R}\to\mathbb{R}_{> 0}$ . We write $f(x)=\mathcal{O}(g(x))$ as $x\to \infty$ to indicate that $\limsup_{x\to\infty}\vert f(x)\vert/g(x)<\infty$ and $f(x)=o(g(x))$ as $x\to\infty$ to indicate that $\limsup_{x\to\infty}\vert f(x)\vert/$ $g(x)=0$ . Similarly, $f(x)= \Theta (g(x))$ as $x\to \infty$ if $f(x)=\mathcal{O}(g(x))$ and $\liminf_{x\to\infty}\vert f(x) \vert/g(x)>0$ .

For a random variable X and an event A, we use the notation $E(X;A)=E(X\mathbf{1}(A))$ where $\mathbf{1}(A)$ is the indicator of A. For a non-negative random variable X, we say that the random variable Y has mixed Poisson distribution with intensity given by (the law of) X if $(Y\mid X=x)\sim \mathrm{Po}(x)$ , and we denote it by MP(X). For any non-negative integrable random variable X with $\mathbb{E}(X)>0$ , we denote the size-biased version of X by $\overline{X}$ , i.e. for any Borel set $B\subset \mathbb{R}$ ,

(2.1) \begin{gather} \mathbb{P}(\overline{X} \in B) = \frac{\mathbb{E}(X;\,X\in B)}{\mathbb{E}(X)}.\end{gather}

We will make frequent use of the abbreviations MP (mixed Poisson) and SIR (susceptible $\to$ infectious $\to$ recovered). Throughout this paper, $G_n$ denotes a random graph on n vertices generated via the random intersection graph model. We say that an event occurs with high probability (w.h.p.) if the probability of the event tends to 1 as $n\to\infty$ , where n is the number of vertices of the graph $G_n$ under consideration.

2.2. The random intersection graph with mixed Poisson degrees

We consider a random intersection graph model where the degrees of the nodes follow a mixed Poisson distribution. Epidemics on this particular type of graph have previously been investigated in [Reference Ball, Sirl and Trapman9], which used a branching process coupling to derive expressions for the asymptotic probability of a major outbreak (i.e. that a fraction $ \Theta (1)$ of the population contracts the disease in the limit as $n\to\infty$ , where n is the population size), the final size of a major outbreak, and a threshold parameter. In the present paper, we focus on the (exponential) real-time growth rate of an epidemic on a random intersection graph in the early phase of a major outbreak. We give a somewhat brief description of this graph model and refer the reader to [Reference Ball, Sirl and Trapman9] for a more detailed account.

A graph $G_n$ on n vertices can be constructed via the mixed Poisson random intersection graph model as follows (see Figure 1 for an illustration of this construction). Let A and B be two random variables with expected values $\mathbb{E}(A)=\mu_A$ and $\mathbb{E}(B)=\mu_B$ . We make the following assumption on A and B.

Figure 1. Construction of $G_n$ for $n=17$ with $\vert V_n'\vert=\vert\{v_1',\ldots,v_6'\}\vert=6$ cliques. Top: the auxiliary graph $G_n^{\text{aux}}$ . Bottom: the resulting directed graph $G_n$ .

Assumption 2.1.

  1. (i). $\mathbb{P}(A\geq 0)=\mathbb{P}(B\geq 0)=1$ .

  2. (ii). $\mathbb{P}(A= 0)<1$ and $\mathbb{P}(B=0)<1$ .

  3. (iii). $\mathbb{E}(A^2\log_+A)<\infty$ and $\mathbb{E}(B^2\log_+B)<\infty$ .

We will refer to the condition of Assumption 2.1(iii) as the $x^2\log x$ -condition.

Remark 2.1. This version of the random intersection graph can be constructed under less strict assumptions than the $x^2\log x$ -condition (see [Reference Ball, Sirl and Trapman9]); it is, however, needed here for the approximating branching process to satisfy the classical $x\log x$ -condition (see [Reference Jagers20, p. 10]).

Let $\{A_k\}_k$ and $\{B_k\}_k$ be two sequences of independent copies of A and B, respectively. Further, let $V_{n}=\{v_1,\ldots, v_n\}$ be the vertex set of $G_n$ , and assign the weight $A_i$ to the vertex $v_i$ , $i=1,\ldots,n$ . As an intermediate step, we construct an auxiliary graph $G_n^{\textrm{aux}}$ with vertex set $V_n\cup V_n'$ , where $V'_{n}=\{v_1', \ldots, v_m'\}$ and $m=m(n)\,:\!=\,\lfloor n\mu_A/\mu_B\rfloor$ . Assign the weight $B_j$ to the vertex $v_j'$ , $j=1,\ldots, m$ . Given the weights of the vertices of $V_n$ and $V_n'$ , for each pair $v_i, v_j'$ of vertices of $G_n^{\text{aux}}$ such that $v_i\in V_n$ and $ v_j'\in V_n'$ , let the number of edges of $G_n^{\text{aux}}$ shared by $v_i$ and $v_j'$ have distribution $\mathrm{Po}({A_iB_j}/{n\mu_A})$ , independently for pairs $v_i,v_j'$ . Thus, in $G_n^{\text{aux}}$ the degree of $v_i\in V_n$ has distribution

(2.2) \begin{gather} \mathrm{Po}\bigg(A_i\frac{\mu_B^{(n)}\lfloor n\mu_A/\mu_B\rfloor}{n\mu_A}\bigg),\end{gather}

where $\mu_B^{(n)}\,:\!=\,{\sum_{j=1}^m B_j}/{m}$ . Similarly, in $G_n^{\text{aux}}$ the degree of $v_j'\in V_n'$ has distribution

(2.3) \begin{gather} \mathrm{Po}\bigg(B_j\frac{\mu_A^{(n)}}{\mu_A}\bigg),\end{gather}

where $\mu_A^{(n)}\,:\!=\,{\sum_{i=1}^n A_i}/{n}$ . There are no edges of $G_n^{\text{aux}}$ between pairs $v_{i_1},v_{i_2}\in V_n$ . Similarly, there are no edges of $G_n^{\text{aux}}$ between pairs of vertices of $ V_n'$ .

We now obtain the graph $G_n$ from $G_n^{\text{aux}}$ by letting two distinct vertices $v_{i_1}, v_{i_2}\in V_n$ of $G_n$ share an edge if and only if $v_{i_1}$ and $v_{i_2}$ of $G_n^{\text{aux}}$ have at least one common neighbour in $V_n'$ . Next, we replace each edge of the undirected graph $G_n$ by two directed edges pointing in the opposite direction. The reason for this modification is that in the epidemic model considered in this paper (see Section 2.3) infectious contacts are directed.

For later reference, we note that every vertex $v'\in V_n'$ corresponds to a clique $\mathcal{C}$ , with vertex set $V_{\mathcal{C}}$ consisting of neighbours of v’ in $G_n^{\text{aux}}$ , and with edge set $E_{\mathcal{C}}$ that makes $\mathcal{C}$ a simple directed complete graph (that is, for any pair $u,v \in V_{\mathcal{C}}$ of distinct vertices there are precisely two edges $(u,v), (v,u)\in E_{\mathcal{C}}$ , and $E_{\mathcal{C}}$ contains no self-loops).

2.3. The epidemic model

We consider a stochastic SIR epidemic on $G_n$ . In the SIR model, individuals are classified as susceptible (S), infectious (I), or recovered (R) depending on their current health status. An individual who is classified as infectious can transmit the disease to other individuals in the population; if an infectious individual contacts a susceptible individual then transmission occurs and the susceptible individual immediately becomes infectious. An infectious individual will eventually recover from the disease after some period of time, which we refer to as the infectious period of the individual in question (in our model we allow for the infectious period of an individual to be $\infty$ , which means that once the individual in question has contracted the disease it will remain infectious forever). Recovered individuals are fully immune to the disease; once recovered, an individual plays no further role in the spread of the disease. For simplicity, we assume that the epidemic starts with one initial infectious case and that the rest of the population is initially fully susceptible to the disease. We assume that the disease spreads in a population of size n, where the underlying social network of the population is represented by $G_n$ . In our model, a ‘close contact’ (a contact that results in transmission if a susceptible individual is contacted by an infectious individual) can only occur between individuals who are neighbours in $G_n$ . Throughout the paper, we will use the terms individual and vertex interchangeably.

To be precise, the epidemic process can be defined as follows. Let I be a random variable with support in $\mathbb{R}_{\geq 0}\cup\{\infty\}$ . Each vertex $v_i$ of $G_n$ is equipped with an infectious period $I_i$ , where $\{I_i\}_i$ is a sequence of independent copies of I. Let $\{T_{ij}\}_{ij}$ be a sequence of independent and identically distributed random variables with support in $[0,\infty)$ , and assume that they are independent of the infectious periods $(I_i)_{i \geq 1}$ . Here, $T_{ij}$ represents the time elapsed from the event that $v_i$ contracts the disease (which might or might not occur) to the event that $v_i$ contacts $v_j$ . In many standard models, $T_{ij}$ are exponentially distributed. For each (directed) edge $(v_i, v_j)$ , we equip $(v_i, v_j)$ with the transmission weight

$$T_{ij}' \,:\!=\,\begin{cases} T_{ij} & \text{if } T_{ij}\leq I_i, \\ \infty & \text{if } T_{ij}> I_i.\end{cases}$$

The transmission weight $T_{ij}'$ represents the time elapsed from the event that $v_i$ contracts the disease to the event that $v_i$ makes an infectious contact with $v_j$ , which results in transmission of the disease to $v_j$ if $v_j$ is still susceptible. We make the following assumption on the distribution of $T'_{ij}$ .

Assumption 2.2. $\mathbb{P}(T'_{ij}=0)<1/(\mu_{\overline{A}}\mu_{\overline{B}})$ and the distribution of $T'_{ij}$ is non-lattice(i.e. $\mathbb{P}(T_{ij}'\in\{\infty, 0,s,2s,\ldots\})<1$ for any $s>0$ ).

The first part of Assumption 2.2, $\mathbb{P}(T'_{ij}=0)<1/(\mu_{\overline{A}}\mu_{\overline{B}})$ , ensures that the approximating branching process does not explode (i.e. that the branching process population does not grow infinitely large in finite time).

A path $\varsigma=(v_{i_1}, v_{i_2}, \ldots,v_{i_k})$ is any finite sequence of vertices of $G_n$ such that $(v_{i_{r}} ,v_{i_{r+1}})$ is an edge of $G_n$ , $r=1,\ldots k-1$ . We define the length $\ell(\varsigma)$ of a path $\varsigma=(v_{i_1}, v_{i_2}, \ldots,v_{i_k}) $ as $\ell(\varsigma)=\sum_{r=1}^{k-1} T_{i_ri_{r+1}}'$ . Denote the collection of all paths from a vertex u to a vertex v by $\Sigma_{uv}$ . The distance (transmission time) from u to v is given by $d(u,v)\,:\!=\,\min_{\varsigma}\ell(\varsigma)$ , where the minimum is taken over all paths $\varsigma\in\Sigma_{uv}$ . We make the conventions $d(u,u)=0$ and $d(u,v)=\infty$ if $\Sigma_{uv}$ is empty for two distinct vertices u and v.

Remark 2.2. Strictly speaking, d is a quasi-distance rather than a distance since it is not symmetric. We do, however, abuse terminology for convenience.

The initial infected case $u_*$ is then selected according to some rule; a common choice, which we will adhere to here, is to select the initial case uniformly at random. We assume that the initial case $u_*$ contracts the disease at time 0. The time evolution of an outbreak can now be fully specified as follows. An individual $v_i$ , $i=1,\ldots, n$ , has contracted the disease at time $t\geq 0$ if and only if $d(u_*,v_i)\leq t$ , and $v_i$ has recovered from the disease at t if and only if $d(u_*,v_i)+I_i\leq t$ .

We will also need the within-clique distance. Let $\mathcal{C}$ be a clique of $G_n$ . For two vertices u and v let $\Sigma^{\mathcal{C}}_{uv}$ be the collection of paths from u to v restricted to $\mathcal{C}$ . That is, a path $\varsigma$ from u to v belongs in $\Sigma^{\mathcal{C}}_{uv}$ whenever every edge of $\varsigma$ is also an edge of $E_{\mathcal{C}}$ . Note that $\Sigma_{uv}^{\mathcal{C}}$ is empty if u and v are not both members of $\mathcal{C}$ . The distance from u to v restricted to $\mathcal{C}$ is given by $d_{\mathcal{C}}(u,v)\,:\!=\,\min_{\varsigma}\ell(\varsigma)$ , where the minimum is taken over all paths $\varsigma\in\Sigma_{uv}^{\mathcal{C}}$ . As before, $d_{\mathcal{C}}(u,u)=0$ holds whenever $u\in V_{\mathcal{C}}$ , and $d_{\mathcal{C}}(u,v)=\infty$ if $\Sigma_{uv}^{\mathcal{C}}$ is empty.

For any clique $\mathcal{C}$ , we refer to the first individuals of $\mathcal{C}$ to contract the disease as the primary cases of $\mathcal{C}$ . That is, given that the disease reaches $\mathcal{C}$ (i.e. $\min_{w\in V_{\mathcal{C}}}d(u_*,w)<\infty$ ), a vertex $u\in V_{\mathcal{C}}$ is a primary case of $\mathcal{C}$ if $d(u_*,u)=\min_{w\in V_{\mathcal{C}}}d(u_*,w)$ . If $v\in V_{\mathcal{C}}$ contracts the disease but is not a primary case of $\mathcal{C}$ , we say that v is a secondary case of $\mathcal{C}$ , regardless of whether v is infected directly by a primary case or via some other path (which may or may not go via u). Notice that, if we assume that the transmission weight distribution has no atoms, then the primary case is almost surely unique.

2.4. Branching process approximation

At the beginning of an epidemic outbreak, with high probability, infectious individuals contact individuals that are still susceptible. For this reason, in those early stages the epidemic graph is structured as a tree and can therefore be approximated by a branching process. We point out that a salient feature of branching processes is that the lives of individuals that belong to different branches of the branching process tree are independent (conditioned on their types in the multitype case). In the epidemic process, even in the large-population limit, the infectious individuals in a clique ‘compete’ in transmitting the disease to the remaining susceptible individuals in the clique. Therefore, naive attempts to couple a finite-type branching process with the epidemic process will in general give rise to non-local dependencies between the individuals of the branching process tree.

For our scopes, we need a more formal description of a general branching process Z (see, e.g., [Reference Jagers21] for a full formal framework).

Let $\mathbb{N}$ be the set of positive integers, $\mathbb{N}^0\,:\!=\,{\emptyset}$ , and $\mathbb{N}_0\,:\!=\, \mathbb{N} \cup \{0\}$ . The set of all (potential) individuals $\mathcal{S}$ corresponds to the set of all finite tuples of positive integers, i.e. $\mathcal{S}\,:\!=\, \bigcup_{n \geq 0} \mathbb{N}^n$ . An individual x is identified with a tuple $(x_1, \ldots, x_n)\in \mathcal{S}$ , indicating that x is the $x_n$ th child of the $\ldots$ of the $x_2$ nd child of the $x_1$ st child of the ancestor, the latter being $a_0\,:\!=\,\emptyset$ . We will refer to the common ancestor $a_0\in \mathcal{S}$ as the first ancestor of Z. We write $\vert x\vert=n$ to indicate that x belongs to the nth generation of Z, which is $x \in \mathbb{N}^n$ , and we say that $a_0$ is in generation 0. If $x=(x_1,\ldots,x_n)$ and $y=(y_1,\ldots,y_m)$ , we denote by xy the concatenated individual $(x_1,\ldots,x_n,y_1,\ldots,y_m)$ . If $x=(x_1,\ldots,x_{n-1},x_n)$ , $mx=(x_1,\ldots,x_{n-1})$ is the mother of x. The individuals of Z can be partially ordered by descent. If $x=m^n y$ for some $n\in \mathbb{N}_0$ , then we write $x\preceq y$ (or equivalently $y\succeq x$ ) to indicate that x is an ancestor of y (we make the convention that an individual is an ancestor of itself and that $m^0x\,:\!=\,x$ ) and $x\prec y$ (or $y\succ x$ ) to indicate that $x\preceq y $ and $x\neq y$ . Similarly, for $\mathcal J\subset \mathcal S$ and $x\in\mathcal S$ we write $\mathcal J\prec x$ to indicate that $y\prec x$ for some $y\in\mathcal J$ , and $\mathcal J\preceq x$ to indicate that $y\preceq x$ for some $y\in\mathcal J$ . Note that if $x \in \mathcal J$ and ther exists $y\in\mathcal J$ such that $y \prec x$ , then we still write $\mathcal J \prec x$ .

Let $(\Omega_{\emptyset}, \mathcal{A}_{\emptyset})$ be a measurable space. Using the terminology in [Reference Jagers21], we call $\omega \in \Omega_\emptyset$ a possible life career of an individual; any property of an individual can be seen as a measurable function on $(\Omega_{\emptyset}, \mathcal{A}_{\emptyset})$ . For any individual $x\in \mathcal{S}$ , we denote by $\tau_x \in [0, \infty]$ the time at which it is born; we say that x is realised if $\tau_x<\infty$ , while $\tau_x=\infty$ means that x and its (possible) descendants are never born. At birth, any child gets a type in the type space E, which is equipped with a countably generated $\sigma$ -algebra $\mathscr{E}$ . We denote by $\sigma(x) \in E$ the type of x. For each $x \in \mathcal{S}$ we create an independent copy $(\Omega_x, \mathcal{A}_x)$ of the space $(\Omega_{\emptyset}, \mathcal{A}_{\emptyset})$ ; each space $(\Omega_x, \mathcal{A}_x)$ carries a point process $\xi_x$ on $\mathbb{R}_+ \times E$ , defined as

$$\xi_x(A\times B) \,:\!=\, \#\{k\in\mathbb{N}\colon\tau_{xk}-\tau_x\in A,\sigma(xk)\in B\}, \quad\text{$A \in \mathscr{B},\, B \in \mathscr{E}$},$$

with $\mathscr{B}$ denoting the Borel $\sigma$ -algebra of $\mathbb{R}_+$ , and with xk being the concatenated individual $(x_1,\ldots,x_n,k)$ . A point (t,r) of $\xi_x$ indicates a type-r individual produced by x at time t, and there’s a one-to-one correspondence between the children of x and the points of $\xi_x$ .

The population space of Z is then $(\Omega,\mathcal{A}) \,:\!=\,\big(E\times\prod_{x\in\mathcal{S}}\Omega_x,\mathscr{E}\times\prod_{x\in\mathcal{S}}\mathcal{A}_x \big)$ . An element of $(\Omega, \mathcal{A})$ consists of a type for the ancestor and a life career for all individuals of $\mathcal{S}$ .

The epidemic process on $G_n$ can be approximated, in its early phase, by a multi-type branching process whose individuals correspond to infectious individuals in the epidemic process. Following the description above, we will now define a branching process Z, similar to the one presented in [Reference Ball, Sirl and Trapman9], that describes the initial course of the epidemic. From now on, we will talk about individuals of Z and infectious individuals of the epidemic model interchangeably, keeping in mind their correspondence.

The type space of Z is $\mathcal{T}_\theta\,:\!=\,\mathcal{T}\cup\{\theta\}$ , where $\mathcal{T}$ is the support of the generic infectious period I, and the extra point $\theta$ is an atom for the reproduction kernel of Z, in the sense of [Reference Nummelin28, Definition 4.3]. If an infectious individual is, in the epidemic process, the secondary case of a clique of size two, then the corresponding element in the branching process is assigned type $\theta$ . The type of any other individual, apart from the ancestor (i.e. the first infectious individual), is taken to be the length of its infectious period. We notice that the infectious period $I_v$ of $v \in \mathcal{S}$ is an independent copy of the general infectious period if $\sigma(v)=\theta$ or if $v=a_0$ ; $I_v=\sigma(v)$ otherwise. For this reason, $\theta$ -individuals generate independent and identically distributed branches of the branching process. This feature is crucial for the embedded process defined in Section 3.1. The individuals of Z are divided into generations by attributing all secondary cases in a clique to the primary case of the clique in question, even though it may well be that a secondary case does not get infected directly by the primary case. In other words, if we follow the epidemic trail from v back to the initial case $v_*$ then the generation of v in Z corresponds, in the epidemic process, to the number of cliques that have to be traversed to reach $v_*$ (including the cliques of v and $v_*$ ).

Each space $(\Omega_v, \mathcal{A}_v)$ , $v \in \mathcal{S}$ , is equipped with a point process $\xi_v$ . The law of $\xi_v$ can be described as follows. Given the type of an individual v, its point process of reproduction is independent of the lives of the individuals that do not stem from v. A point (t, r) of $\xi_v$ corresponds to a secondary type-r case $u \in V_{\mathcal{C}}$ for which the time elapsed since the corresponding primary case v contracted the disease is $d_{\mathcal{C}}(v,u)=t$ .

Observe that the law of the life of $a_0$ is usually different from the laws of the other branching process individuals since the initial case is assumed to be selected uniformly at random from the population, whereas individuals of subsequent generations represent infectious cases whose degree distribution is size biased. For this reason, the initial case is a member of MP(A) cliques while the number of cliques that a non-initial case is a member of is distributed as $\overline{D}$ cliques, where D is distributed as MP(A). It is readily verified (see also [Reference Ball, Sirl and Trapman9]) that, if D has MP(A) distribution, then $(\overline{D}-1)$ has MP $(\overline{A})$ distribution. Thus, asymptotically, a non-initial case is a member of MP $(\overline{A})$ cliques that are not yet affected by the disease. Note that the size-biased distribution appears here because the probability that a vertex belongs to a clique is proportional to its number of edges. Similarly, the size of a clique (excluding the primary case of the clique) reached during the early phase of the epidemic has asymptotic distribution given by MP $(\overline{B})$ . Here, the size biasing is present because the size of a clique is proportional to the number of edges of its correspondent vertex in the auxiliary graph; see the construction in Section 2.2. Thus, the points of $\xi_{a_0}$ correspond to the secondary cases where the corresponding primary case has infectious period $I_{a_0}$ and is a member of MP(A) cliques of independent sizes with distribution MP $(\overline{B})$ . Similarly, if $v\neq a_0$ , the points of $\xi_{v}$ correspond to the secondary cases where the corresponding primary case has infectious period $I_{v}$ and is a member of MP $(\overline{A})$ cliques, each of size MP $(\overline{B})$ .

The number $\vert Z_t\vert$ of individuals of Z that are alive at time $t\geq 0$ corresponds to the number of infectious individuals at t and is given by the cardinality of the set $\{u\in\mathcal{S}\colon \tau_u\leq t < \tau_u+I_u \}$ .

If the extinction probability of Z is strictly smaller than 1 then Z is said to be supercritical, and we say that we are in the supercritical regime.

3. Main results

In this section we present the main results and give a rough outline of the ideas behind the proofs. These proofs rely on asymptotic results on finite-type branching processes (see, e.g., [Reference Nerman26]) via a coupling of an epidemic process on $G_n$ and a single-type branching process.

The main idea of this section is to embed a single-type branching process Y in Z by letting the type- $\theta$ individuals of Z be the individuals of Y. This allows us to employ the almost sure (a.s.) asymptotic results (see Section 3.1 for an overview) that are available for single-type branching processes.

We now need to introduce some extra notation. Let $F_k$ , $k\in\mathbb{N}_{\geq 2}$ , be the cumulative distribution function of the transmission time from the primary case in a clique of size k to another (specific) member of the clique. That is to say, for a clique $\mathcal{C}$ with $\vert V_{\mathcal{C}}\vert=k$ and two fixed individuals $u, v\in V_{\mathcal{C}}$ , $F_k$ is the cumulative distribution function of the within-clique distance $d_{\mathcal{C}}(u,v)$ . Let $p^{\overline{B}}_k\,:\!=\,\mathbb{P}(\mathrm{MP}(\overline{B})=k)$ and define the vector $\Gamma^\top=\mu_{\overline{A}}(1\cdot p_1^{\overline{B}},2\cdot p_2^{\overline{B}},\ldots)$ , where $\mu_{\overline{A}}=\mathbb{E}(\overline{A})=\mathbb{E}(A^2)/\mu_A$ . For a clique $\mathcal{C}$ of size $\vert V_{\mathcal{C}}\vert=k$ and any $\lambda\geq0$ , let

(3.1) \begin{gather} \mathcal{L}_k^{(\lambda)}\,:\!=\,\int_{\mathbb{R}_{\geq 0}}{\mathrm{e}}^{-t\lambda}F_k({\mathrm{d}} t)\end{gather}

be the Laplace transform of the transmission time within $\mathcal{C}$ , and define the vector

(3.2) \begin{gather} L^{(\lambda)}=\big(\mathcal L_{2}^{(\lambda)},\mathcal L_{3}^{(\lambda)},\ldots\big)^\top.\end{gather}

Note that, for $\lambda=0$ , the mass at infinity is typically not included in the integral defined in (3.1).

As we will prove in Section 4, the Malthusian parameter can be found by solving the Euler–Lotka equation.

Definition 3.1. The Malthusian parameter is the unique solution $\alpha>0$ of $\Gamma\cdot L^{(\alpha)}=1$ .

We are now ready to state our main results.

Theorem 3.1. Under Assumptions 2.1(i)–(iii) and 2.2, and if the Malthusian parameter $\alpha>0$ exists, then there exists a non-negative integrable random variable W such that ${\vert Z_t\vert}/{{\mathrm{e}}^{\alpha t}}\overset{\mathrm{a.s.}}{\to}W$ , where W satisfies $\mathbb{P}(\{\vert Z_t\vert\not\to 0\}\Delta\{W>0\})=0$ .

Here, $\Delta$ denotes the symmetric difference, i.e. $A\Delta B=(A\setminus B)\cup (B\setminus A)$ , and $\vert Z_t\vert$ denotes the number of individuals of Z that are alive at time t. In the notation $\{\vert Z_t\vert\not\to 0\}$ it is implicit that the limit is taken as t tends to $\infty$ . So, $\{\vert Z_t\vert\not\to 0\}$ is the event that the branching process population of Z ultimately avoids extinction.

Theorem 3.2. Let $(G_n)_n$ be a sequence of graphs generated via the random intersection graph model and assume that the assumptions of Theorem 3.1 hold. Let $(\varepsilon_n)_{n\geq 1}$ be a sequence in $(0,\infty)$ that satisfies $\varepsilon_n\log(n)\to\infty$ as $n\to 0$ . Then, for any $q\geq 2$ , $q\neq 3$ , that satisfies $\mathbb{E}(A^{q})<\infty $ and $\mathbb{E}(B^{q})<\infty$ there exist couplings of the epidemic process on the $G_n$ and Z such that the two processes agree w.h.p. until at least $n^{\gamma-\varepsilon_n}$ individuals have contracted the disease. Here, $\gamma = \min\big(\frac{1}{2}, \big({q-\frac32}\big)/{q}\big)$ .

3.1. Branching processes counted with random characteristics

This section contains a brief overview of some preliminaries from the theory of branching processes, which we include for completeness. More detailed accounts of the branching process theory can be found in [Reference Nerman26], and also in the more recent paper [Reference Iksanov and Meiners19].

Before proceeding, we introduce some additional notation to Section 2.4. To arrive at Theorem 3.1, as anticipated before, we embed a single-type branching process Y into the above-described branching process Z. To this end, we partition the individuals of Z into blocks. Let $\mathcal S_\theta\subset \mathcal S$ be the set of the type- $\theta$ individuals of Z. For $x\in\mathcal{S}_\theta$ , define the block $\mathscr B_x$ as follows:

(3.3) \begin{gather} \mathscr B_x \,:\!=\, \{y\in\mathcal{S}\colon x\preceq y, \text{ and whenever } x\prec z \text{ for some } z\in\mathcal{S}_\theta \text{ then } z\not \preceq y\}.\end{gather}

In words, for any $x\in\mathcal{S}_\theta$ the block $\mathscr B_x$ is the set of descendants of x for which the line of descent back to x does not contain an individual of type $\theta$ . The embedded branching process Y is then obtained by letting the individuals of Y be the individuals of $\mathcal S_\theta$ and the children of $x\in\mathcal{S}_\theta$ (seen as an individual of Y) the type- $\theta$ children of individuals of $\mathscr B_x$ (in Z). That is, if we define $\mathscr J_n$ , $n\geq 1$ , recursively as $\mathscr J_0=\{a_0 \}$ and

$$\mathscr J_n = \{x\in\mathcal S_\theta\colon x\succ\mathscr J_{n-1} \text{ and whenever }\mathscr J_{n-1}\prec z\prec x \text{ then } z\not\in\mathcal S_\theta\},$$

then $\mathscr J_n$ consists of the individuals of generation n of Y, $n\geq 0$ . Since we are interested in the number of infected individuals at each time point $t\in\mathbb{R}_{\geq 0}$ we count the population of the embedded single-type branching process Y with a certain random characteristic (see, e.g., [Reference Nerman26]) which provides the link between the size $\vert Z_t\vert$ of the branching process population of Z at t and the embedded branching process Y. Here we consider the special case where the random characteristic $\phi $ is defined as

(3.4) \begin{gather} \phi_x(t) = \vert\{y\in\mathscr B_x\colon\tau_y\leq t<\tau_y+I_y\}\vert\end{gather}

for each $x\in \mathscr J=\cup_{n\geq 0}\mathscr J_n$ , and we say that $Y^\phi_t\,:\!=\,\sum_{x\in\mathscr{J}}\phi_x(t-\tau_x)$ is the branching process population of Y counted with the characteristic $\phi$ . In words, $\phi_x(t)$ can be thought of as the number of infectious individuals which belong to the block $\mathscr B_x$ at $\tau_x+t$ , where $\tau_x$ is the time point when x contracts the disease. Thus, the total population size $\vert Z_t\vert$ of the approximating branching process Z at the time point t can be recovered from the embedded single-type branching process Y via the relation ${\vert Z_t\vert}={Y^\phi_t}$ , where $Y^\phi_t=\sum_{x\in\mathscr{J}}\phi_x(t-\tau_x)$ is the branching process population of Y counted with the characteristic $\phi$ defined in (3.4) at t.

Remark 3.1. It is worth pointing out that the embedding technique employed here does not require the presence of cliques of size two in the underlying graph model. Indeed, in a more general setting, an embedding of a single-type branching process may be obtained by letting the individuals of the single-type branching process be represented by the vertices that are the last to be infected in their clique, if all clique members get infected. This embedding relies on the observation that if a clique $\mathcal{C}$ (of size $\vert V_{\mathcal{C}}\vert=d\geq 2$ , say) is fully infected then the dth individual of $\mathcal{C}$ to be infected does not compete with the other infected cases of $\mathcal{C}$ in transmitting the disease to the remaining susceptible individuals of $\mathcal{C}$ . Thus, given that v is the dth infected case of $\mathcal{C}$ , the infectious period of v is independent of the actual paths of transmission within $\mathcal{C}$ .

We begin by stating an asymptotic result for single-type branching processes where the type of the ancestor is the same as the type of the other individuals. To this end, let $\tilde Z$ be a branching process that behaves like a copy of Z, where Z is the branching process in Section 2.4, except that the first ancestor of $\tilde Z$ is of type $\theta$ . Further, let $\tilde Y$ be the corresponding embedded single-type branching process. In what follows, we will recycle the notation from Section 2.4 for ease of notation. That is, we denote the type space of $\tilde Z$ by $\mathcal{T}_\theta$ , $\mathcal{S}$ denotes the space of individuals of $\tilde Z$ , the block $\mathscr B_x$ and the random characteristic $\phi$ are analogous to the definitions in (3.4) and (3.3), respectively, and so forth.

Let the random measure $\xi$ be the point process of reproduction on $\mathbb{R}_{\geq 0}$ of a generic individual of the single-type branching process $\tilde Y$ , and let $\xi^{(\alpha)}=\int_{\mathbb{R}_{\geq 0}}{\mathrm{e}}^{-\alpha t}\xi({\mathrm{d}} t) =\sum_{x\in \mathscr J}{\mathrm{e}}^{-\alpha\tau_x}$ , where $\mathscr J=\cup_{n\geq 0} \mathscr J_n$ is the space of all individuals of $\tilde Y$ and $\alpha$ is the Malthusian parameter, i.e. $\mathbb{E}(\xi^{(\alpha)})=1$ . We define the measure $\eta$ on $\mathbb{R}_{\geq 0}$ by $\eta(t)=\eta[0,t]\,:\!=\,\mathbb{E}(\xi(t))$ . Theorem 3.3 is a special case of [Reference Nerman26, Theorem 5.4] and will lead us to the a.s. convergence of Theorem 3.1. In order to state Theorem 3.3 we need the following conditions.

Condition 3.1. (Finite mean age at childbearing.) The mean age at childbearing $\beta$ defined by $\beta \,:\!=\, \mathbb{E}\big(\int_{\mathbb{R}_{\geq 0}}t{\mathrm{e}}^{-\alpha t}\xi({\mathrm{d}} t)\big)$ is finite.

Condition 3.2. ( $x\log x$ .) The random variable $\xi^{(\alpha)}\log_+(\xi^{(\alpha)})$ has finite expectation.

Condition 3.3. There exists some non-negative, real-valued, non-increasing, integrable function $g_1$ such that $\int_{\mathbb{R}_{\geq 0}}{{\mathrm{e}}^{-\alpha t}}/{g_1(t)}\,\eta({\mathrm{d}} t)<\infty$ .

Condition 3.4. There exists some non-negative, real-valued, non-increasing, integrable function $g_2$ such that the expectation of $\sup_{t\geq 0}{{\mathrm{e}}^{-\alpha t}\phi(t)}/({g_2(t)\wedge 1})$ is finite. Recall that $\phi$ is the random characteristic defined in (3.4).

Theorem 3.3. ([Reference Nerman26, Theorem 5.4].) Under Conditions 3.1–3.4, almost surely, ${\vert\tilde Y_t^\phi\vert}/{{\mathrm{e}}^{\alpha t}}\to \widehat W m_\infty$ as $t\to\infty$ , where the random variable $\widehat W$ has mean $\mathbb{E}(\widehat W)=1$ and $\mathbb{P}(\{\widehat W=0\})=\mathbb{P}(\vert Z_t\vert\to 0)$ , and $m_\infty\in (0,\infty)$ is a constant that depends on $\phi$ .

Remark 3.2. Under the conditions of Theorem 3.3, applying Theorem 3.3 to each of the children of the first ancestor of Z gives

(3.5) \begin{equation} \frac{\vert Z_t\vert}{{\mathrm{e}}^{\alpha t}} = \frac{\vert Y_t^\phi\vert}{{\mathrm{e}}^{\alpha t}} \to W \,:\!=\, (\widehat W^{(1)}{\mathrm{e}}^{-\alpha\tau_1} + \cdots + \widehat W^{(J)}{\mathrm{e}}^{-\alpha\tau_J})m_\infty \end{equation}

almost surely as $t\to\infty$ , where J is the number of children of the first ancestor of Z that are born in $[0,\infty)$ , the time points $\tau_1,\ldots,\tau_J$ are the birth times of those children, and $\widehat W^{(1)},\ldots,\widehat W^{(J)}$ are J copies of $\widehat W$ (which are not independent in general).

4. Proofs

In Section 4.1 we prove Theorem 3.1 by showing that there is a coupling of the branching process Z and a single-type branching process whose Malthusian parameter is given in Definition 3.1. In Section 4.2 we prove Theorem 3.2. The main step in the proof is to establish upper bounds on the total variation distance of the degree distribution in (2.2) and Po $(\overline{A})$ , and of the distribution in (2.3) and Po $(\overline{B})$ .

4.1. Proof of Theorem 3.1

Recall that the random measure $\xi$ (defined in Section 3.1) is the point process of reproduction on $\mathbb{R}_{\geq 0}$ of a generic individual v of $\tilde Y$ , and that the measure $\eta$ on $\mathbb{R}_{\geq 0}$ is defined as $\eta(t)=\eta[0,t]\,:\!=\,\mathbb{E}(\xi(t))$ . Also recall that $\Gamma=(\gamma_k)_k$ is the vector with elements of the form $\gamma_k= \mu_{\overline{A}}k p_k^{\overline{B}}$ , where $p^{\overline{B}}_k=\mathbb{P}(\mathrm{MP}(\overline{B})=k)$ for $k\in\mathbb{Z}_{\geq 1}$ , and that $L^{(\alpha)}$ is the vector displayed in (3.2).

Lemma 4.1. In the supercritical regime, the Malthusian parameter $\alpha>0$ of Y exists if and only if $\mathbb{P}(T_{ij}'=0)<1/(\mu_{\overline{A}}\mu_{\overline{B}})$ and is then the unique solution of $\Gamma\cdot L^{(\alpha)}=1$ .

Proof. Here, $*$ denotes convolution, i.e. for two cumulative distribution functions F and G, $F*G(t) = \int_{-\infty}^{\infty}G(t-s)\,F({\mathrm{d}} s)$ . Note that the convolution function is symmetric. The expected number of children of v that are $\theta$ -individuals born up to time t is given by

(4.1) \begin{equation} \eta(t) = \gamma_1F_2(t) + \sum_{r}\sum_{(m_1,\ldots,m_r)}\gamma_{m_1}\gamma_{m_2}\cdots\gamma_{m_r}F_{m_1+1} *\cdots*F_{m_r+1}*F_{2}(t)\gamma_1, \end{equation}

where the sums run over $\mathbb{Z}_{\geq 1}$ and $\mathbb{Z}_{\geq 1}^r$ . To see this, we look at v’s offspring in Z. If v belongs to a clique of size two, then we can have a $\theta$ -newborn in the first generation, and this is taken into account by the first addend of (4.1). After this, we consider the second generation (corresponding to $r=1$ in the formula) and we look for all possible paths that pass via a clique of generic size $m_1+1$ and end on a clique of size two, and so on. Taking the Laplace transform of the right-hand side in (4.1) and writing this in vector form gives that the Malthusian parameter $\alpha$ of Y is the solution of

(4.2) \begin{gather} \int_{\mathbb{R}_{\geq 0}}{\mathrm{e}}^{-\alpha t}\eta({\mathrm{d}} t) = \gamma_1\mathcal{L}_{2}^{(\alpha)} \sum_{n=0}^\infty\big(\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}\big)^n=1, \end{gather}

where $\Gamma_{\geq 2}^\top=(\gamma_2,\gamma_3,\ldots)$ and the elements of $L^{(\alpha)}_{\geq 3}=\big(\mathcal L_{3}^{(\alpha)},\mathcal L_{4}^{(\alpha)},\ldots\big)^\top$ are defined in (3.1). Since $\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}<1$ (shown below), (4.2) reduces to

$$\frac{\gamma_1\mathcal{L}_{2}^{(\alpha)}}{1-\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3} }=1.$$

That is, $\Gamma\cdot L^{(\alpha)}=1$ .

It remains to show that $\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}<1$ . By the $x^2\log x$ assumption, for all $\lambda\geq 0$ , $\Gamma\cdot L^{(\lambda)}\leq\Gamma\cdot(1,1,\ldots,1)^\top<\infty$ and, since the approximating branching process is supercritical, $\Gamma\cdot L^{(0)}>1$ . Since $\Gamma\cdot L^{(\lambda)}$ is continuous and strictly decreasing in $\lambda$ with $\Gamma\cdot L^{(\lambda)}\to P(T_{ij}'=0)\mu_{\overline{A}}\mu_{\overline{B}}<1$ as $\lambda\to\infty$ , there exists an $\alpha$ such that $\gamma_1\mathcal{L}_2^{(\alpha)}+\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}=1$ ; given that $\gamma_1>0$ , for that $\alpha$ we have $\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}<1$ . We conclude that the Malthusian parameter exists and is unique.

To proceed, we need some additional notation and terminology. For two kernels $K_1$ and $K_2$ (defined on the same measurable space $(E, \mathcal E)$ ), we define the convolution kernel $K_1 K_2$ as

$$K_1K_2(r, A)\,:\!=\,\int_EK_1(r,{\mathrm{d}} s)K_2(s, A), \quad A\in\mathcal E,\,r\in E.$$

For any $m\geq 1$ , $K_1^m\,:\!=\,K_1K_1^{m-1}$ and $K_1^0\,:\!=\,I$ , where I is the identity kernel $I(r, A)\,:\!=\,\mathbf{1}(r\in A)$ for any $A \in \mathcal E$ . If f is an $\mathcal E$ -measurable function on E then we define the function $K_1f$ as $K_1f(\cdot)\,:\!=\,\int_Ef(s)K_1(\cdot,{\mathrm{d}} s)$ , and similarly, for a measure $\eta$ on $(E, \mathcal E)$ we define $\eta K(\cdot )=\int \eta({\mathrm{d}} s)K(s,\cdot)$ . For any $A\in\mathcal E$ , let $I_A$ be the kernel $I_A(r, B)=I(r, A\cap B)$ .

Recall that we denote (a generic copy of) the point process of reproduction on $\mathcal T_\theta\times\mathbb{R}_{\geq 0}$ of a type-r individual of $\tilde Z$ by $\xi_r$ , and let $\mu(r,A\times B)=\mathbb{E}(\xi_r(A\times B))$ be the expected number of offspring of a type in $A\subset \mathcal{T}_\theta$ produced by a type-r individual (born at time 0) in $B\subset \mathbb{R}_{\geq 0}$ . For $\lambda\in\mathbb{R}$ , define the kernel $K_{(\lambda)}(r, {\mathrm{d}} s\times{\mathrm{d}} t)\,:\!=\,{\mathrm{e}}^{-\lambda t}\mu(r,{\mathrm{d}} s\times{\mathrm{d}} t)$ . Also let $\widehat K (r,{\mathrm{d}} s)\,:\!=\,\int_{\mathbb{R}_{\geq 0}}K_{(\alpha)}(r,{\mathrm{d}} s\times{\mathrm{d}} t)$ , and let $G_\theta =\sum_{n=0}^\infty(I_{\{\theta\}^{\mathrm{c}}}\widehat K)^n$ be the potential kernel of $I_{\{\theta \}^{\mathrm{c}}}\widehat K$ . Here, $\{\theta\}^{\mathrm{c}}=\mathcal T_\theta\setminus\{\theta\}$ and

(4.3) \begin{equation} I_{\{\theta\}^{\mathrm{c}}}(r, B)=I(r, B\cap \mathcal T_\theta\setminus \{\theta\}).\end{equation}

Remark 4.1. Note that, for any $A\subset \mathcal{T}$ and $s\in\mathcal{T}_\theta$ , $G_\theta(s, A)$ is the expected number of descendants of an individual u of $\tilde Z$ , $\sigma(u)=s$ , that are members of the same block as u and whose type belongs to A discounted by their time of birth. Similarly, $G_\theta(s, \{\theta\})$ is the expected number of type- $\theta$ individuals stemming from u whose mother belongs to the same block as u discounted by their time of birth.

Define the function h by $h(x)= G_\theta(x, \{\theta \})$ and the measure $\pi$ by $\pi(A)= \widehat K G_\theta (\theta,A)$ . The kernel $\widehat{K}$ is 1-recurrent in the sense of [Reference Nummelin28, Definition 3.2]; see [Reference Jagers21, Reference Jagers22]. Then, from [Reference Nummelin28, Proposition 4.6], it follows that h is harmonic for $\widehat{K}$ , i.e. $h=\widehat{K}h$ . Similarly, $\pi$ is invariant for $\widehat{K}$ , i.e. $\pi= \pi \widehat{K}$ .

In order to use the finite-type branching process toolbox, we need to verify that the mean age at childbearing $\beta$ of Y is finite, i.e. that $\beta=\int t{\mathrm{e}}^{-\alpha t} \eta({\mathrm{d}} t)<\infty$ , where $\eta$ is the measure in (4.1).

Lemma 4.2. $0<\beta<\infty$ .

Proof. Let $\varepsilon>0$ be small so that $\Gamma_{\geq 2}\cdot L^{(\alpha-\varepsilon)}_{\geq 3}<1$ ; note that such an $\varepsilon$ exists: $\Gamma_{\geq 2}\cdot L^{(\lambda)}_{\geq 3}$ is continuous and $\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}<1$ , as shown in the proof of Lemma 4.1. Let the constant $C_\varepsilon$ be such that $C_\varepsilon{\mathrm{e}}^{-(\alpha-\varepsilon)t} \geq t{\mathrm{e}}^{-\alpha t}$ for all $t\geq 0$ . Then

\begin{equation*} \beta = \int t{\mathrm{e}}^{-\alpha t}\eta({\mathrm{d}} t) \leq C_\varepsilon\int{\mathrm{e}}^{-(\alpha-\varepsilon)t}\eta({\mathrm{d}} t) = C_\varepsilon\gamma_1\mathcal{L}_{2}^{(\alpha-\varepsilon)}\sum_{n=0}^\infty \big(\Gamma_{\geq 2}\cdot L^{(\alpha-\varepsilon)}_{\geq 3}\big)^n < \infty. \end{equation*}

4.1.1. Optimal lines and the $x\log x$ condition

In this paper, we consider two ways of dividing the individuals of the approximating branching process $\tilde Z$ into generations: either generation n consists of the individuals of $\mathcal S_n\,:\!=\,\{x\in\mathcal S\colon\vert x\vert=n\}$ (i.e. of the individuals separated from the first ancestor by a line of descent of length n), or generation n consists of the individuals of $\mathscr J_n$ , which leads us to the embedded branching process $\tilde Y$ . There is a close connection between these two ways of viewing generations and the concept of stopping lines (see [Reference Jagers21] or [Reference Biggins and Kyprianou12]).

Following [Reference Jagers21], we say that a set of individuals $L\subset \mathcal S$ is a stopping line if, for any pair $y,x\in L$ , $x\not \prec y$ . In other words, a stopping line L cuts across the branching process tree Z in the sense that if $x\in L$ then no descendants or ancestors of x (apart from the individual x itself) are members of L. For any $x\in \mathcal S$ , let $\mathcal G_x$ be the $\sigma$ -algebra generated by the lives (infectious periods and reproduction processes) of the ancestors of x (including x), and for any non-random stopping line $\ell$ , let $\mathcal G_\ell\,:\!=\,\sigma(\cup_{x\not\succeq\ell} \mathcal G_x)$ be the $\sigma$ -algebra generated by the lives of the individuals that do not have an ancestor in $\ell$ . Mirroring the concept of optimal stopping times, we say that a line L is optimal if, for any non-random stopping line $\ell$ , the event $\{L\preceq \ell\}$ belongs to $\mathcal G_\ell$ . Here, $L\preceq \ell$ means that for any $x\in \ell$ there is $y\in L$ such that $y\preceq x $ . Note that $\mathscr J_n$ is an optimal line for each $n\geq 0$ . Note also that $\mathcal S_n$ is an optimal line, $n\geq 0$ .

For each $n\in\mathbb{Z}_{\geq 0}$ , define

(4.4) \begin{gather} \widehat W_n\,:\!=\,\sum_{x\in\mathscr J_n}{\mathrm{e}}^{-\alpha \tau_x}\end{gather}

and

(4.5) \begin{gather} \stackrel{\sim}{\smash{W}\rule{0pt}{1.15ex}}_n=\frac{1}{h(\theta)}\sum_{\vert u\vert=n}{\mathrm{e}}^{-\alpha\tau_u}h(\sigma_u),\end{gather}

where the sums run over all individuals of generation n of $\tilde Y$ and $\tilde Z$ , respectively. It is well known (see, e.g., [Reference Biggins and Kyprianou12]) that $\{\stackrel{\sim}{\smash{W}\rule{0pt}{1.15ex}}_n \}_{n\in \mathbb{Z}_{\geq 0} }$ is a martingale with respect to $\mathcal F\,:\!=\,\{\mathcal F_n \}_n$ , where $\mathcal F_n\,:\!=\,\mathcal G_{\mathcal S_n}$ is generated by the lives of the individuals up to generation n (of $\tilde Z$ ) for $n\geq 1$ (we make the convention that $\mathcal G_{\varnothing}$ is the trivial $\sigma$ -algebra). Similarly, $\{ \widehat W_n \}_{n\in \mathbb{Z}_{\geq 0} }$ is a martingale with respect to $\{\mathcal G_{\mathscr J_n} \}_n$ . For later reference, we now state a special case of results presented in [Reference Biggins and Kyprianou12].

Proposition 4.1. (c.f. [Reference Biggins and Kyprianou12, Theorem 6.1 and Lemma 6.2].) Let $\{\widehat W_n\}_n$ and $\{\stackrel{\sim}{\smash{W}\rule{0pt}{1.15ex}}_n\}_n$ be as in (4.4) and (4.5). Then, with probability 1, the limits $\lim_n\stackrel{\sim}{\smash{W}\rule{0pt}{1.15ex}}_n$ and $\lim_n\widehat W_n$ exist and $\lim_n\stackrel{\sim}{\smash{W}\rule{0pt}{1.15ex}}_n=\lim_n\widehat W_n$ .

Now, the $x\log x$ condition for the single-type branching process $\tilde Y$ takes the form

(4.6) \begin{gather} \mathbb{E}( \widehat W_1\log_+ \widehat W_1)<\infty.\end{gather}

The following two lemmas assert that $\tilde Y$ satisfies the $x\log x$ condition under the $x^2\log x$ condition.

Lemma 4.3. Let J have MP(X) distribution, where X is a non-negative integrable random variable with $\mathbb{P}(X=0)<1$ . Then $\mathbb{E}(X^2\log_+X)<\infty$ is necessary and sufficient for $\mathbb{E}(\overline{J}\log_+\overline{J})<\infty$ to hold.

Proof. First, note that $\mathbb{E}(\overline{J}\log_+\overline{J}) = \mathbb{E}(J^2\log_+J)/\mathbb{E}(J) = \mathbb{E}(J^2\log_+J)/\mathbb{E}(X)$ .

Since $x\mapsto x^2\log_+ x$ is convex, necessity now follows from Jensen’s inequality for conditional expectations. Sufficiency follows from

\begin{align*} \mathbb{E}(J^2\log_+ J) & = \mathbb{E}\Bigg(\sum_{k\geq 0}k^2\log_+k\frac{X^k{\mathrm{e}}^{-X}}{k!}\Bigg) \\ & = \mathbb{E}\Bigg(X\sum_{k\geq 0}(k+1)\log_+(k+1)\frac{X^k{\mathrm{e}}^{-X}}{k!}\Bigg) \\ & = \mathbb{E}\Bigg(X\sum_{k\geq 0}\log_+(k+1)\frac{X^k{\mathrm{e}}^{-X}}{k!}\Bigg) + \mathbb{E}\Bigg(X\sum_{k\geq 0}k\log_+(k+1)\frac{X^k{\mathrm{e}}^{-X}}{k!}\Bigg) \\ & = \mathbb{E}\Bigg(X\sum_{k\geq 0}\log_+(k+1)\frac{X^k{\mathrm{e}}^{-X}}{k!}\Bigg) + \mathbb{E}\Bigg(X^2\sum_{k\geq 0}\log_+(k+2)\frac{X^k{\mathrm{e}}^{-X}}{k!}\Bigg) \\ & \leq \mathbb{E}(X\log_+(X+1)) + \mathbb{E}(X^2\log_+(X+2)), \end{align*}

where the inequality follows from Jensen’s inequality applied to the concave (on $\mathbb{R}_{\geq 0}$ ) functions $x\mapsto \log_+ (x+1)$ and $x\mapsto \log_+ (x+2)$ .

Lemma 4.4. Under Assumption 2.1, $\{\widehat W_n\}_n$ satisfies the $x\log x$ condition in (4.6).

Proof. It is known (see [Reference Iksanov and Meiners19, Proposition 4.1]) that the inequality in (4.6) holds if and only if $\{\widehat W_n\}_n$ is uniformly integrable, which is also equivalent to $\mathbb{E}(\widehat W)=1$ , where $\widehat W$ is the almost sure limit $\lim_n \widehat W_n$ . By Proposition 4.1, $\lim_n \widehat W_n=\lim_n \stackrel{\sim}{\smash{W}\rule{0pt}{1.15ex}}_n$ almost surely. Thus, in order to verify that $\{\widehat W_n\}_n$ is uniformly integrable it is sufficient to verify that the almost sure limit of $\stackrel{\sim}{\smash{W}\rule{0pt}{1.15ex}}_n$ has mean 1.

Now note that h is bounded: for any $x\in \mathcal{T}_\theta$ we have

$$ h(x) = G_\theta(x,\{\theta\})\leq 1 + \mathbb{E}(\overline{A})\mathbb{E}(\overline{B})\pi(\{\theta\}) = 1 + \mathbb{E}(\overline{A})\mathbb{E}(\overline{B}). $$

Combining this with Assumption 2.1, Lemma 4.3 and [Reference Biggins and Kyprianou12, Corollary 2.1], the $x\log x$ condition holds for Y if we can show that, almost surely,

(4.7) \begin{gather} \sup_{x>2}\bigg(\frac{\sum_i\mathbf{1}(H(\varsigma_i)\geq x^{-1})}{\log_+(x)}\bigg)<\infty, \end{gather}

where $\varsigma=(\varsigma_0,\varsigma_1, \ldots)$ , $\varsigma_0=(\theta,0)$ , is a Markov chain on $\mathcal T_\theta\times\mathbb{R}_{\geq 0}$ with transmission measure given by

\begin{align*} R((r ,0), A\times B) & \,:\!=\,\frac{1}{h(r)}\int_{A\times B} h(s) K_{(\alpha)}(r, {\mathrm{d}} s\times{\mathrm{d}} t), \\ R((r ,t), A\times B) & \,:\!=\,R((r ,0), A\times (B-t )_{\geq 0}), \end{align*}

where $B-t=\{b-t\colon b\in B\}$ and $H((r,t))={\mathrm{e}}^{-\alpha t}h(r)$ .

Let $p_1\colon\mathcal T_\theta\times\mathbb{R}_{\geq 0}\to\mathcal T_\theta$ and $p_2\colon \mathcal T_\theta\times\mathbb{R}_{\geq 0}\to\mathbb{R}_{\geq 0}$ be the projections onto the first and second coordinate, respectively, and put $\varsigma_j'=p_1(\varsigma_j)$ and $\varsigma_j''=p_2(\varsigma_j)$ for $j\geq 0$ . Then $\{\varsigma_0',\varsigma_1',\ldots\}$ is a Markov chain on $\mathcal T_\theta$ with transition measure $R_1$ :

$$R_1(r,B) = \frac{1}{h(r)}\int_B h(s)K_{(\alpha)}(r,{\mathrm{d}} s\times\mathbb{R}_{\geq 0}) = \frac{1}{h(r)}\int_B h(s)\widehat K(r,{\mathrm{d}} s)$$

for (measurable) $B\subset \mathcal T_\theta$ .

Now, it is easily verified that $\theta$ is a (positive) recurrent state of $\{\varsigma_0',\varsigma_1',\ldots\}$ . Indeed, let $M_i$ , $i\geq 1$ , be the number of steps until $\{p_1(\varsigma_0),p_1(\varsigma_1),\ldots\}$ revisits $\theta$ for the ith time, given that $p_1(\varsigma_0)=\theta$ : $M_0\,:\!=\,1$ and, for $i\geq 1$ , $M_i=\inf\{m > M_{i-1}\colon p_1(\varsigma_m)=\theta\}$ . Then, for $m\geq 1$ (recall that $I_{\{\theta\}^{\mathrm{c}}}$ is the operator in 4.3),

\begin{align*} \mathbb{P}(M=m) & = R_1(I_{\{\theta\}^{\mathrm{c}}}R_1)^{m-1}(\theta,\{\theta\}) \\ & = \int_{\mathbb{R}_{\geq 0}^{m-1}}K_{(\alpha)}(r_{m-1},\{\theta\}\times\mathbb{R}_{\geq 0}) K_{(\alpha)}(r_{m-2},{\mathrm{d}} r_{m-1}\times\mathbb{R}_{\geq 0})\cdots \\ & \qquad\qquad \times K_{(\alpha)}(r_{1},{\mathrm{d}} r_{2}\times\mathbb{R}_{\geq 0})K_{(\alpha)}(\theta,{\mathrm{d}} r_{1}\times\mathbb{R}_{\geq 0}) \\ & = \gamma_1\mathcal{L}_{2}^{(\alpha)}\big(\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}\big)^{m-1} = (1-\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3})\big(\Gamma_{\geq 2}\cdot L^{(\alpha)}_{\geq 3}\big)^{m-1}. \end{align*}

Thus, $\mathbb{P}(M=\infty)=0$ and M is geometrically distributed, and hence the inequality in (4.7) holds almost surely by the law of large numbers applied to $\{\varsigma_{M_k}''+\cdots+\varsigma_{M_{k+1}-1}''\}_k$ .

We now turn our attention to Conditions 3.3 and 3.4.

Lemma 4.5. Conditions 3.3 and 3.4 are satisfied under Assumption 2.1.

Proof. The mapping $t\mapsto{\mathrm{e}}^{-\varphi t}$ satisfies Conditions 3.3 and 3.4 if $\varphi>0$ is small enough. Indeed, let $\varphi\in(0,\alpha)$ be small, so that $\Gamma_{\geq 2}\cdot L^{(\alpha-\varphi)}_{\geq 3} <1$ , and put $g(t)={\mathrm{e}}^{-\varphi t}$ . Then

$$ \int_0^\infty\frac{1}{g(t)}{\mathrm{e}}^{-\alpha t}\eta({\mathrm{d}} t) = \int_0^\infty{\mathrm{e}}^{-(\alpha-\varphi)t}\eta({\mathrm{d}} t) = \gamma_1\mathcal{L}_{2}^{(\alpha-\varphi)}\sum_{n=0}^\infty\big(\Gamma_{\geq 2}\cdot L^{(\alpha-\varphi)}_{\geq 3}\big)^n<\infty. $$

In the following, x is a generic type- $\theta$ individual of Z with $\phi=\phi_x$ , $\mathscr B=\mathscr B_x$ , and $\tau_x=0$ . Clearly, since ${\mathrm{e}}^{-\alpha t}/g(t)$ is non-increasing in t we have

$$\frac{{\mathrm{e}}^{-\alpha t}\phi(t)}{g(t)} \leq \sum_{y\in\mathscr B}\frac{{\mathrm{e}}^{-\alpha\tau_y }}{g(\tau_y)}$$

for any $t\geq 0$ . Thus

$$ \mathbb{E}\bigg(\sup_{t\geq 0}\frac{{\mathrm{e}}^{-\alpha t}\phi(t)}{g(t)}\bigg) \leq \mathbb{E}\Bigg(\sum_{y\in\mathscr B}{\mathrm{e}}^{-(\alpha-\varphi)\tau_y}\Bigg) = \sum_{n=1}^\infty\big(\Gamma_{\geq 2}\cdot L^{(\alpha-\varphi)}_{\geq 3}\big)^n < \infty. $$

If the random characteristic $\phi$ is as in (3.4 ), then the same function g also satisfies Condition 3.4.

Proof of Theorem 3.1. Assume that Assumptions 2.1(i)–(iii) hold. By Lemma 4.1, the Malthusian parameter for the single-type branching process $\tilde Y$ is the unique solution $\alpha>0$ of $\Gamma\cdot L^{(\alpha)} = 1$ . By Lemma 4.2, Condition 3.1 holds; by Lemma 4.4, Condition 3.2 holds; and by Lemma 4.5, Conditions 3.3 and 3.4 hold. Hence, the conditions of Theorem 3.3 are satisfied, and by Remark 3.2 the convergence in (3.5) holds.

4.2. Proof of Theorem 3.2

Here we describe a probabilistically equivalent construction of $G_n^{\text{aux}}$ , that was defined in Section 2.2. This alternative way of constructing $G_n^{\text{aux}}$ is useful in the branching process approximation of the epidemic process since it allows us to run the epidemic process and construct $G_n$ in unison.

Throughout this section we denote the probability measure of A by p, i.e. $p([0, x])=\mathbb{P}(A\leq x)$ for $x\in[0,\infty)$ . Similarly, we denote the probability measure of the size-biased version $\overline{A}$ of A by $\overline{p}$ . Given the weights $A_1,\ldots, A_n$ , let $A_{(n)}$ be a random variable that follows the empirical distribution of $A_1,\ldots, A_n$ , and let $p_n$ be the corresponding probability measure, i.e.

\begin{equation*} p_n([0,x]) = \mathbb{P}(A_{(n)}\leq x\mid A_1,\ldots,A_n) = \frac{1}{n}\sum_{i=1}^n\mathbf{1}(A_i\leq x)\end{equation*}

for $x\in[0,\infty)$ . Also let $\overline{A}_{(n)}$ denote the size-biased version of $A_{(n)}$ , and let $\overline{p}_n$ be the corresponding probability measure (see the definition in (2.1)), i.e.

$$\overline{p}_n([0, x])=\frac{1}{n\mu_A^{(n)}}\sum_{i=1}^nA_i\mathbf{1}(A_i\leq x).$$

Similarly, conditioned on the weights $B_1,\ldots, B_m$ , let $B_{(n)}$ be a random variable that follows the empirical distribution of $B_1,\ldots, B_m$ , and let $\overline{B}_{(n)}$ be the size-biased version of $B_{(n)}$ .

To construct $G_n^{\text{aux}}$ , start by picking some vertex u of $V_n$ according to some rule (e.g. uniformly at random). Typically, u represents the initial case of the epidemic. Put $\mathcal E_0=\mathcal E_0'=\mathcal N_0=\varnothing$ . The component of $G_n^{\text{aux}}$ that contains u is now constructed by iterating the following steps for $t=1,2,\ldots$ :

  1. (i). If $t=1$ , let $v=u$ be the vertex that is currently being explored. Generate the downshifted (i.e. reduced by one) group degree D of v from the distribution given in (2.2). Here, D represents the (additional) number of cliques of which v is a member.

  2. (ii). Draw D elements, $B_{(1)},\ldots,B_{(D)}$ , from the multiset of (realised) weights $\{B_1,\ldots,B_m\}$ independently with replacement. The probability of selecting $B_k\in \{B_1,\ldots,B_m\}$ in a specific draw is given by $B_k/m\mu_B^{(n)}$ , where $\mu_B^{(n)}=\sum_{k=1}^mB_k/m$ . In other words, we generate D independent copies of $\overline{B}_{(n)}$ .

  3. (iii). Let $v'_{(j)}\in V_n'$ be the vertex that corresponds to $B_{(j)}$ , $j=1,\ldots, D$ . For each $v_{(j)}'\in\{v_{(1)},\ldots,v_{(D)}\}$ , if $v'_{(j)}$ is not a member of the set $\mathcal E'_{t-1}\subset V_n'$ of hitherto explored vertices, carry out the following steps 3a to 3d. If $v_{(j)}'\in \mathcal E'_{t-1}$ then the clique that corresponds to $v_{(j)}'$ is already explored, so $v_{(j)}'$ is excluded from these steps.

    1. (a) Generate the downshifted clique size $D_j'$ of $v_{(j)}'$ from the distribution given in (2.3) with $B_{(j)}$ in place of $B_j$ .

    2. (b) Select $D_j'$ elements $A_{(1)}^j,\ldots,A_{(D_j')}^j$ from the multiset of (realised) weights $\{A_1,\ldots,$ $A_n\}$ independently with replacement. The probability of selecting $A_k$ in a specific draw is given by $A_k/n\mu_A^{(n)}$ , where $\mu_A^{(n)}=\sum_{k=1}^nA_k/n$ . In other words, we generate $D'_j$ independent copies of $\overline{A}_{(n)}$ .

    3. (c) Let $v_k^j\in V_n$ be the vertex that corresponds to $A_{(k)}^j$ , $k=1,\ldots, D_j'$ .

    4. (d) Add an edge between each pair of distinct vertices in $\{v\}\cup\{v_k^j\}_{k=1}^{D'_j}\setminus\mathcal E_{t-1}$ to $G_n$ .

  4. (iv). Update the set $\mathcal E_t'$ of explored cliques by putting $\mathcal E_{t}'=\mathcal E_{t-1}'\cup\{v'_{(1)},\ldots,v'_{(D_i)}\}$ .

  5. (v). Update the set $\mathcal N_t$ of neighbours of explored vertices by putting $\mathcal N_{t}=\mathcal N_{t-1}\cup \{v_k^j\colon j=1,\ldots,D_i \text{ and } k=1\ldots, D'_j \}$ .

  6. (vi). Update the set $\mathcal E_t$ of explored vertices by putting $\mathcal E_{t}=\mathcal E_{t-1}\cup\{v\}$ .

  7. (vii). If $\mathcal N_t=\mathcal E_t$ then the construction of the component is complete and we exit the algorithm. Otherwise, update v by picking some new vertex in $\mathcal N_t\setminus\mathcal E_t$ . If $G_n^{\text{aux}}$ is constructed as the epidemic progresses, then the new vertex v is the tth non-initial case.

If the nodes $v_{(1)}',\ldots,v_{(D)}'$ are not distinct or $\{v_{(1)}',\ldots,v_{(D)}'\} \cap \mathcal E_{t-1} \neq \varnothing$ in step (iii) for some iteration $t\in\{0,1,\ldots\}$ then the coupling of the approximating branching process and the epidemic process breaks down. Similarly, if in some iteration t the $v_k^j$ are not all distinct, or $v_k^j\in E_{t-1}$ for some $j\in\{1,\ldots, D\}$ and $k\in\{ 1,\ldots, D_j'\}$ , then the coupling breaks down.

The following lemma, which is a variant of the birthday problem, ensures that w.h.p. the coupling of the approximating branching process and the epidemic process holds during the first $o(\sqrt n)$ steps of the construction algorithm. The proof is included here for completeness.

Lemma 4.6. Let $j_n=o(\sqrt{n})$ , and let A have finite second moment. Suppose that we draw elements from the multiset $\{A_1,\ldots,A_n\}$ independently with replacement, and that the probability that $A_i\in\{A_1,\ldots,A_n\}$ is selected in a specific draw is proportional to $A_i$ . Then the first $j_n$ drawn elements are distinct with P-probability tending to 1 as $n\to\infty$ .

Proof. For $k=1,2,\ldots$ , let $A^{(k)}$ be the kth element that is drawn from $\{A_1,\ldots, A_n\}$ , and let $\mathbb{E}_n(\cdot)$ be the conditional expectation operator given $A_1,\ldots, A_n$ . Conditioned on $A_1,\ldots, A_n$ , for $k\geq 2$ the probability that $A^{(k)}$ is not distinct from $A^{(j)}$ for some $j\in\{1,\ldots, k-1\}$ is less than or equal to

$$ \mathbb{E}_n\bigg(\frac{A^{(1)}+\cdots+A^{(k-1)}}{A_1+\cdots+A_n} \bigg) = (k-1){\frac{A_1^2+\cdots+A_n^2}{(A_1+\cdots+A_n)^2}}. $$

Thus, by the union bound, conditioned on $\{A_1,\ldots, A_n\}$ the probability that the first $j_n$ drawn elements are not distinct is less than or equal to

\begin{align*} \sum_{k=1}^{j_n}(k-1){\frac{A_1^2+\cdots+A_n^2}{(A_1+\cdots+A_n)^2}} & = \frac{(j_n-1)j_n(A_1^2+\cdots+A_n^2)}{2(A_1+\cdots+A_n)^2} \\ & = \bigg(\frac{(j_n-1)j_n}{n}\bigg)\left(\frac{(A_1^2+\cdots+A_n^2)/n}{2\big(\mu_A^{(n)}\big)^2}\right). \end{align*}

Since A has finite second moments,

$$\frac{(A_1^2+\cdots+A_n^2)/n}{2\big(\mu_A^{(n)}\big)^2}$$

converges to $\mathbb{E}(A^2)/2\mu_A^2$ in $\mathbb{P}$ -probability as $n\to\infty$ . Hence,

$$ \mathbb{P}\big(A^{(1)},\ldots,A^{(j_n)}\text{ are distinct}\big) \geq 1 - \mathbb{E}\left(1\wedge\bigg(\frac{(j_n-1)j_n}{n}\bigg) \frac{(A_1^2+\cdots+A_n^2)/n}{2\big(\mu_A^{(n)}\big)^2}\right), $$

where the right-hand side tends to 1 as $n\to\infty$ .

In the construction algorithm described previously, the weights of explored vertices in $V_n$ and $V_n'$ are generated from the distributions of $\overline{A}_{(n)}$ and $\overline{B}_{(n)}$ , whereas the weights in the approximating branching are generated from the distributions of $\overline{A}$ and $\overline{B}$ . Therefore, in order to prove Theorem 3.2 we find upper bounds on the coupling error between MP $(\overline{A}_{(n)})$ and MP $(\overline{A})$ , and between MP $(\overline{B}_{(n)})$ and MP $(\overline{B})$ , which we state in Proposition 4.2 and Lemma 4.7.

Given the weights $A_1,\ldots, A_{n}$ , for a coupling $\mathscr C$ of A and $A_{(n)}$ we denote the corresponding probability measure and expectation by $\mathbb{P}_{\mathscr C}$ and $\mathbb{E}_{\mathscr C}$ , respectively. A coupling of two random variables with distributions MP $(\overline{A})$ and MP $(\overline{A}_{(n)})$ can be constructed by first constructing a coupling $\mathscr C$ of their respective intensities $\overline{A}$ and $A_{(n)}$ , then generating a joint realization $(\overline{A}',\overline{A}_{(n)}')$ of these intensities according to $\mathscr C$ and in the next step using these intensities to generate two random variables from the Poisson distribution. If in the last step a maximal coupling is used then the (conditional) probability of a miscoupling is given by $\tfrac{1}{2}d_{\mathrm{TV}}(\mathrm{Po}(\overline{A}'),\mathrm{Po}(\overline{A}_{(n)}'))$ . Here, $d_{\mathrm{TV}}$ denotes the total variation distance, i.e. for $a,b\in\mathbb{R}_{\geq 0}$ ,

$$d_{\mathrm{TV}}(\mathrm{Po}(a),\mathrm{Po}(b)) =\sum_{k\geq 0}\bigg\vert\frac{a^k{\mathrm{e}}^{-a}}{k!}-\frac{b^k{\mathrm{e}}^{-b}}{k!}\bigg\vert.$$

Thus, given the distribution of $A_{(n)}$ , the probability of a miscoupling is given by $\frac12\mathbb{E}_{\mathscr C}(d_{\mathrm{TV}}(\mathrm{Po}(\overline{A})$ , $\mathrm{Po}(\overline{A}_{(n)})))$ .

Let the coupling $\mathscr C_n$ of $\overline{A}'$ and $\overline{A}_{(n)}'$ be given by

(4.8) \begin{gather} \mathscr C_n \,:\!=\, \underset{\mathscr C}{\text{argmin}}\,\mathbb{E}_{\mathscr C} \big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert\end{gather}

for each $n\geq 1$ , where the minimum extends over all couplings $\mathscr C$ of A and $A_{(n)}$ .

In the following proposition, $A_1,\ldots,A_n$ are random with respect to $\mathbb{P}$ . Thus, under $\mathbb{P}$ , $\mathscr C_n$ is a coupling of $\overline{p}$ and the random probability measure $\overline{p}_n$ .

Proposition 4.2. Assume that $\mathbb{E}(A^q)<\infty$ for $q>\frac{3}{2}$ . Let $(\varepsilon_n)_{n\geq 1}$ be a sequence in $(0,\infty)$ such that if $q\neq 3$ then $\varepsilon_n\log(n)\to\infty$ as $n\to 0$ and if $q=3$ then $\varepsilon_n\log(n)-\log(\log(n))\to\infty$ as $n\to 0$ . Then $\mathbb{P}(\mathbb{E}_{\mathscr C_n}(d_{\mathrm{TV}}(\mathrm{Po}(\overline{A}),\mathrm{Po}(\overline{A}_{(n)}))) \geq n^{-\gamma+\varepsilon_n}) \to 0$ as $n\to\infty$ , where $\gamma=\frac{1}{2}\wedge\big({q-\frac32}\big)/{q}$ .

Remark 4.2. Proposition 4.2 also holds if A, $\overline{A}$ , and $\overline{A}_{(n)}$ are replaced by B, $\overline{B}$ , and $\overline{B}_{(n)}$ .

Proof of Proposition 4.2. We have (the first inequality following from [Reference Barbour, Holst and Janson10, Theorem 1.C])

\begin{align*} \mathbb{E}_{\mathscr C_n}(d_{\mathrm{TV}}(\mathrm{Po}(\overline{A}),\mathrm{Po}(\overline{A}_{(n)}))) & \leq \mathbb{E}_{\mathscr C_n}\left(\frac{1}{\sqrt{\overline{A}_{(n)}\vee\overline{A}}\,} \big\vert{\overline{A}}-{\overline{A}_{(n)}}\big\vert\right) \\ & \leq \mathbb{E}_{\mathscr C_n}\left(\frac{1}{\frac{1}{2}\big(\sqrt{\overline{A}}+\sqrt{\overline{A}_{(n)}}\big)} \big\vert\sqrt{\overline{A}}+\sqrt{\overline{A}_{(n)}}\big\vert\cdot \big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert\right) \\ & = 2\mathbb{E}_{\mathscr C_n}\big(\big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert\big). \end{align*}

Hence, by Proposition 4.3,

$$ \mathbb{E}\big(\mathbb{E}_{\mathscr C_n}( d_{\mathrm{TV}}(\mathrm{Po}(\overline{A}),\mathrm{Po}(\overline{A}_{(n)}));D_n\big) = \begin{cases} \mathcal{O}\big(n^{-({q-3/2})/{q}}\big) & \text{if } \frac32 < q < 3, \\ \mathcal{O}\big(n^{-{1}/{2}}\log(n)\big) & \text{if } q = 3, \\ \mathcal{O}\big(n^{-{1}/{2}}\big) & \text{if } q > 3, \\ \end{cases} $$

where $D_n$ is the event $\max_{i\in[n]}(A_i)>0$ . The assertion of the proposition now follows from Markov’s inequality and $\mathbb{P}(D_n^{\mathrm{c}})=p(0)^n$ , $p(0)<1$ .

By the degree distributions of $G_n^{\text{aux}}$ given in (2.2) and (2.3), in order to arrive at Theorem 3.2 we will also need a bound on the coupling error of MP $(\overline{B}_{(n)}{\mu_A^{(n)}}/{\mu_A})$ and MP $(\overline{B}_{(n)})$ , and of MP $(\overline{A}_{(n)}{\mu_B^{(n)}\lfloor n\mu_A/\mu_B\rfloor}/{n\mu_A})$ and MP $(\overline{A}_{(n)})$ .

Proposition 4.3. (c.f. [Reference Fournier and Guillin16, Theorem 1].) If $\mathbb{E}(A^{q})<\infty$ for $q > \frac32$ then

$$ \mathbb{E}\big(\mathbb{E}_{\mathscr C_n}\big(\big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert\big); \, D_n\big) = \begin{cases} \mathcal{O}\big(n^{-({q-3/2})/{q}}\big) & \text{if } \frac32 < q < 3, \\ \mathcal{O}\big(n^{-{1}/{2}}\log(n)\big) & \text{if } q = 3, \\ \mathcal{O}(n^{-{1}/{2}}) & \text{if } q > 3, \\ \end{cases} $$

where ${\mathscr C_n}$ is the coupling in (4.8 ) and $D_n$ is the event that $\max_{i\in[n]}(A_i)>0$ .

Proof. This proof is, in part, analogous to the proof of [Reference Fournier and Guillin16, Theorem 1] and is presented here in full for completeness. The differences arise mainly due to the size-biasing of the weights.

Throughout, $C_1,C_2,\ldots$ are positive constants that depend only on q and the distribution of A, and $U\subset[0,\infty)$ is a generic Borel set.

Note that $\mathbb{E}_{\mathscr C_n}\big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert$ is the 1-Wasserstein distance between the distributions of $\sqrt{\overline{A}}$ and $\sqrt{\overline{A}_{(n)}}$ . Hence, with the notation $2^mF=\{2^mu\colon u\in F\}$ for any event F, by [Reference Fournier and Guillin16, Lemmas 5 and 6],

(4.9) \begin{equation} \mathbb{E}_{\mathscr C_n}\big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert \leq C_1\sum_{m\geq 0}2^m\sum_{\ell\geq 0}2^{-\ell}\sum_{F\in\mathcal P_\ell}\vert\eta_n(2^m F\cap U_m)-\eta(2^m F\cap U_m)\vert, \end{equation}

where $\mathcal P_\ell$ is the partition of [0, 1] that consists of $\{0\}$ and $2^{-\ell+1}k+(0,2^{-\ell+1}]$ for $k\in\{0,1,\ldots,2^{\ell-1}-1\}$ , $U_0=[0,1]$ and $U_m=[0,2^m]\setminus [0, 2^{m-1}]$ for $m\geq 1$ , and $\eta$ and $\eta_n$ are the probability distributions of $\sqrt{\overline{A}}$ and $\sqrt{\overline{A}_{(n)}}$ , respectively. Now, with the notation $U^2=\{u^2\colon u\in U\}$ , by the triangle inequality,

(4.10) \begin{align} \vert\eta_n(U)-\eta(U)\vert & = \vert\overline{p}_n(U^2)-\overline{p}(U^2)\vert \nonumber \\ & = \bigg\vert\frac{\sum_{i=1}^nA_i\mathbf{1}(A_i\in U^2)}{n\mu_A^{(n)}} - \frac{\mathbb{E}(A\mathbf{1}(A\in U^2))}{\mu_A}\bigg\vert \nonumber \\ & \leq \frac{1}{n\mu_A^{(n)}}\sum_{i=1}^n{A_i\mathbf{1}(A_i\in U^2}) \bigg\vert1-\frac{\mu_A^{(n)}}{\mu_A}\bigg\vert \nonumber \\ & \quad + \frac{1}{\mu_A}\Bigg\vert{\frac{1}{n}\sum_{i=1}^nA_i\mathbf{1}(A_i\in U^2)} - {\mathbb{E}(A\mathbf{1}(A\in U^2))}\Bigg\vert. \end{align}

The second term on the right-hand side in (4.10) satisfies (with the first inequality following from Jensen’s inequality)

(4.11) \begin{align} \mathbb{E}\Bigg(\Bigg\vert{\frac{1}{n}\sum_{i=1}^nA_i\mathbf{1}(A_i\in U^2)} - {\mathbb{E}(A\mathbf{1}(A\in U^2))}\Bigg\vert\Bigg) & \leq \sqrt{\mathrm{Var}\Bigg({\frac{1}{n}\sum_{i=1}^nA_i\mathbf{1}(A_i\in U^2)}\Bigg)} \nonumber \\ & \leq (\sup U)^2\sqrt{\frac{1}{n}p(U^2)} \end{align}

and, again by the triangle inequality,

(4.12) \begin{equation} \mathbb{E}\Bigg(\Bigg\vert\frac{1}{n}\sum_{i=1}^nA_i\mathbf{1}(A_i\in U^2) - \mathbb{E}(A\mathbf{1}(A\in U^2))\Bigg\vert\Bigg) \leq 2(\sup U)^2p(U^2). \end{equation}

Combining (4.11) and (4.12) gives that the second term on the right-hand side in (4.10) satisfies

(4.13) \begin{equation} \frac{1}{\mu_A}\Bigg\vert{\frac{1}{n}\sum_{i=1}^nA_i\mathbf{1}(A_i\in U^2)} - {\mathbb{E}(A\mathbf{1}(A\in U^2))}\Bigg\vert \leq C_2(\sup U)^2\bigg(\sqrt{\frac{1}{n}p(U^2)}\wedge p(U^2)\bigg). \end{equation}

In order to find a similar upper bound on the first term on the right-hand side in (4.10) we define the event $S_n$ as

(4.14) \begin{gather} S_n \,:\!=\, \big\{\mu_A^{{(n)}}\leq \mu_A-\kappa\big\}, \end{gather}

where $\kappa\in(0,\mu_A)$ is a fixed constant such that ${\mathrm{e}}^{(\mu_A-\kappa)}\mathbb{E}({\mathrm{e}}^{-A})<1$ . Then, with the second inequality following from Hölder’s inequality and the last inequality from $\sum_{i=1}^n\mathbf{1}(A_i\in U^2)\sim \mathrm{Bin}(n, p(U^2))$ ,

(4.15) \begin{align} & \mathbb{E}\Bigg(\frac{1}{n\mu_A^{(n)}}\sum_{i=1}^nA_i\mathbf{1}(A_i\in U^2) \bigg\vert1-\frac{\mu_A^{(n)}}{\mu_A}\bigg\vert;\,S_n^{\mathrm{c}}\cap D_n\Bigg) \nonumber \\ & \qquad \leq \frac{(\sup U)^2}{(\mu_A-\kappa)}\Bigg(\frac{1}{n}\mathbb{E}\Bigg(\sum_{i=1}^n \mathbf{1}(A_i\in U^2)\bigg\vert1-\frac{\mu_A^{(n)}}{\mu_A}\bigg\vert\Bigg) \wedge (2p(U^2))\Bigg) \nonumber \\ & \qquad \leq \frac{(\sup U)^2}{(\mu_A-\kappa)}\Bigg(\sqrt{\frac{1}{n^3}\mathbb{E}\Bigg(\Bigg( \sum_{i=1}^n\mathbf{1}(A_i\in U^2)\Bigg)^2\Bigg)\mathrm{Var}\bigg(\frac{A}{\mu_A}\bigg)} \wedge (2p(U^2))\Bigg) \nonumber \\ & \qquad \leq \frac{(\sup U)^2}{(\mu_A-\kappa)}\bigg(\sqrt{\frac{1}{n^3}np(U^2)(np(U^2)+1) \mathrm{Var}\bigg(\frac{A}{\mu_A}\bigg)} \wedge (2p(U^2))\bigg) \nonumber \\ & \qquad \leq C_3(\sup U)^2\bigg(\sqrt{\frac{1}{ n}p(U^2)}\wedge p(U^2)\bigg). \end{align}

By the inequalities in (4.13) and (4.15), together with the fact that, for any $G\subset [0,\infty)$ , $\sup(G\cap U_m^2)\leq \sup(U_m^2)= 2^{2m}$ , we have

(4.16) \begin{multline} \sum_{F\in\mathcal P_\ell}\mathbb{E}\big(\vert\eta_n(2^m F\cap U_m)-\eta(2^m F\cap U_m)\vert; \, D_n\cap S_n^{\mathrm{c}}\big) \\ \leq C_4\Bigg(2^{2m}\sum_{F\in\mathcal P_\ell}\bigg(\sqrt{\frac{1}{n}p(2^{2m}F^2\cap U_m^2)} \wedge p(2^{2m} F^2\cap U_m^2)\bigg)\Bigg) \end{multline}

and (with the second inequality following from the Cauchy–Schwarz inequality and the fact that, since $\mathbb{E}(A^q)$ is finite, $\mathbb{P}(U_m^2)\leq \mathbb{E}(A^q)2^{-2qm}$ )

(4.17) \begin{align} \sum_{F\in\mathcal P_\ell}\bigg(\sqrt{\frac{1}{n}p(2^{2m}F^2\cap U_m^2)} \wedge p(2^{2m}F^2\cap U_m^2)\bigg) & \leq \Bigg(\frac{1}{\sqrt{n}\,}\sum_{F\in\mathcal P_\ell}\sqrt{p(2^{2m}F^2\cap U_m^2)}\Bigg) \wedge p(U_m^2) \nonumber \\ & \leq C_5\bigg(\bigg(\frac{1}{\sqrt{n}\,}\sqrt{\vert\mathcal P_\ell\vert p(U_m^2)}\bigg) \wedge 2^{-2qm}\bigg) \nonumber \\ & \leq C_5\bigg(\bigg(\frac{1}{\sqrt{n}\,}\sqrt{2^\ell\mathbb{E}(A^q)2^{-2qm}}\bigg) \wedge 2^{-2qm}\bigg) \nonumber \\ & \leq C_6\bigg({\frac{1}{\sqrt{n}\,}2^{\ell/2-qm} }\wedge 2^{-2qm}\bigg). \end{align}

Hence, by (4.9), (4.16), and (4.17),

\begin{align*} \mathbb{E}\big(\mathbb{E}_{\mathscr C_n}\big(\big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert\big); \,S_n^{\mathrm{c}} \cap D_n\big) & \leq C_7\sum_{m\geq 0}2^m\sum_{\ell\geq 0}2^{-\ell+2m}\bigg({\frac{1}{\sqrt{n}\,}2^{\ell/2-qm}}\wedge2^{-2qm}\bigg) \\ & \leq C_7\sum_{m\geq 0}2^{3m}\sum_{\ell\geq 0}2^{-\ell/2}\bigg({\frac{1}{\sqrt{n}\,}2^{-qm}} \wedge 2^{-2qm}\bigg) \\ & \leq C_8\sum_{m\geq 0}\bigg({\frac{1}{\sqrt{n}}2^{m(3-q)}} \wedge 2^{-m(2q-3)}\bigg). \end{align*}

If $q>3$ then

$$ C_8\sum_{m\geq 0}\bigg({\frac{1}{\sqrt{n}\,}2^{m(3-q)}} \wedge 2^{-m(2q-3)}\bigg) \leq C_9\frac{1}{\sqrt{n}\,}. $$

If $q\in\big(\frac32, 3\big)$ then, with $m_n=\lfloor \log\!(n)/(2q\log\!(2))\rfloor$ ,

$$ C_8\sum_{m\geq 0}\!\bigg(\!{\frac{1}{\sqrt{n}\,}2^{m(3-q)}} \wedge 2^{-m(2q-3)}\!\bigg) \leq C_8\!\sum_{m=0}^{m_n}\!{\frac{1}{\sqrt{n}\,}2^{-m(q-3)}} + C_8\!\sum_{m>m_n}^\infty\! 2^{-m(2q-3)} = O\big(n^{-(1-{3}/{2q})}\big). $$

If $q=3$ then, with $a_n = \lfloor\log\!(n)/\!\log\!(2)\rfloor$ ,

$$ C_8\sum_{m\geq 0}\bigg({\frac{1}{\sqrt{n}\,}2^{m(3-q)}} \wedge 2^{-m(2q-3)}\bigg) \leq C_8{\frac{a_n}{\sqrt{n}\,}} + C_8\sum_{m\geq a_n}^\infty 2^{-m(2q-3)} = O\big(n^{-1/2}\log(n)\big). $$

Thus, it only remains to bound the expectation of $\mathbb{E}_{\mathscr C_n}\big(\big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert\big)$ on $S_n\cap D_n$ , where $S_n$ is the event in (4.14). Now, with the third inequality following from the fact that on $S_n$ we have $\sqrt{A_i} \leq \sqrt{n(\mu_A-\kappa)}$ for $i=1,\ldots, n$ ,

\begin{align*} \mathbb{E}\big(\mathbb{E}_{\mathscr C_n}\big(\big\vert\sqrt{\overline{A}}-\sqrt{\overline{A}_{(n)}}\big\vert\big); \,S_n \cap D_n\big) & \leq \mathbb{P}(S_n\cap D_n)\mathbb{E}\big(\sqrt{\overline{A}}\big) + \mathbb{E}\big(\sqrt{\overline{A}_{(n)}};\,S_n\cap D_n\big) \\ & = \mathbb{P}(S_n\cap D_n)\mathbb{E}\big(\sqrt{\overline{A}}\big) + \mathbb{E}\Bigg(\frac{\sum_{i=1}^nA_i^{{3}/{2}}}{\sum_{i=1}^nA_i};\,S_n\cap D_n\Bigg) \\ & \leq \mathbb{P}(S_n\cap D_n)\mathbb{E}\big(\sqrt{\overline{A}}\big) + \mathbb{E}\Bigg({\sum_{i=1}^n\sqrt{A_i}};\,S_n\cap D_n\Bigg) \\ & \leq \mathbb{P}(S_n\cap D_n)\mathbb{E}\big(\sqrt{\overline{A}}\big) + \mathbb{E}(n^{3/2}\sqrt{\mu_A-\kappa};\,S_n\cap D_n) \\ & = \mathcal{O}\big(n^{3/2}\mathbb{P}(S_n)\big) = \mathcal{O}\big(n^{3/2}{\mathrm{e}}^{n(\mu_a-\kappa)}\mathbb{E}({\mathrm{e}}^{-A})^n\big), \end{align*}

where the last step follows from the Chernoff bound $\mathbb{P}(S_n)\leq{\mathrm{e}}^{n(\mu_a-\kappa)}\mathbb{E}({\mathrm{e}}^{-A})^n$ . The assertion now follows by recalling that ${\mathrm{e}}^{(\mu_a-\kappa)}\mathbb{E}({\mathrm{e}}^{-A})<1$ .

Lemma 4.7. Let $\varepsilon_n$ be as in Proposition 4.2, and $\mathbb{E}(A^2),\mathbb{E}(B^2)<\infty$ . Then, w.h.p.,

(4.18) \begin{align} \hskip24pt d_{\mathrm{TV}}\big(\mathrm{Po}\big(\overline{B}_{(n)}{\mu_A^{(n)}}/{\mu_A}\big), \mathrm{Po}(\overline{B}_{(n)})\big) & \leq n^{-{1}/{2}+\varepsilon_n},\hskip-24pt\\[-30pt] \nonumber \end{align}
(4.19) \begin{align} d_{\mathrm{TV}}\big(\mathrm{Po}\big(\overline{A}_{(n)}{\mu_B^{(n)}\lfloor n\mu_A/\mu_B\rfloor}/{n\mu_A}\big), \mathrm{Po}(\overline{A}_{(n)})\big) & \leq n^{-{1}/{2}+\varepsilon_n}.\\[6pt] \nonumber \end{align}

Proof. Define $H_n \,:\!=\, \big\{\big\vert1-{\mu_A^{(n)}}/{\mu_A}\big\vert \leq n^{-{1}/{2}+({1}/{2})\varepsilon_n}\big\}$ . By Chebyshev’s inequality, $\mathbb{P}(H_n^{\mathrm{c}}) = \mathcal{O}(n^{-\varepsilon_n})$ . Now (again by [Reference Barbour, Holst and Janson10, Theorem 1.C]),

\begin{align*} \mathbb{E}\big(d_{\mathrm{TV}}\big(\mathrm{Po}\big(\overline{B}_{(n)}{\mu_A^{(n)}}/{\mu_A}\big), \mathrm{Po}(\overline{B}_{(n)})\big); \,H_n\big) & \leq \mathbb{E}\big(\big\vert\overline{B}_{(n)}{\mu_A^{(n)}}/{\mu_A}-\overline{B}_{(n)}\big\vert; \,H_n\big) \\ & \leq \mathbb{E}\big(\overline{B}_{(n)}n^{-{1}/{2}+({1}/{2})\varepsilon_n}\big) = \mathcal{O}\big(n^{-{1}/{2}+({1}/{2})\varepsilon_n}\big). \end{align*}

This implies, using the union bound and Markov’s inequality,

$$ \mathbb{P}\big(d_{\mathrm{TV}}\big(\mathrm{Po}\big(\overline{B}_{(n)}{\mu_A^{(n)}}/{\mu_A}\big), \mathrm{Po}(\overline{B}_{(n)})\big) \geq n^{-{1}/{2}+\varepsilon_n}\big) \leq \mathbb{P}(H_n^{\mathrm{c}}) + \mathcal{O}\big(n^{-{1}/{2}\varepsilon_n}\big). $$

This proves the inequality in (4.18), and the proof of the inequality in (4.19) is completely analogous.

Proof of Theorem 3.2. Let $\varepsilon_n$ , q, and $\gamma$ be as in Theorem 3.2, and let $T\in\mathbb{N}_0$ be the number of iterations in the construction algorithm on page 16 (i.e. $\mathcal N_T=\mathcal E_T$ ). By Lemma 4.6, w.h.p. the vertices of $V_n$ and $V_n'$ that are explored in the first $\lfloor n^{\gamma-\varepsilon_n}\rfloor \wedge T$ steps of the construction algorithm are distinct. Combining Lemma By Lemma 4.6 with Proposition 4.2 gives the assertion of Theorem 3.2 by the union bound.

Acknowledgements

The authors would like to thank Pieter Trapman for fruitful discussions and constructive criticism of the paper. We would also like to thank an anonymous referee for their many valuable comments, which contributed to a considerable improvement of the paper.

Funding information

This research is supported by the Swedish Research Council (Vetenskapsrådet) grant 2016-04566.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Ball, F., Mollison, D. and Scalia-Tomba, G. (1997). Epidemics with two levels of mixing. Ann. Appl. Prob. 7, 4689.CrossRefGoogle Scholar
Ball, F. and Neal, P. (2002). A general model for stochastic SIR epidemics with two levels of mixing. Math. Biosci. 180, 73102.CrossRefGoogle ScholarPubMed
Ball, F. and Neal, P. (2003). The great circle epidemic model. Stoch. Process. Appl. 107, 233268.CrossRefGoogle Scholar
Ball, F., Pellis, L. and Trapman, P. (2016). Reproduction numbers for epidemic models with households and other social structures II: Comparisons and implications for vaccination. Math. Biosci. 274, 108139.CrossRefGoogle ScholarPubMed
Ball, F. and Shaw, L. (2015). Estimating the within-household infection rate in emerging SIR epidemics among a community of households. J. Math. Biol. 71, 17051735.CrossRefGoogle ScholarPubMed
Ball, F. and Sirl, D. (2012). An SIR epidemic model on a population with random network and household structure, and several types of individuals. Adv. Appl. Prob. 44, 6386.CrossRefGoogle Scholar
Ball, F., Sirl, D. and Trapman, P. (2009). Threshold behaviour and final outcome of an epidemic on a random network with household structure. Adv. Appl. Prob. 41, 765796.CrossRefGoogle Scholar
Ball, F., Sirl, D. and Trapman, P. (2010). Analysis of a stochastic SIR epidemic on a random network incorporating household structure. Math. Biosci. 224, 5373.CrossRefGoogle ScholarPubMed
Ball, F., Sirl, D. and Trapman, P. (2014). Epidemics on random intersection graphs. Ann. Appl. Prob. 24, 10811128.CrossRefGoogle Scholar
Barbour, A., Holst, L. and Janson, S. (1992). Poisson Approximation. Clarendon Press, Oxford.CrossRefGoogle Scholar
Becker, N. G. and Dietz, K. (1995). The effect of household distribution on transmission and control of highly infectious diseases. Math. Biosci. 127, 207219.CrossRefGoogle ScholarPubMed
Biggins, J. D. and Kyprianou, A. E. (2004). Measure change in multitype branching. Adv. Appl. Prob. 36, 544581.CrossRefGoogle Scholar
Britton, T., Deijfen, M., Lagerås, A. N. and Lindholm, M. (2008). Epidemics on random graphs with tunable clustering. J. Appl. Prob. 45, 743756.CrossRefGoogle Scholar
Broadbent, S. R. and Hammersley, J. M. (1957). Percolation processes: I. Crystals and mazes. Math. Proc. Camb. Phil. Soc. 53, 629–641.CrossRefGoogle Scholar
Dye, C. et al. (2015). Ebola virus disease in West Africa: The first 9 months. New England J. Medicine 372, 189.Google ScholarPubMed
Fournier, N. and Guillin, A. (2015). On the rate of convergence in Wasserstein distance of the empirical measure. Prob. Theory Relat. Fields 162, 707738.CrossRefGoogle Scholar
Frisch, H. and Hammersley, J. (1963). Percolation processes and related topics. J. SIAM 11, 894918.Google Scholar
Heesterbeek, H., Britton, T. and Diekmann, O. (2013). Mathematical Tools for Understanding Infectious Disease Dynamics. Princeton University Press.Google Scholar
Iksanov, A. and Meiners, M. (2015). Rate of convergence in the law of large numbers for supercritical general multi-type branching processes. Stoch. Process. Appl. 125, 708738.CrossRefGoogle Scholar
Jagers, P. (1975). Branching Processes with Biological Applications. John Wiley, Chichester.Google Scholar
Jagers, P. (1989). General branching processes as Markov fields. Stoch. Process. Appl. 32, 183212.CrossRefGoogle Scholar
Jagers, P. (1993). Stabilities and instabilities in population dynamics. In Stability Problems for Stochastic Models, eds V. V. Kalashnikov and V. M. Zolatarev. Springer, Berlin, pp. 58–67.CrossRefGoogle Scholar
Karoński, M., Scheinerman, E. and Singer-Cohen, K. (1999). On random intersection graphs: The subgraph problem. Combinatorics Prob. Comput. 8, 131159.CrossRefGoogle Scholar
Ludwig, D. (1975). Final size distribution for epidemics. Math. Biosci. 23, 3346.CrossRefGoogle Scholar
Neal, P. (2008). The SIS great circle epidemic model. J. Appl. Prob. 45, 513530.CrossRefGoogle Scholar
Nerman, O. (1981). On the convergence of supercritical general (C-M-J) branching processes. Z. Wahrscheinlichkeitsth. 57, 365395.CrossRefGoogle Scholar
Nishiura, H. and Chowell, G. (2014). Early transmission dynamics of ebola virus disease (EVD), West Africa, March to August 2014. Eurosurveillance 19, 20894.CrossRefGoogle Scholar
Nummelin, E. (2004). General Irreducible Markov Chains and Non-Negative Operators. Cambridge University Press.Google Scholar
Olofsson, P. (1996). Branching processes with local dependencies. Ann. Appl. Prob. 6, 238268.CrossRefGoogle Scholar
Pellis, L., Ball, F. and Trapman, P. (2012). Reproduction numbers for epidemic models with households and other social structures. I. Definition and calculation of inline853. Math. Biosci. 235, 8597.CrossRefGoogle ScholarPubMed
Pellis, L., Ferguson, N. and Fraser, C. (2011). Epidemic growth rate and household reproduction number in communities of households, schools and workplaces. J. Math. Biol. 63, 691734.CrossRefGoogle ScholarPubMed
Sagitov, S. (2017). Regenerative multi-type Galton–Watson processes. Preprint, arXiv:1708.02636.Google Scholar
Watson, R. K. (1972). On an epidemic in a stratified population. J. Appl. Prob. 9, 659666.CrossRefGoogle Scholar
Watts, D. and Strogatz, S. (1998). Collective dynamics of small world networks. Nature 393, 440442.CrossRefGoogle ScholarPubMed
Figure 0

Figure 1. Construction of $G_n$ for $n=17$ with $\vert V_n'\vert=\vert\{v_1',\ldots,v_6'\}\vert=6$ cliques. Top: the auxiliary graph $G_n^{\text{aux}}$. Bottom: the resulting directed graph $G_n$.