Hostname: page-component-586b7cd67f-dsjbd Total loading time: 0 Render date: 2024-11-22T10:40:30.408Z Has data issue: false hasContentIssue false

Moran models and Wright–Fisher diffusions with selection and mutation in a one-sided random environment

Published online by Cambridge University Press:  09 March 2023

Fernando Cordero*
Affiliation:
Bielefeld University
Grégoire Véchambre*
Affiliation:
Academy of Mathematics and Systems Science, Chinese Academy of Sciences
*
*Postal address: Faculty of Technology, Bielefeld University, Box 100131, 33501 Bielefeld, Germany. Email address: [email protected]
**Postal address: Academy of Mathematics and Systems Science, Chinese Academy of Sciences, No. 55, Zhongguancun East Road, Haidian District, Beijing, China. Email address: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Consider a two-type Moran population of size N with selection and mutation, where the selective advantage of the fit individuals is amplified at extreme environmental conditions. Assume selection and mutation are weak with respect to N, and extreme environmental conditions rarely occur. We show that, as $N\to\infty$, the type frequency process with time sped up by N converges to the solution to a Wright–Fisher-type SDE with a jump term modeling the effect of the environment. We use an extension of the ancestral selection graph (ASG) to describe the genealogical picture of the model. Next, we show that the type frequency process and the line-counting process of a pruned version of the ASG satisfy a moment duality. This relation yields a characterization of the asymptotic type distribution. We characterize the ancestral type distribution using an alternative pruning of the ASG. Most of our results are stated in annealed and quenched form.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

The Wright–Fisher diffusion with mutation and selection describes the evolution of the type composition of an infinite two-type haploid population, which is subject to mutation and selection. Fit individuals reproduce at rate $1+\sigma$ , $\sigma\geq 0$ , whereas unfit ones reproduce at rate 1. In addition, individuals mutate at rate $\theta$ to the fit type with probability $\nu_0\in[0,1]$ , and to the unfit type with probability $\nu_1\,{:\!=}\,1-\nu_0$ . The proportion of fit individuals evolves forward in time according to the stochastic differential equation (SDE)

(1.1) \begin{align} {\textrm{d}} X(t) =(\theta\nu_0(1-X(t))-\theta\nu_1 X(t) &+ \sigma X(t)(1-X(t)))\,{\textrm{d}} t \nonumber \\[3pt] &\qquad +\sqrt{2 X(t)(1-X(t))}\,{\textrm{d}} B(t),\quad t\geq 0,\end{align}

where $(B(t))_{t\geq 0}$ is a standard Brownian motion. The solution to (1.1) arises as the diffusion approximation of (properly normalized) continuous-time Moran models and discrete-time Wright–Fisher models. In the neutral case, it also appears as the limit of a large class of Cannings models (see [Reference Möhle33]). The genealogical counterpart to X is the ancestral selection graph (ASG), which is a branching–coalescing process coding the potential ancestors of an untyped sample of the population at present. It was introduced by Krone and Neuhauser in [Reference Krone and Neuhauser29, Reference Neuhauser and Krone35] and later extended to models evolving under general neutral reproduction mechanisms (see [Reference Baake, Lenz and Wakolbinger2, Reference Etheridge, Griffiths and Taylor15]), and to general forms of frequency-dependent selection (see [Reference Baake, Cordero and Hummel1, Reference Cordero, Hummel and Schertzer12, Reference González Casanova and Spanò19, Reference Neuhauser34]).

For $\theta=0$ , the process $(R(t))_{t\geq 0}$ that counts the lines in the ASG is moment dual with $1-X$ , i.e. for every $n\in{\mathbb N}, \, x\in[0,1]$ , and $t\geq 0$ we have

(1.2) \begin{equation}{\mathbb E}\left[(1-X(t))^n\mid X(0)=x\right]={\mathbb E}\left[(1-x)^{R(t)}\mid R(0)=n\right].\end{equation}

This relation yields an expression for the absorption probability of X at 0 in terms of the stationary distribution of R. For $\theta>0$ , there are two variants of the ASG that dynamically resolve mutation events and encode relevant information about the model: the killed ASG and the pruned lookdown ASG. The killed ASG was introduced in [Reference Baake and Wakolbinger3] to determine whether a sample consists only of unfit individuals. Its line-counting process extends Equation (1.2) to the case $\theta>0$ [Reference Baake and Wakolbinger3, Proposition 1] (see [Reference Cordero and Möhle13, Proposition 2.2] for a generalization). This allows one to characterize the stationary distribution of X. The pruned lookdown ASG in turn was introduced in [Reference Lenz, Kluth, Baake and Wakolbinger31] (see [Reference Baake, Lenz and Wakolbinger2, Reference Cordero11] for extensions) as a tool for determining the ancestral type distribution, i.e. the type distribution of the individuals that have been successful in the long run.

In many biological situations the strength of selection fluctuates in time. The influence of random fluctuations in selection intensities on the growth of populations has been the object of extensive research in the past (see e.g. [Reference Bull8, Reference Bürger and Gimelfarb9, Reference Gillespie18, Reference Karlin and Benny Levikson25Reference Karlin and Lieberman27, Reference Sagitov, Jagers and Vatutin36]), and it is currently experiencing renewed interest (see e.g. [Reference Bansaye, Caballero and Méléard4, Reference Biswas, Etheridge and Klimek7, Reference Chetwynd-Diggle and Klimek10, Reference González Casanova, Spanò and Wilke-Berenguer20Reference Guillin, Personne and Strickler22]). In this paper, we consider the scenario where the selective advantage of fit individuals is accentuated by exceptional environmental conditions (e.g. extreme temperatures, precipitation, humidity variation, abundance of resources, etc.). As an example, consider a population consisting of fit and unfit individuals which is subject to catastrophes. Assume that only fit individuals are resistant to the catastrophes. Hence, shortly after a catastrophe the population may drop below its carrying capacity and subsequently grow quickly. Since fit individuals have a reproductive advantage, it is likely that their relative frequency will grow fast after a catastrophe. One may also think of a population consisting of individuals that are specialized to high temperatures, as well as wild-type individuals accustomed but not specialized to them. The environment is characterized by moderately high temperatures, present most of the time, and short periods of extreme heat. It is then likely that specialized individuals have a (slight) reproductive advantage under moderate temperatures, and a more prominent advantage at extreme temperatures.

To model the previously described scenario we use a two-type Moran population with mutation and selection, immersed in a varying environment. The environment is modeled via a countable collection of points $\zeta\,{:\!=}\,(t_i,p_i)_{i\in I}$ in $(\!-\!\infty,\infty)\times(0,1)$ , satisfying that $\sum_{t_i\in[s,t]} p_i<\infty$ for all $s\lt t$ . Each $t_i$ represents the time of an instantaneous environmental change; the peak $p_i$ models the strength of this event: at time $t_i$ each fit individual independently reproduces with probability $p_i$ . Each offspring replaces a different individual in the population, so that the population size remains constant. The summability of the peaks ensures that the number of reproductions in any compact time interval is almost surely finite. In this context, we show that the type frequency process is continuous with respect to the environment. The proof uses coupling techniques that uncover the effect of small environmental changes.

Next, we consider a random environment given by a Poisson point process on $(\!-\!\infty,\infty) \times (0,1)$ with intensity measure ${\textrm{d}} t \times \mu$ , where ${\textrm{d}} t$ stands for the Lebesgue measure and $\mu$ is a measure on (0, 1) satisfying $\int\! x \mu({\textrm{d}} x) \lt \infty$ . Then we let population size grow to infinity, and we show that, in an appropriate parameter regime, the fit-type frequency process converges to the solution to the SDE

(1.3) \begin{align} {\textrm{d}} X(t)=\left(\theta\nu_0(1-X(t))-\theta\nu_1 X(t)\right){\textrm{d}} t &+X(t-\!)(1-X(t-\!)){\textrm{d}} S(t) \nonumber\\ &\quad +\sqrt{2 X(t)(1-X(t))}{\textrm{d}} B(t),\quad t\geq 0,\end{align}

where $S(t)\,{:\!=}\, \sigma t+J(t)$ , and J is a pure-jump subordinator with Lévy measure $\mu$ , independent of B, which represents the cumulative effect of the environment. We refer to X as the Wright–Fisher diffusion in random environment. We prove the convergence in an annealed setting, i.e. when the environment is random. For environments given by compound Poisson processes, we show that the convergence also holds in a quenched sense, i.e. when a realization of the environment is fixed. For $\theta=0$ , Equation (1.3) is a particular case of [Reference Bansaye, Caballero and Méléard4, Equation 3.3], which arises as the large-population limit of a family of discrete-time Wright–Fisher models [Reference Bansaye, Caballero and Méléard4, Theorem 3.2].

Next, we generalize the construction of the ASG, the killed ASG, and the pruned lookdown ASG to incorporate the effect of the environment. In the annealed case, we establish a relation between X, the line-counting process of the killed ASG, and the total increment of the environment; we refer to this relation as a reinforced moment duality. The latter is a central tool for characterizing the asymptotic type frequencies. We also express the ancestral type distribution in terms of the line-counting process of the pruned lookdown ASG. Analogous results are obtained in the more involved quenched setting.

As an application of our results, we compare the long-term behavior of two Wright–Fisher diffusions without mutations, the first one having parameter $\sigma=0$ and an environment with Lévy measure $\mu$ , and the second one having parameter $\sigma=\int_{(0,1)} y\mu({\textrm{d}} y)$ and no environment. We prove that the probability of fixation of the fit type is smaller under the first model than under the second one, provided that the initial frequency of fit individuals is sufficiently large; see Proposition 2.2.

The analysis of a more realistic scenario where environmental changes are not always favorable to the same type cannot be done via the methods presented in this paper. The main reason is that in such a setting the frequency process does not admit a moment dual. To circumvent this problem one has to take into account the whole combinatorics of the ASG, which is a cumbersome object. This is the object of a forthcoming study.

We would also like to mention the parallel development by González Casanova et al. in [Reference González Casanova, Spanò and Wilke-Berenguer20]. They study the accessibility of the boundaries and the fixation probabilities of a generalization of the SDE (1.3) with $\theta=0$ . The paper [Reference González Casanova, Spanò and Wilke-Berenguer20] makes use only of the ASG and does not cover the case $\theta>0$ , where the killed and the pruned lookdown ASG play a pivotal role. Moreover, the reinforced moment duality and all the results obtained in the quenched setting are to the best of our knowledge new.

This article is organized as follows. An outline of the paper containing our main results is given in Section 2. In Section 3 we prove the continuity of the type frequency process in the Moran model with respect to the environment, that (1.3) is well-posed, and that it arises in the large-population limit of the type frequency process of a sequence of Moran models. In Section 4 we give more detailed definitions of the ASG, the killed ASG, and the pruned lookdown ASG. Section 5 is devoted to the proofs of (i) the annealed moment duality between the process X and the line-counting process of the killed ASG, (ii) the long-term behavior of the annealed type frequency process, and (iii) the annealed ancestral type distribution. The quenched versions of these results are proved in Section 6. Section 7 provides additional (quenched) results for environments having finitely many jumps in any compact time interval.

2. Description of the model and main results

Notation. The positive integers are denoted by ${\mathbb N}$ , and we set ${\mathbb N}_0\,{:\!=}\, {\mathbb N}\cup\{0\}$ . For $m\in {\mathbb N}$ ,

\begin{align*}[m]\,{:\!=}\, \{1,\ldots,m\},\quad [m]_0\,{:\!=}\, [m]\cup\{0\},\quad \text{and} \quad {]{m}]}\,{:\!=}\, [m]\setminus\{1\}.\end{align*}

For $x,y\in{\mathbb R}$ , we define $x\wedge y\,{:\!=}\,\min\{x,y\}$ , $x\vee y\,{:\!=}\,\max\{x,y\}$ , and $(x)_+\,{:\!=}\, x\vee 0$ . For $s<t$ , we denote by ${\mathbb D}_{s,t}$ (resp. ${\mathbb D}$ ) the space of càdlàg functions from [s, t] (resp. ${\mathbb R}$ ) to ${\mathbb R}$ , which is endowed with the Billingsley metric inducing the $J_1$ -Skorokhod topology and makes the space complete (see Appendix A.1). For any Borel set $S\subset{\mathbb R}$ , denote by $\mathcal{M}_f(S)$ (resp. $\mathcal{M}_1(S)$ ) the set of finite (resp. probability) measures on S. We use $\xrightarrow[]{(d)}$ to denote convergence in distribution of random variables and for convergence in distribution of càdlàg processes.

For $n\in {\mathbb N}_0$ and $k,m\in[n]_0$ , we write $K\sim \textrm{Hyp}({n},{m},{k})$ if K is a hypergeometric random variable with parameters n, m, and k, i.e.

\begin{align*}{\mathbb P}(K=i)= {\binom{n-m}{k-i} \binom{m}{i}}/{\binom{n}{k}},\quad i\in[k\wedge m]_0.\end{align*}

For $x\in[0,1]$ and $n\in {\mathbb N}$ , we write $B\sim \textrm{Bin}({n},{x})$ if B is a binomial random variable with parameters n and x, i.e.

\begin{align*}{\mathbb P}(B=i)= \binom{n}{i} x^i (1-x)^{n-i},\quad i\in[n]_0.\end{align*}

Relevant notation introduced in the next sections is collected in Appendix B.

2.1. Moran models in deterministic pure-jump environments

Consider a population of size N with two types, 0 and 1, subject to mutation and selection influenced by a deterministic environment. The latter is modeled by an at most countable collection $\zeta\,{:\!=}\, (t_k, p_k)_{k \in I}$ of points in $(\!-\!\infty,\infty) \times (0,1)$ satisfying for any $s,t\in{\mathbb R}$ with $s\leq t$ that

(2.1) \begin{equation} \sum_{t_k \in[s, t]} p_k \lt \infty.\end{equation}

We refer to $p_k$ as the peak of the environment at time $t_k$ . The individuals in the population undergo the following dynamic. Each individual independently mutates at rate $\theta_N\geq 0$ with probability $\nu_{0}\in[0,1]$ (resp. $\nu_1\,{:\!=}\, 1-\nu_0$ ) to type 0 (resp. 1). Reproduction occurs independently from mutation. Individuals of type 1 reproduce at rate 1, whereas individuals of type 0 reproduce at rate $1+\sigma_N$ , $\sigma_N\geq0$ . (The subscript N in the parameters $\sigma_N$ and $\theta_N$ emphasizes their dependence on N. In Theorem 2.2 we will require that they are asymptotically proportional to $1/N$ .) Thus, we refer to type 0 (resp. type 1) as the fit (resp. unfit) type. In addition, at time $t_k$ each type-0 individual independently reproduces with probability $p_k$ . At reproduction times, (a) each individual produces at most one offspring, which inherits the parent’s type, and (b) if n individuals are born, n individuals are randomly sampled without replacement from the population present before the reproduction event (including the parents) to die, keeping the size of the population constant.

Graphical representation. In the absence of environmental factors (i.e. $\zeta=\emptyset$ ), it is classical to describe the evolution of the population by means of the graphical representation as an interacting particle system (IPS). This decouples the randomness of the model due to the initial type configuration from the randomness due to mutations and reproductions. We now extend the graphical representation to incorporate the effect of the environment (see Section 3.1 for a more detailed description).

In the graphical representation, individuals are represented by horizontal lines at levels $i\in [N]$ (see Figure 1). Time runs forward from left to right. Potential reproduction events are depicted by arrows, with the (potential) parent at the tail and the offspring at the tip. We distinguish between neutral and selective arrows. Neutral arrows have a filled arrowhead; they occur at rate $1/N$ per pair of lines. Selective arrows have an open arrowhead; they occur in two independent ways: first, they occur at rate $\sigma_N/N$ per pair of lines, and second, at any time $t_k$ , $k\in I$ , a random number $n_k\sim\textrm{Bin}({N},{p_k})$ of lines shoot selective arrows to $n_k$ individuals in the population. Furthermore, beneficial (deleterious) mutations, depicted as circles (crosses), occur at rate $\theta_N \nu_0$ (at rate $\theta_N \nu_1$ ) per line.

Figure 1. Left: a realization of the Moran IPS; time runs forward from left to right; the environment has peaks at times $t_0$ and $t_1$ . Right: the ASG that arises from the second and third lines (from bottom to top) in the left picture, with the potential ancestors drawn in black; time runs backward from right to left; backward time $\beta\in[0,T]$ corresponds to forward time $t=s+T-\beta$ .

Note that for any $s<t$ , the number of non-environmental graphical elements present in [s, t] is almost surely finite. Moreover, thanks to Assumption (2.1), we have

(2.2) \begin{align}{\mathbb E}\left[\sum_{t_k\in[s,t]}n_k\right]=N\sum_{t_k\in[s,t]}p_k<\infty.\end{align}

Hence, $\sum_{t_k\in[s,t]}n_k<\infty$ almost surely, i.e. the number of arrows in [s, t] due to peaks of the environment is almost surely finite.

Once the graphical elements in $[s,t]\times [N]$ are drawn, we specify the initial conditions by assigning types to the N lines at time s and propagate them forward in time according to the following rules: the type of a line right after a circle (resp. cross) is 0 (resp. 1). In particular, if the type before the circle (resp. cross) is 0 (resp. 1), the mutation is silent. Type 0 propagates through neutral arrows and selective arrows; type 1 propagates only through neutral arrows.

Reading off ancestries in the Moran model. The ancestral selection graph (ASG) was introduced by Krone and Neuhauser in [Reference Krone and Neuhauser29] (see also [Reference Neuhauser and Krone35]) to study the genealogical relations in the diffusion limit of the Moran model with mutation and selection. In what follows, we briefly explain how to adapt this construction to the Moran model in deterministic environment.

Consider a realization of the IPS associated to the Moran model in the environment $\zeta\,{:\!=}\, (t_k, p_k)_{k \in I}$ in the time interval $[s,s+T]$ . Fix an untyped sample of n individuals at time $s+T$ and trace backward in time (from right to left in Figure 1) the lines of their potential ancestors (i.e. the lines that are ancestral to the sample for some type-configuration at time s), ignoring the effect of mutations; the backward time $\beta\in[0,T]$ corresponds to the forward time $t=s+T-\beta$ . We do this as follows. When a neutral arrow joins two individuals in the current set of potential ancestors, the two lines coalesce into a single one at the tail of the arrow. When a neutral arrow hits a potential ancestor from outside the current set of potential ancestors, the hit line is replaced by the line at the tail of the arrow. When a selective arrow hits the current set of potential ancestors, the individual that is hit has two possible parents, the incoming branch at the tail and the continuing branch at the tip. The true parent depends on the type of the incoming branch, but for the moment we work without types. These unresolved reproduction events can be of two types: a branching event if the selective arrow emanates from an individual outside the current set of potential ancestors, and a collision event if the selective arrow links two current potential ancestors. Note that at the peak times, multiple lines in the ASG can be hit by selective arrows, and therefore, multiple branching and collision events can occur simultaneously. Mutations are superposed on the lines of the ASG.

The object arising under this procedure up to time $\beta=T$ is called the Moran-ASG in $[s,s+T]$ under the environment $\zeta$ . It contains all the lines that are potentially ancestral (ignoring mutation events) to the lines sampled at time $t=s+T$ ; see Figure 1. Note that, since the number of events occurring in $[s,s+T]$ is almost surely finite (see (2.2)), the Moran-ASG in $[s,s+T]$ is well-defined.

Given an assignment of types to the lines present in the ASG at time $t=s$ , we can extract the true genealogy and determine the types of the sampled individuals at time $t=s+T$ . To this end, we propagate types forward in time along the lines of the ASG, taking into account mutations and reproductions, with the rule that if a line is hit by a selective arrow, the incoming line is the ancestor if and only if it is of type 0; see Figure 2. This rule is called the pecking order. Proceeding in this way, the types in $[s,s+T]$ are determined along with the true genealogy.

Figure 2. The descendant line (D) splits into the continuing line (C) and the incoming line (I). The incoming line is ancestral if and only if it is of type 0. The true ancestral line is drawn in bold.

Evolution of the type composition. Consider the set ${\mathbb D}^\star$ of non-decreasing functions $\omega\in{\mathbb D}$ satisfying the following:

  1. (i) for all $s<t\in{\mathbb R}$ , $\Delta \omega(t)\,{:\!=}\, \omega(t)-\omega(t-\!)\in[0,1)$ and $\sum_{u\in[s,t]}\Delta \omega(u)<\infty$ ;

  2. (ii) $\omega$ is pure-jump, i.e. for all $s<t$ , $\omega(t)=\omega(s)+\sum_{u\in(s,t]}\Delta \omega(u)$ .

Note that the set of environments $\zeta\,{:\!=}\, (t_k, p_k)_{k \in I}$ satisfying (2.1) can be identified with the set of functions $\omega\in {\mathbb D}^\star$ with $\omega(0)=0$ . Indeed, for any $\omega\in{\mathbb D}^\star$ , the collection of points $\{(t,\Delta \omega(t))\,{:}\, \Delta\omega(t)>0)\}$ is countable and satisfies (2.1) (the same collection is obtained if we add a constant to $\omega$ ; this is why we set $\omega(0)=0$ ). Conversely, for any $\zeta\,{:\!=}\, (t_k, p_k)_{k \in I}$ , the function $\omega\,{:}\,{\mathbb R}\to{\mathbb R}$ defined via

\begin{align*}\omega(t)\,{:\!=}\,\sum_{t_k\in(0,t]}p_k\quad\textrm{for $t\geq 0$, and}\quad {\omega(t)\,{:\!=}\,-\sum_{t_k\in(t,0]}p_k}\quad\textrm{for $t<0$},\end{align*}

belongs to ${\mathbb D}^\star$ . For this reason, we often abuse notation and refer to the elements of ${\mathbb D}^\star$ as environments. In addition, an environment $\omega\in{\mathbb D}^\star$ is said to be simple if $\omega$ has only a finite number of jumps in any compact time interval. We denote by $\textbf{0}$ the environment corresponding to $\zeta=\emptyset$ and refer to it as the null environment.

We denote by $Z_N^\omega(t)$ the number of fit individuals at time t in a Moran population of size N subject to the environment $\omega\in{\mathbb D}^\star$ . We refer to $Z_N^\omega\,{:\!=}\,(Z_N^\omega(t))_{t\geq 0}$ as the quenched fit-counting process. In particular, $Z_N^\textbf{0}$ is just the continuous-time Markov chain on $[N]_0$ with infinitesimal generator

\begin{align*}\mathcal{A}_N^0 f(n)&\,{:\!=}\, \left[(1+\sigma_N)\frac{n(N-n)}{N}+\theta_N\nu_0(N-n)\right](f(n+1)-f(n))\\[3pt]&\qquad\qquad\qquad\qquad\qquad\qquad\quad + \left[\frac{n(N-n)}{N}+\theta_N\nu_1 n\right](f(n-1)-f(n)).\end{align*}

Note that if $\omega$ has a jump at time t, the number n(t) of individuals placing offspring is a binomial random variable with parameters $Z_N^\omega(t-\!)$ and $\Delta\omega(t)$ . Since the n(t) individuals that will be replaced are chosen uniformly at random, the additional number of fit individuals after the reproduction event is a hypergeometric random variable with parameters N, $N-Z_N^\omega(t-\!)$ , and n(t). Therefore, the dynamic of $Z_N^\omega$ is as follows. Recall that in any finite time interval, the number of environmental reproductions is almost surely finite. Thus, we can define $(S_i)_{i\in{\mathbb N}}$ as the increasing sequence of times at which environmental reproductions take place. We set $S_0\,{:\!=}\, 0$ . By construction, $(S_i)_{i\in{\mathbb N}}$ is Markovian and its transition probabilities are given by

\begin{align*}\mathbb{P}(S_{i+1} \gt t \mid S_i = s ) = \prod_{u \in (s, t]} (1- \Delta \omega(u))^N,\quad i\in{\mathbb N}_0, 0\leq s\leq t.\end{align*}

If $Z_N^\omega(0)=n\in [N]_0$ , then $Z_N^\omega$ evolves in $[0,S_1)$ as $Z_N^\textbf{0}$ started at n. For $i\in{\mathbb N}$ , if $Z_N^\omega(S_{i}-\!)=k$ , then $Z_N^\omega(S_{i})=k+H(N,N-k,\tilde B_i(k))$ , where the random variables $H(N,N-k,b)\sim \textrm{Hyp}(N,N-k,b)$ , $b\in[k]_0$ , and $\tilde B_i(k)$ are independent, and $\tilde B_i(k)$ is a binomial random variable with parameters k and $\Delta \omega(S_i)$ conditioned to be positive. Then, $Z_N^\omega$ evolves in $[S_i,S_{i+1})$ as $Z_N^\textbf{0}$ started at $Z_N^\omega(S_{i})$ .

Let us fix $T>0$ . We end this section with our first main result, which provides the continuity in [0, T] of the fit-counting process with respect to the environment. Note that the restriction of the environment to [0, T] can be identified with an element of

(2.3) \begin{align} {\mathbb D}_T^\star\,{:\!=}\, \left\{\omega\in{\mathbb D}_{0,T}\,{:}\, \omega(0)=0,\, \Delta \omega(t)\in[0,1)\textrm{ for all $t\in[0,T]$, $\omega$ is}\right. \nonumber\\[3pt] \left.\text{non-decreasing and pure-jump}\right\}.\end{align}

Moreover, we equip ${\mathbb D}_T^\star$ with the metric $d_T^\star$ defined in Appendix A.1, Equation (A.3).

Theorem 2.1. (Continuity.) Let $\omega\in{\mathbb D}_T^\star$ and $\{\omega_k\}_{k\in{\mathbb N}}\subset{\mathbb D}_T^\star$ be such that $d_T^\star(\omega_k,\omega)\to 0$ as $k\to\infty$ . If $Z_N^{\omega_k}(0)=Z_N^\omega(0)$ for all $k\in{\mathbb N}$ , then

Theorem 2.1 is proved in Section 3.1.

2.2. Moran models in subordinator-driven environments

In contrast to Section 2.1, we consider here a random environment given by a Poisson point process $(t_i, p_i)_{i \in I}$ on $(\!-\!\infty,\infty) \times (0,1)$ with intensity measure ${\textrm{d}} t \times \mu$ , where ${\textrm{d}} t$ stands for the Lebesgue measure and $\mu$ is a measure on (0, 1) satisfying

(2.4) \begin{equation}\int_{(0,1)} x \mu({\textrm{d}} x) \lt \infty.\end{equation}

The latter implies that $(t_i, p_i)_{i \in I}$ almost surely satisfies Assumption (2.1). In particular, setting $J(t) \,{:\!=}\, \sum_{t_i\in(0,t]} p_i$ for $t\geq 0$ and $J(t)\,{:\!=}\, -\sum_{t_i\in(t,0]}p_i$ for $t<0$ , we have $J\in{\mathbb D}^\star$ almost surely. Moreover, by the Lévy–Itô decomposition, $(J(t))_{t\in{\mathbb R}}$ is a pure-jump subordinator with Lévy measure $\mu$ . If the measure $\mu$ is finite, then J is a compound Poisson process, and thus the environment J is almost surely simple.

We will see in Section 3.1 that, using the graphical representation, one can simultaneously construct Moran models for any $\omega\in{\mathbb D}^\star$ . Now, consider an independent pure-jump subordinator $(J(t))_{t\geq 0}$ , with Lévy measure $\mu$ on (0, 1) satisfying (2.4). Thanks to Theorem 2.1 the process $Z_N^J\,{:\!=}\, (Z_N^J(t))_{t\geq 0}$ is well-defined. We refer to $Z_N^J$ as the annealed fit-counting process. By definition, we have

\begin{align*}{\mathbb P}(Z_N^J \in \cdot)=\int {\mathbb P}(Z_N^\omega \in \cdot) {\mathbb P}(J \in {\textrm{d}} \omega).\end{align*}

In other words, ${\mathbb P}(Z_N^\omega \in \cdot)$ is the law of $Z_N^J$ conditionally on a realization $\omega$ of the environment (i.e. ${\mathbb P}(Z_N^\omega \in \cdot) = {\mathbb P}(Z_N^J \in \cdot \mid J=\omega)$ ) and is classically referred to as the quenched measure, while ${\mathbb P}(Z_N^J \in \cdot)$ integrates the effect of the random environment and is classically referred to as the annealed measure.

The process $Z_N^J$ is a continuous-time Markov chain on $[N]_0$ with generator

\begin{align*}\mathcal{A}_N f(n)\,{:\!=}\, \mathcal{A}_N^0 f(n) + \int_{(0,1)}\left({\mathbb E}\left[f\!\left(n+\mathcal{H}(N,N-n,B_n(u))\right)\right]-f(n)\right){\mu({\textrm{d}} u)},\quad n\in [N_0], \end{align*}

where $B_n(u)\sim \textrm{Bin}(n,u)$ , and for any $i\in[n]_0$ , $\mathcal{H}(N,N-n,i)\sim\textrm{Hyp}(N,N-n,i)$ are independent.

The dynamic of the graphical representation is as follows. For each $i,j\in [N]$ with $i\neq j$ , selective (resp. neutral) arrows from level i to level j appear at rate $\sigma_N/N$ (resp. $1/N$ ). For each $i\in [N]$ , open circles (resp. crosses) appear at level i at rate $\theta_N\nu_0$ (resp. $\theta_N\nu_1$ ). For each $k \in [N]$ , every group of k lines is subject to simultaneous potential reproductions at rate

\begin{align*}\sigma_{N,k}\,{:\!=}\, \int_{(0,1)}y^k (1-y)^{N-k}\mu({\textrm{d}} y),\end{align*}

resulting in the appearance of k selective arrows from the lines of this group (of potential parents) to the lines of a group of size k (the potential descendants) that is chosen uniformly at random among subsets of size k of the N individuals. The k selective arrows are drawn uniformly at random from the k potential parents to the k potential descendants. Recall that only type 0 propagates through selective arrows, while both types propagate through neutral arrows. The appearance of a selective arrow is therefore silent when the potential parent at its tail is of type 1.

2.3. The Wright–Fisher diffusion in random environment

In this section we are interested in the Wright–Fisher diffusion in random environment described in the introduction as the solution to the SDE (1.3). In Section 3 we will see that, indeed, for any $x_0\in[0,1]$ , this SDE has a pathwise unique strong solution starting at $x_0$ (see Proposition 3.3).

Consider a pure-jump subordinator $J=(J(s))_{s\geq 0}$ with Lévy measure satisfying (2.4) and an independent standard Brownian motion $B=(B(s))_{s\geq 0}$ . For any $T>0$ , the solution to (1.3) in [0, T] is a measurable function of $(B(s),J(s))_{s\in[0,T]}$ , which we denote by F(B, J). A regular version of the conditional law ${\mathbb P}(F(B,J) \in \cdot \mid J=\omega)$ of F(B, J) given J is classically referred to as the quenched probability measure. It is defined for almost every realization $\omega$ of J. ${\mathbb P}(F(B,J) \in \cdot)$ integrates the effect of the random environment and is classically referred to as the annealed measure. As before, the quenched and annealed measures are related via

\begin{align*}{\mathbb P}(F(B,J) \in \cdot)=\int {\mathbb P}(F(B,J) \in \cdot \mid J=\omega)\, {\mathbb P}(J \in {\textrm{d}} \omega).\end{align*}

We write X and $X^\omega$ for the solutions to (1.3) under the annealed and quenched measures, respectively. For $\omega$ simple, the process $X^\omega$ starting at $x_0$ can be alternatively defined as follows. Denote by $t_1<\cdots<t_k$ the consecutive jump times of $\omega$ in [0, T], and set $t_0\,{:\!=}\, 0$ and $X^\omega(0)\,{:\!=}\, x_0$ . In the intervals $[t_i,t_{i+1})$ , $X^\omega$ evolves as the solution to (1.1) starting at $X^\omega(t_i)$ . Moreover, if $X^\omega(t_i-\!)=x$ , then $X^\omega(t_i)\,{:\!=}\, x+x(1-x)\Delta\omega(t_i)$ ; see Figure 3 for an illustration.

Figure 3. An illustration of a path of the process $X^\omega$ in the interval [0, T]. The grey vertical lines represent the peaks of the environment $\omega$ ; $t_1$ , $t_2$ , $t_3$ , and $t_4$ are the jump times of $\omega$ .

The next result states the convergence of the type frequency process in the Moran model to the Wright–Fisher diffusion in random environment, as population size grows to $\infty$ and time is suitably accelerated.

Theorem 2.2. (Convergence.) Assume that $N \sigma_N\rightarrow \sigma$ and $N\theta_N\rightarrow \theta$ as $N\to\infty$ , for some $\sigma, \theta\geq 0$ (weak selection, weak mutation).

  1. 1. Let J be a pure-jump subordinator with Lévy measure $\mu$ in (0,1), and set $J_N(t)\,{:\!=}\, J(t/N)$ , $t\geq 0$ . Define the process $(X_N(t))_{t\geq 0}$ via $X_N(t)\,{:\!=}\, Z_N^{J_N}(Nt)/N,$ $t\geq 0$ . If $X_N(0)\xrightarrow[N\to\infty]{(d)} x_0$ , then

    where X is the unique pathwise solution to (1.3) with $X(0)=x_0$ .
  2. 2. Let $\omega\in{\mathbb D}^\star$ be a simple environment and set $\omega_N(t)\,{:\!=}\,\omega(t/N)$ , $t\geq 0$ . Define the process $(X_N^\omega(t))_{t\geq 0}$ via $X_N^\omega(t)\,{:\!=}\, Z_N^{\omega_N}(Nt)/N,$ $t\geq 0$ . If $X_N^\omega(0)\xrightarrow[N\to\infty]{(d)} x_0$ , then

    with $X^\omega$ starting at $x_0$ .

The proof of Theorem 2.2 is given in Section 3.2. The reason for using the environment $J_N$ or $\omega_N$ is to compensate for the fact that time is sped up by a factor of N. In this way, $X_N$ and X share the same environment.

Remark 2.1. The result analogous to Theorem 2.2(1) in the context of discrete-time Wright–Fisher models without mutations is covered by the fairly general result [Reference Bansaye, Caballero and Méléard4, Theorem 3.2] (see also [Reference González Casanova, Spanò and Wilke-Berenguer20, Theorem 2.12]).

Remark 2.2. If J is a compound Poisson process, then almost every environment is simple. In this case, according to Theorem 2.2(2), the quenched convergence holds for almost every environment (with respect to the law of J). We conjecture that this is true for general J. In Proposition 3.4 we show that the sequence $(X_N^\omega)_{N\geq 1}$ is tight for any environment $\omega$ . Hence, it would suffice to prove the continuity of $\omega\mapsto X^\omega$ to obtain the desired convergence. Unfortunately, since the diffusion term in (1.3) is not Lipschitz, the standard techniques used to prove this type of result fail. Developing new techniques to cover non-Lipschitz diffusion coefficients is beyond the scope of this paper.

2.4. The ancestral selection graph in random/deterministic environment

The aim of this section is to associate an ASG to the Wright–Fisher diffusion in random/deterministic environment. In contrast to the Moran model setting described in Section 2.1, it is not straightforward to set up a graphical representation for the forward process. To circumvent this problem, we proceed as follows. We first consider the graphical representation of a Moran model with parameters $\sigma/N$ , $\theta/N$ , $\nu_0$ , $\nu_1$ , and environment $\omega_N(\!\cdot\!)=\omega(\!\cdot\!/N)$ , and we speed up time by N. Next, we sample n individuals at time T and we construct the ASG as in Section 2.1.

Now, replace $\omega$ by a pure-jump subordinator J with Lévy measure $\mu$ supported in (0,1). Note that the Moran-ASG in [0, T] evolves according to the time reversal of J. The latter is the subordinator $\bar{J}^T\,{:\!=}\,(\bar{J}^T(\beta))_{\beta\in[0,T]}$ with $\bar{J}^T(\beta)\,{:\!=}\, J(T)-J((T-\beta)-\!)$ , which has the same law as J (its law does not depend on T). A simple asymptotic analysis of the rates and probabilities for the possible events leads to the following definition.

Definition 2.1. (The annealed/quenched ASG.) The annealed ancestral selection graph $\mathcal{G}\,{:\!=}\,(\mathcal{G}(\beta))_{\beta\geq 0}$ with parameters $\sigma,\theta,\nu_0,\nu_1$ , and environment driven by a pure-jump subordinator with Lévy measure $\mu$ , associated to a sample of size n of the population at time T is the branching–coalescing particle system starting with n lines and with the following dynamic:

  1. (i) Each line independently splits at rate $\sigma$ into an incoming line and a continuing line.

  2. (ii) Every given pair of lines independently coalesces into a single one at rate 2.

  3. (iii) If m is the current number of lines in the ASG, every group of k lines independently experiences a simultaneous branching at rate

    (2.5) \begin{equation} \sigma_{m,k}\,{:\!=}\, \int_{(0,1)}y^k (1-y)^{m-k}\mu({\textrm{d}} y), \end{equation}
    i.e. each line in the group splits into an incoming line and a continuing line.
  4. (iii) Each line is independently decorated by a beneficial mutation at rate $\theta \nu_0$ .

  5. (v) Each line is independently decorated by a deleterious mutation at rate $\theta \nu_1$ .

Let $\omega\,{:}\,{\mathbb R}\to{\mathbb R}$ be a fixed environment. The quenched ancestral selection graph with parameters $\sigma,\theta,\nu_0,\nu_1$ , and environment $\omega$ of a sample of size n at time T is a branching–coalescing particle system $\mathcal{G}_T^\omega\,{:\!=}\,(\mathcal{G}_T^\omega(\beta))_{\beta\geq 0}$ starting at $\beta=0-$ with n lines and evolving as the annealed ASG but with (iii) replaced by the following:

  1. (iii) If at time $\beta$ we have $\Delta \omega(T-\beta)>0$ , then any line splits into two lines, an incoming line and a continuing line, with probability $\Delta \omega(T-\beta)$ , independently from the other lines.

See Figure 4 for an illustration of the type frequency process $X^\omega$ and the killed ASG $\mathcal{G}_T^\omega$ . The branching–coalescing system $\mathcal{G}_T^\omega$ is clearly well-defined for $\omega$ simple. The justification of the previous definition for general environments is more involved and will be given in Section 4.1.

Figure 4. Illustration of a path of the process $X^\omega$ (grey path) and the killed ASG $\mathcal{G}_T^\omega$ (black lines) embedded in the same picture. Forward time t runs from left to right; backward time $\beta\,{:\!=}\, T-t$ runs from right to left. The environment $\omega$ jumps at forward times $t_0$ , $t_1$ , and $t_2$ .

Remark 2.3. In the Moran model, a neutral arrow appears from line A to line B at rate $1/N$ and from line B to line A at the same rate. Two lines are thus connected by a neutral arrow at rate $2/N$ , which explains the rate of coalescence events in (ii).

2.5. Type frequency via the killed ASG

The aim of this section is to relate the type-0 frequency process X to the ASG. To this end, assume that the proportion of fit individuals at time 0 is equal to $x\in[0,1]$ . Conditionally on X(T), the probability of independently sampling n unfit individuals at time T equals $(1-X(T))^n$ . Now, consider the annealed ASG associated to the n sampled individuals in [0, T], and randomly assign types independently to each line in the ASG at time $\beta=T$ according to the initial distribution $(x,1-x)$ . In the absence of mutations, the n sampled individuals are unfit if and only if all the lines in the ASG at time $\beta=T$ are assigned the unfit type (because at any selective event a fit individual can only be replaced by another fit individual). Therefore, if R(T) denotes the number of lines present in the ASG at time $\beta=T$ , then conditionally on R(T), the probability that the n sampled individuals are unfit is $(1-x)^{R(T)}$ . We would then expect to have

\begin{align*}{\mathbb E}[(1-X(T))^n\big| X(0)=x]={\mathbb E}[(1-x)^{R(T)}\big| R(0)=n].\end{align*}

Mutations determine the types of some of the lines in the ASG even before we assign types to the lines at time $\beta=T$ . Hence, we can prune away from the ASG all the sub-ASGs arising from a mutation event. If in the pruned ASG there is a line ending in a beneficial mutation, we can infer that at least one of the sampled individuals is fit. If all the lines end up in a deleterious mutation, we can infer directly that all the sampled individuals are unfit. In the remaining case, the sampled individuals are all unfit if and only if all the lines present at time $\beta=T$ in the pruned ASG are assigned the unfit type. We use this idea in Section 4.2 to construct, for a given sample of the population at time $t=T$ , branching–coalescing systems $\bar{\mathcal{G}}\,{:\!=}\, (\bar{\mathcal{G}}(\beta))_{\beta\geq 0}$ and $\bar{\mathcal{G}}_T^\omega\,{:\!=}\, (\bar{\mathcal{G}}_T^\omega(\beta))_{\beta\geq 0}$ in the annealed and quenched settings, respectively. Both processes have a cemetery state $\dagger$ . The main feature of $\bar{\mathcal{G}}$ (resp. $\bar{\mathcal{G}}^\omega_T$ ) is that for any $\beta\geq0$ , the individuals in the sample are all unfit if and only if $\bar{\mathcal{G}}\neq \dagger$ (resp. $\bar{\mathcal{G}}^\omega_T\neq \dagger$ ) and all the lines present at time $\beta$ in $\bar{\mathcal{G}}$ (resp. $\bar{\mathcal{G}}^\omega_T$ ) are unfit. We refer to $\bar{\mathcal{G}}$ and $\bar{\mathcal{G}}^\omega_T$ as the annealed and quenched killed ASGs (k-ASGs), respectively.

Moment dualities. We will now establish duality relations between the process X and the line-counting process of the k-ASG. For each $\beta\geq 0$ , we denote by $R(\beta)$ the number of lines present in the annealed k-ASG at time $\beta$ , with the convention that $R(\beta)=\dagger$ if ${\bar{\mathcal{G}}}(\beta)=\dagger$ . The process $R\,{:\!=}\, (R(\beta))_{\beta\geq 0}$ , called the annealed line-counting process of the k-ASG, is a continuous-time Markov chain with values on ${\mathbb N}_0^\dagger\,{:\!=}\, \mathbb{N}_0\cup\{\dagger\}$ and infinitesimal generator matrix $Q^\mu_\dagger\,{:\!=}\, \left(q^\mu_\dagger(i,j)\right)_{i,j\in{\mathbb N}_0^\dagger}$ defined via

(2.6) \begin{equation} q^\mu_\dagger(i,j)\,{:\!=}\, \left\{\begin{array}{l@{\quad}l} i(i-1)+i \theta\nu_1 & \text{if $j=i-1$},\\ \\[-8pt] \binom{i}{k}(\sigma_{i,k}+\sigma 1_{\{k=1\}}) & \text{if $j=i+k,\, i\geq k\geq 1$},\\ \\[-8pt] i\theta\nu_0& \textrm{if $j=\dagger$},\\ \\[-8pt] -i(i-1+\theta+\sigma)-\int_{(0,1)}(1-(1-y)^i)\mu({\textrm{d}} y)& \textrm{if $j=i\in{\mathbb N}_0$}, \end{array}\right.\end{equation}

where the coefficients $\sigma_{m,k}$ are defined in Equation (2.5). All other entries are zero.

Similarly, for $T \in \mathbb{R}$ and a fixed environment $\omega\in{\mathbb D}^\star$ , we denote by $R_T^\omega \,{:\!=}\, (R_T^\omega(\beta))_{ \beta \geq 0}$ the line-counting process associated to the quenched k-ASG ${\bar{\mathcal{G}}_T^\omega}$ . The process $R_T^\omega$ , called the quenched line-counting process of the k-ASG, is a continuous-time (inhomogeneous) Markov process with values in ${\mathbb N}_0^\dagger$ . It jumps from $i\in{\mathbb N}$ to $j\in{\mathbb N}_0^\dagger \setminus \{i\}$ at rate $q^0_\dagger(i,j)$ , where $q^0_\dagger$ is the matrix defined in (2.6) with $\mu=0$ . In addition, at each time $\beta \geq 0$ with $\Delta\omega(T-\beta)>0$ , conditionally on $\{R_T^\omega(\beta-\!)=i\}$ , $i\in{\mathbb N}$ , we have $R_T^\omega(\beta) \sim i + \textrm{Bin}({i},{\Delta \omega(T-\beta)})$ . If $\theta \gt0$ and $\nu_0 \in (0,1)$ , the states 0 and $\dagger$ are absorbing for R and $R_T^\omega$ .

Let J be a pure-jump subordinator with Lévy measure $\mu$ supported in (0, 1). We write here $X^J$ instead of X to stress the dependency of the (strong) solution to (1.3) on the environment J. Similarly, we write $X^\omega$ for its quenched version (as introduced in Section 2.3). Since, in the annealed case, backward and forward environments have the same law, we can construct the line-counting process of the k-ASG as the strong solution to an SDE involving J and four other independent Poisson processes encoding the non-environmental events. We denote it by $(R^J(\beta))_{\beta\geq 0}$ . The next result establishes a formal relation between $X^J$ and $R^J$ : a reinforced moment duality, which allows us to derive moment dualities in the annealed and quenched settings (see Figure 4 to visualize forward and backward processes in the same picture).

Theorem 2.3. (Reinforced, annealed, and quenched moment dualities.) For all $x\in[0,1]$ , $n\in\mathbb{N}$ , and $T\geq 0$ , and any function $f\in\mathcal{C}^2([0,\infty))$ with compact support,

(2.7) \begin{equation}\mathbb{E}[(1-X^J(T))^n f(J(T))\mid X^J(0)=x]=\mathbb{E}[(1-x)^{R^J(T)} f(J(T))\mid R^J(0)=n], \end{equation}

with the convention $(1-x)^\dagger=0$ for all $x\in[0,1]$ . In particular, if $f=1$ we recover the moment duality (we will often drop the superscript J when using this relation, unless we want to emphasize the dependency on J):

(2.8) \begin{equation}\mathbb{E}[(1-X^J(T))^n \mid X^J(0)=x]=\mathbb{E}[(1-x)^{R^J(T)} \mid R^J(0)=n]. \end{equation}

For almost every (with respect to the law of J) environment $\omega\in{\mathbb D}^\star$ ,

(2.9) \begin{equation}\mathbb{E} \left [ (1-X^\omega(T))^n\mid X^\omega(0)=x \right ]= \mathbb{E} \left [ (1-x)^{R_T^\omega(T-\!)}\mid R_T^\omega(0-\!)=n \right ]. \end{equation}

We prove (2.7) and (2.8) in Section 5.1. The proof of the quenched duality (2.9) is given in Section 6.1. Moreover, Theorem 7.1 extends (2.9) to any simple environment.

Remark 2.4. For $\theta=0$ , (2.8) is a particular case of [Reference González Casanova, Spanò and Wilke-Berenguer20, Lemma 2.14].

Asymptotic type composition. Assume now that $\theta \gt0$ and $\nu_0, \nu_1 \in (0,1)$ . In particular, the processes X and $X^\omega$ are not absorbed in $\{0,1\}$ . We will describe the asymptotic behavior of these processes using Theorem 2.3. The quenched case is particularly delicate, because for a given environment $\omega$ , $X^\omega(t)$ strongly depends on the environment in the recent past, and only weakly on the environment in the distant past (see Figure 3). Hence, unless $\omega$ is constant after some fixed time $t_0$ (i.e. $\omega$ has no jumps after $t_0$ ), $X^\omega(t)$ will not converge as $t\to\infty$ (see Remark 2.6 for the case of periodic environments). In contrast, for a given environment $\omega$ in $(\!-\!\infty,0]$ , we will see that $X^\omega(0)$ , conditionally on $X^\omega(\!-\tau)=x$ , converges in distribution as $\tau\to\infty$ , and we will characterize its law; the setting is illustrated in Figure 5 (compare with Figure 3). To this end, for $n\in{\mathbb N}_0$ define

(2.10) \begin{align}\pi_n &\,{:\!=}\, \mathbb{P}(\exists \beta\geq 0\,{:}\, R(\beta)=0 \mid R(0)=n),\nonumber\\[3pt]\Pi_n (\omega)&\,{:\!=}\, \mathbb{P}(\exists \beta\geq 0 \,{:}\, R_0^\omega(\beta)=0 \mid R_{0}^\omega(0-\!)=n), \end{align}

and set $\pi_\dagger\,{:\!=}\, 0$ and $\Pi_\dagger(\omega)\,{:\!=}\, 0$ . Clearly, $\pi_0=1$ and $\Pi_0(\omega)=1$ .

Figure 5. An illustration of two paths of $X^\omega$ : the black (resp. grey) path is defined in the interval $[\!-(\tau+h),0]$ (resp. $[\!-\tau,0]$ ) starting at $X^\omega(\!-(\tau+h))=x$ (resp. $X^\omega(\!-\tau)=x$ ). The peaks of the environment $\omega$ are depicted as grey vertical lines.

Theorem 2.4. (Asymptotic type frequency.) Assume that $\theta \gt0$ and $\nu_0, \nu_1 \in (0,1)$ .

  1. 1. The process X has a unique stationary distribution $\eta_X\in\mathcal{M}_1([0,1])$ , and letting $X(\infty)$ be a random variable distributed according to $\eta_X$ , we have $X(t)\xrightarrow[]{(d)}X(\infty)$ as $t\to\infty$ . Moreover, for all $n\in\mathbb{N}$ ,

    (2.11) \begin{equation}\mathbb{E}\left[ (1-X(\infty))^n \right]=\pi_n, \end{equation}
    and the absorption probabilities $(\pi_n)_{n\geq 0}$ satisfy
    (2.12) \begin{align}(\sigma+\theta+n-1)\pi_n&=\sigma \pi_{n+1}+(\theta\nu_1+ n-1)\pi_{n-1} \nonumber \\[3pt] &+ \frac{1}{n}\sum\limits_{k=1}^n\binom{n}{k} \sigma_{n,k}(\pi_{n+k}-\pi_n),\quad n\in{\mathbb N}, \end{align}
    where the coefficients $\sigma_{n,k}$ , $k\in[n]$ , $n\in{\mathbb N}$ , are defined in Equation (2.5).
  2. 2. For almost every (with respect to the law of J) environment $\omega$ and for any $x \in (0,1)$ , the distribution of $X^\omega(0)$ conditionally on $\{ X^\omega(\!-\tau) = x \}$ has a limit distribution $\mathcal{L}^\omega$ as $\tau\to\infty$ , which does not depend on x. Moreover,

    (2.13) \begin{eqnarray}\ \int_0^1 (1-y)^n \mathcal{L}^{\omega}({\textrm{d}} y) = \Pi_n(\omega),\quad n\in{\mathbb N}, \end{eqnarray}
    and the convergence of moments is exponential, i.e.
    (2.14) \begin{eqnarray} \ \left | \mathbb{E}\left [ (1-X^\omega(0))^n \mid X^\omega(\!-\tau)=x \right ] - \Pi_n(\omega) \right | \leq e^{-\theta\nu_0 \tau},\quad n\in{\mathbb N}. \end{eqnarray}

The setting of Theorem 2.4(1) is illustrated in Figure 3, and its proof is given in Section 5.1; the setting of Theorem 2.4(2) is illustrated in Figure 5, and its proof is given in Section 6.1. Moreover, Theorem 7.2 extends Theorem 2.4(2) to any simple environment. A refinement of Theorem 2.4(2) is given in Theorem 7.3 for simple environments under additional conditions.

Remark 2.5. Simpson’s index is a popular tool for describing population diversity. It represents the probability that two individuals chosen uniformly at random from the population have the same type. In our case it is given by ${\text{Sim}}(t) \,{:\!=}\, X(t)^2 + (1-X(t))^2$ . If the types represent different species, it gives a measure of bio-diversity. If the types represent two alleles of a gene for a given species, it measures homozygosity. As a consequence of Theorem 2.4, one can express the moments of ${\text{Sim}}(\infty)$ in terms of the coefficients $(\pi_n)_{n\geq 0}$ . In particular, we have

\begin{align*} \mathbb{E}[{\text{Sim}}(\infty)]=\mathbb{E}\!\left[X(\infty)^2 + (1-X(\infty))^2\right] = 1 - 2 \pi_1 + 2\pi_2. \end{align*}

Remark 2.6. If $\omega$ is a periodic environment on $[0,\infty)$ with period $T_p \gt 0$ , the proof of Theorem 2.4(2) yields that, for any $x \in (0,1)$ and $r \in [0,T_p)$ , the distribution of $X^\omega(nT_p + r)$ , conditionally on $\{ X^\omega(0) = x \}$ , has a limit distribution $\mathcal{L}_r^{\omega}$ , when n goes to infinity, which is a function of $\omega$ and r and does not depend on x. Furthermore, $\mathcal{L}_r^{\omega}$ satisfies $\int_0^1 (1-y)^n \mathcal{L}_r^{\omega}({\textrm{d}} y) = \Pi_n(\omega_r)$ , where $\omega_r$ is the periodic environment in $(\!-\!\infty,0]$ defined by $\omega_r(t) \,{:\!=}\, \omega(r+t+(\lfloor -t/T_p \rfloor + 1) T_p)$ for any $t \in (\!-\!\infty,0]$ . The convergence of moments is exponential as in (2.14).

Mixed environments. We present now an application illustrating the advantage of studying both quenched and annealed settings. We consider a population evolving from the distant past in a (stationary) random environment and analyze the effect of a recent perturbation of the environment on the type composition at present. To this end, we assume we only know the distribution of the environment before the perturbation.

In the absence of perturbations, the environment is given by a pure-jump subordinator J in $(\!-\!\infty,0]$ with Lévy measure $\mu$ satisfying (2.4). The perturbation occurs in $(\!-\tau_\star^{},0]$ (for some $\tau_\star^{} \gt 0$ ) and is given by a deterministic environment $\omega$ . Let $X^{J\otimes_{\tau_\star^{}}^{} \omega}$ be the solution to (1.3) under the environment $J\otimes_{\tau_\star^{}}^{} \omega$ , which coincides with J and $\omega$ in $(\!-\!\infty,-{\tau_\star^{}}]$ and $(\!-{\tau_\star^{}},0]$ , respectively; see Figure 6 for an illustration. Recall that $ (R_0^\omega(\beta))_{\beta \in [0,{\tau_\star^{}})}$ is the line-counting process associated to the quenched k-ASG (see Section 2.5). We are interested in the distribution of $X^{J\otimes_{\tau_\star^{}}^{} \omega}(0)$ . The next result provides the moments of this random variable.

Figure 6. An illustration of two paths (thin and thick) of $X^{J\otimes_{\tau_\star^{}}^{}\omega}$ in $[\!-\tau,0]$ starting at x. Both paths are subject to the same deterministic environment $\omega$ in $[\!-{\tau_\star^{}},0]$ ; the vertical lines in $(\!-{\tau_\star^{}},0]$ represent the peaks of $\omega$ . The environment in $[\!-\tau,-\tau_\star^{})$ is random and driven by J; the peaks of the realization of J giving rise to the thick (resp. thin) path in $[\!-\tau,-{\tau_\star^{}})$ are depicted as solid (resp. dotted) vertical lines.

Proposition 2.1. Assume that $\theta>0$ and $\nu_0,\nu_1\in(0,1)$ . For any $\tau_\star^{}>0$ , $n\in{\mathbb N}$ , $x\in[0,1]$ , and almost every (with respect to the law of J) $\omega\in{\mathbb D}^\star$ , we have

(2.15) \begin{equation}\lim_{\tau\to\infty}{\mathbb E}\left[(1-X^{J\otimes_{\tau_\star^{}}^{} \omega}(0))^n\mid X^{J\otimes_{\tau_\star^{}}^{} \omega}(\!-\tau)=x\right]={\mathbb E}\left[\pi_{R_0^{\omega}({\tau_\star^{}}-\!)}\mid R_0^\omega(0-\!)=n\right].\end{equation}

Proposition 2.1 is proved in Section 6.1; a refinement of this result is given for simple environments under the additional condition $\sigma=0$ in Proposition 7.2.

2.6. Ancestral type via the pruned lookdown ASG

In this section we are interested in the type distribution at present of the individuals that will be successful in the long run. This distribution may differ substantially from the type composition at present and may show a bias towards the fit type.

Consider a sample of n individuals at some time T in the future and trace their ancestral lines using the ASG. We will see in Section 5.2 that the number of lines in the ASG is positive recurrent (see Lemma 5.2). Hence, the ASG has bottlenecks, and if T is sufficiently large, the n individuals share a common ancestor at time 0. Assigning types to the lines in the ASG at time 0 and propagating the types along using the pecking order, we determine the types in the sample as well as the true genealogy. In particular, we obtain the type of the common ancestor of the sample. What it means for T to be sufficiently large depends on n and on the realization of the ASG, but this dependency vanishes as $T\to \infty$ . Because we are interested in the type of the individual that is successful in the long run, we can work under this limit consideration. In what follows we formalize this idea.

Consider a realization $\mathcal{G}_{[0,T]}^{}\,{:\!=}\, (\mathcal{G}(\beta))_{\beta\in[0,T]}$ of the annealed ASG in [0, T] started with one line, representing an individual sampled at forward time T. If t denotes forward time, we set $\beta=T-t$ to denote the backward time (see Figure 4). For $\beta\in[0,T]$ , let $V_\beta$ be the set of lines present at time $\beta$ in $\mathcal{G}_{[0,T]}^{}$ . Consider a function $c\,{:}\,V_T\to\{0,1\}$ representing an assignment of types to the lines in $V_T$ . Given $\mathcal{G}_{[0,T]}^{}$ and c, we propagate types (forward in time) along the lines of $\mathcal{G}_{[0,T]}$ , keeping track, at any time $\beta\in[0,T]$ , of the true ancestor in $V_T$ of each line in $V_\beta$ . We denote by $a_c(\mathcal{G}_{[0,T]}^{})$ the type of the ancestor in $V_T$ of the single line in $V_0$ . Assume that, under ${\mathbb P}_x$ , c assigns independently to each line type 0 with probability x and type 1 with probability $1-x$ . The annealed ancestral type distribution at time T is

\begin{align*}h_T(x)\,{:\!=}\, {\mathbb P}_x(a_c(\mathcal{G}_{[0,T]})=0),\quad x\in[0,1].\end{align*}

In the quenched setting we proceed in the same way, but using $\mathcal{G}_{[0,T]}^\omega\,{:\!=}\, (\mathcal{G}_T^\omega(\beta))_{\beta\in[0,T]}$ , the quenched ASG in [0, T] in the environment $\omega$ of an individual sampled at time T, instead of $\mathcal{G}_{[0,T]}^{}$ . The quenched ancestral type distribution at time T is

\begin{align*}h_T^\omega(x)\,{:\!=}\, {\mathbb P}_x\left(a_c(\mathcal{G}_{[0,T]}^\omega)=0\right),\quad x\in[0,1],\end{align*}

where, under ${\mathbb P}_x$ , c assigns independently to each line present in $\mathcal{G}_{[0,T]}^\omega$ at time $\beta=T$ type 0 with probability x and type 1 with probability $1-x$ .

In the absence of mutations, the ancestor of the sampled individual is fit if and only if there is at least one fit line in the ASG having type 0 at time $\beta=T$ . In the presence of mutations, determining the type of the ancestor is more involved. In [Reference Lenz, Kluth, Baake and Wakolbinger31] the ancestral type distribution was obtained for the null environment using the line-counting process of a pruned version of the ASG, called the pruned lookdown ASG (pLD-ASG). In Section 4.3 we generalize this construction to incorporate the effect of the environment. The main feature of the pLD-ASG is that the type of the ancestor at time $t=0$ of the sampled individual at time $t=T$ is 0 if and only if there is at least one line in the pLD-ASG at time $\beta=T$ that has type 0 (see Lemma 4.3). Hence, $h_T(x)$ and $h_T^\omega(x)$ can be represented via the corresponding line-counting processes (which can easily be inferred from the description of the pLD-ASG given in Section 5.2).

The line-counting process of the annealed pLD-ASG, denoted by $L\,{:\!=}\, (L(\beta))_{\beta \geq 0}$ , is a continuous-time Markov chain with values on $\mathbb{N}$ and generator matrix $Q^\mu\,{:\!=}\, (q^\mu(i,j))_{i,j\in{\mathbb N}}$ given by

(2.16) \begin{equation}q^\mu(i,j)\,{:\!=}\, \left\{\begin{array}{l@{\quad}l} i(i-1)+(i-1) \theta\nu_1 + \theta\nu_0 & \text{if $j=i-1$},\\ \\[-7pt] i(\sigma + \sigma_{i,1}) &\text{if $j=i+1$},\\ \\[-7pt] \binom{i}{k}\sigma_{i,k} &\text{if $j=i+k,\ i\geq k\geq 2$},\\ \\[-7pt] \theta\nu_0 &\textrm{if $1 \leq j \lt i-1$},\\ \\[-7pt] -(i-1)(i+\theta)-i\sigma-\int_{(0,1)}(1-(1-y)^i)\mu({\textrm{d}} y) &\textrm{if $j=i$},\\ \end{array}\right.\end{equation}

where $\sigma_{m,k}$ is defined in (2.5); all other entries are 0.

The pLD-ASG associated to $\omega\in{\mathbb D}^\star$ is well-defined and almost surely contains finitely many lines at any time; we show this in Section 5.2. The corresponding line-counting process $(L_T^\omega(\beta))_{ \beta \geq 0}$ started at time T is a continuous-time (inhomogeneous) Markov process with values in $\mathbb{N}$ . It jumps from $i\in{\mathbb N}$ to $j\in{\mathbb N} \setminus \{i\}$ at rate $q^0(i,j)$ , where $q^0$ is the matrix defined in (2.16) with $\mu=0$ , and in addition, at each time $\beta \geq 0$ such that $\Delta\omega(T-\beta)>0$ , conditionally on $\{L_T^\omega(\beta-\!)=i\}$ , $i\in{\mathbb N}$ , we have $L_T^\omega(\beta) \sim i+\textrm{Bin}({i},{\Delta \omega(T-\beta)})$ .

We now state the main result of this section, describing the asymptotic behavior of $h_T(x)$ and $h_T^\omega(x)$ .

Theorem 2.5. (Ancestral type distribution.) The following assertions hold:

  1. 1. The process L admits a unique stationary distribution $\eta_L$ . Moreover, if $L(\infty)$ is a random variable distributed according to $\eta_L$ , then $L(T)\xrightarrow[]{(d)}L(\infty)$ as $T\to\infty$ . In particular, $h(x)\,{:\!=}\, \lim_{T\to\infty} h_T(x)$ is well-defined, and

    (2.17) \begin{eqnarray}h(x) = \sum_{n \geq 0} x(1-x)^{n} a_n, \end{eqnarray}
    where the coefficients $a_n\,{:\!=}\, \mathbb{P}(L(\infty) \gt n)$ , $n\in{\mathbb N}_0$ , satisfy the following recursion (which is known as Fearnhead’s recursion when $\mu=0$ ):
    (2.18) \begin{equation}(\sigma +\theta +n+1 )\, a_n= \sigma a_{n-1}+(\theta\nu_1+n+1)\,a_{n+1} + \frac{1}{n}\sum\limits_{j=1}^{n} \gamma_{n+1,j}\, (a_{j-1}-a_{j}),\quad n\in{\mathbb N},\end{equation}
    where $\gamma_{i,j}\,{:\!=}\, \sum_{k=i-j}^{j}\binom{j}{k}\sigma_{j,k}$ if $1 \leq j<i\leq 2j$ and $\gamma_{i,j}\,{:\!=}\, 0$ otherwise.
  2. 2. Assume that $\theta\nu_0 \gt 0$ . For any $n \in {\mathbb N}$ , the distribution of $L_T^\omega(T-\!)$ conditionally on $\{ L_T^\omega(0-\!) = n \}$ has a limit distribution $\mu^\omega\in\mathcal{M}_1({\mathbb N})$ as $T\to\infty$ , which does not depend on n. In particular, $h^{\omega}(x)\,{:\!=}\, \lim_{T\to\infty}h_T^\omega(x)$ is well-defined and

    (2.19) \begin{equation}h^{\omega}(x)= 1- \sum_{n=1}^\infty \mu^\omega(\{n\})(1-x)^{n}.\end{equation}
    Moreover, for any $x \in [0,1]$ and $t \gt 0$ ,
    (2.20) \begin{eqnarray}\left | h^{\omega}(x) - h^{\omega}_T(x) \right | \leq 2e^{-\theta\nu_0 T}. \end{eqnarray}

The proof of Theorem 2.5(1) is given in Section 5.2; Theorem 2.5(2) is proved in Section 6.2. Theorem 7.4 extends Theorem 2.5(2) to the case $\theta\nu_0=0$ for simple environments under additional conditions. A refinement of Theorem 2.5(2) is given in Theorem 7.5 for simple environments under additional conditions.

In the case $\theta=0$ , Theorem 2.5 yields the following result about the boundary behavior of X.

Corollary 2.1. (Accessibility of the boundaries.) If $\theta=0$ , then for any $T>0$ and $x\in[0,1]$ ,

\begin{align*}h_T(x)=\mathbb{E} [X(T) \mid X(0) = x].\end{align*}

Moreover, conditionally on $\{X(0)=x\}$ , X(T) converges almost surely as $T\to\infty$ to a Bernoulli random variable with parameter h(x). In particular, the absorbing states 0 and 1 are both accessible from any $x\in(0,1)$ .

Remark 2.7. Corollary 2.1 is not a direct consequence of [Reference González Casanova, Spanò and Wilke-Berenguer20, Theorem 3.2], whose statement does not cover SDEs with a diffusion term (the term $\sqrt{2X(t)(1-X(t))}{\textrm{d}} B(t)$ ). We close this section with an application of our results to the comparison of the (isolated) effects of the environment and of (genic) selection. To this end, we fix a non-zero measure $\mu$ on (0, 1) satisfying (2.4) and we consider two models, both without mutations. The first model has selection parameter

(2.21) \begin{equation}\sigma=\sigma_\mu\,{:\!=}\, \int_{(0,1)}y\mu({\textrm{d}} y),\end{equation}

and no environment (i.e. in (1.3) we take $S(t)\,{:\!=}\, \sigma_\mu t$ ). The second one has selection parameter $\sigma=0$ and an environment given by a subordinator with Lévy measure $\mu$ (i.e. in (1.3) we take $S(t)\,{:\!=}\, J(t)$ ). We will use the superscript ‘sel’ (resp. ‘env’) to refer to the first (resp. second) model.

For $n\in{\mathbb N}$ , set $\rho_n^{{\text{env}}}\,{:\!=}\, {\mathbb P}(L^{{\text{env}}}(\infty)=n)$ and $\rho_n^{{\text{sel}}}\,{:\!=}\, {\mathbb P}(L^{{\text{sel}}}(\infty)=n)$ . Consider the probability generating functions

\begin{align*}p^{{\text{env}}}(z)\,{:\!=}\, \sum_{n=1}^\infty \rho_n^{{\text{env}}}z^{n}\quad\textrm{and}\quad p^{{\text{sel}}}(z)\,{:\!=}\, \sum_{n=1}^\infty \rho_n^{{\text{sel}}}z^{n},\quad z\in[0,1].\end{align*}

Note that $p^{{\text{env}}}(z)=1-h^{{\text{env}}}(1-z)$ and $p^{{\text{sel}}}(z)=1-h^{{\text{sel}}}(1-z)$ .

Proposition 2.2. For any non-zero measure $\mu$ on (0,1) satisfying (2.4) we have

\begin{align*}\rho_1^{{\text{env}}}>\rho_1^{{\text{sel}}}=\frac{\sigma_\mu}{e^{\sigma_\mu}-1}\quad{and}\quad p^{{\text{env}}}(z)\leq \frac{\rho_1^{{\text{env}}}}{\rho_1^{{\text{sel}}}}\, p^{{\text{sel}}}(z)=\rho_1^{{\text{env}}}\,\left(\frac{e^{\sigma_\mu z}-1}{\sigma_\mu}\right),\quad z\in[0,1].\end{align*}

In particular, there is $x_c\in(0,1)$ such that, for $x\in[x_c,1)$ ,

\begin{equation*}h^{{\text{env}}}\!(x)\,{=}\,{\mathbb P}\!\left(\lim_{t\to\infty}\!X^{{\text{env}}}(t)\,{=}\,1 \!\mid\! X^{{\text{env}}}(0)\,{=}\,x\!\right)\!<{\mathbb P}\!\left(\lim_{t\to\infty}\!X^{{\text{sel}}}(t)\,{=}\,1 \!\mid\! X^{{\text{sel}}}(0)\,{=}\,x\!\right)\,{=}\,h^{{\text{sel}}}(x).\end{equation*}

Remark 2.8. As a consequence of Proposition 2.2 one recovers the classical result of Kimura [Reference Kimura28],

\begin{align*}h^{{\text{sel}}}(x)=\frac{1-e^{-\sigma_\mu x}}{1-e^{-\sigma_\mu}},\quad x\in[0,1].\end{align*}

Remark 2.9. Consider a Wright–Fisher diffusion with no mutations and selection parameter $\sigma$ , evolving in an environment with Lévy measure $\mu$ . The quantity $\sigma_\mu$ in (2.21) corresponds to the quantity $\alpha_{\mathfrak{s}}$ in [Reference González Casanova, Spanò and Wilke-Berenguer20]. As shown there, $\alpha_{\mathfrak{s}}$ is not sufficient to fully describe the strength of the environment; one also needs to know the shape of rare selection, which is defined as $\alpha^*\,{:\!=}\,\int_{(0,1)}\log\!(1+y)\mu({\textrm{d}} y)/\alpha_{\mathfrak{s}}$ . The joint action of weak selection and the environment is then described by the quantity $\alpha_{{\text{eff}}}\,{:\!=}\, \sigma+\alpha_{\mathfrak{s}}\alpha^*$ , which is called the effective strength of selection. The main result in [Reference González Casanova, Spanò and Wilke-Berenguer20] establishes that both boundaries are accessible if and only if $\alpha_{{\text{eff}}}$ is smaller than a quantity $\beta^*$ coding for neutral reproductions ( $\beta^*=\infty$ in our case).

The proofs of Corollary 2.1 and Proposition 2.2 are given in Section 5.2.

3. Moran models and Wright–Fisher processes

This section is devoted to the proofs of Theorem 2.1 and Theorem 2.2 and other related results.

3.1. Results related to Section 2.1

Graphical representation. We start by making more precise the description of the graphical representation of the Moran model as an IPS. This will allow us to decouple the randomness of the model coming from the initial type configuration, the randomness coming from mutations and reproductions, and the randomness coming from the environment. Non-environmental events are as usual encoded via a family of independent Poisson processes $\Lambda\,{:\!=}\, \{\lambda_{i}^{0},\lambda_{i}^{1},\{\lambda_{i,j}^{\vartriangle},\lambda_{i,j}^{\blacktriangle}\}_{j\in{[N] \setminus \{i\}}} \}_{i\in[N]}$ , where (a) for each $i,j\in [N]$ with $i\neq j$ , $(\lambda_{i,j}^{\vartriangle}(t))_{t\in{\mathbb R}}$ and $(\lambda_{i,j}^{\blacktriangle}(t))_{t\in{\mathbb R}}$ are Poisson processes with rates $\sigma_N/N$ and $1/N$ , respectively, and (b) for each $i\in [N]$ , $(\lambda_{i}^{0}(t))_{t\in{\mathbb R}}$ and $(\lambda_{i}^{1}(t))_{t\in{\mathbb R}}$ are Poisson processes with rates $\theta_N\nu_0$ and $\theta_N\nu_1$ , respectively. We call $\Lambda$ the basic background. The environment introduces a new independent source of randomness into the model, which we describe via the collection

\begin{align*}\Sigma\,{:\!=}\, \{(U_i(t))_{i\in[N], t\in{\mathbb R}}, (\tau_A^{}(t,\cdot))_{A\subset[N], t\in{\mathbb R}}\},\end{align*}

where (c) $(U_i(t))_{i\in[N], t\in{\mathbb R}}$ is an $[N] \times {\mathbb R}$ -indexed family of independent and identically distributed (i.i.d.) random variables with $U_i(t)$ being uniformly distributed on [0, 1], and (d) $(\tau_A(t,\cdot))_{A\subset[N], t\in{\mathbb R}}$ is a family of independent random variables with $\tau_A(t,\cdot)$ being uniformly distributed on the set of injections from A to [N]. We call $\Sigma $ the environmental background. We assume that the basic and environmental backgrounds are independent, and we call $(\Lambda,\Sigma)$ the background.

Recall that in the graphical representation individuals are represented by horizontal lines at levels $i\in [N]$ (see Figure 1). The random appearance of selective and neutral arrows, circles, and crosses is prescribed by the background as follows. At the arrival times of $\lambda_{i,j}^{\vartriangle}$ (resp. $\lambda_{i,j}^{\blacktriangle}$ ), we draw selective (resp. neutral) arrows from level i to level j. At the arrival times of $\lambda_{i}^{0}$ (resp. $\lambda_{i}^{1})$ , we draw an open circle (resp. a cross) at level i. Given an environment $\zeta\,{:\!=}\, (t_k, p_k)_{k \in I}$ satisfying (2.1), we define, for each $k\in I$ ,

\begin{align*}I_{\zeta}(k)\,{:\!=}\, \{i\in[N]\,{:}\,U_i(t_k)\leq p_k\}\quad\textrm{and}\quad n_{\zeta}(k)\,{:\!=}\, |I_{\zeta}(k)|,\end{align*}

and we draw, at time $t_k$ , for each $i\in I_{\zeta}(k)$ a selective arrow from level i to level $\tau_{I_{\zeta}(k)}^{}(t,i)$ .

Continuity with respect to the environment. Now we embark on the proof of Theorem 2.1, which states the continuity of the type composition in a Moran model with respect to the environment. The paths of the fit-counting process are considered as elements of ${\mathbb D}_{0,T}$ , which is endowed with the $J_1$ -Skorokhod topology, i.e. the topology induced by the Billingsley metric $d_T^0$ defined in (A.2). Recall also that the restriction of an environment to [0, T] is described by means of a function in ${\mathbb D}_T^\star$ (see (2.3)), which is endowed with the topology induced by the metric $d_T^\star$ defined in (A.3).

Let us denote by $\mu_N(\omega)$ the law of $(Z_N^\omega(t))_{t\in[0,T]}$ (recall that $Z_N^\omega(t)$ is the number of fit individuals at time t in a Moran population of size N subject to environment $\omega$ ). Theorem 2.1 states the continuity of the mapping $\omega\mapsto \mu_N(\omega)$ , where the set of probability measures on ${\mathbb D}_{0,T}$ is equipped with the topology of weak convergence of measures. We will use the fact that the topology of weak convergence of probability measures on a complete and separable metric space (E, d) is induced by the metric $\varrho_E$ defined in (A.6).

First, we get rid of the small jumps of the environment. To this end, we introduce the following notation. For $\delta>0$ and $\omega\in{\mathbb D}_T^\star$ , we define $\omega^\delta, \omega_\delta\in {\mathbb D}_T^\star$ via

(3.1) \begin{align}\omega^\delta(t)\,{:\!=}\, \sum_{u\in[0,t]:\Delta \omega(u)\geq \delta} \Delta \omega(u)\quad\textrm{and}\quad \omega_\delta(t)\,{:\!=}\, \sum_{u\in[0,t]:\Delta \omega(u)\lt \delta} \Delta \omega(u).\end{align}

Clearly, $\omega^\delta$ is simple and $\omega=\omega^\delta+\omega_\delta$ . Moreover, $\omega_\delta \to 0$ pointwise as $\delta\to 0$ , and hence for any $t\in[0,T]$ ,

\begin{align*}d_t^\star(\omega,\omega^\delta)\leq \sum_{u\in[0,T]}\lvert \Delta \omega(u)-\Delta \omega^\delta(u)\rvert= \omega_\delta(T) \xrightarrow[\delta\to 0]{}0.\end{align*}

In addition, for $\omega\in{\mathbb D}_T^\star$ , $n\in{\mathbb N}$ , and $\vec{r}\,{:\!=}\, (r_i)_{i\in[n]}\in[0,T]^n$ , we denote by $\mu_N^{\vec{r}}(\omega)$ the law of $(Z_N^{\omega}(r_i))_{i\in[n]}$ , where $[0,N]^n$ is equipped with the distance $d_1$ defined via

(3.2) \begin{equation}d_1( (x_i)_{i\in[n]}, (y_i)_{i\in[n]}) \,{:\!=}\, \sum_{i\in[n]}|x_i - y_i|.\end{equation}

Proposition 3.1. Let $\omega\in{\mathbb D}_T^\star$ . Assume that for any $\delta>0$ we have $Z_N^{\omega^\delta}(0)=Z_N^{\omega}(0)$ ; then

(3.3) \begin{equation}\varrho_{[0,T]^n}^{}(\mu_N^{\vec{r}}\left(\omega^\delta\right),\mu_N^{\vec{r}}(\omega))\leq nN\,\omega_\delta(r_*)e^{\sigma_N r_*+ \omega(r_*)}, \ \forall \vec{r}\in[0,T]^n, n\in{\mathbb N},\end{equation}

where $r_*\,{:\!=}\, \max_{i\in[n]}r_i$ . Moreover,

(3.4) \begin{equation}\varrho_{{\mathbb D}_{0,T}}^{}(\mu_N\left(\omega^\delta\right),\mu_N(\omega))\leq N\omega_\delta(T)e^{(1+\sigma_N) T+\omega(T)}.\end{equation}

In particular,

\begin{align*}(Z_N^{\omega^\delta}(t))_{t\in[0,T]}\xrightarrow[\delta\to 0]{(d)}(Z_N^{\omega}( t))_{t\in[0,T]}.\end{align*}

Proof. For $\delta>0$ , we couple in [0, T] a Moran model with parameters $(\sigma_N,\theta_N,\nu_0,\nu_1)$ and environment $\omega$ to a Moran model with parameters $(\sigma_N,\theta_N,\nu_0,\nu_1)$ and environment $\omega^\delta$ (both of size N) by using the same initial type configuration, the same basic background, and the same environmental background. For any $t\in[0,T]$ and $a,b\in\{0,1\}$ , we denote by $Y_N^{a,b}(t)$ the number of individuals that at time t have type a under the environment $\omega$ and type b under the environment $\omega^\delta$ . Clearly, we have

\begin{align*}\left\lvert Z_N^{\omega^{\delta}}(t)-Z_N^{\omega}(t)\right\rvert=\left\lvert Y_N^{1,0}(t) - Y_N^{0,1}(t) \right\rvert\leq Y_N^{1,0}(t) + Y_N^{0,1}(t)\,{:\!=}\, Y_N^{\neq}(t).\end{align*}

Note that $Y_N^{\neq}(t)$ is the number of individuals that have different types at time t under $\omega$ and $\omega^\delta$ . In particular, we have $Y_N^{\neq}(t) \leq N$ almost surely. Let us assume that at time t a graphical element arises in the basic background, i.e. t is an arrival time of one of the Poisson processes in the family $\Lambda$ . If the graphical element is a mutation, then $Y_N^{\neq}(t)\leq Y_N^{\neq}(t-\!)$ . If the graphical element is a neutral arrow, we have

\begin{align*}{\mathbb E}\left[Y_N^{\neq}(t)\mid Y_N^{\neq}(t-\!)\right] &=Y_N^{\neq}(t-\!)+\frac{1}{N}Y_N^{\neq}(t-\!)(N-Y_N^{\neq}(t-\!))-\frac{1}{N}(N-Y_N^{\neq}(t-\!))Y_N^{\neq}(t-\!)\\[3pt] &=Y_N^{\neq}(t-\!).\end{align*}

If the graphical element is a selective arrow, then $Y_N^{\neq}(t)$ can increase by 1 only if the individual at the tail of the arrow has a different type at time t under $\omega$ and $\omega^\delta$ . Thus

\begin{align*}{\mathbb E}\left[Y_N^{\neq}(t)\mid Y_N^{\neq}(t-\!)\right]\leq\left(1+\frac1N\right)Y_N^{\neq}(t-\!).\end{align*}

Now, let $0\leq s<t\leq T$ and assume that there are neither jumps of $\omega^\delta$ nor selective events in (s, t). In particular, in (s, t) only the population driven by $\omega$ is affected by the environment. Moreover, since neutral reproductions and mutations do not increase the expected value of $Y_N^{\neq}$ , we obtain

\begin{align*}{\mathbb E}\left[Y_N^{\neq}(t-\!)\mid Y_N^{\neq}(s)\right]\leq Y_N^{\neq}(s) +N\sum_{u\in(s,t)}\Delta \omega(u).\end{align*}

In addition, if at time t the environment $\omega^\delta$ jumps (there are only finitely many of these jumps), then

\begin{align*}{\mathbb E}\left[Y_N^{\neq}(t)\mid Y_N^{\neq}(t-\!)\right]\leq Y_N^{\neq}(t-\!)(1+\Delta \omega (t)).\end{align*}

Let $0\leq t_1<\cdots<t_m\leq T$ be the jump times of $\omega^\delta$ . The previous discussion yields

(3.5) \begin{equation} {\mathbb E}\left[Y_N^{\neq}(t_{i+1})\mid Y_N^{\neq}(t_i)\right]\leq {\mathbb E}\left[\left(1+\frac1N\right)^{K_i} \right]\left(Y_N^{\neq}(t_i)+N\epsilon_i(\delta)\right)(1+\Delta \omega(t_{i+1})), \end{equation}

where $\epsilon_i(\delta)\,{:\!=}\, \sum_{u\in(t_i,t_{i+1})}\Delta \omega(u)$ and $K_i$ is the number of selective events in $(t_i,t_{i+1})$ . Note that $K_i$ has a Poisson distribution with parameter $N\sigma_N(t_{i+1}-t_i)$ . Hence,

\begin{align*}{\mathbb E}\left[Y_N^{\neq}({t_{i+1}})\mid Y_N^{\neq}(t_i)\right]\leq e^{\sigma_N(t_{i+1}-t_i)}\left(Y_N^{\neq}(t_i)+N\epsilon_i(\delta)\right)(1+\Delta \omega(t_{i+1})).\end{align*}

Iterating this formula and using that $Y_N^{\neq}(0)=0$ yields

(3.6) \begin{equation} {\mathbb E}\left[Y_N^{\neq}(t)\right]\leq e^{\sigma_N t} N\omega_\delta(t)\prod_{t_i\leq t}(1+\Delta \omega(t_i))\leq N\,\omega_\delta(t)\,e^{\sigma_N t+\sum_{u\in[0,t]}\Delta \omega(u)}.\end{equation}

Recall the definition of the space $\textrm{BL}(E)$ in Appendix A.2. We equip $[0,N]^n$ with the distance $d_1$ defined in (3.2). For any $n \geq 1$ and $F\in \textrm{BL}([0,N]^n)$ , we have

\begin{align*} \left\lvert \int F {\textrm{d}}\mu_N^{\vec{r}}\left(\omega^\delta\right)- \int F {\textrm{d}}\mu_N^{\vec{r}}(\omega)\right\rvert = \left\lvert \mathbb{E} \left [ F((Z_N^{\omega^\delta}(r_j))_{j\in[n]}) \right ] - \mathbb{E} \left [ F((Z_N^\omega(r_j))_{j\in[n]}) \right ] \right\rvert. \end{align*}

Hence, if $\lVert F\rVert_{\textrm{BL}}\leq 1$ (see (A.5) for the definition of $\lVert \cdot\rVert_{\textrm{BL}}$ ) and we couple $Z_N^{\omega^{\delta}}(t)$ and $Z_N^\omega(t)$ as before, we get that

\begin{align*} &\left\lvert \mathbb{E} \left[ F((Z_N^{\omega^\delta}(r_j))_{j\in[n]}) \right] - \mathbb{E} \left[ F((Z_N^{\omega}(r_j))_{j\in[n]}) \right] \right\rvert \\ &\quad \leq \left\lvert \mathbb{E} \left [ d_1((Z_N^{\omega^\delta}(r_j))_{j\in[n]}, (Z_N^\omega(r_j))_{j\in[n]}) \right ] \right\rvert= \mathbb{E} \left [ \sum_{j\in[n]}|Y_N^{\neq}(r_j)| \right ]\\ &\quad \leq \sum_{j\in[n]} N\omega_\delta(r_j)e^{\sigma_N r_j+\sum_{u\in[0,r_j]}\Delta \omega(u)},\end{align*}

where the last bound comes from (3.6) applied at $r_j$ , $j\in[n]$ . Taking the supremum over all $F\in \textrm{BL}([0,N]^n)$ with $\lVert F\rVert_{\textrm{BL}}\leq 1$ and using the definition of the distance $\varrho_{[0,N]^n}$ in (A.6), we get (3.3). Now, define $Y_N^*(t)\,{:\!=}\, \sup_{u\in[0,t]}Y_N^{\neq}(u)$ . If at time t a neutral event occurs, then

\begin{align*}{\mathbb E}[Y_N^*(t)\mid Y_N^*(t-\!)]\leq\left(1+\frac1N\right)Y_N^*(t-\!).\end{align*}

Other events can be treated as before, leading to (3.5) with $K_i$ being this time the number of selective and neutral events in $(t_i,t_{i+1})$ . Hence, Equation (3.4) follows similarly to (3.3). The convergence of $Z_N^{\omega^\delta}$ towards $Z_N^\omega$ is a direct consequence of (3.4).

Proposition 3.2. Let $\omega\in{\mathbb D}_T^\star$ and $\{\omega_k\}_{k\in{\mathbb N}}\subset {\mathbb D}_T^\star$ be such that $d_T^\star(\omega_k,\omega)\to 0$ as $k\to\infty$ . If $\omega$ is simple and, for any $k\in{\mathbb N}$ , $Z_N^\omega(0)=Z_N^{\omega_k}(0)$ , then

(3.7) \begin{equation}(Z_N^{\omega_k}(t))_{t\in[0,T]}\xrightarrow[k\to \infty]{(d)}(Z_N^\omega(t))_{t\in[0,T]}.\end{equation}

Proof. The proof consists of two parts. In the first part, we construct a time deformation $\lambda_k\in\mathcal{C}_T^\uparrow$ with suitable properties. In the second part, we compare $Z_N^{\omega_k}\circ\lambda_k$ and $Z_N^\omega$ under an appropriate coupling of the underlying Moran models.

Part 1: We assume, without loss of generality, that $d_T^\star(\omega_k,\omega)>0$ for all $k\in{\mathbb N}$ . Set $\epsilon_k\,{:\!=}\, 2d_T^\star(\omega_k,\omega)$ , so that $d_T^\star(\omega_k,\omega)\lt \epsilon_k$ . By definition of the metric $d_T^\star$ in (A.3), there is $\varphi_k\in \mathcal{C}_T^\uparrow$ such that

\begin{align*}\lVert \varphi_{k}\rVert_T^0\leq \epsilon_k\quad\textrm{and}\quad\sum_{u\in[0,T]}|\Delta \omega(u)-\Delta (\omega_k\circ\varphi_{k})(u)|\leq \epsilon_k,\end{align*}

where $\lVert \cdot \rVert_T^0$ is defined in (A.1). Denote by $r_1<\cdots<r_n$ the consecutive jump times of $\omega$ in [0, T]. We assume without loss of generality that $0<r_1\leq r_n<T$ . The case where $\omega$ jumps at T can be reduced to the previous case, by extending $\omega_k$ , $k\in{\mathbb N}$ , and $\omega$ to $[0,T+\varepsilon]$ as constants in $[T,T+\varepsilon]$ . Set $\gamma_k\,{:\!=}\, T\,\sqrt{e^{\epsilon_k}-1}$ . In the remainder of the proof we assume that k is sufficiently large, so that $\gamma_k\leq \min_{i\in[n]_0}(r_{i+1}-r_{i})/3$ , where $r_0\,{:\!=}\, 0$ and $r_{n+1}\,{:\!=}\, T$ . This condition ensures that the intervals $I_i^k\,{:\!=}\, [r_i-\gamma_k,r_i+\gamma_k]$ , $i\in[n]$ , are disjoint and contained in [0, T]. Now, define $\lambda_k\,{:}\,[0,T]\to[0,T]$ via the following:

  1. (i) For $u\notin \cup_{i=1}^nI_i^k$ : $\lambda_k(u)\,{:\!=}\, u$ .

  2. (ii) For $u\in [r_i-\gamma_k,r_i]\,:$ $\lambda_k(u)\,{:\!=}\, \varphi_k(r_i)+m_i(u-r_i)$ , where $m_i\,{:\!=}\, (\varphi_k(r_i)-r_i+\gamma_k)/\gamma_k.$

  3. (iii) For $u\in (r_i,r_i+\gamma_k]\,:$ $\lambda_k(u)\,{:\!=}\, \varphi_k(r_i)+\bar{m}_i(u-r_i)$ , where $\bar{m}_i\,{:\!=}\, (r_i+\gamma_k-\varphi_k(r_i))/\gamma_k.$

For k sufficiently large, so that $\epsilon_k \lt \log 2$ , we can infer from $\lVert \varphi_{k}\rVert_T^0\leq \epsilon_k$ and from $\gamma_k=T\,\sqrt{e^{\epsilon_k}-1}$ that $m_i$ and $\bar{m}_i$ are positive. It is then straightforward to check that $\lambda_k\in\mathcal{C}_T^\uparrow$ , $\lambda_k(I_i^k)=I_i^k$ , $i\in[n]$ , and that

\begin{align*}\sum_{u\in[0,T]}|\Delta \omega(u)-\Delta \bar{\omega}_k(u)|\leq \epsilon_k,\end{align*}

where $\bar{\omega}_k\,{:\!=}\, \omega_k\circ\lambda_k$ . Moreover, since $\lVert \varphi_{k}\rVert_T^0\leq \epsilon_k$ , we infer that $\varphi_k(r_i)\in[e^{-\epsilon_k}r_i,e^{\epsilon_k}r_i]$ . It follows that, for k sufficiently large,

\begin{align*}1-2\sqrt{e^{\epsilon_k}-1} \leq m_i \leq 1+2\sqrt{e^{\epsilon_k}-1},\quad i\in[n],\end{align*}

and the same holds for $\bar{m}_i$ . Note that we can write $\lambda_k(t) = \int_0^t p_k(u) {\textrm{d}} u$ , with $p_k\,{:}\,[0,T] \mapsto \mathbb{R}$ taking only the values $(m_i)_{i\in[n]}$ , $(\bar{m}_i)_{i\in[n]}$ , and 1. In particular, we have $|p_k(u)-1|\leq 2\sqrt{e^{\epsilon_k}-1}$ for all $u\in[0,T]$ . Thus, for any $s,t\in[0,T]$ with $s\neq t$ , the slope $(\lambda_k(s)-\lambda_k(t))/(s-t)$ belongs to $[1-2\sqrt{e^{\epsilon_k}-1}, 1+2\sqrt{e^{\epsilon_k}-1}]$ . Therefore, for k sufficiently large, we have

\begin{align*}\frac{\lambda_k(s)-\lambda_k(t)}{s-t}, \,\frac{s-t}{\lambda_k(s)-\lambda_k(t)}\leq 1+3\sqrt{e^{\epsilon_k}-1},\quad i\in[n].\end{align*}

Hence, using that $\log(1+x)\leq x$ for $x>-1$ , we obtain for k sufficiently large

(3.8) \begin{equation}\lVert \lambda_k\rVert_T^0\leq 3\sqrt{e^{\epsilon_k}-1}.\end{equation}

Part 2: For $\delta>0$ , we couple in [0, T] a Moran model with parameters $(\sigma_N,\theta_N,\nu_0,\nu_1)$ and environment $\omega$ to a Moran model with parameters $(\sigma_N,\theta_N,\nu_0,\nu_1)$ and environment $\omega_k$ (both of size N) by using (1) the same initial type configuration and (2) the same basic background, and (3) by using in the second population the environmental background of the first one, time-changed by $\lambda_k^{-1}$ . Under this coupling and by construction of the function $\lambda_k$ , the Moran model associated to $\omega$ and the Moran model associated to $\omega_k$ (time-changed by $\lambda_k$ ) experience the same basic events out of the time intervals $I_i^k$ . Moreover, at the times $r_i$ , the success of simultaneous environmental reproductions is decided according to the same uniform random variables.

For $t\in[0,T]$ , let $Y_N^{\neq}(t)$ be the number of individuals that have different types at time t for $\omega$ and at time $\lambda_k(t)$ for $\omega_k$ , and set $Y_N^*(t)\,{:\!=}\, \sup_{u\in[0,t]}Y_N^{\neq}(u)$ .

Consider the event $E_{k}\,{:\!=}\, {\textrm{there are no basic events in}} $ ${ \cup _{i \in [n]}}I_i^k$ , and note that

(3.9) \begin{equation}{\mathbb P}(E_{k}^c)\leq n\left(1-e^{-2N(1+\sigma_N+\theta_N)\gamma_k}\right).\end{equation}

Moreover, on the event $E_{k}$ , only the population driven by $\omega_k$ can change in $(r_i,r_i+\gamma_k]$ , and this can only be due to environmental events. Hence,

(3.10) \begin{equation} {\mathbb E}[Y_N^*(r_i+\gamma_k)1_{E_k}]\leq {\mathbb E}[Y_N^*(r_i)1_{E_{k}}] +N\sum_{u\in(r_i,r_i+\gamma_k]}\Delta\bar{\omega}_k(u).\end{equation}

A similar argument yields

(3.11) \begin{equation} {\mathbb E}[Y_N^*(r_{i+1}-\!)\,1_{E_{k}}]\leq {\mathbb E}[Y_N^*(r_{i+1}-\gamma_k)1_{E_{k}}]+ N\sum_{u\in(r_{i+1}-\gamma_k,r_{i+1})}\Delta\bar{\omega}_k(u).\end{equation}

Since in the interval $J_{i}^{k}\,{:\!=}\,(r_i+\gamma_k,r_{i+1}-\gamma_k]$ there are no simultaneous jumps of the two environments, we can proceed as in the proof of Proposition 3.1 to obtain

(3.12) \begin{equation} {\mathbb E}\left[Y_N^*(r_{i+1}-\gamma_k)1_{E_{k}}\right]\leq e^{(1+\sigma_N)(r_{i+1}-r_{i})}\left({\mathbb E}[Y_N^*(r_i+\gamma_k)1_{E_{k}}]+N\sum_{u\in J_{i}^k}\Delta\bar{\omega}_k(u)\right).\end{equation}

Moreover, at time $r_{i+1}$ , there are two possible contributions to take into account: (i) the contribution of selective arrows arising simultaneously in both environments, and (ii) the contribution of selective arrows arising only on the environment with the biggest jump. This leads to

(3.13) \begin{align} &{\mathbb E}\left[Y_N^*(r_{i+1})\,1_{E_{k}}\right]\leq {\mathbb E}[Y_N^*(r_{i+1}-\!)\,1_{E_{k}}](1+\Delta \omega(r_{i+1})\wedge \Delta \bar{\omega}_k(r_{i+1}))\nonumber\\[3pt]&\quad + N|\Delta \omega(r_{i+1})-\Delta \bar{\omega}_k(r_{i+1})|.\end{align}

Using (3.10), (3.11), (3.12), and (3.13), we obtain

\begin{align*}{\mathbb E}[Y_N^*(r_{i+1})\,1_{E_k}]\leq e^{(1+\sigma_N)(r_{i+1}-r_{i})}&\left[{\mathbb E}[Y_N^*(r_{i})\,1_{E_k}]\vphantom{+N\sum_{u\in(r_i,r_{i+1}]}|\Delta \omega(u)-\Delta \bar{\omega}_k(u)|}\right.\\[3pt]&\left.+N\sum_{u\in(r_i,r_{i+1}]}|\Delta \omega(u)-\Delta \bar{\omega}_k(u)|\right](1+\Delta \omega(r_{i+1})).\end{align*}

Iterating this inequality, using that $Y_N^*(0)=0$ , and adding the contribution of the interval $(r_n+\gamma_k,T]$ , we obtain

(3.14) \begin{equation}{\mathbb E}\left[Y_N^*(T)\,1_{E_{k}}\right]\leq N\epsilon_k e^{(1+\sigma_N)T+\sum_{u\in(0,T]}\Delta \omega(u)}.\end{equation}

Using (3.9), (3.14), (3.8), and the definition of $d_T^0$ in (A.2), we obtain for k large enough

\begin{align*} {\mathbb E}\left[d_T^0(Z_N^\omega,Z_N^{\omega_k})\right] &\leq 2nN\left(1-e^{-2N(1+\sigma_N+\theta_N)\gamma_k}\right)\\[3pt]&\quad + {3} \sqrt{e^{\epsilon_k}-1}\vee\left(N\epsilon_k\, e^{(1+\sigma_N)T+\sum_{u\in(0,T]}\Delta \omega(u)} \right).\end{align*}

The result follows from letting $k\to\infty$ and using that $\gamma_k\to 0$ and $\epsilon_k\to 0$ as $k\to\infty$ .

Proof of Theorem 2.1 (continuity).If $\omega$ has a finite number of jumps, the result follows directly from Proposition 3.2. In the general case, note that for any $\delta>0$ ,

(3.15) \begin{align} {\varrho_{{\mathbb D}_{0,T}^{}}^{}}({\mu_N(\omega_k)},{\mu_N(\omega)})&\leq {\varrho_{{\mathbb D}_{0,T}^{}}^{}}\left({\mu_N(\omega_k)},{\mu_N\left(\omega_{k}^{\delta}\right)}\right)\nonumber\\[3pt]&\quad + {\varrho_{{\mathbb D}_{0,T}^{}}^{}}\left({\mu_N\left(\omega_{k}^{\delta}\right)},{\mu_N\left(\omega^\delta\right)}\right)+{\varrho_{{\mathbb D}_{0,T}^{}}^{}}({\mu_N(\omega^{\delta})},{\mu_N(\omega)}), \end{align}

where $\omega^{\delta}$ is as in (3.1) and, similarly, $\omega_{k}^{\delta}(t)\,{:\!=}\, \sum_{u\in[0,t]: \Delta\omega_k(u)\geq \delta}\Delta \omega_k(u)$ . Recall the definition of $d_T^\star$ in (A.3). We claim that, for any $\delta\in A_\omega\,{:\!=}\, \{d>0\,{:}\,\Delta \omega(u)\neq d \textrm{ for any u $\in$ [0, T]}\}$ , we have

(Claim 1) \begin{equation}d_T^\star(\omega_{k}^{\delta},\omega^\delta)\xrightarrow[k\to\infty]{}0.\end{equation}

Assume that Claim 1 is true and let $\delta\in A_\omega$ . Note that for any $\lambda\in\mathcal{C}^\uparrow_T$ , we have

\begin{align*} &\omega_{k,\delta}(T) \,{:\!=}\, \sum_{u\in[0,T]: \Delta\omega_k(u)\lt \delta}\Delta \omega_k(u)=\omega_k(T)-\omega_{k}^{\delta}(T)=\omega_k(\lambda(T))-\omega_{k}^{\delta}(\lambda(T))\\ &\quad \leq |\omega_k(\lambda(T))-\omega(T)|+|\omega(T)-\omega^\delta(T)|+|\omega^\delta(T)-\omega_{k}^{\delta}(\lambda(T))|\\ &\quad \leq d_T^0(\omega,\omega_k)+ \omega_\delta(T)+d_T^0(\omega_{k}^{\delta},\omega^\delta)\leq d_T^\star(\omega,\omega_k)+ \omega_\delta(T)+d_T^\star(\omega_{k}^{\delta},\omega^\delta), \end{align*}

where we used the definition of $d_T^0$ in (A.2) and then Lemma A.1. Combining this with Claim 1 and Proposition 3.1, we obtain

\begin{equation*} \limsup_{k\to\infty}{\varrho_{{\mathbb D}_{0,T}^{}}^{}}({\mu_N(\omega_k)},{\mu_N\left(\omega_{k}^{\delta}\right)})\leq N\omega_\delta(T) e^{(1+\sigma_N)T+\omega(T)}.\end{equation*}

Proposition 3.2 and Claim 1 in turn yield $\limsup_{k\to\infty}^{}{\varrho_{{\mathbb D}_{0,T}^{}}^{}}\left({\mu_N\left(\omega_{k}^{\delta}\right)},{\mu_N(\omega^{\delta})}\right)=0$ . Hence, letting $k\to\infty$ in (3.15) and using Proposition 3.1, we obtain

\begin{align*}\limsup_{k\to\infty}{\varrho_{{\mathbb D}_{0,T}^{}}^{}}({\mu_N(\omega_k)},{\mu_N(\omega)})\leq 2N\omega_\delta(T) e^{(1+\sigma_N)T+\omega(T)}.\end{align*}

The previous inequality holds for any $\delta\in A_\omega$ . It is plain to see that $\inf A_\omega=0$ . Hence, letting $\delta\to 0$ with $\delta\in A_\omega$ in the previous inequality yields the result.

It remains to prove Claim 1. Let $\delta\in A_\omega$ . Since $d_T^\star(\omega_k,\omega)\to 0$ as $k\to\infty$ , we see from the definition of $d_T^\star$ in (A.3) that there exists $(\lambda_k)_{k\in{\mathbb N}}$ with $\lambda_k\in\mathcal{C}_T^\uparrow$ such that

\begin{align*}\lVert \lambda_k\rVert_T^0\xrightarrow[k\to\infty]{}0\quad\textrm{ and }\quad \epsilon_k\,{:\!=}\, \sum_{u\in[0,T]}|\Delta(\omega_k\circ\lambda_k)(u)-\Delta \omega(u)| \xrightarrow[k\to\infty]{}0.\end{align*}

Set $\bar{\omega}_k\,{:\!=}\, \omega_k\circ\lambda_k$ . Clearly, $\Delta\bar{\omega}_k(u)\leq \epsilon_k+\Delta \omega(u)$ and $\Delta\omega (u)\leq \epsilon_k+\Delta \bar{\omega}_k(u)$ , $u\in[0,T].$ Therefore,

\begin{align*}\omega_{k}^{\delta} (\lambda_k(t)) -\omega^\delta(t)=\sum_{\substack{u\in[0,t]\\ \Delta\bar{\omega}_k(u)\geq \delta}}\!\!\!\!\Delta \bar{\omega}_k(u)-\sum_{\substack{u\in[0,t]\\ \Delta\omega(u)\geq \delta}}\!\!\!\!\Delta \omega(u)\leq \sum_{\substack{u\in[0,t]\\\Delta\omega(u)\geq \delta-\epsilon_k^{}}}\!\!\!\!\!\!\!\!\!\!\Delta \bar{\omega}_k(u)-\sum_{\substack{u\in[0,t]\\ \Delta\omega(u)\geq \delta}}\!\!\!\!\!\Delta \omega(u)\\ =\sum_{\substack{u\in[0,t]\\ \Delta\omega(u)\geq \delta-\epsilon_k}}\!\!\!\!\!\!\!(\Delta \bar{\omega}_k(u)-\Delta \omega(u))+\sum_{\substack{u\in[0,t]\\ \Delta\omega(u)\in[\delta-\epsilon_k,\delta)}}\!\!\!\!\!\!\!\!\Delta \omega(u)\leq d_T^\star(\omega_k,\omega)+\sum_{\substack{u\in[0,T]\\ \Delta\omega(u)\in[\delta-\epsilon_k,\delta)}}\!\!\! \!\!\!\!\!\!\!\Delta \omega(u).\end{align*}

Similarly, we obtain

\begin{align*}& \omega^\delta (t)-\omega_{k}^{\delta}(\lambda_k(t))=\sum_{\substack{u\in[0,t]\\ \Delta\omega(u)\geq \delta}}\!\!\!\!\Delta \omega(u)-\sum_{\substack{u\in[0,t]\\ \Delta\bar{\omega}_k(u)\geq \delta}}\!\!\!\!\Delta \bar{\omega}_k(u)\\[3pt] &\quad \leq \sum_{\substack{u\in[0,t]\\ \Delta\omega(u)\in[\delta,\delta+\epsilon_k)}}\!\!\!\!\!\!\!\!\!\!\Delta \omega(u)+\sum_{\substack{u\in[0,t]\\ \Delta\bar{\omega}_k(u)\geq \delta}}\!\!\!\!(\Delta \omega(u)-\Delta \bar{\omega}_k(u))\leq \sum_{\substack{u\in[0,T]\\ \Delta\omega(u)\in[\delta,\delta+\epsilon_k)}}\!\!\!\!\!\!\!\!\Delta \omega(u)+d_T^\star(\omega_k,\omega).\end{align*}

Thus, using the definition of $d_T^0$ in (A.2), we deduce that

\begin{align*}d_T^0(\omega_k^\delta,\omega^\delta)\leq d_T^\star(\omega_k,\omega)+\sum_{\substack{u\in[0,T]\\ \Delta\omega(u)\in(\delta-\epsilon_k,\delta+\epsilon_k)}}\!\!\!\!\!\!\!\!\!\!\Delta \omega(u).\end{align*}

Since $\delta\in A_\omega$ , letting $k\to\infty$ in the previous inequality yields $\lim_{k\to\infty }d_T^0(\omega_k^\delta,\omega^\delta)=0$ . Since $\omega^\delta$ has a finite number of jumps, Claim 1 follows using Lemma A.1.

3.2. Results related to Section 2.3: the Wright–Fisher process as a large-population limit

We start this section by proving that the SDE (1.3) is well-posed.

Proposition 3.3. (Existence and uniqueness.) Let $\sigma,\theta\geq 0$ , $\nu_0,\nu_1\in[0,1]$ with $\nu_0+\nu_1=1$ . Let J be a pure-jump subordinator with Lévy measure $\mu$ supported in (0,1), and let B be a standard Brownian motion independent of J. Then, for any $x_0\in[0,1]$ , there is a pathwise unique strong solution $(X(t))_{t\geq 0}$ to the SDE (1.3) such that $X(0)=x_0$ . Moreover, $X(t)\in[0,1]$ for all $t\geq 0$ .

Remark 3.1. The Wright–Fisher diffusion defined via the SDE (1.3) with $\theta=0$ corresponds to [Reference González Casanova, Spanò and Wilke-Berenguer20, Equation 10] with $K_y$ , $y\in(0,1)$ , being a random variable that takes the value 1 with probability $1-y$ and the value 2 with probability y.

Proof. We prove the existence and pathwise uniqueness of strong solutions to (1.3) via [Reference Li and Pu32, Theorems 3.2 and 5.1]. To this end, we first extend (1.3) to an SDE on ${\mathbb R}$ and write it in the form of [Reference Li and Pu32, Equation 2.1]. Note that by Lévy–Itô decomposition, the pure-jump subordinator J can be expressed as $J(t)=\int_{(0,1)} x\, N(t,{\textrm{d}} x),$ where $N({\textrm{d}} s,{\textrm{d}} x)$ is a Poisson random measure with intensity measure $\mu$ . We define the functions $a,b\,{:}\,{\mathbb R}\to{\mathbb R}$ and $g\,{:}\,{\mathbb R}\times(0,1)\to{\mathbb R}$ via

\begin{align*}a(x)\,{:\!=}\, \sqrt{2x(1-x)},\quad b(x)\,{:\!=}\,\sigma x(1-x)+\theta\nu_0(1-x)-\theta\nu_1 x, \quad g(x,u)\,{:\!=}\, x(1-x)u,\end{align*}

for $x\in[0,1],$ $u\in(0,1)$ ; $a(x)\,{:\!=}\, 0$ , $g(x,u)\,{:\!=}\, 0$ for $x\notin[0,1]$ ; $b(x)\,{:\!=}\, \theta\nu_0$ for $x\lt 0$ and $b(x)\,{:\!=}\, -\theta\nu_1$ for $x>1$ . Thus, Equation (1.3) can be extended to the following SDE on ${\mathbb R}$ :

(3.16) \begin{equation} X(t)=X(0)+\int_0^t a(X(s)){\textrm{d}} B(s)+\int_0^t\int_{(0,1)}g(X(s-\!),u)N({\textrm{d}} s,{\textrm{d}} u)+\int_0^t b(X(s)){\textrm{d}} s.\end{equation}

Thus, any solution $(X(t))_{t\geq 0}$ of (3.16) such that $X(t)\in[0,1]$ for any $t\geq 0$ is a solution to (1.3), and vice versa. Note that the functions a, b, g are continuous. Moreover, $b=b_1- b_2$ , where

\begin{align*}b_1(x)\,{:\!=}\, \theta\nu_0+\sigma (x\wedge 1)_+\quad\text{and}\quad b_2(x)\,{:\!=}\, \theta(x\wedge 1)_+ +\sigma (x\wedge 1)_+^2. \end{align*}

In addition, $b_2$ is non-decreasing. Thus, in order to apply [Reference Li and Pu32, Theorems 3.2 and 5.1], we only need to verify the sufficient conditions (3.a), (3.b), and (5.a) therein. Condition (3.a) in our case amounts to proving that $x\mapsto b_1(x)+\int_{(0,1)}g(x,u)\mu({\textrm{d}} u)$ is Lipschitz continuous. In fact, a straightforward calculation shows that

\begin{align*}|b_1(x)-b_1(y)|+\int_{(0,1)}|g(x,u)-g(y,u)|\mu({\textrm{d}} u)\leq \left(\sigma+\int_{(0,1)}u\mu({\textrm{d}} u)\right)|x-y|,\quad x,y\in{\mathbb R},\end{align*}

and hence (3.a) follows. Condition (3.b) amounts to proving that $x\mapsto a(x)$ is $1/2$ -Hölder, which is already shown in the proof of [Reference González Casanova and Spanò19, Lemma 3.6]. Therefore, [Reference Li and Pu32, Theorem 3.2] yields the pathwise uniqueness for (3.16). Condition (5.a) follows from the fact that the functions a, b, $x\mapsto \int_{(0,1)} g(x,u)^2\mu({\textrm{d}} u)$ , and $x\mapsto \int_{(0,1)} g(x,u)\mu({\textrm{d}} u)$ are bounded on ${\mathbb R}$ . Hence, [Reference Li and Pu32, Theorem 5.1] ensures the existence of a strong solution to (3.16). It remains to show that any solution to (3.16) with $X(0)\in[0,1]$ is such that $X(t)\in[0,1]$ for any $t\in[0,1]$ . Sufficient conditions implying such a result are provided in [Reference Fu and Li17, Proposition 2.1]. The conditions on the diffusion and drift coefficients are satisfied; namely, a is 0 outside [0, 1], and b(x) is positive for $x\leq 0$ and negative for $x\geq 1$ . However, the condition on the jump coefficient, $x+g(x,u)\in[0,1]$ for every $x\in{\mathbb R}$ , is not fulfilled. Nevertheless, the proof of [Reference Fu and Li17, Proposition 2.1] works without modifications under the alternative condition $x+g(x,u)\in[0,1]$ for $x\in[0,1]$ and $g(x,u)=0$ for $x\notin[0,1]$ , which is in turn satisfied. This ends the proof.

Lemma 3.1. The solution to the SDE (1.3) is a Feller process with generator $\mathcal{A}$ satisfying, for all $f\in \mathcal{C}^2([0,1],{\mathbb R})$ ,

\begin{align*}\mathcal{A} f(x)=x(1-x)f''(x)+ (\sigma x(1-x)&+\theta\nu_0(1-x)-\theta\nu_1 x) f{'}(x)\\[3pt]&\quad +\int_{(0,1)}\left(f\left(x+ x(1-x)u\right)-f(x)\right)\mu({\textrm{d}} u).\end{align*}

Moreover, $\mathcal{C}^\infty([0,1],{\mathbb R})$ is an operator core for $\mathcal{A}$ .

Proof. Since pathwise uniqueness implies weak uniqueness (see [Reference Barczy, Li and Pap6, Theorem 1]), we infer from [Reference Kurtz30, Corollary 2.16] that the martingale problem associated to $\mathcal{A}$ in $\mathcal{C}^{2}([0,1])$ is well-posed. Moreover, an inspection of the proof shows that this is also true in $\mathcal{C}^{\infty}([0,1])$ . Using [Reference Van Casteren37, Proposition 2.2], we deduce that X is Feller. The fact that $\mathcal{C}^\infty([0,1])$ is a core then follows from [Reference Van Casteren37, Theorem 2.5].

Now we will prove the first part of the main result of Section 2.3, i.e. the annealed convergence of a sequence of Moran models towards the solution to the SDE (1.3).

Proof of Theorem 2.2(1) (annealed convergence.)Let $\mathcal{A}_N^*$ and $\mathcal{A}$ be the infinitesimal generators of the processes $(X_N(t))_{t\geq 0}$ and $(X(t))_{t\geq 0}$ , respectively. Note that $(X_N(t))_{t\geq 0}$ has state space

(3.17) \begin{equation} E_N\,{:\!=}\, \{k/N\,{:}\,k\in[N]_0\}.\end{equation}

We will prove that, for all $f\in\mathcal{C}^\infty([0,1],{\mathbb R})$ ,

(Claim 2) \begin{equation}\sup\limits_{x\in E_N}|\mathcal{A}_N^* f(x)-\mathcal{A} f(x)|\xrightarrow[N\to\infty]{} 0.\end{equation}

Provided Claim 2 is true, since X is Feller and $\mathcal{C}^\infty([0,1],{\mathbb R})$ is an operator core for $\mathcal{A}$ (see Lemma 3.1), the result follows from applying [Reference Kallenberg24, Theorem 17.25]. Thus, it remains to prove Claim 2. To this end, we write $\mathcal{A}$ as $\mathcal{A}^{1}+\mathcal{A}^{2}+\mathcal{A}^3+\mathcal{A}^4$ , where

\begin{align*}\mathcal{A}^1f(x) &\,{:\!=}\, x(1-x)f{''}(x),\\[3pt]\mathcal{A}^2f(x) &\,{:\!=}\, (\sigma x(1-x)+\theta\nu_0(1-x)-\theta\nu_1 x) f{'}(x),\\[3pt]\mathcal{A}^3f(x) &\,{:\!=}\, \int_{(0,\varepsilon_N)}\left(f\left(x+ x(1-x)u\right)-f(x)\right)\mu({\textrm{d}} u),\\[3pt]\mathcal{A}^4f(x) &\,{:\!=}\, \int_{(\varepsilon_N,1)}\left(f\left(x+ x(1-x)u\right)-f(x)\right)\mu({\textrm{d}} u).\end{align*}

We also write $\mathcal{A}_N^*=\mathcal{A}_N^{1}+\mathcal{A}_N^{2}+\mathcal{A}_N^3+\mathcal{A}_N^4$ , where

\begin{align*}\mathcal{A}_N^1f(x) &\,{:\!=}\, N^2x(1-x)\left[\Delta_{\frac1N}f(x)+\Delta_{-\frac1N}f(x)\right],\\[3pt]\mathcal{A}_N^2f(x) &\,{:\!=}\, N^2(\sigma_N x(1-x)+\theta_N\nu_0(1-x))\left[\Delta_{\frac1N}f(x)\right]+N^2\theta_N\nu_1x\left[\Delta_{-\frac1N}f(x)\right],\\[3pt]\mathcal{A}_N^3f(x) &\,{:\!=}\, \int_{(0,\varepsilon_N^{})}{\mathbb E}\left[f\left(x+\xi_N(x,u)\right)\right]-f(x)\mu({\textrm{d}} u),\\[3pt]\mathcal{A}_N^4f(x) &\,{:\!=}\, \int_{(\varepsilon_N^{},1)}{\mathbb E}\left[f\left(x+\xi_N(x,u)\right)\right]-f(x)\mu({\textrm{d}} u),\end{align*}

where $\Delta_hf(x)\,{:\!=}\, f(x+h)-f(x)$ , and $\xi_N(x,u)\,{:\!=}\, \mathcal{H}(N,N(1-x),B_{Nx}(u))/N$ , with $\mathcal{H}(N,N(1-x),k)\sim\textrm{Hyp}({N},{N(1-x)},{k})$ , and $B_{Nx}(u)\sim\textrm{Bin}({Nx},{u})$ being independent; $\varepsilon_N>0$ will be chosen later in an appropriate way.

Let $f\in\mathcal{C}^\infty([0,1],{\mathbb R})$ and note that

(3.18) \begin{equation} \sup_{x\in E_N} |\mathcal{A}_N^* f(x)-\mathcal{A} f(x)|\leq \sum_{i=1}^4 \sup_{x\in E_N} |\mathcal{A}_N^i f(x)-\mathcal{A}^i f(x)|.\end{equation}

Taylor expansions of order three around x for $f(x+1/N)$ and $f(x-1/N)$ yield

(3.19) \begin{equation} \sup\limits_{x\in E_N}|\mathcal{A}_N^1 f(x)-\mathcal{A}^1 f(x)|\leq \frac{\lVert f'''\rVert_\infty}{3N}.\end{equation}

Similarly, the triangle inequality and appropriate Taylor expansions of order two yield

(3.20) \begin{equation} \sup\limits_{x\in E_N}|\mathcal{A}_N^2 f(x)-\mathcal{A}^2 f(x)|\leq {\frac{(N\sigma_N+N\theta_N) \lVert f''\rVert_\infty}{2N}}+ (|\sigma-N\sigma_N|+|\theta-N\theta_N|)\lVert f'\rVert_{\infty}.\end{equation}

Since $N\sigma_N\to\sigma$ and $N\theta_N\to\theta$ , the right-hand side in (3.20) converges to 0 as $N\to\infty$ . In addition, since ${\mathbb E}[\xi_N(x,u)]=x(1-x)u$ , we have

\begin{align*}|\mathcal{A}^3_Nf(x)|\leq \lVert f'\rVert_\infty\int_{(0,\varepsilon_N)} u\mu({\textrm{d}} u),\quad x\in[0,1],\end{align*}

and hence,

(3.21) \begin{equation} \sup\limits_{x\in E_N}|\mathcal{A}_N^3 f(x)-\mathcal{A}^3 f(x)|\leq 2 ||f'||_\infty\int_{(0,\varepsilon_N^{})} u\mu({\textrm{d}} u).\end{equation}

For the last term, note first that

\begin{align*} \left|{\mathbb E}\left[f\!\left(x+\xi_N(x,u)\right)\right.\right. \left.\left.-f(x+x(1-x)u)\right]\right| \leq \lVert f'\rVert_{\infty}{\mathbb E}\left[\left|\xi_N(x,u)-x(1-x)u\right|\right]\\ \leq \lVert f'\rVert_{\infty}\sqrt{{\mathbb E}\left[\left(\xi_N(x,u)-x(1-x)u\right)^2\right]}\leq\lVert f'\rVert_{\infty} \sqrt{\frac{u}{N}},\end{align*}

where in the last inequality we used that

(3.22) \begin{equation} {\mathbb E}\left[\left(\xi_N(x,u)-x(1-x)u\right)^2\right]=\frac{x(1-x)^2u(1-u)}{N}+\frac{Nx^2(1-x)u^2}{N^2(N-1)}\leq \frac{u}{N},\end{equation}

which is obtained from standard properties of the hypergeometric and binomial distributions. Hence,

(3.23) \begin{equation} \sup\limits_{x\in E_N}|\mathcal{A}_N^4 f(x)-\mathcal{A}^4 f(x)|\leq ||f'||_\infty\int_{(\varepsilon_N^{},1)} \sqrt{\frac{u}{N}}\,\mu(du)\leq \frac{||f'||_\infty}{\sqrt{N\varepsilon_N}}\int_{(\varepsilon_N^{},1)} u\,\mu({\textrm{d}} u) .\end{equation}

Now, choose $\varepsilon_N\,{:\!=}\, 1/\sqrt{N}$ . Since $\int_{(0,1)} u\mu({\textrm{d}} u)<\infty$ , Claim 2 follows from plugging (3.19), (3.20), (3.21), and (3.23) into (3.18) and letting $N\to\infty$ .

Before embarking on the proof of the second part of Theorem 2.2, we prove the following estimates for the Moran model with null environment.

Lemma 3.2. For any $x\in E_N$ (see (3.17) for its definition) and $t\geq 0$ , we have

\begin{align*}{\mathbb E}\left[\left(X_N^{{\textbf{0}}}(t)-x\right)^2\mid X_N^{{\textbf{0}}}(0)=x\right]\leq \left(\frac12 + N(\sigma_N+3\theta_N)\right)t,\end{align*}

and

\begin{align*}-N\theta_N\nu_1 t\leq {\mathbb E}\left[X_N^{\textbf{0}}(t)-x\mid X_N^{{\textbf{0}}}(0)=x\right]\leq N(\sigma_N+\theta_N\nu_0) t.\end{align*}

Proof. Fix $x\in E_N$ and consider the functions $f_x,g_x\,{:}\,E_N\to[0,1]$ defined via $f_x(z)\,{:\!=}\,$ $(z-x)^2$ and $g_x(z)\,{:\!=}\, z-x$ , $z\in E_N$ . The process $X_N^{\textbf{0}}$ is a Markov chain with generator $\mathcal{A}_N^{\star,0}\,{:\!=}\, \mathcal{A}_N^1+\mathcal{A}_N^2$ , where $\mathcal{A}_N^1$ and $\mathcal{A}_N^2$ are defined in the proof of Theorem 2.2. Moreover, for every $z\in E_N$ , we have

\begin{align*} \mathcal{A}_N^{\star,0} f_x(z) & =2z(1-z)\\[3pt] &\quad +N\left[(\sigma_N z+\theta_N \nu_0)(1-z)\left(2(z-x)+\frac1N\right)+\theta_N\nu_1 z\left(2(x-z)+\frac1N\right)\right]\\[3pt] &\quad \leq \frac12+N\left[3\left(\frac{\sigma_N}{4}+\theta_N \nu_0\right)+3\theta_N\nu_1\right]\leq \frac12 + N(\sigma_N+3\theta_N), \end{align*}

and

\begin{align*} \mathcal{A}_N^{\star,0} g_x(z) =N\left[(\sigma_N z+\theta_N \nu_0)(1-z)-\theta_N\nu_1 z\right]\in[\!-N\theta_N\nu_1,N(\sigma_N+\theta_N\nu_0)]. \end{align*}

Hence, Dynkin’s formula applied to $X_N^0$ with $f_x$ leads to

\begin{align*} {\mathbb E}\left[\left(X_N^{\textbf{0}}(t)-x\right)^2\mid X_N^{{\textbf{0}}}(0)=x\right] &=\int_0^t {\mathbb E}\left[\mathcal{A}_N^{\star,0} f_x(X_N^{\textbf{0}}(s))\mid X_N^{{\textbf{0}}}(0)=x\right]{\textrm{d}} s\\ &\leq \left(\frac12 + N(\sigma_N+3\theta_N)\right)t. \end{align*}

Similarly, applying Dynkin’s formula to $X_N^0$ with $g_x$ , we obtain

\begin{align*} {\mathbb E}\left[X_N^{\textbf{0}}(t)-x\mid X_N^{{\textbf{0}}}(0)=x\right] &=\int_0^t {\mathbb E}\left[\mathcal{A}_N^{\star,0} g_x(X_N^{\textbf{0}}(s))\mid X_N^{{\textbf{0}}}(0)=x\right]{\textrm{d}} s\\[3pt]&\in[\!-N\theta_N\nu_1 t,N(\sigma_N+\theta_N\nu_0)t], \end{align*}

which ends the proof.

Proposition 3.4. (Quenched tightness.) Assume that $N \sigma_N\rightarrow \sigma$ and $N\theta_N\rightarrow \theta$ as $N\to\infty$ . For any $\omega\in{\mathbb D}^\star$ , the sequence $(X_N^\omega)_{N\geq 1}$ is tight.

Proof. Let $(\mathcal{F}_s^N)_{s\geq 0}$ denote the natural filtration associated to the process $X_N^\omega$ . To prove the tightness of the sequence $(X_N^\omega)_{N\geq 1}$ , we use [Reference Bansaye, Kurtz and Simatos5, Theorem 1]. For this we need to show that the following conditions hold:

  1. (A1) For each $T,\varepsilon>0$ , there exists a compact set $K\subset{\mathbb R}$ such that

    \begin{align*}\liminf_{N\to\infty} {\mathbb P}\left(X_N^\omega(t)\in K,\, \forall t\leq T\right)\geq 1-\varepsilon.\end{align*}
  2. (A2) There exist $\alpha>0$ and non-decreasing, càdlàg processes $F_N$ , F such that $F_N$ is $\mathcal{F}_0$ -measurable, , and for any $N\geq 1$ and every $0\leq s\leq t$ ,

    \begin{align*}{\mathbb E}\left[1\wedge |X_N^\omega(t)-X_N^\omega(s)|^{\alpha} \right]\leq F_N(t)-F_N(s).\end{align*}

Since, for all $t\geq 0$ and $N\geq 1$ , $X_N^\omega(t)\in E_N \subset [0,1]$ (see (3.17) for the definition of $E_N$ ), Condition (A1) is satisfied. Now, we claim that there are constants $c,C>0$ , independent of N, such that

(Claim 3) \begin{equation}{\mathbb E}\left[(X_N^\omega(t)-X_N^\omega(s))^2\mid \mathcal{F}_s^N\right]\leq c\sum_{u\in[s,t]}\Delta \omega(u)+ C (t-s),\quad\textrm{for all $0\leq s\leq t$}.\end{equation}

If Claim 3 is true, then Condition (A2) is satisfied with $\alpha=2$ and $F_N(t)=F(t)=c\sum_{u\in[0,t]}\Delta \omega(u)+Ct$ , and the result follows from [Reference Bansaye, Kurtz and Simatos5, Theorem 1]. The rest of the proof is devoted to proving Claim 3.

For $x\in E_N$ and $t\geq 0$ , we set $\psi_x(\omega,t)\,{:\!=}\, {\mathbb E}_x[(X_N^\omega(t)-x)^2].$ For $s\geq 0$ , we set $\omega_s(\!\cdot\!)\,{:\!=}\,$ $\omega(s+\cdot\!)$ . From the definition of $X_N^\omega$ , it follows that for any $0\leq s\lt t$ ,

(3.24) \begin{equation} {\mathbb E}\left[(X_N^\omega(t)-X_N^\omega(s))^2\Big| \mathcal{F}_s^N\right]=\psi_{X_N^\omega(s)}(\omega_s,t-s).\end{equation}

Let $0\leq s\lt t$ . We split the proof of Claim 3 into three cases.

Case 1: $\omega$ has no jumps in (s, t]. In particular, $\omega_s$ has no jumps in $(0,t-s]$ . Hence, restricted to $[0,t-s]$ , $X_N^{\omega_s}$ has the same distribution as $X_N^{\textbf{0}}$ . Using Lemma 3.2 with $x=X_N^\omega(s)$ and plugging the result into (3.24), we infer that Claim 3 holds for any $c\geq 1$ and $C\geq C_1\,{:\!=}\, 1/2+\sup_{N\in{\mathbb N}}(N(\sigma_N+3\theta_N))$ .

Case 2: $\omega$ has n jumps in (s, t]. Let $t_1,\ldots,t_n\in(s,t]$ be the jump times of $\omega$ in (s, t] in increasing order. We set $t_0\,{:\!=}\, s$ and $t_{n+1}=t$ . For any $i\in[n+1]$ and any $r\in(t_{i-1},t_{i})$ , $\omega$ has no jumps in $(t_{i-1},r]$ . In particular, $(t_{i-1},r]$ falls into Case 1. Using Claim 3 in $(t_{i-1},r]$ and letting $r\to t_i$ , we obtain

\begin{align*}{\mathbb E}\left[(X_N^\omega(t_i -\!)-X_N^\omega(t_{i-1}))^2\mid \mathcal{F}_{t_{i-1}}^N\right]\leq C_1(t_i-t_{i-1}).\end{align*}

Moreover,

\begin{align*}{\mathbb E}\left[(X_N^\omega(t_i)-X_N^\omega(t_{i}-\!))^2\mid \mathcal{F}_{t_{i}-}^N\right]\leq {\mathbb E}\left[\left(\frac{B_{N}(\Delta \omega(t_i))}{N}\right)^2\right]\leq \Delta \omega(t_i),\end{align*}

where $B_{N}(\Delta \omega(t_i)) \sim \textrm{Bin}({N},{\Delta \omega(t_i)})$ . Using the two previous inequalities and the tower property of the conditional expectation, we get

(3.25) \begin{equation} {\mathbb E}\left[(X_N^\omega(t_i)-X_N^\omega(t_{i-1}))^2\mid \mathcal{F}_{s}^N\right]\leq 2C_1(t_i-t_{i-1})+2\Delta\omega(t_i).\end{equation}

Now, note that

\begin{align*} (X_N^\omega(t)-X_N^\omega(s))^2&=\left(\sum_{i=0}^n( X_N^\omega(t_{i+1})-X_N^\omega(t_i))\right)^2 \\&=\sum_{i=0}^n( X_N^\omega(t_{i+1})-X_N^\omega(t_i))^2+2\sum_{i=0}^n( X_N^\omega(t_{i+1})-X_N^\omega(t_i))(X_N^\omega(t_i)-X_N^\omega(s)).\end{align*}

Using Equation (3.25), we see that

\begin{align*}{\mathbb E}\left[\sum_{i=0}^n(X_N^\omega(t_{i+1})-X_N^\omega(t_{i}))^2\mid \mathcal{F}_{s}^N\right]\leq 2C_1(t-s)+2\sum_{i=1}^n\Delta\omega(t_i).\end{align*}

Moreover, we have

\begin{align*} {\mathbb E}\left[(X_N^\omega(t_{i+1})-X_N^\omega(t_{i}))(X_N^\omega(t_{i})-X_N^\omega(s))\mid \mathcal{F}_{t_i}^N\right]=\varphi_{X_N^\omega(s),X_N^\omega(t_i)}(\omega_{t_i},t_{i+1}-t_{i}),\end{align*}

where for $x,y\in E_N$ and $t\geq 0$ we set $\varphi_{x,y}(\omega,t)\,{:\!=}\, (y-x){{\mathbb E}[X_N^\omega(t)-y \mid X_N^\omega(0)=y]}$ . Since, for any $r\in(t_i,t_{i+1})$ , $\omega_{t_i}$ has no jumps in $(0,r-t_i]$ , we can use Lemma 3.2 to infer that for any $x,y\in E_N$ ,

\begin{align*}\varphi_{x,y}(\omega_{t_i}, r-t_i)\leq N((\sigma_N+\theta_N\nu_0)\vee\theta_N\nu_1)(r-t_i).\end{align*}

Note that $(y-x){\mathbb E}_y[X_N^{\omega_{t_i}}(t_{i+1}-t_i)-X_N^{\omega_{t_i}}((t_{i+1}-t_i)-\!)]\leq \Delta \omega(t_{i+1})$ . Hence, letting $r\to t_{i+1}$ , we get

\begin{align*}\varphi_{x,y}(\omega_{t_i}, t_{i+1}-t_i)\leq N((\sigma_N+\theta_N\nu_0)\vee\theta_N\nu_1)(t_{i+1}-t_i)+\Delta\omega(t_{i+1}),\end{align*}

for any $x,y\in E_N$ , hence in particular for $x=X_N^\omega(s)$ and $y=X_N^\omega(t_i)$ . Altogether, we obtain

\begin{align*}{\mathbb E}\left[(X_N^\omega(t)-X_N^\omega(s))^2\mid \mathcal{F}_{s}^N\right]\leq C_2(t-s)+3\sum_{i=1}^n\Delta\omega(t_i),\end{align*}

where $C_2\,{:\!=}\, 2C_1+\sup_{N\in{\mathbb N}}N((\sigma_N+\theta_N\nu_0)\vee\theta_N\nu_1)$ . Hence, Claim 3 holds for any $C\geq C_2$ and $c\geq 3$ .

Case 3: $\omega$ has infinitely many jumps in (s, t]. For any $\delta$ , we consider $\omega^\delta$ as in (3.1) and couple the processes $X_N^\omega$ and $X_N^{\omega^\delta}$ as in the proof of Proposition 3.1. Note that $\omega^\delta$ has only a finite number of jumps in any compact interval; thus $\omega^\delta$ falls into Case 2. Moreover, we have

\begin{align*} \psi_x(\omega,t) &\leq 2{\mathbb E}_x[(X_N^\omega(t)-X_N^{\omega^\delta}(t))^2]+2{\mathbb E}_x\left[(X_N^{\omega^\delta}(t)-x)^2\right]\\[3pt] & \leq 2{\mathbb E}_x[|X_N^\omega(t)-X_N^{\omega^\delta}(t)|]+2{\mathbb E}_x\left[(X_N^{\omega^\delta}(t)-x)^2\right]\\[3pt] & \leq 2e^{N\sigma_N t+\omega(t)} \sum_{\substack{u\in[0,t]\\ \Delta \omega(u)<\delta}}\!\!\Delta \omega(u)+2{\mathbb E}_x\left[(X_N^{\omega^\delta}(t)-x)^2\right],\end{align*}

where in the last inequality we used Proposition 3.1. Now, using Claim 3 for $X_N^{\omega^\delta}$ and the previous inequality, we obtain

\begin{align*}{\mathbb E}\left[(X_N^\omega(t)-X_N^\omega(s))^2\mid \mathcal{F}_{s}^N\right]\leq e^{N\sigma_N (t-s)+\omega(t-s)}\!\!\!\!\sum_{\substack{u\in[s,t]\\ \Delta \omega(u)<\delta}}\!\!\!\!\!\Delta \omega(u)+2C_2(t-s)+6\!\!\sum_{u\in[s,t]}\Delta\omega(u).\end{align*}

We let $\delta\to 0$ and conclude that Claim 3 holds for any $C\geq 2C_2$ and $c\geq 6$ .

Now we proceed to prove the quenched convergence of the sequence of Moran models to the Wright–Fisher diffusion, under the assumption that the environment is simple.

Proof of Theorem 2.2(2) (quenched convergence).Let $B\,{:\!=}\, (B(t))_{t\geq 0}$ be a standard Brownian motion. We denote by $X^{\textbf{0}}$ the unique strong solution to (1.3) associated to B and the null environment. Theorem 2.2(1) implies in particular that $X_N^{\textbf{0}}$ converges to $X^{\textbf{0}}$ as $N\to\infty$ .

Now, assume that $\omega\neq{\textbf{0}}$ is simple. We denote by $T_\omega$ the (discrete but possibly infinite) set of jump times of $\omega$ in $(0,\infty)$ . Moreover, for $0<i\lt |T_\omega|+1$ , we denote by $t_i\,{:\!=}\, t_i(\omega)\in T_\omega$ the time of the ith jump of $\omega$ . We set $t_0 \,{:\!=}\, 0$ and $t_{|T_\omega|+1} \,{:\!=}\, \infty$ . We need to prove that

\begin{align*}(X_N^\omega(t))_{t\geq 0}\xrightarrow[N\to\infty]{(d)} (X^\omega(t))_{t\geq 0}.\end{align*}

Recall that the process $X^\omega$ starting at $x_0$ can be constructed as follows:

  1. (i) (i) $X^\omega(0)=x_0$ .

  2. (ii) For $i\in{\mathbb N}$ with $i\leq |T_\omega|+1$ , the restriction of $X^{\omega}$ to the interval $(t_{i-1},t_i)$ is given by a version of $X^{\textbf{0}}$ started at $X^\omega(t_{i-1})$ .

  3. (iii) For $0<i\lt |T_\omega|+1$ , conditionally on $X^\omega(t_i-\!)$ ,

    \begin{align*}X^\omega(t_i)=X^\omega(t_i-\!)+X^\omega(t_i-\!)(1-X^\omega(t_i-\!))\Delta\omega(t_i).\end{align*}

Since the sequence $(X_N^\omega)_{N\in{\mathbb N}}$ is tight (see Proposition 3.4), it is enough to prove the convergence at the level of the finite-dimensional distributions. More precisely, we will prove by induction on $i\in{\mathbb N}$ with $i\leq |T_\omega|+1$ that for any finite set $I\subset[0,t_i)$ , we have

\begin{align*}((X_N^\omega(t))_{t\in I},X_N^\omega(t_i-\!))\xrightarrow[N\to\infty]{(d)} ((X^\omega(t))_{t\in I},X^\omega(t_i-\!)).\end{align*}

For $i=|T_\omega|+1<\infty$ we remove the components $X_N^\omega(t_i-\!)$ and $X^\omega(t_i-\!)$ since they do not make sense. Since $X_N^\omega(t_1-\!)=X_N^{\textbf{0}}(t_1)$ and $X^\omega(t_1-\!)=X^{\textbf{0}}(t_1)$ almost surely, the result for $i=1$ follows from Theorem 2.2(1). Now, assume that the result is true for some $i<|T_\omega|+1$ and let $I\subset(0,t_{i+1})$ . Without loss of generality we assume that $I=\{s_1,\ldots,s_k, t_i,r_1,\ldots, r_m\}$ , with $s_1<\cdots<s_k<t_i<r_1<\cdots<r_m$ . We also assume that $i<|T_\omega|$ ; the other case, i.e. $i=|T_\omega|<\infty$ , follows by an analogous argument.

Let $F\,{:}\,[0,1]^{k+1}\to{\mathbb R}$ be a Lipschitz function with $\lVert F\rVert_{\textrm{BL}}\leq 1$ (see (A.5) for the definition of $\lVert \cdot\rVert_{\textrm{BL}}$ ). Note that

\begin{align*} {\mathbb E}\left[F\left(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i)\right)\right] ={\mathbb E}\left[F(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i-\!)+\xi_N(X_N^\omega(t_i-\!),\Delta\omega(t_i)))\right],\end{align*}

where for $x\in E_N$ (see (3.17) for the definition of $E_N$ ) and $u\in(0,1)$ , we let $\xi_N(x,u)\,{:\!=}\, \mathcal{H}(N,N(1-x),B_{Nx}(u))/N$ with $\mathcal{H}(N,N(1-x),k)\sim\textrm{Hyp}({N},{N(1-x)},{k})$ , $k\in[N]_0$ , and $B_{Nx}(u)\sim\textrm{Bin}({Nx},{u})$ being independent of each other and independent of $X_N^\omega$ . Now, set

\begin{align*} D_N &\,{:\!=}\, {\mathbb E}\left[F(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i-\!)+\xi_N(X_N^\omega(t_i-\!),\Delta\omega(t_i)))\right]\\ &\quad -{\mathbb E}\left[F(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i-\!)+X_N^\omega(t_i-\!)(1-X_N^\omega(t_i-\!))\Delta\omega(t_i))\right].\end{align*}

Using that $\lVert F\rVert_{\textrm{BL}}\leq 1$ and (3.22), we see that $|D_N| \leq\sqrt{\Delta\omega(t_i)/N}\to 0$ as $N\to\infty$ . Therefore, the induction hypothesis yields

\begin{align*} &{\mathbb E} \left[F\left(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i)\right)\right]\\ &\quad=D_N+{\mathbb E}\left[F(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i-\!)+X_N^\omega(t_i-\!)(1-X_N^\omega(t_i-\!))\Delta\omega(t_i))\right]\\ &\quad\xrightarrow[N\to\infty]{}\,{\mathbb E}\left[F((X^\omega\left(s_j\right))_{j=1}^k,X^\omega(t_i-\!)+X^\omega(t_i-\!)(1-X^\omega(t_i-\!))\Delta\omega(t_i))\right]\\ &\quad={\mathbb E}\left[F\left(\left(X^\omega\left(s_j\right)\right)_{j=1}^k,X^\omega(t_i)\right)\right].\end{align*}

Therefore,

(3.26) \begin{equation}\left(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i)\right)\xrightarrow[N\to\infty]{(d)} \left(\left(X^\omega\left(s_j\right)\right)_{j=1}^k,X^\omega(t_i)\right).\end{equation}

Let $G\,{:}\,[0,1]^{k+m+2}\to{\mathbb R}$ be a Lipschitz function with $\lVert G\rVert_{\textrm{BL}}\leq 1$ . For $x\in E_N$ , define

\begin{align*}H_N(z,x)\,{:\!=}\, {\mathbb E}_x[G(z,x,(X_N^{\textbf{0}}(r_j-t_i))_{j=1}^m,X_N^{\textbf{0}}(t_{i+1}-t_i))], \ \forall z \in{\mathbb R}^k.\end{align*}

Note that

(3.27) \begin{equation} {\mathbb E}[G(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i),(X_N^\omega(r_j))_{j=1}^m,X_N^\omega(t_{i+1}-\!))]={\mathbb E}\left[H_N\left(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i)\right)\right].\end{equation}

Similarly, for $x\in[0,1]$ , define

\begin{align*}H(z,x)\,{:\!=}\, {\mathbb E}_x[G(z,x,(X^{\textbf{0}}(r_j-t_i))_{j=1}^m,X^{\textbf{0}}(t_{i+1}-t_i))],\quad z\in{\mathbb R}^k,\end{align*}

and note that

(3.28) \begin{equation} {\mathbb E}[G((X^\omega\left(s_j\right))_{j=1}^k,X^\omega(t_i),(X^\omega(r_j))_{j=1}^m,X^\omega(t_{i+1}-\!))]={\mathbb E}\left[H\left(\left(X^\omega\left(s_j\right)\right)_{j=1}^k,X^\omega(t_i)\right)\right].\end{equation}

Using (3.26) and the Skorokhod representation theorem, we can assume that the random variables $\left(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i)\right)_{N\geq1}$ and $\left(\left(X^\omega\left(s_j\right)\right)_{j=1}^k,X^\omega(t_i)\right)$ are defined on the same probability space and that the convergence holds almost surely. In particular, we can write

(3.29) \begin{equation} \left|{\mathbb E}\left[H_N\left(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i)\right)\right]-{\mathbb E}\left[H\left(\left(X^\omega\left(s_j\right)\right)_{j=1}^k,X^\omega(t_i)\right)\right]\right|\leq R_N^1+R_N^2,\end{equation}

where

\begin{align*}R_N^1 &\,{:\!=}\, \left|{\mathbb E}\left[H_N\left(\left(X_N^\omega\left(s_j\right)\right)_{j=1}^k,X_N^\omega(t_i)\right)\right]-{\mathbb E}\left[H_N((X^\omega\left(s_j\right))_{j=1}^k,X_N^\omega(t_i))\right]\right|,\\R_N^2 &\,{:\!=}\, \left|{\mathbb E}\left[H_N((X^\omega\left(s_j\right))_{j=1}^k,X_N^\omega(t_i))\right]-{\mathbb E}\left[H\left(\left(X^\omega\left(s_j\right)\right)_{j=1}^k,X^\omega(t_i)\right)\right]\right|.\end{align*}

Using that $\lVert G\rVert_{\textrm{BL}}\leq 1$ , we obtain

(3.30) \begin{equation} R_N^1\leq \sum_{j=1}^k{\mathbb E}[|X_N^\omega\left(s_j\right)-X^\omega\left(s_j\right)|]\xrightarrow[N\to\infty]{}0.\end{equation}

Moreover, since $X_N^\omega(t_i)$ converges to $X^\omega(t_i)$ almost surely, we conclude using Theorem 2.2 that, for any $z\in[0,1]^k$ , $H_N(z,X_N^\omega(t_i))$ converges to $H(z,X^\omega(t_i))$ almost surely. Therefore, using the dominated convergence theorem, we conclude that

(3.31) \begin{equation} R_N^2\xrightarrow[N\to\infty]{}0.\end{equation}

Plugging (3.30) and (3.31) into (3.29) and using (3.27) and (3.28) yields the result.

4. The ASG and its relatives

In this section we formalize the definition of the quenched ASG, and we provide definitions for the k-ASG and the pLD-ASG.

4.1. Results related to Section 2.4: the quenched ASG

The aim of this section is to prove the following result.

Proposition 4.1. (Existence of the quenched ASG.) Let $\omega\in{\mathbb D}^\star$ . For any $n \in{\mathbb N}$ and $T \gt 0$ , there is a branching–coalescing particle system $(\mathcal{G}_T^\omega(\beta))_{\beta\geq 0}$ starting at $\beta=0-$ with n lines, that almost surely consists of finitely many lines at each time $\beta \in [0,T]$ , and that satisfies the requirements (i), (ii), (iii $^{\prime}$ ), (iv), and (v) of Definition 2.1.

Proof. We will explicitly construct a branching–coalescing particle system $(\mathcal{G}_T^\omega(\beta))_{\beta\geq 0}$ with the desired properties. The main difficulty is that the environment $\omega$ may have infinitely many jumps on each compact interval. Fix $T \gt 0$ and $n \in {\mathbb N}$ (sampling size) and define

\begin{align*}\Lambda_{\textrm{mut}}\,{:\!=}\,\left\{\lambda_{i}^{0},\lambda_{i}^{1} \right\}_{i \geq 1}, \qquad \Lambda_{\textrm{sel}}\,{:\!=}\,\left\{\lambda_{i}^{\vartriangle}\right\}_{i \geq 1}, \qquad \Lambda_{\textrm{coal}}\,{:\!=}\,\left\{\lambda_{i,j}^{\blacktriangle} \right\}_{i,j \geq 1, i \neq j},\end{align*}

where $\lambda_{i}^{0}$ , $\lambda_{i}^{1}$ , $\lambda_{i}^{\vartriangle}$ , and $\lambda_{i,j}^{\blacktriangle}$ are independent Poisson processes on [0, T] with parameters $\theta \nu_0$ , $\theta \nu_1$ , $\sigma$ , and 1, respectively. For $\beta \in [0,T]$ , let $\tilde \omega (\beta) \,{:\!=}\, \omega (T) - \omega ((T-\beta)-\!)$ and $I_{\tilde \omega} \,{:\!=}\, \{ \beta \in [0,T]\,{:}\, \Delta \tilde \omega (\beta) \gt 0 \}$ ; $I_{\tilde \omega}$ is the countable set of jump times of $\tilde \omega$ . Let $\mathcal{U}_{\tilde \omega} \,{:\!=}\, \{ U_i(\beta) \}_{i \geq 1, \beta \in I_{\tilde \omega}}$ be an i.i.d. family of uniform random variables on (0, 1). Assume, without loss of generality, that the arrival times of $\lambda_i^0$ , $\lambda_i^1$ , $\lambda_i^{\vartriangle}$ , and $\lambda_{i,j}^{\blacktriangle}$ , $i,j\in{\mathbb N}$ , $i\neq j$ , are countable, distinct from each other, and distinct from the jump times of $\tilde \omega$ . Let $I_{\textrm{coal}}$ (resp. $I_{\textrm{sel}}$ ) be the set of arrival times of $\Lambda_{\textrm{coal}}$ (resp. $\Lambda_{\textrm{sel}}$ ).

We first construct a set $\mathcal{V}^\omega\subset {\mathbb N} \times [0,T]$ of virtual lines, representing the lines that would be part of the ASG if there were no coalescences. In particular, once a line enters this set, it will remain there. The set $\mathcal{V}^\omega$ is constructed on the basis of the set of potential branching times $I_{\textrm{bran}}\,{:\!=}\, I_{\tilde{\omega}}\cup I_{\textrm{sel}}$ as follows. Consider the (countable) set

\begin{align*}S_{\textrm{bran}}\,{:\!=}\,\{(\beta_1,\ldots,\beta_k)\,{:}\,k\in{\mathbb N},\, 0\leq \beta_1<\cdots<\beta_k, \beta_i\in I_{\textrm{bran}}, i\in[k]\},\end{align*}

and fix an injective function $i_\star\,{:}\,[n]\times S_{\textrm{bran}}\to{\mathbb N}\setminus[n]$ . The set $\mathcal{V}^\omega$ is determined as follows:

  1. 1. For any $i\in[n]$ (i.e. in the initial sample) and $\beta\in[0,T]$ : $(i,\beta)\in\mathcal{V}^\omega$ .

  2. 2. For any $(\beta_1,\ldots,\beta_k)\in S_{\textrm{bran}}$ , $j\in[n]$ , and $\beta\in[\beta_k,T]$ : $(i_\star(j,\beta_1,\ldots,\beta_k),\beta)\in\mathcal{V}^\omega$ if and only if

    • for any $\ell\in[k]$ with $\beta_\ell\in I_{\tilde{\omega}}$ , $U_{i_\star(j,\beta_1,\ldots,\beta_{\ell-1})}(\beta_\ell)\leq \Delta \tilde \omega(\beta_\ell)$ (or $U_{j}(\beta_1) \leq \Delta \tilde \omega(\beta_1)$ if $\ell=1$ ),

    • for any $\ell\in[k]$ such that $\beta_\ell\in I_{\textrm{sel}}$ , $\beta_\ell$ is a jump time of $\lambda_{i_\star(j,\beta_1,\ldots,\beta_{\ell-1})}^{\vartriangle}$ (or of $\lambda_{j}^{\vartriangle}$ if $\ell=1$ ),

and these are all possible virtual lines; see Figure 7.

Let $V^\omega(\beta)\,{:\!=}\, \{i\in{\mathbb N}\,{:}\, (i,\beta)\in\mathcal{V}^\omega\}$ . According to Lemma 4.1 below, $V^\omega(\beta)$ is almost surely finite for all $\beta \in [0,T]$ . Now, for $\beta \in I_{\textrm{coal}}$ , let $(a_{\beta},b_{\beta})$ be the pair (i, j) such that $\beta$ is an arrival time of $\lambda_{i,j}^{\blacktriangle}$ . Since the Poisson processes $\lambda_{i,j}^{\blacktriangle}$ , $i\neq j$ , have distinct jump times, $(a_{\beta},b_{\beta})$ is uniquely defined. Let

\begin{align*}\tilde I_{\textrm{coal}}\,{:\!=}\, \{ \beta \in I_{\textrm{coal}} \,{:}\, a_{\beta},b_{\beta} \in V^\omega(\beta) \}\quad\textrm{and}\quad\tilde I_{\textrm{bran}}\,{:\!=}\, \{ \beta \in I_{\textrm{bran}} \,{:}\, V^\omega(\beta-\!)\subsetneq V^\omega(\beta) \}.\end{align*}

Since $V^\omega(T)$ is independent of $\Lambda_{\textrm{coal}}$ and almost surely finite, it follows that $\tilde I_{\textrm{coal}}$ and $\tilde I_{\textrm{bran}}$ are almost surely finite. Let $\beta_1 \lt \cdots \lt \beta_m$ be the elements of $\tilde I_{\textrm{coal}} \cup \tilde I_{\textrm{bran}}$ (set $\beta_{0} \,{:\!=}\, 0$ and $\beta_{m+1} \,{:\!=}\, T$ for convenience). We define $V_{\text{on}}^\omega(\beta)\subset V^\omega(\beta)$ , the set of active lines at time $\beta$ , as follows (see also Figure 7). For $\beta=0$ we set $V_{\text{on}}^\omega(0)\,{:\!=}\, V^\omega(0)$ , and for $\beta\in(\beta_\ell,\beta_{\ell+1})$ we set $V_{\text{on}}^\omega(\beta)\,{:\!=}\, V_{\text{on}}^\omega(\beta_\ell)$ . For $\beta=\beta_\ell\in \tilde I_{\textrm{coal}}$ , we set $V_{\text{on}}^\omega(\beta_\ell)\,{:\!=}\, V_{\text{on}}^\omega(\beta_\ell-\!)\setminus \{ a_{\beta_\ell} \}$ if $\{a_{\beta_\ell},b_{\beta_\ell}\}\subset V_{\text{on}}^\omega(\beta_\ell-\!)$ , and $V_{\text{on}}^\omega(\beta_\ell)\,{:\!=}\, V_{\text{on}}^\omega(\beta_\ell-\!)$ otherwise. Finally, for $\beta=\beta_\ell\in \tilde I_{\textrm{bran}}$ , we set $V_{\text{on}}^\omega(\beta_\ell)\,{:\!=}\, V_{\text{on}}^\omega(\beta_\ell-\!) \cup J_\ell$ , where the set $J_\ell$ consists of the integers $i \in V^\omega(\beta_\ell) \setminus$ $V^\omega(\beta_\ell\!-\!)$ such that $i=i_\star(j,\beta_\ell)$ for some $j\in [n]\cap V_{\text{on}}^\omega(\beta_\ell-\!)$ , or $i=i_\star(j,\hat \beta_1,\ldots,\hat \beta_k,\beta_\ell)$ for some $(j,\hat \beta_{1},\ldots,\hat \beta_{k})\in i_\star^{-1}(V_{\text{on}}^\omega(\beta_\ell-\!)\setminus [n])$ with $\hat{\beta}_k<\beta_\ell$ .

Figure 7. Illustration of the construction of the quenched ASG. The environment $\omega$ has jumps at forward times $t_0$ , $t_1$ , $t_2$ ; backward times $\beta_1,\ldots,\beta_5$ belong to the set of potential branching times $\tilde I_{{\text{bran}}}$ . Virtual lines are depicted in grey or black; active lines are black. The ASG in [0, T] consists of the set of active lines together with their connections and mutation marks.

The ASG on [0, T] is then the branching–coalescing system starting with n lines at levels in [n], consisting at any time $\beta\in[0,T]$ of the lines in $V_{\text{on}}^\omega(\beta)$ , and where the following hold:

  1. (i) For $\beta \in I_{\textrm{bran}}$ such that $V_{\text{on}}^\omega(\beta-\!)\subsetneq V_{\text{on}}^\omega(\beta)$ and $i\in V_{\text{on}}^\omega(\beta)\setminus V_{\text{on}}^\omega(\beta-\!)$ , either there is $(j,\hat \beta_{1},\ldots,\hat \beta_{k})\in[n]\times S_{\textrm{bran}}$ with $\hat \beta_{k}<\beta$ such that $i=i_\star(j,\hat \beta_1,\ldots,\hat \beta_k,\beta)$ , or there is $j\in[n]$ such that $i=i_\star(j,\beta)$ . In the first case, line $i_\star(j,\hat \beta_1,\ldots,\hat \beta_k)$ branches at time $\beta$ into $i_\star(j,\hat \beta_1,\ldots,\hat \beta_k)$ (continuing line) and i (incoming line). In the second case, line j branches at time $\beta$ into j (continuing line) and i (incoming line).

  2. (ii) For $\beta \in I_{\textrm{coal}}$ such that $V_{\text{on}}^\omega(\beta)\subsetneq V_{\text{on}}^\omega(\beta-\!)$ and $i\in V_{\text{on}}^\omega(\beta-\!)\setminus V_{\text{on}}^\omega(\beta)$ , $i=a_\beta$ and $b_{\beta}\in V_{\text{on}}^\omega(\beta)$ . Thus, lines i and $b_\beta$ merge into $b_\beta$ at time $\beta$ .

  3. (iii) At each $\beta \in [0,T]$ that is an arrival time of $\lambda_{i}^{0}$ (resp. $\lambda_{i}^{1}\big)$ for some $i \in V_{\text{on}}^\omega(\beta)$ , we mark line i with a beneficial (resp. deleterious) mutation at time $\beta$ .

Clearly the branching–coalescing particle system thus constructed satisfies the requirements (i), (ii), (iii $^{\prime}$ ), (iv), and (v) of Definition 2.1. This ends the proof.

It remains to prove the following lemma.

Lemma 4.1. The set $V^\omega(\beta)$ is almost surely finite for any $\beta \in [0,T]$ .

Proof. We keep using the notation introduced in the proof of Proposition 4.1. For $\delta>0$ , we consider the environment $\omega^\delta$ defined via (3.1). We couple the sets of virtual lines $\mathcal{V}^\omega$ and $\mathcal{V}^{\omega^\delta}$ associated to $\omega$ and $\omega^\delta$ , respectively, by using the same random sets $\Lambda_{{\text{sel}}}$ and $\mathcal{U}_{\tilde \omega}$ (note that for $\beta \in I_{\tilde \omega}$ with $\Delta \tilde \omega(\beta)<\delta$ , $\Delta \tilde \omega^\delta(\beta)=0\lt U_i(\beta)$ ). Let $N_T^\omega(\beta)\,{:\!=}\, |V^\omega(\beta)|$ and $N_T^{\omega^\delta}(\beta)\,{:\!=}\, |V^{\omega^\delta}(\beta)|$ , $\beta\in[0,T]$ . Since $\beta\mapsto N_T^\omega(\beta)$ is non-decreasing, it is enough to prove that $N_T^\omega(T)<\infty$ almost surely. From the construction of the set of virtual lines, it follows that $N_T^{\omega^{\delta}}(\beta)$ increases almost surely to $N_T^{\omega}(\beta)$ as $\delta\to 0$ . By the monotone convergence theorem, for all $\beta \in [0,T]$ we get

(4.1) \begin{eqnarray}\lim_{\delta \rightarrow 0} \mathbb{E}\left[N_T^{\omega^{\delta}}(\beta)\mid N_T^{\omega^{\delta}}(0)=n\right] = \mathbb{E}\left[N_T^{\omega}(\beta)\mid N_T^{\omega}(0)=n\right]. \end{eqnarray}

Recall that $\tilde{\omega}^\delta(\beta)\,{:\!=}\, \omega^\delta(T)-\omega^\delta((T-\beta)-\!)$ , $\beta\in[0,T]$ , and that $\tilde \omega^{\delta}$ has finitely many jumps in [0, T]. Let $T_1 <\cdots \lt T_N$ be the jump times of $\tilde \omega^{\delta}$ . The process $\left(N_T^{\omega^{\delta}}(\beta)\right)_{\beta \in [0,T]}$ has the following transitions:

  1. 1. On $(T_i,T_{i+1})$ : $N_T^{\omega^{\delta}}$ jumps from k to $k+1$ at rate $k\sigma$ .

  2. 2. At time $T_i$ : $N_T^{\omega^{\delta}}$ jumps from k to $k+ B_{k}(\Delta \tilde \omega^{\delta}(T_i))$ , where $B_{k}(\Delta \tilde \omega^{\delta}(T_i)) \sim \textrm{Bin}({k},{\Delta \tilde \omega^{\delta}(T_i)})$ .

Note that for each $T_i$ we have $\Delta \tilde \omega^{\delta}(T_i)=\Delta \tilde \omega(T_i)$ . This yields in particular

(4.2) \begin{eqnarray}\mathbb{E}[N_T^{\omega^{\delta}}(T_i)\mid N_T^{\omega^{\delta}}(T_i-\!)]=(1+\Delta \tilde \omega(T_i))N_T^{\omega^{\delta}}(T_i-\!). \end{eqnarray}

Successively using Lemma 4.2 (see below) and (4.2), we get

\begin{align*} \mathbb{E}\left[N_T^{\omega^{\delta}}(T)\mid N_T^{\omega^{\delta}}(0)=n\right]\leq ne^{\sigma T} \prod_{\beta \in [0,T]: \Delta \tilde \omega (\beta) \geq \delta}(1+\Delta \tilde \omega(\beta)). \end{align*}

In particular, we have

(4.3) \begin{eqnarray}\mathbb{E}[N_T^{\omega^{\delta}}(T)\mid N_T^{\omega^{\delta}}(0)=n]\leq ne^{\sigma T} \prod_{\beta \in I_{\tilde \omega} \cap [0,T]}(1+\Delta \tilde \omega(\beta)) \lt \infty. \end{eqnarray}

Letting $\delta$ go to 0 in (4.3) and using (4.1), we get

\begin{align*} \mathbb{E}[N_T^{\omega}(T)\mid N_T^{\omega}(0)=n]\leq ne^{\sigma T} \prod_{\beta \in I_{\tilde \omega} \cap [0,T]}(1+\Delta \tilde \omega(\beta)) \lt \infty. \end{align*}

This concludes the proof.

Lemma 4.2. Let $0 \leq \beta_1 <\beta_2 \leq T$ be such that $\tilde \omega^{\delta}$ has no jump times on $(\beta_1,\beta_2]$ . Then we have

\begin{align*}\mathbb{E}[N_T^{\omega^\delta}(\beta_2)\mid N_T^{\omega^\delta}(\beta_1)] \leq e^{\sigma (\beta_2-\beta_1)} N_T^{\omega^\delta}(\beta_1).\end{align*}

Proof. Since $\tilde \omega^{\delta}$ has no jump times on $(\beta_1,\beta_2]$ , on this interval $N_T^{\omega^\delta}$ is the Markov chain on $\mathbb{N}$ with generator $\mathcal{G}_N f(n) = \sigma n (f(n+1)-f(n)).$ Let $f_M(n)\,{:\!=}\, n \wedge M$ . Note that, for any $M,n \geq 1$ , we have $\mathcal{G}_N f_M(n) \leq \sigma f_M(n)$ . Applying Dynkin’s formula to $N_T^{\omega^\delta}$ on $(\beta_1,\beta_2]$ with the function $f_M$ , we obtain

\begin{align*}\mathbb{E}\left[f_M(N_T^{\omega^\delta}(\beta_2)) \mid N_T^{\omega^\delta}(\beta_1)\right] =f_M(N_T^{\omega^\delta}(\beta_1))+\mathbb{E}\left[\int_{\beta_1}^{\beta_2} \mathcal{G}_N f_M(N_T^{\omega^\delta}(\beta)){\textrm{d}} \beta\mid N_T^{\omega^\delta}(\beta_1)\right] \\ \leq f_M(N_T^{\omega^\delta}(\beta_1))+\sigma \mathbb{E}\left[\int_{\beta_1}^{\beta_2} f_M(N_T^{\omega^\delta}(\beta)){\textrm{d}} \beta\mid N_T^{\omega^\delta}(\beta_1)\right] \\ = f_M(N_T^{\omega^\delta}(\beta_1))+\sigma \int_{\beta_1}^{\beta_2} \mathbb{E}\left[f_M(N_T^{\omega^\delta}(\beta))\mid N_T^{\omega^\delta}(\beta_1)\right] {\textrm{d}} \beta.\end{align*}

Thus, Gronwall’s lemma yields $\mathbb{E}[f_M(N_T^{\omega^\delta}(\beta_2)) \mid N_T^{\omega^\delta}(\beta_1)]\leq e^{\sigma (\beta_2-\beta_1)} f_M(N_T^{\omega^\delta}(\beta_1))$ . The result follows from letting $M\to\infty$ and using the monotone convergence theorem.

4.2. Definitions related to Section 2.5: the killed ASG

The k-ASG as a branching–coalescing system of particles is defined as follows (see Figure 4).

Definition 4.1. (The annealed/quenched k-ASG.) The annealed k-ASG with parameters $\sigma,\theta,\nu_0,\nu_1$ , and environment driven by a pure-jump subordinator with Lévy measure $\mu$ , of a sample of size n is the branching–coalescing particle system $\bar{\mathcal{G}}\,{:\!=}\,(\bar{\mathcal{G}}(\beta))_{\beta\geq 0}$ starting with n lines and with the following dynamic:

  1. (i) Each line splits into two lines, an incoming line and a continuing line, at rate $\sigma$ .

  2. (ii) Every given pair of lines coalesces into a single line at rate 2.

  3. (iii) Every group of k lines is subject to a simultaneous branching at rate $\sigma_{m,k}$ (defined in Equation (2.5)), where m denotes the total number of lines in the ASG before the simultaneous branching event. At the simultaneous branching event, each line in the group involved splits into two lines, an incoming line and a continuing line.

  4. (iv) Each line is killed at rate $\theta \nu_1$ .

  5. (v) Each line sends the process to the cemetery state $\dagger$ at rate $\theta \nu_0$ .

Let $\omega\in{\mathbb D}^\star$ . The quenched k-ASG with parameters $\sigma,\theta,\nu_0,\nu_1$ , and environment $\omega$ , of a sample of size n at time T is the branching–coalescing particle system $\bar{\mathcal{G}}_{T}^\omega\,{:\!=}\,(\bar{\mathcal{G}}_{T}^{\omega}(\beta))_{\beta\geq 0}$ starting at $\beta=0-$ with n lines and evolving according to (i), (ii), (iv), and (v) of the previous definition, with (iii) replaced by the following:

  1. (iii) If at time $\beta$ we have $\Delta \omega(T-\beta)>0$ , then any line splits into two lines, an incoming line and a continuing line, with probability $\Delta \omega(T-\beta)$ , independently from the other lines.

Remark 4.1. The branching–coalescing system underlying the quenched k-ASG is well-defined, because it can be constructed on the basis of the quenched ASG.

4.3. Definitions related to Section 2.6: the pruned lookdown ASG

In this section, we give a detailed construction of the pLD-ASG, which incorporates the effect of the environment.

First, we construct the (annealed/quenched) lookdown ASG (LD-ASG). The latter is the ASG equipped with a numbering of its lines encoding the hierarchy given by the pecking order. This is done as follows. Consider a realization of the (annealed/quenched) ASG in [0, T] starting with one line, which is assigned level 1. When the line at level i coalesces with the line at level $j>i$ , the resulting line is assigned level i; the level of each line having level $k \gt j$ before the coalescence is decreased by 1. When a group of lines with levels $i_1\lt i_2<\ldots<i_N$ experiences a simultaneous branching, the incoming (resp. continuing) line of the descendant line with level $i_k$ gets level $i_k+k-1$ (resp. $i_k+k$ ); a line having level j before the branching, with $i_k<j<i_{k+1}$ , gets level $j+k$ ; a line having level $j>i_N$ before the branching gets level $j+N$ . Mutations do not affect the levels. See Figure 8 (left panel) for an illustration. The pLD-ASG is obtained via an appropriate pruning of the lines of the LD-ASG. Before describing the pruning procedure, we identify a special line in the LD-ASG: the immune line. The immune line at time $\beta$ is the line in the ASG present at time $\beta$ that is the ancestor of the starting line if all the lines at time $\beta$ are assigned the unfit type. In the absence of mutations, the immune line changes only if it is involved in a coalescence or branching event. If it is involved in a coalescence event, the merged line is the new immune line. If it is involved in a branching event, the continuing line is the new immune line.

Figure 8. LD-ASG (left) and its pLD-ASG (right). Backward time $\beta \in [0,T]$ runs from right to left. In the LD-ASG, levels remain constant between the dashed lines; in particular, they are not affected by mutation events. In the pLD-ASG, lines are pruned at mutation events, where an additional updating of the levels takes place. The bold line in the pLD-ASG represents the immune line.

In the presence of mutations, the pLD-ASG is constructed simultaneously with the immune line as follows. Let $\beta_1<\cdots<\beta_m$ be the times at which mutations occur in the LD-ASG in [0, T]. In the time interval $[0,\beta_1)$ , the pLD-ASG coincides with the LD-ASG and the immune line evolves as before. Now, assume that we have constructed the pLD-ASG together with its immune line up to time $\beta_i-$ , where the pLD-ASG contains n lines and the immune line has level $k_0\in[n]$ . The pLD-ASG is extended up to time $\beta_i$ according to the following rules:

  1. (i) If, at time $\beta_i$ , a line with level $k\neq k_0$ at $\beta_i-$ is hit by a deleterious mutation, we stop tracing back this line; all the other lines are extended up to time $\beta_i$ ; all the lines with level $j>k$ at time $\beta_i-$ decrease their level by 1, and the others keep their levels unchanged; the immune line continues on the same line (possibly with a different level).

  2. (ii) If, at time $\beta_i-$ , the line with level $k_0$ at $\beta_i-$ is hit by a deleterious mutation, we extend all the lines up to time $\beta_i$ ; the immune line gets level n, but remains on the same line; all lines having a level $j>k_0$ at time $\beta_i-$ decrease their level by 1, and the others keep their levels unchanged.

  3. (iii) If, at time $\beta_i$ , a line with level k is hit by a beneficial mutation, we stop tracing back all the lines with level $j>k$ ; the remaining lines are extended up to time $\beta_i$ , keeping their levels; the line hit by the mutation becomes the immune line.

In $[\beta_i,\beta_{i+1})$ , $i\in[m-1]$ , and $[\beta_m,T]$ , the pLD-ASG evolves as the LD-ASG, and the immune line as in the case without mutations. The next result states the main feature of the pLD-ASG.

Lemma 4.3. If we assign types at (backward) time T in the pLD-ASG, the true ancestor of the single line at (backward) time 0 is the line of type 0 with smallest level, or, if all lines have type 1, it is the immune line.

Proof. The proof is analogous to the proof of [Reference Lenz, Kluth, Baake and Wakolbinger31, Theorem 4], which covers the null environment case.

5. Annealed results

5.1. Annealed results related to Section 2.5

We start this section by proving the first part of Theorem 2.3, i.e. Equations (2.7) and (2.8).

Proof of Theorem 2.3 (Part I: reinforced and annealed moment duality). Define the function $H\,{:}\,[0,1]\times \mathbb{N}_0^\dagger\times [0,\infty)$ via ${H(x,n,j)\,{:\!=}\,}(1-x)^nf(j)$ . Let $(P_t)_{t\geq0}$ and $(Q_t)_{t\geq 0} $ denote the semigroups of (X, J) and (R, J), respectively, i.e.

\begin{align*}P_t g(x,j) ={\mathbb E}[g(X(t),J(t)+j)\mid X(0)=x],\\[3pt] Q_th(n,j) ={\mathbb E}[h(R(t),J(t)+j)\mid R(0)=n].\end{align*}

Let $(\hat{R},\hat{J})$ be a copy of (R, J), which is independent of (X, J). A straightforward calculation shows that

(5.1) \begin{align}P_t(Q_s H)(x,n,j)&={\mathbb{E}}[(1-X(t))^{\hat{R}(s)}f(J(t)+\hat{J}(s)+j)\mid X(0)=x,\, \hat{R}(0)=n]\nonumber\\[3pt] &=Q_s(P_t H)(x,n,j).\end{align}

Let G and $G_{\star}$ be the infinitesimal generators of (X, J) and (R, J), respectively. Clearly, for any $x\in[0,1]$ , the function $(n,j)\mapsto P_t H(\cdot,n,\cdot)(x,j)$ belongs to the domain of $G_{\star}$ . Hence, Equation (5.1) yields

(5.2) \begin{equation} P_t G_{\star} H(x,n,j)= G_{\star} P_t H(x,n,j).\end{equation}

We claim that

(Claim 4) \begin{equation}G H(\cdot,n,\cdot)(x,j)=G_{\star} H(x,\cdot,\cdot)(n,j).\end{equation}

Assume that Claim 4 holds. Define the functions ${u(t,x,n,j)\,{:\!=}\,} P_tH(\cdot,n,\cdot)(x,j)$ and ${v(t,x,n,j)\,{:\!=}\,} Q_tH(x,\cdot,\cdot)(n,j)$ . The Kolmogorov forward equation for Q yields

(5.3) \begin{equation}\frac{{\textrm{d}}}{{\textrm{d}} t} v(t,x,n,j)=G_{\star} v(x,\cdot,\cdot)(n,j).\end{equation}

Moreover, using the Kolmogorov forward equation for P, Claim 4, and (5.2), we get

\begin{align*}\frac{{\textrm{d}}}{{\textrm{d}} t} u(t,x,n,j)=P_t G H(\cdot,n,\cdot)(x,j)= P_t G_{\star} H(x,\cdot,\cdot)(n,j)= G_{\star} u(x,\cdot,\cdot)(n,j).\end{align*}

Hence, u and v satisfy Equation (5.3). Since $u(0,x,n,j)=(1-x)^nf(j)=v(0,x,n,j)$ , Equation (2.7) follows from the uniqueness of the initial value problem associated with $G_{\star}$ (see [Reference Dynkin14, Theorem 1.3]). Equation (2.8) is obtained using $f\equiv 1$ in Equation (2.7). It remains to prove Claim 4. Note first that

(5.4) \begin{align} G H(\cdot,n,\cdot)(x,j)&=(1-x)^n\int_{(0,1]}\left[(1-xz)^nf(j+z)-f(j)\right]\mu({\textrm{d}} z)\nonumber\\ &\quad +\left[\!n(n-1)x(1-x)^{n-1}-\left(\sigma x(1-x)+ \theta\nu_0(1-x)\!-\theta\nu_1 x \right)n(1-x)^{n-1}\!\right]f(j).\end{align}

In addition,

(5.5) \begin{align}G_\star H(x,\cdot,\cdot)(n,j)&=n((n-1)+\theta\nu_1)[(1-x)^{n-1}\!-(1-x)^n]f(j)\nonumber\\[3pt] &\quad +\sigma n[(1-x)^{n+1}\!-(1-x)^n]f(j)- n\theta\nu_0 (1-x)^nf(j)\nonumber\\[3pt] &\quad +\sum\limits_{k=0}^n\binom{n}{k}\int_{(0,1)}y^k (1-y)^{n-k}[(1-x)^{n+k}f(j+y)-(1-x)^nf(j)]\mu({\textrm{d}} y)\nonumber\\[3pt] & =\left[n(n-1)x(1-x)^{n-1}-\left(\sigma x(1-x)+ \theta\nu_0(1-x)-\theta\nu_1 x \right)n(1-x)^{n-1}\right]f(j)\nonumber\\[3pt] &\quad +(1-x)^n\sum\limits_{k=0}^n\binom{n}{k}\int_{(0,1)}y^k (1-y)^{n-k}[(1-x)^kf(j+y)-f(j)]\mu({\textrm{d}} y).\end{align}

Moreover, using Fubini’s theorem, we obtain

\begin{align*}&\sum\limits_{k=0}^n\binom{n}{k}\int_{(0,1)}y^k (1-y)^{n-k}[(1-x)^kf(j+y)-f(j)]\mu({\textrm{d}} y)\\&\quad =\int_{(0,1)}\left[(1-xy)^nf(j+y)-f(j)\right]\mu({\textrm{d}} y).\end{align*}

Hence, Claim 4 follows after comparing (5.5) with (5.4).

We now prove Theorem 2.4(1), which characterizes the asymptotic type frequency in the annealed setting.

Proof of Theorem 2.4(1) (asymptotic type frequency). We first show that X(t) has a limit in distribution as $t\to \infty$ . Since $\theta \gt0$ and $\nu_0\in (0,1)$ , Equation (2.8) in Theorem 2.3 implies that, for any $x\in[0,1]$ , the limit of $\mathbb{E}[(1-X(t))^n\mid X(0)=x]$ as $t\to\infty$ exists and satisfies

(5.6) \begin{equation} \lim_{t \rightarrow \infty} \mathbb{E}[(1-X(t))^n|X(0)=x]=\pi_n,\quad n\in{\mathbb N}_0,\end{equation}

where $\pi_n$ is defined in (2.10). Recall that probability measures on [0, 1] are completely determined by their positive integer moments and that convergence of positive integer moments implies convergence in distribution. Therefore, Equation (5.6) implies that there is $\eta_X\in\mathcal{M}_1([0,1])$ such that, for any $x\in[0,1]$ , conditionally on $\{X(0)=x\}$ , the law of X(t) converges in distribution to $\eta_X$ as $t\to\infty$ and

\begin{align*}\pi_n=\int_{[0,1]} (1-z)^n\eta_X(dz), \quad n\in{\mathbb N}.\end{align*}

Using dominated convergence, the convergence of the law of X(t) towards $\eta_X$ as $t\to\infty$ extends to any initial distribution. As a consequence of this and the Markov property of X, it follows that X admits a unique stationary distribution, which is given by $\eta_X$ .

Finally, a first step decomposition for the probability of absorption in 0 of R yields

\begin{align*}{\left[n(\sigma+\theta+ n-1)+\sum_{k=1}^n\!\binom{n}{k}\sigma_{n,k}\right]} \pi_n\!= n\sigma \pi_{n+1}+ n(\theta\nu_1+ n-1)\pi_{n-1}+{\sum\limits_{k=1}^n\! \binom{n}{k}\sigma_{n,k} \pi_{n+k}}.\end{align*}

Dividing both sides in the previous identity by n and rearranging terms yields Equation (2.12).

5.2. Annealed results related to Section 2.6

In this section we prove Theorem 2.5(1) and Corollary 2.1. Before that we prove the following lemma relating the ancestral type distribution at time T to the number L(T) of lines in the pLD-ASG at time T.

Lemma 5.1. For all $T \geq 0$ and $x\in[0,1]$ , we have

(5.7) \begin{equation} h_T(x)=1-\mathbb{E}[(1-x)^{L(T)}\mid L(0)=1]. \end{equation}

Proof. Since types are assigned to the L(T) lines present in the pLD-ASG at (backward) time T according to independent Bernoulli random variables with parameter x, the result follows from Lemma 4.3.

The next result is crucial for describing the asymptotic behavior of $h_T(x)$ as $T\to\infty$ .

Lemma 5.2. (Positive recurrence.) The process L is positive recurrent.

Proof. Since L is irreducible, it is enough to prove that the state 1 is positive recurrent. This holds if $\theta\nu_0>0$ , because in this case the hitting time of 1 is bounded above by an exponential random variable with parameter $\theta\nu_0$ . Now, assume that $\theta=0$ (the case $\theta\nu_0=0$ and $\theta\nu_1>0$ can easily be reduced to this case). We proceed in a way similar to the proof of [Reference Foucart16, Lemma 2.3]. Define the function $f\,{:}\,{\mathbb N}\to{\mathbb R}_+$ via

\begin{align*}f(n)\,{:\!=}\, \sum_{i=1}^{n-1} \frac1{i}\ln\left(1+\frac{1}{i}\right),\end{align*}

with the convention that an empty sum equals 0. Note that f is bounded. Note also that, for $n>1$ ,

\begin{align*}n(n-1)(f(n-1)-f(n))=-n\ln\left(1+\frac{1}{n-1}\right)\leq -1.\end{align*}

This follows from using $x=1/n$ in the inequality $e^{x} \lt 1/(1-x)$ , which holds for $x<1$ . For any $\varepsilon>0$ , set ${n_0(\varepsilon)\,{:\!=}\,} \lfloor 1/\varepsilon\rfloor +1.$ Note that for $n>n_0(\varepsilon)$ ,

\begin{align*}n(f(n+i)-f(n))={n\sum_{j=n}^{n+i-1}\frac1{j}\ln\left(1+\frac{1}{j}\right)}\leq n\ln\left(1+\frac1n\right)i\varepsilon\leq i\varepsilon.\end{align*}

Hence, for $n> n_0(\varepsilon)$ ,

\begin{align*}G_L f(n) &\leq -1 +\frac{\varepsilon}{n}\sum_{i=1}^n \binom{n}{i}\sigma_{n,i} \,i+\sigma\varepsilon=-1+\varepsilon\!\!\!\int\limits_{(0,1)}\!\!\sum_{i=1}^n\binom{n-1}{i-1}y^i(1-y)^{n-i}\mu(dy)+\sigma \varepsilon\\& =-1+\varepsilon\int\limits_{(0,1)}y\mu(dy)+\sigma\varepsilon,\end{align*}

where $\sigma_{n,i}$ is defined in (2.5). Set $m_0\,{:\!=}\, n_0(\varepsilon_\star)$ , where $\varepsilon_\star\,{:\!=}\, 1/\big(2\int_{(0,1)}y\mu(dy)+2\sigma\big)$ (and we set $m_0\,{:\!=}\, 1$ in the particular case $\mu=0$ and $\sigma=0$ ). In particular, for $n> m_0$ , we have $G_Lf(n)\leq -1/2.$

Define $T_{m_0}\,{:\!=}\, \inf\{\beta>0\,{:}\, L(\beta){\leq m_0}\}$ . Applying Dynkin’s formula to L with the function f and the stopping time $T_{m_0}\wedge k$ , $k\in{\mathbb N}$ , we obtain

\begin{align*}{\mathbb E}\left[f(L(T_{m_0}\wedge k))\mid L(0)=n\right]=f(n)+{\mathbb E}\left[\int_0^{T_{m_0}\wedge k}G_Lf(L(\beta))d\beta\mid L(0)=n\right].\end{align*}

Therefore, for $n> m_0$ , we have

\begin{align*}0\leq {\mathbb E}\left[f(L(T_{m_0}\wedge k))\mid L(0)=n\right]\leq f(n)-\frac1{2}{\mathbb E}[T_{m_0}\wedge k\mid L(0)=n].\end{align*}

In particular, we have ${\mathbb E}[T_{m_0}\wedge k\mid L(0)=n]\leq 2f(n).$ Thus, letting $k\to\infty$ in this inequality yields ${\mathbb E}[T_{m_0}\mid L(0)=n]\leq 2f(n)<\infty.$ Since L is irreducible, the result follows by standard arguments.

The first part of the proof of Theorem 2.5(1) builds on the previous two lemmas. The system of equations (2.18) characterizing the tail probabilities $\mathbb{P}(L(\infty) \gt n)$ is obtained via Siegmund duality. More precisely, consider the continuous-time Markov chain $D\,{:\!=}\, (D(\beta))_{\beta \geq 0}$ with values in ${\mathbb N}^\dagger\,{:\!=}\, {\mathbb N}\cup\{\dagger\}$ with rates

\begin{align*}q_D(i,j)\,{:\!=}\, \left\{\begin{array}{l@{\quad}l} (i-1)(\sigma + \sigma_{i-1,1}) &\textrm{if $j=i-1$, $i>1$},\\ \\[-7pt] (i-1)\theta\nu_1 +i(i-1) &\textrm{if $j=i+1$, $i>1$},\\ \\[-7pt] \gamma_{i,j}-\gamma_{i,j-1} &\textrm{if {$ 1\leq j\lt i$}, $i>2$},\\ \\[-7pt] (i-1)\theta\nu_0 &\textrm{if $j=\dagger$, $i>1$,} \end{array}\right.\end{align*}

where $\dagger$ is a cemetery point, and where $\gamma_{i,j}\,{:\!=}\, \sum_{k=i-j}^{j}\binom{j}{k}\sigma_{j,k}$ if $1 \leq j<i\leq 2j$ and $\gamma_{i,j}\,{:\!=}\, 0$ otherwise (see Theorem 2.5). The states 1 and $\dagger$ are absorbing for D. The next result relates L and D via duality.

Lemma 5.3. (Siegmund duality.) The processes L and D are Siegmund dual; i.e.

\begin{align*}\mathbb{P}\left(L(\beta) \geq d\mid L(0)=\ell\right)=\mathbb{P}\left(\ell\geq D(\beta) \mid D(0)=d\right) \qquad {for\ all}\ \ell, d\in \mathbb{N},\ t\geq0.\end{align*}

Proof. Define $H\,{:}\,\mathbb{N}\times\mathbb{N}\cup\{\dagger\}\rightarrow \{0,1\}$ via $H(\ell,d)\,{:\!=}\, 1_{\{\ell\geq d\}}$ and $H(\ell,\dagger)\,{:\!=}\, 0$ , $\ell,d\in\mathbb{N}$ . Let $G_L$ and $G_D$ be the infinitesimal generators of L and D, respectively. By [Reference Jansen and Kurt23, Proposition 1.2] we only have to show that $G_L H(\cdot,d)(\ell)=G_D H(\ell,\cdot)(d)$ for all $\ell,d\in\mathbb{N}$ . From (2.16), we have

(5.8) \begin{align} & G_L H(\cdot,d)(\ell)\nonumber\\[3pt] &\quad =\sigma \ell \,1_{\{\ell+1=d\}} - (\ell-1)(\ell+\theta\nu_1)\,1_{\{\ell=d\}} -\theta\nu_0 \sum\limits_{j=1}^{\ell-1} 1_{\{j<d\leq \ell\}} + \sum\limits_{k=1}^\ell\binom{\ell}{k}\sigma_{\ell,k} \, 1_{\{\ell\lt d\leq \ell+k\}}\nonumber\\[3pt] &\quad =\sigma \ell \,1_{\{\ell+1=d\}} - (\ell-1)(\ell+\theta\nu_1)\,1_{\{\ell=d\}} -\theta\nu_0 (d-1) 1_{\{d\leq \ell\}}+ \gamma_{d,\ell} \, 1_{\{\ell\lt d\}}. \end{align}

Similarly, we have

(5.9) \begin{align} G_D H(\ell,\cdot)(d) &={\sigma (d-1) \,1_{\{d-1=\ell\}}} - (d-1)(d+\theta\nu_1)\,1_{\{\ell=d\}} -\theta\nu_0 (d-1) 1_{\{d\leq \ell\}}\nonumber\\&\quad + \sum\limits_{j=1}^{d-1}\left(\gamma_{d,j}-\gamma_{d,j-1}\right) \, 1_{\{j\leq \ell <d\}}. \end{align}

Summation by parts yields $\sum_{j=1}^{d-1}\left(\gamma_{d,j}-\gamma_{d,j-1}\right) \, 1_{\{j\leq \ell<d\}}{= \gamma_{d,\ell}1_{\{\ell\lt d\}}}$ . Thus, the result follows from comparing (5.9) with (5.8).

Now we have all the ingredients to prove Theorem 2.5(1).

Proof of Theorem 2.5(1) (ancestral type distribution). Since L is positive recurrent, L(T) converges in distribution as $T\to\infty$ towards the stationary distribution $\eta_L$ . In particular, we infer from Equation (5.7) that $h(x)\,{:\!=}\,\lim_{T\to\infty}h_T(x)$ exists and satisfies

\begin{align*}h(x) &=1-{\mathbb E}[(1-x)^{L(\infty)}]=1-\sum_{\ell=1}^\infty {\mathbb P}(L(\infty)=\ell)(1-x)^\ell\\& =\sum_{\ell=0}^\infty{\mathbb P}(L(\infty)>\ell)(1-x)^\ell-(1-x)\sum_{\ell=1}^\infty{\mathbb P}(L(\infty)>\ell-1)(1-x)^{\ell-1},\end{align*}

and Equation (2.17) follows. It remains to prove (2.18). From Lemma 5.3 we infer that $a_n=d_{n+1}$ , where

\begin{align*}d_n\,{:\!=}\, \mathbb{P}(\exists \beta>0\,{:}\, D(\beta)=1\mid D(0)=n), \quad n\geq 1.\end{align*}

Applying a first step decomposition to the process D, we obtain, for $n>1$ ,

(5.10) \begin{align}&\left((n-1)(\sigma +\theta +n ) +{\gamma_{n,n-1}}\right)d_n \nonumber\\&\quad = (n-1)\sigma d_{n-1}+(n-1)(\theta\nu_1+n)d_{n+1} + \sum\limits_{j=1}^{n-1} (\gamma_{n,j}-\gamma_{n,j-1})d_j.\end{align}

Using summation by parts and rearranging terms in (5.10) yields

(5.11) \begin{equation}(\sigma +\theta +n ) d_n= \sigma d_{n-1}+(\theta\nu_1+n)d_{n+1} +\frac{1}{n-1} \sum\limits_{j=1}^{n-1} \gamma_{n,j}(d_j-d_{j+1}),\quad n>1.\end{equation}

The result follows.

Proof of Corollary 2.1. Since $\theta = 0$ , the line-counting processes R and L have the same distribution. Hence, combining Lemma 5.1 and (2.8) (from Theorem 2.3) applied to $n=1$ , we obtain

(5.12) \begin{equation} h_T(x)={\mathbb E}[X(T)\mid X(0)=x],\end{equation}

which proves the first part of the statement. Moreover, for $\theta=0$ , X is a bounded submartingale, and hence X(T) almost surely has a limit as $T\to\infty$ , which we denote by $X(\infty)$ . Letting $T\to \infty$ in the identity (5.12) yields

(5.13) \begin{equation} h(x)={\mathbb E}[X(\infty)\mid X(0)=x].\end{equation}

Moreover, using (2.8) (from Theorem 2.3) with $n=2$ , we get

\begin{align*}{\mathbb E}[(1-X(T))^2\mid X(0)=x]={\mathbb E}[(1-x)^{L(T)}\mid L(0)=2].\end{align*}

Letting $T\to \infty$ and using that L is positive recurrent, we obtain

\begin{align*}{\mathbb E}[(1-X(\infty))^2\mid X(0)=x]=1-h(x).\end{align*}

Plugging (5.13) into the previous identity yields the desired result.

Proof of Proposition 2.2 Using Equation (2.18) in Theorem 2.5 for the two models, we obtain, for $n\in{\mathbb N}$ ,

\begin{align*}(n+1)\rho_{n+1}^{{\text{sel}}}=\sigma_\mu \rho_{n}^{{\text{sel}}},\quad\textrm{and}\quad(n+1)\rho_{n+1}^{{\text{env}}}= \frac{1}{n}\sum_{j=1}^n \gamma_{n+1,j}\,\rho_j^{{\text{env}}}.\end{align*}

Separately multiplying these equations by $z^n$ , $z\in[0,1]$ , and summing over $n\in{\mathbb N}$ , one obtains

\begin{align*}(p^{{\text{sel}}})'(z)=\rho_1^{{\text{sel}}}+\sigma_\mu\, p^{{\text{sel}}}(z),\quad\textrm{and}\quad (p^{{\text{env}}})'(z)=\rho_1^{{\text{env}}}+\sum_{j=1}^\infty \rho_j^{{\text{env}}} g_j(z),\end{align*}

with $g_j(z)\,{:\!=}\, \sum_{n=j}^{2j-1}\gamma_{n+1,j}\frac{z^n}{n}.$ Solving the ordinary differential equation for $p^{{\text{sel}}}$ via variation of constants, and using that $p^{{\text{sel}}}(0)$ and $p^{{\text{sel}}}(1)=1$ , we obtain the desired formulas for $\rho_1^{{\text{sel}}}$ and $p^{{\text{sel}}}$ (see also [Reference Cordero and Möhle13, Theorem 6.1]). Now, using the definitions of the coefficients $\gamma_{n+1,j}$ (defined below Equation (2.18)) and $\sigma_{m,k}$ (see (2.5)) followed by a straightforward calculation, one obtains

\begin{align*}g_j(z) =\sum_{k=1}^{j}\binom{j}{k}\sigma_{j,k} \int\limits_0^z \frac{u^{j-1}-u^{k+j-1}}{1-u} {\textrm{d}} u=\int\limits_0^z{\textrm{d}} u\, \frac{u^{j-1}}{1-u}\int\limits_{(0,1)}\mu({\textrm{d}} y)\,(1-(1-y(1-u))^j).\end{align*}

Since $(1-h)^j\geq 1-jh$ for $h\in(0,1)$ , we infer that $g_j(z)\leq \sigma_\mu z^{j}$ , with equality only if $z=0$ or $j=1$ . We conclude that

\begin{align*}(p^{{\text{env}}})'(z)<\rho_1^{{\text{env}}}+\sigma_\mu \, p^{{\text{env}}}(z), \quad z\in(0,1].\end{align*}

Letting $f(z)\,{:\!=}\, \rho_1^{{\text{env}}}+\sigma_\mu \, p^{{\text{env}}}(z)$ , we then have $f'(z)/f(z) \leq \sigma_\mu$ , so, after integration, $\log(f(z)/f(0)) \leq \sigma_\mu z$ . Since $p^{{\text{env}}}(0)=0$ , this yields

\begin{align*}p^{{\text{env}}}(z)\leq \rho_1^{{\text{env}}}\left(\frac{e^{\sigma_\mu z}-1}{\sigma_\mu}\right)=\frac{\rho_1^{{\text{env}}}}{\rho_1^{{\text{sel}}}}\,p^{{\text{sel}}}(z).\end{align*}

Moreover, since $p^{{\text{env}}}(1)=p^{{\text{sel}}}(1)=1$ , we conclude that $\rho_1^{{\text{env}}}\geq\rho_1^{{\text{sel}}}$ . Assume now that $\rho_1^{{\text{env}}}=\rho_1^{{\text{sel}}}$ . It follows that $p^{{\text{env}}}(z)\leq p^{{\text{sel}}}(z)$ for $z\in[0,1]$ . Hence,

\begin{align*}1=\!\!\int\limits_{0}^1 (p^{{\text{env}}})'(z){\textrm{d}} z<\!\!\int\limits_{0}^1 (\rho_1^{{\text{env}}}+\sigma_\mu \, p^{{\text{env}}}(z)){\textrm{d}} z\leq\!\!\int\limits_{0}^1 (\rho_1^{{\text{sel}}}+\sigma_\mu \, p^{{\text{sel}}}(z)){\textrm{d}} z=\!\! \int\limits_{0}^1 (p^{{\text{sel}}})'(z){\textrm{d}} z=1,\end{align*}

which is a contradiction. Thus, $\rho_1^{{\text{env}}}>\rho_1^{{\text{sel}}}$ . In particular, for $z\neq 0$ sufficiently small, $p^{{\text{env}}}(z)>p^{{\text{sel}}}(z)$ . Hence, the last statement follows from the fact that $h^{{\text{env}}}(z)=1-p^{{\text{env}}}(1-z)$ and $h^{{\text{sel}}}(z)=1-p^{{\text{sel}}}(1-z)$ .

6. Quenched results

6.1. Quenched results related to Section 2.5

In this section we prove the quenched parts of the results stated in Section 2.5. We start with the proof of the second part of Theorem 2.3, which establishes the quenched moment duality (2.9) for almost every environment $\omega$ .

Proof of Theorem 2.3 (Part II: quenched moment duality). Since both sides of (2.9) are right-continuous in T, it is sufficient to prove that, for any bounded measurable function $g\,{:}\,{\mathbb D}^\star_T\to{\mathbb R}$ ,

(6.1) \begin{equation} {\mathbb E}[(1-X^J(T))^n g((J_s)_{s\in[0,T]})\!\mid\! {X^J(0)=x}]\!={\mathbb E}[(1-x)^{{R_T^J(T-\!)}} g((J_s)_{s\in[0,T]})\!\mid \!{R_T^J(0-\!)}=n].\end{equation}

Let $\mathcal{H}\,{:\!=}\,\{g\,{:}\,{\mathbb D}_T^\star \to{\mathbb R}\,{:}\,\textrm{ such that holds}\}$ . Thanks to the annealed moment duality, Equation (2.8), every constant function belongs to $\mathcal{H}$ . Moreover, $\mathcal{H}$ is closed under increasing limits of non-negative bounded functions in $\mathcal{H}$ . We claim that (6.1) holds for functions of the form $g(\omega)=g_1(\omega(t_1))\cdots g_k(\omega(t_k))$ , with $0<t_1<\cdots<t_k<T$ and $g_i\in\mathcal{C}^2([0,\infty))$ with compact support. If the claim is true, then thanks to the monotone class theorem, $\mathcal{H}$ will contain any measurable function g, which will then complete the proof.

We prove the claim by induction on k. For $k=1$ , we need to prove that for $t_1\in(0,T)$ ,

(6.2) \begin{equation} {\mathbb E}[(1-X^J(T))^n g_1(J(t_1))\!\mid\! X^J(0)=x]\!={\mathbb E}[(1-x)^{{R_T^J(T-\!)}} g_1(J({t_1}))\!\mid\! {R_T^J(0-\!)}=n].\end{equation}

Note first that, using the Markov property for $X^J$ in $[0,t_1]$ , Equation (2.8), and the fact that $t_1$ and T are almost surely continuity times for J, we obtain

\begin{align*} & {\mathbb E}\left[(1-X^J(T))^n g_1(J(t_1))\mid X^J(0)=x\right]\\ &\quad = {\mathbb E}\left[g_1(J(t_1))\hat{{\mathbb E}}\left[(1-\hat{X}^{\hat{J}}(T-t_1))^n\mid \hat{X}^{\hat{J}}(0)=X^J(t_1)\right]\mid X^J(0)=x\right]\\ &\quad = {\mathbb E}\left[g_1(J(t_1))\hat{{\mathbb E}}\left[(1-X^J(t_1))^{{{\hat{R}}_{T-t_1}^{\hat{J}}((T-t_1)-\!)}}\mid {\hat{R}_{T-t_1}^{\hat{J}}(0-\!)}=n\right]\mid X^J(0)=x\right],\end{align*}

where the subordinator $\hat{J}$ is defined via $\hat{J}(h)\,{:\!=}\, J(t_1+h)-J(t_1)$ . The processes $\hat{X}^{\hat{J}}$ and $\hat{R}_{T-t_1}^{\hat{J}}$ are independent copies of $X^J$ and $R_{T-t_1}^J$ , which are driven by $\hat{J}$ (which is in turn independent of $(J(u))_{u\in[0,t_1]}$ ). Using first Fubini’s theorem, then Equation (2.7) in Theorem 2.3 and the fact that 0 and $t_1$ are almost surely continuity times for J, we find that the last expression equals

\begin{align*} \hat{{\mathbb E}}\left[{\mathbb E}\left[g_1(J(t_1))(1-X^J(t_1))^{{{\hat{R}}_{T-t_1}^{\hat{J}}((T-t_1)-\!)}}\mid X^J(0)=x\right]\mid {{\hat{R}}_{T-t_1}^{\hat{J}}(0-\!)}=n\right]\\= \hat{{\mathbb E}}\left[{\mathbb E}\left[g_1(J(t_1))(1-x)^{{R_{t_1}^J(t_1-\!)}}\mid {R_{t_1}^J(0-\!)}={{\hat{R}}_{T-t_1}^{\hat{J}}((T-t_1)-\!)}\right]\mid {{\hat{R}}_{T-t_1}^{\hat{J}}(0-\!)}=n\right]. \end{align*}

The proof of the claim for $k=1$ is achieved using the Markov property for $R_T^J$ in the (backward) interval $[0,T-t_1]$ . Let us now assume that the claim is true up to $k-1$ . We proceed as before to prove that the claim holds for k. For $j\in[k]$ , define $G^{\omega}_{j,k}(z)\,{:\!=}\,\prod_{i=j}^k g_i(z+\omega(t_i-t_1))$ . Using the Markov property for $X^J$ in $[0,t_1]$ followed by the inductive step, we obtain

\begin{align*} & {\mathbb E}\Bigg[(1-X^J(T))^n \prod_{i=1}^k g_i(J(t_i))\!\mid\! X^J(0)=x\Bigg]\\&\quad = {\mathbb E}\left[g_1(J(t_1))\hat{{\mathbb E}}\!\left[(1-\!{\hat{X}}^{\hat{J}}(T-t_1))^n\, G^{\hat{J}}_{2,k}(J(t_1))\!\mid\! {\hat{X}}^{\hat{J}}(0)=X^J(t_1)\right]\mid X^J(0)=x\right]\\&\quad = {\mathbb E}\left[g_1(J(t_1))\hat{{\mathbb E}}\!\left[(1-\!X^J(t_1))^{{{\hat{R}}_{T-t_1}^{\hat{J}}((T-t_1)-\!)}}\,G^{\hat{J}}_{2,k}(J(t_1))\!\mid\! {{\hat{R}}_{T-t_1}^{\hat{J}}(0-\!)}=n\right]\!\mid\! X^J(0)=x\right]\!. \end{align*}

By Fubini’s theorem, the reinforced duality Equation (2.7), and the fact that 0 and $t_1$ are almost surely continuity times for J, the last expression equals

\begin{align*} &\hat{{\mathbb E}}\!\left[{{\mathbb E}}\!\left[(1-X^J(t_1))^{{{\hat{R}}_{T-t_1}^{\hat{J}}((T-t_1)-\!)}} G^{\hat{J}}_{1,k}(J(t_1))\!\mid\! X^J(0)=x\right]\! \mid\! {{\hat{R}}_{T-t_1}^{\hat{J}}(0-\!)}=n\right]\\ &\quad =\hat{{\mathbb E}}\!\left[{{\mathbb E}}\!\left[(1-x)^{{R_{t_1}^J\!(t_1-\!)}} G^{\hat{J}}_{1,k}(J(t_1))\!\mid\! {R_{t_1}^J\!(0-\!)}\,{=}\,{{\hat{R}}_{T-t_1}^{\hat{J}}\!((T-t_1)-\!)}\right]\! \mid\! {{\hat{R}}_{T-t_1}^{\hat{J}}(0-\!)}= n\right].\end{align*}

The result follows from applying the Markov property for $R^J_T$ in the (backward) interval $[0,T-t_1]$ .

Proof of Theorem 2.4(2) (asymptotic type frequency). Let $\omega$ be such that Equation (2.9) holds between $-\tau$ and 0. In particular,

(6.3) \begin{equation} \mathbb{E} \left [ (1-X^\omega(0))^n|X^\omega(\!-\tau)=x \right ]= \mathbb{E} \left [ (1-x)^{R_0^\omega(\tau-\!)}|R_0^\omega(0-\!)=n \right ].\end{equation}

Since we assume that $\theta \gt0$ and $\nu_0, \nu_1 \in (0,1)$ , the right-hand side converges to $\Pi_n(\omega)$ (defined in (2.10)), which proves that the moment of order n of $1-X^\omega(0)$ conditionally on $\{ X^\omega(\!-\tau) = x \}$ converges to $\Pi_n(\omega)$ . Since we are dealing with random variables supported on [0, 1], the convergence of the positive integer moments proves the convergence in distribution and the fact that the limit distribution $\mathcal{L}^\omega$ satisfies (2.13).

It remains to prove (2.14). For $\upsilon\in\mathcal{M}_1(\mathbb{N}_0^\dagger)$ with finite support, let $\upsilon^\omega_s$ denote the distribution of $R_0^\omega(s-\!)$ given that $R_0^\omega(0-\!)\sim\upsilon$ . Let $T_{0, \dagger}^\omega$ be the absorption time of $R_0^\omega$ at $\{ 0, \dagger \}$ . Note that $T_{0, \dagger}^\omega$ is stochastically bounded by an exponential random variable with parameter $\theta\nu_0$ . Therefore,

\begin{align*}\upsilon^\omega_\tau(\mathbb{N}) = \mathbb{P}_{\upsilon} \left ( R_0^\omega(\tau-\!) \in \mathbb{N} \right ) = \mathbb{P}_{\upsilon} \left ( T_{0, \dagger}^\omega \gt \tau \right ) \leq e^{- \theta\nu_0 \tau}. \end{align*}

Hence, we have

\begin{align*} \mathbb{P}_{\upsilon} \left ( \exists s\geq 0 \ \text{s.t.} \ R_0^\omega(s)=0 \right ) &= \upsilon_{\tau}^\omega(\{0\}) + \sum_{k \geq 1} \mathbb{P}_{\upsilon} \left ( R_0^\omega(\tau-\!) = k \ \& \ \exists s\geq \tau \ \text{s.t.} \ R_0^\omega(s)=0 \right ) \\ &\quad \leq \upsilon_{\tau}^\omega(\{0\}) + \upsilon_\tau^\omega(\mathbb{N}) \leq \upsilon_\tau^\omega(\{0\}) + e^{-\theta \nu_0 \tau}.\end{align*}

Thus, we obtain

(6.4) \begin{eqnarray} \upsilon_{\tau}^\omega(\{0\}) \leq \mathbb{P}_{\upsilon} \!\left ( \exists s\geq 0 \ \text{s.t.} \ R_0^\omega(s)=0 \right ) \leq \upsilon_{\tau}^\omega(\{0\}) + e^{-\theta\nu_0 \tau}. \end{eqnarray}

Similarly, we have

\begin{align*}\mathbb{E}_{\upsilon} \left [ (1-x)^{R_0^\omega(\tau-\!)} \right ] = \sum_{k \geq 0} (1-x)^k \upsilon_{\tau}^\omega(\{k\}) \leq \upsilon_{\tau}^\omega(\{0\}) + \upsilon_\tau^\omega(\mathbb{N}) \leq \upsilon_{\tau}^\omega(\{0\}) + e^{-\theta\nu_0 \tau}.\end{align*}

Hence,

(6.5) \begin{eqnarray}\upsilon_{\tau}^\omega(\{0\})\leq \mathbb{E}_{\upsilon} [ (1-x)^{R_0^\omega(\tau-\!)} ] \leq \upsilon_{\tau}^\omega(\{0\}) + e^{-\theta\nu_0 \tau}. \end{eqnarray}

Recall from Section 2.5 that $\Pi_n(\omega)\,{:\!=}\, \mathbb{P}(\exists s\geq 0 \ \text{s.t.} \ R_0^\omega(s)=0 \mid R_0^\omega(0-\!)=n)$ . Choosing $\upsilon = \delta_n$ in (6.4) and in (6.5) and subtracting both inequalities, we get

\begin{align*} \left | \mathbb{E} \left [ (1-x)^{R_0^\omega(\tau-\!)}\mid R_0^\omega(0-\!)=n \right ] - \Pi_n(\omega) \right | \leq e^{-\theta\nu_0 \tau}. \end{align*}

This inequality together with (2.9) (i.e. the quenched moment duality) yields the desired result.

Proof of Proposition 2.1. Let $\omega\in{\mathbb D}^\star$ be such that (2.9) holds. Let $J_\omega\,{:\!=}\, J\otimes_{\tau_\star} \omega$ . Consider the process $X^{J_\omega}$ in $[\!-\tau,0]$ with $\tau>{\tau_\star^{}}$ . Using the Markov property, we obtain

\begin{align*}&{\mathbb E}\!\left[(1-X^{J_\omega}(0))^n\mid X^{J_\omega}(\!-\tau)=x\right]\\&\quad =\int_0^1\!\!{\mathbb E}\!\left[(1-X^{\omega}(0))^n\mid X^{\omega}(\!-{\tau_\star^{}})=y\right]{\mathbb P}(X(\!-{\tau_\star^{}})\in {\textrm{d}} y\mid X(\!-\tau)=x),\end{align*}

where X is the solution to (1.3) with subordinator J. Combining the previous identity with (2.9) for $X^\omega$ in $(\!-\tau_\star,0)$ , and using the translation invariance of X, we obtain

\begin{align*}&{\mathbb E}\!\left[(1-X^{J_\omega}(0))^n\mid X^{J_\omega}(\!-\tau)=x\right]=\\&\quad \!\int_0^1\!\!{\mathbb E}\!\left[(1-y)^{R_0^\omega(\tau_\star^{} -\!)}\mid R_0^{\omega}(0-\!)=n\right]{\mathbb P}(X(\tau-\tau_\star)\in {\textrm{d}} y\mid X(0)=x).\end{align*}

Hence, letting $\tau\to\infty$ and using Theorem 2.4(1), we get

\begin{align*}&\lim_{\tau\to\infty}{\mathbb E}\left[(1-X^{J_\omega}(0))^n\mid X^{J_\omega}(\!-\tau)=x\right]\\&\quad =\int_0^1\!\!{\mathbb E}\left[(1-y)^{R_0^\omega(\tau_\star^{} -\!)}\mid R_0^{\omega}(0-\!)=n\right]{\mathbb P}(X(\infty)\in {\textrm{d}} y),\end{align*}

and the result follows from Equation (2.11) in Theorem 2.4(1).

6.2. Quenched results related to Section 2.6

This section is devoted to the proof of Theorem 2.5(2), which describes the asymptotic behavior of the ancestral type distribution.

Lemma 6.1. For all $T \geq 0$ , $x\in[0,1]$ , and $\omega\in{\mathbb D}^\star$ , we have

(6.6) \begin{align} h^{\omega}_T(x) =1-\mathbb{E}[(1-x)^{L_T^\omega(T-\!)}\mid L_T^\omega(0-\!)=1]. \end{align}

Proof. The proof is analogous to the proof of Lemma 5.1.

Now we proceed to prove Theorem 2.5(2).

Proof of Theorem 2.5(2) (ancestral type distribution). Recall that $\theta\nu_0>0$ by assumption. For $\mu\in\mathcal{M}_1(\mathbb{N})$ , we denote by $\mu^\omega_T(\beta)$ the distribution of $L_T^\omega(\beta-\!)$ given that $L_T^\omega(0-\!)\sim\mu$ . Let $t \gt s \gt 0$ . Note that we have $\mu^{\omega}_{t}(t) = (\mu^{\omega}_{t}(t - s))^{\omega}_{s}(s)$ , so

(6.7) \begin{eqnarray}d_{TV} (\mu^{\omega}_{t}(t),\mu^{\omega}_{s}(s)) = d_{TV} ( (\mu^{\omega}_{t}(t - s))^{\omega}_{s}(s), \mu^{\omega}_{s}(s)), \end{eqnarray}

where $d_{TV}(\mu_1,\mu_2)$ stands for the total variation distance between $\mu_1$ and $\mu_2$ .

Assume now that $L_T^\omega(0-\!)\sim\mu$ . By construction, $L_T^\omega$ jumps from any state i to the state 1 with rate $q^0(i,1)\geq \theta\nu_0 \gt 0$ (see (2.16)). Let $\hat{L}_T^\omega$ be a process with initial distribution $\mu$ , evolving as $L_T^\omega$ , but jumping from i to 1 at rate $q^0(i,1) - \theta\nu_0 \geq 0$ . We decompose the dynamic of $L_T^\omega$ as follows: (1) $L_T^\omega$ evolves as $\hat{L}_T^\omega$ on $[0, \xi]$ , where $\xi$ is an independent exponential random variable with parameter $\theta\nu_0$ , (2) at time $\xi$ , $L_T^\omega$ jumps to the state 1 regardless of its current position, and (3) conditionally on $\xi$ , $L_T^\omega$ has the same law on $[\xi, \infty)$ as an independent copy of $L_{T-\xi}^\omega$ started with one line. This idea allows us to couple $L_T^\omega$ to a copy of it, $\tilde{L}_T^\omega$ , with starting law $\tilde \mu$ , so that the two processes are equal on $[\xi, \infty)$ . Since $L_T^\omega(T-\!) \sim \mu^{\omega}_T(T)$ and $\tilde{L}_T^\omega(T-\!) \sim \tilde\mu^{T}_T(\omega)$ , we have

(6.8) \begin{eqnarray}d_{TV} (\mu^{\omega}_T(T), \tilde \mu^{\omega}_T(T)) \leq \mathbb{P} \left ( \tilde{L}_T^\omega(T-\!) \neq {L}_T^\omega(T-\!) \right ) \leq \mathbb{P} \left ( \xi \gt T \right ) = e^{-\theta\nu_0 T}. \end{eqnarray}

This, together with (6.7), implies that for any $\mu\in\mathcal{M}_1({\mathbb N})$ and any $t \gt s \gt 0$ ,

(6.9) \begin{eqnarray}\ d_{TV} (\mu^{\omega}_{t}(t), \mu^{\omega}_{s}(s)) \leq e^{-\theta\nu_0 s}. \end{eqnarray}

In particular, $(\mu^{\omega}_{t}(t))_{t \gt 0}$ is Cauchy as $t \rightarrow \infty$ for the total variation distance. Therefore, $(\mu^{\omega}_{t}(t))_{t \gt 0}$ has a limit $\mu^{\omega}\in\mathcal{M}_1({\mathbb N})$ . Moreover, (6.8) implies that $\mu^\omega$ does not depend on $\mu$ , and the first part of Theorem 2.5(2) is proved. The identity (2.19) then follows by Lemma 6.1.

Setting $s = T$ and letting $t\to\infty$ in (6.9) yields $d_{TV} (\mu^\omega, \mu^{\omega}_{T}(T)) \leq e^{-\theta\nu_0 T}$ . Since $h^{\omega}(x) = 1 - \mathbb{E} \left[(1-x)^{Z_{\infty}^\omega} \right]$ and $h^{\omega}_T(x) = 1 - \mathbb{E} \left[(1-x)^{Z_{T}^\omega} \right]$ , where $Z_{\infty}^\omega \sim (\delta_1)^{\omega}$ and $Z_{T}^\omega \sim (\delta_1)^{\omega}_{T}(T)$ , we get

\begin{align*}\lvert h_T^\omega(x)-h^\omega(x)\rvert\leq d_{TV}((\delta_1)^{\omega}, (\delta_1)^{\omega}_{T}(T))\leq e^{-\theta\nu_0 T},\end{align*}

completing the proof.

7. Further quenched results for simple environments

In this section we provide, for simple environments, extensions and refinements of the results obtained in Sections 2.5 and 2.6 in the quenched setting. Recall the quenched diffusion $X^\omega$ defined in Section 2.3.

7.1. Extensions of quenched results in Section 2.5

First we extend the main quenched results in Section 2.5, which hold for almost every environment, to any simple environment.

Theorem 7.1. (Quenched moment duality for simple environments.) The quenched moment duality (2.9) holds for any simple environment.

The proof of Theorem 7.1 has two main ingredients: a moment duality between the jumps of the environment, and a moment duality at the jumps. These results are covered by the next two lemmas.

Lemma 7.1. (Quenched moment duality between the jumps.) Let $0\leq s<t\leq T$ , and assume that $\omega$ has no jumps in (s,t). For all $x\in[0,1]$ and $n\in \mathbb{N}$ , we have

\begin{align*} \mathbb{E} \left [ (1-X^\omega(t-\!))^n \mid X^\omega(s)=x \right ]= \mathbb{E} \left [ (1-x)^{R_T^\omega((T-s)-\!)}\mid R_{T}^\omega(T-t)=n \right ]. \end{align*}

Proof. In (s, t), the processes $X^\omega$ and $R_t^\omega$ evolve as in the annealed case with $\mu=0$ . Therefore, the result follows from applying Theorem 2.3 with $\mu = 0$ .

Lemma 7.2. (Quenched moment duality at jumps.) Assume that $\omega\in{\mathbb D}^\star$ is simple and has a jump at time $t<T$ . Then, for all $x\in[0,1]$ and $n\in \mathbb{N}$ , we have

\begin{align*} \mathbb{E} \left [ (1-X^\omega(t))^n \mid X^\omega(t-\!)=x \right ]= \mathbb{E}\left [ (1-x)^{R_{T}^\omega(T-t)}\mid R_{T}^\omega((T-t)-\!)=n \right ]. \end{align*}

Proof. On the one hand, since $X^\omega(t) = X^\omega(t-\!) + X^\omega(t-\!)(1-X^\omega(t-\!))\Delta \omega(t)$ almost surely, we have

(7.1) \begin{equation}\mathbb{E}\left [ (1-X^\omega(t))^n\mid X^\omega(t-\!)=x \right ] = \left [ 1 - x (1 + (1-x) \Delta \omega(t)) \right ]^n = \left [(1-x)(1-x\Delta\omega(t))\right]^n.\end{equation}

On the other hand, conditionally on $\{R_{T}^\omega((T-t)-\!)=n\}$ , we have $R_{T}^\omega(T-t) \sim n + Y$ where $Y \sim \textrm{Bin}({n},{\Delta \omega(t)})$ . Therefore,

(7.2) \begin{align}\mathbb{E} \left [ (1-x)^{R_{T}^\omega(T-t)}\mid R_{T}^\omega((T-t)-\!)=n \right ] = \mathbb{E}\left [ (1-x)^{n+Y} \right ] \nonumber \\ = (1-x)^n \left [ 1-\Delta \omega(t) + \Delta \omega(t)(1-x) \right ]^n =\left [(1-x)(1-x\Delta\omega(t))\right]^n. \end{align}

The combination of (7.1) and (7.2) yields the result.

Proof of Theorem 7.1. Let $\omega$ be a simple environment. Let $(t_i)_{i=1}^m$ be the increasing sequence of jump times of $\omega$ in [0, T]. Without loss of generality we assume that 0 and T are both jump times of $\omega$ . In particular, $t_1 = 0$ and $t_m=T$ . Let $(X^\omega( s))_{ s \in[0,T]}$ and $(R_T^\omega(\beta))_{\beta\in[0,T]}$ be independent realizations of the Wright–Fisher process and the line-counting process of the k-ASG, respectively. For $s\in[0,T]$ , denote by $\mu_s^\omega(x,\cdot)$ and $\bar{\mu}_s^\omega(x,\cdot)$ the laws of $X^\omega(s)$ and $X^\omega(s-\!)$ , respectively, given that $X^\omega(0)=x$ . Partitioning with respect to the values of $X^\omega(t_m-\!)$ and using Lemma 7.2 at $t=T$ , we get

\begin{align*}&\mathbb{E} \left [ (1-X^\omega(T))^n\right. \left.\mid X^\omega(0)=x \right ]= \int_0^1 \mathbb{E} \left [ (1-X^\omega(t_m))^{n}\mid X^\omega(t_m-\!)=y \right ] \bar{\mu}_{t_m}^\omega(x,{\textrm{d}} y)\\ &\quad= \int_0^1 \mathbb{E} \left [ (1-y)^{R_T^\omega(0)}\mid R_{T}^\omega(0-\!) = n \right ] \bar{\mu}_{t_m}^\omega(x,{\textrm{d}} y) \\ &\quad = \mathbb{E} \left [ (1-X^\omega(t_m-\!))^{R_T^\omega(0)}\mid R_T^\omega(0-\!)=n, X^\omega(0)=x \right ]=: \,I_T^\omega(x,n).\end{align*}

For $t<T$ , set $q_{n,k}^\omega(t)\,{:\!=}\,\mathbb{P} \left ( R_T^\omega(t) = k \mid R_T^\omega(0-\!)=n \right ) $ . Partitioning with respect to the values of $X^\omega(t_{m-1})$ and $R_T^\omega(0)$ , and using Lemma 7.1, we get

\begin{align*}I_T^\omega(x,n) &= \sum_{k \in \mathbb{N}_0^\dagger} \mathbb{E} \left [ (1-X^\omega(t_m-\!))^{k}\mid X^\omega(0)=x \right ] q_{n,k}^\omega(0) \\&\quad = \sum_{k \in \mathbb{N}_0^\dagger}q_{n,k}^\omega(0) \int_0^1 \mathbb{E}\left [ (1-X^\omega(t_m-\!))^k \mid X^\omega(t_{m-1})=y \right ] {\mu}_{t_{m-1}}^\omega(x,{\textrm{d}} y) \\&\quad = \sum_{k \in \mathbb{N}_0^\dagger}q_{n,k}^\omega(0) \int_0^1 \mathbb{E} \left [ (1-y)^{R_T^\omega((T-t_{m-1})-\!)}\mid R_T^\omega(0)=k \right ] \mu_{t_{m-1}}^\omega(x,{\textrm{d}} y) \\&\quad = \mathbb{E} \left [ (1-X^\omega(t_{m-1}))^{R_T^\omega((T-t_{m-1}^{})-\!)}\mid R_T^\omega(0-\!)=n, X^\omega(0)=x \right ].\end{align*}

If $m=2$ , the proof of (2.9) is already complete. If $m \gt 2$ , we continue as follows. Partitioning with respect to the values of $R_T^\omega((T-t_{m-1})-\!)$ and of $X^\omega(t_{m-1}^{}-\!)$ , and using Lemma 7.2, we obtain

\begin{align*} &I_T^\omega(x,n)= \sum_{k \in \mathbb{N}_0^\dagger} \mathbb{E} \left [ (1-X^\omega(t_{m-1}))^{k}\mid X^\omega(0)=x \right ] q_{n,k}^\omega((T-t_{m-1})-\!) \\& \quad = \sum_{k \in {\mathbb N}_0^\dagger} q_{n,k}^\omega((T-t_{m-1})-\!)\int_0^1 \mathbb{E} \left [ (1-X^\omega(t_{m-1}))^{k}\mid X^\omega(t_{m-1}-\!)=y \right] \bar{\mu}_{t_{m-1}}^\omega(x,{\textrm{d}} y) \\&\quad = \sum_{k \in \mathbb{N}_0^\dagger}\!\!q_{n,k}^\omega((T-t_{m-1})-\!)\! \int_0^1\! \!\!\mathbb{E}\!\left [(1-y)^{R_T^\omega(T-t_{m-1})}\!\mid\! R_{T}^\omega((T-t_{m-1})-\!) \!= \! k \right ] \bar{\mu}_{t_{m-1}}^\omega\!(x,{\textrm{d}} y) \\&\quad = \mathbb{E}\left [ (1-X^\omega(t_{m-1}-\!))^{R_T^\omega(T-t_{m-1})}\mid R_T^\omega(0-\!)=n, X^\omega(0)=x \right ].\end{align*}

Iterating this procedure, using successively Lemma 7.1 and Lemma 7.2 (the first one is applied on the intervals $(t_{i-1}, t_i)$ , while the second one is applied at the times $t_{i}$ ), we finally obtain

\begin{align*}\mathbb{E}[ (1-X^\omega(T))^n\mid X^\omega(0)=x ]= \mathbb{E} \left [ (1-x)^{R_T^\omega(T-\!)}|R_T^\omega(0-\!)=n \right ], \end{align*}

which ends the proof.

Theorem 7.2. (Quenched asymptotic type frequency for simple environments) The statement of Theorem 2.4(2) holds for any simple environment.

Proof. The proof is analogous to the proof of Theorem 2.4(2), but using Theorem 7.1 instead of Theorem 2.3.

Refinements for $\sigma=0$ . Under this additional assumption, we provide a more explicit expression for $\Pi_n(\omega)$ (defined in (2.10)). This is possible thanks to the following explicit diagonalization of $Q_\dagger^0$ (the transition matrix of R under the null environment).

Lemma 7.3. Assume that $\sigma=0$ , and set $\lambda_k^\dagger\,{:\!=}\, -q_\dagger^0(k,k)$ for $k\in{\mathbb N}_0^\dagger$ and $\gamma_k^\dagger\,{:\!=}\, q_\dagger^0(k,k-1)$ for $k\in{\mathbb N}$ , where $q_\dagger^\mu(\cdot,\cdot)$ is defined in (2.6). In addition, let $D_\dagger$ be the diagonal matrix with diagonal entries $(\!-\lambda_i^\dagger)_{i\in{\mathbb N}_0^\dagger}$ , and let $U_\dagger\,{:\!=}\, (u_{i,j}^\dagger)_{i,j\in{\mathbb N}_0^\dagger}$ and $V_\dagger\,{:\!=}\, (v_{i,j}^\dagger)_{i,j\in{\mathbb N}_0^\dagger}$ be defined via

(7.3) \begin{equation}u_{i,j}^\dagger \,{:\!=}\, \left\{\begin{array}{l@{\quad}l} \prod\limits_{\ell=j+1}^{i} \left ( \frac{\gamma_{\ell}^\dagger}{\lambda_\ell^\dagger - \lambda_j^\dagger} \right ) & for\ \textrm{$i\in{\mathbb N}_0\ \&\ j\in[i]_0$,}\\ \\[-7pt] 0 &for\ \textrm{$i\in{\mathbb N}_0\ \&\ j>i$, or $i=\dagger\ j\in{\mathbb N}_0$,}\\ \\[-7pt] \theta \nu_0 \sum\limits_{k = 1}^{i} \frac{k}{\lambda_{k}^\dagger} \prod\limits_{\ell=k+1}^{i} \frac{\gamma_{\ell}^\dagger}{\lambda_\ell^\dagger} &for\ \textrm{$i\in{\mathbb N}_0\ \&\ j=\dagger$},\\ \\[-7pt] 1 &for\ \textrm{$i=j=\dagger$,} \end{array}\right.\end{equation}
(7.4) \begin{equation}v_{i,j}^\dagger \,{:\!=}\, \left\{\begin{array}{l@{\quad}l} \prod\limits_{\ell=j}^{i-1} \left ( \frac{-\gamma_{\ell+1}^\dagger}{\lambda_i^\dagger - \lambda_\ell^\dagger} \right ) & for\ \textrm{$i\in{\mathbb N}_0\ \&\ j\in[i]_0$,}\\ \\[-7pt] 0 &for\ \textrm{$i\in{\mathbb N}_0\ \&\ j>i,\ or\ i=\dagger$ $\ $ $j\in{\mathbb N}_0$,}\\ \\[-7pt] \frac{- \theta \nu_0}{ \lambda_i^\dagger} \sum\limits_{k = 1}^{i} k \prod\limits_{\ell=k}^{i-1} \left ( \frac{- \gamma_{\ell+1}^\dagger}{\lambda_i^\dagger - \lambda_\ell^\dagger} \right ) & for\ \textrm{$i\in{\mathbb N}_0\ \&\ j=\dagger$},\\ \\[-7pt] 1 &for\ \textrm{$i=j=\dagger$,} \end{array}\right.\end{equation}

with the convention that an empty sum equals 0 and an empty product equals 1. Then we have $Q_\dagger^0=U_\dagger D_\dagger V_\dagger$ and $U_\dagger V_\dagger=V_\dagger U_\dagger={\text{I}}$ , where ${\text{I}}$ denotes the identity matrix.

Proof of Lemma 7.3. For any $i\in{\mathbb N}_0^\dagger$ , let $e_i\,{:\!=}\, (e_{i,j})_{j\in{\mathbb N}_0^\dagger}$ be the vector defined via $e_{i,i}\,{:\!=}\,1$ and $e_{i,j}\,{:\!=}\,0$ for $j\neq i$ . Order ${\mathbb N}_0^\dagger$ as $\{ \dagger, 0, 1, 2,\ldots \}$ , so that the matrix $(Q_\dagger^0)^{\top}$ is upper triangular with diagonal elements $(\!-\lambda_{\dagger}, -\lambda_{0}, -\lambda_{1}, -\lambda_{2}, \ldots)$ . For $n\in{\mathbb N}_0^\dagger$ , let $v_n \in \textrm{Span} \{ e_i\,{:}\, i \in[n]_0\cup\{\dagger\} \}$ be the eigenvector of $(Q_\dagger^0)^{\top}$ associated with the eigenvalue $-\lambda_n$ , normalized so that its coordinate with respect to $e_n$ is 1. It is not difficult to see that these eigenvectors exist and that we have $v_{\dagger} = e_{\dagger}$ and $v_{0} = e_{0}$ . For $n \geq 1$ , writing $v_n = c_\dagger e_\dagger + c_{0} e_{0} + \ldots + c_{n-1} e_{n-1} + e_{n}$ and multiplying by $\frac1{-\lambda_n} (Q_\dagger^0)^{\top}$ on both sides, we obtain another expression for $v_n$ as a linear combination of $e_\dagger, e_{0},\ldots, e_{n-1}, e_{n}$ . Identifying the two expressions, we obtain that $c_k=v_{n,k}^\dagger$ , for $k\leq n-1$ . In particular, we have

\begin{align*} v_n = v_{n,\dagger}^\dagger e_{\dagger}+ v_{n,0}^\dagger e_0 +\cdots+ v_{n,n-1}^\dagger e_{n-1} + v_{n,n}^\dagger e_{n}.\end{align*}

Proceeding in a similar way, one obtains that

\begin{align*} e_n = u_{n,\dagger}^\dagger v_\dagger + u_{n,0}^\dagger v_{0} + \cdots + u_{n,n-1}^\dagger v_{n-1} + u_{n,n}^\dagger v_{n}. \end{align*}

We thus get that $V_\dagger^{\top} U_\dagger^{\top} = U_\dagger^{\top} V_\dagger^{\top}={\text{I}}$ and $(Q_\dagger^0)^{\top} = V_\dagger^{\top} D_\dagger U_\dagger^{\top}$ (the matrix products are well-defined, because they involve sums of finitely many non-zero terms). This ends the proof.

Now, consider the polynomials $S_k^\dagger$ , $k \in{\mathbb N}_0$ , defined via

(7.5) \begin{eqnarray}S_k^\dagger(x) \,{:\!=}\, \sum_{i=0}^{k} v_{k,i}^\dagger\, x^i,\quad x\in[0,1]. \end{eqnarray}

For $z\in(0,1)$ , define the matrices $\mathcal{B}(z)\,{:\!=}\, (\mathcal{B}_{i,j}(z))_{i,j\in{\mathbb N}_0^\dagger}$ and $\Phi^\dagger(z)\,{:\!=}\, (\Phi^\dagger_{i,j}(z))_{i,j\in{\mathbb N}_0^\dagger}$ via

(7.6) \begin{equation} \mathcal{B}_{i,j}(z)\,{:\!=}\, \left\{\begin{array}{l@{\quad}l} {\mathbb P}(i+B_i(z)=j) &\textrm{for $i,j\in{\mathbb N},$}\\ 1 &\textrm{for $i=j\in\{0,\dagger\}, $}\\ 0& \textrm{otherwise}, \end{array}\right.\quad\ \quad {\&\ \Phi^\dagger(z)\,{:\!=}\,} U_\dagger^\top \mathcal{B}(z)^\top V_\dagger^\top,\end{equation}

where $B_i(z)\sim\textrm{Bin}({i},{z})$ . We will see in the proof of Theorem 7.3 that $\Phi^\dagger(z)$ is well-defined.

Theorem 7.3. Assume that $\sigma=0$ , $\theta \gt0$ , and $\nu_0, \nu_1 \in (0,1)$ . Let $\omega$ be a simple environment. Denote by $N\,{:\!=}\, N(\tau)$ the number of jumps of $\omega$ in $(\!-\tau,0)$ , and let $(T_i)_{i=1}^N$ be the sequence of the jump times in decreasing order; set $T_0 \,{:\!=}\, 0$ . For any $m\in[N]$ , define the matrix $A_m^{\dagger}(\omega)\,{:\!=}\, (A_{i,j}^{\dagger,m}(\omega))_{i,j\in{\mathbb N}_0^\dagger}$ via

(7.7) \begin{eqnarray}A^{\dagger}_m(\omega) \,{:\!=}\, \Phi^\dagger(\Delta \omega(T_m)) \exp \left ( (T_{m-1} - T_m) D_\dagger \right ). \end{eqnarray}

Then, for all $x \in (0,1)$ and $n \in \mathbb{N}$ , we have

(7.8) \begin{eqnarray}\mathbb{E}\left [ (1-X^\omega(0))^n\mid X^\omega(\!-\tau)=x \right ] = \sum_{k = 0}^{n 2^N} C^\dagger_{n,k}(\omega,\tau) S_k^\dagger(1-x), \end{eqnarray}

where the matrix $C^\dagger(\omega,\tau)\,{:\!=}\, (C^\dagger_{n,k}(\omega,\tau))_{k,n\in{\mathbb N}_0^\dagger}$ is given by

(7.9) \begin{equation}C^\dagger(\omega,\tau)\,{:\!=}\, U_\dagger \left[A_N^\dagger(\omega)A_{N-1}^\dagger(\omega)\cdots A_1^\dagger(\omega)\right]^\top \exp \left ((T_N+\tau)D_\dagger \right ), \end{equation}

with the convention that an empty product of matrices is the identity matrix. Moreover, for all $n\in{\mathbb N}$ ,

(7.10) \begin{equation}\ \Pi_n(\omega) = C_{n,0}^\dagger(\omega,\infty)\,{:\!=}\,\! \lim_{\tau\to\infty} C_{n,0}^\dagger(\omega,\tau)\!=\!\lim_{\tau \to \infty}\! \left(\!U_\dagger \left[A_{N(\tau)}^\dagger(\omega)\cdots A_1^\dagger(\omega)\right]^\top\right)_{n,0}^{}\!\!, \end{equation}

where the previous limits are well-defined.

Proof. Let us first show that the matrix products in (7.6), (7.7), and (7.9) are well-defined and that $C^\dagger_{n,k}(\omega,\tau) = 0$ for all $k \gt n2^N$ . To this end, order ${\mathbb N}_0^\dagger$ as $\{ \dagger, 0, 1, 2, \ldots \}$ , so that the matrices $U_\dagger^\top$ and $V_\dagger^\top$ are upper triangular. Note also that $\mathcal{B}_{j,i}(z)=0$ for $i>2j$ . Therefore, for any $n \in{\mathbb N}$ and any $v= (v_i)_{i \in {\mathbb N}_0^\dagger}$ such that $v_i = 0$ for all $i \gt n$ , the vector $\tilde v \,{:\!=}\, U_\dagger^\top (\mathcal{B}(z)^\top (V_\dagger^\top v))$ is well-defined and satisfies $\tilde v_i = 0$ for all $i \gt 2n$ . It follows that the matrix $\Phi^\dagger(z)$ in (7.6) is well-defined. Moreover, since $\exp ( (T_{m-1} - T_m) D_\dagger )$ is diagonal, the product defining the matrix $A^{\dagger}_m(\omega)$ in (7.7) is also well-defined. Furthermore, for any $n \in{\mathbb N}$ and any vector $v = (v_i)_{i \in {\mathbb N}_0^\dagger}$ such that $v_i = 0$ for all $i \gt n$ , the vector $\tilde v \,{:\!=}\, A^{\dagger}_m(\omega) v$ satisfies $\tilde v_i = 0$ for all $i \gt 2n$ . In particular, for any $m \geq 1$ , the product $\exp (\!-(T_N+\tau)D_\dagger) A_m^\dagger(\omega)A_{m-1}^\dagger(\omega)\cdots A_1^\dagger(\omega) U_\dagger^\top$ is well-defined. Additionally, for $n \geq 1$ and a vector $v = (v_i)_{i \in {\mathbb N}_0^\dagger}$ such that $v_i = 0$ for all $i \gt n$ , the vector $\tilde v \,{:\!=}\, \exp (\!-(T_N+\tau)D_\dagger) A_m^\dagger(\omega)A_{m-1}^\dagger(\omega)\cdots A_1^\dagger(\omega) U_\dagger^\top v$ satisfies $\tilde v_i = 0$ for all $i \gt 2^m n$ . Transposing, we see that the matrix $C^\dagger(\omega,\tau)$ in (7.9) is well-defined and satisfies $C^\dagger_{n,k}(\omega,\tau) = 0$ for all $k \gt n2^N$ .

For $s>0$ , define the stochastic matrix $\mathcal{P}^\dagger_s(\omega)\,{:\!=}\, (p_{i,j}^\dagger(\omega,s))_{i,j\in{\mathbb N}_0^\dagger}$ via

\begin{align*}p_{i,j}^\dagger(\omega,s)\,{:\!=}\, {\mathbb P}(R_0^\omega(s-\!)=j\mid R_0^\omega(0-\!)=i).\end{align*}

Hence, defining $\rho(y)\,{:\!=}\, (y^i)_{i\in{\mathbb N}_0^\dagger}$ , $y\in[0,1]$ (with the convention $y^\dagger\,{:\!=}\, 0$ ), we obtain

(7.11) \begin{equation} \mathbb{E}[y^{R_0^\omega(\tau-\!)} \mid R_0^\omega(0-\!)=n] = (\mathcal{P}_\tau^\dagger(\omega) \rho(y))_n=(\mathcal{P}_\tau^\dagger(\omega)U_\dagger {S_\dagger(y)})_n, \end{equation}

where we used that $\rho(y)=U_\dagger {S_\dagger(y)}$ with ${S_\dagger(y)\,{:\!=}\,}(S_k^\dagger(y))_{k\in{\mathbb N}_0^\dagger}$ . Thus, Theorem 7.1 and Equation (7.11) yield

(7.12) \begin{equation}\mathbb{E} [ (1-X^\omega(0))^n\mid X^\omega(\!-\tau)=x ] = \sum_{k = 0}^{\infty} \left(\mathcal{P}_\tau^\dagger(\omega)U_\dagger\right)_{n,k} S_k^\dagger(1-x).\end{equation}

Now, consider the semigroup $M_\dagger\,{:\!=}\, (M_\dagger(s))_{s\geq 0}$ of the line-counting process of the k-ASG in the null environment, which is defined via $M_{\dagger}(s)\,{:\!=}\, \exp \!(sQ_\dagger^0 )$ . Thanks to Lemma 7.3, $M_\dagger(\beta)=U_\dagger E_\dagger(\beta)V_\dagger$ , where $E_\dagger(\beta)$ is the diagonal matrix with diagonal entries $(e^{-\lambda^{\dagger}_j \beta})_{j\in{\mathbb N}_0^\dagger}$ .

Assume first that $N(\tau)=0$ (i.e. $\omega$ has no jumps in $[\!-\tau,0]$ ). In this case, we have

\begin{align*}\mathcal{P}_\tau^\dagger(\omega)U_\dagger=M_\dagger(\tau)U_\dagger=U_\dagger E_\dagger(\tau)V_\dagger U_\dagger=U_\dagger E_\dagger(\tau)=C^\dagger(\omega,\tau),\end{align*}

where we used that $V_\dagger U_\dagger={\text{I}}$ . Hence, (7.8) follows from (7.12).

Assume now that $N(\tau)\geq 1$ (i.e. $\omega$ has at least one jump in $[\!-\tau,0]$ ). Disintegrating with respect to the values of $R_0^\omega((\!-T_i)-\!)$ and $R_0^\omega(\!-T_i)$ , $i\in [N]$ , we get

(7.13) \begin{equation} \mathcal{P}_\tau^\dagger(\omega)=M_\dagger(\!-T_1)\mathcal{B}(\Delta\omega (T_1))M_\dagger(T_1-T_2)\mathcal{B}(\Delta\omega(T_2))\cdots \mathcal{B}(\Delta\omega(T_N))M_\dagger(T_N+\tau). \end{equation}

Using this, the relation $M_\dagger(\beta)=U_\dagger E_\dagger(\beta)V_\dagger$ , the definition of the matrices $\Phi^\dagger$ and $A_i^\dagger$ (see (7.6) and (7.7)), and the fact that $V_\dagger U_\dagger={\text{I}}$ , we obtain

(7.14) \begin{align} &\mathcal{P}_\tau^\dagger(\omega)U_\dagger\nonumber\\ &\quad =U_\dagger E_\dagger(\!-T_1) \Phi^\dagger(\Delta \omega(T_1))^\top E_\dagger(T_1-T_2)\Phi^\dagger(\Delta \omega(T_2))^\top\cdots\Phi^\dagger(\Delta \omega(T_N))^\top E_\dagger(T_N+\tau) \nonumber \\ &\quad = U_\dagger A_1^\dagger(\omega)^\top A_2^\dagger(\omega)^\top\cdots A_N^\dagger(\omega)^\top E_\dagger(T_N+\tau) \nonumber \\ &\quad = U_\dagger \left[A_N^\dagger(\omega)A_{N-1}^\dagger(\omega)\cdots A_1^\dagger(\omega)\right]^\top E_\dagger(T_N+\tau)=C^\dagger(\omega,\tau), \end{align}

which proves (7.8) also in this case.

It remains to prove that $C^\dagger_{n,0}(\omega,\tau)$ converges to $\Pi_n(\omega)$ as $\tau\to\infty$ . For $\omega={\textbf{0}}$ (i.e. the null environment), on the one hand (7.9) yields $C_{n,0}^\dagger(\omega,\tau)=e^{-\lambda_0^\dagger \tau} u_{n,0}^\dagger$ , and on the other hand (7.11) together with $M_\dagger(\beta)=U_\dagger E_\dagger(\beta)V_\dagger$ and $V_\dagger U_\dagger={\text{I}}$ yields

\begin{align*}\mathbb{E}[y^{R_0^{\textbf{0}}(\tau-\!)} \mid R_0^{\textbf{0}}(0-\!)=n]=\sum_{k=0}^n e^{-\lambda_k^\dagger \tau} u_{n,k}^\dagger S_k^\dagger(y).\end{align*}

Since $\lambda_k^\dagger>0$ for $k\in{\mathbb N}$ and $\lambda_0^\dagger=0$ , the desired convergence follows by letting $\tau \to\infty$ in the previous identity. For later use, note that we have $\Pi_n({\textbf{0}})=u_{n,0}^\dagger$ . The general case is a direct consequence of the following proposition.

Proposition 7.1. Assume that $\sigma=0$ , $\theta \gt0$ , $\nu_0, \nu_1 \in (0,1)$ , and $\omega$ is a simple environment. We have

\begin{align*} \left | C^\dagger_{n,0}(\omega,\tau) - \Pi_n(\omega) \right | \leq e^{-\theta\nu_0 \tau}. \end{align*}

Proof. Let $\omega_\tau$ be the environment that coincides with $\omega$ in $(\!-\tau,\infty)$ and that is constant and equal to $\omega(\!-\tau)$ in $(\!-\!\infty,-\tau]$ (which means that $\omega_\tau$ has no jumps in $(\!-\!\infty,-\tau]$ ). Since $\mathcal{P}_\tau^\dagger(\omega_\tau)=\mathcal{P}_\tau^\dagger(\omega)$ and $\omega_\tau$ has no jumps in $(\!-\!\infty,-\tau]$ , we obtain

(7.15) \begin{align} \Pi_n(\omega_\tau) =\sum_{k\geq 0}p_{n,k}^\dagger(\omega_\tau,\tau)\,{\mathbb P}(\exists \beta\geq \tau \ \text{s.t.} \ R_0^{\omega_\tau}(\beta)=0 \mid R_0^{\omega_\tau}(\tau-\!)=k)\nonumber\\ =\sum_{k\geq 0}p_{n,k}^\dagger(\omega,\tau)\,\Pi_k({\textbf{0}}){=\sum_{k\geq 0}p_{n,k}^\dagger(\omega,\tau)\,u_{k,0}^\dagger}=(\mathcal{P}_\tau^\dagger(\omega)U_\dagger)_{n,0}=C_{n,0}^\dagger(\omega,\tau),\end{align}

where in the last line we used $\Pi_k({\textbf{0}})=u_{k,0}^\dagger$ (from the end of the previous proof) and (7.14). Now combining (7.15) with (6.4) applied to $\omega_\tau$ with $\upsilon=\delta_n$ yields

(7.16) \begin{equation} p_{n,0}^\dagger(\omega,\tau)=p_{n,0}^\dagger(\omega_\tau,\tau)\leq C_{n,0}^\dagger(\omega,\tau) \leq p_{n,0}^\dagger(\omega_\tau,\tau)+e^{-\theta\nu_0 \tau}=p_{n,0}^\dagger(\omega,\tau)+e^{-\theta\nu_0 \tau}.\end{equation}

Then, combining (6.4) applied to $\omega$ with $\upsilon=\delta_n$ and (7.16), we get

\begin{align*}C_{n,0}^\dagger(\omega,\tau)-e^{-\theta\nu_0 \tau}\leq \Pi_n(\omega)\leq C_{n,0}^\dagger(\omega,\tau)+e^{-\theta\nu_0 \tau},\end{align*}

and the result follows.

Remark 7.1. If $\omega$ has no jumps in $(\!-\tau,0)$ , then $C^\dagger(\omega,\tau)=U_\dagger\exp(\tau D_\dagger)$ . In particular, $\Pi_n({\textbf{0}})=u_{n,0}^\dagger$ .

Remark 7.2. Under the assumptions of Theorem 7.3, the Simpson index (see Remark 2.5) is given by

\begin{align*} \mathbb{E}[{\text{Sim}}^\omega(\infty)]=\mathbb{E}[X^\omega(\infty)^2 + (1-X^\omega(\infty))^2] = 1 - 2 C_{1,0}^\dagger(\omega,\infty) + 2C_{2,0}^\dagger(\omega,\infty). \end{align*}

Remark 7.3. If $\omega$ is a simple periodic environment with period $T_p \gt 0$ , then (7.10) can be written as $\Pi_n(\omega) =\lim_{m \to \infty} (U_\dagger B(\omega)^m )_{n,0}$ where $B(\omega) \,{:\!=}\, [A_{N(T_p)}^\dagger(\omega)\cdots A_1^\dagger(\omega)]^\top$ . As an application of Theorem 7.3 we obtain the following refinement of Proposition 2.1 for mixed environments composed of a pure-jump subordinator J and a simple environment $\omega$ (see Figure 6).

Proposition 7.2. Assume that $\sigma=0$ , $\theta>0$ , and $\nu_0,\nu_1\in(0,1)$ . For any $\tau_\star>0$ , $n\in{\mathbb N}$ , $x\in[0,1]$ , and any simple environment $\omega$ , we have

(7.17) \begin{equation}\lim_{\tau\to\infty}\!{\mathbb E}\!\left[(1-X^{J\otimes_{\tau_\star} \omega}(0))^n\!\mid\! X^{J\otimes_{\tau_\star} \omega}(\!-\tau)=x\right]= \sum_{j = 0}^{n 2^N} \left ( \sum_{k = j}^{n 2^N} C^\dagger_{n,k}(\omega,\tau_\star) v_{k,j}^\dagger \right ) \pi_j,\end{equation}

where N denotes the number of jumps of $\omega$ in $[\!-\!\tau_\star,0]$ .

Proof. Let $\omega$ be a simple environment. Proceeding as in the proof of Proposition 2.1, but using Theorem 7.1 instead of Theorem 2.3, we obtain

\begin{equation*}\lim_{\tau\to\infty}\!{\mathbb E}\left[(1-X^{J\otimes_{\tau_\star} \omega}(0))^n\!\mid\! X^{J\otimes_{\tau_\star} \omega}(\!-\tau)=x\right]={\mathbb E}\left[\pi_{R_0^{\omega}(\tau_\star -\!)}\mid R_0^\omega(0-\!)=n\right].\end{equation*}

Since $U^\dagger V^\dagger= {\text{I}}$ and the stochastic matrix $\mathcal{P}^\dagger_{\tau_\star}(\omega)\,{:\!=}\, (p_{i,j}^\dagger(\omega,{\tau_\star}))_{i,j\in{\mathbb N}_0^\dagger}$ defined via

\begin{align*}p_{i,j}^\dagger(\omega,{\tau_\star})\,{:\!=}\, {\mathbb P}(R_0^\omega({\tau_\star}-\!)=j\mid R_0^\omega(0-\!)=i)\end{align*}

satisfies $C^\dagger(\omega,{\tau_\star})=\mathcal{P}_{\tau_\star}^\dagger(\omega)U_\dagger$ (see (7.14)), the result follows.

7.2. Extensions of quenched results in Section 2.6

In this section we assume that $\sigma=0$ and extend some of the quenched results stated in Section 2.6 concerning the ancestral type distribution for simple environments. The next result allows us to get rid of the condition $\theta\nu_0>0$ in Theorem 2.5(2).

Theorem 7.4. (Ancestral type distribution for simple environments.) Assume that $\sigma=0$ , and let $\omega\in{\mathbb D}^\star$ be a simple environment with infinitely many jumps in $[0,\infty)$ and such that the distance between the successive jumps does not converge to 0. Then the statement of Theorem 2.5(2), except for Equation (2.20), remains true.

Proof. The case $\theta\nu_0>0$ is already covered by Theorem 2.5(2). Assume now that $\theta \nu_0=0$ , $\sigma=0$ , and that $\omega$ is as in the statement. For $\mu\in\mathcal{M}_1(\mathbb{N})$ , we denote by $\mu^\omega_T(\beta)$ the distribution of $L_T^\omega(\beta-\!)$ given that $L_T^\omega(0-\!)\sim\mu$ . We claim that, for all $\mu,\tilde\mu\in\mathcal{M}_1({\mathbb N})$ ,

(Claim 5) \begin{equation}d_{TV}(\mu_T^\omega(T),\tilde\mu_T^\omega(T))\xrightarrow[T\to\infty]{}0,\end{equation}

where $d_{TV}(\mu_1,\mu_2)$ stands for the total variation distance between $\mu_1$ and $\mu_2$ . If Claim 5 is true, the rest of the proof follows as in the proof of Theorem 2.5(2). In what follows we prove Claim 5.

Let $0 \lt T_1 \lt T_2 <\cdots$ be the sequence of the jump times of $\omega$ , and set $T_0 \,{:\!=}\, 0$ for convenience. On $(T_i, T_{i+1})$ , $L_T^\omega$ has transition rates given by $(q^0(k,j))_{k,j\in{\mathbb N}}$ (see (2.16)). For any $k \gt l$ , let H(k, l) denote the hitting time of l by a Markov chain starting at k and with transition rates given by $(q^0(i,j))_{i,j\in{\mathbb N}}$ . Let $(S_l)_{l \geq 2}$ be a sequence of independent exponential random variables with parameter $(l-1) \theta\nu_1 + l(l-1)/2$ and note that $S_l \sim H(l,l-1)$ for $l\geq 2$ . Using the Markov property, one can easily see that H(k,1) is equal in distribution to $\sum_{l=2}^k S_l$ . Therefore, for any i such that $T_{i+1} \lt T$ and any $k \geq 1$ , we have

(7.18) \begin{align}\mathbb{P} \left (L_T^\omega((T-T_i)-\!) = 1 \mid L_T^\omega(T-T_{i+1}) = k \right ) {=} \mathbb{P} \left ( \sum_{l=2}^k S_l \leq T_{i+1} - T_{i} \right )\nonumber\\ \geq \mathbb{P} \left ( \sum_{l=2}^{\infty} S_l \leq T_{i+1} - T_{i} \right ).\end{align}

Clearly $\sum_{l=2}^{\infty} \mathbb{E}[S_l] \lt \infty$ ; thus $S^\infty\,{:\!=}\, \sum_{l=2}^{\infty} S_l<\infty$ almost surely. Moreover, since for $l \geq 2$ the support of $S_l$ contains 0, the support of $S^\infty$ contains 0 as well. In particular $q(s) \,{:\!=}\, \mathbb{P} ( \sum_{l=2}^{\infty} S_l \leq s)$ is positive for all $s \gt 0$ . Thus, we get from (7.18) that

(7.19) \begin{eqnarray} \mathbb{P} \left (L_T^\omega((T-T_i)-\!) = 1 \mid L_T^\omega(T-T_{i+1}) = k \right ) \geq q(T_{i+1} - T_{i}),\quad k\geq 1. \end{eqnarray}

Let $L_T^\omega$ , $\tilde L_T^\omega$ , and $L_{T_i}^\omega$ , $i\geq 0$ , be independent copies of the line-counting process of the pLD-ASG with environment $\omega$ (the subscript indicates the sampling time) and $L_T^\omega(0-\!)\sim \mu$ , $\tilde L_T^\omega(0-\!)\sim\tilde \mu$ , and $L_{T_i}^\omega(0-\!)=1$ . Let

\begin{align*}i(T)\,{:\!=}\, \max\{i\in{\mathbb N}_0\,{:}\, T_i<T,\, L_T^\omega((T-T_i)-\!)=\tilde L_T^\omega((T-T_i)-\!)=1\},\end{align*}

with the convention that the maximum of an empty set is $-\infty$ . Set $T_{-\infty}\,{:\!=}\,-\infty$ for convenience. We define $(U_T^\omega(\beta))_{\beta \geq 0}$ and $(\tilde U_T^\omega(\beta))_{\beta \geq 0}$ by setting $U_T^\omega(\beta)\,{:\!=}\, L_T^\omega(\beta)$ and $\tilde U_T^\omega(\beta)\,{:\!=}\, \tilde L_T^\omega(\beta)$ for $\beta<T-T_{i(T)}$ and $U_T^\omega(\beta)\,{:\!=}\,\tilde U_T^\omega(\beta)=L_{T_{i(T)}}^\omega(\beta-(T-T_{i(T)}))$ for $\beta \geq T-T_{i(T)}$ . Note that $U_T^\omega$ and $\tilde U_T^\omega$ have the same distributions as $L_T^\omega$ and $\tilde L_T^\omega$ , respectively. In particular, we have $U_T^\omega(T-\!) \sim \mu_{T}^\omega(T)$ and $\tilde U^\omega_T(T-\!)\sim \tilde \mu^{\omega}_T(T)$ . Moreover, we have $U_T(\omega,\beta)=\tilde U_T(\omega,\beta)$ for all $\beta \geq T-T_{i(T)}$ . Therefore,

(7.20) \begin{eqnarray}d_{TV}(\mu_T^\omega(T),\tilde\mu_T^\omega(T)) \leq \mathbb{P} \left ( U_T^\omega(T-\!) \neq \tilde U_T^\omega(T-\!) \right ) \leq \mathbb{P} \left ( i(T) = -\infty \right ). \end{eqnarray}

Let N(T) be the number of jumps of $\omega$ in [0, T]. According to (7.19), for $k_1, k_2 \geq 1$ with $k_1 \neq k_2$ we have

\begin{align*} \mathbb{P}\left (L_T^\omega((T-T_i)-\!)= 1,\, \tilde L_T^\omega((T-T_i)-\!) = 1 \mid L_T^\omega(T-T_{i+1}) = k_1, \tilde L_T^\omega(T-T_{i+1}) = k_2 \right )\\ \geq q(T_{i+1} - T_{i})^2.\end{align*}

Therefore, using (7.20), we obtain

\begin{align*} d_{TV}(\mu_T^\omega(T),\tilde\mu_T^\omega(T)) \leq \mathbb{P} \left ( I_0(T) = -\infty \right ) \leq \prod_{i=1}^{N(T)} \left ( 1-q(T_{i} - T_{i-1})^2 \right ) \,{=\!:}\, \varphi_{\omega}(T). \end{align*}

Note that $\varphi_{\omega}$ does not depend on $\mu$ and $\tilde \mu$ . Recall that by assumption the sequence of jump times $T_1, T_2,\ldots$ is infinite and the distance between the successive jumps does not converge to 0. Therefore, there is $\epsilon \gt 0$ such that, for infinitely many indices i, we have $T_{i+1}-T_i \gt \epsilon$ . Thus, the number of factors smaller than $1-q(\epsilon)^2 \lt 1$ in the product defining $\varphi_{\omega}(T)$ converges to infinity as $T\to\infty$ . We deduce that $\varphi_{\omega}(T)\to 0$ as $T\to\infty$ , which proves Claim 5, concluding the proof.

The following diagonalization of $Q^0$ (the transition matrix of the process L under the null environment) will allow us to obtain a more explicit expression for $h^{\omega}_T(x)$ .

Lemma 7.4. Assume that $\sigma=0$ , and for $k\in{\mathbb N}$ set $\lambda_k\,{:\!=}\, -q^0(k,k)$ and $\gamma_k\,{:\!=}\, q^0(k,k-1)$ ,where $q^\mu(\cdot,\cdot)$ is defined in (2.16). In addition, make the following definitions:

  1. (i) Let D be the diagonal matrix with diagonal entries $(\!-\lambda_i)_{i\in{\mathbb N}}$ .

  2. (ii) Let $U\,{:\!=}\, (u_{i,j})_{i,j\in{\mathbb N}}$ , where, for all $i\in{\mathbb N}$ , $u_{i,i} \,{:\!=}\, 1$ ; $u_{i,j} \,{:\!=}\, 0$ for $j \gt i$ ; when $i \geq 2$ , $u_{i,i-1} \,{:\!=}\, \gamma_{i}/(\lambda_{i} - \lambda_{i-1})$ ; and the coefficients $(u_{i,j})_{j \in [i-2]}$ are defined via the recurrence relation

    (7.21) \begin{eqnarray}u_{i,j} \,{:\!=}\, \frac{1}{\lambda_i - \lambda_j} \left ( \gamma_i {u_{i-1,j}} + \theta\nu_0 \left ( \sum_{l=j}^{i-2} {u_{l,j}} \right ) \right ). \end{eqnarray}
  3. (iii) Let $V\,{:\!=}\, (v_{i,j})_{i,j\in{\mathbb N}}$ , where, for all $i\in{\mathbb N}$ , $v_{i,i} \,{:\!=}\, 1$ ; $v_{i,j} \,{:\!=}\, 0$ for $j \gt i$ ; and when $i \geq 2$ , the coefficients $(v_{i,j})_{j \in [i-1]}$ are defined via the recurrence relation

    (7.22) \begin{eqnarray}v_{i,j} \,{:\!=}\, \frac{-1}{(\lambda_i - \lambda_j)} \left [ \left ( \sum_{l=j+2}^i v_{i,l} \right ) \theta\nu_0 + v_{i,j+1} \gamma_{j+1} \right ]. \end{eqnarray}

(We adopt the convention that an empty sum equals 0.) Then we have $Q=U D V$ and $U V=V U={\text{I}}$ , where ${\text{I}}$ denotes the identity matrix.

Proof. The proof is analogous to the proof of Lemma 7.3.

Now we consider the polynomials $S_k, k \in{\mathbb N}$ , defined via

(7.23) \begin{eqnarray}S_k(x) \,{:\!=}\, {\sum_{i=1}^{k} v_{k,i} x^i}. \end{eqnarray}

In addition, for $z\in(0,1)$ , we define the matrices $\mathcal{B}(z)\,{:\!=}\, (\mathcal{B}_{i,j}(z))_{i,j\in{\mathbb N}}$ and $\Phi(z)\,{:\!=}\, (\Phi_{i,j}(z))_{i,j\in{\mathbb N}}$ via

(7.24) \begin{equation} \mathcal{B}_{i,j}(z)\,{:\!=}\, {\mathbb P}(i+B_i(z)=j),\quad \textrm{$i,j\in{\mathbb N}$,}\quad\textrm{and}\quad \Phi(z)\,{:\!=}\, U^\top \mathcal{B}(z)^\top V^\top,\end{equation}

where $B_i(z)\sim\textrm{Bin}({i},{z})$ . The fact that the matrix product defining $\Phi(z)$ is well-defined can be justified similarly as in the proof of Theorem 7.3. The same is true for the matrix products in (7.25) and (7.27).

Theorem 7.5. Assume that $\sigma=0$ , and let $\omega$ be a simple environment with infinitely many jumps on $[0,\infty)$ and such that the distance between the successive jumps does not converge to 0. Let N be the number of jumps of $\omega$ in (0,T), and let $(T_i)_{i=1}^N$ be the sequence of the jump times in increasing order. We set $T_0 \,{:\!=}\, 0$ for convenience. For any $m\in[N]$ , we define the matrix $A_m(\omega)\,{:\!=}\, (A_{i,j}^{m}(\omega))_{i,j\in{\mathbb N}}$ by

(7.25) \begin{eqnarray}A_m(\omega) \,{:\!=}\, \exp \left ( (T_m - T_{m-1}) D \right ) \Phi(\Delta \omega(T_m)). \end{eqnarray}

Then for all $x \in (0,1)$ , $n \in \mathbb{N}$ , we have

(7.26) \begin{eqnarray}h^{\omega}_T(x) = 1 - \sum_{k = 1}^{2^N} C_{1,k}(\omega,T) S_k(1-x), \end{eqnarray}

where the matrix $C(\omega,T)\,{:\!=}\, (C_{n,k}(\omega,T))_{k,n\in{\mathbb N}}$ is given by

(7.27) \begin{align}C(\omega,T)\,{:\!=}\, U \exp \!\left ( (T-T_N)D \right ) \left[A_1(\omega)A_{2}(\omega)\cdots A_N(\omega)\right]^\top. \end{align}

Moreover, for any $x\in(0,1)$ ,

(7.28) \begin{eqnarray} \ h^{\omega}(x) = 1 - \sum_{k = 1}^{\infty} C_{1,k}(\omega, \infty) S_k(1-x), \end{eqnarray}

where the series in (7.28) is convergent and where

(7.29) \begin{align}C_{1,k}(\omega, \infty) \,{:\!=}\, {\lim_{m \rightarrow \infty}} \left ( U \left[A_1(\omega)A_{2}(\omega)\cdots A_m(\omega)\right]^\top \right )_{1,k}, \end{align}

and the above limit is well-defined.

Proof. We are interested in the generating function of $L_T^\omega(T-\!)$ . For $s>0$ , we define the stochastic matrix $\mathcal{P}^T_s(\omega)\,{:\!=}\, (p^T_{i,j}(\omega,s))_{i,j\in{\mathbb N}}$ via

\begin{align*}p^T_{i,j}(\omega,s)\,{:\!=}\, {\mathbb P}(L_T^\omega(s-\!)=j\mid L_T^\omega(0-\!)=i).\end{align*}

We also define $(M(s))_{s\geq 0}$ via $M(s)\,{:\!=}\, \exp (sQ^0 )$ ; i.e. M is the semigroup of $L^{{\textbf{0}}}$ . Let $T_1<T_{2}<\cdots<T_N$ be the sequence of jump times of $\omega$ in [0, T]. Disintegrating with respect to the values of $L_T^\omega( (T-T_i)-\!)$ and $L_T^\omega( T - T_i)$ , $i\in [N]$ , we obtain

(7.30) \begin{equation} {\mathcal{P}^T_T(\omega)}=M(T-T_N)\mathcal{B}(\Delta\omega (T_N))M(T_N-T_{N-1})\mathcal{B}(\Delta\omega(T_{N-1}))\cdots \mathcal{B}(\Delta\omega(T_1))M(T_1). \end{equation}

In addition,

(7.31) \begin{equation} \mathbb{E}[y^{L_T^\omega(T-\!)} \mid L_T^\omega(0-\!)=n] = ({\mathcal{P}^T_T(\omega)} \rho(y))_n,\quad \textrm{where}\quad \rho(y)\,{:\!=}\, (y^i)_{i\in{\mathbb N}}. \end{equation}

Thanks to Lemma 7.4, we have $M(\beta)=U E(\beta)V$ , where $E(\beta)$ is the diagonal matrix with diagonal entries $(e^{-\lambda_j \beta})_{j\in{\mathbb N}}$ . Moreover, $\rho(y)=U {S(y)}$ , where $S(y)\,{:\!=}\,(S_k(y))_{k\in{\mathbb N}}$ . Using this together with Equation (7.30) and the relations $M(\beta)=U E(\beta)V$ and $V U={\text{I}}$ , we obtain

(7.32) \begin{align} & {\mathcal{P}^T_T(\omega)} \rho(y)\nonumber\\ &\quad =U E(T-T_N) \Phi(\Delta \omega(T_N))^\top E(T_N-T_{N-1})\Phi(\Delta \omega(T_{N-1}))^\top\cdots\Phi(\Delta \omega(T_1))^\top E(T_1){S(y)}. \end{align}

Thus, using the definition of the matrices $A_i(\omega)$ , we get

\begin{align} {\mathcal{P}^T_T(\omega)} \rho(y) &= U E(T-T_N) A_N(\omega)^\top A_{N-1}(\omega)^\top\cdots A_1(\omega)^\top {S(y)} \nonumber \\ &= U E(T-T_N) \left[A_1(\omega)A_{2}(\omega)\cdots A_N(\omega)\right]^\top {S(y)}=C(\omega,T){S(y)}. \nonumber\end{align}

Now, using the previous identity, Lemma 6.1, and Equation (7.31), we obtain

\begin{align*}h^{\omega}_T(x)=1-\mathbb{E}\left[(1-x)^{L_T^\omega(T-\!)} \mid L_T^\omega(0-\!)=1\right] = 1 - {\sum_{k = 1}^{\infty}} C_{1,k}(\omega,T) S_k(1-x).\end{align*}

Proceeding as in the proof of Theorem 7.3, one shows that $C_{1,k}(\omega,T)=0$ for $k \gt 2^N$ , and (7.26) follows.

Let us now analyze $C_{1,k}(\omega,T)$ as $T\to\infty$ . First note that, on the one hand, from (7.26) we have

\begin{align*}{\mathbb E}[y^{L_T^\omega(T-\!)}\mid L_T^\omega(0-\!)=1]=\sum_{k=1}^{\infty} C_{1,k}(\omega,T) S_k(y).\end{align*}

On the other hand, we have

\begin{align*}{\mathbb E}[y^{L_T^\omega(T-\!)}\mid L_T^\omega(0-\!)=1]=\sum_{k=1}^{\infty} \mathbb{P}({L_T^\omega(T-\!)=k \mid L_T^\omega(0-\!)=1}) y^k.\end{align*}

Since $U^\top$ is the change-of-basis matrix from $(y^k)_{k \in {\mathbb N}}$ to $(S_k(y))_{k \in {\mathbb N}}$ , we deduce that

(7.33) \begin{equation}C_{1,k}(\omega,T) = \sum_{i \in {\mathbb N}} u_{i,k} \mathbb{P}(L_T^\omega(T-\!)=i \mid L_T^\omega(0-\!)=1) = \mathbb{E} [ u_{L_T^\omega(T-\!),k} \mid L_T^\omega(0-\!)=1 ].\end{equation}

From Theorem 7.4, we know that the distribution of $L_T^\omega(T-\!)$ converges when $T\to\infty$ . In addition, according to Lemma 7.5 the function $i \mapsto u_{i,k}$ is bounded, and hence $C_{1,k}(\omega,T)$ converges to a real number. Recall that $T_1 \lt T_2 \lt \cdots$ is the increasing sequence of the jump times of $\omega$ and that this sequence converges to infinity. Therefore

\begin{align*} \lim_{T \rightarrow \infty} C_{1,k}(\omega,T) = \lim_{m \rightarrow \infty} C_{1,k}(\omega,T_m) = \lim_{m \rightarrow \infty} {\left ( U \left[A_1(\omega)A_{2}(\omega)\cdots A_m(\omega)\right]^\top \right )_{1,k}}, \end{align*}

where we used (7.27) in the last step. This shows that the limit on the right-hand side of (7.29) exists and equals $\lim_{T \rightarrow \infty} C_{1,k}(\omega,T)$ .

It remains to prove (7.28) together with the convergence of the corresponding series. We already know from Theorem 7.4 that $h^{\omega}_T(x)$ converges to $h^{\omega}(x)$ when $T\to\infty$ , and we have just proved (7.26) and that for any $k \geq 1$ , $C_{1,k}(\omega,T)$ converges to $C_{1,k}(\omega,\infty)$ , defined in (7.29), when $T\to\infty$ . Now we claim that, for all $y\in[0,1]$ and $T>T_1$ ,

(Claim 6) \begin{equation}| C_{1,k}(\omega,T) S_k(y) | \leq 4^k \times (2ek)^{(k+\theta)/2} e^{- \lambda_k T_1}. \end{equation}

Assume that Claim 6 is true. Then (7.28) and the convergence of the series follow using the dominated convergence theorem. It only remains to prove Claim 6. As in the proof of (7.33), one shows that

\begin{align*}{\mathcal{P}^T_{(T-T_1)+}(\omega)} \rho(y)=E[y^{L_T^\omega(T-T_1)}\mid L_T^\omega(0-\!)=1]=\sum_{k=1}^{\infty} \tilde{C}_{1,k}(\omega,T) S_k(y),\end{align*}

where $\tilde{C}_{1,k}(\omega,T) = \mathbb{E} [ u_{L_T^\omega(T-T_1),k} \mid L_T^\omega(0-\!)=1 ]$ . Proceeding as in the proof of (7.32), we can prove that

(7.34) \begin{align} & {\mathcal{P}^T_{(T-T_1)+}(\omega)} \rho(y)\nonumber\\ &\quad =U E(T-T_N) \Phi(\Delta \omega(T_N))^\top E(T_N-T_{N-1})\Phi(\Delta \omega(T_{N-1}))^\top\cdots\Phi(\Delta \omega(T_1))^\top {S(y)}. \end{align}

Since $E(T_1)$ is diagonal with entries $(e^{-\lambda_j T_1})_{j\in{\mathbb N}}$ , we conclude from (7.32) and (7.34) that $C_{1,k}(\omega,T) = e^{-\lambda_k T_1} \tilde C_{1,k}(\omega,T)$ . Therefore

\begin{align*} C_{1,k}(\omega,T) = e^{-\lambda_k T_1} \mathbb{E} [ u_{L_T^\omega(T-T_1),k} \mid L_T^\omega(0-\!)=1 ]. \end{align*}

This together with Lemma 7.5 (see below) implies that, for all $k\geq 1$ and $t\geq 0$ ,

\begin{align*} \ |C_{1,k}(\omega,T)| \leq (2ek)^{(k+\theta)/2} e^{- \lambda_k T_1}. \end{align*}

Combining this with Lemma 7.7, we obtain Claim 6, which concludes the proof.

Lemma 7.5. For all $k\geq 1$ ,

\begin{align*} \ \sup_{j \geq 1} u_{j,k} \leq (2ek)^{(k+\theta)/2}. \end{align*}

Proof. Let $k \geq 1$ . By the definition of the matrix U in Lemma 7.4, the sequence $(u_{j,k})_{j \geq 1}$ satisfies

\begin{align*}u_{j,k} = 0 \ \text{for} \ j \lt k, \ u_{k,k} = 1, \ u_{k+1,k} = \frac{\gamma_{k+1}}{\lambda_{k+1} - \lambda_{k}},\end{align*}
\begin{align*}u_{k+l,k} = \frac{1}{\lambda_{k+l} - \lambda_k} \left ( \gamma_{k+l} u_{k+l-1,k} + \theta\nu_0 \sum_{j=0}^{l-2} u_{k+j,k} \right ) \ \text{for} \ l \geq 2.\end{align*}

Let $M_k^{j} \,{:\!=}\, \sup_{i \leq j} u_{i,k}$ . By the definitions of $\gamma_{j+1}$ , $\lambda_{k+1}$ , $\lambda_k$ (see Lemma 7.4), we have

\begin{align*}\gamma_{k+1} = \lambda_{k+1}- (k-1) \theta\nu_0 \gt \lambda_{k+1} - \lambda_{k}.\end{align*}

This together with the above expressions yields that

\begin{align*}M_k^k = 1, \qquad M^{k+1}_k = \frac{\lambda_{k+1}- (k-1) \theta\nu_0}{\lambda_{k+1} - \lambda_{k}} \leq \frac{\lambda_{k+1}}{\lambda_{k+1} - \lambda_{k}} = 1 + \frac{\lambda_{k}}{\lambda_{k+1} - \lambda_k}.\end{align*}

Moreover, for $l\geq 2$ , we have

\begin{align*}u_{k+l,k}\leq M^{k+l-1}_k \frac{\gamma_{k+l} + (l-1) \theta\nu_0}{\lambda_{k+l}-\lambda_k}=M^{k+l-1}_k \frac{\lambda_{k+l} - (k-1) \theta\nu_0}{\lambda_{k+l}-\lambda_k}\leq M^{k+l-1}_k \frac{\lambda_{k+l}}{\lambda_{k+l}-\lambda_k}.\end{align*}

Hence, we have

\begin{align*}M^{k+l}_k =M^{k+l-1}_k\vee u_{k+l,k} \leq M^{k+l-1}_k \times \frac{\lambda_{k+l}}{\lambda_{k+l} - \lambda_k} =M^{k+l-1}_k \times \left ( 1 + \frac{\lambda_{k}}{\lambda_{k+l} - \lambda_k} \right ).\end{align*}

As a consequence, we have

(7.35) \begin{eqnarray}\sup_{j \geq 1} u_{k,j} \leq \prod_{l=1}^{\infty} \left ( 1 + \frac{\lambda_{k}}{\lambda_{k+l} - \lambda_k} \right ) \,{=\!:}\, M^{\infty}_k. \end{eqnarray}

Since $\lambda_{k+l} \underset{}{\sim} l^2$ as $l\to\infty$ , it is easy to see that the infinite product $M^{\infty}_k$ is finite. Then,

\begin{align*} M^{\infty}_k = \exp \left [ \sum_{l=1}^{\infty} \log \left ( 1 + \frac{\lambda_{k}}{\lambda_{k+l} - \lambda_k} \right ) \right ] \leq \exp \left [ \sum_{l=1}^{\infty} \frac{\lambda_{k}}{\lambda_{k+l} - \lambda_k} \right ] \leq \exp \left [ \frac{\lambda_{k} \log(2ek)}{2(k-1)} \right ], \end{align*}

where we used Lemma 7.6 (see below) in the last step. Since $\lambda_{k} = (k-1)(k+\theta)$ (see Lemma 7.4), the desired result follows.

Lemma 7.6. For all $k\in{\mathbb N}$ ,

\begin{align*} {\sum_{l=1}^{\infty}} \frac{1}{\lambda_{k+l} - \lambda_k} \leq \frac{\log\!(2ek)}{2(k-1)}. \end{align*}

Proof. Using the definition of $\lambda_k$ in Lemma 7.4, we have

\begin{align*}{\sum_{l=1}^{\infty}} \frac{1}{\lambda_{k+l} - \lambda_k} &\leq {\sum_{l=1}^{\infty}} \frac{1}{(k+l)(k+l-1) - k(k-1)} \\[3pt] &\leq \frac{1}{(k+1)k - k(k-1)} + \int_k^{\infty} \frac{1}{x(x+1) - k(k-1)} {\textrm{d}} x \\[3pt] &= \frac{1}{2k} + \int_1^{\infty} \frac{1}{u^2 + (2k-1)u} {\textrm{d}} u = \frac{1}{2k} + \lim_{a \rightarrow \infty} \int_1^{a} \frac{1}{u(u + 2k-1)} {\textrm{d}} u \\[3pt] &\leq \frac{1}{2k-1}\left[1 + \lim_{a \rightarrow \infty} \left ( \int_1^{a} \frac{1}{u} du - \int_1^{a} \frac{1}{u + 2k-1} {\textrm{d}} u \right ) \right]\\[3pt] &= \frac{1}{2k-1}\left[1 + \lim_{a \rightarrow \infty} \log\left(\frac{a2k}{a+2k-1}\right ) \right]\leq \frac{\log(2ek)}{2(k-1)}.\\[-40pt]\end{align*}

Lemma 7.7. For all $k\in{\mathbb N}$ , we have

\begin{align*} \ \sup_{y \in [0,1]} \left | S_k(y) \right | \leq 4^{k}. \end{align*}

Proof. By definition of the polynomials $S_k$ in (7.23), we have for $k\geq 1$

(7.36) \begin{eqnarray}\ \sup_{y \in [0,1]} \left | S_k(y) \right | \leq \sum_{i=1}^{k} | v_{k,i} |. \end{eqnarray}

Let us fix $k \geq 1$ and define $S^k_j \,{:\!=}\, \sum_{i \geq j}^{k} | v_{k,i} |$ . Note that $S^k_j = 0$ for $j \gt k$ and that $S^k_k = 1$ by the definition of the matrix $(v_{i,j})_{i,j \in {\mathbb N}}$ in Lemma 7.4. In particular, the result is true for $k=1$ . Thus, we assume that $k \gt 1$ from now on. Using (7.22), we see that for any $j \in [k-1]$ ,

\begin{align*}S^k_j = S^k_{j+1} + | v_{k,j} | &= S^k_{j+1} + \left | \frac{-1}{(\lambda_k - \lambda_j)} \left [ \left ( \sum_{l=j+2}^k v_{k,l} \right ) \theta\nu_0 + v_{k,j+1} \gamma_{j+1} \right ] \right | \\ &\leq S^k_{j+1} + \frac{1}{(\lambda_k - \lambda_j)} \left [ S^k_{j+2} \theta\nu_0 + (S^k_{j+1}-S^k_{j+2}) \gamma_{j+1} \right ] \\ &\leq \left ( 1 + \frac{\gamma_{j+1}}{\lambda_k - \lambda_j} \right ) S^k_{j+1} + \frac{\theta\nu_0 - \gamma_{j+1}}{\lambda_k - \lambda_j} S^k_{j+2}.\end{align*}

Note that $\frac{\theta\nu_0 - \gamma_{j+1}}{\lambda_k - \lambda_j}\leq 0$ , because of the definition of the coefficients $\gamma_i$ in Lemma 7.4. Thus, for $j\in[k-1]$ ,

(7.37) \begin{eqnarray}\ S^k_j \leq \left ( 1 + \frac{\gamma_{j+1}}{\lambda_k - \lambda_j} \right ) S^k_{j+1}. \end{eqnarray}

By the definitions of $\gamma_{j+1}$ , $\lambda_k$ , $\lambda_j$ in Lemma 7.4, and using that $j\lt k$ , we have

\begin{align*}\frac{\gamma_{j+1}}{\lambda_k - \lambda_j} &= \frac{(j+1)j + j {\theta \nu_1} + \theta\nu_0}{k(k-1) - j(j-1) + (k-j) {\theta \nu_1} + (k-j) \theta\nu_0} \\ &\leq \frac{(j+1)j}{j(k-1) - j(j-1)} + \frac{j {\theta \nu_1}}{(k-j) {\theta \nu_1}} + \frac{\theta\nu_0}{(k-j) \theta\nu_0} \leq 2\, \frac{j+1}{k-j}.\end{align*}

In particular,

\begin{align*} 1 + \frac{\gamma_{j+1}}{\lambda_k - \lambda_j} \leq \frac{k + j + 2}{k-j}. \end{align*}

Plugging this into (7.37) yields, for all $j\in[k-1]$ ,

\begin{align*} \ S^k_j \leq \frac{k + j + 2}{k-j} S^k_{j+1}. \end{align*}

Then, applying the previous inequality recursively and combining with $S^k_k = 1$ , we get

\begin{align*} \sum_{i=1}^{k} | v_{k,i} | = S^k_1 \leq \prod_{j=1}^{k-1} \frac{k + j + 2}{k-j} =\binom{2k+1}{k-1}= \frac{\binom{2k+1}{k-1} + \binom{2k+1}{k+2}}{2} \leq 2^{2k+1}/2 = 4^k. \end{align*}

Combining with (7.36), we obtain the desired result.

Acknowledgements

We are grateful to E. Baake for many interesting discussions. We also thank Sebastian Hummel and two anonymous referees for their valuable suggestions to improve the manuscript.

Funding information

F. Cordero received financial support from the Deutsche Forschungsgemeinschaft (CRC 1283, ‘Taming Uncertainty’, Project C1). This paper is supported by NSFC grant No. 11688101. G. Véchambre acknowledges the support of the Deutsche Forschungsgemeinschaft (CRC 1283, ‘Taming Uncertainty’, Project C1) for his visit to Bielefeld University in January 2018. F. Cordero acknowledges the support of the NYU-ECNU Institute of Mathematical Sciences at NYU Shanghai for his visit to NYU Shanghai in July 2018.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

Appendix A. $J_1$ -Skorokhod topology and weak convergence

A.1 Definitions and remarks on the $J_1$ -Skorokhod topology

For $T>0$ , as in the beginning of Section 2 we denote by ${\mathbb D}_{0,T}$ the space of càdlàg functions in [0, T] with values on ${\mathbb R}$ . Let $\mathcal{C}_T^\uparrow$ denote the set of increasing, continuous functions from [0, T] onto itself. For $\lambda\in\mathcal{C}_T^\uparrow$ , we set

(A.1) \begin{eqnarray}\lVert \lambda\rVert_T^0\,{:\!=}\, \sup_{0\leq u\lt s\leq T }\left \lvert \log\left(\frac{\lambda(s)-\lambda(u)}{s-u}\right)\right\rvert. \end{eqnarray}

We define the Billingsley metric $d_T^0$ in ${\mathbb D}_{0,T}$ via

(A.2) \begin{eqnarray}d_T^0(f,g)\,{:\!=}\, \inf_{\lambda\in\mathcal{C}_T^\uparrow}\{\lVert \lambda\rVert_T^0\vee \lVert f-g\circ\lambda\rVert_{T,\infty} \}, \ \text{where} \ \lVert f\rVert_{T,\infty}\,{:\!=}\, \sup_{s\in[0,T]}|f(s)|. \end{eqnarray}

The metric $d_T^0$ induces the $J_1$ -Skorokhod topology in ${\mathbb D}_{0,T}$ . An important feature is that the space $({\mathbb D}_{0,T},d_T^0)$ is separable and complete. The role of the time-change $\lambda$ in the definition of $d_T^0$ is to capture the fact that two càdlàg functions can be close in spite of a small difference between their jumping times.

For $T>0$ , a function $\omega\in{\mathbb D}_{0,T}$ is said to be pure-jump if $\sum_{u\in(0,T]}|\Delta \omega(u)|<\infty$ and for all $t\in (0,T]$ ,

\begin{align*}\omega(t)-\omega(0)=\sum_{u\in(0,t]}\Delta \omega(u),\end{align*}

where $\Delta \omega(u)\,{:\!=}\, \omega(u)-\omega(u-\!)$ , $u\in[0,T]$ . In the set of pure-jump functions, we consider the following metric:

(A.3) \begin{eqnarray}d_T^\star (\omega_1,\omega_2)\,{:\!=}\, \inf_{\lambda\in\mathcal{C}_T^\uparrow}\left\{\lVert \lambda\rVert_T^0\vee \sum_{u\in[0,T]} \left\lvert \Delta \omega_1(u)-\Delta (\omega_2\circ \lambda)(u)\right\rvert \right\}.\end{eqnarray}

The next result provides comparison inequalities between the metrics $d_T^0$ and $d_T^\star$ .

Lemma A.1 Let $\omega_1$ and $\omega_2$ be two pure-jump functions with $\omega_1(0)=\omega_2(0)=0$ ; then

\begin{align*}d_T^0(\omega_1,\omega_2)\leq d_T^\star(\omega_1,\omega_2).\end{align*}

If $\omega_1$ and $\omega_2$ are non-decreasing, and $\omega_1$ jumps exactly n times in [0, T], then

\begin{align*}d_T^\star(\omega_1,\omega_2)\leq (4n+3) d_T^0(\omega_1,\omega_2).\end{align*}

Proof. Let $\lambda\in \mathcal{C}_T^\uparrow$ and set $f\,{:\!=}\, \omega_1$ and $g\,{:\!=}\, \omega_2\circ\lambda$ . Since f and g are pure-jump functions with the same value at 0, we have, for any $t\in[0,T]$ ,

\begin{align*}\lvert f(t)-g(t)\rvert=\left\lvert\sum_{u\in[0,t]}(\Delta f(u)-\Delta g(u))\right\rvert\leq \sum_{u\in[0,t]}\left\lvert\Delta f(u)-\Delta g(u)\right\rvert.\end{align*}

The first inequality follows. Now, assume that $\omega_1$ and $\omega_2$ are non-decreasing and that $\omega_1$ has n jumps in [0, T]. Let $t_1\lt \cdots\lt t_n$ be the consecutive jump times of $\omega_1.$ We first prove that, for any $k\in[n]$ ,

(A.4) \begin{equation}\sum_{u\in[0,t_k]}\lvert\Delta f(u)-\Delta g(u) \rvert \leq (4k+1) \lVert f-g\rVert_{t_k,\infty},\end{equation}

where $\lVert \cdot \rVert_{t,\infty}$ , $t>0$ , is defined in (A.2). We proceed by induction on k. Note that

\begin{align*}\sum_{u\in[0,t_1]}\lvert\Delta f(u)-\Delta g(u) \rvert &=\sum_{u\in[0,t_1)}\Delta g(u) +\lvert\Delta f(t_1)-\Delta g(t_1) \rvert\leq g(t_1-\!)+2\lVert f-g\rVert_{t_1,\infty}\\ &\leq 3\lVert f-g\rVert_{t_1,\infty},\end{align*}

which proves (A.4) for $k=1$ . Now, assuming that (A.4) is true for $k\in[n-1]$ , we obtain

\begin{align*} &\sum_{u\in[0,t_{k+1}]}\!\!\!\!\lvert\Delta f(u)-\Delta g(u) \rvert =\sum_{u\in[0,t_k]}\lvert\Delta f(u)-\Delta g(u) \rvert +\sum_{u\in(t_{k},t_{k+1})}\!\!\!\Delta g(u)\\&\quad +\lvert\Delta f(t_{k+1})-\Delta g(t_{k+1}) \rvert\\ &\quad \leq (4k+1)\lVert f-g\rVert_{t_k,\infty}+g(t_{k+1}-\!)-g(t_k)+2\lVert f-g\rVert_{t_{k+1},\infty}\\ &\quad = (4k+1)\lVert f-g\rVert_{t_k,\infty}+(g(t_{k+1}-\!)-f(t_{k+1}-\!))-(g(t_k)-f(t_{k}))+2\lVert f-g\rVert_{t_{k+1},\infty}\\ &\quad \leq (4k+1)\lVert f-g\rVert_{t_k,\infty}+4\lVert f-g\rVert_{t_{k+1},\infty}\leq (4(k+1)+1)\lVert f-g\rVert_{t_{k+1},\infty}.\end{align*}

Hence, (A.4) also holds for $k+1$ . This ends the proof of (A.4) by induction. Finally, using (A.4), we get

\begin{align*} \sum_{u\in[0,T]}\lvert\Delta f(u)-\Delta g(u) \rvert &=\sum_{u\in[0,t_{n}]}\lvert\Delta f(u)-\Delta g(u) \rvert+\sum_{u\in(t_{n},T]}\Delta g(u)\\& \leq (4n+1)\lVert f-g\rVert_{t_{n},\infty}+ g(T)-g(t_n)\leq (4n+3)\lVert f-g\rVert_{T,\infty},\end{align*}

ending the proof.

A.2. Bounded Lipschitz metric and weak convergence

Let (E, d) denote a complete and separable metric space. It is well known that the topology of weak convergence of probability measures on E is induced by the Prokhorov metric. An alternative metric inducing this topology is given by the bounded Lipschitz metric, whose definition is recalled in this section.

Definition 1. (Lipschitz function.) A real-valued function F on (E, d) is said to be Lipschitz if there is $K>0$ such that

\begin{align*}|F(x)-F(y)|\leq K d(x,y),\quad\textrm{for all $x,y\in E$}.\end{align*}

We denote by $\textrm{BL}(E)$ the vector space of bounded Lipschitz functions on E. The space BL(E) is equipped with the norm

(A.5) \begin{equation}\lVert F\rVert_{\textrm{BL}}\,{:\!=}\, \sup_{x\in E}|F(x)| \vee \sup_{x,y\in E:\, x\neq y}\left\{\frac{|F(x)-F(y)|}{d(x,y)}\right\},\quad F\in BL(E). \end{equation}

Definition 2. (Bounded Lipschitz metric.) Let $\mu,\nu$ be two probability measures on E. The bounded Lipschitz distance between $\mu$ and $\nu$ is defined by

(A.6) \begin{equation}\varrho_{E}(\mu,\nu)\,{:\!=}\, \sup\left\{\left\lvert \int F d\mu- \int F d\nu\right\rvert\,{:}\, F\in \textrm{BL}(E),\, \lVert F\rVert_{\textrm{BL}}\leq 1 \right\}. \end{equation}

The bounded Lipschitz distance defines a metric on the space of probability measures on E. Moreover, the bounded Lipschitz distance metrizes the weak convergence of probability measures on E; i.e.

\begin{align*}\varrho_E(\mu_n,\mu)\xrightarrow[n\to\infty ]{}0\quad \Longleftrightarrow\quad \mu_n \xrightarrow[n\to\infty]{(d)}\mu.\end{align*}

Appendix B. Table of notation

References

Baake, E., Cordero, F. and Hummel, S. (2022). Lines of descent in the deterministic mutation-selection model with pairwise interaction. Ann. Appl. Prob. 32, 24002447.10.1214/21-AAP1736CrossRefGoogle Scholar
Baake, E., Lenz, U. and Wakolbinger, A. (2016). The common ancestor type distribution of a $\Lambda$ -Wright–Fisher process with selection and mutation. Electron. Commun. Prob. 21, 16 pp.10.1214/16-ECP16CrossRefGoogle Scholar
Baake, E. and Wakolbinger, A. (2018). Lines of descent under selection. J. Statist. Phys. 172, 156174.10.1007/s10955-017-1921-9CrossRefGoogle Scholar
Bansaye, V., Caballero, M.-E. and Méléard, S. (2019). Scaling limits of population and evolution processes in random environment. Electron. J. Prob. 24, 38 pp.10.1214/19-EJP262CrossRefGoogle Scholar
Bansaye, V., Kurtz, T. G. and Simatos, F. (2016). Tightness for processes with fixed points of discontinuities and applications in varying environment. Electron. Commun. Prob. 21, 9 pp.10.1214/16-ECP6CrossRefGoogle Scholar
Barczy, M., Li, Z. and Pap, G. (2015). Yamada–Watanabe results for stochastic differential equations with jumps. Internat. J. Stoch. Anal. 2015, article no. 460472, 23 pp.Google Scholar
Biswas, N., Etheridge, A. and Klimek, A. (2021). The spatial Lambda-Fleming–Viot process with fluctuating selection. Electron. J. Prob. 26, 51 pp.10.1214/21-EJP593CrossRefGoogle Scholar
Bull, J. J. (1987). Evolution of phenotypic variance. Evolution 41, 303315.10.2307/2409140CrossRefGoogle ScholarPubMed
Bürger, R. and Gimelfarb, A. (2002). Fluctuating environments and the role of mutation in maintaining quantitative genetic variation. Genet. Res. 80, 3146.10.1017/S0016672302005682CrossRefGoogle ScholarPubMed
Chetwynd-Diggle, J. and Klimek, A. (2019). Rare mutations in the spatial Lambda-Fleming–Viot model in a fluctuating environment and SuperBrownian motion. Preprint. Available at https://arxiv.org/abs/1901.04374.Google Scholar
Cordero, F. (2017). Common ancestor type distribution: a Moran model and its deterministic limit. Stoch. Process. Appl. 127, 590621.10.1016/j.spa.2016.06.019CrossRefGoogle Scholar
Cordero, F., Hummel, S. and Schertzer, E. (2022). General selection models: Bernstein duality and minimal ancestral structures. Ann. Appl. Prob. 32, 14991556.10.1214/21-AAP1683CrossRefGoogle Scholar
Cordero, F. and Möhle, M. (2019). On the stationary distribution of the block counting process for population models with mutation and selection. J. Math. Anal. Appl. 474, 10491081.10.1016/j.jmaa.2019.02.004CrossRefGoogle Scholar
Dynkin, E. B. (1965). Markov Processes, Vol. 1. Springer, Berlin, Heidelberg.Google Scholar
Etheridge, A. M., Griffiths, R. C. and Taylor, J. E. (2010). A coalescent dual process in a Moran model with genic selection, and the lambda coalescent limit. Theoret. Pop. Biol. 78, 7792.10.1016/j.tpb.2010.05.004CrossRefGoogle Scholar
Foucart, C. (2013). The impact of selection in the $\Lambda$ -Wright–Fisher model. Electron. Commun. Prob. 18, 10 pp.10.1214/ECP.v18-2838CrossRefGoogle Scholar
Fu, Z. and Li, Z. (2010). Stochastic equations of non-negative processes with jumps. Stoch. Process. Appl. 120, 306330.10.1016/j.spa.2009.11.005CrossRefGoogle Scholar
Gillespie, J. H. (1972). The effects of stochastic environments on allele frequencies in natural populations. Theoret. Pop. Biol. 3, 241248.10.1016/0040-5809(72)90001-9CrossRefGoogle ScholarPubMed
González Casanova, A. and Spanò, D. (2018). Duality and fixation in $\Xi$ -Wright–Fisher processes with frequency-dependent selection. Ann. Appl. Prob. 28, 250284.Google Scholar
González Casanova, A., Spanò, D. and Wilke-Berenguer, M. (2019). The effective strength of selection in random environment. Preprint. Available at https://arxiv.org/abs/1903.12121.Google Scholar
Guillin, A., Jabot, F. and Personne, A. (2020). On the Simpson index for the Moran process with random selection and immigration. Internat. J. Biomath. 13, article no. 2050046.10.1142/S1793524520500461CrossRefGoogle Scholar
Guillin, A., Personne, A. and Strickler, E. (2019). Persistence in the Moran model with random switching. Preprint. Available at https://arxiv.org/abs/1911.01108.Google Scholar
Jansen, S. and Kurt, N. (2014). On the notion(s) of duality for Markov processes. Prob. Surveys 11, 59120.10.1214/12-PS206CrossRefGoogle Scholar
Kallenberg, O. (2021). Foundations of Modern Probability, 3rd edn. Springer, Cham.10.1007/978-3-030-61871-1CrossRefGoogle Scholar
Karlin, S. and Benny Levikson, B. (1974). Temporal fluctuations in selection intensities: case of small population size. Theoret. Pop. Biol. 6, 383412.10.1016/0040-5809(74)90017-3CrossRefGoogle Scholar
Karlin, S. and Liberman, U. (1975). Random temporal variation in selection intensities: one-locus two-allele model. J. Math. Biol. 2, 117.10.1007/BF00276012CrossRefGoogle Scholar
Karlin, S. and Lieberman, U. (1974). Random temporal variation in selection intensities: Case of large population size. Theoret. Pop. Biol. 6, 355382.10.1016/0040-5809(74)90016-1CrossRefGoogle ScholarPubMed
Kimura, M. (1962). On the probability of fixation of mutant genes in a population. Genetics 47, 713719.10.1093/genetics/47.6.713CrossRefGoogle ScholarPubMed
Krone, S. M. and Neuhauser, C. (1997). Ancestral processes with selection. Theoret. Pop. Biol. 51, 210237.10.1006/tpbi.1997.1299CrossRefGoogle ScholarPubMed
Kurtz, T. G. (2011). Equivalence of stochastic equations and martingale problems. In Stochastic Analysis 2010, ed. D. Crisan, Springer, Berlin, pp. 113130.10.1007/978-3-642-15358-7_6CrossRefGoogle Scholar
Lenz, U., Kluth, S., Baake, E. and Wakolbinger, A. (2015). Looking down in the ancestral selection graph: a probabilistic approach to the common ancestor type distribution. Theoret. Pop. Biol. 103, 2737.10.1016/j.tpb.2015.01.005CrossRefGoogle Scholar
Li, Z. and Pu, F. (2012). Strong solutions of jump-type stochastic equations. Electron. Commun. Prob. 17, 13 pp.10.1214/ECP.v17-1915CrossRefGoogle Scholar
Möhle, M. (2001). Forward and backward diffusion approximations for haploid exchangeable population models. Stoch. Process. Appl. 95, 133149.10.1016/S0304-4149(01)00093-XCrossRefGoogle Scholar
Neuhauser, C. (1999). The ancestral graph and gene genealogy under frequency-dependent selection. Theoret. Pop. Biol. 56, 203214.10.1006/tpbi.1999.1412CrossRefGoogle ScholarPubMed
Neuhauser, C. and Krone, S. M. (1997). The genealogy of samples in models with selection. Genetics 145, 519534.10.1093/genetics/145.2.519CrossRefGoogle ScholarPubMed
Sagitov, S., Jagers, P. and Vatutin, V. (2010). Coalescent approximation for structured populations in a stationary random environment. Theoret. Pop. Biol. 78, 192199.10.1016/j.tpb.2010.06.008CrossRefGoogle Scholar
Van Casteren, J. A. (1992). On martingales and Feller semigroups. Results Math. 21, 274288.10.1007/BF03323085CrossRefGoogle Scholar
Figure 0

Figure 1. Left: a realization of the Moran IPS; time runs forward from left to right; the environment has peaks at times $t_0$ and $t_1$. Right: the ASG that arises from the second and third lines (from bottom to top) in the left picture, with the potential ancestors drawn in black; time runs backward from right to left; backward time $\beta\in[0,T]$ corresponds to forward time $t=s+T-\beta$.

Figure 1

Figure 2. The descendant line (D) splits into the continuing line (C) and the incoming line (I). The incoming line is ancestral if and only if it is of type 0. The true ancestral line is drawn in bold.

Figure 2

Figure 3. An illustration of a path of the process $X^\omega$ in the interval [0, T]. The grey vertical lines represent the peaks of the environment $\omega$; $t_1$, $t_2$, $t_3$, and $t_4$ are the jump times of $\omega$.

Figure 3

Figure 4. Illustration of a path of the process $X^\omega$ (grey path) and the killed ASG $\mathcal{G}_T^\omega$ (black lines) embedded in the same picture. Forward time t runs from left to right; backward time $\beta\,{:\!=}\, T-t$ runs from right to left. The environment $\omega$ jumps at forward times $t_0$, $t_1$, and $t_2$.

Figure 4

Figure 5. An illustration of two paths of $X^\omega$: the black (resp. grey) path is defined in the interval $[\!-(\tau+h),0]$ (resp. $[\!-\tau,0]$) starting at $X^\omega(\!-(\tau+h))=x$ (resp. $X^\omega(\!-\tau)=x$). The peaks of the environment $\omega$ are depicted as grey vertical lines.

Figure 5

Figure 6. An illustration of two paths (thin and thick) of $X^{J\otimes_{\tau_\star^{}}^{}\omega}$ in $[\!-\tau,0]$ starting at x. Both paths are subject to the same deterministic environment $\omega$ in $[\!-{\tau_\star^{}},0]$; the vertical lines in $(\!-{\tau_\star^{}},0]$ represent the peaks of $\omega$. The environment in $[\!-\tau,-\tau_\star^{})$ is random and driven by J; the peaks of the realization of J giving rise to the thick (resp. thin) path in $[\!-\tau,-{\tau_\star^{}})$ are depicted as solid (resp. dotted) vertical lines.

Figure 6

Figure 7. Illustration of the construction of the quenched ASG. The environment $\omega$ has jumps at forward times $t_0$, $t_1$, $t_2$; backward times $\beta_1,\ldots,\beta_5$ belong to the set of potential branching times $\tilde I_{{\text{bran}}}$. Virtual lines are depicted in grey or black; active lines are black. The ASG in [0, T] consists of the set of active lines together with their connections and mutation marks.

Figure 7

Figure 8. LD-ASG (left) and its pLD-ASG (right). Backward time $\beta \in [0,T]$ runs from right to left. In the LD-ASG, levels remain constant between the dashed lines; in particular, they are not affected by mutation events. In the pLD-ASG, lines are pruned at mutation events, where an additional updating of the levels takes place. The bold line in the pLD-ASG represents the immune line.