Hostname: page-component-586b7cd67f-rdxmf Total loading time: 0 Render date: 2024-11-24T19:18:47.198Z Has data issue: false hasContentIssue false

Concentration of measure for graphon particle system

Published online by Cambridge University Press:  19 January 2024

Erhan Bayraktar*
Affiliation:
University of Michigan
Donghan Kim*
Affiliation:
University of Michigan
*
*Postal address: Department of Mathematics, 530 Church Street, Ann Arbor, MI 48109.
**Email address: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

We study heterogeneously interacting diffusive particle systems with mean-field-type interaction characterized by an underlying graphon and their finite particle approximations. Under suitable conditions, we obtain exponential concentration estimates over a finite time horizon for both 1- and 2-Wasserstein distances between the empirical measures of the finite particle systems and the averaged law of the graphon system.

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

In this article, we study concentration of measures related to the graphon particle system and its finite particle approximations. This work is a continuation of the earlier papers [Reference Bayraktar, Chakraborty and Wu2Reference Bayraktar and Wu4]. A graphon particle system consists of uncountably many heterogeneous particles $X_u$ for $u \in [0, 1]$ whose interactions are characterized by a graphon. More precisely, for a fixed $T > 0$ and $d \in \mathbb{N}$ , we consider the following system:

(1.1) \begin{align} X_u(t) & = X_u(0) \nonumber + \int_0^t \bigg[ \int_0^1 \int_{\mathbb{R}^d} \phi \big(X_u(s), y \big) G(u, v) \, \mu_{v, s} (dy) \, dv + \psi \big(X_u(s)\big) \bigg] ds + \sigma B_u(t), \\[5pt] & \mu_{u, t} \text{ is the probability distribution of } X_u(t), \text{ for every } u \in [0, 1] \text{ and } t \in [0, T], \end{align}

where $\{B_u\}_{u \in [0, 1]}$ is a family of independent and identically distributed d-dimensional Brownian motions, and $\{X_u(0)\}_{u \in [0, 1]}$ is a collection of independent (but not necessarily identically distributed) $\mathbb{R}^d$ -valued random variables with law $\mu_u(0)$ , independent of $\{B_u\}_{u \in [0, 1]}$ for each $u \in [0, 1]$ , defined on a filtered probability space $(\Omega, \mathscr{F}, \{\mathscr{F}_t\}, \mathbb{P})$ . Two functions $\phi \;:\; \mathbb{R}^d \times \mathbb{R}^d \rightarrow \mathbb{R}^d$ and $\psi \;:\; \mathbb{R}^d \rightarrow \mathbb{R}^d$ represent pairwise interactions between the particles and single-particle drift, respectively. The quantity $\sigma \in \mathbb{R}^{d \times d}$ is a constant, and $G \;:\; [0, 1] \times [0, 1] \rightarrow [0, 1]$ is a graphon, that is, a symmetric measurable function.

Along with the graphon particle system, we introduce two finite particle systems with heterogeneous interactions, which approximate (1.1). For a fixed, arbitrary $n \in \mathbb{N}$ and each $i \in [n] \;:\!=\; \{ 1, \cdots, n\}$ , we first consider the ‘not-so-dense’ analogue of (1.1) introduced in Section 4 of [Reference Bayraktar, Chakraborty and Wu2]:

(1.2) \begin{equation} X^n_i(t) = X_{\frac{i}{n}}(0) + \int_0^t \Bigg[ \frac{1}{n p(n)} \sum_{j=1}^n \xi^n_{ij} \phi \big(X^n_i(s), X^n_j(s) \big) + \psi \big( X^n_i(s) \big) \Bigg] ds + \sigma B_{\frac{i}{n}}(t),\end{equation}

where $\{p(n)\}_{n \in \mathbb{N}} \subset (0, 1]$ is a sequence of numbers and $\{\xi^n_{ij}\}_{1 \le i, j \le n}$ are independent Bernoulli random variables satisfying

\begin{equation*} \xi^n_{ij} = \xi^n_{ji}, \qquad \mathbb{P}(\xi^n_{ij} = 1) = p(n) G\bigg(\frac{i}{n}, \, \frac{j}{n}\bigg), \qquad \text{for every } i, j \in [n],\end{equation*}

independent of $\{B_{i/n}, X_{i/n}(0) \;:\; i \in [n]\}$ . Here, p(n) represents the global sparsity parameter, and the strength of interaction between the particles in (1.2) is scaled by np(n), the order of the number of neighbors, as in mean-field systems on Erdös–Rényi random graphs [Reference Bhamidi, Budhiraja and Wu7, Reference Delarue15, Reference Oliveira and Reis27]; the convergence of $p(n) \rightarrow 0$ as $n \rightarrow \infty$ implies that the graph is sparse, but we shall consider the case $np(n) \rightarrow \infty$ , i.e., the random graph is ‘not so dense’, meaning that the average degree of the graph diverges.

The other finite particle approximation system is given by

(1.3) \begin{equation} \bar{X}^n_i(t) = X_{\frac{i}{n}}(0) + \int_0^t \Bigg[ \frac{1}{n} \sum_{j=1}^n \phi \big(\bar{X}^n_i(s), \bar{X}^n_j(s) \big) G\bigg(\frac{i}{n}, \frac{j}{n}\bigg) + \psi \big( \bar{X}^n_i(s) \big) \Bigg] ds + \sigma B_{\frac{i}{n}}(t).\end{equation}

Since this system has a nonrandom coefficient for the interaction term (but still models heterogeneous interaction via the graphon), it is easier to analyze than the other finite particle system (1.2). We note that the three systems (1.1)–(1.3) are coupled in the sense that they share initial particle locations $X_{i/n}(0)$ and Brownian motions $B_{i/n}$ for $i \in [n]$ .

Law-of-large-numbers-type results on the convergence of the systems (1.2) and (1.3) to the graphon particle system (1.1) under suitable conditions are studied in [Reference Bayraktar, Chakraborty and Wu2]. Results on the exponential ergodicity of the two systems (1.1) and (1.2), as well as the uniform-in-time convergence of (1.2) to (1.1) under a certain dissipativity condition, are presented in [Reference Bayraktar and Wu3]. There are numerous recent studies of graphon particle systems [Reference Bayraktar and Wu4, Reference Coppini14] and works on associated heterogeneously interacting finite particle models [Reference Bet, Coppini and Nardi6, Reference Coppini13, Reference Delattre, Giacomin and Luçon17, Reference Lacker and Soret25Reference Oliveira and Reis27]. These studies have been undertaken because graphons have been widely applied in mean-field game theory for both the static and dynamic cases; see e.g. [Reference Aurell, Carmona and Laurière1, Reference Bayraktar, Wu and Zhang5, Reference Caines and Huang10Reference Carmona, Cooney, Graves and Laurière12, Reference Gao, Caines and Huang20, Reference Gao, Tchuendom and Caines21, Reference Parise and Ozdaglar28, Reference Tchuendom, Caines and Huang30, Reference Tchuendom, Caines and Huang31, Reference Vasal, Mishra and Vishwanath33] and references therein.

Among these studies, our work is particularly linked to [Reference Bayraktar and Wu4]. With $W_1$ denoting the 1-Wasserstein distance, with the empirical measures of the three particle systems at time $t \in [0, T]$ defined by

(1.4) \begin{equation} L_{n, t} \;:\!=\; \frac{1}{n} \sum_{i=1}^n \delta_{(X^n_i(t))}, \qquad \bar{L}_{n, t} \;:\!=\; \frac{1}{n} \sum_{i=1}^n \delta_{(\bar{X}^n_i(t))}, \qquad \widetilde{L}_{n, t} \;:\!=\; \frac{1}{n} \sum_{i=1}^n \delta_{(X_{i/n}(t))},\end{equation}

and with the averaged law $\widetilde{\mu}_t \;:\!=\; \int_0^1 \mu_{u, t} \, du$ of the graphon system (1.1), [Reference Bayraktar and Wu4] computes concentration bounds of the types $\mathbb{P} \big[ \sup_{0 \le t \le T} W_1(\bar{L}_{n, t}, \widetilde{\mu}_t) > \epsilon \big]$ , $\sup_{t \ge 0} \mathbb{P} \big[ W_1(\bar{L}_{n, t}, \widetilde{\mu}_t) > \epsilon \big]$ for $\epsilon > 0$ under certain conditions. In particular, uniform-in-time concentration bounds of the latter type are studied in an infinite-time-horizon setting under an extra dissipativity condition on $\psi$ . These results are established by computing certain sub-Gaussian estimates rather directly with the moment generating function of the standard normal random vector (Lemmas 3.7–3.10 of [Reference Bayraktar and Wu4]).

In contrast, the present work focuses on the case of finite time horizon and deals with a more general sparsity sequence $\{p(n)\}_{n \in \mathbb{N}} \subset (0, 1]$ for (1.2), whereas the results of [Reference Bayraktar and Wu4] cover only the dense graphs, i.e., $p(n) \equiv 1$ . For our argument we adopt the method of [Reference Delarue, Lacker and Ramanan16], as follows. We first compute the bound on the probability that Lipschitz functions of the finite particles $\bar{X}^n = (\bar{X}^n_1, \cdots, \bar{X}^n_n)$ from the system (1.3) on the space $\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n$ deviate from their means (Theorem 3.1). The derivation of this ‘concentration bound around their means’ relies on transportation inequalities from [Reference Djellout, Guillin and Wu18]. Combining this bound with the fact, presented in Section 3.1, that the expectations of the $W_2$ -distances between the empirical measures in (1.4) converge to zero as the number of particles goes to infinity, we show exponential concentration bounds on the probabilities

(1.5) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_p(\bar{L}_{n, t}, \widetilde{\mu}_t) > \epsilon \bigg]\end{equation}

for $p = 1$ (Theorem 3.3).

The advantage of using transportation inequalities to find ‘concentration around means’, compared to the approach of [Reference Bayraktar and Wu4], is that we can even derive the same exponential bound (1.5) in the case of $p = 2$ (Theorem 3.5), at the cost of assuming independence and (3.3) on the initial particles. More specifically, we can obtain a dimension-free concentration bound around means,

\begin{equation*} \mathbb{P} \Big[ F(\bar{X}^n) - \mathbb{E}\big(F(\bar{X}^n)\big) > a \Big] \le 2 \exp\!(\!-\!\delta a^2),\end{equation*}

for every Lipschitz function F and $a > 0$ (see (3.4))—i.e., the right-hand side does not depend on n—from the quadratic transportation inequality (3.3) on the initial particles. This bound can be derived thanks to the remarkable result on the dimension-free $W_2$ -tensorization of transportation inequalities from [Reference Gozlan and Léonard22] (Lemma 2.7(ii)). Then, again by combining the dimension-free concentration bound around means with the results in Section 3.1, we ultimately arrive at the exponential bound (1.5) for $p = 2$ . The authors of [Reference Delarue, Lacker and Ramanan16] apply this method to compute similar concentration bounds when $\bar{X}^n$ represents the state of the so-called Nash equilibrium of a symmetric n-player stochastic differential game and $\widetilde{\mu}$ is the measure flow of the unique equilibrium of the corresponding mean-field game (we refer to Section 2 of [Reference Delarue, Lacker and Ramanan16] for a more detailed description of these terms).

Moreover, inspired by the argument using Bernstein’s inequality in [Reference Oliveira and Reis27], we compare the particles $X^n_i$ and $\bar{X}^n_i$ to improve the exponential bound in [Reference Bayraktar and Wu4] for

\begin{equation*} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_p(L_{n, t}, \widetilde{\mu}_t) > \epsilon \bigg]\end{equation*}

when $p = 1$ , at the cost of an assumption on the interaction function $\phi$ , namely, that it is a member of the $L^1$ -Fourier class. When $p=2$ , we also present a similar exponential bound for the system (1.2) on the dense graphs ( $p(n) \equiv 1$ ). This requires a refinement of the Bernstein-type inequality for the cut norm in [Reference Oliveira and Reis27]; see Lemma 3.1. Without such a condition on $\phi$ , we have a similar exponential bound, but in terms of the bounded Lipschitz metric ( $d_{BL}$ metric), a weaker metric than $W_1$ .

This paper is organized as follows. In Section 2 we introduce the notation, state the assumptions, and recall some of the relevant existing results concerning the particle systems (1.1)–(1.3), as well as other preliminary results. Section 3 provides our main results, and Section 4 gives the proofs of these results.

2. Preliminaries

In this section, we first introduce the notation which will be used throughout this paper. We then state several assumptions and some of the basic results on the particle systems (1.1)–(1.3) from [Reference Bayraktar, Chakraborty and Wu2, Reference Bayraktar and Wu4], and provide several well-known results regarding transportation cost inequalities without proof. Finally, we introduce Bernstein’s inequality and the concept of the $L^1$ -Fourier class.

2.1. Notation

Given a metric space (S, d) and a function $f \;:\; S \rightarrow \mathbb{R}$ , we define

\begin{align*} \vert\vert f \vert\vert_{\infty} &\;:\!=\; \sup \{ \vert f(x) \vert \;:\; x \in S\}, \\[5pt] \vert\vert f \vert\vert_{Lip} &\;:\!=\; \sup \bigg\{ \frac{\vert f(x) - f(y) \vert}{d(x, y)} \;:\; x, y \in S, \, x \neq y \bigg\}, \\[5pt] \vert\vert f \vert\vert_{BL} &\;:\!=\; \vert \vert f(x) \vert \vert_{\infty} + \vert\vert f \vert\vert_{Lip},\end{align*}

and we say that f is Lipschitz (respectively, bounded Lipschitz) if $\vert\vert f \vert\vert_{Lip} < \infty$ (respectively, $\vert\vert f \vert\vert_{BL} < \infty$ ). In particular, f is called a-Lipschitz if $\Vert f \Vert_{Lip} = a$ .

Denote by $\mathcal{P}(S)$ the space of Borel probability measures on S. We shall use the standard notation $\langle \mu, \varphi \rangle \;:\!=\; \int_S \varphi \, d\mu$ for integrable functions $\varphi$ and measures $\mu$ on S. When $(S, \vert\vert \cdot \vert\vert)$ is a normed space, we write $\mathcal{P}^p(S, \vert\vert\cdot\vert\vert)$ for the set of $\mu \in \mathcal{P}(S)$ satisfying $\langle \mu, \vert\vert\cdot\vert\vert^p \rangle < \infty$ for a given $p \in [1, \infty)$ . We denote by $Lip(S, \vert\vert\cdot\vert\vert)$ the set of 1-Lipschitz functions, i.e., $f \;:\; S \rightarrow \mathbb{R}$ satisfying $\vert f(x) - f(y) \vert \le \vert\vert x-y \vert\vert$ for every $x, y \in S$ .

For a separable Banach space $(S, \vert\vert\cdot\vert\vert)$ , we endow $\mathcal{P}^p(S, \vert\vert\cdot\vert\vert)$ with the p-Wasserstein metric

(2.1) \begin{equation} W_{p, (S, \vert\vert\cdot\vert\vert)} (\mu, \nu) \;:\!=\; \inf_{\pi} \Bigg(\int_{S \times S} \vert\vert x-y \vert\vert^p \pi(dx, \, dy) \Bigg)^{1/p}, \qquad p \ge 1,\end{equation}

where the infimum is taken over all probability measures $\pi$ on $S \times S$ with first and second marginals $\mu$ and $\nu$ . We also consider the product space $S^n \;:\!=\; S \times \cdots \times S$ , equipped with the $\ell^p$ norm (for any $p \ge 1$ )

(2.2) \begin{equation} \vert\vert x \vert\vert_{n, p} = \Bigg( \sum_{i=1}^n \vert\vert x_i \vert\vert^p \Bigg)^{1/p},\end{equation}

for $x = (x_1, \cdots, x_n) \in S^n$ . When the space S or the norm $\vert\vert\cdot\vert\vert$ is understood, we sometimes omit it from the above notation.

Denote by $C([0, T] \;:\; S)$ the space of continuous functions from [0, T] to S, and $\vert\vert x \vert \vert_{\star, t} \;:\!=\; \sup_{0 \le s \le t}\vert x_s \vert$ , where $\vert \cdot \vert$ is the usual Euclidean norm on $\mathbb{R}^d$ for $x \in C([0, T] \;:\; \mathbb{R}^d)$ and $t \in [0, T]$ . We write $\mathcal{L}(X)$ for the probability law of a random variable X and $[n] \;:\!=\; \{1, \cdots, n\}$ for any $n \in \mathbb{N}$ . We use K to denote various positive constants throughout the paper; its value may change from line to line.

For a Polish space (S, d) with Borel $\sigma$ -field $\mathcal{S}$ , we also consider the space of probability measures over $(S, \mathcal{S})$ endowed with the topology of weak convergence, which is metrized by the BL metric, defined for $\mu, \nu \in \mathcal{P}(S)$ by

(2.3) \begin{equation} d_{BL}(\mu, \nu) \;:\!=\; \sup \bigg\{ \Big\vert \int_S \, f \, d(\mu-\nu) \Big\vert \;:\; f\;:\;S \rightarrow \mathbb{R} \text{ with } \vert\vert f \vert\vert_{BL} \le 1 \bigg\}.\end{equation}

Note the dual representation of the 1-Wasserstein metric

(2.4) \begin{equation} W_1(\mu, \nu) \;:\!=\; \sup \bigg\{ \Big\vert \int_S \, f \, d(\mu-\nu) \Big\vert \;:\; f\;:\;S \rightarrow \mathbb{R} \text{ with } \vert\vert f \vert\vert_{Lip} \le 1 \bigg\},\end{equation}

along with the relationship $d_{BL} \le W_1$ . We shall also use the following notation: for given $\mu, \nu \in \mathcal{P}(C([0, T] \;:\; \mathbb{R}^d))$ ,

\begin{equation*} W_{p, t}(\mu, \nu) \;:\!=\; \inf_{\pi} \bigg( \int \vert \vert x- y \vert \vert^p_{\star, t} \, \pi(dx, \, dy) \bigg)^{1/p}, \qquad t \in [0, T], \quad p \ge 1,\end{equation*}

where the infimum is taken over all probability measures $\pi$ with marginals $\mu$ and $\nu$ .

Let us define three $n \times n$ random matrices $P^{(n)}$ , $\bar{P}^{(n)}$ , and $D^{(n)}$ , related to the systems (1.2), (1.3), for every $n \in \mathbb{N}$ , with entries

(2.5) \begin{align} P^{(n)}_{i, j} &\;:\!=\; \frac{\xi^n_{ij}}{np(n)}, \qquad \qquad i, j \in [n], \nonumber \\[5pt] \bar{P}^{(n)}_{i, j} &\;:\!=\; \frac{1}{n}G\bigg(\frac{i}{n}, \frac{j}{n}\bigg), \;\qquad i, j \in [n], \\[5pt] D^{(n)} &\;:\!=\; P^{(n)}-\bar{P}^{(n)}. \nonumber\end{align}

For these matrices, we define the $\ell_{\infty} \rightarrow \ell_1$ norm of an $n \times n$ matrix A by

(2.6) \begin{equation} \vert\vert A \vert\vert_{\infty \rightarrow 1} \;:\!=\; \sup \Big\{ \big\langle \mathbf{x}, \, A \mathbf{y} \big\rangle \, : \, \mathbf{x}, \, \mathbf{y} \in [-1, 1]^n \Big\}.\end{equation}

This norm is known to be equivalent to the so-called cut norm (see (3.3) of [Reference Guëdon and Vershynin23]).

We denote the empirical measures of the approximation systems for each $n \in \mathbb{N}$ by

(2.7) \begin{equation} L_n \;:\!=\; \frac{1}{n} \sum_{i=1}^n \delta_{(X^n_i)}, \qquad \bar{L}_n \;:\!=\; \frac{1}{n} \sum_{i=1}^n \delta_{(\bar{X}^n_i)},\end{equation}

all of which are random elements of $\mathcal{P}(C([0, T] \;:\; \mathbb{R}^d))$ .

We conclude this subsection by recalling the relative entropy of two probability measures $\mu, \nu$ over the same measurable space:

(2.8) \begin{equation} H(\mu \vert \nu) \;:\!=\; \begin{cases} \int \log \big(\frac{d\mu}{d\nu} \big) d\mu \qquad \text{if } \mu \ll \nu, \\[5pt] \qquad \infty \qquad \qquad \; \text{otherwise}. \end{cases}\end{equation}

2.2. Existence and uniqueness of the solutions

We state the existence and uniqueness of strong solutions to the systems (1.1)–(1.3).

Assumption 2.1

  1. (i) The function $\phi$ is bounded; furthermore, $\phi$ and $\psi$ are Lipschitz, i.e., there exists a constant $K > 0$ such that

    \begin{equation*} \big\vert \phi(x_1, y_1) - \phi(x_2, y_2) \big\vert + \big\vert \psi(x_1) - \psi(x_2) \big\vert \le K \big( \vert x_1-x_2 \vert + \vert y_1 - y_2 \vert \big) \end{equation*}
    holds. Moreover, the initial particles have finite second moments, i.e.,
    (2.9) \begin{equation} \sup_{u \in [0, 1]} \mathbb{E} \big\vert X_u(0) \big\vert^2 < \infty. \end{equation}
  2. (ii) The map $[0, 1] \ni u \mapsto \mu_u(0) = \mathcal{L}(X_u(0)) \in \mathcal{P}(\mathbb{R}^d)$ is measurable.

Lemma 2.1. (Existence and uniqueness of the particle systems)

  1. (i) Under Assumption 2.1(i), the two systems (1.2), (1.3) have unique strong solutions.

  2. (ii) Under Assumption 2.1(i)–(ii), the graphon system (1.1) has a unique strong solution, and the map $[0, 1] \ni u \mapsto \mu_u \in \mathcal{P}\big(C([0, T] \;:\; \mathbb{R}^d)\big)$ is measurable.

The proof of Lemma 2.1(i) is classical (see e.g. Theorem 5.2.9 of [Reference Karatzas and Shreve24]). Part (ii) follows from Proposition 2.1 of [Reference Bayraktar, Chakraborty and Wu2]. As pointed out in Remark 2.2 of [Reference Bayraktar, Chakraborty and Wu2], we note that the boundedness condition on $\phi$ in Assumption 2.1(i) can be removed throughout this paper, at the cost of a stronger condition $\sup_{u \in [0, 1]} \mathbb{E} \vert X_u(0) \vert^{2 + \epsilon} < \infty$ for some $\epsilon > 0$ than (2.9). We occasionally need an even stronger condition on the initial particles, as in the following.

Assumption 2.2. The initial particles $\{X_{u}(0)\}_{u \in [0, 1]}$ are independent, with law $\mu_{u, 0} \in \mathcal{P}(\mathbb{R}^d)$ satisfying

(2.10) \begin{equation} \sup_{u \in [0, 1]} \int_{\mathbb{R}^d} e^{\kappa \vert x \vert^2} \mu_{u, 0} (dx) < \infty, \qquad \textit{for some} \kappa > 0. \end{equation}

Under this stronger assumption, we have, in particular, the finite fourth moment of the solution to (1.1). The proof is standard and hence is omitted (see e.g. [Reference Sznitman29] or Proposition 2.1 of [Reference Bayraktar and Wu4]).

Lemma 2.2. Under Assumptions 2.1, 2.2, the solution to (1.1) satisfies

\begin{equation*} \sup_{u \in [0, 1]} \sup_{t \in [0, T]} \mathbb{E} \big[ \big\vert X_{u}(t) \big\vert^4 \big] < \infty. \end{equation*}

2.3. Continuity of the graphon system

The following result, which states the continuity of the graphon system (1.1), is from Theorem 2.1 of [Reference Bayraktar, Chakraborty and Wu2].

Assumption 2.3. There exists a finite collection of subintervals $\{I_i \;:\; i \in [N]\}$ , for some $N \in \mathbb{N}$ , satisfying $\cup_{i=1}^N I_i = [0, 1]$ . For each $i, j \in [N]$ , the following hold:

  1. (i) The map $I_i \ni u \mapsto \mu_u(0) \in \mathcal{P}(\mathbb{R}^d)$ is continuous with respect to the $W_2$ metric.

  2. (ii) For each $u \in I_i$ , there exists a Lebesgue-null set $N_u \subset [0, 1]$ such that G(u, v) is continuous at $(u, v) \in [0, 1] \times [0, 1]$ for each $v \in [0, 1] \setminus N_u$ .

  3. (iii) There exists $K > 0$ such that

    \begin{align*} W_2(\mu_{u_1}(0), \mu_{u_2}(0)) &\le K \vert u_1 - u_2 \vert, \qquad \qquad \qquad \qquad u_1, u_2 \in [0, 1], \\[5pt] \big\vert G(u_1, v_1) - G(u_2, v_2) \big\vert &\le K \big( \vert u_1 - u_2 \vert + \vert v_1 - v_2 \vert \big), \qquad (u_1, v_1), \, (u_2, v_2) \in I_i \times I_j. \end{align*}

Lemma 2.3. Suppose that Assumption 2.1 holds.

  1. (i) (Continuity.) Under Assumption 2.3(i)–(ii), the map $I_i \ni u \mapsto \mu_u \in \mathcal{P}\big(C([0, T] \;:\; \mathbb{R}^d)\big)$ is continuous with respect to the $W_{2, T}$ metric for every $i \in [N]$ .

  2. (ii) (Lipschitz continuity.) Under Assumption 2.3(iii), there exists $\kappa > 0$ , which depends on T, such that $W_{2, T}(\mu_u, \mu_v) \le \kappa \vert u-v \vert$ whenever $u, v \in I_i$ for some $i \in [N]$ .

In Lemma 2.3(ii), note that we have, in particular,

\begin{equation*} \sup_{\vert\vert f \vert\vert_{Lip} \le 1} \bigg\vert \int_{\mathbb{R}^d} f(x) \mu_{u, t} (dx) - \int_{\mathbb{R}^d} f(x) \mu_{v, t}(dx) \bigg\vert \le W_{2, t}(\mu_u, \mu_v) \le \kappa \vert u-v \vert, \qquad \forall \, t \in [0, T].\end{equation*}

2.4. A law of large numbers for the mean-field particle system

Besides the assumptions already introduced in this section, we will need the following assumption on the sparsity parameter for the system (1.2), as briefly mentioned in Section 1.

Assumption 2.4. The sequence $\{p(n)\}_{n \in \mathbb{N}}$ in (1.2) satisfies $np(n) \rightarrow \infty$ as $n \rightarrow \infty$ .

We introduce the following law-of-large-numbers result for the mean-field particle system (1.2), which is Theorem 4.1 of [Reference Bayraktar, Chakraborty and Wu2]. We write $\mu_u$ for the law of $X_u$ in the graphon particle system (1.1) for each $u \in [0, 1]$ , and define

(2.11) \begin{equation} \widetilde{\mu} \;:\!=\; \int_0^1 \mu_{u} \, du.\end{equation}

Lemma 2.4. Under Assumptions 2.1, 2.3, and 2.4,

\begin{equation*} L_n \rightarrow \widetilde{\mu} \quad \textit{in } \mathcal{P}\big(C([0, T] \;:\; \mathbb{R}^d)\big) \textit{ in probability, as } n \rightarrow \infty. \end{equation*}

Moreover, we have

\begin{equation*} \frac{1}{n} \sum_{i=1}^n \mathbb{E}\vert\vert X^n_i - X_{\frac{i}{n}} \vert\vert^2_{\star, T} \rightarrow 0, \qquad \textit{as } n \rightarrow \infty. \end{equation*}

2.5. Transportation inequalities

In this subsection, we present some preliminary results regarding transportation inequalities. The first result, from Theorem 9.1 of [Reference Üstünel32], illustrates the transportation inequality with the uniform norm for the laws of diffusion processes.

Lemma 2.5. For a fixed $T>0$ and $k \in \mathbb{N}$ , suppose that $X^x = \{X^x_t\}_{t \in [0, T]}$ is the unique strong solution of the stochastic differential equation (SDE)

(2.12) \begin{equation} dX^x_t = b(t, X^x)dt + \Sigma \, dW_t, \quad \forall \, t \in [0, T], \qquad X_0 = x \in \mathbb{R}^k, \end{equation}

on a probability space $C([0, T] \;:\; \mathbb{R}^k)$ supporting a k-dimensional Brownian motion W. Here, $b \;:\; [0, T] \times C([0, T] \;:\; \mathbb{R}^k) \rightarrow \mathbb{R}^k$ satisfies, for any $\xi, \eta \in C([0, T] \;:\; \mathbb{R}^k)$ ,

(2.13) \begin{equation} \big\vert b(t, \xi) - b(t, \eta) \big\vert \le L \sup_{0 \le s \le t} \big\vert \xi(s) - \eta(s) \big\vert = L \Vert \xi - \eta \Vert_{\star, t}, \qquad \forall \, t \in [0, T], \end{equation}

for some constants $L > 0$ and $\Sigma \in \mathbb{R}^{k \times k}$ . Let $P^x \in \mathcal{P}(C([0, T] \;:\; \mathbb{R}^k))$ be the law of $X^x$ for any $x \in \mathbb{R}^k$ . Then for any $Q \in \mathcal{P}(C([0, T] \;:\; \mathbb{R}^k))$ , there exist positive constants $\kappa_1, \kappa_2$ , depending only on T, such that the inequalities

\begin{equation*} W^2_{1, (C([0, T] : \mathbb{R}^k), \, \vert\vert\cdot\vert\vert_{k, 2})}(P^x, Q) \le W^2_{2, (C([0, T] : \mathbb{R}^k), \, \vert\vert\cdot\vert\vert_{k, 2})}(P^x, Q) \le \kappa_1 e^{\kappa_2 L^2} H(Q \vert P^x) \end{equation*}

hold, where $H(Q\vert P)$ is the relative entropy of Q with respect to P, defined in (2.8).

The following result (Theorem 5.1 of [Reference Delarue, Lacker and Ramanan16]) characterizes the concentration of a probability measure with a transportation cost inequality and Gaussian integrability property. The equivalence between (2.14) and (2.15) is originally from Theorem 3.1 of [Reference Bobkov and Götze8], and the equivalence between (2.14) and (2.17) is due to Theorem 2.3 of [Reference Djellout, Guillin and Wu18].

Lemma 2.6. For a probability measure $\mu \in \mathcal{P}^1(S)$ on a separable Banach space $(S, \vert\vert\cdot\vert\vert)$ , the following statements are equivalent up to a universal change in the positive constant c:

  1. (i) The transportation cost inequality

    (2.14) \begin{equation} W_{1, S}(\mu, \nu) \le \sqrt{2c H(\nu \vert \mu)} \end{equation}
    holds for every $\nu \in \mathcal{P}(S)$ .
  2. (ii) For every 1-Lipschitz function f on S and $\lambda \in \mathbb{R}$ ,

    (2.15) \begin{equation} \int_S e^{\lambda(f - \langle \mu, f \rangle )}d\mu \le \exp\!\Big(\frac{c\lambda^2}{2}\Big) \end{equation}
    holds.
  3. (iii) For every 1-Lipschitz function f on S and $a > 0$ ,

    (2.16) \begin{equation} \mu \big(f - \langle \mu, f \rangle > a\big) \le \exp\!\Big(\!-\! \frac{a^2}{2c}\Big). \end{equation}
  4. (iv) The probability measure $\mu$ is sub-Gaussian, i.e.,

    (2.17) \begin{equation} \int_S e^{c\vert\vert x \vert\vert^2} \mu(dx) < \infty. \end{equation}

The next result is a well-known tensorization of transportation cost inequalities from Corollary 5 of [Reference Gozlan and Léonard22]. The major difference between (i) and (ii) is that the inequality (2.18) is dimension-free, i.e., the right-hand side does not depend on n.

Lemma 2.7. For each $n \in \mathbb{N}$ , consider a set of probability measures $\{\mu_i\}_{i \in [n]} \subset \mathcal{P}(S)$ on a separable Banach space $(S, \Vert\cdot\Vert)$ .

  1. (i) If the inequality $W_{1, S}(\mu_i, \nu) \le \sqrt{2c H(\nu \vert \mu_i)}$ holds for every $i \in [n]$ and $\nu \in \mathcal{P}^1(S)$ , then

    \begin{equation*} W_{1, (S^n, \, \vert\vert\cdot\vert\vert_{n, 1})}(\mu_1 \otimes \cdots \otimes \mu_n, \, \rho) \le \sqrt{2nc H(\rho \vert \mu_1 \otimes \cdots \otimes \mu_n)} \end{equation*}
    holds for every $\rho \in \mathcal{P}^1(S^n)$ .
  2. (ii) If the inequality $W_{2, S}(\mu_i, \nu) \le \sqrt{2c H(\nu \vert \mu_i)}$ holds for every $i \in [n]$ and $\nu \in \mathcal{P}^2(S)$ , then

    (2.18) \begin{equation} W_{2, (S^n, \, \vert\vert\cdot\vert\vert_{n, 2})}(\mu_1 \otimes \cdots \otimes \mu_n, \, \rho) \le \sqrt{2c H(\rho \vert \mu_1 \otimes \cdots \otimes \mu_n)} \end{equation}
    holds for every $\rho \in \mathcal{P}^2(S^n)$ .

We finally mention the following result on the Wasserstein distance of the empirical measures of independent but not necessarily identically distributed random variables. This is Lemma A.1 of [Reference Bayraktar and Wu4], a generalization of Theorem 1 of [Reference Fournier and Guillin19], where independent and identically distributed random variables are considered. This result will be used in proving Proposition 3.2.

Lemma 2.8. Let $\{Y_i\}_{i \in \mathbb{N}}$ be independent $\mathbb{R}^d$ -valued random variables, and define

\begin{equation*} \nu_n \;:\!=\; \frac{1}{n} \sum_{i=1}^n \delta_{Y_i}, \qquad \bar{\nu}_n \;:\!=\; \frac{1}{n} \sum_{i=1}^n \mathcal{L}(Y_i). \end{equation*}

For a fixed $p > 0$ , assume that $\sup_{i \in \mathbb{N}} \mathbb{E} \vert Y_i \vert^q < \infty$ holds for some $q > p$ . Then there exists a constant $K > 0$ depending only on p, q, and d such that for every $n \ge 1$ ,

\begin{equation*} \mathbb{E} \big[ W^p_p(\nu_n, \bar{\nu}_n) \big] \le K \alpha_{p, q}(n) \bigg( \int_{\mathbb{R}^d} \vert x \vert^q \, \bar{\nu}_n (dx) \bigg)^{p/q}, \end{equation*}

where

\begin{equation*} \alpha_{p, q}(n) \;:\!=\; \begin{cases} n^{-1/2} + n^{-(q-p)/q} \qquad &\textit{if } p > d/2 \textit { and } q \neq 2p, \\[5pt] n^{-1/2} \log(1+n) + n^{-(q-p)/q} \qquad &\textit{if } p = d/2 \textit { and } q \neq 2p, \\[5pt] n^{-p/d} + n^{-(q-p)/q} \qquad &\textit{if } p < d/2 \textit { and } q \neq d/(d-p). \end{cases} \end{equation*}

2.6. Bernstein’s inequality and the $\boldsymbol{L}^1$ -Fourier class

When comparing the two approximation systems (1.2) and (1.3), it is essential to control the matrix $D^{(n)}$ of (2.5). Thus, we introduce the following concentration of $D^{(n)}$ in terms of the $\Vert \cdot \Vert_{\infty \rightarrow 1}$ norm, which is from Lemma 2 of [Reference Oliveira and Reis27]. Its proof is a straightforward application of Bernstein’s inequality (Lemma 2.10, or Bennett’s inequality) with the distribution of the $n^2$ independent entries of the matrix $D^{(n)}$ . We will use Bernstein’s inequality again in Section 3.3 to prove Lemma 3.1, an elaboration of Lemma 2.9.

Lemma 2.9. For any $0 < \eta \le n$ , we have

\begin{equation*} \mathbb{P} \bigg[ \frac{\vert\vert D^{(n)} \vert\vert_{\infty \rightarrow 1}}{n} > \eta \bigg] \le \exp\!\bigg(\!-\! \frac{\eta^2 n^2p(n)}{2+\frac{\eta}{3}} \bigg). \end{equation*}

In particular, under Assumption 2.4, we have for every $\eta > 0$

\begin{equation*} \frac{1}{n} \log \mathbb{P} \bigg[ \frac{\vert\vert D^{(n)} \vert\vert_{\infty \rightarrow 1}}{n} > \eta \bigg] \longrightarrow -\infty, \qquad \textit { as } n \rightarrow \infty. \end{equation*}

Lemma 2.10. (Bernstein’s inequality, Theorem 2.9 of [Reference Boucheron, Lugosi and Massart9].) Let $X_1, \cdots X_k$ be independent random variables with finite variance such that $X_i \le b$ for some $b > 0$ almost surely for each $i \in [k]$ . Let $v = \sum_{i=1}^k \mathbb{E}[X_i^2]$ ; then we have

\begin{equation*} \mathbb{P} \Bigg[ \sum_{i=1}^k \big( X_i - \mathbb{E}(X_i) \big) \ge u \Bigg] \le \exp\!\bigg(\!-\! \frac{u^2}{2(v + \frac{bu}{3})} \bigg). \end{equation*}

When the interaction function $\phi$ belongs to a special class of functions, we shall see in the proof of Theorem 3.4 that the distance $W_1(L_n, \bar{L}_n)$ can easily be expressed in terms of the quantity $\vert\vert D^{(n)} \vert\vert_{\infty \rightarrow 1}$ . This observation is inspired by the work of [Reference Oliveira and Reis27]. To state it more precisely, we introduce the notion of the $L^1$ -Fourier class of functions.

Definition 2.1. Identifying $\mathbb{R}^{2d}$ with $\mathbb{R}^d \times \mathbb{R}^d$ , we say that a function $f \;:\; \mathbb{R}^d \times \mathbb{R}^d \rightarrow \mathbb{R}$ belongs to the $L^1$ -Fourier class if there exists a finite complex measure $m_{f}$ over $\mathbb{R}^{2d}$ such that for every $(x, y) \in \mathbb{R}^d \times \mathbb{R}^d$ ,

\begin{equation*} f(x, y) = \int_{\mathbb{R}^{2d}} \exp\!\Big(2\pi \sqrt{-1} \big\langle (x, y), \, z \big\rangle \Big) m_f(dz). \end{equation*}

We recall that a finite complex measure m over $\mathbb{R}^{2d}$ is a set function $m \;:\; \mathcal{B}(\mathbb{R}^{2d}) \rightarrow \mathbb{C}$ of the form $m = m_r^+ - m_r^- + \sqrt{-1}(m_i^+ - m_i^-)$ , where each of $m_r^+, m_r^-, m_i^+, m_i^-$ is a finite, $\sigma$ -additive (nonnegative) measure over $\mathbb{R}^{2d}$ . We define the total mass of m by

\begin{equation*} \vert\vert m \vert\vert_{TM} \;:\!=\; m_r^+(\mathbb{R}^{2d}) + m_r^-(\mathbb{R}^{2d}) + m_i^+(\mathbb{R}^{2d}) + m_i^-(\mathbb{R}^{2d}).\end{equation*}

If a function f is an inverse $L^1$ -transform of a function in $L^1(\mathbb{R}^{2d})$ , then f belongs to the $L^1$ -Fourier class. In particular, any Schwartz function belongs to the $L^1$ -Fourier class. An example of such a function is the Kuramoto interaction; if $d=1$ and $\phi(x-y) = K \sin(y-x)$ for some constant K, then the corresponding complex measure is equal to

\begin{equation*} m_{\phi} = \frac{K}{2\sqrt{-1}} \big( \delta_{(\!-\! 1, 1)} + \delta_{(1, -1)} \big).\end{equation*}

The finite system (1.2) of ‘oscillators’ with the Kuramoto interaction function is studied in [Reference Coppini13].

3. Main results

This section consists of three subsections. The first shows that expectations of the $W_2$ -distances between two empirical measures on $\mathbb{R}^d$ related to the systems (1.1)–(1.3) converge to zero as the number of particles goes to infinity. The second subsection gives exponential bounds on the probabilities that Lipschitz function values of the particles $\bar{X}^n$ of the system (1.3) on $\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n$ deviate from their means; the stronger the norm we use for the space $\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n$ , the stronger the assumption we need on the initial distribution of the particles. The results in the first two subsections will be used in proving the results in the last subsection. In the latter, we derive several results on the concentration of the finite particle systems (1.2), (1.3) toward the graphon particle system (1.1), under different metrics.

3.1. Concentration in mean of the $\boldsymbol{W}_2$ -distance

Let us recall the law $\mu_{u, t}$ of (1.1), the empirical measures (1.4) of the three systems, and the averaged law $\widetilde{\mu}_t \;:\!=\; \int_0^1 \mu_{u, t} \, du$ for every $t \in [0, T]$ . We give two expectations converging to zero as $n \rightarrow \infty$ in the following. The proofs are provided in Section 4.1.

Proposition 3.1. Under Assumptions 2.1 and 2.4,

\begin{equation*} \mathbb{E}\bigg[\sup_{0 \le t \le T} W_{2} (L_{n, t}, \bar{L}_{n, t}) \bigg] \longrightarrow 0, \end{equation*}

as $n\rightarrow \infty$ .

Proposition 3.2. Under Assumptions 2.1, 2.2, and 2.3(iii),

\begin{equation*} \mathbb{E}\bigg[\sup_{0 \le t \le T} W_{2} (\widetilde{L}_{n, t}, \, \widetilde{\mu}_t) \bigg] \longrightarrow 0, \end{equation*}

as $n\rightarrow \infty$ .

By virtue of Lemma 2.4, we have

\begin{align*} \mathbb{E} \bigg[\sup_{0 \le t \le T} W^2_{2} (L_{n, t}, \widetilde{L}_{n, t}) \bigg] &\le \mathbb{E} \Bigg[\sup_{0 \le t \le T} \frac{1}{n} \sum_{i=1}^n \big\vert X_{\frac{i}{n}}(t) - X^n_i(t) \big\vert^2 \Bigg] \\[5pt] &\le \mathbb{E} \Bigg[ \frac{1}{n} \sum_{i=1}^n \big\Vert X_{\frac{i}{n}} - X^n_i \big\Vert_{\star, T}^2 \Bigg] \longrightarrow 0, \qquad \text{as } n \rightarrow \infty.\end{align*}

Combining the last convergence with Propositions 3.1 and 3.2, we immediately have other convergences of the expectations.

Corollary 3.1. Under assumptions of Propositions 3.1, 3.2,

\begin{equation*} \mathbb{E}\bigg[\sup_{0 \le t \le T} W_{2} (L_{n, t}, \widetilde{\mu}_t) \bigg] \longrightarrow 0, \qquad \mathbb{E}\big[\sup_{0 \le t \le T} W_{2} (\bar{L}_{n, t}, \widetilde{\mu}_t) \big] \longrightarrow 0, \end{equation*}

as $n\rightarrow \infty$ .

3.2. Concentration around the mean

We present in this subsection the concentration of a 1-Lipschitz function of the particles $\bar{X}^n$ around its mean, under two different norms $\ell^1$ and $\ell^2$ . The proofs of the results rely on the transportation inequalities presented in Section 2.5, and they will be given in Section 4.2.

From Lemma 2.6, we note that the condition (2.10) of Assumption 2.2 in Theorem 3.1 below is equivalent to the condition

(3.1) \begin{equation} W_{1} (\mu_{u, 0}, \nu) \le \sqrt{2 \kappa H(\nu \vert \mu_{u, 0})}, \qquad \text{for every } u \in [0, 1], \;\;\; \nu \in \mathcal{P}^1(S).\end{equation}

Theorem 3.1. Under Assumptions 2.1 and 2.2, there exists a constant $\delta > 0$ , independent of n, such that for every $F \in Lip\Big(\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n, \vert\vert\cdot\vert\vert_{n, 1}\Big)$ and every $a > 0$ ,

(3.2) \begin{equation} \mathbb{P} \Big[ F(\bar{X}^n) - \mathbb{E}\big(F(\bar{X}^n)\big) > a \Big] \le 2 \exp\!\bigg(\!-\!\frac{\delta a^2}{n} \bigg) \end{equation}

holds.

We have the following result analogous to Theorem 3.1 when the condition (3.1) is replaced by (3.3). For example, if the initial law for any $u \in [0, 1]$ takes the same form $\mu_{u, 0}(dx) = e^{-U(x)} dx$ for some $U \in C^2(\mathbb{R}^d)$ with Hessian bounded below in semidefinite order by cI for some $c>0$ , then $\mu_{u, 0}$ satisfies the condition (3.3) with $\kappa = 1/c$ . In particular, if $\mu_{u, 0}$ has the standard normal distribution on $\mathbb{R}^d$ , then (3.3) holds with $\kappa = 1$ (however, the initial law may not necessarily be the same for every $u \in [0, 1]$ in the statement of Theorem 3.2).

We note here that the positive constant $\delta$ which appears in Theorems 3.13.6 of this section depends only on the constants c of Lemmas 2.6 and 2.7, $\kappa$ of (3.1) and (3.3), the functions $\phi, \psi$ , and T, but not on n, the number of particles. We also emphasize that the concentration inequality (3.4) is dimension-free; the bound on the right-hand side does not depend on n. This property will play an essential role in deriving the exponential concentration of the empirical measures in terms of $W_2$ -distance.

Theorem 3.2. Suppose that the initial particles $\{X_u(0)\}_{u \in [0, 1]}$ are independent, with law $\mu_{u, 0} \in \mathcal{P}(\mathbb{R}^d)$ satisfying, for some $\kappa > 0$ ,

(3.3) \begin{equation} W_{2} (\mu_{u, 0}, \nu) \le \sqrt{2 \kappa H(\nu \vert \mu_{u, 0})}, \qquad \textit{for every } u \in [0, 1], \;\;\; \nu \in \mathcal{P}^2(S). \end{equation}

Under Assumption 2.1, there exists a constant $\delta > 0$ , independent of n, such that for every $F \in Lip\Big(\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n, \vert\vert\cdot\vert\vert_{n, 2}\Big)$ and every $a > 0$ ,

(3.4) \begin{equation} \mathbb{P} \Big[ F(\bar{X}^n) - \mathbb{E}\big(F(\bar{X}^n)\big) > a \Big] \le 2 \exp\!(\!-\!\delta a^2) \end{equation}

holds.

3.3. Concentration toward the graphon system

Recalling the notation in (1.4), we now provide the concentration, in terms of the (1- and 2-) Wasserstein distance, of the empirical measures of the finite particle systems toward the averaged measure $\widetilde{\mu}_t$ of the graphon system. Proofs will be given in Section 4.3.

First, we have the following result on the concentration of $\bar{L}_{n, t}$ toward $\widetilde{\mu}_t$ in terms of the $W_1$ -distance, due to Theorem 3.1.

Theorem 3.3. Under Assumptions 2.1, 2.2, and 2.3(iii), there exist constants $\delta > 0$ , which is independent of n, and $N \in \mathbb{N}$ such that

(3.5) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} (\bar{L}_{n, t}, \widetilde{\mu}_t) > a \bigg] \le 2 \exp\!\bigg(\!-\! \frac{\delta a^2n}{4} \bigg) \end{equation}

holds for every $a > 0$ and every $n \ge N$ .

Remark 3.1. Theorem 3.3 gives the same exponential bound as in Theorem 2.1 of [Reference Bayraktar and Wu4]. The proof in [Reference Bayraktar and Wu4] mainly focuses on computing certain sub-Gaussian estimates, whereas our argument relies on the concentration property (3.2) of the system (1.3). Applying the same argument, we can even deduce the exponential bound in terms of the $W_2$ metric (in Theorem 3.5 below).

In Section 2.6, we introduced the concept of the $L^1$ -Fourier class, along with Bernstein’s inequality, to express $W_1(L_n, \bar{L}_n)$ in terms of $\Vert D^{(n)} \Vert_{\infty \rightarrow 1}$ . This gives rise to the following result on the concentration of the particle system (1.2) toward the graphon system.

Theorem 3.4. Suppose that the components of the interaction function $\phi$ belong to the $L^1$ -Fourier class (Definition 2.1). Under Assumptions 2.1, 2.2, 2.3(iii), and 2.4, there exist constants $\delta > 0$ , which is independent of n, and $N \in \mathbb{N}$ such that

(3.6) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} (L_{n, t}, \widetilde{\mu}_t) > a \bigg] \le 3 \exp\!\bigg(\!-\! \frac{\delta a^2n}{16} \bigg) \end{equation}

holds for every $a > 0$ and every $n \ge N$ . For general interaction functions $\phi$ (which do not necessarily belong to the $L^1$ -Fourier class), we have instead

(3.7) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} d_{BL} (L_{n, t}, \widetilde{\mu}_t) > a \bigg] \le 3 \exp\!\bigg(\!-\! \frac{\delta a^2n}{16} \bigg). \end{equation}

The following result gives the concentration of $\bar{L}_{n, t}$ toward $\widetilde{\mu}_t$ as in Theorem 3.3, but in terms of the $W_2$ metric. Its proof is similar to that of Theorem 3.3, but Theorem 3.2 is used in place of Theorem 3.1.

Theorem 3.5. Under Assumptions 2.1, 2.2, and 2.3, together with the condition (3.3), there exist constants $\delta > 0$ , independent of n, and $N \in \mathbb{N}$ such that

(3.8) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_2 (\bar{L}_{n, t}, \widetilde{\mu}_t) > a \bigg] \le 2 \exp\!\bigg(\!-\! \frac{\delta a^2n}{4} \bigg) \end{equation}

holds for every $a > 0$ and every $n \ge N$ .

Since we have the exponential bound in (3.8) in the $W_2$ metric, one naturally expects to obtain a bound similar to (3.6) in the $W_2$ metric as well. In order to do this, we need to find the exponential bound for the probability $\mathbb{P} \big[ \sup_{0 \le t \le T} W_2 (L_{n, t}, \bar{L}_{n, t}) > a \big]$ , which requires us to handle the quantity $\Vert (D^{(n)})^\top D^{(n)} \Vert_{\infty \rightarrow 1}$ , instead of $\Vert D^{(n)} \Vert_{\infty \rightarrow 1}$ as in the proof of Theorem 3.4. The control of this quantity is achieved in Lemma 3.1 under an extra condition on the sparsity parameter p(n), a more restrictive condition than the one in Assumption 2.4.

Assumption 3.1. The sparsity parameter sequence $\{p(n)\}_{n \in \mathbb{N}} \subset (0, 1]$ of the system (1.2) satisfies one of the following:

  1. (i) $p(n) \rightarrow 0$ and $np(n)^2 \rightarrow \infty$ as $n \rightarrow \infty$ , or

  2. (ii) $p(n) \equiv 1$ for every $n \in \mathbb{N}$ .

Recalling the notation of (2.5), we state the following lemma, which is needed in proving Theorem 3.6. Its proof, given in Section 4.3, is similar to that of Lemma 2.9, but requires more involved applications of Bernstein’s inequality.

Lemma 3.1. Under Assumption 3.1, there exists $N \in \mathbb{N}$ such that

\begin{equation*} \mathbb{P} \bigg[ \frac{\Vert (D^{(n)})^\top D^{(n)} \Vert_{\infty \rightarrow 1}}{n} > \eta \bigg] \le 3n^2 \exp\!\bigg(\!-\! \frac{2\eta^2 np(n)^4}{9+4\eta} \bigg) \end{equation*}

holds for every $n \ge N$ and $\eta > 0$ .

Theorem 3.6. Suppose that the components of the interaction function $\phi$ belong to the $L^1$ -Fourier class. Under Assumptions 2.1, 2.2, 2.3, and 3.1(i), together with the condition (3.3), there exist constants $K > 0$ , which is independent of n, and $N \in \mathbb{N}$ such that

(3.9) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{2} (L_{n, t}, \widetilde{\mu}_t) > a \bigg] \le 4n^2 \exp\!\bigg(\!-\! \frac{a^4 np(n)^4}{72K^2+8a^2K} \bigg) \end{equation}

holds for every $a > 0$ and every $n \ge N$ .

Furthermore, if Assumption 3.1(i) is replaced by Assumption 3.1(ii), we have the exponential bound in n: there exist constants $\delta > 0$ , which is independent of n, and $N \in \mathbb{N}$ such that

(3.10) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{2} (L_{n, t}, \widetilde{\mu}_t) > a \bigg] \le \exp\!\bigg(\!-\! \frac{\delta a^4 n}{a^2+ \delta} \bigg) \end{equation}

holds for every $a > 0$ and every $n \ge N$ .

4. Proofs

In this section we provide the proofs of the results stated in Section 3.

4.1. Proofs of results in Section 3.1

4.1.1. Proof of Proposition 3.1

Let us recall the identity (4.7), along with the notation in (2.5). By Hölder’s inequality there exists $K > 0$ , depending on $\phi$ and $\psi$ , such that for every $t \in [0, T]$ ,

\begin{align*} \sup_{0 \le t \le T} \big\vert X^n_i(t) - \bar{X}^n_i(t) \big\vert^2 &\le KT \int_0^T \Big\vert \sum_{j=1}^n D^{(n)}_{i, j} \phi\big( X^n_i(s), X^n_j(s) \big) \Big\vert^2 \, ds \\[5pt] & + KT \int_0^T \Big\vert \sum_{j=1}^n \bar{P}^{(n)}_{i, j} \Big ( \phi\big(X^n_i(s), X^n_j(s) \big) - \phi\big(\bar{X}^n_i(s), \bar{X}^n_j(s) \big) \Big) \Big\vert^2 \, ds \\[5pt] &+ KT \int_0^T \big\vert X^n_i(s) - \bar{X}^n_i(s) \big\vert^2 \, ds.\end{align*}

Taking the expectation of the first term, and using the independence of $\{D^{(n)}_{i, j}\}_{j \in [n]}$ and the boundedness of $\phi$ , we have

\begin{align*} KT \, \mathbb{E} \int_0^T \Big\vert \sum_{j=1}^n D^{(n)}_{i, j} \phi\big( X^n_i(s), X^n_j(s) \big) \Big\vert^2 \, ds \le KT \int_0^T \sum_{j=1}^n \mathbb{E}\Big[(D^{(n)}_{i, j})^2\Big] \, ds \le \frac{KT^2}{np(n)}.\end{align*}

For the second term, Hölder’s inequality and the Lipschitz continuity of $\phi$ give

\begin{align*} KT \, \mathbb{E} \int_0^T \Big\vert \sum_{j=1}^n \bar{P}^{(n)}_{i, j} &\Big ( \phi\big(X^n_i(s), X^n_j(s) \big) - \phi\big(\bar{X}^n_i(s), \bar{X}^n_j(s) \big) \Big) \Big\vert^2 \, ds \\[5pt] &\le KT \, \mathbb{E} \int_0^T \frac{1}{n}\sum_{j=1}^n \big( \vert X^n_i(s) - \bar{X}^n_i(s) \vert^2 + \vert X^n_j(s) - \bar{X}^n_j(s) \vert^2 \big) \, ds.\end{align*}

Combining above inequalities and averaging over $i \in [n]$ , we obtain

\begin{align*} \frac{1}{n} \sum_{i=1}^n \mathbb{E} \bigg[ \sup_{0 \le t \le T} \vert X^n_i(t) - &\bar{X}^n_i(t) \vert^2 \bigg] \\[5pt] &\le \frac{KT^2}{np(n)} + KT\int_0^T \frac{1}{n} \sum_{i=1}^n \mathbb{E} \bigg[ \sup_{0 \le u \le s}\vert X^n_i(u) - \bar{X}^n_i(u) \vert^2 \bigg] \, ds.\end{align*}

Grönwall’s inequality yields

\begin{equation*} \frac{1}{n} \sum_{i=1}^n \mathbb{E} \bigg[ \sup_{0 \le t \le T} \vert X^n_i(t) - \bar{X}^n_i(t) \vert^2 \bigg] \le \frac{KT^2 \exp(KT^2)}{np(n)},\end{equation*}

and thus

\begin{align*} \mathbb{E}\bigg[\sup_{0 \le t \le T} & W^2_{2} (L_{n, t}, \bar{L}_{n, t}) \bigg] \le \mathbb{E} \bigg[ \sup_{0 \le t \le T} \frac{1}{n}\sum_{i=1}^n \vert X^n_i(t) - \bar{X}^n_i(t) \vert^2 \bigg] \\[5pt] &\le \frac{1}{n} \sum_{i=1}^n \mathbb{E} \bigg[ \sup_{0 \le t \le T} \vert X^n_i(t) - \bar{X}^n_i(t) \vert^2 \bigg] \le \frac{KT^2 \exp\!(KT^2)}{np(n)} \longrightarrow 0, \text{ as } n \rightarrow \infty.\end{align*}

4.1.2. Proof of Proposition 3.2

We divide the interval [0, T] into $M \;:\!=\; \lceil \frac{T}{\Delta} \rceil$ subintervals of length $\Delta > 0$ :

\begin{equation*} [0, T] = [0, \Delta] \cup [\Delta, 2\Delta] \cup \cdots \cup [(M-1)\Delta, T] \;=\!:\; \cup_{h=1}^{M} \Delta_h,\end{equation*}

where $\Delta_h \;:\!=\; [(h-1)\Delta, \, h\Delta]$ for $h = 1, \cdots, M-1$ and $\Delta_M = [(M-1)\Delta, \, T]$ . (We choose the value of $\Delta$ later.) With the notation

\begin{equation*} \widetilde{\mu}_{n, t} \;:\!=\; \frac{1}{n} \sum_{i=1}^n \mu_{\frac{i}{n}, t} = \frac{1}{n} \sum_{i=1}^n \mathcal{L}\big(X_{\frac{i}{n}}(t)\big),\end{equation*}

the triangle inequality gives

\begin{align*} &\;\;\;\;\mathbb{E} \bigg[ \sup_{0 \le t \le T} W_{2} (\widetilde{L}_{n, t}, \, \widetilde{\mu}_t) \bigg] = \mathbb{E} \bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} W_{2} (\widetilde{L}_{n, t}, \, \widetilde{\mu}_t) \bigg] \\[5pt] & \le \mathbb{E} \bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} W_{2} (\widetilde{L}_{n, t}, \, \widetilde{L}_{n, (h-1)\Delta}) \bigg] + \mathbb{E} \bigg[ \sup_{h \in [M]} W_{2} (\widetilde{L}_{n, (h-1)\Delta}, \, \widetilde{\mu}_{n, (h-1)\Delta}) \bigg] \\[5pt] & \qquad + \mathbb{E} \bigg[ \sup_{h \in [M]} W_{2} (\widetilde{\mu}_{n, (h-1)\Delta}, \, \widetilde{\mu}_{(h-1)\Delta}) \bigg] + \mathbb{E} \bigg[ \sup_{h \in [M]} \, \sup_{t \in \Delta_h} W_{2} (\widetilde{\mu}_{(h-1)\Delta}, \, \widetilde{\mu}_{t}) \bigg]. \\[5pt] & \;=\!:\; E_1 + E_2 + E_3 + E_4.\end{align*}

For the first term, $E_1$ , we note that there exists $K>0$ , depending on the bounds of $\phi$ , $\psi$ , and $\sigma$ , such that

\begin{equation*} \big\vert X_{\frac{i}{n}}(s) - X_{\frac{i}{n}}(u) \big\vert^2 \le K \vert s-u \vert^2 + K \big\vert B_{\frac{i}{n}}(s) - B_{\frac{i}{n}}(u) \big\vert^2\end{equation*}

holds for every $0 \le u \le s \le T$ , and thus we have

\begin{align*} (E_1)^2 &\le \mathbb{E} \bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} W^2_{2} (\widetilde{L}_{n, t}, \, \widetilde{L}_{n, (h-1)\Delta}) \bigg] \\[5pt] &\le \mathbb{E} \bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} \frac{1}{n}\sum_{i=1}^n \big\vert X_{\frac{i}{n}}(t) - X_{\frac{i}{n}}\big((h-1)\Delta\big) \big\vert^2 \bigg] \\[5pt] & \le \mathbb{E} \Bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} \frac{1}{n}\sum_{i=1}^n \bigg( K \Delta^2 + K \bigg\vert B_{\frac{i}{n}}(t) - B_{\frac{i}{n}}\big((h-1)\Delta\big) \bigg\vert^2 \bigg) \Bigg] \\[5pt] & \le K \Delta^2 + K\mathbb{E} \Bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} \frac{1}{n}\sum_{i=1}^n \big\vert B_{\frac{i}{n}}(t) - B_{\frac{i}{n}}\big((h-1)\Delta\big) \big\vert^2 \Bigg].\end{align*}

Applying Hölder’s inequality twice, we find that the last expectation is bounded above by

\begin{align*} & \quad \mathbb{E} \Bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} \Bigg( \frac{1}{n} \sum_{i=1}^n \big\vert B_{\frac{i}{n}}(t) - B_{\frac{i}{n}}\big((h-1)\Delta\big) \big\vert^4 \Bigg)^{\frac{1}{2}} \Bigg] \\[5pt] &\le \sqrt{\mathbb{E} \Bigg[ \max_{h \in [M]} \, \sup_{t \in \Delta_h} \frac{1}{n}\sum_{i=1}^n \big\vert B_{\frac{i}{n}}(t) - B_{\frac{i}{n}}\big((h-1)\Delta\big) \big\vert^4 \Bigg]} \\[5pt] &\le \sqrt{ \sum_{h \in [M]} \frac{1}{n}\sum_{i=1}^n \mathbb{E} \bigg[ \sup_{t \in \Delta_h} \big\vert B_{\frac{i}{n}}(t) - B_{\frac{i}{n}}\big((h-1)\Delta\big) \big\vert^4 \bigg]} \le \sqrt{M C_4 \mathbb{E}[\Delta^2]} \le \sqrt{(T+1) C_4 \Delta}.\end{align*}

The second-to-last inequality uses the properties of the increments of Brownian motion and the Burkholder–Davis–Gundy inequality with the positive constant $C_4$ . Therefore, we have the bound

\begin{equation*} (E_1)^2 \le K \Delta^2 + K \sqrt{(T+1)C_4 \Delta}.\end{equation*}

For the second expectation, $E_2$ , a series of applications of Hölder’s inequality and Lemma 2.8 give

\begin{align*} E_2 &\le \sum_{h \in [M]} \mathbb{E} \big[ W_{2} (\widetilde{L}_{n, (h-1)\Delta}, \, \widetilde{\mu}_{n, (h-1)\Delta}) \big] \le \sum_{h \in [M]} \mathbb{E} \big[ W_{3} (\widetilde{L}_{n, (h-1)\Delta}, \, \widetilde{\mu}_{n, (h-1)\Delta}) \big] \\[5pt] &\le \sum_{h \in [M]} \Big( \mathbb{E} \big[ W^3_{3} (\widetilde{L}_{n, (h-1)\Delta}, \, \widetilde{\mu}_{n, (h-1)\Delta}) \big] \Big)^{1/3} \\[5pt] &\le K \sum_{h \in [M]} \bigg( \int_{\mathbb{R}^d} \vert x \vert^4 \widetilde{\mu}_{n, (h-1)\Delta}(dx) \bigg)^{1/4} \alpha_{3, 4}^{1/3}(n) \le KM \alpha_{3, 4}^{1/3}(n) \longrightarrow 0,\end{align*}

as $n \rightarrow \infty$ , where the last inequality follows from Lemma 2.2.

On the other hand, by the convexity of $W^2_2(\, \cdot \, , \, \cdot \, )$ and Lemma 2.3(ii), there exists $K > 0$ satisfying

\begin{align*} E_3^2 &\le \mathbb{E} \bigg[ \sup_{h \in [M]} W^2_{2} (\widetilde{\mu}_{n, (h-1)\Delta}, \, \widetilde{\mu}_{(h-1)\Delta}) \bigg] \\[5pt] &\le \mathbb{E} \bigg[ \sup_{h \in [M]} \int_0^1 W^2_{2} (\mu_{\frac{\lceil nu \rceil}{n}, (h-1)\Delta}, \, \mu_{u, (h-1)\Delta}) \, du \bigg] \le \frac{K}{n^2}.\end{align*}

Finally, for the last term, $E_4$ , we note from a straightforward computation that there exists $K > 0$ satisfying

\begin{equation*} \mathbb{E} \big\vert X_u(t) - X_u(s) \big\vert^2 \le K \vert t-s \vert^2 + K\mathbb{E} \big\vert B_u(t) - B_u(s) \big\vert^2 \le K \vert t-s \vert\end{equation*}

for every $u \in [0, 1]$ and $s, t \in [0, T]$ satisfying $\vert t-s \vert \le 1$ . Thus, we have

\begin{align*} E^2_4 &\le \sup_{h \in [M]} \, \sup_{t \in \Delta_h} \int_0^1 W^2_{2} (\mu_{u, (h-1)\Delta}, \, \mu_{u, t}) \, du \\[5pt] &\le \sup_{h \in [M]} \, \sup_{t \in \Delta_h} \int_0^1 \mathbb{E} \big\vert X_u \big( (h-1)\Delta \big) - X_u(t) \big\vert^2 \, du \le K \Delta.\end{align*}

Let us combine all the bounds from $E_1$ to $E_4$ . For any given $\epsilon > 0$ , we can choose $\Delta$ small enough so that $E_1 + E_4 < \epsilon/2$ . Then we can choose $N \in \mathbb{N}$ large enough so that $E_2 + E_3 < \epsilon/2$ for every $n \ge N$ , which implies $\mathbb{E} \big[ \sup_{0 \le t \le T} W_{2} (\widetilde{L}_{n, t}, \, \widetilde{\mu}_t) \big] < \epsilon$ for every $n \ge N$ .

4.2. Proofs of results in Section 3.2

4.2.1. Proof of Theorem 3.1

Let us fix an arbitrary $n \in \mathbb{N}$ . We shall naturally identify elements of $(\mathbb{R}^d)^n$ with those of $\mathbb{R}^{dn}$ , and elements of $\big( C([0, T] \;:\; \mathbb{R}^d)\big)^n$ with those of $C\big([0, T] \;:\; (\mathbb{R}^d)^n\big)$ ; we shall specify which norm we use for each space. We can express the SDE (1.3) in the form of (2.12) with $k = dn$ , by making the following definitions:

  1. (i) $(\mathbb{R}^d)^n \ni x = (x_i)_{i \in [n]}$ , where $x_i = X_{i/n}(0)$ ;

  2. (ii) $C\big([0, T] \;:\; (\mathbb{R}^d)^n\big) \ni X^x = \left(\bar{X}^n_i\right)_{i \in [n]}$ , where $\bar{X}^n_i = \left(\bar{X}^n_{i, k}\right)_{k \in [d]}$ ;

  3. (iii) $b \;:\; [0, T] \times C\big([0, T] \;:\; (\mathbb{R}^d)^n\big) \rightarrow (\mathbb{R}^d)^n$ is such that $b = (b_i)_{i \in [n]}$ , $b_i = (b_{i, k})_{k \in [d]}$ , where

    $$b_{i, k}(t, X^x) = \frac{1}{n} \sum_{j=1}^n \phi_k\big(\bar{X}^n_i(t), \bar{X}^n_j(t)\big)G\bigg(\frac{i}{n}, \frac{j}{n}\bigg) + \psi_k\big(\bar{X}^n_i(t)\big);$$
  4. (iv) $W = (W_i)_{i \in [n]}$ is a (dn)-dimensional Brownian motion, where $W_i \equiv B_{i/n}$ ;

  5. (v) $\Sigma$ is a block-diagonal $(dn) \times (dn)$ matrix with block diagonal entries $\sigma$ .

In order to apply Lemma 2.5, it suffices to check the condition (2.13): for any $X, Y \in C([0, T] \;:\; \mathbb{R}^{dn})$ , Hölder’s inequality and the Lipschitz continuity of $\phi$ , $\psi$ indeed yield, for every $t \in [0, T]$ ,

\begin{align*} & \qquad \big\vert b(t, X) - b(t, Y) \big\vert^2 \\[5pt] &= \sum_{i=1}^n \sum_{k=1}^d \bigg \vert \frac{1}{n} \sum_{j=1}^n \Big(\phi_k \big(X_i(t), X_j(t)\big) - \phi_k \big(Y_i(t), Y_j(t)\big) \Big) G\bigg(\frac{i}{n}, \frac{j}{n}\bigg) + \psi_k\big(X_i(t) \big) - \psi_k\big(Y_i(t)\big) \bigg\vert^2 \\[5pt] & \le 2 \sum_{i=1}^n \sum_{k=1}^d \Bigg[ \Big \vert \frac{1}{n} \sum_{j=1}^n \Big(\phi_k \big(X_i(t), X_j(t)\big) - \phi_k \big(Y_i(t), Y_j(t)\big) \Big) G\bigg(\frac{i}{n}, \frac{j}{n}\bigg) \Big\vert^2 \\[5pt] &\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad + \Big\vert \psi_k\big(X_i(t) \big) - \psi_k\big(Y_i(t)\big) \Big \vert^2 \Bigg] \\[5pt] &\le 2 \sum_{i=1}^n \Bigg[ K^2 \big\vert X_i(t) - Y_i(t) \big\vert^2 + \sum_{k=1}^d \frac{1}{n} \sum_{j=1}^n \Big\vert \phi_k \big(X_i(t), X_j(t)\big) - \phi_k \big(Y_i(t), Y_j(t)\big) \Big\vert^2 \Bigg] \\[5pt] &\le (4K^2+1) \sum_{i=1}^n \big\vert X_i(t) - Y_i(t) \big\vert^2 \le (4K^2+1) \vert\vert X-Y \vert\vert^2_{\star, t}.\end{align*}

Let $P^x \in \mathcal{P}\big(C([0, T] \;:\; \mathbb{R}^{dn})\big)$ be the law of the solution of (1.3) in the notation of (Reference Aurell, Carmona and Laurière1)–(Reference Bayraktar, Wu and Zhang5) above; then, from Lemma 2.5, for any $Q \in \mathcal{P}\big(C([0, T] \;:\; \mathbb{R}^{dn})\big)$ we have

\begin{equation*} W_{1, \big(C([0, T] \;:\; \mathbb{R}^{dn}), \, \vert\vert\cdot\vert\vert_{dn, 2}\big)} (P^x, Q) \le \sqrt{2c_1 H(Q \vert P^x)},\end{equation*}

for some $c_1 > 0$ .

For an arbitrary $F \in Lip\Big(\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n, \vert\vert\cdot\vert\vert_{n, 1}\Big)$ , Hölder’s inequality shows that F is a $\sqrt{n}$ -Lipschitz function on the space $\big(C([0, T] \;:\; \mathbb{R}^{dn}), \, \Vert \cdot \Vert_{dn, 2}\big)$ ; indeed, for $X, Y \in \Big(\big(C([0, T] : \mathbb{R}^d)\big)^n, \Vert \cdot \Vert_{n, 1}\Big)$ we obtain

\begin{align*} \big\vert F(X) - F(Y) \big\vert &\le \Vert X-Y \Vert_{n, 1} = \sum_{i=1}^d \Vert X_i - Y_i \Vert_{\star, T} \le \sqrt{n \sum_{i=1}^d \Vert X_i - Y_i\Vert_{\star, T}^2} \\[5pt] & \le \sqrt{n \sum_{i=1}^d \sum_{k=1}^d \Vert X_{i, k} - Y_{i, k}\Vert_{\star, T}^2} = \sqrt{n} \, \Vert X-Y \Vert_{dn, 2}.\end{align*}

Thus, Lemma 2.6 implies

(4.1) \begin{equation} P^x \big(F - \langle P^x, F \rangle > a \big) \le \exp\!\bigg(\!-\! \frac{a^2}{2c_1 n}\bigg),\end{equation}

for any $a > 0$ .

We now claim that there exists a positive constant $c_2$ , which does not depend on n, such that the map $x \mapsto \langle P^x, F \rangle$ is $c_2$ -Lipschitz on $(\mathbb{R}^d)^n$ with respect to the Euclidean $\ell^p$ norm for any $F \in Lip\Big(\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n, \vert\vert\cdot\vert\vert_{n, p}\Big)$ and for any $p = 1, 2$ . Given any $x, y \in (\mathbb{R}^d)^n$ , we couple $P^x$ and $P^y$ by solving the system (1.3) from the two initial states x, y with the same Brownian motion, and denote the coupling by $\pi_{x, y}$ . We deduce that for $\mathcal{L}(X) = P^x$ and $\mathcal{L}(Y) = P^y$ ,

(4.2) \begin{align} \big\vert \langle P^x, F \rangle - \langle P^y, F \rangle \big\vert^p \le \int \big\vert F(X) - F(Y) \big\vert^p \pi_{x, y}(dX, dY) \le \int \Vert X - Y \Vert^p_{n, p} \, \pi_{x, y}(dX, dY).\end{align}

When $p = 2$ , we use a standard argument (the trivial inequality $(a+b)^2 \le 2(a^2+b^2)$ , the Lipschitz continuity from Assumption 2.1(i), and a series of applications of Hölder’s inequality) to derive

\begin{align*} &\qquad \sum_{i=1}^n \vert X_i(t) - Y_i(t) \vert^2 \\[5pt] &\le 2 \vert\vert x-y\vert\vert^2_{n, 2} + 2Kt \sum_{i=1}^n \int_0^t \Bigg( \vert X_i(s) - Y_i(s) \vert + \frac{1}{n}\sum_{j=1}^n \vert X_j(s) - Y_j(s) \vert \Bigg)^2 \, ds \\[5pt] &\le 2 \vert\vert x-y\vert\vert^2_{n, 2} + 8Kt \int_0^t \sum_{i=1}^n \vert X_i(s) - Y_i(s) \vert^2 \, ds, \qquad \forall \, t \in [0, T].\end{align*}

Grönwall’s inequality yields that the last integrand in (4.2) for $p=2$ is bounded by

\begin{equation*} \vert\vert X-Y \vert\vert^2_{n, 2} \le c_2^2 \vert\vert x-y \vert\vert^2_{n, 2}\end{equation*}

for some constant $c_2 > 0$ , which depends on $\phi$ , $\psi$ , and T, but not on n. When $p=1$ , proving $\vert\vert X-Y \vert\vert_{n, 1} \le c_2 \vert\vert x-y \vert\vert_{n, 1}$ is easier, and the claim follows.

On the other hand, we apply Lemmas 2.7, 2.6 to the assumption (3.1) to obtain, for every $f \in Lip\big( (\mathbb{R}^d)^n, \, \vert\vert\cdot\vert\vert_{n, 1} \big)$ and for any $a > 0$ ,

(4.3) \begin{equation} \mu^n_0 \big(f - \langle \mu^n_0, f \rangle > a \big) \le \exp\!\bigg(\!-\! \frac{a^2}{2\kappa n}\bigg).\end{equation}

We conclude from (4.1), the above claim, and (4.3) that

\begin{align*} \mathbb{P} \Big[ F(\bar{X}^n) - \mathbb{E}\big(F(\bar{X}^n)\big) > a \Big] & \le \mathbb{E} \Big[ \mathbb{P} \big( F(\bar{X}^n) - \langle P^{x}, F \rangle > \frac{a}{2} \, \big \vert \, \bar{X}^n(0) = x \big) \Big] \\[5pt] & \qquad \qquad \qquad + \mathbb{P} \bigg( \langle P^{\bar{X}^n(0)}, F \rangle - \mathbb{E} \big[\langle P^{\bar{X}^n(0)}, F \rangle \big] > \frac{a}{2} \bigg) \\[5pt] & \le \exp\!\bigg(\!-\! \frac{a^2}{8c_1 n}\bigg) + \exp\!\bigg(\!-\! \frac{a^2}{8\kappa c_2^2n}\bigg).\end{align*}

The assertion (3.4) follows from choosing $1 / \delta = 8 \max\!(c_1, \kappa c_2^2)$ .

4.2.2. Proof of Theorem 3.2

We follow the proof of Theorem 3.1. Identifying the elements of $\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n$ with those of $C([0, T] \;:\; \mathbb{R}^{dn})$ , expressing the SDE (1.3) in the form of (2.12), and applying Lemma 2.5, we have that there exists a positive constant $c_1 > 0$ such that

\begin{equation*} W_{1, (C([0, T] \;:\; \mathbb{R}^{dn}), \, \vert\vert\cdot\vert\vert_{dn, 2})}(P^x, Q) \le \sqrt{2c_1 H(Q \vert P^x)}\end{equation*}

holds for any $Q \in \mathcal{P}\big(C([0, T] \;:\; \mathbb{R}^{dn})\big)$ . Here, $P^x$ is the law of the solution of (1.3). Moreover, Lemma 2.6 implies

(4.4) \begin{equation} P^x \big(F - \langle P^x, F \rangle > a \big) \le \exp\!\bigg(\!-\! \frac{a^2}{2c_1}\bigg),\end{equation}

for any $a > 0$ and every $F \in Lip\big(C([0, T] \;:\; \mathbb{R}^{dn}), \Vert\cdot\Vert_{dn, 2}\big)$ . It is easy to check that every function in $Lip\big(C([0, T] \;:\; \mathbb{R}^{dn}), \Vert\cdot\Vert_{dn, 2}\big)$ also belongs to $Lip\Big(\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n, \Vert\cdot\Vert_{n, 2}\Big)$ ; thus the inequality (4.4) also holds for every $F \in Lip\Big(\big(C([0, T] \;:\; \mathbb{R}^d)\big)^n, \Vert\cdot\Vert_{n, 2}\Big)$ .

We now apply Lemmas 2.7(ii) and 2.6 to the assumption (3.3) to deduce

(4.5) \begin{equation} \mu^n_0 \big(f - \langle \mu^n_0, f \rangle > a \big) \le \exp\!\bigg(\!-\! \frac{a^2}{2\kappa}\bigg),\end{equation}

for every $f \in Lip\big( (\mathbb{R}^d)^n, \, \vert\vert\cdot\vert\vert_{n, 2} \big)$ and for any $a > 0$ .

From (4.4), (4.5), and the claim in the proof of Theorem 3.1, we conclude that

\begin{align*} \mathbb{P} \Big[ F(\bar{X}^n) - \mathbb{E}\big(F(\bar{X}^n)\big) > a \Big] & \le \mathbb{E} \Big[ \mathbb{P} \big( F(\bar{X}^n) - \langle P^{x}, F \rangle > \frac{a}{2} \, \big \vert \, \bar{X}^n(0) = x \big) \Big] \\[5pt] & \qquad + \mathbb{P} \Big[ \langle P^{\bar{X}^n(0)}, F \rangle - \mathbb{E} \big[\langle P^{\bar{X}^n(0)}, F \rangle \big] > \frac{a}{2} \Big] \\[5pt] & \le \exp\!\bigg(\!-\! \frac{a^2}{8c_1}\bigg) + \exp\!\bigg(\!-\! \frac{a^2}{8\kappa c_2^2}\bigg).\end{align*}

The result (3.4) follows from choosing $1 / \delta = 8 \max\!(c_1, \kappa c_2^2)$ .

4.3. Proofs of results in Section 3.3

4.3.1. Proof of Theorem 3.3

First, we claim that

\begin{equation*} Y \mapsto \sup_{0 \le t \le T} W_{1} \bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Y_i(t)}, \, \widetilde{\mu}_t \bigg)\end{equation*}

is $(1/n)$ -Lipschitz from $\Big(\big( C ([0, T] \;:\; \mathbb{R}^d) \big)^n, \, \Vert\cdot\Vert_{n, 1} \Big)$ to $\mathbb{R}$ . For $Y, Z \in \big( C ([0, T] \;:\; \mathbb{R}^d) \big)^n$ , applying the triangle inequality for the Wasserstein metric and taking the supremum over $t \in [0, T]$ gives the inequality

\begin{align*}& \Bigg\vert \sup_{0 \le t \le T} W_{1} \Bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Y_i(t)}, \, \widetilde{\mu}_t \Bigg) - \sup_{0 \le t \le T} W_{1} \Bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Z_i(t)}, \, \widetilde{\mu}_t \Bigg) \Bigg\vert \\[5pt] &\quad \le \sup_{0 \le t \le T} W_{1} \Bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Y_i(t)}, \, \frac{1}{n}\sum_{i=1}^n \delta_{Z_i(t)} \Bigg).\end{align*}

From the definitions (2.1) and (2.2), the last expression is less than or equal to $\frac{1}{n} \sum_{i=1}^n \Vert Y_i - Z_i \Vert = \frac{1}{n}\Vert Y-Z \Vert_{n, 1}$ , and the claim follows.

Then, for any $a > 0$ ,

\begin{align*} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} (\bar{L}_{n, t}, \widetilde{\mu}_t) > a \bigg] &\le \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} (\bar{L}_{n, t}, \widetilde{\mu}_t) - \mathbb{E}\bigg[\sup_{0 \le t \le T} W_{1} (\bar{L}_{n, t}, \widetilde{\mu}_t) \bigg] > \frac{a}{2} \bigg] \\[5pt] & \qquad + \mathbb{P} \bigg[ \mathbb{E}\bigg[\sup_{0 \le t \le T} W_{1} (\bar{L}_{n, t}, \widetilde{\mu}_t) \bigg] > \frac{a}{2} \bigg].\end{align*}

The first term is bounded by the right-hand side of (3.5) from Theorem 3.1.

Let us consider the auxiliary particle system (1.2) satisfying Assumption 2.4. Corollary 3.1 shows that the last probability vanishes for all but finitely many n, and the result follows.

4.3.2. Proof of Theorem 3.4

We first prove (3.6). From the triangle inequality, we obtain

(4.6) \begin{align} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} (L_{n, t}, \widetilde{\mu}_t) > a \bigg] \le \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} (\bar{L}_{n, t}, \widetilde{\mu}_t) > \frac{a}{2} \bigg] + \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} (L_{n, t}, \bar{L}_{n, t}) > \frac{a}{2} \bigg].\end{align}

In what follows, we compute the bound for the last probability on the right-hand side. For fixed $t \in [0, T]$ and $i \in [n]$ , we use the notation (2.5) to obtain

(4.7) \begin{align} & X^n_i(t) - \bar{X}^n_i(t) = \int_0^t \sum_{j=1}^n D^{(n)}_{i, j} \phi\big( X^n_i(s), X^n_j(s) \big) \, ds \\[5pt] &+\int_0^t \sum_{j=1}^n \bar{P}^{(n)}_{i, j} \Big ( \phi\big(X^n_i(s), X^n_j(s) \big) - \phi\big(\bar{X}^n_i(s), \bar{X}^n_j(s) \big) \Big) + \psi\big(X^n_i(s)\big) - \psi\big(\bar{X}^n_i(s) \big) \, ds. \nonumber\end{align}

We define $\triangle(t) \;:\!=\; \frac{1}{n} \sum_{i=1}^n \vert \vert X^n_i - \bar{X}^n_i \vert \vert_{\star, t}$ , then deduce from the continuity of $X^n_i(\!\cdot\!)-\bar{X}^n_i(\!\cdot\!)$ that there exists $t_i \in [0, t]$ for each $i \in [n]$ satisfying

(4.8) \begin{align} \triangle(t) = \frac{1}{n} \sum_{i=1}^n \vert X^n_i(t_i) - \bar{X}^n_i(t_i) \vert \le \int_0^{t} \frac{1}{n} \sum_{i,j=1}^n \Big\vert D^{(n)}_{i, j} \mathbb{1}_{[0, t_i]}(s) \phi\big( X^n_i(s), X^n_j(s) \big) \Big\vert \, ds \end{align}
(4.9) \begin{align} +\frac{1}{n}\int_0^t \sum_{i, j=1}^n \bar{P}^{(n)}_{i, j} \mathbb{1}_{[0, t_i]}(s) \Big\vert \phi\big(X^n_i(s), X^n_j(s) \big) - \phi\big(\bar{X}^n_i(s), \bar{X}^n_j(s) \big) \Big\vert \, ds \end{align}
(4.10) \begin{align} + \frac{1}{n}\int_0^t \sum_{i=1}^n \mathbb{1}_{[0, t_i]}(s) \Big\vert \psi\big(X^n_i(s)\big) - \psi\big(\bar{X}^n_i(s) \big) \Big\vert \, ds. \end{align}

Since each component $\phi_k$ of $\phi$ belongs to the $L^1$ -Fourier class, there exists a finite complex measure $m_{\phi_k}$ such that we can write, for every $k \in [d]$ ,

(4.11) \begin{equation} \phi_k \big(X^n_i(s), X^n_j(s) \big) = \int_{\mathbb{R}^{2d}} a^k_i(z, s) b^k_j(z, s) m_{\phi_k}(dz), \qquad z = (z_1, z_2),\end{equation}

for some complex functions $a^k_i, b^k_j$ of the form

\begin{align*} a^k_i(z, s) \;:\!=\; \exp\!\big(2\pi \sqrt{-1} \langle X^n_i(s), z_1 \rangle \big), \qquad b^k_j(z, s) \;:\!=\; \exp\!\big(2\pi \sqrt{-1} \langle X^n_j(s), z_2 \rangle \big).\end{align*}

Using the representation (4.11) with the elementary inequality

\begin{equation*} \big\vert \phi \big(X^n_i(s), X^n_j(s) \big) \big\vert \le \sqrt{d}\max_{1 \le k \le d} \big\vert \phi_k \big(X^n_i(s), X^n_j(s) \big) \big\vert, \end{equation*}

we find that the integral of (4.8) is bounded above by

(4.12) \begin{align} \max_{1 \le k \le d} \frac{\sqrt{d}}{n} \int_0^t \int_{\mathbb{R}^{2d}} & \sum_{i,j=1}^n \Big\vert D^{(n)}_{i, j} \mathbb{1}_{[0, t_i]}(s) a^k_i(z, s) b^k_j(z, s) \Big\vert m_{\phi_k}(dz) \, ds \nonumber \\[5pt] &= \max_{1 \le k \le d} \frac{\sqrt{d}}{n} \int_0^t \int_{\mathbb{R}^{2d}} \Big\vert \Big\langle \mathbf{a}^k(z, s), \, D^{(n)} \mathbf{b}^k(z, s) \Big\rangle \Big\vert m_{\phi_k}(dz) \, ds, \end{align}

where we define the complex vectors

(4.13) \begin{equation} \mathbf{a}^k(z, s) \;:\!=\; \Big(\mathbb{1}_{[0, t_i]}(s) a^k_i(z, s) \Big)_{i \in [n]}, \qquad \mathbf{b}^k(z, s) \;:\!=\; \big( b^k_j(z, s) \big)_{j \in [n]}, \qquad \text{for each } k \in [d].\end{equation}

Since the $\ell^{\infty}$ norms of these vectors are bounded by 1, decomposing them into real and complex parts gives, for each $k \in [d]$ ,

\begin{equation*} \Big\vert \Big\langle \mathbf{a}^k(z, s), \, D^{(n)} \mathbf{b}^k(z, s) \Big\rangle \Big\vert \le 4 \sup \Big\{ \Big\langle \mathbf{x}, D^{(n)}\mathbf{y} \Big\rangle \;:\; \mathbf{x}, \mathbf{y} \in [\!-\!1, 1]^n \Big\} = 4\vert\vert D^{(n)} \vert\vert_{\infty \rightarrow 1}.\end{equation*}

Thus, the right-hand side of (4.12) is bounded above by

\begin{equation*} \frac{4t\sqrt{d}}{n} \Big( \max_{1 \le k \le d} \vert\vert m_{\phi_k} \vert\vert_{TM} \Big) \vert\vert D^{(n)} \vert\vert_{\infty \rightarrow 1}.\end{equation*}

For the integrals of (4.9) and (4.10), we use the Lipschitz continuity of $\phi$ and $\psi$ ; thus, there exists a constant $K > 0$ such that

(4.14) \begin{equation} \triangle(t) \le K \int_0^t \triangle (s) ds + \frac{\vert\vert D^{(n)} \vert\vert_{\infty \rightarrow 1}}{n}Kt, \qquad \forall \, t \in [0, T].\end{equation}

Grönwall’s inequality yields

(4.15) \begin{equation} \triangle(T) \le \frac{K \Vert D^{(n)}\Vert_{\infty \rightarrow 1}}{n},\end{equation}

where K is now a positive constant depending on the time horizon T. Recalling the notation $\triangle(t)$ , we obtain

\begin{equation*} \sup_{0 \le t \le T} W_{1} \!\left(L_{n, t}, \bar{L}_{n, t}\right) \le \triangle(T) \le \frac{K \vert\vert D^{(n)} \vert\vert_{\infty \rightarrow 1}}{n},\end{equation*}

and finally Lemma 2.9 gives the bound for the last probability of (4.6),

(4.16) \begin{equation} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} \left(L_{n, t}, \bar{L}_{n, t}\right) > \frac{a}{2} \bigg] \le \exp\!\bigg(\!-\! \frac{a^2 n^2p(n)}{8K^2+2aK/3}\bigg).\end{equation}

For the first probability on the right-hand side of (4.6), Theorem 3.3 yields, for every $n \ge N$ ,

\begin{equation*} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{1} \left(\bar{L}_{n, t}, \widetilde{\mu}_t\right) > \frac{a}{2} \bigg] \le 2 \exp\!\bigg(\!-\! \frac{\delta a^2n}{16} \bigg).\end{equation*}

Thanks to Assumption 2.4, by choosing a larger value for $N \in \mathbb{N}$ than the one in Theorem 3.3, we can ensure that

$$\exp\!\left(\!-\! \frac{a^2 n^2p(n)}{8K^2+2aK/3}\right) \le \exp\!\left(\!-\! \frac{\delta a^2n}{16}\right)$$

for every $n \ge N$ , and the assertion (3.6) follows.

For the result (3.7), we can approximate a general $\phi$ by those in the $L^1$ -Fourier class, using the approximation method in Section 5.1.3 of [Reference Oliveira and Reis27], to find the exponential bound for the probability $\mathbb{P} [\!\sup_{0 \le t \le T} d_{BL} (L_{n, t}, \bar{L}_{n, t}) > a/2 ]$ (similar to (4.16)). By recalling the fact $d_{BL} \le W_1$ and replacing all the $W_1$ metrics with the $d_{BL}$ metrics in (4.6), we arrive at (3.7).

4.3.3. Proof of Theorem 3.5

As in the proof of Theorem 3.3, for $Y, Z \in \big( C ([0, T] \;:\; \mathbb{R}^d) \big)^n$ we derive

\begin{align*} &\bigg\vert \sup_{0 \le t \le T} W_{2} \Bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Y_i(t)}, \, \widetilde{\mu}_t \!\Bigg) - \sup_{0 \le t \le T} W_{2} \Bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Z_i(t)}, \, \widetilde{\mu}_t \Bigg) \bigg\vert \le \sup_{0 \le t \le T} W_{2} \Bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Y_i(t)}, \, \frac{1}{n}\sum_{i=1}^n \delta_{Z_i(t)}\! \Bigg) \\[5pt] & \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \le \Bigg( \frac{1}{n} \sum_{i=1}^n \Vert Y_i - Z_i \Vert^2 \Bigg)^{\frac{1}{2}} = \frac{1}{\sqrt{n}} \Vert Y-Z \Vert_{n, 2},\end{align*}

again from (2.1) and (2.2). This verifies the $\Big(\frac{1}{\sqrt{n}}\Big)$ -Lipschitz continuity of the map

\begin{equation*} Y \mapsto \sup_{0 \le t \le T} W_{2} \Bigg(\frac{1}{n}\sum_{i=1}^n \delta_{Y_i(t)}, \, \widetilde{\mu}_t \Bigg)\end{equation*}

from $\Big(\big( C ([0, T] \;:\; \mathbb{R}^d) \big)^n, \, \Vert\cdot\Vert_{n, 2} \Big)$ to $\mathbb{R}$ . Then, for any $a > 0$ ,

\begin{align*} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_2 (\bar{L}_{n, t}, \widetilde{\mu}_t) > a \bigg] &\le \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_2 (\bar{L}_{n, t}, \widetilde{\mu}_t) - \mathbb{E}\bigg[\sup_{0 \le t \le T} W_2 (\bar{L}_{n, t}, \widetilde{\mu}_t) \bigg] > \frac{a}{2} \bigg] \\[5pt] & \qquad + \mathbb{P} \bigg[ \mathbb{E}\bigg[\sup_{0 \le t \le T} W_2 (\bar{L}_{n, t}, \widetilde{\mu}_t) \bigg] > \frac{a}{2} \bigg].\end{align*}

The first term is bounded by the right-hand side of (3.4) from Theorem 3.2. The last probability vanishes for all but finitely many n from Corollary 3.1.

4.3.4. Proof of Lemma 3.1

We note from (2.5) that $\Big\{D^{(n)}_{i, j}\Big\}_{1 \le i, j \le n}$ are independent zero-mean random variables, and for every $i, j \in [n]$ ,

(4.17) \begin{equation} \mathbb{E}\big[(D^{(n)}_{i, j})^2\big] = \frac{p(n)G(\frac{i}{n}, \frac{j}{n}) \big(1 - p(n)G(\frac{i}{n}, \frac{j}{n}) \big)}{\big(np(n)\big)^2}.\end{equation}

In particular, since $p(n) \le 1$ , we have $0 \le p(n)G(\frac{i}{n}, \frac{j}{n}) \le 1$ , and thus $\mathbb{E}\big[(D^{(n)}_{i, j})^2\big] \le 1/(4n^2p(n)^2)$ .

Let us fix any $n \in \mathbb{N}$ . For arbitrary n-dimensional vectors $\mathbf{x}, \, \mathbf{y} \in [\!-\!1, 1]^n$ , we have

\begin{align*} & \qquad \qquad \langle \mathbf{x}, (D^{(n)})^\top D^{(n)}\mathbf{y} \rangle = \sum_{i=1}^n \big(D^{(n)} \mathbf{x}\big)_i \big(D^{(n)} \mathbf{y} \big)_i \\[5pt] &= \sum_{i=1}^n \sum_{j=1}^n \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, j} D^{(n)}_{i, k} x_j y_k + \sum_{i=1}^n \sum_{j=1}^n \big( D^{(n)}_{i, j} \big)^2 x_j y_j \\[5pt] &\le \sum_{i=1}^n \sum_{j=1}^n \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, j} D^{(n)}_{i, k} x_j y_k + \sum_{i=1}^n \sum_{j=1}^n \Big( \big( D^{(n)}_{i, j} \big)^2 x_j y_j - \mathbb{E}\big[ \big(D^{(n)}_{i, j}\big)^2 x_j y_j \big] \Big) \\[5pt] & \qquad \qquad \qquad \qquad \qquad \qquad + \sum_{i=1}^n \sum_{j=1}^n \frac{x_i y_j}{4n^2p(n)^2}.\end{align*}

Thus, for fixed arbitrary $\eta > 0$ we have

(4.18) \begin{align} \mathbb{P} \bigg[ \frac{\Vert (D^{(n)})^\top D^{(n)} \Vert_{\infty \rightarrow 1}}{n} > \eta \bigg] &\le \mathbb{P} \Bigg[ \frac{1}{n} \sum_{i=1}^n \sum_{j=1}^n \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, j} D^{(n)}_{i, k} x_j y_k > \frac{\eta}{3} \Bigg] \nonumber \\[5pt] &+ \mathbb{P} \Bigg[ \frac{1}{n}\sum_{i=1}^n \sum_{j=1}^n \Big( \big( D^{(n)}_{i, j} \big)^2 x_j y_j - \mathbb{E}\big[ \big(D^{(n)}_{i, j}\big)^2 x_j y_j \big] \Big) > \frac{\eta}{3} \Bigg] \nonumber \\[5pt] &+ \mathbb{1}_{\{\frac{1}{4np(n)^2} > \frac{\eta}{3}\}} \;=\!:\; P_1 + P_2 + P_3. \end{align}

From Assumption 3.1, there exists $N \in \mathbb{N}$ such that $P_3$ vanishes for every $n \ge N$ . In the following, we find the bounds for $P_1$ and $P_2$ . Using the distribution

\begin{equation*} D^{(n)}_{i, j} = \begin{cases} \frac{1-p(n) G(i/n, j/n)}{np(n)} \le \frac{1}{np(n)} &\text{ with probability } p(n)G(i/n, j/n), \\[5pt] -\frac{1}{n}G(i/n, j/n) \ge - \frac{1}{n} &\text{ with probability } 1-p(n)G(i/n, j/n), \end{cases}\end{equation*}

for each $i, j \in [n]$ , we derive for $P_1$

\begin{align*} P_1 &\le \sum_{i=1}^n \mathbb{P} \!\left[ \sum_{j=1}^n \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, j} D^{(n)}_{i, k} x_j y_k > \frac{\eta}{3} \right] \le \sum_{i=1}^n \sum_{j=1}^n \mathbb{P} \!\left[ \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, j} D^{(n)}_{i, k} x_j y_k > \frac{\eta}{3n} \right] \\[5pt] & = \sum_{i=1}^n \sum_{j=1}^n \mathbb{E} \!\left[ \mathbb{P} \!\left( \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, j} D^{(n)}_{i, k} x_j y_k > \frac{\eta}{3n} \Big\vert D^{(n)}_{i, j} \right) \right] \\[5pt] & \le \sum_{i=1}^n \sum_{j=1}^n \!\left( \mathbb{P} \!\left[ \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, k} x_j y_k > \frac{\eta p(n)}{3} \right] + \mathbb{P} \!\left[ \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, k} x_j y_k< -\frac{\eta}{3} \right] \right).\end{align*}

The summands $D^{(n)}_{i, k} x_j y_k$ in the last two probabilities are independent zero-mean random variables bounded above by $1/(np(n))$ , bounded below by $-1/n$ , and satisfying

\begin{equation*} \sum_{\substack{k=1 \\ k \neq j}}^n \mathbb{E} \big[ (D^{(n)}_{i, k} x_j y_k)^2 \big] \le \frac{n-1}{4n^2p(n)^2} \le \frac{1}{4np(n)^2},\end{equation*}

from (4.17). From Bernstein’s inequality (Lemma 2.10), we have

\begin{align*} \mathbb{P} \!\left[ \sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, k} x_j y_k > \frac{\eta p(n)}{3} \right] &\le \exp\!\bigg(\!-\! \frac{2np(n)^4 \eta^2}{9+ 4\eta p(n)^2} \bigg), \\[5pt] \mathbb{P} \!\left[ -\sum_{\substack{k=1 \\ k \neq j}}^n D^{(n)}_{i, k} x_j y_k > \frac{\eta}{3} \right] &\le \exp\!\bigg(\!-\! \frac{2np(n)^2 \eta^2}{9+ 4\eta p(n)^2} \bigg);\end{align*}

thus

(4.19) \begin{equation} P_1 \le 2n^2 \exp\!\bigg(\!-\! \frac{2np(n)^4 \eta^2}{9+ 4\eta p(n)^2} \bigg).\end{equation}

We now compute the bound for $P_2$ . We have

(4.20) \begin{equation} P_2 \le \sum_{i=1}^n \mathbb{P} \Bigg[ \sum_{j=1}^n \Big( \big( D^{(n)}_{i, j} \big)^2 x_j y_j - \mathbb{E}\big[ \big(D^{(n)}_{i, j}\big)^2 x_j y_j \big] \Big) > \frac{\eta}{3} \Bigg],\end{equation}

and the summands $\big( D^{(n)}_{i, j} \big)^2 x_j y_j - \mathbb{E}\big[ \big(D^{(n)}_{i, j}\big)^2 x_j y_j \big]$ in the probability are independent zero-mean random variables bounded above by $5/(2np(n))^2$ . Moreover, we easily obtain the bound $\mathbb{E}\big[ (D^{(n)}_{i, j})^4\big] \le 1/(np(n))^4$ , and thus the sum of the variances of the summands are

\begin{align*} \sum_{j=1}^n \mathbb{E} \bigg[ \Big( \big( D^{(n)}_{i, j} \big)^2 x_j y_j &- \mathbb{E}\big[ \big(D^{(n)}_{i, j}\big)^2 x_j y_j \big] \Big)^2 \bigg] \le \sum_{j=1}^n \mathbb{E} \bigg[ \Big( \big( D^{(n)}_{i, j} \big)^2 - \mathbb{E}\big[ \big(D^{(n)}_{i, j}\big)^2 \big] \Big)^2 \bigg] \\[5pt] & \le \sum_{j=1}^n \bigg( \mathbb{E} \big[ (D^{(n)}_{i, j})^4\big] + \Big(\mathbb{E} \big[ (D^{(n)}_{i, j})^2\big] \Big)^2 \bigg) \le \frac{17}{16n^3p(n)^4}.\end{align*}

Applying Bernstein’s inequality (Lemma 2.10) to each probability in (4.20) yields

(4.21) \begin{equation} P_2 \le n \exp\!\bigg(\!-\! \frac{8n^3p(n)^4 \eta^2}{153+ 20 np(n)^2\eta} \bigg).\end{equation}

Comparing the bounds of (4.19) and (4.21), modifying the value of N if necessary, and plugging these into (4.18), we obtain the result.

4.3.5. Proof of Theorem 3.6

The argument is similar to the proof of Theorem 3.4. The triangle inequality gives

(4.22) \begin{align} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{2} (L_{n, t}, \widetilde{\mu}_t) > a \bigg] \le \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{2} (\bar{L}_{n, t}, \widetilde{\mu}_t) > \frac{a}{2} \bigg] + \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{2} (L_{n, t}, \bar{L}_{n, t}) > \frac{a}{2} \bigg].\end{align}

Recalling the identity (4.7), applying Hölder’s inequality several times, and using the Lipschitz property, we obtain

\begin{align*} & \qquad \qquad \big\vert X^n_i(t) - \bar{X}^n_i(t) \big\vert^2 - 2t \int_0^t \big\vert \sum_{j=1}^n D^{(n)}_{i, j} \phi\big( X^n_i(s), X^n_j(s) \big) \Big\vert^2 \, ds \\[5pt] & \le 2K^2t \int_0^t 2 \Bigg( \sum_{j=1}^n \bar{P}^{(n)}_{i, j} \Big( \big\vert X^n_i(s)-\bar{X}^n_i(s) \big\vert + \big\vert X^n_j(s)-\bar{X}^n_j(s) \big\vert \Big)^2 + 2\big\vert X^n_i(s)-\bar{X}^n_i(s) \big\vert^2 \, ds \\[5pt] & \le 2K^2t \int_0^t 6 \big\vert X^n_i(s)-\bar{X}^n_i(s) \big\vert^2 + \frac{2}{n} \sum_{j=1}^n \big\vert X^n_j(s)-\bar{X}^n_j(s) \big\vert^2 \, ds, \qquad \forall \, t \in [0, T].\end{align*}

For a fixed $t \in [0, T]$ , by the continuity of $X^n_i(\!\cdot\!)-\bar{X}^n_i(\!\cdot\!)$ , there exists $t_i \in [0, t]$ for each $i \in [n]$ satisfying $\Box(t) \;:\!=\; \frac{1}{n} \sum_{i=1}^n \Vert X^n_i - \bar{X}^n_i \Vert^2_{\star, t} = \frac{1}{n} \sum_{i=1}^n \big\vert X^n_i(t_i) - \bar{X}^n_i(t_i) \big\vert^2$ . Combining this with the last inequality, we have

(4.23) \begin{equation} \Box(t) \le 2t \int_0^{t} \frac{1}{n} \sum_{i=1}^n \Big\vert \sum_{j=1}^n D^{(n)}_{i, j} \mathbb{1}_{[0, t_i]}(s) \phi\big( X^n_i(s), X^n_j(s) \big) \Big\vert^2 \, ds +16K^2t \int_0^t \Box(s) \, ds.\end{equation}

We recall the representations (4.11) and (4.13) and use Hölder’s inequality to derive for the first integral on the right-hand side

\begin{align*} \int_0^{t} \frac{1}{n} \sum_{i=1}^n &\Big\vert \sum_{j=1}^n D^{(n)}_{i, j} \mathbb{1}_{[0, t_i]}(s) \phi\big( X^n_i(s), X^n_j(s) \big) \Big\vert^2 \, ds \\[5pt] &\le d \int_0^{t} \frac{1}{n} \sum_{i=1}^n \sum_{k=1}^d \bigg\vert \sum_{j=1}^n \int_{\mathbb{R}^{2d}} \big\vert D^{(n)}_{i, j} \mathbb{1}_{[0, t_i]}(s) a^k_i(z, s) b^k_j(z, s) \big\vert \, m_{\phi_k}(dz) \bigg\vert^2 \, ds \\[5pt] & \le d \int_0^{t} \frac{1}{n} \sum_{i=1}^n \sum_{k=1}^d \Vert m_{\phi_k} \Vert_{TM} \int_{\mathbb{R}^{2d}} \big\vert \big(\mathbf{a}^k(z, s)\big)_i \, \big( D^{(n)} \mathbf{b}^k(z, s)\big)_i \big\vert^2 \, m_{\phi_k}(dz) \, ds \\[5pt] & \le d \int_0^{t} \sum_{k=1}^d \Vert m_{\phi_k} \Vert_{TM} \int_{\mathbb{R}^{2d}} \frac{1}{n} \sum_{i=1}^n \big\vert \big( D^{(n)} \mathbf{b}^k(z, s)\big)_i \big\vert^2 \, m_{\phi_k}(dz) \, ds \\[5pt] & \le \max_{1 \le k \le d} \Vert m_{\phi_k} \Vert_{TM}^2 d^2 t \frac{\Vert (D^{(n)})^\top D^{(n)} \Vert_{\infty \rightarrow 1}}{n}.\end{align*}

In the last two inequalities, we used the fact that the $\ell^{\infty}$ norms of the two vectors $\mathbf{a}^k(z, s)$ and $\mathbf{b}^k(z, s)$ are bounded by 1. Thus, from (4.23), there exists a constant $K>0$ such that

\begin{equation*} \Box(t) \le Kt^2 \frac{\Vert (D^{(n)})^\top D^{(n)} \Vert_{\infty \rightarrow 1}}{n} + Kt \int_0^t \Box(s) ds\end{equation*}

holds for every $t \in [0, T]$ , and applying Grönwall’s inequality gives

\begin{equation*} \Box(T) \le \frac{K\Vert (D^{(n)})^\top D^{(n)} \Vert_{\infty \rightarrow 1}}{n},\end{equation*}

where the constant K now depends on T.

Since we have

\begin{equation*} \sup_{0 \le t \le T} W_2(L_{n, t}, \bar{L}_{n, t}) \le \sup_{0 \le t \le T} \sqrt{\frac{1}{n} \sum_{i=1}^n \big\vert X^n_i(t) - \bar{X}^n_i(t) \big\vert^2} \le \sqrt{\Box(T)},\end{equation*}

Lemma 3.1 shows that there exists $N \in \mathbb{N}$ such that the last probability in (4.22) has the bound

\begin{align*} \mathbb{P} \bigg[ \sup_{0 \le t \le T} W_{2} (L_{n, t}, \bar{L}_{n, t}) > \frac{a}{2} \bigg] &\le \mathbb{P} \Bigg[ \Box(T) > \frac{a^2}{4} \Bigg] \le \mathbb{P} \Bigg[ \frac{\Vert (D^{(n)})^\top D^{(n)} \Vert_{\infty \rightarrow 1}}{n} > \frac{a^2}{4K} \Bigg] \\[5pt] &\le 3n^2 \exp\!\bigg(\!-\! \frac{a^4 np(n)^4}{72K^2+8a^2K} \bigg),\end{align*}

for every $n \ge N$ .

On the other hand, Theorem 3.5 gives the bound for the other probability in (4.22). By comparing the two bounds under Assumption 3.1(i), we obtain the assertion (3.9). The result (3.10) is now clear under Assumption 3.1(ii), if we set $p(n) \equiv 1$ and redefine the constants $\delta > 0$ and $N \in \mathbb{N}$ appropriately.

Funding information

E. Bayraktar is supported in part by the National Science Foundation under the grant DMS-2106556 and by the Susan M. Smith Professorship.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Aurell, A., Carmona, R. and Laurière, M. (2022). Stochastic graphon games: II. The linear-quadratic case. Appl. Math. Optimization 85, article no. 39.Google Scholar
Bayraktar, E., Chakraborty, S. and Wu, R. (2023). Graphon mean field systems. Ann. Appl. Prob. 33, 35873619.CrossRefGoogle Scholar
Bayraktar, E. and Wu, R. (2022). Stationarity and uniform in time convergence for the graphon particle system. Stoch. Process. Appl. 150, 532568.CrossRefGoogle Scholar
Bayraktar, E. and Wu, R. (2023). Graphon particle system: uniform-in-time concentration bounds. Stoch. Process. Appl. 156, 196225.CrossRefGoogle Scholar
Bayraktar, E., Wu, R. and Zhang, X. (2023). Propagation of chaos of forward–backward stochastic differential equations with graphon interactions. Appl. Math. Optimization 88, article no. 25.CrossRefGoogle Scholar
Bet, G., Coppini, F. and Nardi, F. R. (2023). Weakly interacting oscillators on dense random graphs. J. Appl. Prob. (published online).Google Scholar
Bhamidi, S., Budhiraja, A. and Wu, R. (2019). Weakly interacting particle systems on inhomogeneous random graphs. Stoch. Process. Appl. 129, 21742206.CrossRefGoogle Scholar
Bobkov, S. and Götze, F. (1999). Exponential integrability and transportation cost related to logarithmic Sobolev inequalities. J. Funct. Anal. 163, 128.CrossRefGoogle Scholar
Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press.CrossRefGoogle Scholar
Caines, P. E. and Huang, M. (2018). Graphon mean field games and the GMFG equations. In 2018 IEEE Conference on Decision and Control (CDC), Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 41294134.CrossRefGoogle Scholar
Caines, P. E. and Huang, M. (2021). Graphon mean field games and their equations. SIAM J. Control Optimization 59, 43734399.CrossRefGoogle Scholar
Carmona, R. A., Cooney, D. B., Graves, C. V. and Laurière, M. (2021). Stochastic graphon games: I. The static case. Math. Operat. Res. 47, 750778.Google Scholar
Coppini, F. (2022). Long time dynamics for interacting oscillators on graphs. Ann. Appl. Prob. 32, 360391.CrossRefGoogle Scholar
Coppini, F. (2022). A note on Fokker–Planck equations and graphons. J. Statist. Phys. 187, article no. 15.CrossRefGoogle Scholar
Delarue, F. (2017). Mean field games: a toy model on an Erdös–Renyi graph. ESAIM Proc. Surveys 60, 1–26.CrossRefGoogle Scholar
Delarue, F., Lacker, D. and Ramanan, K. (2020). From the master equation to mean field game limit theory: large deviations and concentration of measure. Ann. Prob. 48, 211263.CrossRefGoogle Scholar
Delattre, S., Giacomin, G. and Luçon, E. (2016). A note on dynamical models on random graphs and Fokker–Planck equations. J. Statist. Phys. 165, 785798.CrossRefGoogle Scholar
Djellout, H., Guillin, A. and Wu, L. (2004). Transportation cost-information inequalities and applications to random dynamical systems and diffusions. Ann. Prob. 32, 27022732.CrossRefGoogle Scholar
Fournier, N. and Guillin, A. (2015). On the rate of convergence in Wasserstein distance of the empirical measure. Prob. Theory Relat. Fields 162, 707738.CrossRefGoogle Scholar
Gao, S., Caines, P. E. and Huang, M. (2021). LQG graphon mean field games: graphon invariant subspaces. In 2021 60th IEEE Conference on Decision and Control (CDC), Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 52535260.CrossRefGoogle Scholar
Gao, S., Tchuendom, R. F. and Caines, P. E. (2021). Linear quadratic graphon field games. Commun. Inf. Systems 21, 341369.CrossRefGoogle Scholar
Gozlan, N. and Léonard, C. (2007). A large deviation approach to some transportation cost inequalities. Prob. Theory Relat. Fields 139, 235283.CrossRefGoogle Scholar
Guëdon, O. and Vershynin, R. (2016). Community detection in sparse networks via Grothendieck’s inequality. Prob. Theory Relat. Fields 165, 10251049.CrossRefGoogle Scholar
Karatzas, I. and Shreve, S. E. (1991). Brownian Motion and Stochastic Calculus, 2nd edn. Springer, New York.Google Scholar
Lacker, D. and Soret, A. (2021). A case study on stochastic games on large graphs in mean field and sparse regimes. Math. Operat. Res. 47, 15301565.CrossRefGoogle Scholar
Luçon, E. (2020). Quenched asymptotics for interacting diffusions on inhomogeneous random graphs. Stoch. Process. Appl. 130, 67836842.CrossRefGoogle Scholar
Oliveira, R. and Reis, G. (2019). Interacting diffusions on random graphs with diverging average degrees: hydrodynamics and large deviations. J. Statist. Phys. 176, 10571087.CrossRefGoogle Scholar
Parise, F. and Ozdaglar, A. E. (2023). Graphon games: a statistical framework for network games and interventions. Econometrica 91, 191225.CrossRefGoogle Scholar
Sznitman, A.-S. (1991). Topics in propagation of chaos. In École d’Été de Probabilités de Saint-Flour XIX—1989, Springer, Berlin, Heidelberg, pp. 165251.Google Scholar
Tchuendom, R. F., Caines, P. E. and Huang, M. (2020). On the master equation for linear quadratic graphon mean field games. In 2020 59th IEEE Conference on Decision and Control (CDC), Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 10261031.CrossRefGoogle Scholar
Tchuendom, R. F., Caines, P. E. and Huang, M. (2021). Critical nodes in graphon mean field games. In 2021 60th IEEE Conference on Decision and Control (CDC), Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 166170.CrossRefGoogle Scholar
Üstünel, A. (2012). Transportation cost inequalities for diffusions under uniform distance. In Stochastic Analysis and Related Topics, Springer, Berlin, Heidelberg, pp. 203214.CrossRefGoogle Scholar
Vasal, D., Mishra, R. and Vishwanath, S. (2021). Sequential decomposition of graphon mean field games. In 2021 American Control Conference (ACC), Institute of Electrical and Electronics Engineers, Piscataway, NJ, pp. 730736.CrossRefGoogle Scholar