1. Introduction
Hypergraphs are useful for modelling relationships between objects in a complex discrete system, and can offer improvements over graph models in areas such as ecology [Reference Golubski, Westlund, Vandermeer and Pascual10], quantum computing [Reference Morimae, Takeuchi and Hayashi24] and computer vision [Reference Purkait, Chin, Sadri and Suter26]. A hypergraph $H=(V,E)$ consists of a finite set $V$ of vertices and a finite set $E$ of edges, where each edge is a subset of the vertex set. Here edges do not contain repeated vertices, and there are no repeated edges. A hypergraph is $r$ -uniform if every edge contains $r$ vertices. We present an asymptotic enumeration formula for the number of $r$ -uniform hypergraphs with a specified degree sequence, where the degree of a vertex is the number of edges containing it. Our formula holds for $3\leq r\leq \frac 12 n$ and $nr^4\log n\ll d\leq \frac 12\binom{n-1}{r-1}$ , where $d$ is the average degree, under very weak restrictions on how much the degrees can vary. By symmetry, the ranges obtained by complementing the edge set and/or complementing each edge are also covered. Using this formula, we establish some results on the degree sequence of a random $r$ -uniform hypergraph with a given number of edges, verifying a conjecture of Kamčev, Liebenau and Wormald [Reference Kamčev, Liebenau and Wormald15] for our parameter range.
To be more precise, we must introduce some notation. Let $[a]$ denote the set $\{1,2,\ldots, a\}$ for any positive integer $a$ . For infinitely many natural numbers $n$ , let $r(n)$ satisfy $3\leq r(n)\leq n-3$ and let ${\boldsymbol{{d}}}(n) = \bigl (d_1(n),\ldots, d_n(n)\bigr )$ be a sequence of positive integers. We simply write $r$ for $r(n)$ , and similarly for other notation. We assume that for infinitely many $n$ ,
All asymptotics in the paper are as $n$ tends to infinity, along values for which (1.1) holds. Define $\mathcal{H}_r(\boldsymbol{{d}})$ to be the set of simple $r$ -uniform hypergraphs with vertex set $V=\{1,2,\ldots, n\}$ and degree sequence $\boldsymbol{{d}}$ . Write $e({\boldsymbol{{d}}})\,:\!=\, \frac{1}{r} \sum _{j\in [n]} d_j$ for the number of edges and $d\,:\!=\, d({\boldsymbol{{d}}})= \frac{1}{n} \sum _{j\in [n]} d_j$ for the average degree.
Our first aim is to find an asymptotic expression for ${H_r(\boldsymbol{{d}})} = |{\mathcal{H}_r(\boldsymbol{{d}})}|$ for degree sequences $\boldsymbol{{d}}$ which are neither too dense nor too sparse.
Our approach to hypergraph enumeration is based on the complex-analytical method. The answer is expressed in terms of high-dimensional integrals resulting from Fourier inversion applied to a multivariable generating function. Then, these integrals are approximated using multidimensional variants of the saddle-point method; see Section 2 for more details. In the context of combinatorial enumeration, this method was pioneered by McKay and Wormald in 1990 [Reference McKay and Wormald21]. Since then, many other applications of this method have appeared; see for example [Reference Canfield, Greenhill and McKay4, Reference Canfield, Gao, Greenhill, McKay and Robinson5, Reference McKay and McLeod20], and the many results cited in [Reference Isaev and McKay13]. In particular, Kuperberg, Lovett and Peled [Reference Kuperberg, Lovett and Peled17] prove an asymptotic formula for the number of $r$ -uniform $d$ -regular hypergraphs on $n$ vertices which holds when the number of edges in the hypergraph and its complement are each at least $n^c$ (which implies that $r\gt c$ ) for some sufficiently large constant $c$ which is not identified explicitly.
Recently, Isaev and McKay [Reference Isaev and McKay13] developed a general theory based on complex martingales for estimating the high-dimensional integrals which arise from the complex-analytical method. In this paper, we apply tools from [Reference Isaev and McKay13] in the hypergraph setting.
For a survey of enumeration results for graphs with given degrees, see Wormald [Reference Wormald, Sirakov, de Souza and Viana29]. Here we discuss only $r$ -uniform hypergraphs with $r\geq 3$ . Dudek, Frieze, Ruciński and Šileikis [Reference Dudek, Frieze, Ruciński and Šileikis7] gave an asymptotic formula for the number of $d$ -regular $r$ -uniform hypergraphs on $n$ vertices when $r\geq 3$ is constant, assuming that $d=o(n^{1/2})$ . Building on [Reference Blinovsky and Greenhill2], Blinovsky and Greenhill [Reference Blinovsky and Greenhill3, Corollary 2.3] gave an asymptotic formula for $H_r(\boldsymbol{{d}})$ that holds when the maximum degree $d_{\max }$ satisfies $r^4 d_{\max }^3 = o(nd)$ . These results were obtained using the switching method.
By adapting the “degree switching and contraction mapping” approach of [Reference Liebenau and Wormald18, Reference Liebenau and Wormald19], Kamčev, Liebenau and Wormald [Reference Kamčev, Liebenau and Wormald15, Theorem 1.2] proved that the degree sequence of a randomly chosen $r$ -uniform hypergraph with $m$ edges is closely related to a random vector with entries chosen from suitable independent binomial distributions, conditioned on the entries of the vector having sum $nd$ . More precisely, they prove that the ratio of the probabilities of a particular vector $\boldsymbol{{d}}$ in these two models is well-approximated by a simple function of $r$ and $\boldsymbol{{d}}$ . We will restate their theorem as Theorem 1.6 below. This result holds under the assumptions that the degrees do not vary too much, the edge size is not too large and the average degree is at most a sufficiently small constant times $\frac 1r\binom{n-1}{r-1}$ . Kamčev, Liebenau and Wormald also considered sparse degree sequences in [Reference Kamčev, Liebenau and Wormald15, Theorem 1.3], which subsumes the enumeration results of [Reference Blinovsky and Greenhill2, Reference Dudek, Frieze, Ruciński and Šileikis7].
Our second aim is to apply our enumeration formula to study the degree sequence of random uniform hypergraphs with a given number of edges. In particular, we prove a companion result to [Reference Kamčev, Liebenau and Wormald15, Theorem 1.2] which allows larger edge size, more edges and more variation between the degrees, when the average degree is large enough. Furthermore, we verify (for our range of parameters) a conjecture made in [Reference Kamčev, Liebenau and Wormald15], showing that vectors of independent hypergeometric random variables, conditioned on having sum $nd$ , closely match the degree sequence of a random uniform hypergraph with $nd/r$ edges almost everywhere.
1.1 Notation, assumptions and our general results
Define the density $\lambda$ as a function of $n$ , $r$ and the average degree $d$ by
Write $\mathcal{S}_r(n)$ to denote the set of all subsets of $[n]$ of size $r$ . Given a vector $\boldsymbol{\beta } = (\beta _1,\ldots, \beta _n)\in{\mathbb{R}}^n$ , for all $W\in{\mathcal{S}_r(n)}$ define
Note that $\lambda _W(\boldsymbol{\beta })$ is the probability that the edge $W$ appears in the $\beta$ -model for hypergraphs with given degrees, see for example [Reference Stasi, Sadeghi, Rinaldo, Petrović and Fienberg27]. Let $\lambda (\boldsymbol{\beta })$ be the average values of the $\lambda _W(\boldsymbol{\beta })$ ; that is,
Observe that $\lambda _W(\boldsymbol{\beta }),\, \lambda (\boldsymbol{\beta }) \in (0,1)$ .
Define the positive symmetric $n\times n$ matrix $A(\boldsymbol{\beta })=(a_{jk})$ as follows:
We use $|M|$ to denote the determinant of a matrix $M$ .
Let $\boldsymbol{\beta }^{\ast }\in{\mathbb{R}}^n$ be a solution to the system of equations
Summing (1.5) over $j\in [n]$ gives
This shows that $\lambda (\boldsymbol{\beta }^{\ast })$ equals the density $\lambda$ defined in (1.2). Similarly, if we write $\lambda _W$ or $A$ without argument, we always mean that the argument is $\boldsymbol{\beta }^{\ast }$ .
Our main enumeration result is the following.
Theorem 1.1. Let ${\boldsymbol{{d}}} ={\boldsymbol{{d}}}(n)= (d_1,\ldots, d_n)$ be a degree sequence. Suppose that $r=r(n)$ satisfies $3\leq r\leq n-3$ and
Further assume that $\boldsymbol{\beta }^{\ast }=\left(\beta _1^\ast,\ldots, \beta _n^\ast \right)$ is a solution of ( 1.5 ) such that
Let $\lambda _W=\lambda _W(\boldsymbol{\beta }^{\ast })$ be defined as in ( 1.3 ), for all $W\in{\mathcal{S}_r(n)}$ , and let $A=A(\boldsymbol{\beta }^{\ast })$ be defined as in ( 1.4 ). Then
where
The implicit constant in the $O(\varepsilon )$ term depends only on the implicit constant in ( 1.8 ).
The enumeration problem has two natural symmetries: given a hypergraph, we may replace every edge by its complement, or we may take the complement of the edge set. These symmetries show that for a given degree sequence $\boldsymbol{{d}}$ ,
where
Using these symmetries, we may assume that
When these inequalities are both satisfied, we say that $(r,{\boldsymbol{{d}}})$ belongs to the first quadrant.
The conditions in Theorem 1.1 are invariant under these two symmetries. It is true, but not obvious, that the asymptotic formula in Theorem 1.1 is also invariant under these symmetries. We prove this in Lemma 1.2 below.
Lemma 1.2. Suppose that $\boldsymbol{\beta }^{\ast }$ is a solution to ( 1.5 ). Let $\boldsymbol{\beta }^{\prime}$ , $\widetilde{\boldsymbol{\beta }}$ , $\widetilde{\boldsymbol{\beta }}^{\prime}$ be vectors with entries $\beta^{\prime}_j$ , $\widetilde \beta _j$ , $\widetilde \beta^{\prime}_j$ defined in the fourth row of Table 1 for all $j\in [n]$ . Then $\boldsymbol{\beta }^{\prime}$ , $\widetilde{\boldsymbol{\beta }}$ , $\widetilde{\boldsymbol{\beta }}^{\prime}$ are solutions of ( 1.5 ) for the degree sequences ${\boldsymbol{{d}}}^{\prime}$ , $\widetilde{\boldsymbol{{d}}}$ and $\widetilde{\boldsymbol{{d}}}^{\prime}$ defined in ( 1.10 ), respectively. Furthermore, the following relationships hold:
For the reader’s convenience, in Table 1 we summarise information about our parameters under these symmetries.
It follows from (1.9) and Lemma 1.2 that it suffices to prove Theorem 1.1 when $(r,{\boldsymbol{{d}}})$ belongs to the first quadrant. In this case, using (1.6) the assumptions of Theorem 1.1 become
and the error term becomes
Here we use the fact that $\lambda (1-\lambda )\binom{n}{r}$ is a lower bound on the number of edges of any hypergraph in $\mathcal{H}_r({\boldsymbol{{d}}})$ and its complement. The following lemma provides sufficient conditions on $r$ and $\boldsymbol{{d}}$ which guarantee the existence of solutions to (1.5).
Lemma 1.3. Let $(r,{\boldsymbol{{d}}})$ belong to the first quadrant. Assume that there exists $\varDelta \geq 0$ such that for all $j \in [n]$ ,
Further, assume that one of the following two conditions hold:
-
(i) $\varDelta \leq \varDelta _0$ for some sufficiently small constant $\varDelta _0\gt 0$ ;
-
(ii) $ r d = o(1) \binom{n-1}{r-1}$ , $r = o(n)$ , and $\varDelta = \Theta (1)$ .
Then there exists $\boldsymbol{\beta }^{\ast }$ satisfying ( 1.5 ) such that $ \max _{j,k\in [n]} \big|\beta ^\ast _j-\beta ^\ast _k\big|= O(\varDelta/r).$
Uniqueness is a feature of similar situations [Reference Bishop, Fienberg and Holland1, Section 3.3.4], but we have not found a proof of uniqueness in our case in the literature. For completeness we provide a short proof.
Lemma 1.4. For a given degree sequence $\boldsymbol{{d}}$ , the solution $\boldsymbol{\beta }^{\ast }$ to (1.5) is unique if it exists.
Even though (1.5) doesn’t have an explicit solution in general, we can evaluate the formula in Theorem 1.1 accurately if we have a sufficiently precise estimate of $\boldsymbol{\beta }^{\ast }$ . Stasi, Sadeghi, Rinaldo, Petrović and Fienberg [Reference Stasi, Sadeghi, Rinaldo, Petrović and Fienberg27] stated without proof a generalisation of an algorithm of [Reference Chatterjee, Diaconis and Sly6] that gives geometric convergence to $\boldsymbol{\beta }^{\ast }$ if it exists. Though we didn’t use the iteration from [Reference Stasi, Sadeghi, Rinaldo, Petrović and Fienberg27], we will demonstrate how the precision to which an estimate of $\boldsymbol{\beta }^{\ast }$ satisfies (1.5) can be used to validate the corresponding estimate of $H_r({\boldsymbol{{d}}})$ . Our example will be degree sequences that are not far from regular, which will allow us to investigate the degree sequences of random hypergraphs.
For $j\in [n]$ define $\delta _j\,:\!=\,d_j-d$ . Define $\boldsymbol{\delta }\,:\!=\,(\delta _1,\ldots,\delta _n)$ and $\delta _{\mathrm{max}}\,:\!=\,\max \{\|\boldsymbol{\delta }\|_\infty,1\}$ . Also define $R_q\,:\!=\,\sum _{j=1}^n \delta _j^q$ for $q\ge 0$ and note that $R_1=0$ .
Recall the definition of $\lambda$ from (1.2). We will find it convenient to write some quantities in terms of the parameter $Q$ , which is invariant under the symmetries of (1.9):
We continue to use the error term of Theorem 1.1, which in terms of $Q$ is
Our criterion for being “near-regular” is
which in the first quadrant is equivalent to $\delta _{\mathrm{max}} = O(d^{3/5})$ .
Theorem 1.5. If $3\leq r\leq n-3$ and assumptions (1.7) and (1.13) hold, then
where $\hat \varepsilon \,:\!=\,\varepsilon + \delta _{\mathrm{max}} n^{3/5}Q^{-3/5}$ and $\varepsilon$ is defined in ( 1.12 ).
1.2 Degree sequences of random uniform hypergraphs
Assumption (1.13) is weak enough to include the degree sequences of random hypergraphs with high probability. Following the notation of Kamčev, Liebenau and Wormald [Reference Kamčev, Liebenau and Wormald15], we define three probability spaces of integer vectors. Formulas will be given in Section 7.
-
$\mathcal{D}_r(n,m)$ is the probability space of degree sequences of uniformly random $r$ -uniform hypergraphs with $n$ vertices and $m$ edges.
-
$\mathcal{B}_r(n,m)$ is the result of conditioning $n$ independent binomial variables $\text{Bin}\big(\binom{n-1}{r-1},p\big)$ on having sum $nd$ . (This distribution is independent of $p$ .)
-
Note that each component of $\mathcal{D}_r(n,m)$ has a hypergeometric distribution. $\mathcal{T}_r(n,m)$ is the result of conditioning $n$ independent copies of that distribution on having sum $nd$ .
The most important previous result on the near-regular case was obtained by Kamčev, Liebenau and Wormald [Reference Kamčev, Liebenau and Wormald15]. All the overlap between [Reference Kamčev, Liebenau and Wormald15, Theorem 1.2] and Theorem 1.1 occurs in Theorem 1.5, so we restate their theorem here.
Theorem 1.6. ([Reference Kamčev, Liebenau and Wormald15, Theorem. 1.2]). Fix $\varphi \in \big(\frac{4}{9},\frac 12\big)$ . For some constant $c\gt 0$ and every $C\gt 0$ , suppose that $3\le r\lt c n^{1/4}/\log n$ , $r^3 d^{1-3\varphi }\lt c$ and $\log ^C n \ll d \lt \frac cr\binom{n-1}{r-1}$ . Let $\boldsymbol{{d}}$ be a degree sequence with mean $d$ and $\delta _{\mathrm{max}}\le d^{1-\varphi }$ . Then
where
The conditions of Theorem 1.6 allow for much lower average degree than Theorem 1.1, but at the cost of stricter upper bounds on the edge size, the number of edges, and the variation between the degrees.
As can be seen, the relation between $\mathcal{D}_r(n,m)$ and $\mathcal{B}_r(n,m)$ becomes rapidly more distant as $r$ increases. Theorem 1.5 would allow a statement for all $r$ , but we prefer a statement that is more easily compared to Theorem 1.6. Note that our formula agrees with the expression given in Theorem 1.6 if $r=o(n^{1/2})$ , since then $((n-1)/(n-r))^{(n-1)/2} \sim e^{(r-1)/2}$ .
Theorem 1.7. Suppose that $3\le r\le cn$ and $0\lt \lambda \lt c$ for some fixed $c\lt 1$ . If $d\gg r^4n\log n$ and $\delta _{\mathrm{max}}=O(d^{3/5})$ then
where $\bar \varepsilon \,:\!=\,\varepsilon + \delta _{\mathrm{max}} d^{-3/5}$ and $\varepsilon$ is defined in ( 1.12 ).
As noted in [Reference Kamčev, Liebenau and Wormald15], one can expect $\mathcal{T}_r(n,m)$ to be a better match to $\mathcal{D}_r(n,m)$ , especially for large edge sizes. We prove this for the full range of our parameters.
Theorem 1.8. If $3\leq r\leq n-3$ and assumptions (1.7) and (1.13) hold, then
where $\hat \varepsilon \,:\!=\,\varepsilon + \delta _{\mathrm{max}} n^{3/5} Q^{-3/5}$ and $\varepsilon$ is defined in ( 1.12 ).
Kamčev, Liebenau and Wormald [Reference Kamčev, Liebenau and Wormald15] conjectured that $\mathcal{D}_r(n,m)$ is asymptotically equal to $\mathcal{T}_r(n,m)$ almost everywhere.
Conjecture 1.9. ([Reference Kamčev, Liebenau and Wormald15]). Let $2\le r\le n-2$ and $\min \{m,\binom nr-m\}=\omega (\!\log n)$ . Then there exists a set $\mathfrak{W}$ that has probability $1-O(n^{-\omega (1)})$ in both $\mathcal{D}_r(n,m)$ and $\mathcal{T}_r(n,m)$ , such that uniformly for all ${\boldsymbol{{d}}}\in \mathfrak{W}$ ,
We prove their conjecture for our range of parameters.
Theorem 1.10. If $3\leq r\leq n-3$ and assumption (1.7) holds, then there exists a set $\mathfrak{W}$ that has probability $1 - n^{-\Omega (\!\log n)}$ in both $\mathcal{D}_r(n,m)$ and $\mathcal{T}_r(n,m)$ , such that uniformly for all ${\boldsymbol{{d}}}\in \mathfrak{W}$ ,
1.3 Structure of the paper
Having now stated our main results, we describe the overall structure of the paper. In Section 2, we outline how $H_r({\boldsymbol{{d}}})$ can be expressed as an $n$ -dimensional integral and state the lemmas which lead to its evaluation. In Section 3 we prove some necessary bounds concerning the quantities $\lambda _W(\boldsymbol{\beta })$ and $A(\boldsymbol{\beta })$ , and then in Section 4 we apply them to evaluate the integral, completing the proof of our main enumeration result, Theorem 1.1. In Section 5 we address existence and uniqueness of solutions to (1.5), proving Lemma 1.3 and Lemma 1.4. Section 6 examines the near-regular case, proving Theorem 1.5. Then in Section 7 we prove our results about the degree sequence of random uniform hypergraphs, as stated in Section 1.2. Finally, Section 8 contains several technical proofs that have been deferred, including the proof of Lemma 1.2.
Some of the calculations in this paper are rather tedious, particularly in Sections 6 and 7. We carried out the worst of them first using the computer algebra package Maple and later checked them by hand. All infinite series are based on Taylor’s theorem and so have clear-cut truncation criteria.
2. Proof outline for Theorem 1.1
We will take advantage of Lemma 1.2 to work in the first quadrant, where the conditions of Theorem 1.1 are given by (1.11).
The number $H_r({\boldsymbol{{d}}})$ of simple $r$ -uniform hypergraphs with degree sequence ${\boldsymbol{{d}}} = (d_1,\ldots, d_n)$ can be expressed using a generating function, where the power of variable $x_j$ gives the degree of vertex $j$ for $j\in [n]$ . Each $W\in{\mathcal{S}_r(n)}$ will contribute a factor of $\prod _{j\in W} x_j$ , if $W$ is an edge in the hypergraph, or 1 if $W$ is not an edge. Using $\left[x_1^{d_1}\cdots x_n^{d_n}\right]$ to denote coefficient extraction, this gives
using Cauchy’s coefficient formula for the second line. Each integral is over a contour enclosing the origin. Recalling that $\boldsymbol{\beta }^{\ast }$ is a solution of (1.5), we choose the $j$ th contour to be a circle of radius $e^{\beta ^\ast _j}$ , for $j\in [n]$ . This choice leads to the expression
where the factor in front of the integral is given by
Let $F(\boldsymbol{\theta })$ denote the integrand, that is,
As we will see in Lemma 4.1, our choice of $\boldsymbol{\beta }^{\ast }$ ensures that the linear term in the expansion of $\log F(\boldsymbol{\theta })$ vanishes.
The maximum value of $|F(\boldsymbol{\theta })|$ is 1, which is achieved if and only if $\sum _{j\in W}\theta _j\equiv 0\!\!\pmod{2\pi }$ for all $W\in{\mathcal{S}_r(n)}$ . If this condition holds then all $\theta _j$ must be equal modulo $2\pi$ , as can be seen by considering two $r$ -subsets $W$ , $W^{\prime}$ which differ in just one vertex and observing that such a pair of subsets exists for any pair of vertices. Hence there are precisely $r$ points where $F(\boldsymbol{\theta })$ is maximised in $({-}\pi,\pi ]^n$ , namely $\boldsymbol{\theta }^{(1)},\ldots, \boldsymbol{\theta }^{(r)}$ , where for $t\in [r]$ the point $\boldsymbol{\theta }^{(t)} = \big(\theta _1^{(t)},\ldots, \theta _n^{(t)}\big)$ is defined by
We will estimate the value of the integral first in the regions close to $\boldsymbol{\theta }^{(t)}$ , for some $t\in [r]$ , then for the remainder of the domain. Write $U_n(\rho )$ for the ball of radius $\rho$ around the origin, with respect to the infinity norm; that is,
and for $\rho \gt 0$ define the region $\mathcal{R}(\rho )$ as
Evaluation of the integral proceeds by the following sequence of lemmas, whose proof is deferred to Section 4. The first two lemmas give an estimate of the value of the integral over $U_n\big(r^{-1}\big)$ , by providing an estimate over $\mathcal{R}\big(d^{-1/2}\log{n}\big)$ and $U_n\big(r^{-1}\big)\setminus \mathcal{R}\big(d^{-1/2}\log{n}\big)$ respectively.
Lemma 2.1. If assumptions (1.11) hold then
where $\varepsilon$ is given in ( 1.12 ).
Lemma 2.2. If assumptions (1.11) hold then
Define the regions $U^{(t)}$ for $t\in [r]$ by
Let $\mathcal{B}\,:\!=\, \cup _{t\in [r]} U^{(t)}$ . Since $F(\boldsymbol{\theta }^{(t)} + \boldsymbol{\theta }) = F(\boldsymbol{\theta })$ for all $\boldsymbol{\theta }\in U_n(\pi )$ , each of the regions $U^{(1)},\ldots, U^{(r)}$ makes an identical contribution to the integral. Lemmas 2.1 and 2.2 imply that under assumptions (1.11) we have
The integral in the region $U_n(\pi )\setminus \mathcal{B}$ is approximated in the next result.
Lemma 2.3. If assumptions (1.11) hold then
Continuing with the proof of Theorem 1.1, by combining Lemma 2.3 and (2.6) we obtain
We can express $P_r(\boldsymbol{\beta }^{\ast })$ in a more convenient form, as follows:
The proof of Theorem 1.1 in the first quadrant is completed by substituting (2.7) and (2.8) into (2.1). The full statement of Theorem 1.1 then follows from Lemma 1.2.
3. Properties of $A$ and other useful bounds
We will need to analyse the behaviour of $\lambda _W(\boldsymbol{\beta })$ , $\lambda (\boldsymbol{\beta })$ and $A(\boldsymbol{\beta })$ , not only when $\boldsymbol{\beta }$ is a solution of (1.5), but more generally. We also need
Recall that the elements of $A(\boldsymbol{\beta })$ are sums of terms of the form $\lambda _W(\boldsymbol{\beta })(1-\lambda _W(\boldsymbol{\beta }))$ . We start by establishing bounds on $\lambda _W(\boldsymbol{\beta })$ and $1-\lambda _W(\boldsymbol{\beta })$ .
Lemma 3.1. Denote by $f\,:\,{\mathbb{R}}^r\to{\mathbb{R}}$ the function
Let $\boldsymbol{{x}}$ , $\boldsymbol{{y}}$ satisfy $|x_i-y_i|\le \delta/r$ for some constant $\delta \ge 0$ , and define $p\,:\!=\,\big|\big\{j\,:\,x_j\neq y_j\big\}\big|$ . Then
Proof. First suppose that $p=1$ and without loss of generality assume $x_1\neq y_1$ . Then if $y_1\le x_1$ we have
where $X=\sum _{j=2}^r x_j=\sum _{j=2}^r y_j$ . Observe that $\frac{1+e^y}{1+e^x}\le e^{y-x}$ whenever $x\le y$ . Therefore when $y_1\gt x_1$ ,
As $\boldsymbol{{x}}$ and $\boldsymbol{{y}}$ are arbitrary vectors in ${\mathbb{R}}^r$ , by symmetry we also have
Similarly,
For arbitrary ${\boldsymbol{{x}}},{\boldsymbol{{y}}}$ , let ${\boldsymbol{{z}}}_0,\ldots,{\boldsymbol{{z}}}_p$ be a sequence of elements of ${\mathbb{R}}^n$ with ${\boldsymbol{{z}}}_0={\boldsymbol{{x}}}$ , ${\boldsymbol{{z}}}_{p}={\boldsymbol{{y}}}$ such that ${\boldsymbol{{z}}}_j$ and ${\boldsymbol{{z}}}_{j-1}$ differ in only one coordinate for $j=1,\ldots,p$ . Then
and the statement follows as there are exactly $p$ factors.
We will apply this lemma in two slightly different scenarios. First we compare $\lambda (\boldsymbol{\beta })$ to $\lambda (\widehat{\boldsymbol{\beta }})$ for two different vectors $\boldsymbol{\beta }$ and $\widehat{\boldsymbol{\beta }}$ .
Lemma 3.2. Let $\boldsymbol{\beta }$ and $\widehat{\boldsymbol{\beta }}$ satisfy $\max _{j\in [n]} \big|\beta _j-\widehat \beta _j\big|\leq \delta/r$ for some nonnegative constant $\delta$ . Then
Proof. By Lemma 3.1 we have for each $W\in{\mathcal{S}_r(n)}$ that $e^{-\delta }\lambda _W(\widehat{\boldsymbol{\beta }})\le \lambda _W(\boldsymbol{\beta })\le e^{\delta }\lambda _W(\widehat{\boldsymbol{\beta }})$ . The result follows from the definition of $\lambda (\boldsymbol{\beta })$ .
In the second application, we consider the ratios of $\lambda _W(\boldsymbol{\beta })$ and $\lambda _{W^{\prime}}(\boldsymbol{\beta })$ for $W,W^{\prime}\in{\mathcal{S}_r(n)}$ .
Lemma 3.3. Let $\boldsymbol{\beta }$ satisfy $\max _{j,k\in [n]} |\beta _j - \beta _k|\leq \delta/r$ for some nonnegative constant $\delta$ . Then
for all $W,W^{\prime}\in{\mathcal{S}_r(n)}$ . Hence
for all $W\in{\mathcal{S}_r(n)}$ .
Proof. Note that the $\beta _j$ terms corresponding to $j\in W\cap W^{\prime}$ appear in both $\lambda _W(\boldsymbol{\beta })$ and $\lambda _{W^{\prime}}(\boldsymbol{\beta })$ . Together with Lemma 3.1 this implies the first half of the statement. The bounds involving $\lambda (\boldsymbol{\beta })$ and $\varLambda (\boldsymbol{\beta })$ follow from the definitions of these quantities.
We use the previous result to deduce that $\lambda (\boldsymbol{\beta })$ and $\varLambda (\boldsymbol{\beta })$ have the same order of magnitude when $\lambda (\boldsymbol{\beta })$ is small enough.
Lemma 3.4. Let $\boldsymbol{\beta }$ satisfy $\max _{j,k\in [n]} |\beta _j-\beta _k|\leq \delta/r$ for a given nonnegative constant $\delta$ . If $\lambda (\boldsymbol{\beta })\le 7/8$ then
Proof. The upper bound holds as
Now consider the set $S=\left\{W \in{\mathcal{S}_r(n)}\,:\,\lambda _W(\boldsymbol{\beta })\gt \frac{15}{16}\right\}$ . First assume that $|S|\leq \frac{15}{16}\,|{\mathcal{S}_r(n)}|$ . Then
On the other hand if $|S|\gt \frac{15}{16}\, |{\mathcal{S}_r(n)}|$ , then
contradicting our assumption.
Now we turn to the matrix $A(\boldsymbol{\beta })$ and establish that the diagonal entries are relatively close to each other, and similarly for the off-diagonal entries.
Lemma 3.5. Let $\boldsymbol{\beta }$ satisfy $\max _{j,k\in [n]} |\beta _j-\beta _k|\leq \delta/r$ for some nonnegative constant $\delta$ . Then the entries of $A(\boldsymbol{\beta })=(a_{jk})$ satisfy
for any $j,k,j^{\prime},k^{\prime}\in [n]$ with $j\neq k$ and $j^{\prime}\neq k^{\prime}$ . Furthermore,
Proof. We start with the case when $j\neq k$ and $j^{\prime}\neq k^{\prime}$ . Let $S_{jk}=\{W\in{\mathcal{S}_r(n)}\,:\,W\supset \{j,k\}\}$ . Recall that
Both $S_{jk}$ and $S_{j^{\prime}k^{\prime}}$ contain exactly $\binom{n-2}{r-2}$ elements. We will show that there exists a bijection $\zeta \,:\,S_{j,k}\rightarrow S_{j^{\prime}k^{\prime}}$ such that for every pair $(W,W^{\prime})$ with $W^{\prime}=\zeta (W)$ , we have
By Lemma 3.3, this follows if $|W\cap \zeta (W)|\ge r-2$ for all $W\in S_{jk}$ .
We can assume that either $\{j,k\}\cap \{j^{\prime},k^{\prime}\}=\emptyset$ or $j=j^{\prime}$ . Now consider the function $b\,:\,V\to V$ , which is the identity for every vertex in $V\setminus \{j,k,j^{\prime},k^{\prime}\}$ and switches $j$ with $j^{\prime}$ and $k$ with $k^{\prime}$ . This function can be extended to a function $\zeta \,:\,S_{jk}\to S_{j^{\prime}k^{\prime}}$ by assigning to each set $W\in S_{jk}$ the set $\{b(j) \,:\, j\in W\}$ . Clearly $b$ is a bijection and so is $\zeta$ , and $|W\cap \zeta (W)|\geq r-2$ for all $W\in S_{jk}$ , as required.
The remaining results follow as
completing the proof.
We also establish an upper bound on the determinant of $A(\boldsymbol{\beta })$ . It follows easily from the Matrix Determinant Lemma (see for example [Reference Meyer22, equation (6.2.3)]) that for any real numbers $a,b$ ,
where $I$ is the $n\times n$ identity matrix and $J$ is the $n\times n$ matrix with every entry equal to one.
Lemma 3.6. Let $\boldsymbol{\beta }$ satisfy $\max _{j,k\in [n]} |\beta _j-\beta _k|\leq \delta/r$ for some nonnegative constant $\delta$ . Then
Proof. Note that for any ${\boldsymbol{{x}}} \in{\mathbb{R}}^n$ we have
where $A^{\prime}=\frac{1}{2}e^{2\delta }\varLambda (\boldsymbol{\beta }) \bigl (\binom{n-2}{r-1}I+ \binom{n-2}{r-2}J\bigr )$ . (Here ${\boldsymbol{{x}}}^{\mathrm{t}}$ is the transpose of $\boldsymbol{{x}}$ .) Therefore, by the min-max theorem, the $k$ -th largest eigenvalue of $A(\boldsymbol{\beta })$ is at most the $k$ -th largest eigenvalue of $A^{\prime}$ . Since $A(\boldsymbol{\beta })$ is positive semidefinite, all its eigenvalues are non-negative, implying that $|A(\boldsymbol{\beta })|\le |A^{\prime}|$ . Using (3.1), we have
and the result follows.
3.1 Inverting $A(\boldsymbol{\beta })$
Next we bound the entries of $A(\boldsymbol{\beta })^{-1}$ and find a change of basis matrix $T$ which transforms $A(\boldsymbol{\beta })$ to the identity matrix. For $p\in \{1,2,\infty \}$ , we use the notation $\|{\cdot}\|_p$ for the standard vector norms and the corresponding induced matrix norms (see for example [Reference Horn and Johnson12, Section 5.6]). In particular, for an $n\times n$ matrix $M = (m_{ij})$ ,
The proof of this lemma is given in Section 8.2.
Lemma 3.7. Let $\delta$ be a nonnegative constant. For every $\boldsymbol{\beta }$ such that $\max _{j,k\in [n]} |\beta _j - \beta _k|\leq \delta/r$ the following holds.
Let $A(\boldsymbol{\beta })^{-1}=(\sigma _{jk})$ be the inverse of $A(\boldsymbol{\beta })$ . There exists a constant $C$ , independent of $\delta$ , such that for $n\ge 16e^{4\delta }$ we have
In addition, there exists a matrix $T=T(\boldsymbol{\beta })$ such that $T^{\mathrm{t}}\! A(\boldsymbol{\beta }) T=I$ with
Furthermore, for any $\rho \gt 0$ there exists $\rho _1,\rho _2=\Theta \Bigl (\rho \, \varLambda (\boldsymbol{\beta })^{1/2}\, \binom{n-1}{r-1}^{1/2} \Bigr )$ such that
where $\mathcal{R}(\rho )$ is defined in (2.4).
4. Evaluating the integral
In this section we prove Lemmas 2.1–2.3. We have already seen that these lemmas establish Theorem 1.1.
Throughout this section we assume that (1.11) holds and thus $\lambda =\binom{n-1}{r-1}^{-1}d \le \frac{1}{2}$ . Therefore, by Lemma 3.4, for $\varLambda \,:\!=\,\varLambda (\boldsymbol{\beta }^{\ast })$ we have
4.1 Proof of Lemma 2.1
First, we will estimate the integral of $F(\boldsymbol{\theta })$ over $\mathcal{R} (d^{-1/2} \log{n})$ . For $\xi \in [0,1]$ and $x\in [{-}1,1]$ , $|\xi (e^{ix}-1)|$ is bounded below 1 and the fifth derivative of $\log \bigl (1 + \xi \bigl(e^{ix}-1\bigr)\bigr )$ with respect to $x$ is uniformly $O(\xi )$ . Using the principal branch of the logarithm in this domain, we have by Taylor’s theorem that uniformly
where the coefficients are
Lemma 4.1. Let $\rho \,:\!=\,d^{-1/2}\, \log n$ . Then, for $\boldsymbol{\theta }\in U_n(\rho )$ , we have
Proof. Recall that $\lambda _W\in (0,1)$ for all $W$ , and note that (1.11) implies that $r\rho =o(1)$ . Hence, recalling (2.3), we can apply (4.2) for each $W\in{\mathcal{S}_r(n)}$ , taking $\xi =\lambda _W$ and $x=\sum _{j\in W}\theta _j$ . The linear term of $\log F(\boldsymbol{\theta })$ (which includes terms from the denominator of $F(\boldsymbol{\theta })$ ), is
which equals zero by (1.5). In addition, for the quadratic term,
Now $\lambda _W = O(\lambda )$ for all $W\in{\mathcal{S}_r(n)}$ , by Lemma 3.3, so the combined error term is
Recall that for a complex variable $Z$ , the variance is defined by
while the pseudovariance is
The following is a special case of [Reference Isaev and McKay13, Theorem 4.4] that is sufficient for our current purposes.
Theorem 4.2. Let $A$ be an $n\times n$ positive definite symmetric real matrix and let $T$ be a real matrix such that $T^{\mathrm{t}}\! AT=I$ . Let $\varOmega$ be a measurable set and let $f\,:\,{\mathbb{R}}^n\to{\mathbb{C}}$ and $h\,:\,\varOmega \to{\mathbb{C}}$ be measurable functions. Make the following assumptions for some $\rho _1,\rho _2,\phi$ :
-
(a) $T(U_n(\rho _1))\subseteq \varOmega \subseteq T(U_n(\rho _2)),$ where $\rho _1,\rho _2=\Theta (\!\log n)$ .
-
(b) For ${\boldsymbol{{x}}}\in T(U_n(\rho _2))$ , $2\rho _2\,\|T\|_1\,\left |\frac{\partial f}{\partial x_{j}}({\boldsymbol{{x}}})\right | \le \phi n^{-1/3}\le \frac{2}{3}$ for $1\le j\le n$ and
\begin{equation*}4\rho _{2}^{2}\,\|T\|_{1}\,\|T\|_{\infty }\, \|H\|_\infty \le \phi n^{-1/3},\end{equation*}where $H = (h_{jk})$ is the matrix with entries defined by\begin{equation*} h_{jk}=\sup _{{\boldsymbol{{x}}}\in T(U_{n}(\rho _{2}))}\, \left | \frac {\partial ^2 f}{\partial x_{j}\, \partial x_{k}}({\boldsymbol{{x}}}) \right |. \end{equation*} -
(c) $| f(\boldsymbol{{x}})| \le n^{O(1)} e^{O(1/n)\,{\boldsymbol{{x}}}^{\mathrm{t}}\! A{\boldsymbol{{x}}}}$ uniformly for ${\boldsymbol{{x}}}\in{\mathbb{R}}^{n}$ .
Let $\boldsymbol{{X}}$ be a Gaussian random vector with density $\pi ^{-n/2} |A|^{1/2} \, e^{-{\boldsymbol{{x}}}^{\mathrm{t}}\!A{\boldsymbol{{x}}}}$ . Then, provided $\mathbb{V} f(\boldsymbol{{X}})$ is finite and $h$ is bounded in $\varOmega$ ,
where, for sufficiently large $n$ ,
Now we will prove Lemma 2.1.
Proof of Lemma 2.1. Let $\rho =d^{-1/2}\log n$ . Applying Lemma 4.1 gives
where
We will apply Theorem 4.2 with $\Omega = \mathcal{R}(\rho )$ . Let $T,\rho _1,\rho _2$ be as in Lemma 3.7. Then $T(U_n(\rho _1))\subseteq \mathcal{R}(\rho )\subseteq T(U_n(\rho _2))$ . Observe that $\rho _1,\rho _2=\Theta \big(\rho d^{1/2}\big)=\Theta (\!\log{n})$ , by (4.1). Clearly $\rho _1\le \rho _2$ and thus condition (a) in Theorem 4.2 is satisfied.
Now for $j\in [n]$ ,
Thus, for all $\boldsymbol{\theta }\in T(U_n(\rho _2))$ and all $j\in [n]$ we have
by Lemmas 3.3 and 3.4 and using the fact that $r\rho = o(1)$ . Hence, by (4.4) and Lemma 3.7,
for every $\boldsymbol{\theta }\in T(U_n(\rho _2))$ and $j\in [n]$ . Also for all $j,k\in [n]$ (including $j=k$ ),
Arguing as above, if $\boldsymbol{\theta }\in T(U_n(\rho _2))$ then
Then (4.6) and Lemma 3.7 imply that
By (4.5) and (4.7) there exists
such that the left side of both (4.5) and (4.7) are at most $\phi n^{-1/3}$ .
Recall that the 2-norm of the real symmetric matrix $A^{-1}$ equals the largest eigenvalue of $A^{-1}$ . Using this we obtain
so condition (c) is satisfied.
By Theorem 4.2 we have
where
In the last step we use the fact that $\phi =o(1)$ and $e^{-\rho _1^2/2}=o(1)$ . The $nr^2/d$ term inside the $O(\cdot )$ is the bound on $h$ from (4.3). To complete an estimate of $K$ , it remains to bound
We will rely heavily on Isserlis’ theorem (also called Wick’s formula) in order to establish bounds for the variance of $\Im f(\boldsymbol{{X}})$ and later for the pseudovariance of $f(\boldsymbol{{X}})$ . Isserlis’ theorem states that the expected value of a product of jointly Gaussian random variables, each with zero mean, can be obtained by summing over all partitions of the variables into pairs, where the term corresponding to a partition is just the product of the covariances of each pair. See for example [Reference Michalowicz, Nichols, Bucholtz and Olson23, Theorem 1.1].
In particular, for a normally distributed random vector $(Y_1,Y_2)$ with expected value $(0,0)$ , we have
Since the sum of components of a normally distributed random vector is also normally distributed, we can apply Isserlis’ theorem to sums involving the random variables $X_j$ , $j\in [n]$ . Then for any $W\in{\mathcal{S}_r(n)}$ we have
and so
For $W,W^{\prime}\in{\mathcal{S}_r(n)}$ let
Now $\text{Cov}[X_j,X_k]$ equals the corresponding values of $(2A)^{-1}$ and hence, by Lemma 3.7 and (4.1),
Since covariance is additive, we have
Using this together with Isserlis’ theorem, for any pair $W,W^{\prime}$ ,
The average value of $|W\cap W^{\prime}|$ over pairs of $r$ -sets is $r^2/n$ , so we can sum over $W,W^{\prime}\in{\mathcal{S}_r(n)}$ to obtain
By (1.11) this term tends to $0$ , implying that $K=O\big(nr^2/d + \phi ^3+e^{-\rho _1^2}\big)$ .
All that is left is to establish bounds on ${\mathbb{E}} f(\boldsymbol{{X}})$ and $\mathbb{V} f(\boldsymbol{{X}})$ . Due to (4.9), we have
Again using Isserlis’ theorem, for any $W\in{\mathcal{S}_r(n)}$ we have
Hence by (4.1),
Now $\mathbb{V} f(\boldsymbol{{X}})$ satisfies
Since we already established a bound on $\text{Var}(\Im f(\boldsymbol{{X}}))$ , we only need to consider $\text{Var}(\Re f(\boldsymbol{{X}}))$ . Note that
By Isserlis’ theorem, we have
Since $\sigma (W,W^{\prime})=O(r/d)$ from (4.10),
Therefore $|\mathbb{V}(f(\boldsymbol{{X}}))| = O\big(nr^2/d\big)$ and so
using (4.8) and the definition of $\rho _1$ .
4.2 Proof of Lemma 2.2
In this section we evaluate the integral over the region $U_n\big(r^{-1}\big)\setminus \mathcal{R}(\rho )$ . The following technical bound will be useful: for any $t \in{\mathbb{R}}$ and $\lambda \in [0,1]$ , we have
Proof of Lemma 2.2. We will show that for any $\hat{\rho }$ satisfying $(2r)^{-1} \geq \hat{\rho } \geq d^{-1/2}\log{n}$ , we have
Observe that
for $L = \lceil \log _2 \bigl ( (2r)^{-1}/\big(d^{-1/2}\log n\big)\bigr ) \rceil = O(r \log n)$ , and that
This expresses the region of integration in the lemma statement as a union of regions of the form given in (4.12), and the result follows.
It remains to prove (4.12). Using (4.11), for any such $\hat{\rho }$
Let $T$ be as in Lemma 3.7 and note that $|T|=|A|^{-1/2}$ . Then by Lemma 3.7 and (4.1) there exists a $\hat{\rho }_1=\Theta \big(\hat{\rho } d^{1/2}\big)$ such that $T(U_n(\hat{\rho }_1))\subseteq \mathcal{R}(\hat{\rho })$ . By taking $\hat{\rho }_1^{\prime} = \big(1-r^2 \hat{\rho }^2/3\big)^{1/2}\, \hat{\rho }_1$ we find that $\big(1-r^2 \hat{\rho }^2/3\big)^{-1/2}\, U_n\big(\hat{\rho }^{\prime}_1\big) = U_n(\hat{\rho }_1)$ and hence
Therefore, substituting $\boldsymbol{\theta }=\big(1-r^2 \hat{\rho }^2/3\big)^{-1/2}\, T{\boldsymbol{{x}}}$ gives
Note that $\big(1 - r^2 \hat{\rho }^2/3 \big)^{-n/2}= \exp \big(O\big(r^2 \hat{\rho }^2 n\big)\big)$ . In addition we have $\hat{\rho }^{\prime}_1=\Theta (\hat{\rho }_1)=\Theta \big(\hat{\rho }d^{1/2}\big)$ and thus
We deduce that
as $d\gg r^2 n$ , by (1.11), and $\hat{\rho }^2 d=\Omega \big(\!\log ^2{n}\big)$ .
4.3 Proof of Lemma 2.3
In this section we complete the evaluation of the integral by examining the values in the region $U_n(\pi )\setminus \mathcal{B}$ . For $x\in{\mathbb{R}}$ , define $|x|_{2\pi } = \min _{k\in{\mathbb{Z}}}\, |x-2k\pi |$ and note that $|1+\lambda (e^{ix}-1)|$ depends only on $|x|_{2\pi }$ .
Proof of Lemma 2.3. Let $\boldsymbol{\theta }\in U_n(\pi )\setminus \mathcal{B}$ . First suppose that $|\theta _a- \theta _b|_{2\pi }\gt (2r)^{-1}$ for some $a,b\in [n]$ . For any $W_1, W_2 \in{\mathcal{S}_r(n)}$ that $W_1 \mathbin{\triangle } W_2 =\{a,b\}$ , we have
So $\bigl |\sum _{j\in W_1} \theta _j\bigr |_{2\pi } \gt (4r)^{-1}$ or $\bigl |\sum _{j\in W_2} \theta _j\bigr |_{2\pi }\gt (4r)^{-1}$ , or both. In any case, by Lemma 3.3 and (4.11) we have
Note that there are exactly $\binom{n-2}{r-1}=\Theta \bigl ( \binom{n-1}{r-1}\bigr )$ pairs $W_1,W_2$ such that $W_1 \mathbin{\triangle } W_2 =\{a,b\}$ . Furthermore, every $W\in{\mathcal{S}_r(n)}$ is contained in at most one such pair. Then, multiplying inequalities (4.13) for all such pairs, we obtain
By (1.11), $\frac{d}{r^2}\gg nr^2\log n$ , while by Lemma 3.6 and because $d\lt n^r$ , we have $|A| = e^{O(n\log d)} = e^{O(nr\log n)}$ . Therefore, the total contribution to the integral from this case is at most
All remaining points $\boldsymbol{\theta }\in U_n(\pi )\setminus \mathcal{B}$ satisfy $|\theta _a -\theta _b|_{2\pi } \leq (2r)^{-1}$ for all $a,b\in [n]$ and $\min _{j \in [n],\, k\in [r]}|\theta _j - \frac{2\pi k}{r}|_{2\pi } \gt (2r)^{-1}$ . These two conditions imply that for any such $\boldsymbol{\theta }$ there exists $k\in [r]$ such that for all $j\in [n]$ we have
Summing the above over any $W\in{\mathcal{S}_r(n)}$ implies that $\frac{1}{2} \leq \left |\sum _{j\in W} \theta _j\right |_{2\pi } \leq \pi$ . Hence (4.11) implies that
Again, multiplying by $(2\pi )^n$ for an upper bound, we see that the contribution of all such points $\boldsymbol{\theta }$ to the integral is at most
completing the proof.
5. Solving the beta-system
We first prove that the solution to (1.5) is unique if it exists.
Proof of Lemma 1.4. Suppose $\boldsymbol{\beta }^{\prime}\ne \boldsymbol{\beta }^{\prime\prime}$ both satisfy (1.5). For $y\in{\mathbb{R}}$ and $W\in{\mathcal{S}_r(n)}$ define $\xi _W(y)\,:\!=\,(1-y)\lambda _W(\boldsymbol{\beta }^{\prime}) + y\lambda _W(\boldsymbol{\beta }^{\prime\prime})$ . Consider the entropy function
The derivative of $S(y)$ at $y=0$ is
Similarly, the derivative of $S(y)$ at $y=1$ is $S^{\prime}(1)=0$ .
On the other hand, $\boldsymbol{\beta }^{\prime}\ne \boldsymbol{\beta }^{\prime\prime}$ implies that $\lambda _W(\boldsymbol{\beta }^{\prime})\ne \lambda _W(\boldsymbol{\beta }^{\prime\prime})$ for at least one $W\in{\mathcal{S}_r(n)}$ . The second derivative of $S(y)$ equals
and hence is strictly negative when $\boldsymbol{\beta }^{\prime}\neq \boldsymbol{\beta }^{\prime\prime}$ . Therefore, $S(y)$ is strictly concave and cannot have more than one stationary point. This completes the proof.
To prove Lemma 1.3 we will employ the following lemma from [Reference Gao, Isaev and McKay9].
Lemma 5.1. [Reference Gao, Isaev and McKay9, Lemma 7.8] Let $\Psi \,:\,{\mathbb{R}}^n \to{\mathbb{R}}^n$ , $\eta \gt 0$ , and $U = \{\boldsymbol{\beta }\in{\mathbb{R}}^n \mathrel{:} \|\boldsymbol{\beta } - \boldsymbol{\beta }^{(0)} \| \leq \eta \|\Psi (\boldsymbol{\beta }^{(0)})\| \}$ and $\boldsymbol{\beta }^{(0)}\in{\mathbb{R}}^n$ , where $\|{\cdot}\|$ is any vector norm in ${\mathbb{R}}^n$ . Assume that
where $J$ denotes the Jacobian matrix of $\Psi$ and $\|{\cdot}\|$ stands for the induced matrix norm. Then there exists $\boldsymbol{\beta }^*\in U$ such that $\Psi (\boldsymbol{\beta }^*) = \boldsymbol{0}$ .
In connection with the system of (1.5), we consider $\Psi \,:\,{\mathbb{R}}^n\to{\mathbb{R}}^n$ defined by
Clearly, $\Psi$ is analytic in ${\mathbb{R}}^n$ . Observe that
and thus $ J(\boldsymbol{\beta }) = 2A(\boldsymbol{\beta })$ , where $J(\boldsymbol{\beta })$ is the Jacobian matrix of $\Psi (\boldsymbol{\beta })$ and $A(\boldsymbol{\beta })$ is defined by (1.4). We start by bounding $ \|J^{-1}(\boldsymbol{\beta })\|_\infty$ as required for Lemma 5.1.
Lemma 5.2. Let $\boldsymbol{\beta }^{(0)}\in{\mathbb{R}}^n$ and real numbers $\delta _1,\delta _2\ge 0$ satisfy $\max _{j,k\in [n]}\big|{\beta }^{(0)}_j-{\beta }^{(0)}_k\big|\le \delta _1/r$ and $e^{\delta _2}\lambda (\boldsymbol{\beta }^{(0)})\le 7/8$ . Suppose that $n\ge 16e^{4\delta _1+8\delta _2}$ . Then for any $\boldsymbol{\beta }\in{\mathbb{R}}^n$ such that $\|\boldsymbol{\beta }-\boldsymbol{\beta }^{(0)}\|_{\infty }\le \delta _2/r$ , we have
where $C$ is the constant from Lemma 3.7.
Proof. Let $\boldsymbol{\beta }\in{\mathbb{R}}^n$ satisfy $\|\boldsymbol{\beta }-\boldsymbol{\beta }^{(0)}\|_{\infty }\le \delta _2/r$ . Then
Applying Lemma 3.7 for $\boldsymbol{\beta }$ implies for all sufficiently large $n$ that
By Lemma 3.2 and our assumptions we have $\lambda (\boldsymbol{\beta })\le e^{\delta _2}\lambda (\boldsymbol{\beta }^{(0)})\le 7/8$ . Therefore, the conditions of Lemma 3.4 are satisfied and we have
The result follows as $\lambda (\boldsymbol{\beta })\ge e^{-\delta _2}\lambda (\boldsymbol{\beta }^{(0)})$ by Lemma 3.2.
Further, we explain how to carefully choose $U$ and $\boldsymbol{\beta }^{(0)}$ depending on whether $d$ is small relative to $\binom{n-1}{r-1}$ or not.
5.1 Proof of Lemma 1.3(i)
Recalling (1.2), define
and note that $\|\Psi (\boldsymbol{\beta }^{(0)})\|_\infty = \max _{j\in [n]}|d-d_j|$ . Define
where $\eta =2^{10} C/d$ and $C$ is the constant from Lemma 5.2. Since $\max _{j,k\in [n]}\big|{\beta }^{(0)}_j-{\beta }^{(0)}_k\big|= 0$ , we set $\delta _1\,:\!=\, 0$ . Now assume that $\varDelta$ is sufficiently small, in particular $\varDelta \le \varDelta _0 \,:\!=\, \min \{ ( 2^{17} C)^{-1},1\}$ . Then for any $\boldsymbol{\beta }\in U$ ,
Hence we define $\delta _2\,:\!=\,1/64$ . Since
we deduce that
Therefore, the conditions of Lemma 5.2 are met for $\delta _1$ and $\delta _2$ as above, and we deduce for every $\boldsymbol{\beta }\in U$ ,
Hence all the conditions of Lemma 5.1 hold, and applying this lemma shows that there exists a solution $\boldsymbol{\beta }^{\ast }$ to (1.5). Finally note that (5.2) implies that $\max _{j,k\in [n]} \big|\beta ^\ast _j-\beta ^\ast _k\big| = O(1/r)$ , completing the proof.
5.2 Proof of Lemma 1.3(ii)
For part (ii), we define $\boldsymbol{\beta }^{(0)} = \big(\beta _1^{(0)}, \ldots,\beta _n^{(0)}\big)^{\mathrm{t}}$ by
where
Note that $\max _{j,k\in [n]}\big|\beta ^{(0)}_j - \beta ^{(0)}_k\big| = \max _{j,k\in [n]}|\log d_j - \log d_k|\leq 2 \varDelta/r$ . Define
For any $W \in{\mathcal{S}_r(n)}$ , using the assumptions of the lemma we have
Furthermore,
and so, using our assumption on $rd$ ,
It follows that for all $j\in [n]$ , Lemma 3.3 implies that $\lambda _W(\boldsymbol{\beta }^{(0)})=\Theta \big(\lambda \big(\boldsymbol{\beta }^{(0)}\big)\big)$ , and hence
It follows that for all $j\in [n]$ ,
Next, we observe that the quantity $\sum _{W \ni j } \prod _{k\in W-j} d_k$ depends insignificantly on $j$ . Indeed, by our assumptions we have
for $\ell \in \{j,j^{\prime}\}$ , while
The last line uses the fact that for any $x\in{\mathbb{R}}$ we have
This shows that for any $j,j^{\prime}\in [n]$ ,
Observe also that
Combining the above and using the assumptions, we conclude that for all $j\in [n]$ ,
Taking the average of (5.4) implies that
Applying Lemma 5.2 with $\delta _1\,:\!=\,2\varDelta$ and $\delta _2\,:\!=\,\varDelta$ , we conclude that for every $\boldsymbol{\beta }\in U$ ,
By the definition of $\Psi$ and our assumptions on $d_j$ , it follows from (5.4) that $\|\Psi (\boldsymbol{\beta }^{(0)})\|_\infty = o(d/ r)$ . Hence we can apply Lemma 5.1 with $\eta \,:\!=\, \varDelta (r \|\Psi (\boldsymbol{\beta }^{(0)}\|_\infty )^{-1} = \omega (d^{-1})$ , completing the proof.
6. The near-regular case
In this section we will prove Theorem 1.5. As mentioned at the end of Section 1, we have omitted some of the calculations in this and the following section. These calculations can be verified using the identities in Section 9. It will be convenient for us to begin the analysis in the first quadrant. By assumption (1.13), Lemma 1.3(i) guarantees the existence of a solution $\boldsymbol{\beta }^{\ast }=(\beta _1^\ast,\ldots,\beta _n^\ast )$ which satisfies (1.8), and by Lemma 1.4 this solution is unique. Therefore we are justified in applying Theorem 1.1.
Next, recalling (1.2), define $\boldsymbol{\gamma }^\ast =(\gamma ^\ast _1,\ldots,\gamma ^\ast _n)$ by
In the regular case, $\boldsymbol{\beta }^{\ast }$ satisfies (1.5) when $\boldsymbol{\gamma }^\ast =\boldsymbol{0}$ . For $W\in{\mathcal{S}_r(n)}$ , define $\gamma _W^\ast \,:\!=\, \sum _{j\in W} \gamma _j^\ast$ . In addition, for $W\in{\mathcal{S}_r(n)}$ and $s\in{\mathbb{N}}$ , define $\varGamma _s=\varGamma _s(W)\,:\!=\,\sum _{j\in W} \delta _j^s$ .
Lemma 6.1. Under assumptions (1.7) and (1.13) in the first quadrant, there is a solution of (1.5) with
uniformly for $j\in [n]$ .
Proof. Equations (1.5) can be written as $\varPhi (\boldsymbol{\gamma })=\boldsymbol{\delta }$ , where $\varPhi \,:\,{\mathbb{R}}^n\to{\mathbb{R}}^n$ is defined by
for $j\in [n]$ . Consider $\bar{\boldsymbol{\gamma }}=(\bar \gamma _1,\ldots,\bar \gamma _n)$ defined by
The function $L(x)=(e^x-1)/(1+\lambda (e^x-1))$ has bounded fifth derivative for $\lambda \in [0,1]$ , $x\in [{-}1,1]$ , so by Taylor’s theorem we have in that domain that
For $W\in{\mathcal{S}_r(n)}$ , define $\bar \gamma _W\,:\!=\,\sum _{j\in W} \bar \gamma _j$ . Now
which implies that $(\bar \gamma _W)^5 = O\big(r^{-1}n^{-1}d^{-3/5}\big)$ . Therefore, from (6.1) we have
Summing (6.2) over the $\binom{n-1}{r-1}=d/\lambda$ sets $W$ that include $j$ , for each $j$ , we verify that
These calculations rely heavily on the identities given in Section 9.2.
Define $C^{\prime}\,:\!=\, 2^{10} C$ , where $C$ is the constant from Lemma 5.2, and let
Define the function $\nu \,:\,{\mathbb{R}}^n\to{\mathbb{R}}^n$ by
Let $\Psi$ be the function defined in (5.1). Then for any ${\boldsymbol{{x}}}\in{\mathbb{R}}^n$ we have $\Psi (\nu ({\boldsymbol{{x}}}))=\varPhi ({\boldsymbol{{x}}})-\boldsymbol{\delta }$ . In particular this implies that $J_\varPhi ^{-1}({\boldsymbol{{x}}})=J_{\Psi }^{-1}(\nu ({\boldsymbol{{x}}}))$ where $J_\varPhi ({\boldsymbol{{x}}})$ and $J_{\Psi }(\nu ({\boldsymbol{{x}}}))$ denote the Jacobians of $\varPhi ({\boldsymbol{{x}}})$ and $\Psi (\nu ({\boldsymbol{{x}}}))$ respectively.
We wish to apply Lemma 5.2. Then
Next, using (6.3), we have that
Finally, since $\lambda (\nu (\boldsymbol{0}))=\lambda \le 1/2$ and $\max _{j\in [n]} |\bar \gamma _j|=o(1/r)$ , Lemma 3.2 implies that
Hence Lemma 5.2 implies that for every ${\boldsymbol{{x}}}\in U(C^{\prime})$ , we have
Therefore, by Lemma 5.1 there exists ${\boldsymbol{{x}}}\in U(C^{\prime})$ such that $\varPhi ({\boldsymbol{{x}}})=\boldsymbol{\delta }$ . Setting $\boldsymbol{\gamma }^\ast ={\boldsymbol{{x}}}$ proves the lemma, since $\|{\boldsymbol{{x}}} - \bar{\boldsymbol{\gamma }}\|_\infty = O(r^{-1} n^{-1} d^{-3/5})$ and the last term of $\bar \gamma _j$ is $O\big(r^{-1}n^{-1}d^{-3/5}\big)$ .
Now we can calculate the values of the quantities that appear in Theorem 1.1.
Lemma 6.2. Under assumptions (1.7) and (1.13), we have in the first quadrant that
Proof. Define $z_W$ by $\lambda _W=\lambda (1+z_W)$ and
Recall that $\sum _{W\in{\mathcal{S}_r(n)}} z_W = 0$ , therefore,
Lemma 6.1 implies that $\gamma ^\ast _W=\bar{\gamma }_W+O\big(n^{-1}d^{-3/5}\big)$ . Recalling (6.1), this implies that $L(\gamma _W^\ast )=L(\bar{\gamma }_W)+O\big(n^{-1}d^{-3/5}\big)$ , as $\gamma _W^\ast =o(1)$ . Using (6.2), we have
The coefficients of the Taylor expansion of $\eta (z)$ are uniformly $O(\lambda )$ , as shown in (6.4). Also note that $z_W=O(\delta _{\mathrm{max}} r d^{-1})=O(d^{-1/5})$ . This gives
Using the identities in Section 9.1, we can sum over all $W\in{\mathcal{S}_r(n)}$ :
Let $A_0$ be the matrix $A$ in the case that ${\boldsymbol{{d}}} = (d,d,\ldots, d)$ . That is,
Then
where the determinant follows from (3.1).
Lemma 6.3. Under assumptions (1.7) and (1.13), we have in the first quadrant that
Proof. Define the matrix $E$ by $A=A_0+E$ . Then
For $W\in{\mathcal{S}_r(n)}$ we have $\lambda _W=\lambda (1+z_W)$ , where $z_W$ is given by (6.6). This gives
Summing over $W\ni j$ and $W\ni j,k$ , using Sections 9.2 and 9.3, we have $E=(e_{jk})$ , where
This implies that $A_0^{-1}E = (e^{\prime}_{jk})$ , where
Finally, we have $M=(m_{jk})$ , where
To complete the proof, note that
and, since $\|M\|_2\le \sqrt{\|M\|_1 \|M\|_\infty }=o(1)$ ,
where $\mu _1,\ldots,\mu _n$ are the eigenvalues of $M$ and $\|M\|_F$ is the Frobenius norm. The penultimate equality follows by [Reference Zhan30, equation (3.71)], which states that $\sum _{j=1}^n |\mu _j|^2 \le \|M\|_F^2$ .
Corollary 6.4. Under assumptions (1.7) and (1.13), we have in the first quadrant that
where $\bar \varepsilon =\varepsilon + \delta _{\mathrm{max}} d^{-3/5}$ and $|A_0|$ is given by (6.8).
Finally, Theorem 1.5 removes the assumption of being in the first quadrant.
Proof of Theorem 1.5. Since the formula is invariant under the symmetries and matches Corollary 6.4 within the error term in the first quadrant, it is true in all quadrants. To see this, observe that under either of our two symmetries, $R_3$ becomes $-R_3$ and $(1-2\lambda )(n-2r)$ becomes $-(1-2\lambda )(n-2r)$ .
7. Degrees of random uniform hypergraphs
We now show how to apply the results of Section 6 to analyse the degree sequence of a random uniform hypergraph with a given number of edges. Define $B(K,x)=\binom{K}{\lambda K+x}$ where $K$ , $\lambda K + x$ are integers. The following lemma is a consequence of Stirling’s expansion for the gamma function.
Lemma 7.1. Let $K,x,\lambda$ be functions of $n$ such that, as $n\to \infty$ , $\lambda \in (0,1)$ , $\lambda (1-\lambda )K\to \infty$ and $x=o\bigl (\lambda (1-\lambda )K\bigr )$ . Then
Proof. This follows from Stirling’s expansion for the factorial, which we use in the form
From this we obtain
Now write
and similarly for $((1-\lambda )K-x)^{(1-\lambda )K-x+1/2}$ . Expanding the logarithms gives the desired result.
Proof of Theorem 1.7. For some $p\in (0,1)$ , let $X_1,\ldots, X_n$ be iid random variables with the binomial distribution $\text{Bin}\bigl (\binom{n-1}{r-1},p\bigr )$ . Then $\mathcal{B}_r(n,m)$ is the distribution of $(X_1,\ldots,X_n)$ conditioned on the sum being $nd$ . Since the sum has distribution $\text{Bin}\bigl (n\binom{n-1}{r-1},p\bigr )$ , we find that the conditional probability is independent of $p$ :
Consequently,
Now use Theorem 1.5 for $H_r({\boldsymbol{{d}}})$ and Lemma 7.1 for the other factors.
Let $Z_1,\ldots, Z_n$ be iid random variables having the hypergeometric distribution with parameters $\binom nr,m,\binom{n-1}{r-1}$ , where $m=e({\boldsymbol{{d}}})$ . That is,
Note that $Z_1$ has precisely the distribution of the degree of one vertex in a uniformly random $r$ -uniform hypergraph with $n$ vertices and $m$ edges. Now let $\mathcal{T}_r(n,m)$ be the distribution of $Z_1,\ldots,Z_n$ when conditioned on having sum $nd$ . If $P\,:\!=\,\mathbb{P}(Z_1+\cdots + Z_n=nd)$ , for which there seems to be no closed formula, we have
Lemma 7.2. Let $Z_1,\ldots,Z_n$ be independent hypergeometric variables with distribution given by (7.1) and let $X_1,\ldots,X_n$ be the same conditioned on $\sum _{j=1}^n Z_j=nd$ . Then
-
(a) Each $Z_j$ and $X_j$ has mean $d$ . Also, $Z_j$ has variance
(7.3) \begin{equation} \sigma ^2 = \frac{(1-\lambda )(n-r)d^2}{nd-\lambda r} = \frac{Q}{n}\biggl (1 - \binom nr^{\!\!-1}\,\biggr )^{\!-1}. \end{equation} -
(b) For $t\ge 0$ , we have for any $j$ that
\begin{equation*} \mathbb {P}(|Z_j-d| \ge t) \le 2\exp \biggl ( -\frac {t^2}{2(d+t/3)}\biggr ) \le \begin {cases} 2\exp \Bigl ({-}\frac {t^2}{4d}\Bigr ), & 0\le t\le 3d;\\[8pt] 2e^{-3t/4}, & t\ge 3d. \end {cases} \end{equation*} -
(c) If $nd+y$ is an integer in $[0,mn]$ , then
\begin{equation*} \mathbb {P}\Bigl (\, \sum \nolimits _{j=1}^n Z_j=nd+y\Bigr ) = \frac {1}{\sigma \sqrt {2\pi n}} \exp \biggl ({-}\frac {y^2}{2n\sigma ^2}\biggr ) + O\big(n^{-1}\sigma ^{-2}\big), \end{equation*}where the implicit constant in the error term is bounded absolutely. -
(d) For every nonnegative integer $y$ , $\mathbb{P}(X_1=y)=C(y)\mathbb{P}(Z_1=y)$ , where uniformly
\begin{equation*} C(y) = \frac {\mathbb {P}\bigl (\sum _{j=2}^n Z_j = nd-y\bigr )}{\mathbb {P}\bigl (\sum _{j=1}^n Z_j=nd\bigr )} = \big(1+O(n^{-1})\big) \exp \biggl ( -\frac {(y-d)^2}{2(n-1)\sigma ^2}\biggr ) + O\Big(n^{-1/2}\sigma ^{-1}\Big). \end{equation*} -
(e) If $\sigma ^2\geq 1$ then for $t\gt 0$ ,
\begin{align*}{\mathbb{E}} \min \!\big\{ (Z_1-d)^2,t^2\big\} &= \sigma ^2 + O\bigl (e^{-t^2/(4d)}d + e^{- 9d/4}d\bigr ), \\{\mathbb{E}} \min \!\big\{ (X_1-d)^2,t^2\big\} &= \big(1+O(n^{-1})\big)\, \sigma ^2 + O\bigl (e^{-t^2/(4d)}d + e^{- 9d/4}d\bigr ). \end{align*}
Proof. Part (a) is standard theory of the hypergeometric distribution. For parts (b) and (c), we note that Vatutin and Michailov [Reference Vatutin and Mikhailov28] proved that $Z_j$ can be expressed as the sum of $m$ independent Bernoulli random variables (generally with different means). Inequality (b) is now standard (see [Reference Janson, Łuczak and Rucinski14, Theorem 2.1]), while (c) was proved by Fountoulakis, Kang and Makai [Reference Fountoulakis, Kang and Makai8, Theorem 6.3].
For part (d), the standard formula for conditional probability implies that the expression for $\mathbb{P}(X_1=y)$ holds with $C(y) = \frac{\mathbb{P}\bigl (\sum _{j=2}^n Z_j = nd-y\bigr )}{\mathbb{P}\bigl (\sum _{j=1}^n Z_j=nd\bigr )}$ . Then by part (c) we have
and dividing the first expression by the second gives the stated approximation for $C(y)$ .
For (e), we have
where the sum is restricted to integer $d+\ell$ . We will consider the upper tail, noting that the lower tail is much the same:
Now we can use the first case of part (b) to obtain the bound $O(e^{-t^2/(4d)}d)$ and the second case to obtain the bound $O(e^{-9d/4}d)$ .
For the second part of (e), we have
Since $\sigma ^2\geq 1$ , the fourth central moment of $Z_1$ satisfies ${\mathbb{E}}\bigl ((Z_1-d)^4\bigr ) =O(\sigma ^4)$ , as follows from the exact expression given in [Reference Kendall and Stuart16, equation (5.55)]. Therefore
Then the effect of truncation at $t$ can be bounded as before, using the fact that $C(\ell )= O(1)$ .
Proof of Theorem 1.8. From the definitions of $\mathcal{D}_r(n,m)$ and (7.2), we have
Now use Theorem 1.5 for $H_r({\boldsymbol{{d}}})$ , Lemma 7.2(c) for $P$ , and Lemma 7.1 for the other factors.
For the proof of Theorem 1.10 we need a concentration lemma.
Lemma 7.3. Let $f(x_1,\ldots,x_K)\,:\,\{0,1\}^K\to{\mathbb{R}}$ be a function such that $|f({\boldsymbol{{x}}})-f({\boldsymbol{{x}}}^{\prime})|\le a$ whenever ${\boldsymbol{{x}}},{\boldsymbol{{x}}}^{\prime}$ differ in only one coordinate. Let $\boldsymbol{{Z}}=(Z_1,\ldots,Z_K)$ be independent Bernoulli variables (not necessarily identical), conditioned on having constant sum $S$ . Then, for any $t\ge 0$ ,
Proof. According to Pemantle and Peres [Reference Pemantle and Peres25, Example 5.4], the measure defined by independent Bernoulli variables conditioned on a fixed sum has the “strong Rayleigh” property. The proof is completed by applying [Reference Pemantle and Peres25, Theorem 3.1].
Proof of Theorem 1.10. Probabilities in the hypergeometric distribution are symmetric under the two operations (that is, replacing $r$ by $n-r$ , or replacing $m$ by $\binom{n}{r}-m$ ). Since the error term given in the theorem is also symmetric under these operations, it suffices to assume that $(r,{\boldsymbol{{d}}})$ belongs to the first quadrant.
Define
and
Let $Z_1,\ldots,Z_n$ be iid random variables with distribution (7.1). The distribution $\mathcal{T}_r(n,m)$ is that of $(Z_1,\ldots,Z_n)$ conditioned on $\sum _{j=1}^n Z_j=nd$ .
By the union bound, we have
Since always $C(i)=O(1)$ , Lemma 7.2(b,d) and the union bound give
Next, note that in $\mathcal{T}_r(n,m)$ we have $|n\sigma ^2-{\mathbb{E}} R^{\prime}_2({\boldsymbol{{d}}})| =O(\sigma ^2)=O(d)$ by Lemma 7.2(e); for later use note that this only relies on the condition $\delta _{\mathrm{max}}\le d^{1/2}\log n$ . Recall that each $Z_j$ is the sum of $m$ independent Bernoulli variables, so $R^{\prime}_2({\boldsymbol{{d}}})$ is a function of $mn$ independent Bernoulli variables conditioned on fixed sum $nd$ . Changing one of the Bernoulli variables changes the corresponding $d_j$ by one and changes $d$ by $1/n$ . Overall, this changes the value of $R^{\prime}_2({\boldsymbol{{d}}})$ by at most $2+4d^{1/2}\log n$ . Applying Lemma 7.3, we have
Therefore, $\mathbb{P}_{\mathcal{T}_r(n,m)}(\mathfrak{W}) = 1 - n^{-\Omega (\!\log n)}$ . Now we can apply Theorem 1.8 to obtain
for ${\boldsymbol{{d}}}\in \mathfrak{W}$ . Here, $\varepsilon$ and $n^{1/10}Q^{-1/10}\log n$ come from the error terms in Theorem 1.8, while $n^{-1/2}\log ^2 n$ comes from the term $R_2/Q$ in Theorem 1.8 since $n\sigma ^2=Q(1+O(n^{-1/2}\log ^2 n))$ in $\mathfrak{W}$ , by the definition of $\mathfrak{W}$ and (7.3).
Now consider the probability space $\mathcal{D}_r(n,m)$ . Since the distribution of each individual degree is the same as the distribution of $Z_1$ , using a union bound and applying Lemma 7.2(b) gives $\mathbb{P}_{\mathcal{D}_r(n,m)} (\delta _{\mathrm{max}} \gt d^{1/2}\log n) = n^{-\Omega (\!\log n)}$ and hence
In [Reference Kamčev, Liebenau and Wormald15], concentration of $R_2({\boldsymbol{{d}}})$ in $\mathcal{D}_r(n,m)$ is shown using a lemma on functions of random subsets. However, that approach (at least, using the same concentration lemma) apparently only works for $r=o(n/\log n)$ , so we will adopt a different approach.
By the same argument as used to prove (7.4),
for any positive integer $k$ and some constant $C\gt 0$ independent of $k$ . (Similarly to before, we have used $|n\sigma ^2-{\mathbb{E}} R^{\prime}_2({\boldsymbol{{d}}})| =O(d)$ .)
If $R_2({\boldsymbol{{d}}})\leq (k+1)\, n^{1/2}\sigma ^2\log ^2 n$ then $-\frac{1}{2} + \frac{R_2({\boldsymbol{{d}}})}{2Q} \leq \frac{(k+1)\log ^2 n}{2n^{1/2}} + o(1)$ and so applying Theorem 1.8 gives
Summing over $k\ge 1$ , we have
and therefore $\mathbb{P}_{\mathcal{D}_r(n,m)}(\mathfrak{W})=1-n^{-\Omega (\!\log n)}$ , completing the proof.
8. Deferred proofs
8.1 Proof of Lemma 1.2
We begin with the operation of replacing each edge by its complement in $V$ , which sends $d_j$ to $d^{\prime}_j=e({\boldsymbol{{d}}})-d_j$ for each $j$ . Recall that
and note that for all $j,k\in [n]$ ,
In addition, for any $W\in{\mathcal{S}_r(n)}$ we have
Therefore for any $W\in{\mathcal{S}_r(n)}$ we have
Note that summing (1.5) over $j$ each edge is counted $r$ times, so $\sum _{W\in{\mathcal{S}_r(n)}} \lambda _W(\boldsymbol{\beta }^{\ast })=e({\boldsymbol{{d}}})$ . Hence
proving that $({\boldsymbol{{d}}}^{\prime},\boldsymbol{\beta }^{\prime})$ satisfies (1.5). It only remains to show that
For $W\subseteq [n]$ , define the $n\times n$ matrix $\varXi _W$ by
Then,
Now note that $(I-\frac 1r J)\varXi _W(I-\frac 1r J)=\varXi _{V\setminus W}$ for any $W\in{\mathcal{S}_r(n)}$ . (The case $W=\{1,\ldots,r\}$ is representative and easy to check.) Together with (8.1), this proves that
Finally, $|I-\frac 1r J| = -\frac{n-r}{r}$ by (3.1), which proves (8.2).
Next, consider the operation that complements the edge set, sending $d_j$ to $\widetilde d_j=\binom{n-1}{r-1}-d_j$ without changing the edge size. Recall that $\widetilde \beta _j=-\beta _j^\ast$ for each $j$ . Then $|\widetilde \beta _j - \widetilde \beta _k| = |\beta _j^\ast - \beta _k^\ast |$ for all $j,k$ . Note that for any $W\in{\mathcal{S}_r(n)}$ we have
which implies that $A(\widetilde{\boldsymbol{\beta }})=A(\boldsymbol{\beta }^{\ast })$ . In addition,
proving that $(\widetilde{\boldsymbol{{d}}},\widetilde{\boldsymbol{\beta }})$ satisfies (1.5).
The third operation, which complements both the edges and the edge set simultaneously, is just the composition of the first two in either order. Hence the result for this operation follows immediately, completing the proof.
8.2 Proof of Lemma 3.7
The following lemmas will be useful.
Lemma 8.1. ([Reference Higham11, (1.13)]). For $p \in{\mathbb{R}}$ , define
Then, for ${\boldsymbol{{x}}}\in{\mathbb{R}}^n$ ,
Also, for $x\geq 0$ , $|\alpha _{-1/2}(x)| \leq x^{-2}$ and $|\alpha _{1/2}(x)| \leq x^{-1}$ .
For a matrix $X=(x_{jk})$ , $\|X\|_{\textrm{max}}\,:\!=\, \max _{j,k} |x_{jk}|$ is a matrix norm that is not submultiplicative. The following is a special case of a lemma in [Reference Isaev and McKay13].
Lemma 8.2. ([Reference Isaev and McKay13, Lemma 4.9]) Let $M$ be a real symmetric positive definite $n\times n$ matrix with
for some $1\geq \gamma \gt 0$ , $\kappa \gt 0$ and all ${\boldsymbol{{x}}}\in{\mathbb{R}}^n$ . Then the following are true.
-
(a)
\begin{equation*} \| M^{-1} - I\|_{\textrm {max}} \leq \frac {(\kappa +\gamma ) \kappa }{\gamma n}. \end{equation*} -
(b) There exists a real matrix $T$ such that $T^{\mathrm{t}}\! M T=I$ and
\begin{equation*} \|T\|_1, \|T\|_\infty \le \frac {\kappa +\gamma ^{1/2}}{\gamma ^{1/2}}, \ \ \ \ \|T^{-1}\|_1, \|T^{-1}\|_\infty \le \frac {(\kappa +1)(\kappa +\gamma ^{1/2})}{\gamma ^{1/2}}. \end{equation*}
The next result will be used to find a change of basis matrix to invert $A(\boldsymbol{\beta })$ .
Lemma 8.3. Let $\bar{A}=D+\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm{t}}+X$ be a symmetric positive definite real matrix of order $n$ , where $D$ is a positive diagonal matrix and $\boldsymbol{{s}}\in{\mathbb{R}}^n$ . Define these quantities:
Then there is a real $n\times n$ matrix $T$ such that $T^{\mathrm{t}}\! \bar A T=I$ and the following are true:
-
(a)
\begin{equation*} \|\bar {A}^{-1}-(D+\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm {t}})^{-1}\|_{\textrm {max}} \leq \frac {B^2\kappa (\kappa +1)} {D_{\textrm {min}}\gamma n}, \text {where} \end{equation*}\begin{equation*} (D+\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm {t}})^{-1} = D^{-1} - \frac {D^{-1}\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm {t}} D^{-1}} {1 + \|D^{-1/2}\boldsymbol{{s}}\|_2^2}; \end{equation*} -
(b)
\begin{equation*} \|T\|_1, \|T\|_\infty \leq B D_{\textrm {min}}^{-1/2}\gamma ^{-1/2}(\kappa +1); \end{equation*} -
(c) For any $\rho \gt 0$ , define
\begin{equation*} \mathcal {Q}(\rho )\,:\!=\, U_n(\rho ) \cap \biggl \{ {\boldsymbol{{x}}}\in {\mathbb {R}}^n \mathrel {:} |\boldsymbol{{s}}^{\mathrm {t}}{\boldsymbol{{x}}}| \leq \frac {D_{\textrm {max}}\|\boldsymbol{{s}}\|_1} { D_{\textrm {min}}^{1/2}\|\boldsymbol{{s}}\|_2} \rho \biggr \}. \end{equation*}Then\begin{equation*} T\bigl ( U_n(\rho _1)\bigr ) \subseteq \mathcal {Q}(\rho ) \subseteq T\bigl (U_n(\rho _2)\bigr ), \end{equation*}where\begin{align*} \rho _1\,:\!=\, \frac{1}{B} \, D_{\textrm{min}}^{1/2}\, \gamma ^{1/2}\, (\kappa + 1)^{-1}\rho, \qquad \rho _2\,:\!=\, B D_{\textrm{max}}^{\, 1/2}\, \gamma ^{-1/2}\, (\kappa + 1)^2\rho . \end{align*}
Proof. Define $\boldsymbol{{s}}_1\,:\!=\, D^{-1/2}\boldsymbol{{s}}$ , $X_1\,:\!=\,D^{-1/2}XD^{-1/2}$ , $T_1\,:\!=\,(I+\boldsymbol{{s}}_1\boldsymbol{{s}}_1^{\mathrm{t}})^{-1/2}$ and $X_2\,:\!=\,T_1^{\mathrm{t}}\,X_1\, T_1$ . By Lemma 8.1, we have
and note that $T_1$ is symmetric, that is, $T_1=T_1^{\mathrm{t}}$ . Therefore
Recall that by Lemma 8.1 we have $|\alpha _{-1/2}(\|\boldsymbol{{s}}_1\|_2)|\le \|\boldsymbol{{s}}_1\|_2^{-2}$ , so by (8.3),
Next we apply Lemma 8.2 with $M=I+X_2$ . By (8.4), ${\boldsymbol{{x}}}^{\mathrm{t}} \bar A{\boldsymbol{{x}}} \geq \gamma{\boldsymbol{{x}}}^{\mathrm{t}}(D+\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm{t}}){\boldsymbol{{x}}}$ is equivalent to
for all ${\boldsymbol{{x}}}\in{\mathbb{R}}^n$ . Also
Therefore, $M,\gamma,\kappa$ satisfy the conditions of Lemma 8.2. Consequently, there exists a transformation $T_2$ such that $T_2^{\mathrm{t}}(I+X_2)T_2=I$ . This, together with (8.4) implies that $T=D^{-1/2}T_1T_2$ satisfies $T^{\mathrm{t}} \bar AT=I$ . In addition, by Lemma 8.2(b), we have
Together with (8.5) and $\|D^{-1/2}\|_1,\|D^{-1/2}\|_\infty \le D_{\textrm{min}}^{-1/2}$ , this proves part (b).
Next we prove the first inclusion of part (c). Let ${\boldsymbol{{x}}}\in U_n(\rho _1)$ , that is, $\|{\boldsymbol{{x}}}\|_\infty \leq \rho _1$ . Then $\|T{\boldsymbol{{x}}}\|_\infty \leq \|T\|_\infty \, \rho _1 \leq \rho$ by part (b), so $T{\boldsymbol{{x}}}\in U_n(\rho )$ . Next
From (8.6), $\|T_2{\boldsymbol{{x}}}\|_\infty \leq \gamma ^{-1/2}( \kappa +1)\rho _1$ . Also (8.3) gives $T_1\boldsymbol{{s}}_1=(1+\|\boldsymbol{{s}}_1\|_2^2)^{-1/2}\boldsymbol{{s}}_1$ , so
Combining these bounds proves the inclusion, as $B\ge 1$ .
For the second inclusion of part (c), consider ${\boldsymbol{{x}}}\in \mathcal{Q}(\rho )$ . Lemma 8.1 implies that $T_1^{-1}=I+\alpha _{1/2}(\|\boldsymbol{{s}}_1\|_2) \boldsymbol{{s}}_1\boldsymbol{{s}}_1^{\mathrm{t}}$ , and hence
Now apply (8.6) to $\|T_2^{-1}\|_\infty$ , the first part of the definition of $\mathcal{Q}(\rho )$ to $\|D^{1/2}{\boldsymbol{{x}}}\|_\infty$ , the second part of the definition of $\mathcal{Q}(\rho )$ to $|\boldsymbol{{s}}^{\mathrm{t}}{\boldsymbol{{x}}}|$ , and recall from Lemma 8.1 that $|\alpha _{1/2}(\|\boldsymbol{{s}}_1\|_2)| \leq \|\boldsymbol{{s}}_1\|_2^{-1}$ . Then we have
Finally, we prove part (a). Define $X_3\,:\!=\,(I+X_2)^{-1}-I$ . By (8.4) and since $T_1=T_1^{\mathrm{t}}$ we have $T_1^{\mathrm{t}} D^{-1/2} \bar A D^{-1/2} T_1=I+X_2$ . Together with $T_1=(I+\boldsymbol{{s}}_1\boldsymbol{{s}}_1^{\mathrm{t}})^{-1/2}$ , this implies
By Lemma 8.2(a), $\|X_3\|_{\textrm{max}} \leq \kappa ( \kappa + 1)\gamma ^{-1}n^{-1}$ and thus using (8.5) we have
The expression for $(D+\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm{t}})^{-1}$ follows from the Sherman–Morrison theorem (see for example [Reference Meyer22, equation (3.8.2)]).
Proof of Lemma 3.7. Define $\check \varLambda \,:\!=\,\varLambda (\boldsymbol{\beta })$ and
Then let $\boldsymbol{{s}}\,:\!=\, (c,c, \ldots, c)^{\mathrm{t}}$ and $D\,:\!=\,\text{diag}(a_{11}-c^2,\ldots, a_{nn}-c^2)$ . We write $A(\boldsymbol{\beta }) = $ $D + \boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm{t}} + X$ .
First we show that the entries of $X$ are small. Note that all the diagonal entries of $X$ are exactly 0. By Lemma 3.5, the absolute value of any off-diagonal entry in $X$ is at most
In addition, Lemma 3.5 also implies that for any $1\le j \le n$ we have
where in the last step we used $r\le n/2$ and $n\ge 16e^{4\delta }$ .
Consider the value of $\gamma$ as in Lemma 8.3. For any ${\boldsymbol{{y}}}\in{\mathbb{R}}^n$ we have
On the other hand, by Lemma 3.5 we have
where the last inequality holds as $r\le n/2$ . Therefore setting $\gamma \,:\!=\, e^{-6\delta }/4$ , we have for any ${\boldsymbol{{y}}}\in{\mathbb{R}}^n$ that ${\boldsymbol{{y}}}^{\mathrm{t}} A{\boldsymbol{{y}}} \ge \gamma \,{\boldsymbol{{y}}}^{\mathrm{t}} (D+\boldsymbol{{s}} \boldsymbol{{s}}^{\mathrm{t}}){\boldsymbol{{y}}}$ . Let $B$ be as in Lemma 8.3. Then
which follows from Lemma 3.5 and (8.8).
For $\kappa$ as in Lemma 8.3, using (8.7), (8.8) and (8.9), we have
Next we consider the matrix $(D+\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm{t}})^{-1}$ . By Lemma 8.3 we have
and we are interested in an upper bound on the absolute value of the elements of this matrix. First consider the vector $D^{-1} \boldsymbol{{s}}$ and note that
Together with (8.8) this implies that every element in the matrix $D^{-1}\boldsymbol{{s}} \boldsymbol{{s}}^{\mathrm{t}} D^{-1}$ has absolute value at most
Similarly
implying that
and hence
Therefore, by (8.11) and (8.12), every element of $\frac{D^{-1}\boldsymbol{{s}} \boldsymbol{{s}}^{\mathrm{t}} D^{-1}}{1+\|D^{-1/2}\boldsymbol{{s}}\|_2^2}$ has absolute value at most
and thus so do the off-diagonal elements of $(D+\boldsymbol{{s}}\boldsymbol{{s}}^{\mathrm{t}})^{-1}$ . As for the diagonal elements, by (8.8) and (8.13), each has absolute value at most
as $n\ge 16e^{4\delta }\ge 16$ .
Now we have all the information needed to establish a bound on the absolute value of the elements in $A(\boldsymbol{\beta })^{-1}$ using Lemma 8.3(a). In particular, using (8.8), (8.9) and (8.10), the diagonal entries of $A(\boldsymbol{\beta })^{-1}$ have absolute value at most
for some sufficiently large constant $\hat{C}$ . On the other hand, the off-diagonal entries have absolute value at most
The first statement follows by setting $C=32+\hat{C}$ and using the fact that $r\ge 3$ .
Now for the second statement. Substituting (8.8), (8.9) and (8.10) into Lemma 8.3(b) gives
as required.
Now for the last statement of the lemma. For any real $z\ge 0$ , let
Then
Note that
Therefore there exists $z_1=\Omega (1)$ such that $\hat{\rho }(z_1)\le \rho$ and $z_1\leq 1$ . Together with Lemma 8.3, this implies that
where
Similarly there must exist $z_2=O(1)$ such that $\hat{\rho }(z_2)\ge \rho$ and $z_2\geq 1$ . Then by Lemma 8.3 we have
where
This completes the proof.
9. Appendix: useful identities
In this appendix we provide summations that help for the calculations in Section 6. We use the notation
and recall that $R_1=0$ . We provide approximations for some expressions, assuming that $(r,{\boldsymbol{{d}}})$ belongs to the first quadrant and $\delta _{\max } = O(d^{3/5})$ . The error bounds are good enough for our applications but are not necessarily tight.
9.1 Summations over all $$W \in {\mathcal{S}_{\bf{r}}}({\bf{n}})$$
9.2 Summations over all $W\ni j$
9.3 Summations over all $W\supset \{j,k\}$
Acknowledgement
We would like to thank the anonymous referee for their helpful comments.