1. Introduction
An influential theorem of Szemerédi asserts that sets of positive integers A with positive upper density, meaning that
must contain arbitrarily long arithmetic progressions. Green and Tao [Reference Green and Tao7] famously established a version of Szemerédi’s theorem for the primes. Specifically, writing $\mathcal P:=\{2,3,5,\ldots\}$ for the set of prime numbers, Green and Tao showed that sets $A\subseteq\mathcal P$ satisfying
contain arbitrarily long arithmetic progressions. In particular, the primes themselves contain arithmetic progressions of any finite length.
One can consider configurations other than arithmetic progressions. We call a system of Diophantine equations density regular if it has non-constant solutions over all sets of positive integers with positive upper density. For example, consider a linear homogeneous equation
where $s\geqslant 3$ and $a_1,\ldots,a_s$ are non-zero integers. Roth [Reference Roth20] showed that this equation is density regular if and only if $a_1+\cdots +a_s = 0$. Green [Reference Green6] subsequently proved that Roth’s theorem holds over the primes; Eq. (1.2) has non-constant solutions over any set of primes $A\subseteq\mathcal P$ satisfying (1.1) if and only if $a_1+\cdots +a_s = 0$.
A related, weaker notion of regularity is that of partition regularity, which refers to systems of Diophantine equations, which admit monochromatic non-constant solutions with respect to any finite colouring of the positive integers. By the pigeonhole principle, density regularity implies partition regularity. A foundational result in arithmetic Ramsey theory is Rado’s criterion [Reference Rado18, Satz IV], which completely characterizes partition regularity for finite systems of linear equations. In particular, Rado’s criterion reveals that Eq. (1.2) is partition regular if and only if there exists a non-empty set $I\subseteq\{1,\ldots,s\}$ such that $\sum_{i\in I}a_i = 0$.
Lê [Reference Lê11] observed that Green and Tao’s work provides a characterization of partition regularity for systems of linear homogeneous equations in shifted primes. For single equations, Lê proved that Eq. (1.2) admits monochromatic non-constant solutions with respect to any finite colouring of $\mathcal P+1:=\{p+1:p\in\mathcal P\}$ or $\mathcal P-1$ if and only if there exists a non-empty set $I\subseteq\{1,\ldots,s\}$ such that $\sum_{i\in I}a_i = 0$. Note that there are divisibility obstructions that prevent such a result holding over the set $\mathcal P+c$ for integers $c\notin\{-1,1\}$. For example, the equation $x+y=z$ is partition regular, but if we partition $\mathcal P+c$ into residue classes modulo q for any prime q dividing c, then there are no monochromatic solutions to $x+y=z$.
The purpose of this article is to obtain a complete classification of partition and density regularity over primes for equations in sufficiently many variables of the form
where $a_1,\ldots,a_s$ are non-zero integers, b is an integer, and h is a polynomial with integer coefficients. We say that (1.3) is partition regular over the primes if every finite colouring of the prime numbers produces a monochromatic non-constant solution to (1.3). Similarly, we call (1.3) density regular over the primes if (1.3) has a non-constant solution over any set of primes $A\subseteq\mathcal P$, which satisfies (1.1). For b = 0, observe that Lê’s result [Reference Lê11] asserts that Rado’s condition characterizes partition regularity over primes for (1.3) whenever $h(x)=x\pm 1$.
In our previous work [Reference Chapman and Chow2], we established necessary and sufficient conditions for partition and density regularity (over $\mathbb{N}$) for all equations (1.3) in sufficiently many variables. We observed that it is necessary for partition regularity that h satisfies a certain ‘intersectivity condition’ in order to avoid divisibility obstructions, as alluded to above. For partition regularity over primes, we use the following definition, previously introduced in [Reference Lê12, Reference Rice19]. An integer polynomial h is intersective of the second kind if for each positive integer n, there exists an integer x which is coprime to n such that n divides h(x). Observe that any polynomial h satisfying $h(1)=0$ or $h(-1)=0$ is intersective of the second kind. However, one can construct numerous polynomials, such as $(x^2 - 13)(x^2 - 17)(x^2 - 221)$ and $(x^3 - 19)(x^2 + x + 1)$, which are intersective of the second kind but do not have rational zeros (see [Reference Lê and Spencer13, §2]).
Our main result is the following:
Theorem 1.1 Let $d\geqslant 2$ be an integer. There exists a positive integer $s_0(d)$ such that the following is true. Let h be an integer polynomial of degree d, and let $s\geqslant s_0(d)$ be an integer. Let $a_1,\ldots,a_s$ be non-zero integers, and let b be an integer.
(PR) Equation (1.3) is partition regular over the primes if and only if there exists a non-empty set $I\subseteq\{1,\ldots,s\}$ with $\sum_{i\in I}a_i =0$ and an integer m with $b=(a_{1}+\cdots +a_s)m$ such that $h(x) - m$ is an intersective polynomial of the second kind.
(DR) Equation (1.3) is density regular over the primes if and only if $b=a_1 +\cdots +a_s = 0$.
Furthermore, we have $s_0(2)=5$, $s_0(3)\leqslant 9$, and
Remark 1.2. The integer $s_0(d)$, which is defined explicitly in §2, was previously introduced in [Reference Chapman and Chow2] to study partition and density regularity of equations (1.3).
Remark 1.3. The “only if” parts of both statements are true without any assumption on the number of variables s (see lemma 2.4). The above theorem asserts that these necessary conditions are sufficient provided $s\geqslant s_0(d)$.
Remark 1.4. The conditions in the partition regularity statement are trivially satisfied when $b=a_1+\cdots +a_s = 0$. Indeed, in this case, we can take $I=\{1,\ldots,s\}$ and $m=h(1)$. It then follows that $h(x) - m$ vanishes at x = 1, whence $h(x)-m$ is trivially intersective of the second kind.
In our previous work [Reference Chapman and Chow2], we obtained lower bounds for the number of solutions to (1.3) lying in dense or monochromatic subsets of the positive integers, which are sharp up to the leading constant. Our second main result is the following analogous counting version of theorem 1.1 for subsets of ${\mathcal P}_N:=\{p\leqslant N:p\;\text{prime}\}$.
Theorem 1.5 Let $d\geqslant 2$ be an integer, and let $s_0(d)$ be as given in theorem 1.1. Let h be an integer polynomial of degree d. Let $s\geqslant s_0(d)$ be an integer. Let $a_1,\ldots,a_s$ be non-zero integers, and let b be an integer. Given a set of integers $\mathcal A$, write
(PR) Suppose there exists a non-empty set $I\subseteq\{1,\ldots,s\}$ with $\sum_{i\in I}a_i =0$ and an integer m with $b=(a_{1}+\cdots +a_s)m$ such that $h(x) - m$ is an intersective polynomial of the second kind. Then, for any positive integer r, there exists a positive real number $c_1 = c_1(h;a_1,\ldots,a_s,b;r)$ and a positive integer $N_1 = N_1(h;a_1,\ldots,a_s,b;r)$ such that the following is true for any positive integer $N\geqslant N_1$. Given any r-colouring $\mathcal P_N = \mathcal C_1 \cup\cdots\cup \mathcal C_r$, there exists $k\in\{1,\ldots,r\}$ such that $|S(\mathcal C_k)|\geqslant c_1N^{-d}(N/\log N)^{s}$.
(DR) If $a_1 +\cdots +a_s = b = 0$, then for any positive real number δ > 0, there exists a positive real number $c_2= c_2(h;a_1,\ldots,a_s;\delta)$ and a positive integer $N_2 = N_2(h;a_1,\ldots,a_s;\delta)$ such that the following is true for any positive integer $N\geqslant N_2$. Given any set $A\subseteq\mathcal P_N$ satisfying $|A|\geqslant\delta |\mathcal P_N|$, we have $|S(A)|\geqslant c_2N^{-d}(N/\log N)^{s}$.
1.1. Methods
Ourgoal is to find many monochromatic/dense solutions to
for some linear forms L 1 and L 2, where $L_1(1,\ldots,1) = 0$. Here, we have used the slight abuse of notation $h(\mathbf{x})=(h(x_1),\ldots,h(x_s))$. In Fourier space, after normalization and accounting for small prime moduli (the W-trick), the image of h can be shown to behave like $\mathbb{N}$ for our count. The upshot is that it suffices to count solutions to an equation
where hD is a related polynomial that is intersective of the second kind. This Fourier-analytic transference principle was introduced by Green [Reference Green6] to show that relatively dense sets of primes contain three-term arithmetic progressions and is based on Fourier decay and restriction (from harmonic analysis). The transference argument is sketched in further detail in the next section and formalized in the five sections afterwards. It can be regarded as a version of the Hardy–Littlewood circle method and estimates for prime Weyl sums feature prominently in our work.
To count monochromatic/dense solutions to our linearized equation, we use an arithmetic regularity lemma. This enables us to decompose the indicator functions of our colour classes into three parts, the first of which exhibits quasi-periodic structure and ultimately dominates the count. Using this quasi-periodicity to obtain a large count requires us to show that polynomials evaluated at primes are dense in Bohr sets, in a suitably uniform sense. The colour class that we choose maximizes this density.
Our main novelty is to uniformly bound from below the density of a ‘prime polynomial Bohr set’. This is accomplished by induction on the dimension, beginning with a result of Harman [Reference Harman8] from Diophantine approximation. The latter brings prime Weyl sums into play. The requisite data for these come partly from the analysis in the earlier sections, partly from Lucier’s pioneering work on intersective polynomials [Reference Lucier15], and partly from the investigations of Lê–Spencer [Reference Lê and Spencer14] and Rice [Reference Rice19] into polynomials that are intersective of the second kind.
1.2. Organization
We begin in §2 with some preliminary results. We prove the ‘only if’ parts of theorem 1.1 by establishing necessary conditions for (1.3) to be partition or density regular over the primes. We synthesize both the density and partition statements presented in theorem 1.5 into a single result, theorem 2.6, on counting solutions to certain linear form equations. We also recall the ‘auxiliary intersective polynomials’ of Lucier [Reference Lucier15] and use them to state theorem 2.8, which is a ‘linearized’ version of theorem 2.6. Finally, we provide a sketch of the subsequent transference argument we will use to deduce theorem 2.6 from theorem 2.8.
In §3, we introduce formally the ‘linearization’ procedure, which we will use to infer theorem 2.6 from theorem 2.8. We apply the W-trick and introduce the majorant ν, the latter of which is the focus of our investigations in §5 and §6. To expedite this process, we record in §4 some general results on exponential sums over primes. These will later be used in §5 and §9.
In §5, we study the Fourier transform of our majorant ν by using the Hardy–Littlewood circle method. We follow this in §6 by investigating the restriction properties of ν and a related majorant µD. The Fourier decay and restriction estimates we obtain in these sections are then applied in §7 to execute the transference principle. This completes the deduction of theorem 2.6 from theorem 2.8.
The focus of the final two sections is to prove theorem 2.8 using an arithmetic regularity lemma. In §8, we begin this argument by first modifying theorem 2.8 into a new result (theorem 8.1), which is more amenable to arithmetic regularity methods. This reduces matters to counting primes in ‘polynomial Bohr sets’. Finally, in §9, we prove theorem 8.1 by establishing density estimates for these prime polynomial Bohr sets.
1.3. Notation
Let $\mathbb{N}$ denote the set of positive integers, and write $\mathcal P$ for the set of prime numbers. For each prime p, let $\mathbb{Q}_p$ and $\mathbb{Z}_p$ denote the p-adic numbers and the p-adic integers, respectively. Given a real number X > 0, we write $[X] = \{n\in\mathbb{N}: n\leqslant X\}$ and $\mathcal P_X =\mathcal P\cap[X]$. Set $\mathbb{T} = [0,1]$. For each $d\in\mathbb{N}$ and $\boldsymbol{\alpha}=(\alpha_1,\ldots,\alpha_d)\in\mathbb{R}^d$, we define
For $q \in \mathbb{N}$ and $x \in \mathbb{R}$, we write $e(x) = e^{2 \pi i x}$ and $e_q(x) = e(x/q)$. For $h(x) \in \mathbb{Z}[x]$ and $\mathbf{x} = (x_1,\ldots,x_s)$, where $s \in \mathbb{N}$, we abbreviate $h(\mathbf{x}) = (h(x_1),\ldots,h(x_s))$. If L is a polynomial with integer coefficients, we write $\gcd(L)$ for the greatest common divisor of its coefficients. The letter ɛ denotes a small, positive constant, whose value is allowed to differ between separate occurrences.
We employ the Vinogradov and Bachmann–Landau asymptotic notations: for complex-valued functions f and g, we write $f\ll g$ or $g\gg f$ or $f=O(g)$ if there exists a constant C such that $|f(x)|\leqslant C|g(x)|$ holds for all x. We indicate the dependence of the implicit constant C on some parameters $\lambda_1,\ldots,\lambda_t$ using subscripts, for example, $f\ll_{\lambda_1,\ldots,\lambda_t}g$ or $f=O_{\lambda_1,\ldots,\lambda_t}(g)$. We write $f\asymp g$ if $f\ll g$ and $g\ll f$ both hold. In any statement in which ɛ appears, we assert that the statement holds for all sufficiently small ɛ > 0.
For a finitely supported function $f: \mathbb{Z} \to \mathbb{C}$, the Fourier transform $\hat{f}$ is defined by
Given a real-valued function G, which is bounded over a closed interval $[a,b]$, we write
If G is continuously differentiable on an open interval containing $[a,b]$, then we define
2. Preliminaries
2.1. Useful ingredients
We will make repeated use of the Siegel–Walfisz theorem [Reference Hua10, lemma 7.14], which we now state for convenience. Recall that the logarithmic integral is given by
Theorem 2.1 (Siegel–Walfisz theorem)
Let $P \geqslant 2$, and write $L = \log P$. Let A > 0. Then, there exists $c = c(A) \gt 0$ such that the following is true. Let $n,q,a \in \mathbb{Z}$ with
Then,
We will also make repeated use of [Reference Rice19, lemma 9], which we state below. Owing to an inaccuracy in the published version of Rice’s article, we cite a later arXiv version. This is also explained in a remark immediately following [Reference Rice19, lemma 9]. The fact that C only depends on the degree comes from the proof.
Lemma 2.2. For any integer $k\geqslant 2$, there exists $C = C(k) \gt 0$ such that the following holds. Let $g(x) = a_k x^k + \cdots + a_1 x + a_0 \in \mathbb{Z}[x]$, and let $W, b \in \mathbb{Z}$. Let $a \in \mathbb{Z}$ and $q \in \mathbb{N}$ be coprime. Let ${\omega}(q)$ denote the number of distinct prime factors of q. Let $q = q_1 q_2$, where q 2 is the greatest divisor of q that is coprime to W, and let ${\mathrm{cont}}(g) = \gcd(a_1,\ldots,a_k)$. Then,
We note from its proof that the general epsilon-removal lemma [Reference Salmensuu21, lemma 25] holds with $\| {\alpha} - a/q \|^{\kappa}$ in place of $\| {\alpha} - a/q \|$, and for complex-valued f, with the notation therein. For convenience, we state this version below.
Lemma 2.3. (Epsilon-removal lemma)
Let ${\kappa}, \epsilon \gt 0$, $N \in \mathbb{N}$, $K \geqslant 1$, $u \gt 2/{\kappa}$, and $v \gt u + \epsilon$. Let $\phi: [N] \to [0,\infty)$ and $f: [N] \to \mathbb{C}$ with
Let C be a large, positive constant, and let
Let $\mathfrak M$ be the union of the sets
over coprime integers $0 \leqslant a \leqslant q \leqslant Q$. Assume that
Assume, further, that
and
Then,
Here, $o(K^{-2/\epsilon}N)$ denotes any quantity X such that if c > 0 and N is sufficiently large, then
This quantity may differ between instances.
2.2. Necessary conditions
We now provide necessary conditions for Eq. (1.3) to be partition or density regular over the primes. To state our results, we recall that an integer polynomial h is called intersective (or intersective of the first kind) if, for each $n\in\mathbb{N}$, there exists $x\in\mathbb{Z}$ such that $h(x)\equiv 0\;(\operatorname{mod}{n})$. We call h intersective of the second kind if this statement holds under the additional condition such that an x can be found, which is coprime to n. The following lemma demonstrates that intersectivity is a necessary condition for partition regularity of general polynomial equations.
Lemma 2.4. Let $s\in\mathbb{N}$ and let $F\in\mathbb{Z}[x_1,\ldots,x_s]$. Consider the equation
Proof. Suppose (2.1) is partition regular. Let $n\in\mathbb{N}$ and consider the n-colouring of $\mathbb{N}$ defined by partitioning $\mathbb{N}$ into distinct residue classes modulo n. The existence of a monochromatic solution to (2.1) with respect to this colouring implies that $F(t,\ldots,t)\equiv 0 \;(\operatorname{mod}{n})$ holds for some $t\in[n]$. As n was arbitrary, it follows that $F(x,\ldots,x)$ is intersective.
Now suppose (2.1) is partition regular over the primes. Let $n\in\mathbb{N}$. As before, we partition into residue classes modulo n and infer the existence of $t\in[n]$ and primes $p_1,\ldots,p_s$, which are not all equal, with $p_1\equiv \ldots \equiv p_s\equiv t\;(\operatorname{mod}{n})$ such that $F(p_1,\ldots,p_s)=0$. If we take n to be a prime power, then, since the pi are not all equal, at least one pj is coprime to n, whence t and n are coprime. Applying the Chinese remainder theorem, we conclude that $F(x,\ldots,x)$ is intersective of the second kind.
Finally, suppose (2.1) is density regular or density regular over the primes. Let $m\in\mathbb{N}$. By the Siegel–Walfisz theorem (in the case of density regularity over the primes), for each prime $p\nmid m$, we can find an integer/prime solution $(x_1,\ldots,x_s)$ to (2.1) with $x_1\equiv \dots \equiv x_s\equiv m\;(\operatorname{mod}{p})$. By reducing (2.1) modulo p, we deduce that $F(m,\ldots,m)$ is divisible by infinitely many primes, whence $F(m,\ldots,m)=0$. As m was arbitrary, we conclude that $F(x,\ldots,x)$ is the zero polynomial.
We now apply this lemma to (1.3) to establish the ‘only if’ parts of theorem 1.1. By working modulo $|\mu| n$ for any $n \in \mathbb{N}$, we see that if $\mu \ne 0$ and $\mu h$ is intersective of the second kind, then so too is h. Note also that the following result does not impose any restriction on the number of variables:
Corollary 2.5. Let $s\in\mathbb{N}$ and let h be an integer polynomial of positive degree. Let $a_1,\ldots,a_s$ be non-zero integers, and let b be an integer.
(PR) If (1.3) is partition regular over the primes, then there exist a non-empty set $I\subseteq\{1,\ldots,s\}$ with $\sum_{i\in I}a_i =0$ and an integer m with $b=(a_{1}+\cdots +a_s)m$ such that $h(x) - m$ is an intersective polynomial of the second kind.
(DR) If (1.3) is density regular or density regular over the primes, then $b=a_1 +\cdots +a_s = 0$.
Proof. Throughout this proof, we write $\mu =a_1+\cdots+a_s$ and $H(x) =\mu h(x) - b$. First suppose (1.3) is partition regular over the primes. Applying lemma 2.4 to the polynomial $F(x_1,\ldots,x_s) = a_1x_1 + \cdots + a_sx_s - b$, we find that H(x) is intersective of the second kind. By considering solutions to $H(x)\equiv 0\;(\operatorname{mod}{d})$ for any $d\mid \mu$, we observe that $\mu\mid b$. If µ ≠ 0, then there is a unique $m\in\mathbb{Z}$ with $b=\mu m$ such that $H(x)=\mu(h(x) - m)$, whence $h(x)-m$ is intersective of the second kind. If µ = 0, then b = 0, and so, upon taking $m=h(1)$, we have $b=\mu m$, and $h(x)-m$ is trivially intersective of the second kind. In both cases, Eq. (1.3) becomes
for some $m\in\mathbb{Z}$ such that $h(x)-m$ is intersective of the second kind. As this new equation is partition regular, we infer from [Reference Chapman and Chow2, proposition 2.1] the existence of a set $I\subseteq\{1,\ldots,s\}$ with the desired properties.
Finally, suppose that (1.3) is density regular or density regular over the primes. Then, lemma 2.4 implies that $H(x) = \mu h(x) - b$ is the zero polynomial. Since h has positive degree, we conclude that $b=\mu=0$.
In view of these necessary conditions, theorem 1.1 is now an immediate consequence of theorem 1.5.
2.3. Linear form equations
Having dispensed with the necessary conditions for partition and density regularity, we focus on finding monochromatic or dense solutions to (1.3). The necessary conditions we have established, therefore, inform us that (1.3) takes the shape
where $I\subseteq [s]$ is non-empty with $\sum_{i\in I}a_i=0$, and $m\in\mathbb{Z}$ is such that $b=(a_1+\cdots+a_s)m$ and $h(x)-m$ is intersective of the second kind. Upon replacing h(x) with $h(x)-m$, we can therefore reduce to the case where b = 0 and h(x) is intersective of the second kind.
To find monochromatic or dense solutions to (1.3) with b = 0, we study equations of the form
for some linear forms L 1 and L 2. To avoid trivialities, we only consider non-degenerate linear forms, where $L(\mathbf{x})=a_1x_1 + \cdots + a_sx_s$ is non-degenerate if $a_i\neq 0$ for all $i\in[s]$. For this new equation, the necessary conditions for partition and density regularity become $L_1(1,\ldots,1)=0$. Following the recent works [Reference Chapman and Chow2, Reference Chow, Lindqvist and Prendiville5], we address both density and partition regularity for (2.2) simultaneously by seeking solutions where the xi variables are sourced in a dense subset of $\mathcal P_X$, while the remaining yj variables come from a colour class $\mathcal C_k\subseteq \mathcal P_X$.
Before proceeding to our results, we require some notation. We begin by providing an explicit description of the threshold $s_0(d)$ for the number of variables required in our main theorems. Let $T = T(d) \in \mathbb{N}$ be minimal such that if $h(x) \in \mathbb{Z}[x]$ has degree d, then
has $O_{h,\varepsilon}(X^{2T - d + \varepsilon})$ solutions $\mathbf{x} \in [X]^{2T}$. Equivalently, by orthogonality, $T = T(d)$ is the smallest positive integer such that the moment estimate
holds for any integer polynomial h of degree d. The quantity $s_0(d)$ appearing in theorem 1.1 is now defined to be $s_0(d):= 2T(d) + 1$.
It follows from Hua’s lemma [Reference Hua9, equation (1)] that $T(2) \leqslant 2$ and $T(3) \leqslant 4$. In general, the proof of [Reference Wooley23, corollary 14.7] delivers
These observations verify the bound (1.4) for $s_0(d)$. Finally, by considering solutions with $x_{i}=x_{i+T}$ for $i=1,2,\ldots,T$, we record the lower bounds
We can now state our main result on partition and density regularity over primes for linear form equations (2.2).
Theorem 2.6 Let r and $d\geqslant 2$ be positive integers, and let $0 \lt \delta\leqslant 1$. Let h be an integer polynomial of degree d, which is intersective of the second kind. Let $s \geqslant 1$ and $t \geqslant 0$ be integers such that $s + t \geqslant s_0(d)$. Let
be non-degenerate linear forms such that $L_1(1,\ldots,1) = 0$. Then, there exist
such that the following is true for all $X\geqslant X_0$. Suppose $\mathcal P_X = \mathcal C_1 \cup \cdots \cup \mathcal C_r$. Then, there exists $k \in [r]$ with $|\mathcal C_k|\geqslant\tau_0(\delta) |\mathcal P_X|$ such that if $A \subseteq \mathcal P_X$ satisfies $|A| \geqslant {\delta} |\mathcal P_X|$, then
The implied constant may depend on $h, L_1, L_2, r, \delta$.
Remark 2.7. In the case t = 0, we have a linear form L 2 in zero variables, and we are counting solutions $\mathbf{x}\in A^s$ to the equation
Note that when t = 0, all linear forms L 2 in t variables are vacuously non-degenerate.
By harnessing a combinatorial ‘cleaving’ argument of Prendiville [Reference Prendiville17], we can swiftly deduce theorem 1.5 from theorem 2.6.
Proof of theorem 1.5 given theorem 2.6
Following the argument given at the beginning of this subsection, we may reduce to the case where b = 0 and h is intersective of the second kind. Combining [Reference Chapman and Chow2, lemma 3.2] with (2.3), for N sufficiently large, the number of solutions $\mathbf{x} \in [N]^s$ to (1.3) such that $x_i = x_j$ holds for some i ≠ j is $O_\varepsilon(N^{s-d+\varepsilon-1/2})$. Therefore, by setting t = 0, we see that the density regularity statement in theorem 1.5 follows directly from theorem 2.6. Similarly, given a colouring $\mathcal P_N = \mathcal C_1\cup\cdots\cup\mathcal C_r$, provided N is sufficiently large, it remains to show that there are at least $c_1N^{-d}(N/\log N)^{s}$ monochromatic solutions to (2.2) with
For each δ > 0, let $\tau_0(\delta)\in(0,1)$ be as given in the statement of theorem 2.6. By making minor adjustments, we may assume that $\tau_0(\delta)\leqslant\delta$. Set $\delta_0=1/r$, and for each $i\in[r]$, let $\delta_i:=\tau_0(\delta_{i-1})$, whence $0 \lt \delta_r\leqslant\ldots\leqslant\delta_0 \lt 1$. Take $N\geqslant X_0(\delta_r,h,r,L_1,L_2)$ as in theorem 2.6, and suppose $\mathcal P_N =\mathcal C_1\cup\cdots\cup\mathcal C_r$. For each $0\leqslant i\leqslant r$, let $k_i\in[r]$ be the index given by applying theorem 2.6 with $\delta=\delta_i$. By the pigeonhole principle, we can find $k\in[r]$ and $0\leqslant i \lt j\leqslant r$ such that $k_i=k_j=k$. Therefore,
and
holds for any $A\subseteq\mathcal P_N$ with $|A|\geqslant\delta_j|\mathcal P_N|$. Taking $A=\mathcal C_k$ finishes the proof.
2.4. Auxiliary intersective polynomials
The next step of our argument is to use a version of Green’s Fourier-analytic transference principle [Reference Green6] to obtain solutions to (2.2) by ‘transferring’ solutions from a ‘linearized’ equation. To make this precise, we first need to introduce the auxiliary intersective polynomials of Lucier [Reference Lucier15], which emerge during the execution of this process.
Let h be an integer polynomial of positive degree d, which is intersective of the second kind. Thus, for each prime p, we can find a p-adic unit $z_p \in \mathbb{Z}_p^{\times}$ such that $h(z_p)=0$. Throughout this article, we fix a choice of zp for each prime p and let mp be the multiplicity of zp as a zero of h. For each prime p and positive integer D, let $\mathrm{ord}_p(D)$ denote the largest non-negative integer n such that pn divides D. We can then define the completely multiplicative function
By noting that $1\leqslant m_p\leqslant \mathrm{deg}(h)=d$ for all p, we have
For each prime p and non-negative integer k, reducing zp modulo pk reveals that there exists a unique residue $x\in\mathbb{Z}/p^k\mathbb{Z}$ such that $x\equiv z_p \;(\operatorname{mod}{p^k\mathbb{Z}_p})$. By the Chinese remainder theorem, we can therefore find a unique integer rD in the range $(-D,0]$, which satisfies
for all primes p. As h is intersective of the second kind, we have $(r_D, D) = 1$.
Finally, with this notation in place, we define the auxiliary intersective polynomial
These polynomials and the surrounding notation were introduced by Lucier [Reference Lucier15], who also showed that hD is indeed a polynomial with integer coefficients [Reference Lucier15, lemma 21]. The most important property of these auxiliary polynomials is that the greatest common divisor of the non-constant coefficients of hD is bounded uniformly in D. Specifically, for all $D\in\mathbb{N}$, [Reference Lucier15, lemma 28] states that
As in [Reference Chapman and Chow2, §6], this bound is crucial to our investigation of exponential sums involving intersective polynomials (see §7 and §9).
We can now state our linearized version of theorem 2.6.
Theorem 2.8 Let r and $d\geqslant 2$ be positive integers, and let $0 \lt \delta\leqslant 1$. Let h be an integer polynomial of degree d which is intersective of the second kind. Let $s \geqslant 1$ and $t \geqslant 0$ be integers such that $s + t \geqslant s_0(d)$. Let
be non-degenerate linear forms such that $L_1(1,\ldots,1) = 0$. Then, there exists
such that the following is true. Let $D, Z \in \mathbb{N}$ satisfy $Z \geqslant Z_0$, and set $N:=h_D(Z)$. Suppose
Then, there exists $k \in [r]$ such that if $\mathcal A\subseteq[N]$ satisfies $|\mathcal A|\geqslant\delta N$, then
The implied constant may depend on $h, L_1, L_2, r, \delta$.
Remark 2.9. The quantity η is introduced for technical reasons concerning certain weight functions νD we employ when applying the transference principle. For further details, see the remarks preceding lemma 8.3.
The proof of theorem 2.8 is deferred to the final two sections of this article. As in [Reference Chapman and Chow2], we prove this ‘linearized’ result by applying an arithmetic regularity lemma. To streamline this forthcoming argument, we require the following proposition, which is a minor variation of [Reference Chapman and Chow2, proposition 3.10] and is proved in the same way.
Proposition 2.10. Suppose that theorem 2.8 is true in the cases where $\gcd(L_1)=1$. Then, subject to altering the quantities $Z_0(D,r,\delta,L_1,L_2,P)$, η, and the implicit constant in the final bound, theorem 2.8 holds in general.
2.5. Sketch of the transference argument
In this subsection, we outline how the transference principle allows us to deduce theorem 2.6 from theorem 2.8. Fix an integer polynomial h that is intersective of the second kind, as well as a pair of linear forms L 1 and L 2 as in the statement of theorem 2.6. We begin by recalling that theorem 2.6 concerns the equation
while theorem 2.8 considers, for some parameter $D\in\mathbb{N}$, the ‘linearized’ equation
Suppose we have a finite colouring $\mathcal P_X= \mathcal C_1\cup\cdots\cup \mathcal C_r$ and a set $A\subseteq\mathcal P_X$ with $|A|\geqslant\delta|\mathcal P_X|$. For the convenience of this sketch, assume that $X\equiv r_D \;(\operatorname{mod}{D})$. Choosing $Z\in\mathbb{N}$ such that $X=r_D + DZ$, we can define an r-colouring
by
Let $N:=h_D(Z)$. By pigeonholing, we find a ‘dense’ set $\mathcal A\subseteq[N]$ such that
for some integer b.
Theorem 2.8 now informs us that there are many solutions $({\mathbf{n}},\mathbf{z})\in \mathcal A^s\times\tilde{\mathcal C}_k^t$ to (2.7) for some $k\in[r]$. Given such a solution, our construction of $\mathcal A$ and $\tilde{\mathcal C}_k$ furnishes a solution $(\mathbf{x},\mathbf{y})\in A^s\times\mathcal C_k^t$ to (2.6) satisfying
Since the map $({\mathbf{n}},\mathbf{z})\mapsto(\mathbf{x},\mathbf{y})$ is injective, this argument allows us to obtain many solutions to (2.6). However, observe that the number of solutions to (2.6) given in theorem 2.6 is $ X^{s-d+t+o(1)}, $ which is far fewer than the number of solutions to (2.7) provided by theorem 2.8, namely $X^{ds - d + t + o(1)}$. This shortfall is handled by instead considering weighted counts of solutions to (2.6). Our task is then to construct an appropriate weight function ν, which is supported on the set
The key utility of the transference principle is that, provided our weight function is suitably ‘pseudorandom’, we can find a ‘dense model’ $g:[N]\to[0,1]$ such that $\hat{\nu}\approx\hat{g}$. Applying theorem 2.8 to a set of the form $\mathcal A=\{x\in[N]: g(x) \gt c\}$, our argument above allows us to prove theorem 2.6.
To ensure our weight ν is sufficiently pseudorandom, we have to contend with the fact that the set $h(\mathcal P)$ is not equidistributed in residue classes. This issue prevents one from simply taking ν to be a scaled version of the indicator function of $h(\mathcal P)$. Fortunately, there is a standard technical manoeuvre, known as the W-trick, developed by Green [Reference Green6] to account for equidistribution modulo small primes. In the setting discussed above, this amounts to demanding that our weight ν is supported on a set of the form
for some $W,\kappa\in\mathbb{N}$ such that W is divisible by all primes $p\leqslant w$ for some sufficiently large $w\in\mathbb{N}$. If we choose $D,W,\kappa$ appropriately, then we can ensure that the set
equidistributes over congruence classes modulo p for any prime $p\leqslant w$. The contribution of the remaining primes is then subsumed by the error term emerging from the transference of solutions from the ‘dense model’ g to ν. The appearance of the additional parameter $\kappa\in\mathbb{N}$ here, resulting in a ‘double W-trick’, was the main innovation of our previous work [Reference Chapman and Chow2]. Its purpose is to ensure that ${\lambda}(D)$ precisely accounts for all common divisors of the values of $h(x) - h(b)$, as x ranges over the arithmetic progression b modulo $W {\kappa}$.
3. Linearization and the W-trick
In this section, we execute the ‘double W-trick’ and construct the weight function ν needed for our application of the transference principle. Throughout this section, we fix the parameters
which appear in theorem 2.6.
3.1. The W-trick
Consider a set $A \subseteq \mathcal P_X$ with $|A| \geqslant {\delta} |\mathcal P_X|$. Let $C \in \mathbb{N}$ be large with respect to the fixed parameters, and let $w\in\mathbb{N}$ be large in terms of C. Define
and
Henceforth, we take $X\in\mathbb{N}$ sufficiently large in terms of $C,w,$ and the fixed parameters. We also assume that $D\mid(X-r_D)$, whence $Z \in \mathbb{N}$.
For $R \in \mathbb{N}$ and $b \in [R]$, we write
We denote by $(H,W)_d$ the greatest $m \in \mathbb{N}$ for which $m^d \mid (H,W)$. By [Reference Chapman and Chow2, lemma A.5] and the Siegel–Walfisz theorem, we have
Since w is large relative to C and d, we have $M \lt 2^{50w}$. Hence, if $(h'(b),W)_d \leqslant M$, then there cannot exist a prime $p\leqslant w$, which divides $(h'(b),W)$ with multiplicity greater than 50dw. It follows that if $(h'(b),W)_d \leqslant M$, then $(h'(b),W)\mid V$. By incorporating the crude estimate
we find that
Thus, there exists $b_0 \in [W]$ such that
Define ${\kappa} \in \mathbb{N}$ by
Note that (2.4) implies that κ is w-smooth, whence $\varphi(W)\kappa = \varphi(W\kappa)$. By pigeonholing, we can then find $b \in [W {\kappa}]$ such that
Since $(h'(b),W) = (h'(b_0), W) \mid V$, we also have
3.2. The weight function
Our next task is to construct an appropriately ‘pseudorandom’ weight function. Let $w,W$, and κ be as defined in the previous subsection. Fix some $b\in[W\kappa]$, which satisfies
We then define
Observe that ν is supported on the set
Recall from the previous subsection that we are considering a fixed set $A\subseteq\mathcal P_X$ with $|A|\geqslant\delta|\mathcal P_X|$, and that we made a judicious choice of $b\in[W\kappa]$ so that $|A_{b,W\kappa}|$ is suitably dense. For this specific choice of b, let
Lemma 3.1. Let $\mathcal A,\nu$ be as defined above. If X is sufficiently large in terms of $h,\delta,w$, then
Proof. Let c be a small, positive constant, and let
For X sufficiently large, the Siegel–Walfisz theorem implies that $|A_{b,W{\kappa}}| \geqslant |\Omega_{b,W\kappa,c}(X)|$. It follows that there exists an injective map $\psi:\Omega_{b,W\kappa,c}(X)\to A_{b,W{\kappa}}$ such that $p\leqslant\psi(p)$ for all $p\in\Omega_{b,W\kappa,c}(X)$. This implies that
Invoking the bound $h'(x)\ll x^{d-1}$, we deduce that
By the Siegel–Walfisz theorem again, we thus have
Therefore,
Thus, for our choice of κ and b, the desired bound now follows from the equalities
4. Exponential sums
In this section, we record some results on exponential sums of the form
where F is a real polynomial, and $G:(1,\infty)\to \mathbb{R}$ is a continuously differentiable function. We apply these results in §5 to study the Fourier transform $\hat{\nu}$ of our weight function ν. The results of this section are also used in §9 to establish density bounds for ‘prime polynomial Bohr sets’.
A standard observation in analytic number theory, going back over a century to Hardy and Littlewood, is that such exponential sums can only be large if their phases exhibit ‘major arc’ behaviour. In the case of (4.1), this means that the leading coefficient of the polynomial F must be very close to a rational number with small denominator. To elucidate this further, we record the following lemma from [Reference Hua10], which considers the situation where the leading coefficient of F is rational. In what follows, and throughout this section, for all $k \in \mathbb{N}$, let σk be large in terms of k and put $C_k = 2^{8k} {\sigma}_k$.
Lemma 4.1. Let $m \in \mathbb{N}$ and $b \in \mathbb{Z}$ be coprime. Let $F(y) \in \mathbb{R}[y]$ have degree k, and suppose $a/q$ is its leading coefficient, where $a,q\in\mathbb{Z}$ are coprime and
Assume that P is sufficiently large in terms of m. Then,
Proof. This follows immediately from [Reference Hua10, theorem 10].
Using this lemma, we can show that (4.1) is small when the leading coefficient of F is ‘minor arc’, meaning that it is not well-approximated by a rational number with denominator at most polylogarithmic in P.
Lemma 4.2. Let $m\in\mathbb{N}$ and $b\in\mathbb{Z}$ be coprime. Let $F(y)\in\mathbb{R}[y]$ have degree k, and let θ be its leading coefficient. Let $G:(1,\infty)\to \mathbb{R}$ be a continuously differentiable function. Assume that P is sufficiently large in terms of m, and that
Then,
Proof. By Dirichlet’s approximation theorem, there exist coprime $q \in \mathbb{N}$ and $a \in \mathbb{Z}$ such that
By our assumption, we also have
Thus, ${\beta} := {\theta} - a/q$ satisfies
Let $f(y) = F(y) - {\beta} y^k$. By partial summation [Reference Vaughan22, lemma 2.6], we have
where
We deduce from lemma 4.1 and the trivial bound $|A(t)| \leqslant t$ that
This implies that
It, therefore, remains to estimate
where
The bound (4.2) gives
Similarly, since $|\beta|\leqslant P^{-k}$, we see that
Using the trivial bound $|A(t)|\leqslant t$, we have
Similarly, we deduce from (4.2) that
Combining these estimates completes the proof.
The above two lemmas suffice to handle ‘minor arc’ behaviour. As is typical in applications of the circle method, we treat the major arcs by establishing asymptotic formulae for the exponential sums (4.1).
Lemma 4.3. (General major arc asymptotic)
Let $f(y)\in\mathbb{Z}[y]$ have degree k, and let $G:(1,\infty)\to \mathbb{R}$ be a continuously differentiable function. Let $b \in \mathbb{Z}$ and $m \in \mathbb{N}$ be coprime, and let $Q \in \mathbb{N}$ with
Let ${\theta} \in \mathbb{R}$ and $P \geqslant 2$, and suppose $(q,a) \in \mathbb{N} \times \mathbb{Z}$ with $(a,q) = 1$ and
Let c > 0 be constant, small in terms of Ck, and put ${\beta} = \theta - a/q$. If P is sufficiently large relative to m and Q, then
where
Remark 4.4. The condition (4.3) holds if $f(b+mx)/Q \in \mathbb{Z}[x]$.
Proof. Writing $g(x)\in\mathbb{Z}[x]$ for the integer polynomial appearing in (4.3), if $u \in \mathbb{Z}$, then
This implies that
Hence, for $n \leqslant P$,
By the Siegel–Walfisz theorem (theorem 2.1), the inner sum is
whence
Writing $\psi(t) = e_Q({\beta} f(t)) G(t)\log t$, summation by parts gives
By hypothesis, for P sufficiently large,
Hence, for all $x,y\in[2,P]$ with x < y, the mean value theorem yields
In particular, this shows that
As ${\mathrm{Li}}(t) = \sum_{n=3}^t \int_{n-1}^n \frac{{\,{\rm d}} x}{\log x}$, summation by parts now gives
Note that
Thus, using (4.4) to replace each $\psi(n)$ by $\psi(x)$, we obtain
which completes the proof.
5. Fourier decay
Returning to the study of our weight function ν, we need to show that it is suitably ‘pseudorandom’. This will then allow us to ‘transfer’ solutions from the linearized equation (2.7) to our original equation (2.6). As in [Reference Chapman and Chow2, Reference Chow4, Reference Chow, Lindqvist and Prendiville5, Reference Prendiville17], we accomplish this via a Fourier decay estimate (together with the restriction estimates from the next section).
Lemma 5.1. Let ν be as defined in (3.2), where $b\in[W\kappa]$ satisfies (3.1), and assume that X is sufficiently large in terms of w. Then, for all $\alpha\in\mathbb{T}$,
Remark 5.2. As in [Reference Chapman and Chow2, §5], the above lemma does not rely upon nor make any reference to sets $A\subseteq\mathcal P_X$ or $\mathcal A\subseteq[N]$.
We study the Fourier transform $\hat{\nu}$ using the Hardy–Littlewood circle method and the exponential sum estimates established previously. We define the set of minor arcs
The set of major arcs $\mathfrak M:=\mathbb{T}\setminus\mathfrak m$, therefore, consists of all $\alpha\in\mathbb{T}$ for which there exist $a,q\in\mathbb{Z}$ such that
For convenience, we recall that
We, therefore, observe that the results of the previous section may be applied to estimate $\hat \nu({\alpha})$ upon taking
With this choice of parameters, we compute that
and
Our proof of lemma 5.1 for $\alpha\in\mathfrak m$ proceeds by the same strategy as in [Reference Chow4, §4]: we show that $\hat{\nu}(\alpha)$ and $\hat{1}_{[N]}(\alpha)$ are both far smaller than the required upper bound. This is encapsulated in the following corollary of lemma 4.2.
Corollary 5.3. (Minor arc estimate)
If $\alpha\in\mathfrak m$, then
Proof. In view of (5.3) and (5.4), the first estimate follows immediately from lemma 4.2. For the second estimate, we deduce from the definition of $\mathfrak m$ that
as required.
We similarly establish an asymptotic formula for $\hat{\nu}(\alpha)$ on the major arcs by invoking lemma 4.3. Define
Corollary 5.4. (Major arc asymptotic)
Suppose $({\alpha},q,a) \in \mathbb{R} \times \mathbb{N} \times \mathbb{Z}$ with (5.2), and put ${\beta} = {\alpha} - a/q$. Let c > 0 be constant, small in terms of Cd. Then,
Proof. Recall that $\lambda(D)N = h(X)\asymp_h X^d$. Thus, by combining (5.3) with (5.4) and (5.5), the desired formula is provided by lemma 4.3.
To elucidate this formula further, we estimate $S(q,a)$.
Lemma 5.5. Let $a,q\in\mathbb{Z}$ with $q \geqslant 2$ and $(q,a) = 1$. Then,
Furthermore, if $(q,W) \gt 1$, then $S(q,a) = 0$.
Proof. Write $q = q_1q_2$, where $q_1 \in \mathbb{N}$ is w-smooth and q 2 is w-rough. Observe that
where
by Taylor’s theorem. As $(q_1,q_2) = 1$, a standard calculation reveals that
where
as is noted in the proof of [Reference Rice19, lemma 9]. Observe that $(A_1,q_1) = (A_2,q_2) = 1$.
As q 1 is w-smooth and $(b,W)=1$, we always have $(W{\kappa} x+b, q_1) = 1$. Let
so that $(q_1',W') = 1$. Writing $x = y + q_1'z$ and
gives
As
we must have $(h'(b),W) \mid V \mid {\kappa}$. Thus, for $2 \leqslant j \leqslant d$, we have
Now
For each prime $p \leqslant w$, we have $\mathrm{ord}_p(h'(b)) \lt \mathrm{ord}_p(W)$, and so $\mathrm{ord}_p(v_1) = 0$. Therefore, $(v_1,W) = 1$, so $(H, A_1 v_1 ) =1$, whence
This completes the proof of the assertion that $S(q,a)=0$ whenever $(q,W) \gt 1$.
In view of this result, we may henceforth assume that $q_1=1$ and $q_2 = q \geqslant 2$. In particular, we have q > w. Let us denote by $\ell_h$ the leading coefficient of h. Then
so $(v_d, q) \ll_h 1$. Now lemma 2.2 provides a constant $C=C(d) \gt 1$ such that
as required.
These two results allow us to establish our Fourier decay estimate.
Proof of lemma 5.1
Corollary 5.3 gives (5.1) for all $\alpha\in\mathfrak m$. We henceforth assume that $\alpha\in\mathfrak M$. We begin by considering the case where (5.2) holds with q = 1. As demonstrated in [Reference Chow4, § 4], Euler–Maclaurin summation delivers the bound
Applying the triangle inequality and corollary 5.4, therefore, gives
Finally, suppose that (5.2) holds with $q \geqslant 2$. We note from [Reference Chow4, equation (4.3)] that
Thus, in view of the trivial estimate $|I({\beta})| \leqslant N$, the desired result follows by combining corollary 5.4 with lemma 5.5.
Before moving on, we record the following consequence of corollary 5.4 and lemma 5.5, which we use in the next section.
Corollary 5.6. (Major arc estimate)
Let $\alpha\in\mathbb{T}$ and $q\in\mathbb{N}$. If (5.2) holds for some $a \in \mathbb{Z}$, then
Proof. Integrating by parts delivers the standard estimate
Incorporating the elementary inequality $\varphi(W\kappa)\varphi(q)\leqslant\varphi(W\kappa q)$ and lemma 5.5 delivers
The required result now follows from corollary 5.4.
6. Restriction estimates
Recall from §3 that ν is supported on the set
After linearizing, we wish to solve (2.7) with the ni drawn from a dense subset $\mathcal A$ of the above set. This leads us to the study of functions $\phi:\mathbb{Z}\to\mathbb{C}$, such as the indicator function of $1_{\mathcal A}$, which are majorized by ν, meaning that $|\phi|\leqslant\nu$.
The purpose of this section is to establish two restriction estimates. These will then be used in the next section to execute the transference argument. The first restriction estimate is for the weight ν and is needed to transfer between ‘dense variables’ xi and ni in equations (2.6) and (2.7), respectively. Our second restriction estimate concerns an auxiliary weight function νD and is required for interpolation.
Throughout this section, we define ν as in §3.2 for a fixed choice of $b\in[W\kappa]$ satisfying (3.1). We also let $T = T(d)$ be as in §2.
6.1. Restriction I
We begin with the following restriction estimate for ν.
Proposition 6.1. Let $E \gt 2T$, and let $\phi: \mathbb{Z} \to \mathbb{C}$ with $|\phi| \leqslant \nu$. Then,
This is easily bootstrapped to the following restriction estimate for $\nu + 1_{[N]}$, see the deduction of [Reference Chapman and Chow2, lemma 6.2].
Proposition 6.2. Let $E \gt 2T$, and let $\phi: \mathbb{Z} \to \mathbb{C}$ with $|\phi| \leqslant \nu + 1_{[N]}$. Then,
To prove proposition 6.1, we proceed in stages. We introduce the auxiliary function
noting that $\nu \leqslant (\log X) \mu$ pointwise. We compute that
We begin with an epsilon-slack restriction estimate for µ.
Lemma 6.3. (Epsilon-slack estimate)
Let $\psi: \mathbb{Z} \to \mathbb{C}$ with $|\psi| \leqslant \mu$. Then,
Proof. By orthogonality,
As $\| \psi \|_\infty \leqslant \| \mu \|_\infty \ll X^{d-1}$, and since ψ is supported on
we obtain
Finally, using that $T = T(d)$, we find that
Our next goal is to largely remove ɛ from the exponent in this restriction estimate for µ, obtaining a log-slack estimate by passing to a slightly higher moment. To accomplish this, we require some bounds on
The triangle inequality and partial summation yield
where
Writing $x = W{\kappa} y + b$ gives
where cd is the leading coefficient of h and $f_d(y) \in \mathbb{Z}[y]$ is monic of degree d.
Suppose $X^{1/2} \leqslant P \leqslant X$. The exponential sum
can be treated using Roger Baker’s estimates [Reference Baker1]. We apply the formulation [Reference Chow3, lemma 2.3], noting from its proof that the quantity ${\sigma}(d)$ therein can be replaced by $2^{1-d}$. This delivers the following conclusion.
Lemma 6.4. If $|g_1({\alpha}; P)| \gt (P/(W{\kappa}))^{1 - 2^{1-d} + \varepsilon}$, then there exist coprime $r \in \mathbb{N}$ and $b \in \mathbb{Z}$ such that
Lemma 6.5. Let
If ${\theta} \in \mathbb{T} \setminus \mathfrak n$, then there exist coprime $q \in \mathbb{N}$ and $a \in \mathbb{Z}$ such that
Proof. Let $P \in [X^{1/2}, X]$ maximize
By (6.1),
By lemma 6.4, there exist coprime $r \in \mathbb{N}$ and $b \in \mathbb{Z}$ such that
With
we now have
since X is large in terms of w. Finally, recall that
and
By passing to a slightly higher moment, we are now able to obtain a log-slack analogue of lemma 6.3.
Lemma 6.6. (Log-slack estimate)
Let $v \gt 2T$ be real, and let $\psi: \mathbb{Z} \to \mathbb{C}$ with $|\psi| \leqslant \mu$. Then,
Proof. Inserting lemmas 6.3 and 6.5 into lemma 2.3 gives
Proof of proposition 6.1
Let $v \in (2T, E)$. Applying lemma 6.6 to $\psi = (\log X)^{-1} \phi$ gives the log-slack restriction estimate
We can choose σd to be large in terms of E and v. Thus, in view of corollaries 5.3 and 5.6, the desired result follows from lemma 2.3.
6.2. Restriction II
Define the auxiliary weight function $\nu_D: \mathbb{Z} \to [0,\infty)$ by
Observe that νD is supported on $h_D([Z])\subseteq [N]$. By the Siegel–Walfisz theorem, we have
One can be more precise using partial summation, but we do not need to.
In this subsection, we establish the following restriction estimate for νD.
Proposition 6.7. Let $E \gt 2T$ be real, and let $\phi: \mathbb{Z} \to \mathbb{C}$ with $|\phi| \leqslant \nu_D$. Then,
Our approach is similar to that of proposition 6.1, so we will not repeat all of the details. We introduce the auxiliary function
noting that $\nu_D \ll (\log X) \mu_D$ pointwise. As a special case of [Reference Chapman and Chow2, lemma 6.3], we have the following sharp restriction estimate for µD.
Lemma 6.8. Let $E \gt 2T$, and let $\psi: \mathbb{Z} \to \mathbb{C}$ with $|\psi| \leqslant \mu_D$. Then,
The upshot is that if $v \gt 2T$ and $|\phi| \leqslant \nu_D$, then
To apply the general epsilon-removal lemma, we require major and minor arc bounds for
On the minor arcs, we infer the following analogue of corollary 5.3, by essentially the same argument:
On the major arcs, we have (5.2), for some $q,a \in \mathbb{Z}$. Define
We infer the following analogue of corollary 5.4 by essentially the same proof.
Lemma 6.9. Let c be a small, positive constant, small in terms of Cd. Suppose $({\alpha}, q, a) \in \mathbb{R} \times \mathbb{N} \times \mathbb{Z}$ with (5.2), and put ${\beta} = {\alpha} - a/q$. Then,
It follows from lemma 2.2 that
Thus, by incorporating (5.6), we arrive at the following variant of corollary 5.6:
Equipped with the bounds (6.1), (6.2), and (6.3), proposition 6.7 now follows from the general epsilon-removal lemma (lemma 2.3).
7. The transference principle
In this section, we are finally ready to use transference to deduce theorem 2.6 from theorem 2.8. We start with some notation and a preparatory lemma.
For finitely-supported $f_1, \ldots, f_s, g_1, \ldots, g_t: \mathbb{Z} \to \mathbb{R}$, define
We frequently make use of the abbreviations
Given finite sets of integers A and B, we also write $\Phi(A;g) = \Phi(1_A; g)$, and similarly for the expressions $\Phi(f;B)$ and $\Phi(A;B)$.
Lemma 7.1. (Fourier control)
Let $f_1, \ldots, f_s, g: \mathbb{Z} \to \mathbb{R}$ be finitely supported. If
then
Proof. Following the proof of [Reference Chapman and Chow2, lemma 7.1] yields
Propositions 6.2 and 6.7 now give
Proof of theorem 2.6 given theorem 2.8
Fix ${\delta}, h, r, L_1,$ and L 2. The implied constants are henceforth allowed to depend on all of these parameters. Let $\tilde {\delta}$ be small in terms of the fixed parameters, and let $w \in \mathbb{N}$ be large in terms of them. We insist that $\tilde {\delta}$ is an integer power of 2, and that $\tilde {\delta} \gg 1$, so that dependence on $\tilde {\delta}$ is subsumed by dependence on the fixed parameters.
Given sufficiently large $X\in\mathbb{N}$, we define $D,N,W,Z$ as in §3. For the purposes of proving theorem 2.6, we may assume that $Z\in\mathbb{N}$. Indeed, assuming X is sufficiently large relative to D, any $A\subseteq\mathcal P_X$ with $|A|\geqslant\delta |\mathcal P_X|$ must satisfy $|A\cap[X-D]| \gt (\delta/2)|\mathcal P_X|$. Thus, by replacing $(\delta,A,X)$ with $(\delta/2,A\cap[X-a],X-a)$ for some $a\in[D]$ such that $D\mid(X-a-r_D)$, we can assume that $D\mid(X-r_D)$, whence $Z\in\mathbb{N}$.
Set
By theorem 2.8, there exists $k \in [r]$ such that every $\tilde{\mathcal{A}} \subseteq [N]$ with $|\tilde{\mathcal{A}}| \geqslant \tilde {\delta} N$ satisfies
By a simple counting argument, the number of solutions counted here for any given value of zt is $O(|\tilde C_k| N^{s-1} (Z/\log Z)^{t-1})$, whence $|\tilde{\mathcal{C}}_k| \gg Z/\log Z$. Thus,
Define $\mathcal A$ by (3.3), and define
By lemma 5.1 and the dense model lemma [Reference Prendiville16, theorem 5.1], there exists a function f 0 such that
For $\ell \in [s]$, we write $\mathbf{u}^{(\ell)} = (u_1^{(\ell)}, \ldots, u_s^{(\ell)})$, where
The telescoping identity and lemma 7.1 now give
By lemma 3.1, we have
Since $\hat f(0) - \hat f_0(0) \ll (\log w)^{-3/2}$, and w is large, we also have
Let c be a small, positive constant that depends only on the fixed parameters. Setting
we observe that
Taking c sufficiently small, therefore, allows us to extract the lower bound $|\tilde{\mathcal{A}}| \geqslant \tilde {\delta} N$. Now theorem 2.8 gives $ \Phi(\tilde{\mathcal{A}}; g_k) \gg N^{s+t-1}. $ Since $0\leqslant c1_{\tilde{A}}\leqslant f_0$, it follows that $\Phi(f_0; g_k) \gg N^{s+t-1}$. Taking w sufficiently large, we infer from (7.1) that
Finally, since $N\leqslant h(X)\asymp X^d$, we conclude that
By specifying that $w=O_{\delta,h,r,L_1,L_2}(1)$, this completes the proof.
8. Arithmetic regularity
Our only remaining task is to prove theorem 2.8. We are, therefore, interested in counting solutions to the linearized equation (2.7). We seek a colour class $\mathcal C_k$ such that there are many solutions $({\mathbf{n}},\mathbf{z})$ to (2.7) with ${\mathbf{n}}\in\mathcal C_k^t$ and $\mathbf{z}\in\mathcal A^s$ for some arbitrary dense set $\mathcal A\subseteq [N]$.
Following [Reference Prendiville17] and [Reference Chapman and Chow2], we begin by weakening the statement of theorem 2.8. Rather than assert the existence of a colour class $\mathcal C_k$ which gives many solutions with respect to all dense sets $\mathcal A$, we instead consider a finite collection of dense sets $\mathcal A_1,\ldots,\mathcal A_r\subseteq[N]$ and seek a colour class $\mathcal C_k$ such that, for all $i\in[r]$, there are many solutions to (2.7) with $({\mathbf{n}},\mathbf{z})\in\mathcal A_i^s\times\mathcal C_k^t$. This leads to the following version of theorem 2.8.
Theorem 8.1 Let r and $d\geqslant 2$ be positive integers, and let $0 \lt \delta\leqslant 1$. Let h be an integer polynomial of degree d, which is intersective of the second kind. Let $s \geqslant 1$ and $t \geqslant 0$ be integers such that $s + t \geqslant s_0(d)$. Let
be non-degenerate linear forms such that $L_1(1,\ldots,1) = 0$ and $\gcd(L_1)=1$. Then, there exists $\eta=\eta(d,\delta,L_1,L_2)\in(0,1)$ such that the following is true. Let $D, Z \in \mathbb{N}$ satisfy $Z \geqslant Z_0(D, h, r, {\delta}, L_1, L_2)$, and set $N:=h_D(Z)$. Suppose $\mathcal A_1,\ldots,\mathcal A_r\subseteq[N]$ satisfy $|\mathcal A_i|\geqslant\delta N$ for all $i\in[r]$. If
then there exists $k \in [r]$ such that
The implied constant may depend on $h, L_1, L_2, r, \delta$.
Proof of theorem 2.8 given theorem 8.1
In view of proposition 2.10, it is enough to prove theorem 2.8 under the assumption that $\gcd(L_1)=1$. We claim that theorem 2.8 holds with the same quantities $Z_0(D,h,r,\delta,L_1,L_2)$, $\eta(d,\delta,L_1,L_2)$, and the same implicit constant $C=C(h,L_1,L_2,r,\delta)$ appearing in the final bound. Suppose for a contradiction that this is false. For each $k\in[r]$, we can then find $\mathcal A_k\subseteq[N]$ with $|\mathcal A_k|\geqslant\delta N$ such that
Applying theorem 8.1 to the collection of dense sets $\mathcal A_1,\ldots,\mathcal A_r$ delivers a contradiction.
By taking Z sufficiently large relative to η in theorem 8.1, we may assume that hD is positive and strictly increasing on the real interval $[\eta Z, Z]$. We can then define a function $\mathcal Q_{Z}=\mathcal Q_{Z;\eta,h_D}:\mathbb{Z}\to\mathbb{R}$ by
Notice that $\mathcal Q_Z$ is supported on $h_D([\eta Z, Z])\subseteq [N]$, and the restriction of $\mathcal Q_Z$ to $h_D([\eta Z, Z])$ defines a bijection from $h_D([\eta Z, Z])$ to $[\eta Z, Z]$. Hence, given functions $f_1,\ldots,f_s:\mathbb{Z}\to\mathbb{R}$ supported on $[N]$, and $g:\mathbb{Z}\to\mathbb{R}$ supported on $[\eta Z, Z]$, we have
Lemma 8.2. (Arithmetic regularity lemma)
Let $r\in\mathbb{N}$, σ > 0, and let $\mathcal F:\mathbb{R}_{\geqslant 0}\to\mathbb{R}_{\geqslant 0}$ be a monotone increasing function. Then, there exists a positive integer $K_{0}(r;{\sigma},\mathcal F)\in\mathbb{N}$ such that the following is true. Let $N\in\mathbb{N}$ and $f_{1},\ldots,f_r:[N]\to[0,1]$. Then, there is a positive integer $K\leqslant K_{0}(r;{\sigma},\mathcal F)$ and a phase $\boldsymbol{\theta}\in\mathbb{T}^{K}$ such that, for every $i\in[r]$, there is a decomposition
of fi into functions $f_{\mathrm{str}}^{(i)},f_{\mathrm{sml}}^{(i)},f_{\mathrm{unf}}^{(i)}:[N]\to[-1,1]$ with the following stipulations.
(I) The functions $f_{\mathrm{str}}^{(i)}$ and $f_{\mathrm{str}}^{(i)}+f_{\mathrm{sml}}^{(i)}$ take values in $[0,1]$.
(II) The function $f_{\mathrm{sml}}^{(i)}$ obeys the bound $\lVert f_{\mathrm{sml}}^{(i)}\rVert_{L^{2}(\mathbb{Z})}\leqslant{\sigma}\lVert 1_{[N]}\rVert_{L^{2}(\mathbb{Z})}$.
(III) The function $f_{\mathrm{unf}}^{(i)}$ obeys the bound $\lVert \hat{f}_{\mathrm{unf}}^{(i)}\rVert_{\infty}\leqslant\lVert \hat{1}_{[N]}\rVert_{\infty}/\mathcal F(K)$.
(IV) The function $f_{\mathrm{str}}^{(i)}$ satisfies $\sum_{m=1}^{N}(f_{i}-f_{\mathrm{str}}^{(i)})(m)=0$.
(V) There exists a K-Lipschitz function $F_{i}:\mathbb{T}^{K}\to[0,1]$ such that $F_{i}(x\boldsymbol{\theta} )=f_{\mathrm{str}}^{(i)}(x)$ for all $x\in[N]$.
Proof. This is [Reference Chapman and Chow2, lemma 8.3].
Applying this to a given function f allows us to write $\Phi(f; g)$ as the sum of $\Phi(f_\mathrm{str} + f_\mathrm{sml}; g)$ and $2^s -1$ terms $\Phi(f_1,\ldots,f_s;g)$, where at least one of the fi equals $f_{\mathrm{unf}}$ and the rest are equal to $f_{\mathrm{str}}+f_{\mathrm{sml}}$. As is typical in applications of the arithmetic regularity lemma, we expect the term $\Phi(f_{\mathrm{str}}+f_{\mathrm{sml}};g)$ to provide the main contribution, while the remaining terms should be asymptotically negligible. This prediction is verified by combining Property (III) of lemma 8.2 with our Fourier control result (lemma 7.1).
To carry out this strategy of removing the contribution of $f_\mathrm{unf}$, we need to perform a minor technical manoeuvre. Applying lemma 7.1 requires us to bound the function g appearing in $\Phi(f; g)$ in terms of νD. To achieve a strong asymptotic lower bound for the number of solutions, we desire a bound of the form $g\lVert \nu_D\rVert_{\infty} \ll \nu_D$. This is the method we used in [Reference Chapman and Chow2, lemma 8.4], only with µD in place of νD. However, this relied on the fact that µD is constant on its support, while νD is not. In particular, if g is the indicator function of a colour class, then $g(z)\lVert \nu_D\rVert_{\infty}$ could be asymptotically larger than $\nu_D(z)$ for small z.
To overcome this issue, we restrict attention from $[Z]$ to $[\eta Z, Z]$, for some sufficiently small η > 0. On this latter interval, the function νD does not vary too much. This is made precise by the following lemma, which is a variation of [Reference Chapman and Chow2, lemma 8.12].
Lemma 8.3. Let P be a real polynomial of degree $d\in\mathbb{N}$ with positive leading coefficient. Then, there exists a positive integer $M_0(P)$ such that the following is true. For all $\eta\in(0,1)$, if $x\in\mathbb{R}$ satisfies $x\geqslant \eta^{-1}M_{0}(P)$, then
Proof. Let $\ell_P \gt 0$ be the leading coefficient of P. We can then find $M_0(P)\in\mathbb{N}$ such that
holds for all real $x\geqslant M_0(P)$. In particular, if $x\geqslant \eta^{-1}M_{0}(P)$, then
Lemma 8.4. (Removing $f_{\mathrm{unf}}$)
Let $f:\mathbb{Z}\to[0,1]$ be supported on $[N]$. Let $\eta,{\sigma} \gt 0$, and let $\mathcal F:\mathbb{R}_{\geqslant 0}\to\mathbb{R}_{\geqslant 0}$ be a monotone increasing function. Let $f_{\mathrm{str}},f_{\mathrm{sml}},f_{\mathrm{unf}}$ be the functions obtained upon applying lemma 8.2 to f. Then, for any $g:\mathbb{Z}\to[0,1]$ supported on the set
we have
Proof. Note that $|f|\leqslant 1_{[N]}$ and $\lVert \hat{f}\rVert_{\infty} \leqslant N$. Thus, by using a telescoping identity, as in the derivation of (7.1), lemma 7.1 informs us that
holds for any $G:\mathbb{Z}\to\mathbb{R}$ such that $|G|\leqslant \nu_D$. Taking $G=\xi(g\circ\mathcal Q_Z)$ for some ξ > 0, we deduce that
To complete the proof, it remains to find ξ > 0 with $\xi|g\circ\mathcal Q_Z|\leqslant \nu_D$ such that
Set
Observe that, for all $n=h_D(z)\in B$, we have
By lemma 8.3, if Z is sufficiently large relative to h, η, and D, then
Since $g\circ\mathcal Q_Z$ is supported on B and takes values in $[0,1]$, we conclude that there exists $c \gg 1$ such that $\xi:=c(DZ)^{-1}\varphi(D)N \log Z$ has all the required properties.
9. Prime polynomial Bohr sets
The final step of the proof of theorem 8.1 is to obtain a lower bound for the main term $\Phi(f_{\mathrm{str}}+f_{\mathrm{sml}};g\circ\mathcal Q_Z)$. Using our assumption that the coefficients of L 1 are coprime, Bézout’s lemma provides us with some $\mathbf{v}\in\mathbb{Z}^s$, which depends only on L 1, such that $L_1(\mathbf{v}) =1$. We can therefore write
where we have introduced the auxiliary counting operator
Our goal is to show that there is a large supply of $\mathbf{z}\in\mathbb{Z}^t$ for which $\Psi_\mathbf{z}(f_\mathrm{str} + f_\mathrm{sml})$ is asymptotically as large as possible. This is accomplished by choosing the zi to lie in a set of ‘almost-periods’ for $f_\mathrm{str}$. These are known as polynomial Bohr sets and take the form
for some ρ > 0, $K\in\mathbb{N}$, $\boldsymbol{{\alpha}}\in\mathbb{T}^K$, and $Q(x) \in \mathbb{Z}[x]$.
Lemma 9.1. (Lower bound for $\Psi_{\mathbf{z}}(f_{\mathrm{str}}+f_{\mathrm{sml}})$)
For all δ > 0, there exist positive constants $c_{1}(\delta)=c_{1}(L_1,L_2;\delta)$ and $\eta_0 = \eta_0(d,L_1,L_2,\delta) \lt 1$ such that the following is true. Suppose $f:\mathbb{Z}\to[0,1]$ is supported on $[N]$ and satisfies $\lVert f\rVert_1\geqslant \delta N$. Given ${\sigma} \in (0,1]$ and a monotone increasing function $\mathcal F:\mathbb{R}_{\geqslant 0}\to\mathbb{R}_{\geqslant 0}$, let $f_{\mathrm{str}}$, $f_{\mathrm{sml}}$, K, and θ be as given by applying lemma 8.2 to f. If $\mathbf{z}\in[\eta_0 Z]^{t}$ satisfies
then
In particular, if σ is sufficiently small relative to $(d,L_1,L_2,\delta)$, then
Proof. This is [Reference Chapman and Chow2, lemma 8.13] with $\rho=\sigma/K$.
Recall that we seek solutions to the linearized equation (2.7) with $r_D + Dz_j$ prime for all j. In view of lemma 9.1, we are therefore interested in sets of the form
The main purpose of this section is to establish the following density bounds for these prime polynomial Bohr sets.
Theorem 9.2 Let $K,D,d\in\mathbb{N}$, $0 \lt \rho\leqslant 1$, and let h be an integer polynomial of degree d which is intersective of the second kind. Then, there exists a positive integer $P_1=P_1(D,h,K,\rho)$ and a positive real number $\Delta(\rho)=\Delta(h,K;\rho)\leqslant 1$ such that the following is true for all $P\geqslant P_1$. If $\boldsymbol{{\alpha}}\in\mathbb{T}^K$, then
where $\mathcal B=[P]\cap\mathcal B_h(\boldsymbol{\alpha},\rho)$. Moreover, we may take
We demonstrate the utility of this result by using it to complete the proof of theorem 8.1.
Proof of theorem 8.1 given theorem 9.2
As usual, we fix the parameters
appearing in the statement of theorem 8.1. Unless specified otherwise, we allow all forthcoming implicit constants to depend implicitly on these parameters. Let ${\sigma},\eta\in(0,1)$ and let $\mathcal F:\mathbb{R}_{\geqslant 0}\to\mathbb{R}_{\geqslant 0}$ be a monotone increasing function, all three of which depend only on the fixed parameters. Let $D\in\mathbb{N}$, and assume throughout this proof that $Z\in\mathbb{N}$ and $N:=h_D(Z)$ are sufficiently large with respect to all of these quantities.
For each $i\in[r]$, let $\mathcal A_i\subseteq [N]$ with $|\mathcal A_i|\geqslant\delta N$. Suppose we have an r-colouring
In the notation of the previous section, our goal is to find $k\in\mathbb{N}$ such that
Lemma 8.2 provides us with decompositions
along with a positive integer $K\ll_{\sigma,\mathcal F} 1$ and a phase $\boldsymbol{\theta}\in\mathbb{T}^K$ with the properties described therein. Let $\eta_0=\eta_0(d,L_1,L_2,\delta)$ be as defined in lemma 9.1, and let
By choosing σ sufficiently small, lemma 9.1 and (9.1) inform us that
We now claim that, for an appropriate choice of $\eta \lt \eta_0$, the set Ω satisfies
Assume for the moment that this is true. By the pigeonhole principle, we can choose $k\in[r]$ such that $r|\Omega\cap\mathcal C_k|\geqslant|\Omega|$, whence
Incorporating lemma 8.4 furnishes the bound
for some positively-valued function $c(K) \gt 0$ whose value depends only on the fixed parameters and K. Specifying $\mathcal F:\mathbb{R}_{\geqslant 0}\to\mathbb{R}_{\geqslant 0}$ to be a monotone increasing function which obeys
then finishes the proof of theorem 8.1, subject to our claim.
It remains to establish (9.3). Let
Let $\rho=\sigma/K$, and for each $\xi\in(0,1)$ put
By theorem 9.2 and the Siegel-Walfisz theorem, there exists $\xi\in(0,1)$ with $\xi\gg\Delta(\rho)$ such that
Choosing $\eta = \eta_0\xi/2$, we ensure that the injective function $y\mapsto (y-r_D)/D$ maps $[\xi P,P]$ into $[\eta Z, \eta_0 Z]\subseteq [\eta Z, Z]$ and maps $\mathcal D_\xi$ into Ω. Since $\rho=O_K(1)$, we therefore conclude that
as claimed.
9.1. Exponential sums
To study prime polynomial Bohr sets, we are interested in exponential sums over primes of the form
where $\theta\in\mathbb{T}$ and h is intersective of the second kind. Lê and Spencer [Reference Lê and Spencer14] analysed properties of sums of this form to obtain estimates for the smallest element of a prime polynomial Bohr set (9.2), showing in particular that these sets are always non-empty. Our goal is to obtain a lower bound for the densities of these Bohr sets which does not depend on the choice of phase α.
Following [Reference Lê and Spencer14], our argument begins with the observation that if the prime polynomial Bohr set (9.2) has few elements, then we can construct a corresponding exponential sum which is particularly large. This is elucidated by the following lemma, which is a consequence of a much more general result of Harman [Reference Harman8].
Lemma 9.3. Let $D,K,P\in\mathbb{N}$. Let h be an integer polynomial of degree $d\in\mathbb{N}$ which is intersective of the second kind. Define rD and $\lambda(D)$ as in §2.4. Let $\rho\in (0,1)$, $\boldsymbol{\alpha}\in\mathbb{T}^K$, and
Then, there exists $\mathbf{m}\in\mathbb{Z}^K$ with $0 \lt \lVert \mathbf{m}\rVert_{\infty} \leqslant K\rho^{-1}$ such that
Proof. If $\rho \gt 1/2$, then $\mathcal C$ is empty and both sides of the desired inequality equal zero. If $\rho \leqslant 1/2$, then the result follows from the contrapositive of [Reference Harman8, corollary to lemma 5] and the pigeonhole principle.
For the purpose of proving theorem 9.2, we may assume that the Bohr set $\mathcal B_h(\boldsymbol{\alpha},\rho)$ has too few elements, in a manner that will be clarified in due course. Then we can apply lemma 9.3 to find some $L\ll_K \rho^{-K}$ and $\mathbf{m}\in\mathbb{Z}^K$ of bounded size such that $\theta = \mathbf{m}\cdot\boldsymbol{\alpha}$ satisfies
Our next task is to investigate the consequences of (9.4). As discussed in §4, an exponential sum being large is indicative of the phase θ exhibiting ‘major arc’ behaviour, meaning that θ is well-approximated by a rational number with small denominator. This is made precise in the following lemma.
Lemma 9.4. (Low major arc)
Suppose $L,P\in\mathbb{N}$ and $\theta\in\mathbb{R}$ satisfy (9.4). Assume that P is sufficiently large relative to $D,h$ and L. Then, there exists $q \in \mathbb{N}$ such that
Proof. By (9.4) and lemma 4.2, we can find $r\in\mathbb{N}$ and $b\in\mathbb{Z}$ such that
Let $q \in \mathbb{N}$ and $a \in \mathbb{Z}$ with
Then
Put ${\beta} = \theta - a/q$. Applying lemma 4.3 with $(f,G,b,m,Q) = (h, 1, r_D, D, \lambda(D))$ gives
where
Thus, by (9.4), we have
In light of the trivial bound $|I(\beta)|\leqslant P$, we now have
By [Reference Lucier15, lemma 28], the GCD of the non-constant coefficients of
is $O_h(1)$. Now lemma 2.2 yields
and we conclude that
It remains to bound $\lVert q\theta\rVert$.
Observe that
defines a surjective group homomorphism, and that $\psi^{-1}(r_D)$ is a coset of $\ker(\psi) \leqslant (\mathbb{Z}/Dq\mathbb{Z})^\times$. Therefore,
Pairing this with (9.5) yields
A change of variables gives
and now [Reference Vaughan22, theorem 7.3] yields
Consequently,
and finally,
9.2. Proof of theorem 9.2
Let $K\in\mathbb{N}$. Writing $\mathcal B=\mathcal B_h(\boldsymbol{\alpha},\rho)$ and $\mathcal C=\mathcal C_h(\boldsymbol{\alpha},\rho)$, we deduce from the Siegel–Walfisz theorem (theorem 2.1) that
Suppose $\boldsymbol{\alpha}\in\mathbb{T}^K$ is such that
for some suitably small $c(K) \gt 0$. If no such α were to exist, then theorem 9.2 would hold with $\Delta(h,K,\rho)\gg_K \rho^K$ which, since we demand that $\Delta(h,K,\rho)\leqslant 1$, is stronger than the bound we require. For this choice of α, we infer from lemma 9.3 the existence of some $\mathbf{m}\in\mathbb{Z}^K$ with $0 \lt \lVert \mathbf{m}\rVert_{\infty}\leqslant K\rho^{-1}$ such that
We also have
Choosing c(K) sufficiently small, the triangle inequality furnishes some $L\ll_K \rho^{-K}$ such that (9.4) holds with $\theta=\mathbf{m}\cdot\boldsymbol{\alpha}$. By increasing the value of L if necessary—which only weakens the bound (9.4) that we have obtained—we may assume that $L = C\rho^{-K}$ for some suitably large constant $C=C(h,K) \gt K$ to be specified later. Applying lemma 9.4 supplies us with some $q\in\mathbb{N}$ such that
To complete the proof, beginning from the above deductions, we proceed by induction on K. First, suppose that K = 1 and write $m\in\mathbb{Z}$ in place of $\mathbf{m}\in\mathbb{Z}^K$. As h is intersective of the second kind, any integer $n \equiv r_{qmD} \;(\operatorname{mod}{qmD})$ satisfies
Further, by the Siegel–Walfisz theorem and (9.6), we have
Since $r_{qmD} \equiv r_D \;(\operatorname{mod}{D})$, every prime p appearing in the sum on the left is congruent to rD modulo D. Using (2.5), we have
for each prime p in the sum, whence
Taking C sufficiently large, we deduce that $p \in \mathcal B$. We conclude that
as required.
Now suppose $K\geqslant 2$ and assume the induction hypothesis that theorem 9.2 holds with K − 1 in place of K. Write $\mathbf{m}=(\mathbf{m}',m_K)$ and $\boldsymbol{\alpha}=(\boldsymbol{\alpha}',\alpha_K)$. The induction hypothesis, applied to
informs us that the set
satisfies
Let $a\in\mathbb{Z}$ be such that
For each $p\in\mathcal A$, we have $h(p) \ll_h (P/L^2)^d$ so, by (9.6),
As in the previous case, we have $qm_K\lambda(D) \mid h(p)$, whence
Thus, by the triangle inequality,
By taking C sufficiently large, we find that $p\in\mathcal B$. Therefore
as required.
Acknowledgements
J.C. is supported by the Heilbronn Institute for Mathematical Research. S.C. thanks the University of Bristol for their kind hospitality on various occasions when this work was being discussed. We are grateful to the anonymous referee for their thorough report and numerous constructive comments.