1. Introduction
Szemerédi's famous theorem on arithmetic progressions, which states that any subset of the integers with positive upper density contains arbitrarily long arithmetic progressions, has the following multidimensional generalization due to Furstenberg and Katznelson [Reference Furstenberg and KatznelsonFK78].
Theorem 1.1 Let $X$ be a finite, nonempty subset of $\mathbb {Z}^d$. If $S\subset [N]^d$ contains no nontrivial homothetic copy $a+bX$ of $X$, then $|S|=o(N^d)$.
Here we use the standard notation $[N]:=\{1,\ldots,N\}$. There has been great interest over the past few decades in proving a quantitative version of this theorem with reasonable bounds, i.e. with an upper bound for $|S|$ whose savings over the trivial bound of $N^d$ grows at least as quickly as a finite number of iterated logarithms. Indeed, Gowers has posed the problem of proving such a result on several occasions [Reference GowersGow98a, Reference GowersGow01a, Reference GowersGow01b], and others, such as Graham [Reference GrahamGra97], have asked for bounds for sets lacking particular multidimensional configurations. While reasonable bounds are known in Szemerédi's theorem due to work of Gowers [Reference GowersGow98b, Reference GowersGow01b], none are known in the Furstenberg–Katznelson theorem in general. Furstenberg and Katznelson's original proof, which was via ergodic theory, produces no explicit bounds, while the hypergraph regularity proofs of Nagle, Rödl, Schacht, and Skokan [Reference Nagle, Rödl and SchachtNRS06, Reference Rödl and SkokanRS04], Gowers [Reference GowersGow07], and Tao [Reference TaoTao06] each give a saving over the trivial bound of inverse Ackermann type.
Reasonable bounds in Theorem 1.1 are currently known for only one genuinely multidimensional configuration: two-dimensional corners
(and, thus, their linear images) due to the work of Shkredov [Reference ShkredovShk06a, Reference ShkredovShk06b], who proved that any subset of $[N]\times [N]$ containing no nontrivial corners has size at most $\ll N^2/(\log \log {N})^c$ for some absolute constant $c>0$. No reasonable bounds are known for any two-or-more-dimensional four-point configuration, such as three-dimensional corners,
or axis-aligned squares,
The latter of these two configurations is the topic of a conjecture of Graham [Reference GrahamGra97], which states that any subset $S\subset \mathbb {N}\times \mathbb {N}$ for which $\sum _{(x,y)\in S} {1}/({x^2+y^2})$ diverges must contain an axis-aligned square. Graham also conjectured, more generally, that if $\sum _{(x,y)\in S} {1}/({x^2+y^2})$ diverges, then $S$ must contain a homothetic copy of $[m]\times [m]$ for every positive integer $m$. This is a two-dimensional generalization of the famous and still unresolved conjecture of Erdős that every subset $T\subset \mathbb {N}$ for which $\sum _{n\in T} {1}/{n}$ diverges must contain arbitrarily long arithmetic progressions.
Proving reasonable bounds for sets lacking the four-point configurations (2) and (3) seems to be out of reach. This is because no one has managed yet to prove anything useful about a certain two-dimensional directional uniformity norm that naturally appears in the study of these configurations. Details on this difficulty can be found in the work of Austin [Reference AustinAus13a, Reference AustinAus13b], where he demonstrates how enormously complicated and difficult even 100% and 99% inverse theorems can be for directional uniformity norms.
The purpose of this paper is to identify the first two-dimensional four-point configuration for which reasonable bounds in the multidimensional Szemerédi theorem can be proven, and to prove such bounds in the finite field model setting. We will study the configuration
which, when plotted on a two-dimensional integer grid, takes the shape of the capital letter ‘L’. Because of this, we refer to (4) as an L-shaped configuration, and an L-shaped configuration with $z\neq 0$ as a nontrivial L-shaped configuration.
Theorem 1.2 There exists a natural number $m$ and a constant $C>0$ such that the following holds. Fix a prime $p\geq 11$, and set $N:=p^n$. If $n\geq C$, then all $S\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ containing no nontrivial L-shaped configurations satisfy
The $m$ obtained in the theorem is huge, so we do not attempt to compute it. The bulk of the size of $m$ comes from our use of a recent quantitative inverse theorem for the $U^{10}$-norm on $\mathbb {F}_p^n$ due to Gowers and Milićević, who in [Reference Gowers and MilićevićGM20] give a rough upper bound for the number of iterated exponentials appearing in their result. Based on this, $m$ is likely at least 24 trillion. The use of this inverse theorem is necessary in our proof, and no amount of care to argue efficiently in the rest of the argument can reduce $m$ by much. Thus, we have not tried to optimize the proof of Theorem 1.2, choosing instead to present the simplest argument that gives a reasonable upper bound.
It is likely that the proof of Theorem 1.2 can be adapted to the integer setting to prove a reasonable bound for subsets of $[N]\times [N]$ lacking L-shaped configurations, with the bound obtained being far more reasonable than the bound in Theorem 1.2. This is because the quantitative aspects of Manners's [Reference MannersMan18] inverse theorem for the $U^{s}$-norm on cyclic groups are better than those of Gowers and Milićević's inverse theorem when $s>4$. It is also likely that Theorem 1.2 can be extended to more general L-shaped configurations with a longer vertical ‘leg’,
in both the finite field model and integer settings. We expect, however, that understanding L-shaped configurations with two longer ‘legs’,
is significantly more difficult, for some of the same reasons that proving reasonable bounds for sets lacking three-dimensional corners or axis-aligned squares seems out of reach.
While progress in proving a quantitative version of the multidimensional Szemerédi theorem has so far been extremely limited, there has been a bit more success in proving reasonable bounds for sets lacking multidimensional configurations with more degrees of freedom than those in Theorem 1.1. Prendiville [Reference PrendivillePre15] has proven reasonable bounds for subsets of $[N]^d$ lacking any sufficiently nondegenerate three- or four-term matrix progression, and one consequence of his work is that any subset of $[N]\times [N]$ containing no four vertices of any square (not necessarily axis-aligned) has size at most $\ll N^2/(\log \log {N})^{c'}$ for some absolute constant $c'>0$.
The remainder of this paper is organized as follows. In § 2, we give a detailed outline of our proof of Theorem 1.2, including statements of the three main components of the density-increment argument: control of the count of L-shaped configurations by directional uniformity norms, obtaining a density increment on a structured set, and pseudorandomizing the structured set obtained previously. After introducing additional technical preliminaries in §§ 3 and 4, we prove these three main components in §§ 5, 6, and 7, respectively. We then carry out the density increment argument in § 8, proving Theorem 1.2.
2. Outline of the proof of Theorem 1.2
We begin this section by introducing the minimum amount of notation and preliminaries needed to understand our proof outline. We will use the standard asymptotic notation $O,\Omega,$ and $o$, along with Vinogradov's notation $\ll,\gg,$ and $\asymp$. For any two quantities $A$ and $B$, the relations $A=O(B)$, $B=\Omega (A)$, $A\ll B$, and $B\gg A$ all mean that $|A|\leq C|B|$ for some absolute constant $C>0$. We will write $O(B)$ to represent a quantity that is $\ll B$ and $\Omega (A)$ to represent a quantity that is $\gg A$. When any of these asymptotic symbols appears with a subscript, the implied constant is allowed to depend on the parameters in the subscript. Since we fix a prime $p$ in Theorem 1.2, the implied constants appearing throughout the paper will sometimes depend on $p$ even though we will not alert the reader to this with a subscript. We will use $\log _m$ to denote the $m$-fold iterated logarithm, so that $\log _1:=\log$ and $\log _{i}:=\log \circ \log _{i-1}$ for all $i>1$, as well as $\exp ^m$ to denote the $m$-fold iterated exponential, so that $\exp ^1=\exp$ and $\exp ^{i}:=\exp \circ \exp ^{i-1}$ for all $i>1$.
We will frequently denote the indicator function of a set $A$ by the letter $A$ itself, so that
For any pair of finite sets $X\subset Y$ with $Y\neq \emptyset$, we denote the density of $X$ in $Y$ by
For any function $f:X\to \mathbb {C}$, we denote the average of $f$ over $X$ by
When $X=\mathbb {F}_p^n$, we will usually drop ‘$\in X$’ and just write $\mathbb {E}_x$ for $\mathbb {E}_{x\in \mathbb {F}_p^n}$. Whenever $f$ satisfies $|f(x)|\leq 1$ for all $x$ in its domain, we say that it is $1$-bounded. Note that the indicator function of any set is $1$-bounded.
For any $f:\mathbb {F}_p^n\to \mathbb {C}$ and $\xi \in \mathbb {F}_p^n$, we define the Fourier coefficient of $f$ at $\xi$ using the normalization
where $e_p(z):= e^{2\pi iz/p}$ and $\cdot$ denotes the usual dot product in $\mathbb {F}_p^n$. With this choice of normalization, the Fourier inversion formula and Parseval's identity read
and
respectively.
Let $H$ be any abelian group and $g:H\to \mathbb {C}$. For any $h\in H$, we define the function $\Delta _hg:H\to \mathbb {C}$ by
and, for any $h_1,\ldots,h_s\in H$, define the $s$-fold iterated differencing operator $\Delta _{h_1,\ldots,h_s}$ by
Note that $\Delta _{h_1,\ldots,h_s}g=\Delta _{h_{\sigma (1)},\ldots,h_{\sigma (s)}}g$ for any permutation $\sigma$ of $\{1,\ldots,s\}$.
Now we can recall the definition of the Gowers uniformity norms.
Definition 2.1 Let $s\in \mathbb {N}$, $H$ be an abelian group, and $f:H\to \mathbb {C}$. The $U^s$-norm of $f$ is defined by
The basic properties of these norms can be found in [Reference TaoTao12]. One such fact needed in the upcoming outline is the inverse theorem for the $U^2$-norm, which is a simple consequence of Fourier inversion and Parseval's identity.
Lemma 2.2 Let $H$ be an abelian group and $f:H\to \mathbb {C}$ be $1$-bounded. If $\|f\|_{U^2(H)}\geq \delta$, then there exists a $\psi \in \widehat {H}$ such that
We will also sometimes need the notion of the $U^2$-norm on an affine subspace $w+V$ of $\mathbb {F}_p^n$, which is defined by $\|f\|_{U^2(w+V)}:=\|f(\cdot -w)\|_{U^2(V)}$. The corresponding inverse theorem for these norms follows from Lemma 2.2.
2.1 A review of Shkredov's argument in the finite field model setting
Before we outline the proof of Theorem 1.2, it will be instructive to review Shkredov's argument for corners (1) in the finite field model setting. A detailed account of the argument can be found in the expositions of Green [Reference GreenGre05a, Reference GreenGre05b].
Shkredov's proof proceeds via a density-increment argument. As in all analytic approaches to Szemerédi's theorem and its generalizations, we begin by defining a multilinear average over the configuration of interest. For $g_0,g_1,g_2:\mathbb {F}_p^n\times \mathbb {F}_p^n\to \mathbb {C}$, set
Then, for any $S\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$, the quantity $\Lambda _{\llcorner }(S,S,S)$ equals the normalized count,
of the number of corners in $S$. Setting $N:=p^n=|\mathbb {F}_p^n|$, we let $\sigma :=|S|/N^2$ denote the density of $S$ in $\mathbb {F}_p^n\times \mathbb {F}_p^n$ and $g_S:=S-\sigma$ denote the balanced function of $S$. It follows from the trilinearity of $\Lambda _{\llcorner }$ that
Since $\Lambda _{\llcorner }(1,S,S)\geq \sigma ^2$ by the Cauchy–Schwarz inequality, if the normalized count of corners in $S$ is far below the $\sim \sigma ^3$ expected for a random set of density $\sigma$, which is the case when $S$ has no nontrivial corners and $N$ is sufficiently large in terms of $\sigma$, then $|\Lambda _{\llcorner }(g_S,S,S)|$ must be large.
It can then be shown, by an appropriate sequence of applications of the Cauchy–Schwarz inequality, that $g_S$ must have large box norm
If $\|g_S\|_{\square }$ is large, it follows by an averaging argument that $S$ has density at least $\sigma +\Omega (\sigma ^{O(1)})$ on a product set $A\times B$ for some large $A,B\subset \mathbb {F}_p^n$.
One may then hope to continue the density-increment argument by proving the following generalization of the result just sketched: if $S$ is a subset of density $\sigma$ of a product set $T=A\times B$ and contains no nontrivial corners, then $S$ has density at least $\sigma +\Omega (\sigma ^{O(1)})$ on a product set $T'$ contained in $T$.
It turns out, however, that the Cauchy–Schwarz argument mentioned previously yields a lower bound on the box norm of large enough size only when $A$ and $B$ are sufficiently Fourier pseudorandom, meaning that their balanced functions $A-|A|/N$ and $B-|B|/N$ both have small $U^2$-norm. The components of the product set just obtained are essentially arbitrary aside from being large. They are, in particular, not guaranteed to be Fourier pseudorandom.
To overcome this difficulty, Shkredov introduced a pseudorandomizing step into his proof. He used an energy increment argument incorporating the $U^2$-inverse theorem to partition $\mathbb {F}_p^n\times \mathbb {F}_p^n$ into products of large affine subspaces of the form
for most of which the sets $(A-u)\cap V$ and $(B-w)\cap V$ are Fourier pseudorandom in $V$. By an averaging argument, there must exist such a product of affine subspaces (5) on which the restrictions of $A$ and $B$ are both sufficiently dense and Fourier pseudorandom, and such that $S$ still has increased density $\sigma +\Omega (\sigma ^{O(1)})$ on the intersection of $T$ with $(u+V)\times (w+V)$.
By passing to this product of cosets and using that corners are preserved by translation and invertible linear transformations of the form $(x,y)\mapsto (Ex,Ey)$, one can then continue the density-increment argument with $\mathbb {F}_p^n\times \mathbb {F}_p^n$ replaced by $\mathbb {F}_p^{n'}\times \mathbb {F}_p^{n'}$, where $n'=\dim {V}$. If $S\subset T$ contains no nontrivial corners and $A$ and $B$ are sufficiently Fourier pseudorandom, then $g_S$ must have large box norm localized to $T$. One must then prove that $S$ has a further density increment on a product set contained in $T$, which is, fortunately, of exactly the same difficulty whether $T=\mathbb {F}_p^n\times \mathbb {F}_p^n$ or some other large product set. By applying the pseudorandomizing procedure to the factors of the product set just produced, one can then deduce that if $S$ is a subset of density $\sigma$ of a product set $T=A\times B$, where $A$ and $B$ are large and sufficiently Fourier pseudorandom, and $S$ contains no nontrivial corners, then $S$ has density at least $\sigma +\Omega (\sigma ^{O(1)})$ on a product set $T'=A'\times B'$ contained in $T$, where $A'$ and $B'$ are also large and sufficiently Fourier pseudorandom. The density increment iteration can be carried out repeatedly to produce a good bound for subsets of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ lacking corners.
2.2 An outline of our argument
The obstructions to uniformity for L-shaped configurations are not just (skew) product sets, as was the case for corners, but also very general sets of the form
where each $u_x+V_x$ is an affine subspace of $\mathbb {F}_p^n$. For example, assume that $n\geq 3$, and consider the set
This set has density
in $\mathbb {F}_p^n\times \mathbb {F}_p^n$, but
L-shaped configurations, in contrast to the $\sim N^3/p^4$ expected in a random subset of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ of density $1/p$. Similarly, the number of L-shaped configurations in the sets
and
where $\phi (x)\in \mathbb {F}_p^n$ and $u(x)\in \mathbb {F}_p$ are now chosen uniformly at random, is also ${\sim }N^3/p^3$ with high probability, while the sets have density ${\sim }1/p$ with high probability. These new sorts of obstructions to uniformity are the main reason why the study of L-shaped configurations is significantly more difficult than that of corners, and must be taken into account to prove Theorem 1.2.
For any functions $g_0,g_1,g_2,g_3:\mathbb {F}_p^n\times \mathbb {F}_p^n\to \mathbb {C}$, we define
so that $\Lambda (S,S,S,S)$ equals the normalized count of L-shaped configurations in any subset $S$ of $\mathbb {F}_p^n\times \mathbb {F}_p^n$. The multilinearity of $\Lambda$ implies that
where, as before, $g_S=S-\sigma$ is the balanced function of $S$. Thus, if the normalized count of L-shaped configurations in $S$ is far from the random normalized count $\sigma ^4$, one of $|\Lambda (1,1,g_S,S)|$, $|\Lambda (1,g_S,S,S)|$, or $|\Lambda (g_S,S,S,S)|$ must be large. In particular, when $S$ contains no nontrivial L-shaped configurations and $N$ is sufficiently large in terms of $\sigma$, one of these quantities will be larger than $\sigma ^4/2$. It then follows from several applications of the Cauchy–Schwarz inequality that one of the following directional uniformity norms of $g_S$ must be larger than $\sigma ^4/2$:
or
Here $\|\cdot \|_{\star _3}$ is only a semi-norm, while $\|\cdot \|_{\star _1}$ and $\|\cdot \|_{\star _2}$ are genuine norms. Since these are all Gowers box norms, one can find a proof that they are (semi-)norms in Appendix B of [Reference Green and TaoGT10]. The norm $\|\cdot \|_{\star _1}$ had been studied previously, in the setting of cyclic groups, in work of Shkredov [Reference ShkredovShk09].
Directional uniformity norms with two differencing parameters,
for fixed nonzero $v_1,v_2\in H\times H$, are well-understood. Either $v_1$ and $v_2$ are scalar multiples of each other, in which case the norm is just the $U^2$-norm on $\langle v_1\rangle$ averaged over cosets of $\langle v_1\rangle$, or they are linearly independent, as in the definition of $\|\cdot \|_{\star _2}$, in which case the norm is, after a change of variables, equivalent to the two-dimensional box norm. Directional uniformity norms with three differencing parameters,
for fixed nonzero $v_1,v_2,v_3\in H\times H$ analogously fall into one of three cases: either $v_1,v_2,$ and $v_3$ are collinear, lie on exactly two lines, or are in general position. In the first case, the norm is just the $U^3$-norm on $\langle v_1\rangle$ averaged over cosets of $\langle v_1\rangle$. In the third case, the norm is linearly equivalent to the intractable norm that arises in the study of three-dimensional corners and axis-aligned squares. The norm $\|\cdot \|_{\star _1}$ we encounter falls into the second case, and the study and fruitful use of this norm turns out to be possible (though still complicated) due to its structure as a ‘$U^1\times U^2$-norm’.
The upshot is that if $S$ contains no nontrivial L-shaped configurations, then it must have density at least $\sigma +\Omega (\sigma ^{O(1)})$ on a set of the form
where $\Phi \subset A\times \mathbb {F}_p^n$ is of the form
for some element $u\in \mathbb {F}_p^n$ and collection of subspaces $\{V_x:x\in A\}$ of $\mathbb {F}_p^n$, where $A,B,C,D\subset \mathbb {F}_p^n$ are large and $\operatorname {codim} V_x$ is small for each $x\in A$. Note that this set $\Phi$ is not quite as general as the one appearing at the very beginning of this subsection, as the element $u$ of $\mathbb {F}_p^n$ does not vary with $x$. It takes some extra work to show that we can guarantee $\Phi$ to be of this special form, which turns out to be necessary for our density-increment iteration. We say more about this point in § 6.
We would like to continue the density-increment iteration and show that $S':= T\cap S$, which also lacks L-shaped configurations, has a further density increment of at least the same size as the first on a subset $T'$ of $T$ of the same general form (11). Analogously to Shkredov's argument for corners, we can only hope to do this if $A,B,C,D,$ and $\Phi$ are sufficiently pseudorandom, for some appropriate notions of pseudorandomness. We will need to control the count of L-shaped configurations by the norms $\|\cdot \|_{\star _1}$, $\|\cdot \|_{\star _2}$, and $\|\cdot \|_{\star _3}$ defined in (8), (9), and (10) with no loss of density factors, i.e. show that
and
and also obtain a density increment with no loss of density factors when some localized norm $\|\cdot \|_{\star _i}$ of the balanced function of a set is large, i.e. show that if
where now $g_S:=S-\sigma T$, then there exists a subset $T'\subset T$ of the same general form,
as $T$ on which $S$ has a density increment
depending only on $\delta$. Such results are needed so that the density increment obtained at each step of the iteration is independent of the step. If one is not sufficiently careful, it is easy to end up with a density increment that gets smaller as the subset $T$ of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ gets sparser, which is not enough to close the density increment iteration.
To carry out these arguments, we will need $A,B,C,$ and $D$ to be pseudorandom with respect to the $U^{10}(\mathbb {F}_p^n)$-norm. The situation for $\Phi$ is more complicated, and deciding on a good measure of pseudorandomness for $\Phi$ that is amenable to a Shkredov-like pseudorandomization procedure and can also be used to analyze the various averages appearing throughout our argument is one of the challenges of the proof of Theorem 1.2. A suitable condition on $\Phi$ turns out to be that it is pseudorandom with respect to the $U^{8}(\mathbb {F}_p^n\times \mathbb {F}_p^n)$-norm. This condition is not, on its own, immediately useful in the arguments of §§ 5 and 6, since the various averages that appear are not controllable by the $U^{8}(\mathbb {F}_p^n\times \mathbb {F}_p^n)$-norm of $\Phi$. It takes a bit of work to show that it implies a roughly equivalent statement about the typical codimensions of certain affine subspaces obtained from $\Phi$. We prove this in § 4, deriving some new results on the combinatorics of approximate polynomials along the way.
The proof of the implications (13), (14), and (15) when $A,B,C,D,$ and $\Phi$ are sufficiently pseudorandom consists of many careful applications of the Cauchy–Schwarz inequality, along with appeals to standard facts about the number of linear configurations of bounded Cauchy–Schwarz complexity in products of pseudorandom sets intersected with subspaces of bounded codimension. We carry out this argument in § 5.
Obtaining a large enough density increment when $\|g_S\|_{\star _1}$ is large for $S\subset T$ requires some new ideas and a significant amount of extra work beyond the proof of the non-localized case, in contrast to the situation for the box norm localized to product sets, where the argument is the same as the non-localized case. In order to get such a density increment that only depends on $\delta$ and not on the densities of $A,B,C,D,$ or $\Phi$, one of the key ingredients is a density-preserving inverse theorem for the $U^2(\Phi (x,\cdot ))$-norms on pseudorandom sets derived from $A,B,C,$ and $D$, which we prove using a version of the transference principle. We carry out this argument in § 6.
As was the case for corners, the sets $A',B',C',D',$ and $\Phi '$ obtained in the previous paragraph are not guaranteed to be pseudorandom. We must also carry out a pseudorandomizing procedure to locate a product of large affine subspaces of the form $(u+V)\times (w+V)$ on which $A',B',C',D',$ and $\Phi '$ are sufficiently pseudorandom and $S$ still has a large density increment on $T'\cap [(u+V)\times (w+V)]$. Our pseudorandomization procedure is similar to Shkredov's, but with some new complications coming from our desire for $A',B',C',$ and $D'$ and $\Phi '$ to be pseudorandom with respect to the $U^{10}(\mathbb {F}_p^n)$- and $U^{8}(\mathbb {F}_p^n\times \mathbb {F}_p^n)$-norms, respectively, and from $\Phi '$'s particular structure as a union of affine subspaces in the second factor of $\mathbb {F}_p^n\times \mathbb {F}_p^n$. To handle the first complication, we use a recent quantitative inverse theorem of Gowers and Milićević [Reference Gowers and MilićevićGM20] for the $U^{s}$-norms on vector spaces over finite fields, combined with a result of Cohen and Tal [Reference Cohen and TalCT15] that allows us to partition $\mathbb {F}_p^n$ into large affine subspaces on which any finite collection of bounded degree polynomials are all constant. The structure of $\Phi '$ has the potential to cause issues in a Shkredov-like pseudorandomization argument, since the intersection of $\Phi '$ with a cell may no longer be the union of affine subspaces all having the same dimension. We will explain how this complication is dealt with in § 7, since it requires a bit of set up.
2.3 Key intermediate results
We finish this section by stating the key intermediate results needed to prove Theorem 1.2 that we just described in the outline. Recall that $g_S=S-\sigma$ denotes the balanced function of $S$.
Lemma 2.3 (Estimation of $\Lambda (T,T,T,S)$)
There exist absolute constants $0< c_1<1< c_2$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Define $T\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ by (11) and suppose that $S\subset T$ has density $\sigma$ in $T$. Let $\varepsilon \leq c_1(\sigma \alpha \beta \gamma \delta \rho )^{c_2}$ and assume that
Then
As a consequence, we get that if $\varepsilon$ is small enough, $n$ is large enough, and $S\subset T$ has no nontrivial L-shaped configurations, then
Lemma 2.4 (Control by $\|\cdot \|_{\star _i}$ norms)
Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Assume that
Define $T\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ by (11) and suppose that $f_0,f_1,f_2,f_3:\mathbb {F}_p^n\times \mathbb {F}_p^n\to \mathbb {C}$ are $1$-bounded functions supported on $T$. Then
and
Thus, if $\varepsilon$ is small enough, $n$ is large enough, and $S\subset T$ has no nontrivial L-shaped configurations, then one of
is $\gg \sigma ^{O(1)}$.
Theorem 2.5 ($\|g_S\|_{\star _1}$, $\|g_S\|_{\star _2}$, or $\|g_S\|_{\star _3}$ large implies a density increment)
There exist absolute constants $0< c_1<1< c_2,c_3$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Let $\sigma,\tau >0$ and
and assume that
Define $T\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ by (11) and assume that $S\subset T$ has density $\sigma$ in $T$. Suppose that
or
Then, $S$ has density at least $\sigma +\Omega (\tau ^{O(1)})$ on a subset $T'\subset T$ of the form
where the densities of $A',B',C',D'\subset \mathbb {F}_p^n$ are all $\gg (\sigma \tau \alpha \beta \gamma \delta \rho )^{O(1)}$, and the set $\Phi '\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x'$ is a subspace of $\mathbb {F}_p^n$ of codimension $d+1$.
The first three lemmas combined tell us that if $S$ has density $\sigma$ and contains no nontrivial L-shaped configurations, then one can find a subset $T$ of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ of the form (11) on which $S$ has density at least $\sigma +\Omega (\sigma ^{O(1)})$. The next lemma tells us that, after restricting to a product of large affine subspaces, we can get this same conclusion with $A,B,C,D,$ and $\Phi$ as pseudorandom as we need, which will allow us to continue the density-increment iteration.
Lemma 2.6 There exist absolute constants $0< c_1<1< c_2,c,c'$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Let $\varepsilon '>0$, and suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ and takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Define $T\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ by (11), and assume that $S\subset T$ has density $\sigma +\tau$ in $T$, as well as that
Then there exists a subspace $V\leq \mathbb {F}_p^n$ of dimension
$u,w\in \mathbb {F}_p^n$, and $0\leq i\leq d$ such that, on setting $\mathcal {C}=(u+V)\times (w+V)$:
– $B':= B\cap (w+V)$;
– $C':=C\cap (u+w+V)$;
– $D':= D\cap (2u+w+V)$;
– $\Psi ':=\Phi \cap \mathcal {C}$ and $\Phi ':=\{(x,y)\in \Psi ':\mathbb {E}_{z\in w+V}\Psi '(x,z)=p^{-i}\}$;
– $A':=\{x\in u+V:\mathbb {E}_{z\in w+V}\Phi '(x,z)\neq 0\}$;
– $\alpha '=\mu _{u+V}(A')$;
– $\beta ':=\mu _{w+V}(B')$;
– $\gamma ':=\mu _{u+w+V}(C')$;
– $\delta ':=\mu _{2u+w+V}(D')$;
– $\rho ':=p^{-i}$; and
– $T':=\{(x,y)\in \mathcal {C}:B'(y)C'(x+y)D'(2x+y)\Phi '(x,y)=1\}$;
we have
$\alpha ',\beta ',\gamma ',\delta '\gg \tau \mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)/4$, and
By combining the previous four lemmas and using that L-shaped configurations are preserved by translation and invertible linear transformations of the form $(x,y)\mapsto (Ex,Ey)$, we thus deduce the following density-increment lemma, which we will iterate in § 8 to prove Theorem 1.2.
Lemma 2.7 There exist absolute constants $0< c_1<1< c_2,c_3,c_4,c,c'$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Define $T$ by (11) and let $S\subset T$ have density $\sigma$ in $T$. Let $\varepsilon \leq (\sigma \alpha \beta \gamma \delta \rho )^{c_2}\exp (-(64/\sigma )^{c_3})$, and assume that
Let $\varepsilon '>0$, and suppose that $S$ has no nontrivial L-shaped configurations. Then either:
(i) $n<\exp ^2( {c_4\exp ^c(c'/\varepsilon ')}/{(\sigma \alpha \beta \gamma \delta \rho )^{c_2}})$; or
(ii) there exists natural numbers $n'$ and $d'$ satisfying
\[ n'\gg n^{c_1^{\exp^c(c'/\varepsilon')/(\sigma\alpha\beta\gamma\delta\rho)^{c_2}}} \]and $0\leq d'\leq d+1$, subsets $A',B',C',D'\subset \mathbb {F}_p^{n'}$ of densities $\alpha ',\beta ',\gamma ',\delta '$, respectively, a subset $\Phi '\subset \mathbb {F}_p^{n'}\times \mathbb {F}_p^{n'}$ of the form\[ \Phi'=\{(x,y)\in A'\times\mathbb{F}_p^{n'}:y\in u'+V_x'\}, \]where each $V_x'$ is a subspace of $\mathbb {F}_p^{n'}$ of codimension $d'$ (so that $\Phi '$ has density $\alpha '\rho '$, where $\rho ':=p^{-d'}$), and a subset $S'\subset T'$, where\[ T':=\{(x,y)\in\mathbb{F}_p^{n'}\times\mathbb{F}_p^{n'}:B'(y)C'(x+y)D'(2x+y)\Phi'(x,y)=1\}, \]of density at least $\sigma +\Omega (\sigma ^{O(1)})$ in $T'$, such that\begin{gather*} \|A'-\alpha'\|_{U^{10}(\mathbb{F}_p^{n'})},\quad \|B'-\beta'\|_{U^{10}(\mathbb{F}_p^{n'})},\quad \|C'-\gamma'\|_{U^{10}(\mathbb{F}_p^{n'})},\\ \|D'-\delta'\|_{U^{10}(\mathbb{F}_p^{n'})},\quad \|\Phi'-\alpha'\rho'\|_{U^8(\mathbb{F}_p^{n'}\times\mathbb{F}_p^{n'})}<\varepsilon', \end{gather*}$\alpha ',\beta ',\gamma ',\delta '\geq (\sigma \alpha \beta \gamma \delta \rho )^{c_2}$, and $S'$ contains no nontrivial L-shaped configurations.
3. Additional preliminaries
In this section, we present some more preliminaries that were not needed for the outline of the proof of Theorem 1.2, but will be convenient to have for the proof itself. We begin with the notion of Cauchy–Schwarz complexity, first defined by Green and Tao in [Reference Green and TaoGT10].
Definition 3.1 (Cauchy–Schwarz complexity)
Let $\psi _1,\ldots,\psi _d:(\mathbb {F}_p^n)^r\to \mathbb {F}_p^n$ be a collection of linear forms in $r$ variables. We say that $\psi _1,\ldots,\psi _d$ has Cauchy–Schwarz complexity at most $s$ if, for every $j\in [d]$, there exists a partition of $\{\psi _1,\ldots,\psi _d\}\setminus \{\psi _j\}$ into at most $s+1$ subsets such that $\psi _j$ is not contained in the linear span of any of the subsets.
The smallest $s$ such that $\{\psi _1,\ldots,\psi _d\}$ has Cauchy–Schwarz complexity at most $s$ is called the Cauchy–Schwarz complexity of $\{\psi _1,\ldots,\psi _d\}$.
For example, four term arithmetic progressions,
have Cauchy–Schwarz complexity $2$.
Any system of linear forms of complexity at most $s$ can be shown to be controlled by the $U^{s+1}$-norm using repeated applications of the Cauchy–Schwarz inequality. In particular, carrying out the proof of the generalized von Neumann theorem of Green and Tao in [Reference Green and TaoGT10] in the finite field model setting (where the technical details are much simpler) produces the following useful result.
Theorem 3.2 Let $\psi _1,\ldots,\psi _d:(\mathbb {F}_p^n)^r\to \mathbb {F}_p^n$ be a collection of linear forms in $r$ variables with Cauchy–Schwarz complexity at most $s$. For any $1$-bounded functions $f_1,\ldots,f_d:\mathbb {F}_p^n\to \mathbb {C}$, we have
Further, if all of $f_1,\ldots,f_d$ are supported on a set $A\subset \mathbb {F}_p^n$ of density $\alpha$ that satisfies $\|A-\alpha \|_{U^{s+1}(\mathbb {F}_p^n)}<\alpha \varepsilon$, then
We will use the following immediate corollary of Theorem 3.2 numerous times throughout the proof of Theorem 1.2.
Corollary 3.3 Let $\psi _1,\ldots,\psi _d:(\mathbb {F}_p^n)^r\to \mathbb {F}_p^n$ be a collection of linear forms in $r$ variables with Cauchy–Schwarz complexity at most $s$. Suppose that $f_1,\ldots,f_d:\mathbb {F}_p^n\to \mathbb {C}$ are $1$-bounded functions having average values $\alpha _1,\ldots,\alpha _d$, respectively, and that
for all $1\leq j\leq d$. Then
We also need the Gowers–Cauchy–Schwarz inequality.
Lemma 3.4 (Gowers–Cauchy–Schwarz inequality)
Let $s$ be a natural number, $H$ be an abelian group, and $f_\omega :H\to \mathbb {C}$ for every $\omega \in \{0,1\}^s$. We have
Next, we record the basic fact that a function on $\mathbb {F}_p^n$ with small $U^2$-norm has small average on affine subspaces of small codimension.
Lemma 3.5 Let $f:\mathbb {F}_p^n\to \mathbb {C}$ be a $1$-bounded function satisfying $\|f\|_{U^2(\mathbb {F}_p^n)}<\varepsilon$ and $w+V\subset \mathbb {F}_p^n$ be an affine subspace of codimension $d$. Then
Proof. The indicator function of $V$ can be written as
so we have that
by the Gowers–Cauchy–Schwarz inequality. Since $|V|=|\mathbb {F}_p^n|/p^d$, the desired result follows.
The last lemma of this section will be used to analyze the various averages appearing in the proofs of Lemmas 2.3, 2.4, and 2.5.
Lemma 3.6 Let $\psi _1,\ldots,\psi _d\in \mathbb {F}_p[x_1,\ldots,x_t,y]$ be a collection of linear forms such that the coefficient of $y$ in each of $\psi _1,\ldots,\psi _d$ is nonzero, and $F:(\mathbb {F}_p^n)^t\times \mathbb {F}_p^n\to [0,1]$ be a function of the form
for some $1$-bounded functions $f_j:\mathbb {F}_p^n\to \mathbb {C}$ with average value $\beta _j$, each satisfying $\|f_j-\beta _j\|_{U^s(\mathbb {F}_p^n)}<\varepsilon$. If the Cauchy–Schwarz complexity of the set of linear forms
in the variables $x_1,\ldots,x_t,y,h,k$, is at most $s-1$, then
Proof. Set $\beta :=\prod _{j=1}^d\beta _j$. Note that $\mathbb {E}_{\mathbf {x}\in (\mathbb {F}_p^n)^t}\|F(\mathbf {x},\cdot )-\beta \|_{U^2(\mathbb {F}_p^n)}^4$ equals
where
for each $\omega \in \{0,1\}^4$ and $1\leq i\leq 4$. Since (19) has Cauchy–Schwarz complexity at most $s-1$ by hypothesis, Corollary 3.3 implies that
for every $\omega \in \{0,1\}^4$. Thus,
Markov's inequality then gives
for every $r>0$. Taking $r=\varepsilon ^{1/8}$ gives the conclusion of the lemma.
4. Pseudorandomness of $\Phi$
The main purpose of this section is to show that if $\|\Phi -\alpha \rho \|_{U^{2s+2}(\mathbb {F}_p^n\times \mathbb {F}_p^n)}$ is small and $\Phi$ is a subset of the form
where each $V_x\leq \mathbf {F}_p^n$ is a subspace of density $\rho$ in $\mathbf {F}_p^n$ and $A\subset \mathbf {F}_p^n$ has density $\alpha$ in $\mathbf {F}_p^n$, then, whenever $\psi _1,\ldots,\psi _r\in \mathbb {F}_p[x_1,\ldots,x_m]$ is a collection of linear forms of Cauchy–Schwarz complexity at most $s$ and $w_1,\ldots,w_r\in \mathbb {F}_p^n$, the affine subspaces
typically have maximum possible codimension. This allows us to transform the condition that $\Phi$ is $U^{8}(\mathbb {F}_p^n\times \mathbb {F}_p^n)$-pseudorandom into a more useful property for evaluating the various averages that arise in the proof of Theorem 1.2.
Lemma 4.1 For each nonnegative integer $s$ and positive integer $r$, there exist constants $C_{s,r},c_{s,r}>0$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Let $\delta >0$, $A\subset \mathbb {F}_p^n$ have density $\alpha$, and $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ be a set of the form
where each $V_x\leq \mathbb {F}_p^n$ is a subspace of codimension $d$. Assume that
Let $\psi _1,\ldots,\psi _r\in \mathbb {F}_p[x_1,\ldots,x_m]$ be a collection of linear forms of Cauchy–Schwarz complexity at most $s$. Then, for all but at most a $\delta$-proportion of $m$-tuples $\mathbf {x}\in (\mathbb {F}_p^{n})^m$ for which $\psi _1(\mathbf {x}),\ldots,\psi _r(\mathbf {x})\in A$, we must have that
for all $w_1,\ldots,w_r\in \mathbb {F}_p^n$.
We begin by showing that if $\Phi$ is pseudorandom with respect to the $U^{s}(\mathbb {F}_p^n\times \mathbb {F}_p^n)$-norm, then $A$ is pseudorandom with respect to the $U^{s}(\mathbb {F}_p^n)$-norm. This result will also be useful at a few other points in the proof of Theorem 1.2.
Lemma 4.2 Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Let $A\subset \mathbb {F}_p^n$ have density $\alpha$ and $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ be of the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Then, for every natural number $s$, we have
Proof. We write
and then, for each $\omega \in \{0,1\}^s$, insert the identity
to get that $\|A-\alpha \|_{U^s(\mathbb {F}_p^n)}^{2^s}$ equals
We can average the above quantity over $y,\ell _1,\ldots,\ell _s$ and make the change of variables $k_\omega \mapsto k_\omega +y+\omega \cdot (\ell _1,\ldots,\ell _s)$ to get that it equals
Applying the Gowers–Cauchy–Schwarz inequality bounds (20) above by
which gives us the conclusion of the lemma.
To prove Lemma 4.1, we will need the notion of an approximate polynomial of bounded degree, which we define using the additive discrete difference operator $\partial$. For $\phi :H\to H$ and $h\in H$, define $\partial _h\phi :H\to H$ by
and, for $h_1,\ldots,h_s\in H$, the $s$-fold additive difference operator $\partial _{h_1,\ldots,h_s}$ by
Definition 4.3 Let $H$ be an abelian group, $A\subset H$, and $\phi :A\to H$. We say that $\phi$ is an $\varepsilon$-approximate polynomial of degree at most $s-1$ on $A$ if
for at least an $\varepsilon$-proportion of $(s+1)$-tuples $(x,h_1,\ldots,h_s)\in H^{s+1}$ for which $x+\omega \cdot (h_1,\ldots,h_s)\in A$ for all $\omega \in \{0,1\}^s$.
We will also need the following result, which is the key combinatorial input into the proof of Lemma 4.1.
Lemma 4.4 For each nonnegative integer $s$, there exist constants $C_s,c_s>0$ such that the following holds. Let $A\subset \mathbb {F}_p^n$ have density $\alpha$, with
and $\phi : A\to \mathbb {F}_p^n$ be a $\delta$-approximate polynomial of degree at most $s$ on $A$. Then, for at least a $\Omega _s(\delta ^{O_s(1)})$-proportion of $(2s+2)$-dimensional parallelopipeds
in $A^{2^{2s+2}}$, the derivative of $\phi$ on the $(2s+1)$-dimensional face $(x+\omega \cdot (h_1,\ldots,h_{2s+2}))_{\substack {\omega \in \{0,1\}^{2s+2} \\ \omega _i=\epsilon }}$,
vanishes for all $1\leq i\leq 2s+2$ and $\epsilon =0,1$.
Proof. We proceed by induction on $s$, beginning with the case $s=0$. If $\phi$ is a $\delta$-approximate polynomial of degree at most $0$ on $A$, then $\mathbb {E}_{x,y\in A}1_{\phi (x)=\phi (y)}\geq \delta$, so that, by the pigeonhole principle, there exists some $z\in \mathbb {F}_p^n$ for which $\mu _{A}(\{x\in A:\phi (x)=z\})\geq \delta$. Set $X:=\{x\in A:\phi (x)=z\}$, and consider the set of quadruples
Note that if $(x,x+h,x+k,x+h+k)\in X'$, then
so certainly the derivatives
of $\phi$ on each of the $1$-dimensional faces of the parallelopiped $(x,x+h,x+k,x+h+k)$ vanish. Since
we must have $|X'|\geq (\alpha \delta )^4p^{3n}$. Taking $c_0=8$, the total number of quadruples $(x,x+h,x+k,x+h+k)$ in $A^4$ is $(\alpha ^4+O(C_0\alpha ^8))p^{3n}$ by Corollary 3.3, which means that $X'$ consists of at least a $\delta ^4/2$-proportion of parallelograms $(x,x+h,x+k,x+h+k)$ in $A^4$ if $C_0$ is chosen small enough. Thus, for at least a $\delta ^4/2$-proportion of parallelograms $(x,x+h,x+k,x+h+k)$ in $A^4$, the derivative of $\phi$ on each $1$-dimensional face vanishes, as desired.
Now suppose that the result holds for a general degree $s-1\geq 0$, and let $\phi$ be a $\delta$-approximate polynomial of degree at most $s$ on $A$. By Corollary 3.3,
so that, as long as $C_s$ is sufficiently small and $c_s$ is sufficiently large, it follows from Markov's inequality that
and, thus,
as well, for all but a $O(\delta ^2)$-proportion of $h\in \mathbb {F}_p^n$. Thus, for at least a $\Omega (\delta )$-proportion of $h_{s+1}\in \mathbb {F}_p^n$, we have $\|\Delta _{h_{s+1}}A-\alpha ^2\|_{U^{2s}(\mathbb {F}_p^n)}< C_{s-1}(\alpha \delta /2)^{2c_{s-1}}/2$, $|\mu _{\mathbb {F}_p^n}(A\cap (A-h_{s+1}))-\alpha ^2|< C_{s-1}(\alpha \delta /2)^{2c_{s-1}}/2$, and that the function $\partial _{h_{s+1}}\phi$ is a $O(\delta )$-approximate polynomial of degree at most $s-1$ on $A\cap (A-h_{s+1})$. Denoting the set of such $h_{s+1}$ by $H$, so that $\mu _{\mathbb {F}_p^n}(H)\gg \delta$, the induction hypothesis then says that, for each $h\in H$, there are at least a $\Omega _{s}(\delta ^{O_{s}(1)})$-proportion of $2s$-dimensional parallelopipeds $(x+\omega \cdot (h_1,\ldots,h_{2s}))_{\omega \in \{0,1\}^{2s}}$ in $(A\cap (A-h))^{2^{2s}}$ for which the derivative of $\partial _h\phi$ on each $(2s-1)$-dimensional face vanishes.
Summing over all $h\in H$, it follows that, for at least a $\Omega _{s}(\delta ^{O_s(1)})$-proportion of $(2s+2)$-tuples $(x,y,h_1,\ldots,h_{2s})$ for which $(x+\omega \cdot (h_1,\ldots,h_{2s}))_{\omega \in \{0,1\}^{2s}}$ and $(y+\omega \cdot (h_1,\ldots,h_{2s}))_{\omega \in \{0,1\}^{2s}}$ are both in $A^{2^{2s}}$, one has
for all $i=1,\ldots,2s$ and $\epsilon =0,1$. By Corollary 3.3,
and so, by Markov's inequality,
for all but a $O([C_s(\alpha \delta )^{c_{s}}]^{\Omega (1)})$-proportion of $(h_1,\ldots,h_{2s})$ in $(\mathbb {F}_p^n)^{2s}$. By taking $C_s$ small enough and $c_s$ large enough, there are therefore at least a $\Omega _{s}(\delta ^{O_s(1)})$-proportion of $2s$-tuples $(h_1,\ldots,h_{2s})$ in $(\mathbb {F}_p^n)^{2s}$ for which (21) holds and, for at least a $\Omega _{s}(\delta ^{O_s(1)})$-proportion of pairs $(x,y)\in A^2$ such that $(x+\omega \cdot (h_1,\ldots,h_{2s}))_{\omega \in \{0,1\}^{2s}}$ and $(y+\omega \cdot (h_1,\ldots,h_{2s}))_{\omega \in \{0,1\}^{2s}}$ are both in $A^{2^{2s}}$, one also has
for all $i=1,\ldots,2s$ and $\epsilon =0,1$. For each such $2s$-tuple $\mathbf {h}$, it follows from the pigeonhole principle that there exists a $y_{\mathbf {h}}$ in the set
such that, for at least a $\Omega _{s}(\delta ^{O_s(1)})$-proportion of $x\in A_{\mathbf {h}}$, one has
for all $i=1,\ldots,2s$ and $\epsilon =0,1$.
Now set $v_{i,\epsilon,\mathbf {h}}:=\partial _{h_1,\ldots,\widehat {h_i},\ldots,h_{2s}}\phi (y_{\mathbf {h}}+\epsilon h_i)$,
so that $\mu _{A_{\mathbf {h}}}(X_{\mathbf {h}})\gg _s\delta ^{O_s(1)}$, and
Note that $(x,x+k,x+k',x+k+k')\in X'_{\mathbf {h}}$ if and only if
for all $i=1,\ldots,2s$, $\epsilon =0,1$, and $\omega '\in \{0,1\}^2$. Thus, whenever $(x,x+k,x+k',x+k+k')\in X'_{\mathbf {h}}$, we have
and
for all $i=1,\ldots,2s$ and $\epsilon =0,1$. That is, the derivative of $\phi$ vanishes on all $(2s+1)$-dimensional faces of the $(2s+2)$-dimensional parallelopiped $(x+\omega \cdot (h_1,\ldots,h_{2s},k,k'))_{\omega \in \{0,1\}^{2s+2}}$.
As in the $s=0$ case,
so that
for at least a $\Omega _s(\delta ^{O_s(1)})$-proportion of $2s$-tuples $\mathbf {h}$. Each ordered quadruple $(x,\mathbf {h},k,k')$ for which $(x,x+k,x+k',x+k+k')\in X_{\mathbf {h}}'$ corresponds to a unique $(2s+2)$-dimensional parallelopiped
in $A^{2s+2}$. Thus,
In comparison, the number of $(2s+2)$-dimensional parallelopipeds in $A$ is
by Corollary 3.3. The conclusion of the lemma now follows as long as $C_s$ is sufficiently small and $c_s$ is sufficiently large.
With a bit more work, it is possible to prove a version of Lemma 4.4 with $(2s+2)$-dimensional parallelopipeds replaced by $(s+2)$-dimensional parallelopipeds (which is optimal), and thus a version of Lemma 4.1 with the $U^{2s+2}$-norm replaced by the $U^{s+2}$-norm, but this would make a negligible difference in Theorem 1.2.
Now we can prove Lemma 4.1.
Proof of Lemma 4.1 We proceed by induction on $r$ and $s$, beginning with the $r=1$, $s=0$ caseFootnote 1. Since $\operatorname {codim}\{y\in \mathbb {F}_p^n:\Phi (x,y)=1\}=d$ for all $x\in A$, certainly
for all $\mathbf {x}\in (\mathbb {F}_p^n)^m$ for which $\psi _1(\mathbf {x})\in A$, and this case follows trivially without even needing the assumption that $\|\Phi -\alpha \rho \|_{U^2(\mathbb {F}_p^n\times \mathbb {F}_p^n)}$ is small.
Now let $r\geq 2$ or $s\geq 1$, and assume that the result holds for all pairs of integers $(r',s')$ satisfying:
(i) $0\leq r'< r$ and $1\leq s'\leq s$; or
(ii) $1\leq s'< s$;
and let $C'$ be at most the minimum of $C_{r',s'}$ and $c'$ be at least the maximum of $c_{r',s'}$ over all such pairs with $r'<\max (r,2^{s+1})$. As long as $\|\Phi -\alpha \rho \|_{U^{2s+2}(\mathbb {F}_p^n\times \mathbb {F}_p^n)}< C'(\alpha \delta \rho ^{2r}/2)^{c'2^r}$, it follows from the induction hypothesis that for all but a $O(\delta )$-proportion of $\mathbf {x}\in (\mathbb {F}_p^n)^m$ for which $\psi _1(\mathbf {x}),\ldots,\psi _r(\mathbf {x})\in A$, we must have
for all $w_1,\ldots,w_r\in \mathbb {F}_p^n$. If the codimension of some $\{y\in \mathbb {F}_p^n:\prod _{i=1}^{r}\Phi (\psi _i(\mathbf {x}),y+w_i)=1\}$ is not $rd$ for one of these typical $\mathbf {x}$, then it is either $n$ or at most $rd-1$, which means that
in either case, since $p\geq 2$. Squaring both sides of (22), multiplying by $\prod _{i=1}^rA(\psi _i(\mathbf {x}))$, averaging over all $\mathbf {x}\in (\mathbb {F}_p^n)^m$, swapping the order of summation, and applying Lemma 4.2 (to deduce the uniformity of $A$) and Theorem 3.2 yields
By Hölder's inequality, we then have
It follows from this, the induction hypothesis, our assumption that $\|\Phi -\alpha \rho \|_{U^{2s+2}(\mathbb {F}_p^n\times \mathbb {F}_p^n)}< C'(\alpha \delta \rho ^{2r}/2)^{c'2^r}$, and Lemma 4.2 that
since the Cauchy–Schwarz complexity of any proper subset of
is at most $s-1$.
Thus, by taking $C'$ sufficiently small and $c'$ sufficiently large, we get that
for at least an $\Omega _s((\delta \rho )^{O_s(1)})$-proportion of $(s+2)$-tuples $(x,h_1,\ldots,h_{s+1})\in (\mathbb {F}_p^n)^{s+2}$ for which $(x+\omega \cdot (h_1,\ldots,h_{s+1}))_{\omega \in \{0,1\}^{s+1}}\in A^{2^{s+1}}$. The condition (23) implies that, for each such $(x,h_1,\ldots,h_{s+1})$, there exist vectors $v_{x,\mathbf {h},\omega }\in V^{\perp }_{x+\omega \cdot \mathbf {h}}$, $\omega \in \{0,1\}^{s+1}$, not all of which are zero, such that
Fix a basis $\{\gamma _{z,1},\ldots,\gamma _{z,d}\}$ of $V_z^\perp$ for each $z\in \mathbb {F}_p^n$. We can write every vector $v_{x,\mathbf {h},\omega }$ in terms of this basis, giving us that
for some $2^{s+1}d$-tuple of constants $(b_{x,\mathbf {h},\omega,j})_{\omega \in \{0,1\}^{s+1},\ 1\leq j\leq d}$, not all of which are zero.
We apply the pigeonhole principle to deduce that there is some $2^{s+1}d$-tuple of constants $(b_{\omega,j})_{\omega \in \{0,1\}^{s+1},\ 1\leq j\leq d}$, not all of which are zero, such that
for at least a $\Omega _s((\delta \rho )^{O_s(1)})$-proportion of $(s+2)$-tuples $(x,h_1,\ldots,h_{s+1})\in (\mathbb {F}_p^n)^{s+2}$ for which $(x+\omega \cdot (h_1,\ldots,h_{s+1}))_{\omega \in \{0,1\}^{s+1}}\in A^{2^{s+1}}$. Defining
for each $\omega \in \{0,1\}^{s+1}$, the above says that, for at least a $\Omega _s((\delta \rho )^{O_s(1)})$-proportion of $(s+2)$-tuples $(x,h_1,\ldots,h_{s+1})\in (\mathbb {F}_p^n)^{s+2}$ for which $(x+\omega \cdot (h_1,\ldots,h_{s+1}))_{\omega \in \{0,1\}^{s+1}}\in A^{2^{s+1}}$, we must have
where at least one of the functions $\phi _{\omega }:\mathbb {F}_p^n\to \mathbb {F}_p^n$ does not have the zero vector in its image (because $\phi _\omega (z)$ is always a nontrivial linear combination of linearly independent vectors). Let $\phi$ be any such $\phi _\omega$. It then follows by applying Corollary 3.3 to the inside average of
that $\partial _{h_1,\ldots,h_{s+1}}\phi (x)=0$ for at least a $\Omega _s((\delta \rho )^{O_s(1)})$-proportion of $(s+2)$-tuples $(x,h_1,\ldots,h_{s+1})\in (\mathbb {F}_p^n)^{s+2}$ for which $(x+\omega \cdot (h_1,\ldots,h_{s+1}))_{\omega \in \{0,1\}^{s+1}}\in A^{2^{s+1}}$, again provided that $C'$ is sufficiently small and $c'$ is sufficiently large. That is, $\phi$ is a $\Omega _s((\delta \rho )^{O_s(1)})$-approximate polynomial of degree at most $s$ on $A$.
If $C'$ is small enough and $c'$ is large enough, Lemmas 4.2 and 4.4, then imply that for at least a $\Omega _{s}((\alpha \delta \rho )^{O_s(1)})$-proportion of $(2s+2)$-dimensional parallelopipeds $(x+\omega \cdot (h_1,\ldots,h_{2s+2}))_{\omega \in \{0,1\}^{2s+2}}$ in $A^{2^{2s+2}}$, the derivative of $\phi$ on each $(2s+1)$-dimensional face vanishes, i.e.
for all $i=1,\ldots,2s+2$ and $\epsilon =0,1$. Call the set of $(2s+3)$-tuples $(x,h_1,\ldots,h_{2s+2})$ corresponding to such $(2s+2)$-dimensional parallelopipeds $X$. Consider $\|\Phi -\rho A\|_{U^{2s+2}(\mathbb {F}_p^n\times \mathbb {F}_p^n)}^{2^{2s+2}}$, which, plugging in the expression
for $\Phi -\rho A$, equals
The above has size $\gg _s(p-1)(\alpha \rho )^{2^{2s+2}}(\alpha \delta \rho )^{O_s(1)}\gg _s(\alpha \delta \rho )^{O_s(1)}$, coming from the contribution of $v_\omega =\lambda \phi (x+\omega \cdot (h_1,\ldots,h_{2s+2}))$ for each $\lambda \in \mathbb {F}_p^\times$ and $(x,h_1,\ldots,h_{2s+2})\in X$. On the other hand, we have
so that
Taking $C_{s,r}$ sufficiently small and $c_{s,r}$ sufficiently large will thus yield a contradiction if
for some $w_1,\ldots,w_r\in \mathbb {F}_p^n$ for a $\delta$-proportion of $\mathbf {x}\in (\mathbb {F}_p^n)^m$ for which $\psi _1(\mathbf {x}),\ldots,\psi _r(\mathbf {x})\in A$.
5. Control by directional uniformity norms
This section is devoted to proving Lemmas 2.3 and 2.4. Our arguments mostly consist of careful, repeated applications of the Cauchy–Schwarz inequality to ensure that there is no loss of density factors and using the results of §§ 3 and 4 to analyze the resulting averages. As a simple warm-up, we begin by showing that if $T\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ has the form (11) and $B,C,$ and $D$ are sufficiently pseudorandom, then $T$ has density close to the product density $\alpha \beta \gamma \delta \rho$.
Lemma 5.1 Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Let $\varepsilon >0$ and assume that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,$ and $\delta$, respectively, and satisfy
and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Then the set $T\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ defined by (11) has density
Proof. The density of $T$ in $\mathbb {F}_p^n\times \mathbb {F}_p^n$ can be written as
where $F(x,y):=B(y)C(x+y)D(2x+y)$. Set $L(x,y):=\{y,x+y,2x+y\}$. Lemma 3.6 says that
since
has Cauchy–Schwarz complexity at most $3$. Thus, Lemma 3.5 yields
for all but a $O(\sqrt {\varepsilon })$-proportion of $x\in \mathbb {F}_p^n$, so that
Now we can prove Lemma 2.3.
Proof of Lemma 2.3 The quantity of interest $\Lambda (T,T,T,S)$ is
which, after a change of variables, can be written as
where $\mu (x,y)$ equals
We will show that $\mu (x,y)$ is very close to the constant value $\alpha \beta ^2\gamma ^2\delta ^2\rho ^2$ for almost every pair $(x,y)\in \mathbb {F}_p^n\times \mathbb {F}_p^n$, from which it will then follow that $\Lambda (T,T,T,S)$ is close to $\sigma \alpha ^2\beta ^3\gamma ^3\delta ^3\rho ^3$.
The first moment $\mathbb {E}_{x,y}\mu (x,y)$ equals
Applying Lemma 3.6 yields
where
It therefore follows from Lemma 3.5 that
for all but a $O(\sqrt {\varepsilon })$-proportion of $(x,y)\in \mathbb {F}_p^n\times \mathbb {F}_p^n$. Thus,
By arguing as in the proof of Lemma 5.1, we have
and, thus, conclude that
The second moment $\mathbb {E}_{x,y}\mu (x,y)^2$ equals
Applying Lemma 3.6 again yields
where
and applying Lemma 4.1 yields
for all $y\in \mathbb {F}_p^n$. Thus, by Lemma 3.5,
for all but a $O(\varepsilon ^{\Omega (1)}/\rho ^{O(1)})$-proportion of $(x,x+h,y)\in A\times A\times \mathbb {F}_p^n$, so that
By Lemmas 3.6 and 4.1 again, we have
and
where
so that
by Lemma 3.5. Using Corollary 3.3 to estimate $\mathbb {E}_{x,h}\Delta _{-h}A(x)$, we thus conclude that
Our estimates for the first and second moments of $\mu$ imply that $\mu$ has variance $\mathbb {E}_{x,y}|\mu (x,y)-\alpha \beta ^2\gamma ^2\delta ^2\rho |^2\ll \varepsilon ^{\Omega (1)}/\rho ^{O(1)}$. It follows that
When $c_1$ is sufficiently small and $c_2$ is sufficiently large, this gives the desired lower bound for $\Lambda (T,T,T,S)$.
To finish this section, we prove Lemma 2.4.
Proof of Lemma 2.4 We prove (16), (17), and then (18), proceeding in decreasing order of the number of applications of Cauchy–Schwarz required. For (16), we make the change of variables $z\mapsto z-x-y$ to write $\Lambda (f_0,f_1,f_2,f_3)$ as
which, by applying the Cauchy–Schwarz inequality in the $x$ and $z$ variables, has modulus squared bounded by
The first factor equals $\alpha \beta \gamma \delta \rho +O(\varepsilon ^{1/8})$ by Lemma 5.1. Expanding the square and making a change of variables, the second factor equals
By another application of the Cauchy–Schwarz inequality in the $y,z,$ and $h_1$ variables, the modulus squared of this is at most
times
The first factor (24) can be estimated in the same manner as the averages appearing in the proof of Lemma 2.3, and equals $\alpha ^2\beta ^2\gamma \delta ^2\rho ^2+O(\varepsilon ^{\Omega (1)}/\rho ^{O(1)})$. Expanding the square and making a change of variables, we get that the second factor equals
A final application of the Cauchy–Schwarz inequality in the $x,y,h_1,$ and $h_2$ variables bounds the modulus squared of this by
times
By Lemmas 3.6 and 4.1, we have
for all $y\in \mathbb {F}_p^n$, and
where
and
It then follows from Lemma 3.5 that (25) equals
which equals
It remains to relate
to $\|f_0\|_{\star _1}$. Expanding the square and making a change of variables yields
which can be written as
where
We will show that, for almost every $5$-tuple $(x,y,h_1,h_2,h_3)\in (\mathbb {F}_p^n)^5$ for which $x,x+h_2\in A$, $\mu (x,y,h_1,h_2,h_3)$ is very close to the constant value $\alpha ^4\beta ^8\gamma ^6\delta ^6\rho ^{6}$. Indeed, the first moment $\mathbb {E}_{\substack {x,y,h_1,h_2,h_3\\ x,x+h_2\in A}}\mu (x,y,h_1,h_2,h_3)$ is
Lemmas 3.6 and 4.1 tell us that
and
where
so it follows from Lemma 3.5 that the first moment equals
which equals
by Corollary 3.3. The second moment $\mathbb {E}_{\substack {x,y,h_1,h_2,h_3 \\ x,x+h_2\in A}}\mu (x,y,h_1,h_2,h_3)^2$ is
which, noting that
we can write as
Lemmas 3.6 and 4.1, analogously to the case of the first moment, tell us that
and
where
so it follows from Lemma 3.5 that the second moment equals $(\alpha ^2+O(\varepsilon ))^{-1}$ times
To estimate the main term of (26), we note that
and apply Lemmas 3.6 and 4.1 again to get that
and
where
Thus, by Lemma 3.5, we have that
equals
which equals
by Corollary 3.3. It therefore follows that the second moment is
Thus,
and we conclude that
Putting everything together gives
as desired.
For (17), we make a change of variables and apply the Cauchy–Schwarz inequality to bound $|\Lambda (T,f_1,f_2,f_3)|^2$ by
Expanding the square in the second quantity and making a change of variables yields
By applying the Cauchy–Schwarz inequality again in the $x,y,$ and $h_1$ variables, the modulus squared of this is bounded above by
times
Expanding the square and making a change of variables, the second factor equals
which we can write as
where
The quantity (27) and the weight $\mu '$ can be analyzed in the same manner as the corresponding quantity and weight in the proof of (16), so that
Finally, for (18), we make a change of variables and apply the Cauchy–Schwarz inequality to bound $|\Lambda (T,T,f_2,f_3)|^2$ by $\mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)$ times
Expanding the square in the second quantity and making a change of variables yields
which can be written as
where
This weight can again be analyzed in the same manner as the weights appearing in the proof of (16), giving
and completing the proof of the lemma.
6. Obtaining a density increment
As was mentioned in § 2, one ingredient in the proof of Theorem 2.5 is an inverse theorem for the $U^2(\Phi (x,\cdot ))$-norm localized to pseudorandom sets. Applying the standard, nonlocalized inverse theorem for the $U^2(\Phi (x,\cdot ))$-norm would yield a density increment that gets weaker as $T$ becomes sparser, which is inadequate to close the density-increment iteration. To get a strong enough localized version of this inverse theorem, we will need to use the transference principle. The particular instance of it required is an immediate consequence of the dense model lemma from [Reference ZhaoZha14], which appears as Lemma 3.3 in that paper.
Lemma 6.1 (Dense model lemma for the $U^s$-norm on subspaces)
For every natural number $s$, there exists a constant $c_s>0$ such that the following holds. Let $\varepsilon >0$, $V\leq \mathbb {F}_p^n$ be a subspace, and $f,\nu :V\to [0,\infty )$ be functions satisfying:
(i) $0\leq f\leq \nu$;
(ii) $\mathbb {E}_{x\in V}f(x)\leq 1$; and
(iii) $\|\nu -1\|_{U^s(V)}\leq \exp (-\varepsilon ^{-c_s})$.
Then there exists a $\tilde {f}:V\to [0,1]$ such that $\mathbb {E}_{x\in V}f(x)=\mathbb {E}_{x\in V}\tilde {f}(x)$ and $\|f-\tilde {f}\|_{U^s(V)}\leq \varepsilon$.
One can see that this lemma is a consequence of Zhao's lemma by using the Gowers–Cauchy–Schwarz inequality to translate between his $(s,\varepsilon )$-discrepancy pair condition and our $U^s$-uniformity condition.
In the course of the proof of Theorem 2.5, we will encounter various averages of linear forms that turn out to be controlled by certain degree $1$ and $2$ directional uniformity norms. Because of this, we will also need to obtain a density increment when these norms of the balanced function $g_S=S-\sigma$ are large. The first two subsections of this section are devoted to proving that this is possible.
6.1 Results on degree $1$ norms
We first show that the relevant fibers of any set of the form (11) typically have close to their average density, provided that $A,B,C,D,$ and $\Phi$ are sufficiently pseudorandom.
Lemma 6.2 Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ has density $\alpha \rho$ in $\mathbb {F}_p^n\times \mathbb {F}_p^n$ and takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Let $\varepsilon >0$ and assume that
Define $T$ by (11). Then
and
for any $\varepsilon '>0$.
Proof. From Lemma 5.1, we have $\mathbb {E}_{x\in A}\mu _{\mathbb {F}_p^n}(T(x,\cdot ))=\beta \gamma \delta \rho +O(\varepsilon ^{1/8}/\alpha )$. For the second moment, we argue as in § 5 to estimate
using Lemmas 3.5 and 3.6, giving $\mathbb {E}_{x\in A}\mu _{\mathbb {F}_p^n}(T(x,\cdot ))^2=\beta ^2\gamma ^2\delta ^2\rho ^2+O(\varepsilon ^{1/8}/\alpha )$. It now follows from Markov's inequality that
for all $\varepsilon '>0$. The three other estimates are proved analogously.
Now we can obtain a density increment when the degree $1$ uniformity norms controlled by $\|\cdot \|_{\star _1}$ and $\|\cdot \|_{\star _2}$ are large. The proof is essentially an averaging argument, like the proof of the analogous Lemma 3.1 in [Reference GreenGre05a]. The most substantial new feature, which will arise many times in this section, is that we must now verify that the set on which we claim to have found a density increment actually has close to the correct density in $\mathbb {F}_p^n\times \mathbb {F}_p^n$. In the case of corners, any product set $A\times B$ trivially has density equal to the product of the densities of $A$ and $B$. This is not, in general, the case for sets of the form (11) unless further assumptions are made about $A,B,C,D,$ and $\Phi$.
Lemma 6.3 There exist absolute constants $0< c_1<1< c_2$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Let $\tau >0$ and $\varepsilon \leq c_1(\tau \alpha \beta \gamma \delta \rho )^{c_2}$, and assume that
Define $T$ by (11), and let $S\subset T$ have density $\sigma$ in $T$. Suppose that
or
Then $S$ has density at least $\sigma +\Omega (\tau ^{O(1)})$ on some subset $T'$ of $T$ of the form
where the densities of $A',B',C'\subset \mathbb {F}_p^n$ are $\Omega (\tau ^{O(1)}\alpha )$, $\Omega (\tau ^{O(1)}\beta )$, and $\Omega (\tau ^{O(1)}\gamma )$, respectively.
Proof. The assumption (28) can be written as
Since $\mu _{\mathbb {F}_p^n}(T(x,\cdot ))>2\beta \gamma \delta \rho$ for at most a $O(\varepsilon ^{1/8}/\alpha ^3\beta ^2\gamma ^2\delta ^2\rho ^2)$-proportion of $x\in A$ by Lemma 6.2, it follows that, as long as $c_2$ is sufficiently large, there exists a subset $A_0\subset A$ of relative density at least $\tau /4$ for which
for all $x\in A_0$, provided that $c_1$ is small enough and $c_2$ is large enough. Note that $\mathbb {E}_{y}g_S(x,y)$ is a real number, and thus is either positive or negative. There must therefore exist a subset $A_1\subset A_0$ of density at least $1/2$ in $A_0$ such that either
for every $x\in A_1$ or
for every $x\in A_1$.
In the first case, setting $A'=A_1$ and $\alpha '=\mu _{\mathbb {F}_p^n}(A')$, we have
by Lemma 6.2 whenever $c_1$ is small enough and $c_2$ is large enough, so that
Since $\mathbb {E}_{x,y}A'(x)B(y)C(x+y)D(2x+y)\Phi (x,y)=\alpha '\beta \gamma \delta \rho +O(\varepsilon ^{1/16}/\alpha )$ by Lemma 6.2, the conclusion of the lemma follows when $c_1$ is small enough and $c_2$ is large enough by taking $B'=B$.
In the second case, setting $A'=A\setminus A_1$ and $\alpha '=\mu _{\mathbb {F}_p^n}(A')$, we use the fact that $\mathbb {E}_{x,y}g_S(x,y)=0$ to deduce that
whenever $c_1$ is small enough and $c_2$ is large enough. Note that $\mathbb {E}_{x,y}A'(x)B(y)C(x+y)D(2x+y)\Phi (x,y)=\alpha '\beta \gamma \delta \rho +O(\varepsilon ^{1/16}/\alpha )$ in this case as well by Lemma 6.2, so the conclusion of the lemma will follow as long as $\alpha '=\Omega (\tau ^{O(1)}\alpha )$. But this also follows from Lemma 6.2, for we have
The proof of the lemma starting from the assumptions (29) and (30) is essentially identical, but using the second and third probability estimates from Lemma 6.2, respectively.
6.2 Results on degree $2$ norms
To obtain a density increment when the relevant degree $2$ directional uniformity norms of $g_S$ are large, we first need to show that certain degree $2$ ‘inner products’ are controlled by the degree $1$ directional uniformity norms studied in the previous subsection.
Lemma 6.4 Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Let $\varepsilon >0$ and assume that
Define $T$ by (11), and let $S\subset T$ and $\tau >0$.
If
where $g_\omega$ equals $T$ or $g_S$ for all $\omega \in \{0,1\}^2$, at least one $g_\omega$ equals $g_S$, and at least one $g_\omega$ equals $T$, then
If
where $g_\omega$ equals $T$ or $g_S$ for all $\omega \in \{0,1\}^2$, at least one $g_\omega$ equals $g_S$, and at least one $g_\omega$ equals $T$, then
or
If
where $g_\omega$ equals $T$ or $g_S$ for all $\omega \in \{0,1\}^2$, at least one $g_\omega$ equals $g_S$, and at least one $g_\omega$ equals $T$, then
or
Proof. We rewrite the various assumptions that (31), (32), and (33) hold when at least two of the $g_\omega$ equal $T$ as
and
where
and
Using the definition (11) of $T$ and arguing as in § 5 using Lemmas 3.5, 3.6, and 4.1 gives that each of $\mu _1,\ldots,\mu _8$ is typically very close to its average value on its support. Precisely, we have the estimates
and
which imply that
and
Since $|\mathbb {E}_{x,y}g_S(x,y)|^2\leq \alpha \mathbb {E}_{x,y,h}g_S(x,y)g_S(x,y+h)$ by an application of the Cauchy–Schwarz inequality, the conclusion of the lemma easily follows starting from any one of the assumptions (31), (32), or (33) when at least two of the $g_\omega$ equal $T$.
To prove the lemma when only one $g_\omega$ in (31), (32), or (33) equals $T$, we will apply the Cauchy–Schwarz inequality once, and then argue analogously. By making a change of variables, we may start from the assumption that
or
We apply the Cauchy–Schwarz inequality in each of these three cases to get that
times $\mathbb {E}_{x,y,h}\Delta _{(0,h)}T(x,y)$ is at least $\tau ^2\alpha ^2\beta ^8\gamma ^8\delta ^8\rho ^6$,
times $\mathbb {E}_{x,y,h}\Delta _{(h,0)}T(x,y)$ is at least $\tau ^2\alpha ^4\beta ^4\gamma ^8\delta ^8\rho ^8$, or
times $\mathbb {E}_{x,y,h}\Delta _{(0,h)}T(x,y)$ is at least $\tau ^2\alpha ^4\beta ^8\gamma ^4\delta ^8\rho ^8$.
We have
and
by Lemmas 3.5, 3.6, and 4.1. Expanding the square and making a change of variables, (34) equals
where
(35) equals
where
and (36) equals
where
Analogously to the weights $\mu _1,\ldots,\mu _8$, Lemmas 3.5, 3.6, and 4.1 give the estimates
and
from which it follows that
and
This completes the proof of the lemma.
Now we are almost ready to prove our desired density-increment result for the localized degree $2$ directional uniformity norms controlled by $\|\cdot \|_{\star _1}$.
Lemma 6.5 There exist absolute constants $0< c_1<1< c_2,c_3$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Let $\tau >0$ and
and assume that
Define $T$ by (11), and let $S\subset T$ have density $\sigma$ in $T$. Suppose that
or
Then $S$ has density at least $\sigma +\Omega (\tau ^{O(1)})$ on some subset $T'$ of $T$ of the form
where the densities of $A',B'\subset \mathbb {F}_p^n$ are both $\Omega ((\sigma \tau \alpha \beta \gamma \delta \rho )^{O(1)})$, and $\Phi '$ is of the form
where each $V_x'$ is a subspace of $\mathbb {F}_p^n$ of codimension $d+1$.
To prove this lemma starting from the assumption (37), we will apply a localized $U^2(\Phi (x,\cdot ))$-norm inverse theorem for many fixed $x$, which will produce a density increment on a set of the form
where
This is not yet what the conclusion of the lemma promises, since $u'_x$ varies with $x$. To show that we can select a fixed $u'$ (at the cost of shrinking the size of $A'$ a bit) we will need Lemma 6.6.
The reader may wonder why we cannot just run the density-increment argument on sets of the form (39) and skip having to prove Lemma 6.6. The issue with this hypothetical proof is that $U^s(\mathbb {F}_p^n\times \mathbb {F}_p^n)$-uniformity of a set of the form $\Psi$ is not strong enough to guarantee that the analogue of Lemma 4.1 is true, regardless of how large $s$ is taken to be. Thus, such sets are not as amenable to a Shkredov-like pseudorandomization procedure.
Lemma 6.6 There exist absolute constants $0< c_1<1< c_2$ such that the following holds. Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Let $A\subset \mathbb {F}_p^n$ have density $\alpha$, $\Psi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ take the form
where $u_{x}\in \mathbb {F}_p^n$ and each $V_{x}$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$, and $K\subset A\times \mathbb {F}_p^n$ satisfy
for every $x\in A$ and every affine subspace $H$ of $\mathbb {F}_p^n$ of codimension $d$. Let $\tau >0$, assume that $\varepsilon \leq c_1(\tau \alpha \beta \gamma \delta \rho )^{c_2}$, and suppose that $S\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ has density at least $\sigma +\tau$ in $K\cap \Psi$, where $\sigma >0$. Then there exists a $u\in \mathbb {F}_p^n$ and a subset $A'\subset A$ such that $S\cap T'$ has density at least $\sigma +\tau /4$ in $T'$, where
and $\mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(A')\geq \alpha \rho \tau /2$.
Proof. Define
and
for every $u\in \mathbb {F}_p^n$, and set $\alpha _{u}:=\mu _{\mathbb {F}_p^n}(A_{u})$, so that $\mathbb {E}_u\alpha _u=\alpha \rho$. By our assumption on $K$, we have
and
for all $u\in \mathbb {F}_p^n$.
Note that
and set
for each $u\in \mathbb {F}_p^n$. Then we have
where
so that
when $c_1$ is small enough and $c_2$ is large enough. The contribution to $\eta$ coming from $u$ for which $\alpha _u<\tau \alpha \rho /2$ is obviously at most $\tau \alpha \rho /2$, which implies that
which is clearly positive. We thus conclude that there must exist a $u\in \mathbb {F}_p^n$ for which $\alpha _u=\mu _{\mathbb {F}_p^n}(A_{u})\geq \tau \alpha \rho /2$ and
The conclusion of the lemma now follows by taking $A'=A_{u}$ and $\Phi '=\Psi _{u}\cap (A'\times \mathbb {F}_p^n)$.
Now we can prove Lemma 6.5.
Proof of Lemma 6.5 First assume that (37) holds. By writing $S=g_S+\sigma T$, we see that $\mathbb {E}_{x,y,h,k}\Delta _{(0,h),(0,k)}S(x,y)$ equals
plus $14$ terms of the form
where at least one $g_\omega$ equals $g_S$ and at least one other equals $\sigma T$, and
If one of the terms (40) has absolute value larger than $\tau \alpha \beta ^4\gamma ^4\delta ^4\rho ^3/32$, then combining Lemmas 6.4 and 6.3 produces the desired density increment. Thus, we may proceed under the assumption that all have size at most $\tau \alpha \beta ^4\gamma ^4\delta ^4\rho ^3/32$, so that
For each $x\in \mathbb {F}_p^n$, set $\Phi _x(y):=\Phi (x,y)$, $T_x(y):=T(x,y)$, and $S_x(y)=S(x,y)$. Lemmas 3.5, 3.6, and 4.1 tell us that
and that
or else Lemma 6.3 will again give the desired density increment. It follows that there exists a subset $A_0\subset A$ of density $\gg \tau$ in $A$ such that
and
for every $x\in A_0$. Setting
for all $x\in A_0$ we then have $\|f_x\|^4_{U^2(\Phi _x)}\geq \sigma ^4+\tau /4$, $0\leq f_x\leq \nu _x$, $\mathbb {E}_{y}f_x(y)\leq 1$, and
provided that $c_1$ is small enough and $c_2$ is large enough. Thus, as long as $c_3$ is sufficiently large, Lemma 6.1 tells us that there exists a function $\tilde {f}_x:\Phi _x\to [0,1]$ such that $\mathbb {E}_{y\in \Phi _x}f_x(y)=\mathbb {E}_{y\in \Phi _x}\tilde {f}_x(y)$ and $\|f_x-\tilde {f}_x\|_{U^2(\Phi _x)}^4\leq \tau /32$. As a consequence, since $\|f_x\|^4_{U^2(\Phi _x)}\geq \sigma ^4+ {\tau }/{4}$, we must have
as well. Set $\tilde {\sigma }_x:=\mathbb {E}_{y}f_x(y)=\mathbb {E}_{y}\tilde {f}_x(y)$ and let $v_x\in \Phi _x$, so that
Since $|\tilde {\sigma }_x-\sigma |<\tau /64$, it follows that there exists a nonzero $\xi _x\in \widehat {\Phi _x-v_x}$ such that
where we have crucially used that $\tilde {f}_x$ is $1$-bounded. As $\|f_x-\tilde {f}_x\|_{U^2(\Phi _x)}^4\leq \tau /32$, it therefore follows that
for every $x\in A_0$.
Extend $x\mapsto \xi _x$ from $A_0$ to $A$ by picking a nonzero $\xi _x\in \widehat {\Phi _x-v_x}$ arbitrarily for all $x\in A\setminus A_0$. We now split the average over $y\in \Phi _x$ above into an average of averages over cosets of $\langle \xi _x\rangle ^\perp$ in $\Phi _x$ and average over all of $A$ to get that
and use the fact that
to deduce that
By applying the pigeonhole principle in the $x$ and $t$ variables, it follows that there exists a subset $A_1\subset A$ of density $\gg \tau$ in $A$ and, for each $x\in A_1$, an element $t_x\in \mathbb {F}_p$ for which
Thus, recalling the definition of $f_x$, we have
Define $\phi :\mathbb {F}_p^n\to \mathbb {F}_p^n$ by taking $\phi (x)=\xi _x$ for all $x\in A$ and $\phi (x)$ to be an arbitrary element of $(\Phi _x-v_x)^\perp \setminus \{0\}$ for all $x\in \mathbb {F}_p^n\setminus A$, and similarly extend $x\mapsto t_x$ from $A_1$ to $A$ by taking $t_x$ to be an arbitrary element of $\mathbb {F}_p$ for which
for all $x\in A\setminus A_1$. Such an element must exist by the pigeonhole principle. Set
so that $\operatorname {codim}\{y\in \mathbb {F}_p^n:\Psi (x,y)=1\}=d+1$ for every $x\in A$, and $\alpha _1:=|A_1|/p^n$. Then (41) can be rewritten as
It remains to check that the density $\mathbb {E}_{x,y}A_1(x)B(y)C(x+y)D(2x+y)\Psi (x,y)$ is close to $\alpha _1\beta \gamma \delta \rho /p$, so that we indeed have the desired density increment. But by Lemmas 3.5 and 3.6, we have
for all but a $O(\sqrt {\varepsilon })$-proportion of $x\in A$, from which it follows that
Thus, $S$ has density at least $\sigma +\Omega (\tau )$ on
provided that $c_2$ is large enough. The conclusion of the lemma now follows from Lemma 6.6
Now suppose that (38) holds. By writing $S=g_S+\sigma T$ and arguing as in the first case, we may proceed under the assumption that
We will first show that either
for almost every pair $(x',y')\in S$, or else we can deduce the desired density increment using Lemma 6.3.
Consider the average
Using that $S=g_S+\sigma T$, the above can be written as
plus seven other terms of the form
where $g_0,g_1,$ and $g_2$ all equal $g_S$ or $\sigma T$ and at least one $g_i$ equals $g_S$. By Lemmas 3.5, 3.6, and 4.1, the quantity (42) equals $\sigma ^3\alpha ^2\beta ^2\gamma ^4\delta ^4\rho ^4+O(\varepsilon ^{\Omega (1)}/\rho ^{O(1)})$. Suppose that $k$ of the functions $g_0,g_1,$ and $g_2$ in (43) equal $\sigma T$. By a similar argument to those used to prove Lemma 6.4, if any term of the form (43) has size at least $\tau ^4\sigma ^k\alpha ^2\beta ^2\gamma ^4\delta ^4\rho ^4$, then
or
so that the desired density increment follows from Lemma 6.3. Thus, we may proceed under the assumption that
Now consider the average
Using that $S=g_S+\sigma T$, the above can be written as
plus 31 other terms of the form
where $g_0,g_1,g_2,g_3,$ and $g_4$ all equal $g_S$ or $\sigma T$ and at least one $g_i$ equals $g_S$.
By Lemmas 3.5, 3.6, and 4.1, the quantity (44) equals $\sigma ^5\alpha ^3\beta ^3\gamma ^7\delta ^7\rho ^7+O(\varepsilon ^{\Omega (1)}/\rho ^{O(1)})$. Suppose that $k$ of the functions $g_0,\ldots,g_4$ in (45) equal $\sigma T$. Analogously to the situation for the first moment, if any of the terms of the form (45) has size at least $\tau ^8\sigma ^k\alpha ^3\beta ^3\gamma ^7\delta ^7\rho ^7$, then we will be able to deduce the desired density increment. The most involved case is when $g_0=\cdots =g_4=g_S$. All other cases can be handled using a simpler version of the argument we are about to carry out.
Thus, consider this most involved case, i.e. that
has size at least $\tau ^8\alpha ^3\beta ^3\gamma ^7\delta ^7\rho ^7$. Applying the Cauchy–Schwarz inequality in the variables $x',y',y,$ and $w$ gives that
times
is at least $\tau ^8\alpha ^{6}\beta ^{6}\gamma ^{14}\delta ^{14}\rho ^{14}$. By Lemmas 3.5, 3.6, and 4.1, $\mathbb {E}_{x',y',y,w}T(x',y')T(x',y)T(x',w)=\alpha \beta ^3\gamma ^3\delta ^3\rho ^3+O(\varepsilon ^{\Omega (1)}/\rho ^{O(1)})$. Expanding the square in the average above, this means that
is $\gg \tau ^8\alpha ^5\beta ^3\gamma ^{11}\delta ^{11}\rho ^{11}$, provided that $c_1$ is small enough and $c_2$ is large enough. We can write the above as
where
Yet more applications of Lemmas 3.5, 3.6, and 4.1 give the estimate
so that
It follows from one more application of the Cauchy–Schwarz inequality in the variables $y',z,u,$ and $v$ and a similar analysis to above that
which, combined with Lemma 6.3, gives the desired density increment.
Thus, we may also proceed under the assumption that
By Markov's inequality, we therefore have
for all but a $O(\tau ^2)$-proportion of $(x',y')\in S$. As a consequence, there exists a pair $(x',y')\in S$ for which both (46) holds and
which together imply that
when we take $A'(x)=S(x,y')$, $B'(y)=S(x',y)$, $C'=C$, $D'=D$, and $\Phi '=\Phi$ in the definition of $T'$.
6.3 More preliminaries for $\|\cdot \|_{\star _1}$
We will similarly need that certain $\|\cdot \|_{\star _1}$-inner products are controlled by the degree $1$ and $2$ directional uniformity norms appearing in the previous subsections. The proof of the lemma below is similar to the proof of Lemma 6.4, but with an extra application of the Cauchy–Schwarz inequality in some cases.
Lemma 6.7 Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Let $\varepsilon >0$, and assume that
Define $T$ by (11), and let $S\subset T$ and $\tau >0$. If
where at least one $g_\omega$ equals $T$ and another equals $g_S$, then
or
Our final preliminary lemma says that, for almost every $(x,x+h_3)\in A^2$, the function $\Delta _{(h_3,0)}S(x,\cdot )$ is supported on a Fourier uniform subset of the affine subspace $\{y\in \mathbb {F}_p^n:\Delta _{(h_3,0)}\Phi (x,y)=1\}$.
Lemma 6.8 Let $d$ be a nonnegative integer, and set $\rho :=p^{-d}$. Suppose that $A,B,C,D\subset \mathbb {F}_p^n$ have densities $\alpha,\beta,\gamma,\delta$, respectively, and that $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ takes the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension $d$. Let $\varepsilon >0$, and assume that
Setting
and
then the probability
is $\ll \varepsilon ^{\Omega (1)}/\rho ^{O(1)}$.
Proof. By Lemmas 3.6 and 4.1, we have that $\|R_{x,y}-\beta \gamma ^2\delta ^2\|_{U^2(\mathbb {F}_p^n)}\geq \varepsilon ^{1/8}$ or $\operatorname {codim}\{y\in \mathbb {F}_p^n:\Phi _{x,h}(y)=1\}\neq 2d$ for at most a $O(\varepsilon ^{\Omega (1)}/\rho ^{O(1)})$-proportion of pairs $(x,x+h)\in A^2$. For all of these typical pairs $(x,h)$, we have
which is at most
by inserting the identity
for every $z\in \mathbb {F}_p^n$. For each fixed triple $(\xi,\eta,\nu )$, the interior average above is bounded by $\|R_{x,h}-\beta \gamma ^2\delta ^2\|_{U^2(\mathbb {F}_p^n)}$ by the Gowers–Cauchy–Schwarz inequality. Since there are $\rho ^{-6}$ possible triples to sum over, we must have
from which the conclusion of the lemma follows.
6.4 Proof of Theorem 2.5
Now we can finally finish the proof of Theorem 2.5.
Proof of Theorem 2.5 First assume that
Analogously to the proof of Lemma 6.5, by writing $S=g_S+\sigma T$, we see that $\|S\|_{\star _1}^8$ equals
plus 62 terms of the form
where at least one $g_\omega$ equals $\sigma T$ and at least one other equals $g_S$. Note that
by Lemmas 3.5, 3.6, and 4.1. If any of the terms (47) have absolute value at least $\frac {1}{128}\tau ^8\alpha ^2\beta ^4\gamma ^8\delta ^8\rho ^6$, then combining Lemma 6.7 with Lemma 6.3 or Lemma 6.5 produces the desired density increment. We may thus proceed under the assumption that these terms are all small, so that
For each pair $(x,x+h)\in A^2$, let $R_{x,h}$ and $\Phi _{x,h}$ be as in Lemma 6.8, and set $S_{x,h}(y):=\Delta _{(h,0)}S(x,y)$. By Markov's inequality, either
or else $\mathbb {E}_{x,x+h\in A}|\mathbb {E}_{y}S_{x,h}(y)-\sigma ^2\beta \gamma ^2\delta ^2\rho ^2|^2$ is $\gg \tau ^{16}\beta ^2\gamma ^4\delta ^4\rho ^4$, in which case a combination of Lemmas 6.3, 6.4, and 6.5 will produce the desired density increment. We may thus proceed under the assumption that (48) holds.
By (48) and Lemma 6.8, there exists a subset $A_0\subset \{(x,h)\in \mathbb {F}_p^n\times \mathbb {F}_p^n:x,x+h\in A\}$ of density $\gg \tau ^{O(1)}\alpha ^2$ in $\mathbb {F}_p^n\times \mathbb {F}_p^n$ such that
and $\operatorname {codim}\{y\in \mathbb {F}_p^n:\Phi _{x,h}(y)=1\}=2d$ for all $(x,h)\in A_0$. Note that $S_{x,h}$ is supported on $R_{x,h}\Phi _{x,h}$, and, setting
we have $\|f_{x,h}\|^4_{U^2(\Phi _{x,h})}\geq \sigma ^8+\tau ^8/4$, $0\leq f_{x,h}\leq \nu _{x,h}$, $\mathbb {E} f_{x,h}\leq 1$, $|\mathbb {E} f_{x,h}-\sigma ^2|<\tau ^8/64$, and $\|\nu _{x,h}-1\|_{U^2(\Phi _{x,h})}<\varepsilon ^{1/32}/\beta \gamma ^2\delta ^2\rho ^{3/2}<\exp (-(64/\tau )^{c_3})$, provided that $c_1$ is small enough and $c_2$ is large enough. Applying Lemma 6.1 on the affine subspace $\Phi _{x,h}$ yields a function $\tilde {f}_{x,h}:\Phi _{x,h}\to [0,1]$ such that
provided that $c_3$ is large enough. We must then also have
Arguing as in the proof of the first part of Lemma 6.5, it follows that, for every pair $(x,h)\in A_0$, there exists a nonzero $\xi _{x,h}\in \widehat {\Phi _{x,h}-u}$ such that
so that
as well.
Extend $(x,h)\mapsto \xi _{x,h}$ from $A_0$ to the set $\{(x,h)\in \mathbb {F}_p^n\times \mathbb {F}_p^n:x,x+h\in A\}$ by picking a nonzero $\xi _{x,h}\in \widehat {\Phi _{x,h}-u}$ arbitrarily for all pairs outside of $A_0$. We split the average over $y\in \Phi _{x,h}$ up into an average of averages over cosets of $\langle \xi _{x,h}\rangle ^\perp$ and average over all pairs $(x,h)$ such that $x,x+h\in A$ to get that
for some absolute constant $c>0$. Note that
so that, if it were the case that
then we would be able to deduce the desired density increment from Lemma 6.3. Thus, we may proceed under the assumption
which we can combine with (49), a change of variables, and an application of the pigeonhole principle in the $t$ variable to deduce that
for some fixed $t\in \mathbb {F}_p$. Set
To deduce the desired density increment by applying the pigeonhole principle to (50), we will have to show that almost every set of the form
has close to the ‘correct’ density. Most of the remainder of our argument for $\|\cdot \|_{\star _1}$ is devoted to this task.
Set $Q_{x,h}(y):=S(h,y)C(x+y)D(2x+y)$. We start by showing that either $\|Q_{x,h}-\sigma \beta \gamma ^2\delta ^2\|_{U^2(\Phi _{x,h-x})}$ is small for almost every pair $(x,h)\in A\times A$, or else we can deduce a density increment from Lemmas 6.3 and 6.5. Note first that
where $\mu (y):=\mathbb {E}_{x}A(x)C(x+y)D(2x+y)\Phi (x,y)$. By Lemmas 3.5, 3.6, and 4.1, we have
so that $|\mathbb {E}_{x,h,y}A(x)A(h)\Phi _{x,h-x}(y)Q_{x,h}(y)-\sigma \alpha ^2\beta \gamma ^2\delta ^2\rho ^2|<\varepsilon ^{\Omega (1)}/\rho ^{O(1)}$. Similarly, the average $\mathbb {E}_{x,h}A(x)A(h)|\mathbb {E}_yQ_{x,h}(y)\Phi _{x,h-x}(y)-\sigma \beta \gamma ^2\delta ^2\rho ^2|^2$ equals
and $\mathbb {E}_{x,y,h,k}A(x)A(h)\Phi _{x,h-x}(y)\Phi _{x,h-x}(y+k)Q_{x,h}(y)Q_{x,h}(y+k)$ equals
where
By Lemmas 3.5, 3.6, and 4.1, we have
so that
equals
Thus, if
for some $c'>0$ (to be chosen later), and $c_1$ is small enough and $c_2$ is large enough, then
in which case the desired density increment follows from Lemma 6.3. We may thus proceed under the assumption that
so that, by Markov's inequality, we have
It follows that $\mathbb {E}_{x,h}A(x)A(h)\|Q_{x,h}-\sigma \beta \gamma ^2\delta ^2\|_{U^2(\Phi _{x,h-x})}^4$ equals
If
then we could deduce the desired density increment from Lemma 6.5. Thus, we may proceed under the assumption that this inequality does not hold, which implies that $\mathbb {E}_{x,h}A(x)A(h)\|Q_{x,h}-\sigma \beta \gamma ^2\delta ^2\|_{U^2(\Phi _{x,h-x})}^4 \ll \tau ^{4c'}\alpha ^2\beta ^4\gamma ^8\delta ^8$. It then follows from Markov's inequality that
Next, we will show that, for typical $(x,h)\in A\times A$, the average size of $S(x,y)S(h,y)\Psi _{h}(x,y)$ is not very large. Certainly,
for every $(x,h)\in A\times A$. Setting
Lemma 3.6 says that
Thus, by Lemma 3.5, as long as $c_2$ is large enough, we have
say.
Now, setting $c'=8c$, the contribution to the left-hand side of (50) coming from pairs $(x,h)\in A\times A$ for which $\|Q_{x,h}-\sigma \beta \gamma ^2\delta ^2\|_{U^2(\Phi _{x,h-x})}\gg \tau ^{16c}\beta \gamma ^2\delta ^2$ or $|\mathbb {E}_y S(x,y)S(h,y)\Psi _h(x,y)|>100\beta \gamma ^2\delta ^2\rho ^2/p$ is $\ll \tau ^{16c}$. Thus, by the pigeonhole principle, there exists an $h\in A$ and a subset $A'\subset A$ of density $\alpha '\gg \tau ^{O(1)}\alpha$ in $\mathbb {F}_p^n$ such that
and $\|Q_{x,h}-\sigma \beta \gamma ^2\delta ^2\|_{U^2(\Phi _{x,h-x})}\ll \tau ^{16c}\beta \gamma ^2\delta ^2$ for every $x\in A'$. Recalling the definition of $f_{x,h}$, the above displayed equation says that
We now use Lemma 6.6 to find a subset $A''\subset A'$ of density $\alpha ''\gg \alpha '\rho \tau ^{O(1)}$ and a $u'\in \mathbb {F}_p^n$ such that
where
This gives the conclusion of the lemma.
The proofs of the two remaining cases are very similar to those appearing in earlier subsections, so we will be more brief in our arguments. Next, assume that
As in the second part of the proof of Lemma 6.5, we may proceed under the assumption that
and use it to show that either
for almost every pair $(x',y')$ for which $(-x',x'+y')\in S$, or else the desired density increment follows from Lemma 6.3. By Lemmas 3.5, 3.6, and 4.1 and the Cauchy–Schwarz inequality, either
and
or else we have the desired density increment from Lemma 6.3. It then follows from Markov's inequality that
for all but a $\tau /4$-proportion of pairs $(x',y')$ for which $(-x',x'+y')\in S$, which means that we can find such a pair for which we also have
so that we get the desired density increment by taking $A'=S(x,y'-x)$, $B'=B$, $C'=S(-x',x+y+x')$, $D'=D$, and $\Phi '=\Phi$ in the definition of $T'$.
Now assume that
which means
By Lemma 6.2 and the pigeonhole principle, there exists a subset $D_0\subset D$ of density at least $\tau ^2/2$ for which
whenever $z\in D_0$. As in the proof of Lemma 6.3, there is a subset $D_1\subset D_0$ of density at least $1/2$ in $D_0$ such that either $\mathbb {E}_{2x+y=z}g_S(x,y)\geq \tau \alpha \beta \gamma \rho /4$ or $\mathbb {E}_{2x+y=z}g_S(x,y)\leq -\tau \alpha \beta \gamma \rho /4$ for all $z\in D_1$. As in the proof of Lemma 6.3, we can simply take $D'=D_1$ and $D'=D\setminus D_1$ in these cases, respectively, since, by Lemma 6.2, the fibers $\{(x,y)\in T:2x+y=z\}$ typically have very close to their average size.
7. Pseudorandomization
This section begins with yet more preliminaries. To prove Lemma 2.6, we will need a result of Cohen and Tal, which says that, for any finite set of polynomials in $\mathbb {F}_p[x_1,\ldots,x_m]$, one can find a partition of $\mathbb {F}_p^m$ into affine subspaces of relatively large dimension on which all of the polynomials are constant.
Theorem 7.1 (Cohen and Tal [Theorem 3.6, Reference Cohen and TalCT15] specialized to prime fields)
Let $d,m,$ and $t$ be natural numbers. There exists a positive integer $m'$ satisfying
such that, for any polynomials $P_1,\ldots,P_t\in \mathbb {F}_p[x_1,\ldots,x_m]$ of degree at most $d$, there is a partition of $\mathbb {F}_p^m$ into affine subspaces of dimension $m'$ such that $P_1,\ldots,P_t$ are constant on each affine subspace.
To see that this formulation of Cohen and Tal's theorem is equivalent to Theorem 3.6 of [Reference Cohen and TalCT15], note that one can make all of the affine subspaces in the partitions produced by their theorem have the same dimension by simply partitioning each subspace not of the minimum possible dimension into more subspaces.
We will also need a ‘bilinear’ version of Cohen and Tal's result, which we will use to find partitions of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ into product spaces of the form $(u+V)\times (w+V)$ on which polynomials in two sets of variables $x_1,\ldots,x_m,y_1,\ldots,y_m$ are constant.
Corollary 7.2 Let $d$ and $m$ be natural numbers. There exists a positive integer $m'$ satisfying
such that, for any polynomial $R\in \mathbb {F}_p[x_1,\ldots,x_m,y_1,\ldots,y_m]$ of degree at most $d$, there is a partition of $\mathbb {F}_p^m\times \mathbb {F}_p^m$ into products of affine subspaces of the form
with each $\dim V= m'$, such that $R$ is constant on each product of affine subspaces.
Proof. Write
where $P_0,\ldots,P_d,Q_0,\ldots,Q_d\in \mathbb {F}_p[z_1,\ldots,z_m]$ satisfy $\deg {P_i}\leq i$ and $\deg {Q_i}\leq d-i$ for all $0\leq i \leq d$. Applying Theorem 7.1 to $P_0,\ldots,P_d$ gives us a positive integer $m'_0\gg {m^{1/(d-1)!}}/{(d+1)^e}$ and a partition
of $\mathbb {F}_p^m$, with $\dim {V_j}=m'_0$ for each $j\in J$, such that $P_0,\ldots,P_d$ are all constant on each affine subspace $u_j+V_j$.
Now write
Since restricting a polynomial to a subspace cannot increase its degree, for each $j\in J$, we can apply Theorem 7.1 on each affine subspace $w+V_j$ to the polynomial
to get that there exists a positive integer $m'$ satisfying
and a partition
with each subspace $V_k'\leq V_j$ having $\dim V_k'=m'$, such that $Q_{j,w}$ is constant on each $w_k+V_k'$.
Thus, we can write
where $R$ is constant on each product $(u_j+V_j)\times (w_k+V_k')$. Since $V_k'\leq V_j$ we can further refine this partition to one of the desired form
on each part of which $R$ is still constant.
Finally, we will recall the recent quantitative inverse theorem of Gowers and Milićević for the $U^s$-norms on vector spaces over finite fields.
Theorem 7.3 (Gowers and Milićević [Theorem 7, Reference Gowers and MilićevićGM20])
Let $s$ be a natural number and assume that $p\geq s$. There exist constants $c_s,c_{s,p}'>0$ so that, if $f:\mathbb {F}_p^m\to \mathbb {C}$ is a $1$-bounded function satisfying
then there exists a polynomial $P\in \mathbb {F}_p[x_1,\ldots,x_m]$ of degree $\deg {P}\leq s-1$ such that
7.1 Proof of Lemma 2.6
Our proof of Lemma 2.6 is modeled after the corresponding pseudorandomization arguments in [Reference ShkredovShk06a, Reference ShkredovShk06b] and [Reference GreenGre05a]. As was said in the outline, some new features arise from our desire for $B',C',D',$ and $\Phi '$ to be uniform with respect to $U^s$-norms of degree greater than $2$ and from $\Phi$'s particular structure as a union of affine subspaces of the second factor of $\mathbb {F}_p^n\times \mathbb {F}_p^n$. The first of these two complications can be overcome by using Theorem 7.3 and then Theorem 7.1 or Corollary 7.2. We will discuss how $\Phi$'s structure influences our proof shortly.
The proof of Lemma 2.6 proceeds via an energy-increment argument. Each step of the energy-increment iteration will produce a partition $\mathscr {C}_{j}=(\mathcal {C}_{i,j})_{i\in I_j}$ of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ into cells $\mathcal {C}_{i,j}$ of the form
for some subspace $V_{i,j}\leq \mathbb {F}_p^n$. For each cell $\mathcal {C}=(u+V)\times (w+V)$ in a partition $\mathscr {C}$, we set
– $B_{\mathcal {C}}:= B\cap (w+V)$,
– $C_{\mathcal {C}}:= C\cap (u+w+V)$,
– $D_{\mathcal {C}}:= D\cap (2u+w+V)$, and
– $\Phi _{\mathcal {C}}:=\Phi \cap \mathcal {C}$,
and, correspondingly,
– $\beta (\mathcal {C}):=\mu _{w+V}(B_{\mathcal {C}})$,
– $\gamma (\mathcal {C}):=\mu _{u+w+V}(C_{\mathcal {C}})$,
– $\delta (\mathcal {C}):=\mu _{2u+w+V}(D_{\mathcal {C}})$, and
– $\phi (\mathcal {C}):=\mu _{\mathcal {C}}(\Phi _{\mathcal {C}})$,
so that
Analogously to the pseudorandomization procedure for corners, we will show that if $\|B_{\mathcal {C}}-\beta (\mathcal {C})\|_{U^8(\mathbb {F}_p^n)},\|C_{\mathcal {C}}-\gamma (\mathcal {C})\|_{U^8(\mathbb {F}_p^n)},$ or $\|D_{\mathcal {C}}-\delta (\mathcal {C})\|_{U^8(\mathbb {F}_p^n)}$ is large for a substantial portion of cells $\mathcal {C}$ in a partition $\mathscr {C}$, then there exists a refinement $\mathscr {C}'$ of $\mathscr {C}$ which has substantially larger energy. But even if $\Phi _{\mathcal {C}}$ is a union of affine subspaces of the same codimension $d$ for most cells in the partition, this may not be the case for the $\Phi _{\mathcal {C}'}$ corresponding to the cells $\mathcal {C}'$ in the refinement $\mathscr {C}'$. The codimensions of the affine subspaces $\{y\in w'+V':\Phi _{\mathcal {C}'}(x,y)=1\}$ can range from $0$ to $d$, so before even considering how to obtain a pseudorandom $\Phi '$, we have already found an obstacle to even getting a set of the same general form as $\Phi '$.
To get around this issue, we will pseudorandomize each of the sets
for $0\leq i\leq d$, instead of just $\Phi _{\mathcal {C}}=\Phi ^{\leq d}_{\mathcal {C}}$ itself. The definition (54) of $\Phi ^{\leq i}_{\mathcal {C}}$ selects all of the subspaces of the second factor of $\mathcal {C}$ comprising $\Phi _{\mathcal {C}}$ that have codimension at most $i$. This will pseudorandomize each of
as well. At the end of the proof of Lemma 2.6, we will use an averaging argument to choose $\Phi '$ to be some suitable $\Phi _{\mathcal {C}}^i$.
For each $0\leq i\leq d$, set $\phi ^{\leq i}(\mathcal {C}):=\mu _{\mathcal {C}}(\Phi _{\mathcal {C}}^{\leq i})$. We define the energy $E(\mathscr {C})$ of a partition $\mathscr {C}$ of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ by
Note that the energy of any partition is bounded above by $1$. We will now prove a couple of lemmas concerning this energy.
Lemma 7.4 Let $m'>m$ and $d$ be nonnegative integers, $A,B,C,D\subset \mathbb {F}_p^n$, $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ be of the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension between $0$ and $d$, $\mathscr {C}$ be a partition of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ with each cell $\mathcal {C}$ taking the form
for some subspace $V_{\mathcal {C}}\leq \mathbb {F}_p^n$ of codimension $m$, and suppose that $\mathscr {C}'$ is a refinement of $\mathscr {C}$ with each cell $\mathcal {C}'$ taking the form
for some subspace $V'_{\mathcal {C}'}$ of codimension $m'$. Then
and
for every $0\leq i\leq d$.
Proof. Note that it suffices to prove the result with the sum over $\mathcal {C}\in \mathscr {C}$ restricted to a single cell $\mathcal {C}_0$ and the sum over $\mathcal {C}'\in \mathscr {C}'$ restricted to all cells contained in $\mathcal {C}_0$, since one can then just sum over $\mathcal {C}_0$ in $\mathscr {C}$ to get the desired result. Thus, we may assume without loss of generality that $\mathscr {C}$ is the trivial partition $\{\mathbb {F}_p^n\times \mathbb {F}_p^n\}$.
Let $\beta,\gamma,$ and $\delta$ denote the densities of $B,C,$ and $D$, respectively, in $\mathbb {F}_p^n$. Since $\mathbb {F}_p^n\times B$ has density $\beta$ in $\mathbb {F}_p^n\times \mathbb {F}_p^n$, we have
so that, by the Cauchy–Schwarz inequality,
as desired, since
for all cells $\mathcal {C}'$ of $\mathscr {C}'$. Similarly, since $\{(x,y)\in \mathbb {F}_p^n\times \mathbb {F}_p^n:x+y\in C\}$ has density $\gamma$ in $\mathbb {F}_p^n\times \mathbb {F}_p^n$ and $\{(x,y)\in \mathbb {F}_p^n\times \mathbb {F}_p^n:2x+y\in D\}$ has density $\delta$ in $\mathbb {F}_p^n\times \mathbb {F}_p^n$, we have
and
it follows again from the Cauchy–Schwarz inequality that
and
The argument for the $\phi ^{\leq i}$ is also essentially identical, but with one small difference. For ease of notation, set $\Phi ^{\leq i}:=\Phi ^{\leq i}_{\mathbb {F}_p^n\times \mathbb {F}_p^n}$ and $\phi ^{\leq i}:=\phi ^{\leq i}(\mathbb {F}_p^n\times \mathbb {F}_p^n)$ for all $0\leq i\leq d$. Then we actually have
instead of equality, since $\Phi ^{\leq i}\cap \mathcal {C}'\subset \Phi ^{\leq i}_{\mathcal {C}'}$ (which is why we run the energy-increment argument with the $\Phi ^{\leq i}$, instead of the $\Phi ^i$). It therefore follows yet again from the Cauchy–Schwarz inequality that
for all $0\leq i\leq d$.
Lemma 7.5 Let $m$ and $d$ be nonnegative integers, $A,B,C,D\subset \mathbb {F}_p^n$, $\Phi \subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ be of the form
where each $V_x$ is a subspace of $\mathbb {F}_p^n$ of codimension between $0$ and $d$, and $\mathscr {C}$ be a partition of $\mathbb {F}_p^n\times \mathbb {F}_p^n$ with each cell $\mathcal {C}$ taking the form
for some subspace $V_{\mathcal {C}}\leq \mathbb {F}_p^n$ of dimension $m$. There exists a positive integer
and positive integers $c,c'_p>0$, such that the following holds.
Let $\mathcal {C}\in \mathscr {C}$.
(i) If $\|B_{\mathcal {C}}-\beta (\mathcal {C})\|_{U^{10}(w_\mathcal {C}+V_{\mathcal {C}})}\geq \varepsilon$, then there exists a partition $\mathscr {C}'_{\mathcal {C}}$ of $\mathcal {C}$ with each cell $\mathcal {C}'$ taking the form
\[ \mathcal{C}'=(u_{\mathcal{C}'}'+V_{\mathcal{C}'}')\times(w_{\mathcal{C}'}'+V_{\mathcal{C}'}'), \]with each $\dim V_{C'}'=m'$, such that\[ \sum_{\mathcal{C}'\in\mathscr{C}'_{\mathcal{C}}}\beta(\mathcal{C}')^2 \mu_{\mathcal{C}}(\mathcal{C}')\geq\beta(\mathcal{C})^2+\Omega\bigg(\frac{1}{\exp^c(c'_p/\varepsilon)^2}\bigg). \](ii) If $\|C_{\mathcal {C}}-\gamma (\mathcal {C})\|_{U^{10}(u_{\mathcal {C}}+w_\mathcal {C}+V_{\mathcal {C}})}\geq \varepsilon$, then there exists a partition $\mathscr {C}'_{\mathcal {C}}$ of $\mathcal {C}$ with each cell $\mathcal {C}'$ taking the form
\[ \mathcal{C}'=(u_{\mathcal{C}'}'+V_{\mathcal{C}'}')\times(w_{\mathcal{C}'}'+V_{\mathcal{C}'}'), \]with each $\dim V_{C'}'=m'$, such that\[ \sum_{\mathcal{C}'\in\mathscr{C}'_{\mathcal{C}}}\gamma(\mathcal{C}')^2 \mu_{\mathcal{C}}(\mathcal{C}')\geq\gamma(\mathcal{C})^2+\Omega\bigg(\frac{1}{\exp^c(c'_p/\varepsilon)^2}\bigg). \](iii) If $\|D_{\mathcal {C}}-\delta (\mathcal {C})\|_{U^{10}(2u_{\mathcal {C}}+w_\mathcal {C}+V_{\mathcal {C}})}\geq \varepsilon$, then there exists a partition $\mathscr {C}'_{\mathcal {C}}$ of $\mathcal {C}$ with each cell $\mathcal {C}'$ taking the form
\[ \mathcal{C}'=(u_{\mathcal{C}'}'+V_{\mathcal{C}'}')\times(w_{\mathcal{C}'}'+V_{\mathcal{C}'}'), \]with each $\dim V_{C'}'=m'$, such that\[ \sum_{\mathcal{C}'\in\mathscr{C}'_{\mathcal{C}}}\delta(\mathcal{C}')^2 \mu_{\mathcal{C}}(\mathcal{C}')\geq\delta(\mathcal{C})^2+\Omega\bigg(\frac{1}{\exp^c(c'_p/\varepsilon)^2}\bigg). \](iv) Let $0\leq i\leq d$. If $\|\Phi ^{\leq i}_{\mathcal {C}}-\phi ^{\leq i} (\mathcal {C})\|_{U^{8}(\mathcal {C})}\geq \varepsilon$, then there exists a partition $\mathscr {C}'_{\mathcal {C}}$ of $\mathcal {C}$ with each cell $\mathcal {C}'$ taking the form
\[ \mathcal{C}'=(u_{\mathcal{C}'}'+V_{\mathcal{C}'}')\times(w_{\mathcal{C}'}'+V_{\mathcal{C}'}'), \]with each $\dim V_{C'}'=m'$, such that\[ \sum_{\mathcal{C}'\in\mathscr{C}'_{\mathcal{C}}}\phi^{\leq i}(\mathcal{C}')^2 \mu_{\mathcal{C}}(\mathcal{C}')\geq\phi^{\leq i}(\mathcal{C})^2+\Omega\bigg(\frac{1}{\exp^c(c'_p/\varepsilon)^2}\bigg). \]
Proof. Let $c$ and $c'_p$ denote the constants $c_{10}$ and $c_{10,p}'$, respectively, from Theorem 7.3, and $m'$ denote the smaller of the two minimum values of $m'$ appearing in Theorem 7.1 when we take $m$ as in this lemma, $d=9$, and $t=1$ and $m'$ appearing in Corollary 7.2 when we take $m$ as in this lemma and $d=7$.
First assume that $\|B_{\mathcal {C}}-\beta (\mathcal {C})\|_{U^{10}(w_{\mathcal {C}}+V_{\mathcal {C}})}\geq \varepsilon$. Then applying Theorem 7.3 with $s=10$ yields a polynomial $P\in \mathbb {F}_p[x_1,\ldots,x_m]$ of degree at most $9$ such that
By Theorem 7.1, there exists a partition $(w_i+V_i)_{i\in I}$ of $w_{\mathcal {C}}+V_{\mathcal {C}}$ into affine subspaces of $w_{\mathcal {C}}+V_{\mathcal {C}}$ of dimension $m'$, on each of which $P$ is constant. Thus, by the triangle inequality,
so that, by the Cauchy–Schwarz inequality,
Expanding the square, this means that
Now we partition the whole cell of interest $\mathcal {C}$ by writing
and, for each $i\in I$, use that $V_i\leq V_{\mathcal {C}}$ to partition $u_{\mathcal {C}}+V_{\mathcal {C}}$ into cosets of $V_i$ to get
Since $\mu _{\mathcal {C}}(\mathcal {C}')=|\mathscr {C}'|^{-1}$ for each $\mathcal {C}'\in \mathscr {C}_{\mathcal {C}}'$, (55) reads
The arguments for $C_{\mathcal {C}}$ and $D_{\mathcal {C}}$ are again analogous, but we include them for the sake of completeness. Next, assume that $\|C_{\mathcal {C}}-\gamma (\mathcal {C})\|_{U^{10}(u_{\mathcal {C}}+w_{\mathcal {C}}+V_{\mathcal {C}})}\geq \varepsilon$. Applying Theorem 7.3 yields a polynomial $P\in \mathbb {F}_p[x_1,\ldots,x_m]$ of degree at most $9$ such that
By Theorem 7.1, there exists a partition $(v_i+V_i)_{i\in I}$ of $u_{\mathcal {C}}+w_{\mathcal {C}}+V_{\mathcal {C}}$ into affine subspaces of $u_{\mathcal {C}}+w_{\mathcal {C}}+V_{\mathcal {C}}$ of dimension $m'$ on each of which $P$ is constant. Thus, by the Cauchy–Schwarz inequality again,
Now we partition the whole cell $\mathcal {C}$ by writing
Since $(v_i-w_{\mathcal {C}}+v'+V_i)+(w_{\mathcal {C}}-v'+V_i)=v_i+V_i$, we conclude that
Now assume that $\|D_{\mathcal {C}}-\delta (\mathcal {C})\|_{U^{10}(2u_{\mathcal {C}}+w_{\mathcal {C}}+V_{\mathcal {C}})}\geq \varepsilon$. Applying Theorem 7.3 yields a polynomial $P\in \mathbb {F}_p[x_1,\ldots,x_m]$ of degree at most $9$ such that
By Theorem 7.1, there exists a partition $(v_i+V_i)_{i\in I}$ of $2u_{\mathcal {C}}+w_{\mathcal {C}}+V_{\mathcal {C}}$ into affine subspaces of $2u_{\mathcal {C}}+w_{\mathcal {C}}+V_{\mathcal {C}}$ of dimension $m'$ on each of which $P$ is constant. Thus, by the Cauchy–Schwarz inequality yet again,
Now we partition the whole cell $\mathcal {C}$ by writing
Since $(2u_{\mathcal {C}}+2v'+V_i)+(v_i-2u_{\mathcal {C}}-2v'+V_i)=v_i+V_i$, we conclude that
Finally, suppose that $\|\Phi ^{\leq i}_{\mathcal {C}}-\phi ^{\leq i}(\mathcal {C})\|_{U^{8}(\mathcal {C})}\geq \varepsilon$ for some $0\leq i\leq d$. Theorem 7.3 then says that there exists a polynomial $R\in \mathbb {F}_p[x_1,\ldots,x_m,y_1,\ldots,y_m]$ of degree at most $7$ such that
By Corollary 7.2, there exists a partition $\mathscr {C}'_{\mathcal {C}}$ of $\mathcal {C}$ into affine subspaces of the form $(u+V)\times (w+V)$ with $\dim V=m'$, on each of which $R$ is constant. Thus, by the Cauchy–Schwarz inequality, we have
Since
the conclusion
now follows.
Now we can prove Lemma 2.6.
Proof of Lemma 2.6 We proceed via an energy-increment argument, as described at the beginning of the subsection. A cell $\mathcal {C}=(u+V)\times (w+V)$ in a partition $\mathscr {C}_j$ is said to be expired if $\beta (\mathcal {C}),\gamma (\mathcal {C}),\delta (\mathcal {C}),$ or $\phi ^{\leq d}(\mathcal {C})$ is less than $\tau \mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)/4$, and a nonexpired cell $\mathcal {C}$ is said to be uniform if
and
for all $0\leq i\leq d$. We will denote the subset of expired cells of $\mathscr {C}_j$ by $\mathscr {E}_j$, the subset of uniform cells by $\mathscr {U}_j$, and the subset of nonexpired, nonuniform cells by $\mathscr {N}_j$, so that $\mathscr {E}_j,\mathscr {U}_j$, and $\mathscr {N}_j$ partition $\mathscr {C}_j$. For any subset $K\subset I_j$, we define $\eta (K)$ to be the measure of all cells indexed by $K$:
Finally, we define a sequence of integers $(m_j)_{j=0}^{\infty }$ by setting $m_0=n$ and, for every $j>0$, $m_{j}$ to be the minimum of the value of $m'$ appearing in Theorem 7.1 when we take $m=m_{j-1}$, $d=9$, and $t=1$ and of $m'$ appearing in Corollary 7.2 when we take $m=m_{j-1}$ and $d=7$, so that
for some absolute constants $0< c_1,c_2<1$.
Set $\mathscr {C}_0$ to be the trivial partition $\{\mathbb {F}_p^n\times \mathbb {F}_p^n\}$ of $\mathbb {F}_p^n\times \mathbb {F}_p^n$. Letting $c=c_{10}$ and $c'=c'_{10,p}$ be as in Lemma 7.5, then, as long as $\eta (\mathscr {N}_j)\geq \tau \mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)/2$ and $m_{j+1}\geq 1$, there exists a refinement $\mathscr {C}_{j+1}$ of $\mathscr {C}_j$ such that:
(i) $\dim {V_{i,j+1}}\geq m_{j+1}$ for every $i\in I_{j+1}$ and $\dim {V_{i,j+1}}=m_{j+1}$ whenever $\mathcal {C}_{i,j+1}\in \mathscr {N}_{j+1}$; and
(ii) $E(\mathscr {C}_{j+1})\geq E(\mathscr {C}_j)+\Omega ( {\tau \mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)}/{d\exp ^c(c'_p/\varepsilon )^2})$.
Indeed, suppose that $\eta (\mathscr {N}_j)\geq \tau \mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)/2$. Each cell $\mathcal {C}$ in $\mathscr {N}_j$ must be of dimension $m_j\times m_j$. By Lemmas 7.4 and 7.5, there exists a partition $\mathscr {C}_{k,j+1}=(\mathcal {C}_{k,j+1})_{k\in K_{\mathcal {C}}}$ of each $\mathcal {C}\in \mathscr {N}_j$ such that
is at least
and each $\mathcal {C}_{k,j+1}$ is of the form $(u'+V')\times (w'+V')$ with $\dim {V'}=m_{j+1}\geq 1$. Taking
we see that multiplying both sides of the above by $\mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(\mathcal {C})$ and summing over $\mathcal {C}\in \mathscr {N}_j$ yields
Since $E(\mathscr {C})\leq 1$ for all partitions $\mathscr {C}$, this iteration must terminate for some
at which point either $\eta (\mathscr {N}_{j_0})<\tau \mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)/2$ or $m_{j_0+1}<1$. Assuming that $n\geq c_1^{-c_2^{-(j_0+1)}}$ ensures that the latter case cannot occur.
Since $\eta (\mathscr {E}_j)<\tau \mu _{\mathbb {F}_p^n\times \mathbb {F}_p^n}(T)/4$, we have
This certainly implies that
so that, by the pigeonhole principle, there exists a cell $\mathcal {C}_0$ in $\mathscr {U}_{j_0}$ for which
Since $\Phi _{\mathcal {C}_0}=\coprod _{i=0}^d\Phi ^{i}_{\mathcal {C}_0}$, another application of the pigeonhole principle tells us that there exists a $0\leq i\leq d$ for which we also have the density increment
As noted at the beginning of this subsection,
so that the conclusion of the lemma now follows.
8. The density-increment argument
Now we can finally prove Theorem 1.2 by iterating Lemma 2.7.
Proof of Theorem 1.2 Suppose that $S\subset \mathbb {F}_p^n\times \mathbb {F}_p^n$ has density $\sigma$ and contains no nontrivial L-shaped configurations. Set $S_0:=S$, $n_0:=n$, $d_0=0$, $\varepsilon _0:=1$, $A_0=B_0=C_0=D_0=\mathbb {F}_p^n$, and $\Phi _0:=\mathbb {F}_p^n\times \mathbb {F}_p^n$. Applying Lemma 2.7 repeatedly produces sequences of $S_i$, $n_i$, $d_i$, $\varepsilon _i$, $A_i$, $B_i$, $C_i$, $D_i$, and $\Phi _i$, with $A_i,B_i,C_i,D_i\subset \mathbb {F}_p^{n_i}$ and $\Phi _i\subset A_i\times \mathbb {F}_p^{n_i}$ of the form
where each $V_x\leq \mathbb {F}_p^{n_i}$ is a subspace of codimension $d_i$, such that, on setting
$\alpha _i:=\mu _{\mathbb {F}_p^{n_i}}(A_i)$, $\beta _i:=\mu _{\mathbb {F}_p^{n_i}}(B_i)$, $\gamma _i:=\mu _{\mathbb {F}_p^{n_i}}(C_i)$, $\delta _i:=\mu _{\mathbb {F}_p^{n_i}}(D_i)$, and $\rho _i:=\mu _{\mathbb {F}_p^{n_i}\times \mathbb {F}_p^{n_i}}(\Phi _i)/\alpha _i=p^{-d_i}$, we have, for each $i\geq 1$, that:
(i) $S_i\subset T_i$ has density $\sigma _i$ in $T_i$, where $\sigma _i\geq \sigma _{i-1}+\Omega (\sigma ^{O(1)})$;
(ii) $n_i\gg n_{i-1}^{c_1^{O(\exp ^c(c'/\varepsilon _i)/(\sigma \alpha _{i-1} \beta _{i-1}\gamma _{i-1}\delta _{i-1}\rho _{i-1})^{O(1)})}}$;
(iii) $\varepsilon _i\leq (\sigma \alpha _{i}\beta _{i}\gamma _{i}\delta _{i}\rho _{i})^{O(1)}\exp (-(64/\sigma )^{O(1)})$;
(iv) $\alpha _i,\beta _i,\gamma _i,\delta _i\gg (\sigma \alpha _{i-1}\beta _{i-1} \gamma _{i-1}\delta _{i-1}\rho _{i-1})^{O(1)}$;
(v) $d_i\leq d_{i-1}+1$;
(vi) $\|A_i-\alpha _i\|_{U^{10}(\mathbb {F}_p^{n_i})}$, $\|B_i-\beta _i\|_{U^{10}(\mathbb {F}_p^{n_i})}$, $\|C_i-\gamma _i\|_{U^{10}(\mathbb {F}_p^{n_i})}$, $\|D_i-\delta _i\|_{U^{10}(\mathbb {F}_p^{n_i})}$, $\|\Phi _i- \alpha _i\rho _i\|_{U^8(\mathbb {F}_p^{n_i}\times \mathbb {F}_p^{n_i})}<\varepsilon _i$; and
(vii) $S_i$ contains no nontrivial L-shaped configurations;
provided that
Since no set can have density larger than $1$, the lower bound (56) must fail for some $i=i_0+1\ll \sigma ^{-O(1)}$. Thus, there exists an absolute constant $c''>1$ such that
while, on the other hand,
Comparing the upper and lower bounds for $n_{i_0}$ and taking the $c''$-fold iterated logarithm of both sides yields the bound in Theorem 1.2.
Acknowledgements
The author thanks Ben Green and Freddie Manners for helpful discussions, and Ben Green, Noah Kravitz, Terry Tao, and the anonymous referees for useful comments on earlier drafts. The author was supported by the NSF Mathematical Sciences Postdoctoral Research Fellowship Program under Grant No. DMS-1903038 and the Oswald Veblen fund, and also gratefully acknowledges the support and hospitality of the Hausdorff Institute for Mathematics, where the bulk of this paper was written.
Conflicts of Interest
None.