1 Introduction
The study of moments of families of L-functions has a long history. One strand of research concerns the estimation of moments in order to secure strong subconvexity bounds. Another direction is to consider the structural properties of the moments, especially in their connections with random matrix theory. One of the long-standing guiding principles concerns analogies between different families, particularly the consideration of the Riemann zeta function in the t-aspect on the one side and the family of Dirichlet L-functions in the q-aspect on the other side.
We begin by discussing some moment problems for the zeta function. This is a vast subject, and we only briefly touch on a few results pertinent to our narrative. In [Reference Heath-BrownHB2], Heath-Brown studied the twelfth moment of the Riemann zeta function, finding that

a result which easily recovers the Weyl bound while also proving in a strong quantitative form that
$|\zeta (1/2+it)|$
cannot be too large too often. Heath-Brown’s technique for proving (1.1) is based on leveraging information from short second moments of the zeta function. A simple modification of [Reference Heath-BrownHB2, Lemma 1] implies

for
$T \geq 1$
. See also [Reference TitchmarshT2, Section 7.4] and [Reference IvićI, Chapter 7] for more discussion and other related results. For example, [Reference TitchmarshT2, Theorem 7.4] gives that

This error term has been improved many times over the years, and it seems the first improvement is due to Titchmarsh himself [Reference TitchmarshT1], attaining
$\alpha = 5/12$
. Note that (1.3) leads to an asymptotic formula for the second moment in a short interval of length at least
$T^{\alpha +\varepsilon }$
, simply by writing
$\int _T^{T+\Delta } = \int _0^{T+\Delta } - \int _0^{T}$
.
Next, we discuss some prior works on q-analogs of these results. Nunes [Reference NunesN] proved an analog to (1.1) for smooth square-free moduli q. In a complementary direction, Milićević and White [Reference Milićević and WhiteMW] proved a variant of (1.1) in the depth aspect (i.e., fixing a prime p and letting
$q=p^j$
with
$j \rightarrow \infty $
). Both of these works consider upper bounds on ‘short’ second moments, where in this context, ‘short’ refers to a coset. Petrow and Young [Reference Petrow and YoungPY] obtained an upper bound on the fourth moment of Dirichlet L-functions along a coset of size
$\gg q^{2/3+\varepsilon }$
. For the full family second moment, Heath-Brown [Reference Heath-BrownHB3] proved

where with
$\gamma _0 = 0.577\dots $
representing Euler’s constant, we have

One remark that is in order here is that Heath-Brown considers a sum over all Dirichlet characters, and not just the primitive ones. A second comment is that there is no obvious way to extract from (1.4) a formula for the second moment restricted to a coset, in contrast to the t-integral analog for the zeta function. Finally, we remark that for composite values of q, there are a variety of main terms that may be intermediate in size between
$q^{1/2}$
and q. However, the five authors [Reference Conrey, Farmer, Keating, Rubinstein and SnaithCFKRS] have conjectured an asymptotic formula for the second moment in this family, but only after restriction to the family of solely the primitive characters modulo q. Conrey [Reference ConreyC] verified the five-author conjecture for the second moment in this family. In the same paper, Conrey also derived a beautiful reciprocity formula for the twisted second moment with prime modulus.
To state our main results, we need some notation. Suppose
$\psi $
is a Dirichlet character modulo q and d is a positive integer with
$d|q$
and
$q|d^2$
. It is easy to see that, as a function of x,
$\psi (1+dx)$
is an additive character with period
$q/d$
. Hence, there exists an integer
$a_{\psi }\ \pmod {q/d}$
such that
$\psi (1+dx) = e(\frac {a_\psi dx}{q})$
, where
$e(x)=e^{2\pi ix}$
. The primitivity of
$\psi $
modulo q means that
$(a_{\psi }, q/d) = 1$
(for which see Lemma 2.14 below). We assume

Let
$\nu _p(\cdot )$
denote the p-adic valuation. For a and b as positive integers, we will write ‘
$a\prec b$
’ to mean that a and b share all of the same prime factors and, for any prime
$p|b$
,
$\nu _p(a)<\nu _p(b)$
. Similarly, we will write ‘
$a\preceq b$
’ to mean that a and b share all of the same prime factors and, for any prime
$p|b$
,
$\nu _p(a)\leq \nu _p(b)$
.
Theorem 1.1. Let
$\psi $
be even of conductor q, and suppose
$d \prec q \preceq d^2$
. Then

where with
$\theta (q) = \sum _{p|q} \frac {\log {p}}{p-1}$
, and
$\sigma _\alpha (n) = \sum _{k|n} k^\alpha $
, we define

and

Remarks. The term
${\mathcal {D}}$
is the standard diagonal main term, which is analogous to the main term in (1.3). An application of the recipe of [Reference Conrey, Farmer, Keating, Rubinstein and SnaithCFKRS] would predict

where notably the term
${\mathcal {A}}$
is not present. A bold interpretation of [Reference Conrey, Farmer, Keating, Rubinstein and SnaithCFKRS] might suggest the error term in (1.9) would be of size
$O(d^{1/2} q^{\varepsilon })$
, but a more cautious value would be
$O(q^{1/2+\varepsilon })$
. If
$a_{\psi } = 1$
, say, then
${\mathcal {A}} = \frac {\varphi (d)}{d} \sqrt {q}$
, which is essentially its maximal size. This is consistent with an error term of size
$O(q^{1/2+\varepsilon })$
in (1.9), but inconsistent with
$O(d^{1/2} q^{\varepsilon })$
. However, note that
${\mathcal {A}} \ll \sqrt {q}$
which is smaller than
${\mathcal {D}}$
, since
$\sqrt {q} \leq d$
, and
${\mathcal {D}} \gg d (\log q)^{1-\varepsilon }$
.
The existence of
${\mathcal {A}}$
contrasts with the absence of a main term of size
$T^{1/2}$
in (1.3). The condition that
$d \prec q$
implies that
$\chi \cdot \psi $
has conductor q for all
$\chi\ \pmod {d}$
, and so this lower-order main term
${\mathcal {A}}$
is not due to characters of smaller modulus (as in (1.4)).
As a very rough heuristic, we can indicate how
${\mathcal {A}}$
arises. Applying an approximate functional equation and the orthogonality formula, we encounter a sum of the form

With
$m= n + d l$
, we have
$\psi (m) \overline {\psi }(n) = \psi (1 + d l \overline {n}) = e_{q/d}(a_{\psi } l \overline {n})$
. Consider the term
$n=1$
. If
$a_{\psi } l$
is a bit smaller than
$q/d$
, then the exponential is approximately
$1$
. Once
$|a_{\psi }| l$
is larger than
$q/d$
, then the exponential has cancellation and should not contribute to a main term. This thought process leads to

One can apply the same sort of reasoning for each n dividing
$a_{\psi }$
, giving an expression of the same form, in line with the presence of
$\sigma _0(|a_{\psi }|)$
in
${\mathcal {A}}$
. Although this heuristic is far from rigorous, it is true that
${\mathcal {A}}$
emerges in the analysis from the most unbalanced range of summation where n is very small and
$m=n+dl$
is very large.
For the next theorem, we will have that
$d|\frac {q}{d}$
,
$q | d^3$
, and
$(q,3) = 1$
, in which case we define
$a_{\psi }\ \pmod {q/d}$
by the condition that
$\psi (1+dx) = e_{q/d}(a_{\psi } ( x - \overline {2}dx^2))$
for all
$x \in \mathbb Z$
. For more details on this definition, see Section 2.2. Note this definition reduces to the earlier one when
$d \equiv 0\ \pmod {q/d}$
since then the quadratic term may be discarded. In addition, let
$b_\psi $
be the reduction of
$a_\psi \ (\mathrm {mod}\ d)$
such that
$0<|b_\psi |<\frac {d}{2}$
.
Theorem 1.2. Suppose
$\psi $
is even of conductor q, with
$(q,3) = 1$
and suppose
$d^2 \preceq q \preceq d^3$
. Then

where
${\mathcal {D}}$
is as defined in (1.7), and with
$a_{\psi } \overline {a_{\psi }} \equiv 1\ \pmod {q}$
, we set

Remarks. First, observe that the diagonal term
${\mathcal {D}}$
is larger than the error term provided
$d \gg q^{2/5+\varepsilon }$
, which goes below the
$\sqrt {q}$
threshold. Secondly, although
${\mathcal {A}}'$
can be negative,
$|{\mathcal {A}}'|$
is smaller than
${\mathcal {D}}$
, since
${\mathcal {D}} \gg \varphi (d) (\log q)^{1-\varepsilon }$
.
The presence of trigonometric functions at complicated arguments in
${\mathcal {A}}'$
is a new feature compared to
${\mathcal {A}}$
in Theorem 1.1, and worthy of further discussion. One simple observation is that if
$|a_{\psi }| < d/2$
, then
$b_{\psi } = a_{\psi }$
, and it simplifies as
$\cos (0) = 1$
or
$\sin (0) = 0$
. However, there exist situations with
$q \equiv 3\ \pmod {4}$
where
${\mathcal {A}}'$
cannot be discarded. For example, let
$q = p^{239}$
and
$d = p^{116}$
, and suppose
$p \rightarrow \infty $
. Here,
$d> q^{2/5}$
, so the error term is smaller than the diagonal term
${\mathcal {D}}$
. Now suppose
$\psi $
is a character with
$a_{\psi } = 1 + 2p^{116}$
, which is a valid range since we need
$|a_{\psi }| < \frac 12 p^{123}$
. Then
$b_{\psi } =1$
, and so
$a_{\psi } - b_{\psi } = 2p^{116}$
. Under these conditions, we have

Therefore,

In the case
$p \equiv 3\ \pmod {4}$
, then
$|{\mathcal {A}}'| \approx d p^{-7} = p^{109}$
. This is larger than the error term which has size
$d^{-1/4} q^{1/2} = p^{-116/4 + 119.5} \leq p^{91}$
. One can construct other examples exhibiting other types of behavior for
${\mathcal {A}}'$
. Compared to the discussion around (1.10), it seems harder to heuristically see the shape of
${\mathcal {A}}'$
. However, we stress that in the proof, it arises in the same way as
${\mathcal {A}}$
, with the main differences coming from requiring a quadratic approximation for
$\psi (1+d l \overline {n})$
. The relevant sums can be evaluated in closed form using quadratic Gauss sums.
Theorem 1.2 can be extended with little effort to q with
$3|q$
(with a slightly more restrictive assumption
$q\preceq \frac {1}{3}d^3)$
. One could also attempt to find a common/hybrid generalization of Theorems 1.1 and 1.2 by only requiring
$d \prec q$
and
$q \preceq d^3$
. To do so, one could factor
$d = d_1 d_2$
where
$d_1$
(
$d_2$
, resp.) consists of the part of d where
$\nu _p(d_1) \geq \nu _p(q)/2$
(
$\nu _p(d_2) < \nu _p(q)/2$
, resp.), and use the ideas of proof of Theorem 1.1 for
$d_1$
(Theorem 1.2 for
$d_2$
, resp.). Our separation of Theorem 1.1 and 1.2 is intended to simplify the exposition.
It is well known that for many families of L-functions, it may be easier to obtain an upper bound in place of an asymptotic formula. We have the following upper bound, which is sharp for
$d \gg q^{1/3+\varepsilon }$
.
Theorem 1.3. Let
$\psi $
have conductor q, and let
$d|q$
. Then

Special cases of this result appear in both [Reference NunesN, Reference Milićević and WhiteMW]. One nice consequence of Theorem 1.3 is that if q has a factor d with
$d = q^{1/3 + o(1)}$
, then this second moment is strong enough to recover the Weyl bound
$|L(1/2, \psi )| \ll q^{1/6+\varepsilon }$
. For moduli q with
$d|q$
, Heath-Brown [Reference Heath-BrownHB1, Theorem 2] proved a bound for an individual L-function which essentially matches (1.13). Indeed, our proof of Theorem 1.3 relies on Heath-Brown’s work, and in retrospect, the method of Heath-Brown is implicitly bounding a second moment along a coset. Of course, the second moment bound in (1.13) implies an individual bound, but it contains more information.
2 Preliminaries
In this section, we will lay the groundwork and develop the tools necessary to prove the main theorems.
2.1 Various bounds & evaluations
First, we define the sum studied by Heath-Brown in [Reference Heath-BrownHB1] and cite his associated bound.
Definition 2.1. Let
$\chi $
have conductor q, and let h and n be integers. Denote

Lemma 2.2 (Heath-Brown, [Reference Heath-BrownHB1], Lemma 9).
Suppose that q is odd,
$q_0 | q$
, and
$\varepsilon> 0$
. Then

and

Remark 2.3. Technically, [Reference Heath-BrownHB1, Lemma 9] gives a bound of
$|S(q;\chi ,4hq_0,n)|$
, without the restriction that
$(q,2) = 1$
. However, it can be seen by reading through his proof that the result holds for
$|S(q;\chi ,hq_0,n)|$
(without the
$4$
) with the condition that q is odd. Moreover, Heath-Brown claims the bound for the sum with
$1 \leq h \leq A$
(and
$1 \leq n \leq B$
in (2.2)), but simple symmetry arguments extend the result as claimed above.
We next state the standard evaluation of a quadratic exponential sum.
Lemma 2.4. Let r be a positive odd integer. Let
$A,B$
be integers such that
$(B,r)=1$
. Then

where
$e_q(x) = e(\frac {x}{q})$
, the bar notation indicates the multiplicative inverse modulo r,
$\left (\frac {B}{r}\right )$
is the Jacobi symbol, and
$\varepsilon _r = \begin {cases} 1, & r\equiv 1\ (\mathrm {mod}\ 4)\\ i, & r\equiv 3\ (\mathrm {mod}\ 4). \end {cases}$
Proof. This follows by completing the square and applying (3.38) from [Reference Iwaniec and KowalskiIK].
Another simple lemma to get us warmed up is the following integral evaluation.
Lemma 2.5. Let k be a nonzero integer and suppose
$-\frac {1}{2}<\Re (s)<\frac {1}{2}$
. Then

Proof. This follows from [Reference Gradshteyn and RyzhikGR, 17.43.3] after changing variables.
2.2 Postnikov
We will derive a variant of the Postnikov formula that holds for composite moduli, rather than only prime powers. Recall the notation
$a|b^\infty $
, which means that
$a| b^A$
for some
$A> 0$
.
Definition 2.6. For positive odd integers d and q such that
$d|q$
and
$q|d^\infty $
, define the formal power series in the indeterminate x by

The conditions
$d|q$
and
$q|d^\infty $
imply that d and q share all of the same prime factors. We first show some divisibility properties for the coefficients of this formal power series. Let

Note that
$R_q$
is a sub-ring of
${\mathbb {Q}}$
. Reduction modulo q gives rise to the ring homomorphism
${\varphi : R_q \to {\mathbb {Z}}/q{\mathbb {Z}}}$
given by

Lemma 2.7. Let
$d \preceq q$
with q odd. For any
$p|q$
and
$k \geq 1$
, we have

More generally, for any
$A \geq 0$
, there exists N such that
$\nu _p(k)\leq \nu _p(d^{k-A})$
for all
$k\geq N$
.
Proof. We have
$\nu _p(k) \leq \frac {\log (k)}{\log (p)} \leq k-1 \leq \nu _p(d^{k-1})$
, where the last inequality follows from the fact that d and q share all of the same prime factors, so
$\nu _p(d) \geq 1$
. Now, (2.7) follows. In the more general case,
$\nu _p(d^{k-A})\geq k-A$
and
$\nu _p(k)=O(\log (k))$
, so there exists a large enough choice of N such that
$\nu _p(k) \leq \nu _p(d^{k-A})$
for all
$k\geq N$
.
Remark 2.8. While Lemma 2.7 only guarantees the existence of a positive integer N, the method of proof shows that a constructive candidate is the minimal positive integer M such that
$k-A\geq \log (k)$
for all
$k\geq M$
. This minimal M can be easily found for any particular A, and an example of future relevance is that
$M=5$
when
$A=3$
.
Lemma 2.9. We have
${\mathcal {L}}_q(1+dx) \in R_q[[x]]$
. In fact, all the coefficients of
${\mathcal {L}}_q(1+dx)$
are multiples of d.
Proof. With
$c_k=(-1)^{k+1}d^k/k$
, we will show
$\nu _p(c_k) \geq \nu _p(d)$
for all
$p|q$
. Using (2.7) implies that
$0 \leq \nu _p(\frac {d^{k-1}}{k})$
, so
$\nu _p(c_k) = \nu _p(d) + \nu _p(\frac {d^{k-1}}{k}) \geq \nu _p(d)$
, for any k.
Since
${\mathcal {L}}_q(1+dx)$
lives in
$R_q[[x]] \subseteq {\mathbb {Q}}[[x]]$
, given
$\varphi $
from (2.6), there is an induced ring homomorphism
$\overline \varphi : R_q[[x]] \to \left ({\mathbb {Z}}/q{\mathbb {Z}}\right )[[x]]$
which maps

By abuse of notation, we view
${\mathcal {L}}_q(1+dx)$
as an element of
$\left ({\mathbb {Z}}/q{\mathbb {Z}}\right )[[x]]$
via (2.8).
Lemma 2.10. We have
${\mathcal {L}}_q(1+dx)\in ({\mathbb {Z}}/q{\mathbb {Z}})[x]$
. Moreover,
$\mathcal {L}_q(1+dx) \in d (\mathbb Z/q\mathbb Z)[x]$
.
Proof. The content of this lemma is that for sufficiently large k,
$\nu _p(c_k) \geq \nu _p(q)$
for
$c_k = d^k/k$
. Say
$q | d^A$
. By Lemma 2.7, there exists a positive integer N such that
$\nu _{p}(k)\leq \nu _{p}(d^{k-A})$
for all
$k\geq N$
. Then
$\nu _p(c_k) = \nu _p(d^A) + \nu _p(\frac {d^{k-A}}{k}) \geq \nu _p(q)$
for
$k \geq N$
. Here, N may depend on p, but by choosing the maximal of all these N’s gives a uniform value.
This opens the door to discussing various properties of
${\mathcal {L}}_q(1+dx)$
modulo q, such as the following periodicity and additivity properties. These lemmas will require two indeterminates and so we will embed
$({\mathbb {Z}}/q{\mathbb {Z}})[x]$
into
$({\mathbb {Z}}/q{\mathbb {Z}})[x,y]$
in the obvious way.
Lemma 2.11. We have
${\mathcal {L}}_q(1+dx)={\mathcal {L}}_q\left (1+d(x+\frac {q}{d}y)\right )$
in
$({\mathbb {Z}}/q{\mathbb {Z}})[x,y]$
.
Proof. We have
${\mathcal {L}}_q(1 + d(x+\frac {q}{d} y)) = \sum _{k} c_k (x + \frac {q}{d} y)^k \equiv \sum _k c_k x^k\ \pmod {q}$
, using
$c_k \equiv 0\ \pmod {d}$
from Lemma 2.10.
Lemma 2.12. We have
${\mathcal {L}}_q((1+dx)(1+dy)) = {\mathcal {L}}_q(1+dx) + {\mathcal {L}}_q(1+dy)$
in
$({\mathbb {Z}}/q{\mathbb {Z}})[x,y]$
.
Proof. We give a brief sketch here, and refer to [Reference KoblitzK, pp.79-80] for more details. For the real logarithm, we have
$\log ((1+dx)(1+dy)) = \log (1+dx) + \log (1+dy)$
, for
$dx> -1$
,
$dy> -1$
. Hence, the power series expansions of these two expressions are equal wherever they both converge, meaning that all of their corresponding coefficients are equal. Thus, since
${\mathcal {L}}_q(1+dx)$
matches the power series expansion of the real logarithm (reduced modulo q via (2.8)), this property also holds for
${\mathcal {L}}_q(1+dx)$
.
Lemma 2.13. Let q and d be positive odd integers with
$d|q$
.
-
1. If
$q|d^\infty $ , then
${\mathcal {L}}_q(1+dx) \equiv dx \ (\mathrm {mod}\ (q,d^2))$ .
-
2. If
$(q,3)=1$ and
$q|d^3$ , then
${\mathcal {L}}_q(1+dx) \equiv dx-\overline {2}(dx)^2 \ (\mathrm {mod}\ q)$ .
Proof. Expanding the power series, we have

The claim that the tail of this series is still a polynomial with coefficients in
${\mathbb {Z}}/q{\mathbb {Z}}$
after factoring out
$d^3$
follows from Lemma 2.7, or more specifically the observations in Remark 2.8. Elaborating on the details, since
$k-3\geq \log (k)$
for all
$k\geq 5$
, we may choose
$N=5$
from Lemma 2.7. However, since q is odd,
$\frac {1}{4}=\overline {2}^2$
in
${\mathbb {Z}}/q{\mathbb {Z}}$
, so we could also pick
$N=4$
. Now the result follows in each case.
We are now ready for our version of the Postnikov formula.
Lemma 2.14 (Postnikov formula).
Let q and d be odd with
$d|q$
and
$q|d^\infty $
. There exists a unique group homomorphism
$a: \widehat {(\mathbb Z/q \mathbb Z)^{*}} \rightarrow \mathbb Z/(q/d) \mathbb Z$
,
$\psi \mapsto a_{\psi }$
, such that a Postnikov-type formula holds: for each Dirichlet character
$\psi $
modulo q and
$x\in {\mathbb {Z}}$
, we have

Finally, if
$\psi $
is primitive modulo q, then
$(a_{\psi }, q/d) = 1$
.
Proof. Consider the reduction modulo d map
$({\mathbb {Z}}/q{\mathbb {Z}})^* \to ({\mathbb {Z}}/d{\mathbb {Z}})^*$
, and denote its kernel by K. Since d and q share their prime factors,
$K=\left \{1+dx : x\ (\mathrm {mod}\ \frac {q}{d})\right \}$
, so
$|K|=\frac {q}{d}$
. Consider the map
$f:K\to S^1$
defined by

The function f is well defined by Lemmas 2.10 and 2.11. Furthermore, f is a group homomorphism by Lemma 2.12. Thus, f is a character on K, and we claim that f has order
$\frac {q}{d}$
in
$\widehat {K}$
. Indeed, if p is a prime such that
$p|\frac {q}{d}$
, then we have
$f(1+dx)^{q/dp}=e_q\left (\frac {q}{dp}{\mathcal {L}}_q(1+dx)\right )=e_{dp}\left ({\mathcal {L}}_q(1+dx)\right )=e_{dp}(dx)=e(\frac {x}{p})$
, by Lemma 2.13 since
$dp|(q,d^2)$
. Hence,
$\widehat {K}$
is cyclic and generated by f. Therefore, every element of
$\widehat {K}$
is of the form
$f^a$
for some
$a \ (\mathrm {mod}\ \frac {q}{d})$
. In particular,
$\psi $
is a character on
$({\mathbb {Z}}/q{\mathbb {Z}})^*$
, so restricting
$\psi $
to K makes it an element of
$\widehat {K}$
. Thus, there exists a unique
$a_\psi \ (\mathrm {mod}\ \frac {q}{d})$
such that
$\psi |_K = f^{a_\psi }$
, which is equivalent to the Postnikov formula.
For the final statement, suppose that p is a prime with
$p|\frac {q}{d}$
and
$p|a_{\psi }$
; we will show that
$\psi $
is periodic modulo
$q/p$
. Since d and q share all the same prime factors, then
$p^2|q$
, and hence,
$q | (q/p)^{\infty }$
. Applying (2.9) with d replaced by
$\frac {q}{p}$
, we obtain that
$\psi (1+\frac {q}{p} y ) = e_q(a_{\psi } \frac {q}{p} y)$
. Hence, if
$p|a_{\psi }$
then
$\psi $
is periodic modulo
$q/p$
.
2.3 Miscellaneous lemmas
In the course of this paper, we encounter sums of the form

We may evaluate this explicitly in many cases. The conditions relevant for Theorem 1.2 are contained in the following.
Lemma 2.15. Let
$(q,3) =1$
,
$\psi $
have conductor q, and suppose
$d^2|q$
and
$q|d^3$
. Also, let
$k \in \mathbb Z$
. If
$k \not \equiv -a_{\psi } \ (\mathrm {mod}\ d)$
, then
${\mathcal {S}}_{q,d}(\psi ,k) = 0$
. If
$k \equiv - a_{\psi } \ (\mathrm {mod}\ d)$
, then

Proof. By Lemma 2.14, we have

By Lemma 2.13(2), this simplifies as

Since
$q/d^2$
is an integer by hypothesis, we may change variables
$u \rightarrow u + q/d^2$
to see that the sum equals itself times a constant. If
$k \not \equiv -a_{\psi } \ (\mathrm {mod}\ d)$
, then this constant is not
$1$
, and so the sum vanishes unless
$k \equiv -a_\psi \ (\mathrm {mod}\ d)$
. We continue under this assumption, which means that the summand is actually periodic modulo
$q/d^2$
, and hence,

Lemma 2.4, along with some simplification, concludes the proof.
We can also easily calculate
${\mathcal {S}}_{q,d}$
under the conditions relevant for Theorem 1.1, as follows.
Lemma 2.16. Suppose
$\psi $
has conductor q,
$d|q$
and
$q|d^2$
. Let
$k \in \mathbb Z$
. Then

The proof of Lemma 2.16 is similar to, but much easier than, Lemma 2.15, since in this case, we may discard the quadratic terms in
${\mathcal {L}}_q(1+du)$
. We omit the details.
Lemma 2.17 (Iwaniec-Kowalski, [Reference Iwaniec and KowalskiIK], Theorem 5.3).
Let
$\chi $
be a primitive even character modulo q. Suppose
$G(s)$
is holomorphic and bounded in any strip
$-A \leq Re(s) \leq A$
, even, and normalized by
$G(0) = 1$
. Then

where
$V(x)$
is a smooth function defined by

and

The function
$V(x)$
has rapid decay, meaning
$V(x) \ll _A (1+x)^{-A}$
for arbitrarily large
$A> 0$
.
This next lemma encompasses the opening moves to prove Theorems 1.1 and 1.2.
Lemma 2.18. Let
$\psi $
be even and have conductor q, and suppose
$d\prec q$
. Then

Proof. Using
$\overline {L(1/2,\chi )} = L(1/2,\overline {\chi })$
, then

The character
$\chi \cdot \psi $
is primitive modulo q because
$\psi $
is primitive modulo q and
$d\prec q$
. Also,
$\chi \cdot \psi $
is even. Thus, Lemma 2.17 implies

Using
$\frac {(1 + \chi (-1))}{2} $
to detect
$\chi $
even followed by orthogonality of characters secures (2.13).
Lemma 2.19. Let q be a positive integer, and let V be as defined in (2.11). Then

Proof. This is a standard contour-shifting argument, with a tedious residue calculation. See, for instance, [Reference Iwaniec and SarnakIS, Lemma 4.1] for details.
3 Proof of Theorem 1.2
The focus of this section is to prove Theorem 1.2.
3.1 Diagonal term
Since
$d^2\preceq q$
implies that
$d\prec q$
, we may apply Lemma 2.18 to give (2.13). We decompose
${\mathcal {M}}$
into three terms:
$ {\mathcal {M}} = {\mathcal {M}}_{m=n} + {\mathcal {M}}_{m>n} + {\mathcal {M}}_{m<n}$
, where


and where
${\mathcal {M}}_{m<n}$
is given by a similar formula to (3.2) but with
$1\leq m<n$
.
Lemma 3.1. Let
$(q,2) = 1$
and suppose
$d\preceq q$
. Then

Proof. Observe that we cannot simultaneously have
$n \equiv -n \ (\mathrm {mod}\ d)$
and
$(n,q)=1$
since d is odd. Applying Lemma 2.19 concludes the proof.
3.2 Remaining setup
Note that by symmetry

For this reason, we largely focus on the terms with
$m>n$
. Applying a dyadic partition of unity to (3.2) results in
${\mathcal {M}}_{m>n} = \sum _\pm {\mathcal {M}}_{m>n}^{\pm }$
, with

Here,

with
$W^{(j,k)}(m,n) \ll _{j,k} M^{-j}N^{-k}$
and
$\operatorname {\mathrm {supp}}(W_{M,N}(m,n)) \subseteq [M,2M]\times [N,2N]$
. By the rapid decay of V, we may assume that

Consider
${\mathcal {M}}^{\pm }_{m>n}$
. Say
$m= \pm n+dl$
with
$l \geq 1$
, and define

so that

3.3 Off-diagonal via Poisson in l
We will eventually split the dyadic summations of
${\mathcal {M}}^{\pm }_{m>n}$
into two ranges depending on whether M and N are nearby or far apart. In this section, we develop a method that will be most useful when M and N are far apart (or ‘unbalanced’).
We will apply Poisson summation to the sum over l in (3.6). First, we observe some properties of the function

namely, that
$W^{\pm }_{n,d}(l)$
is supported on

and

Let
$\widehat {W}(x) = \int _{-\infty }^{\infty } w(y) e(-xy) dy$
. We record some properties of
$\widehat W$
for future use.
Proposition 3.2. Let
$A> 0$
, and suppose
$W(x)$
is a smooth function with support in an interval of length A that satisfies
$W^{(j)}(x) \ll _j A^{-j}$
. Then for any
$C> 0$
, we have

These properties follow directly from integration by parts, so the proof is omitted. Applying this to
$W_{n,d}^{\pm }$
, we have

Lemma 3.3. Suppose
$(q, 3)=1$
,
$d^2\preceq q \preceq d^3$
and that
$M> 2^{10} N$
. Then

where

Proof. The condition that
$M> 2^{10} N$
ensures that
$W_{n,d}^{\pm }(l)$
is supported on
$l \asymp \frac {M}{d}$
(recall (3.9)). This is convenient because, in particular, we can extend the sum over
$l \geq 1$
to all
$l \in \mathbb Z$
without altering the sum. Applying Poisson summation with respect to l in (3.6) gives

Replacing u by
$\pm nu$
and j by
$\pm j$
and recalling
$\psi $
is even gives

Hence, Lemma 2.15 yields

where the condition
$(n,q)=1$
is now accounted for by
$jn\equiv -a_\psi \ (\mathrm {mod}\ d)$
, recalling that
$(a_{\psi }, q/d) = 1$
and that d,
$q/d$
, and q all share the same prime factors. Since
$a_\psi \in {\mathbb {Z}}/(q/d){\mathbb {Z}}$
and
$d|\frac {q}{d}$
, let
$a_\psi \equiv b_\psi \ (\mathrm {mod}\ d)$
for
$b_\psi \in {\mathbb {Z}}/d{\mathbb {Z}}$
such that

The contribution from
$jn=-b_\psi $
gives
${\mathcal {A}}^{\pm }_{m>n}(M,N)$
, while we will use
$E.T.$
to denote the remaining terms. Then

Recalling (3.11) shows this sum may be truncated at
$|j| \ll \frac {q^{1+\varepsilon }}{M}$
, with a negligible error. We also have that
$n \asymp N$
. Therefore, letting
$k=jn$
, then the non-negligible contribution comes from
$|k| \ll \frac {Nq^{1+\varepsilon }}{M}$
, and for each k, the number of ways to factor
$k=jn \neq 0$
is
$O(q^{\varepsilon })$
. Hence,

In this estimation, we have used that
$|k| \geq d/2$
following from
$k \equiv - b_{\psi } \ \pmod {d}$
with
$k \neq -b_{\psi }$
, and using (3.14).
3.4 Off-diagonal via Poisson in n
Returning to (3.6), we now develop a method designed to treat the complementary range where M and N are nearby. Define the function

which is supported on
$n \in [N, 2N]$
and satisfies

Lemma 3.4. Suppose
$d \prec q$
. We have

Proof. We present the details for
${\mathcal {B}}^{+}$
; the case of
${\mathcal {B}}^{-}$
is nearly identical. Applying Poisson summation to (3.6) in the variable n gives

Using (3.17), we may restrict attention to
$|k| \ll \frac {q^{1+\varepsilon }}{N}$
with a negligible error. Hence,

since the variable l satisfies
$l\ll \frac {M}{d}$
by the support of the test functions. The innermost sum can be recognized as
$S(q;\psi ,dl,k)$
from (2.1). Applying Lemma 2.2 gives

The third term above may be dropped since
$\frac {MN}{q} \ll q^{\varepsilon } \frac {M^{1/4}N}{q^{1/4}}$
, since
$\frac {M}{q} \leq \frac {MN}{q} \ll q^{\varepsilon }$
.
Out of a convenience which will become evident in Section 3.5, we would prefer to include
${\mathcal {A}}^{\pm }_{m>n}(M,N)$
in Lemma 3.4, despite the fact that it does not naturally manifest using the methods of this section.
Lemma 3.5. Suppose
$d \prec q$
and
$d \leq q^{2/3}$
. We have

Proof. We treat the
$+$
case, since the
$-$
case is very similar. By the triangle inequality and (3.11), we have

Finally, note that
$q^{-1/2} \leq d^{-3/2} q^{1/2}$
, since
$d \leq q^{2/3}$
, so this bound on
$\mathcal {A}_{m>n}^{+}$
is absorbed by the error term.
3.5 Combining
${\mathcal {M}}^+_{m>n}$
and
${\mathcal {M}}^-_{m>n}$
Recall (3.7). Define

As a first step, we have the following.
Lemma 3.6. We have

Proof. Retracing the definitions, we have

After applying the Fourier transforms and assembling the dyadic partition of unity, we have

Changing variables results in

We next wish to apply the definition of
$V(x)$
from (2.11) and reverse the order of integration. This formally gives

This interchange is a little delicate but can be justified in a few ways. One option is to first truncate the x-integral to a finite interval
$[0, X]$
and let
$X \rightarrow \infty $
after interchanging the integrals. Lemma 2.5 now implies

Applying the definition of
$\gamma (s)$
(found in (2.12)) and rearranging terms produces

Using standard gamma function identities, one can show easily that

Hence,

Using the symmetry between divisors, it is apparent that this integrand is an odd function. Therefore, the integral is half the residue at
$s=0$
, giving the claimed formula for
${\mathcal {W}}$
.
Finally, we deduce the approximation in (3.19). We have
$e_q(-b_{\psi }) = 1 + O(q^{-1} |b_{\psi }|)$
by a Taylor expansion, and recalling
$|b_{\psi }| \ll d$
. Inserting this into the formula for
${\mathcal {W}}$
and using
$\sigma _0(|b_{\psi }|) |b_{\psi }|^{1/2} \ll d^{1/2} q^{\varepsilon }$
completes the proof.
Lemma 3.7. With
${\mathcal {M}}_{m>n}(\psi )$
as defined in (3.2), we have

where

Proof. We begin by splitting the dyadic summations of
${\mathcal {M}}_{m>n}$
into two ranges depending on whether M and N are nearby or far apart. The cutoff for these two ranges is
$M=d^{1/2}N$
. Therefore, starting at (3.7), we write

For the first term, we apply Lemma 3.3, and for the second term, we apply Lemma 3.5. Hence,

Rearranging and evaluating the main term, we obtain

where

Using that we may restrict to
$N \ll M$
and
$MN \ll q^{1+\varepsilon }$
, it is easy to see that the error term simplifies to give

For the purposes of Theorem 1.2, we have
$q \geq d^2$
so that
$d q^{-1/8} \leq d^{-1/4} q^{1/2}$
, so the latter error term can be dropped. The displayed error term is consistent with Theorem 1.2.
Now, we turn to
${\mathcal {A}}_{m>n}$
, which takes the form

recalling the definition (3.18). Applying Lemma 3.6 shows that

Note
$d^{3/2} q^{-1} \leq d q^{-1/8}$
, so this error term can be dropped.
Next, consider
${\mathcal {M}}_{m<n}$
, for which recall (3.3). Lemma 2.14 implies
$ a_{\overline \psi } \equiv -a_\psi \ (\mathrm {mod}\ q/d)$
, and recalling (1.5), we have
$a_{\overline \psi } = -a_\psi $
. Similarly,
$b_{\overline {\psi }} = - b_{\psi }$
. Hence, we deduce the following:
Lemma 3.8. We have

where

3.6 Combining
${\mathcal {M}}_{m>n}$
and
${\mathcal {M}}_{m<n}$
From Lemmas 3.7 and 3.8, we get that

where

Therefore, if
$q\equiv 1\ (\mathrm {mod}\ 4)$
, then

This is consistent with (1.12) for
$q \equiv 1 \ \pmod {4}$
. If instead
$q\equiv 3\ (\mathrm {mod}\ 4)$
, then

This derivation agrees with (1.12). Combining this with Lemma 3.1 proves Theorem 1.2.
4 Sketching the proofs of remaining theorems
The proof of Theorem 1.1 is similar to the proof of Theorem 1.2, except in some ways which make it simpler. Likewise, the proof of Theorem 1.3 will essentially use a subset of the tools used to prove Theorem 1.2. In order to avoid excessive repetition, we only give sketches of these proofs.
4.1 Sketch of the proof of Theorem 1.1
The structure of the proof of Theorem 1.1 is similar to that of Theorem 1.2. As a substitute for Lemma 3.3, we have the following:
Lemma 4.1. Suppose that
$d\prec q \preceq d^2$
, and that
$M> 2^{10} N$
. Then

where

Proof. We follow through the proof of Lemma 3.3, and note that the earliest difference will occur at (3.13) when Lemma 2.15 was used to evaluate
${\mathcal {S}}_{q,d}(\psi , jn)$
. The condition that
$q\preceq d^2$
means that we need to use Lemma 2.16 in place of Lemma 2.15. In practical terms, this means that in place of (3.13), we instead obtain

In this case, there is no need to introduce
$b_{\psi }$
, since we have
$jn \equiv - a_{\psi } \ \pmod {q/d}$
, and
$a_{\psi }$
is inherently defined modulo
$q/d$
. The term
${\mathcal {A}}$
corresponds to the term
$jn = -a_{\psi }$
, while the error terms are, similarly to (3.15) and (3.16), bounded by

Lemma 3.4 holds without changes, since the only assumption there is
$d \prec q$
. As in Lemma 3.5, we can freely insert the term
$\mathcal {A}_{m>n}^{\pm }(M,N)$
since it is bounded by
$d^{-1} M q^{\varepsilon }$
, which is in turn bounded by
$d^{-3/2} M q^{1/2+\varepsilon }$
. The new cutoff in the proof of Lemma 3.7 is
$Mq^{1/2}=Nd^{3/2}$
, so this error term is absorbed by the error in Lemma 4.1. In place of (3.20), we obtain

In total, this error term is of size
$d^{1/4} q^{1/4+\varepsilon } + d^{} q^{-1/8+\varepsilon }$
. Under the hypotheses of Theorem 1.1, we have
$q \leq d^2$
and hence
$d^{1/4} q^{1/4} \leq d q^{-1/8}$
, so the former term can be discarded. The error term is then seen to be consistent with the statement of Theorem 1.1.
The assembly of the term
${\mathcal {A}}$
is similar to that of
${\mathcal {A}}'$
, though it is simpler since there is no need to introduce
$b_{\psi }$
, and leads to

This concludes the discussion of the proof of Theorem 1.1.
4.2 A sketch of the proof of Theorem 1.3
Since Theorem 1.3 is an upper bound, we can arrange the second moment as follows:

This uses an approximate functional equation for
$L(1/2, \chi \cdot \psi )$
in place of Lemma 2.17. Applying Cauchy’s inequality to take M to the outside of the square, we obtain

The purpose of this trick is to completely avoid the ranges where M and N are far apart.
Squaring this out and applying orthogonality of characters, we obtain a diagonal term of size
$\ll d q^{\varepsilon }$
. For the off-diagonal terms, we essentially arrive at

using Lemma 3.4 for the final bound. The second error term can be dropped in comparison to the diagonal term. In all, we obtain the bound in Theorem 1.3.