1 Introduction
In this paper, we are interested in how many integers
$\leq N$
are covered by the values taken by the quadratic forms
$x^2 + dy^2$
,
$d \leq \Delta $
. Our main result is the following, which gives a fairly complete answer to this question.
Theorem 1.1 (Main theorem)
Let N be large and write, for some real number
$\alpha $
,

Then

where
$\Phi $
is the Gaussian distribution function
$\Phi (\alpha ) = \frac {1}{\sqrt {2\pi }} \int ^{\alpha }_{-\infty } e^{-x^2/2} dx$
.
The problem of covering integers by this family of binary quadratic forms seems to have been first considered in the work of Hanson and Vaughan [Reference Hanson and Vaughan12]. Using the circle method, they established that almost all integers
$n\leq N$
may be covered with
$\Delta = \log N (\log \log N)^{3+\varepsilon }$
for any
$\varepsilon>0$
and that a positive proportion of the integers below N may be covered using
$\Delta = \log N \log \log N$
. Diao [Reference Diao7] found a much shorter proof of the latter result, and in his argument d could be restricted to prime values so that a smaller set of forms is used.
Landau established that the number of integers below N that are sums of two squares is
$\sim BN/(\log N)^{1/2}$
for a positive constant B. This was extended by Bernays to show that for any fixed primitive positive definite binary quadratic form f, the number of integers below N that are represented by f is
$\sim B_f N (\log N)^{-1/2}$
, for a positive constant
$B_f$
(which in fact depends only on the discriminant of f). More recently, Blomer [Reference Blomer1, Reference Blomer2] and Blomer and Granville [Reference Blomer and Granville3] consider in detail the number of integers up to N that are represented by f uniformly in the form f (thus allowing the discriminant to grow with N). These results, taken with the union bound, suggest that if
$\Delta $
is smaller than
$(\log N)^{1/2-\varepsilon }$
, then almost all
$n\leq N$
cannot be covered by the forms
$x^2+dy^2$
with
$d\leq \Delta $
. However, as Theorem 1.1 reveals, the true threshold for
$\Delta $
is neither
$(\log N)^{1/2}$
nor
$\log N$
but instead
$(\log N)^{\log 2}$
.
We shall in fact prove a more precise version of Theorem 1.1, counting the number of integers below N with k prime factors that may be represented as
$x^2+ dy^2$
with
$d\leq \Delta $
. Throughout, let
$\Omega (n)$
denote the number of prime factors of n counted with multiplicity, and define

Recall that most integers below N have about
$\log \log N$
prime factors, a result first established by Hardy and Ramanujan. The well-known work of Erdős and Kac established that
$\Omega (n)$
has a normal distribution with mean
$\sim \log \log N$
and variance
$\sim \log \log N$
, while Selberg’s work [Reference Selberg16] gave still more precise results establishing an asymptotic formula for
${\mathcal A}(N,k)$
uniformly in a wide range of k. To reduce the visual complexity of expressions involving double logs later on, it is convenient to set (throughout the paper)

The following simplified version of Selberg’s result is an immediate consequence of [Reference Tenenbaum17, Theorem II.6.5].
Lemma 1.2. Let N be large. Uniformly for integers k in the range
$|k -k_0| \leq \tfrac 12 k_0$
, we have

For a given k in a suitable interval around
$\log \log N$
, we shall show (the ‘upper bound’, Theorem 1.3 below) that almost none of the integers in
${\mathcal A}(N,k)$
are represented by
$x^2+dy^2$
with
$d\leq \Delta $
if
$\Delta $
is a bit smaller than
$2^k$
. This changes when
$\Delta $
becomes a bit larger than
$2^k$
, when almost all the integers in
${\mathcal A}(N,k)$
may be so represented. This is the ‘lower bound’, Theorem 1.4 below. From these results, Theorem 1.1 will follow swiftly.
We turn now to the precise statements.
Theorem 1.3 (Upper bound)
Let N be large, and let k be an integer in the range

Suppose
$\Delta \leq 2^k/k^4$
. The number of integers
$n \in {\mathcal A}(N,k)$
that may be written as
$x^2 + dy^2$
with
$1\leq d\leq \Delta $
is
$\ll N/k_0$
.
An application of Stirling’s formula (see (2.3) below) shows that for k in the range (1.1)

Thus, Theorem 1.3 is really of interest only when
$|k-k_0| \leq (k_0 \log k_0)^{1/2}$
. This range still includes most typical integers below N, and Theorem 1.3 may be used to establish the upper bound for the integers below N of the form
$x^2+dy^2$
with
$d\leq \Delta $
that is implicit in Theorem 1.1. The corresponding lower bound in Theorem 1.1 is implied by the following result.
Theorem 1.4 (Lower bound)
Let N be large, and let k be an integer in the range given in (1.1). Suppose
$\Delta \geq k^3 2^k$
. Let
${\mathcal E}(N,k)$
denote the set of integers in
${\mathcal A}(N,k)$
that cannot be represented as
$x^2+dy^2$
with
$d\leq \Delta $
. Then

Similarly to Theorem 1.3, Theorem 1.4 is really of interest only in the range

but this range includes most typical integers below N. In Section 2, we shall deduce our main result Theorem 1.1 from Theorems 1.3 and 1.4.
Since our main interest is in establishing Theorem 1.1, we have made no effort to optimize the error terms and ranges for k in Theorems 1.3 and 1.4. It would be of interest to establish analogues of these results uniformly in a wide range of k (although when k is large it may be better to work with
$\omega (n)=k$
, where
$\omega (n)$
counts the number of distinct prime factors of n). One case of particular interest may be
$k=1$
: representing primes up to N using the quadratic forms
$x^2 +dy^2$
with
$d\leq \Delta $
. Here, it would be possible to establish that a proportion
$\rho (\Delta )$
of the primes up to N may be so represented with
$\rho (1)=1/2$
(by Fermat’s result on representing primes of the form
$1 \ (\operatorname {mod}\, 4)$
as a sum of two squares),
$\rho (\Delta ) <1$
for all
$\Delta $
, and
$\rho (\Delta ) \to 1$
as
$\Delta \to \infty $
. Determining
$\rho (\Delta )$
precisely, or understanding its precise asymptotic behavior as
$\Delta $
gets large, seems like a challenging and delicate problem.
Let us indicate very briefly the ideas behind Theorems 1.3 and 1.4; here and in the rest of the introduction, we shall be a little informal and also assume that the reader is familiar with the classical theory of binary quadratic forms (which will be recalled in Section 3). Recall that a square-free integer n may be represented by some binary quadratic form of negative discriminant D if and only if
$\chi _D(p)= 1$
for all primes p dividing n (assume that n is coprime to D). If n has k prime factors, then each condition
$\chi _D(p)=1$
has a
$50\%$
chance of occurring so that n may be represented by some binary quadratic form of discriminant D with probability
$2^{-k}$
. This suggests that
$\Delta $
must be about size
$2^k$
in order to have a chance of representing many integers with k prime factors. This is the idea behind Theorem 1.3, and it can be made precise without too much difficulty (see Section 4).
The more difficult part of our argument is Theorem 1.4, which constitutes the bulk of the paper. If
$\Delta $
is substantially larger than
$2^k$
, then the heuristic that we just mentioned would suggest that for most integers
$n\leq N$
with k prime factors there would be some negative discriminant D with
$|D|\leq \Delta $
such that n is representable by some binary quadratic form of discriminant D, and indeed, there would be a total of about
$2^k$
such representations of n. The number of inequivalent classes of binary quadratic forms of discriminant D is the class number, which is of size
$|D|^{1/2+ o(1)}$
. It is therefore likely that some of the
$2^k$
(which is about
$\Delta $
) representations of n would come from the principal form
$x^2+dy^2$
(corresponding to the discriminant
$D=-4d$
) and indeed that there should be about
$2^k/|D|^{1/2+o(1)}$
representations of n by
$x^2+dy^2$
. We make this heuristic precise by using class group characters and their associated L-functions, together with a second moment method. It would be relatively straightforward to obtain a version of Theorem 1.4 where a positive proportion of the elements in
${\mathcal A}(N,k)$
are represented by the forms
$x^2+dy^2$
with
$d\leq \Delta $
. However, it is more delicate to obtain almost all integers in
${\mathcal A}(N,k)$
, and to achieve this we impose congruence conditions on d for all primes p below a slowly growing parameter W. A key fact is that when discriminants d are restricted to such progressions, the value of
$L(1,\chi _d)$
remains more or less constant. To simplify genus theory considerations, we further restrict attention to prime values of d, but this is merely a matter of convenience.
For k sufficiently close to
$k_0$
, Theorems 1.3 and 1.4 show that the number of represented elements in
${\mathcal A}(N,k)$
undergoes a rapid phase transition as one goes from
$\Delta = 2^k/k^4$
(when
$0\%$
of
${\mathcal A}(N,k)$
is covered) to
$\Delta = 2^k k^3$
(when
$100\%$
of
${\mathcal A}(N,k)$
is covered). While there is some scope to narrow the gap between
$2^k/k^4$
and
$2^k k^3$
, the restriction to prime values of d in our proof of Theorem 1.4 would prevent us from fully closing this gap. It seems likely that a more precise cutoff phenomenon occurs: When
$\Delta = \beta \sqrt {k} 2^k$
, there is a proportion
$p(\beta )$
of integers in
${\mathcal A}(N,k)$
that are represented, with
$0< p(\beta ) <1$
for all
$0 < \beta < \infty $
, and with
$p(\beta ) \to 0$
as
$\beta \to 0$
and
$p(\beta ) \to 1$
as
$\beta \to \infty $
. Possibly our arguments, together with additional ideas taking into account genus theory, could be used to establish part of this cutoff phenomenon, and we hope that an interested reader will take up the challenge.
Our discussion so far has been confined to representing almost all integers below N using the forms
$x^2+dy^2$
with
$d\leq \Delta $
. It is natural to ask what happens if all integers below N are to be represented. Taking
$x= \lfloor \sqrt {n}\rfloor $
and
$y=1$
, we see that
$\Delta = 2\sqrt {N}$
suffices, and going beyond this trivial bound already seems an interesting problem. Since integers below N have
$\ll \log N/\log \log N$
distinct prime factors, extrapolating Theorem 1.4 we may expect that
$\Delta =\exp ( C \log N/\log \log N)$
is sufficient for some constant
$C>0$
. As evidence towards this conjecture, we note that progress can be made in two weaker versions.
By a simple application of the pigeonhole principle, one can show that every positive integer below N may be represented by some nondegenerate binary quadratic form f with
$|\text {disc}(f)| \leq \exp (C \log N/\log \log N)$
with C being any constant larger than
$\log 4$
. Here, nondegenerate means that the quadratic form does not factor into linear forms or, equivalently, that the discriminant is not a square. In fact, all elements of
${\mathcal A}(N,k)$
can be represented by some nondegenerate binary quadratic form with absolute discriminant
$\ll 4^k$
(for instance, all primes are of the form
$x^2+y^2$
,
$x^2+ 2y^2$
or
$x^2-2y^2$
). The pigeonhole argument does not allow us to restrict attention to positive definite forms (although one can restrict attention to indefinite forms), let alone the smaller family of principal positive definite forms. Assuming GRH for quadratic Dirichlet L-functions it can be shown that all integers below N may be represented by some positive definite binary quadratic form with absolute discriminant below
$\exp (C \log N/\log \log N)$
for any
$C> \log 4$
and indeed that all elements in
${\mathcal A}(N,k)$
can be represented by such forms with absolute discriminant
$\ll 4^k (\log N)^4$
.
In the other direction, we may ask how large must
$\Delta $
necessarily be if all integers
$n\leq N$
are represented as
$x^2 +dy^2$
with
$d\leq \Delta $
. Complementing our discussion above, we can establish here that
$\Delta $
must be at least
$\Delta _0= \exp (c \log N/\log \log N)$
for a positive constant c. In fact, we can establish the stronger result that there exists a square-free integer
$n\leq N$
such that for any fundamental discriminant d with
$1 < |d|\leq \Delta _0$
there exists a prime factor p of n with
$\chi _d(p)=-1$
. Such an integer n cannot be represented by any primitive nondegenerate binary quadratic form with absolute discriminant below
$\Delta _0$
. This result, which may be viewed as a variant of the least quadratic nonresidue problem, follows from an application of log-free zero density estimates; details will be supplied elsewhere.
Lastly, we draw attention to three papers from the literature where related problems concerning the integers represented by a family of binary quadratic forms are considered: Blomer’s work on sums of two squareful numbers [Reference Blomer1], the work of Bourgain and Fuchs [Reference Bourgain and Fuchs4] on Apollonian circle packings and the work of Ghosh and Sarnak [Reference Ghosh and Sarnak10] on Markoff-type cubic surfaces.
Notation. For the most part notation will be introduced when it is needed. However, we remind the reader that
$k_0$
will always denote
$\log \log N$
. From Section 5 onwards, W will denote a quantity which tends to infinity with N sufficiently slowly; we will take
$W := \log \log \log N$
for definiteness.
Plan of the paper. Section 2 is devoted to the proof that the upper and lower bounds (Theorems 1.3 and 1.4, respectively) imply the main theorem, Theorem 1.1. Section 3 gives some standard background on binary quadratic forms which will be used throughout the rest of the paper. In Section 4, we prove the relatively straightforward upper bound, Theorem 1.3.
The remainder of the paper is devoted to the much more involved proof of the lower bound, Theorem 1.4. First, we formulate a more technical variant of this result, Theorem 5.1. This result allows us to restrict attention to representing integers not divisible by
$4$
, using only quadratic forms
$x^2 + dy^2$
with d ranging over primes in certain congruence classes. The deduction of Theorem 1.4 from Theorem 5.1 is short and is given immediately after the statement of the latter.
The proof of Theorem 5.1 is via the second moment method. We divide the computations that arise into four separate technical propositions, Propositions 5.2, 5.3, 5.4 and 5.5. The synthesis of these propositions to give a proof of Theorem 5.1 is accomplished in Section 6.
The final sections of the main part of the paper are devoted to the proofs of these four technical propositions. Proposition 5.5 is a statement about averages of certain
$L(1,\chi )$
, and we handle it first, in Section 7. The remaining three results all require some background on class group L-functions, and Section 8 provides an overview and references for the necessary material. Finally, the proofs of Propositions 5.2, 5.3 and 5.4 are given in Sections 9, 10 and 11, respectively.
Sections 8, 9 and 10 use Selberg’s techniques [Reference Selberg16]. There is no particularly convenient reference for what we require, so we provide full details. The more standard parts of this may be found in Appendix A.
2 The upper and lower bounds imply the main theorem
In this section, we show how Theorem 1.1 follows from Theorems 1.3 and 1.4.
Suppose, as in the statement of Theorem 1.1, that
$\Delta = (\log N)^{\log 2} 2^{\alpha \sqrt {\log \log N}}$
. It is enough to prove the result for

the result for all
$\alpha $
follows from this case and the fact that

Suppose henceforth that (2.1) holds.
Let
$k^-$
be defined as the solution to
$\Delta = k^3 2^k$
and
$k^+$
as the solution to
$\Delta = 2^k/k^4$
. Then, one may check that

In particular, by the assumption (2.1), we see that
$|k^{\pm } - k_0| \leq 2 k_0^{3/5}$
.
For k in the range
$k_0-2 k_0^{3/5} \leq k \leq k^-$
, Theorem 1.4 shows that the number of integers in
${\mathcal A}(N,k)$
that may be represented as
$x^2+dy^2$
with
$d\leq \Delta $
is

Stirling’s formula and the approximation
$1 -x = \exp (-x -\frac {x^2}{2} + O(x^3))$
with
$x = 1 - \frac {k_0}{k}$
(
$ = O(k_0^{-2/5})$
) show in this range of k that

so that using Lemma 1.2, we may see that the quantity in (2.2) is

Summing over all k in this range, we conclude that the number of integers
$n\leq N$
that may be written as
$x^2+dy^2$
with
$d\leq \Delta $
is at least

upon approximating the sum by the corresponding integral. This shows the lower bound implicit in Theorem 1.1.
To obtain the corresponding upper bound, note that for k in the range
$k^+ \leq k \leq k_0+2k_0^{3/5}$
, Theorem 1.3 shows that the number of integers in
${\mathcal A}(N,k)$
that cannot be represented as
$x^2+dy^2$
with
$d\leq \Delta $
is
$|{\mathcal A}(N,k)| + O(N/k_0)$
. Using Lemma 1.2 and Stirling’s formula as above, this is

Summing over all k in this range, we conclude the number of integers up to N that cannot be represented as
$x^2+dy^2$
with
$d\leq \Delta $
is at least

This implies the upper bound implicit in Theorem 1.1 and completes the proof.
3 Background on quadratic forms
For the theory in the rest of this section, good resources are [Reference Cox5], [Reference Davenport6], [Reference Iwaniec and Kowalski13, Chapter 22] or [Reference Zagier19].
3.1 Fundamental discriminants and characters
A fundamental discriminant is an integer D of the following type: either (i)
$D \equiv 1 \ (\operatorname {mod}\, 4)$
and square-free or (ii)
$D = 4m$
with
$m \equiv 2,3 \ (\operatorname {mod}\, 4)$
and m square-free. Apart from
$D = 1$
, these are precisely the discriminants of quadratic fields over
$\mathbf {Q}$
, and indeed the discriminant of
$\mathbf {Q}(\sqrt {D})$
is D. Equivalently, if m is square-free, the quadratic field
$\mathbf {Q}(\sqrt {m})$
has discriminant
$4m$
if
$m \equiv 2,3 \ (\operatorname {mod}\, 4)$
and m if
$m \equiv 1\ (\operatorname {mod}\, 4)$
.
Associated to the fundamental discriminant D is the primitive quadratic Dirichlet character
$\chi _{D}(n) = (\frac {D}{n})$
, where the symbol here is the Kronecker symbol. This is defined to be completely multiplicative and specified on the primes by the following:
-
• If p is an odd prime,
$\chi _{D}(p)= (\frac {D}{p})$ is the Legendre symbol;
-
•
$\chi _{D}(2) = 0$ if
$D \equiv 0 \ (\operatorname {mod}\, 4)$ ,
$1$ if
$D \equiv 1 \ (\operatorname {mod}\, 8)$ and
$-1$ if
$D \equiv 5\ (\operatorname {mod}\, 8)$ ;
-
•
$\chi _{D}(-1) = \mbox {sgn}(D)$ .
The Kronecker symbol
$\chi _{D}$
is a primitive character of modulus
$|D|$
. It describes the splitting type of a prime p in the quadratic field
$K=\mathbf {Q}(\sqrt {D})$
: A prime p splits in
$\mathbf {Q}(\sqrt {D})$
if
$(\frac {D}{p})=1$
, remains inert if
$(\frac {D}{p}) =-1$
and ramifies when
$(\frac {D}{p})=0$
. Thus, the Dedekind zeta-function of the field K is given by

and the number of ideals in
$\mathcal {O}_K$
of norm n is

3.2 Positive definite forms and imaginary quadratic fields
Let
$D < 0$
be a negative fundamental discriminant, and let
$K= \mathbf {Q}(\sqrt {D})$
denote the corresponding imaginary quadratic field.
There is a well-known correspondence (going back to Gauss) between ideal classes in K and equivalence classes of positive definite binary quadratic forms of discriminant D. In particular, principal ideals in K are in correspondence with the principal binary quadratic form given by
$x^2 + \frac {|D|}4 y^2$
(in the case
$D\equiv 0 \ (\operatorname {mod}\, 4)$
) and
$x^2 + xy + \frac {1+|D|}{4} y^2$
(in the case
$D \equiv 1 \ (\operatorname {mod}\, 4)$
so that
$|D|=-D \equiv 3 \ (\operatorname {mod}\, 4)$
).
A key object of interest for us is

which counts the number of principal ideals in
${\mathcal O}_K$
of norm n. If
$D\equiv 0 \ (\operatorname {mod}\, 4)$
, then a principal ideal
${\mathfrak a}$
of norm n may be written as
$(a+b\sqrt {D/4})$
and corresponds to two representations of n by the principal form
$x^2+ \frac {|D|}{4} y^2$
, namely
$n= (\pm a)^2 + \frac {|D|}{4} (\pm b)^2$
(with the exception of
$D=-4$
, where it corresponds to
$4$
representations by the principal form
$x^2+y^2$
). Similarly, if
$D\equiv 1\ (\operatorname {mod}\, 4)$
, a principal ideal
${\mathfrak a}$
of norm n may be written as
$(a + b \frac {1+\sqrt {D}}{2})$
and corresponds to two representations of n by the principal form
$x^2 +xy + \frac {1+|D|}{4} y^2$
(with the exception of
$D=-3$
, where it corresponds to
$6$
representations by the principal form
$x^2+xy+y^2$
).
We remark that

since
$(1*\chi _D)(n)$
counts all ideals with norm n, and that each ideal of norm n corresponds to two (or
$4$
when
$D=-4$
, or
$6$
when
$D=-3$
) representations of n by some equivalence class of binary quadratic forms of discriminant D.
To isolate the principal ideals of norm n, we shall use class group characters. Let
$C_K$
denote the ideal class group of K, and denote by
$h_K$
its size which is the class number of K. A class group character is a homomorphism
$\psi : C_K \to \mathbf {C}^{\times }$
. We may think of such class group characters as maps

satisfying
$\psi ({\mathfrak a }{\mathfrak b}) = \psi ({\mathfrak a}) \psi ({\mathfrak b})$
and
$\psi ((\lambda )) = 1$
for every nonzero principal ideal
$(\lambda )$
. We denote the dual group of class group characters by
${\widehat C}_K$
.
If
$\psi \in {\widehat C}_K$
is a class group character, then we define

Notice that
${\widehat C}_K$
always includes the principal character
$\psi _0$
given by
$\psi _0(\mathfrak a) =1$
for all ideals
${\mathfrak a}$
. In this case,

The orthogonality relations for characters now allow us to express
$R_D(n)$
in terms of
$r(n,\psi )$
: namely,

With these preliminaries in place, we postpone a more detailed discussion of class group characters to Section 8.
3.3 Representation by
$x^2+dy^2$
We now relate the concepts of the previous section to our specific problem of representing integers by the quadratic forms
$x^2+dy^2$
. We will restrict attention to square-free integers d, which is sufficient for our purposes. The problem of representing integers by
$x^2+dy^2$
is naturally related to arithmetic in the field
$K=\mathbf {Q}(\sqrt {-d}) = \mathbf {Q}(\sqrt {D})$
, where D denotes the fundamental discriminant

Henceforth in the paper, we will adopt the following notational conventions. Unless explicitly stated otherwise, whenever we write d we have in mind a square-free integer, and corresponding to such d will be the fundamental discriminant D given in (3.5), and the imaginary quadratic field
$K = \mathbf {Q}(\sqrt {-d}) = \mathbf {Q}(\sqrt {D})$
. Of course, K and D depend on d, but we will not indicate this explicitly. Sometimes, we will additionally have a second positive square-free number
$d'$
, and
$K', D'$
will be associated to it in the same way.
Lemma 3.1. Let
$d \ge 1$
be square-free, and let
$D, K$
be associated to d as above.
-
1. If
$d \equiv 1$ or
$2 \ (\operatorname {mod}\, 4)$ , then the number of representations of n by the quadratic form
$x^2 + dy^2$ equals
$2R_D(n)$ , with the exception of the special case
$d=1$ where it equals
$4 R_{-4}(n)$ .
-
2. If
$d\equiv 3\ (\operatorname {mod}\, 4)$ , then the number of representations of n by the quadratic form
$x^2+dy^2$ is at most
$2 R_D(n)$ , with the exception of the special case
$d=3$ where it is at most
$6R_{-3}(n)$ .
-
3. If
$d \equiv 7 \ (\operatorname {mod}\, 8)$ and n is odd, then the number of representations of n by the quadratic form
$x^2+dy^2$ equals
$2R_D(n)$ .
Proof. If
$d\equiv 1, 2 \ (\operatorname {mod}\, 4)$
, we have
$D= -4d$
, and the quadratic form
$x^2 +dy^2$
is the principal form of discriminant D. The result (1) now follows from our discussion in Section 3.2.
If
$d\equiv 3 \ (\operatorname {mod}\, 4)$
, then
$D=-d$
, and the principal form of discriminant D is
$x^2+xy + \frac {1+d}{4} y^2$
. The identity

shows that the representations of n as
$x^2+dy^2$
are in bijective correspondence with the representations of n as
$X^2 + XY + \frac {1+d}{4} Y^2$
with Y even. Since the total number of representations of n as
$X^2+XY +\frac {1+d}{4} Y^2$
(ignoring whether Y is even or odd) equals
$2R_D(n)$
(or
$6R_{-3}(n)$
in the exceptional case
$d=-3$
), the upper bound stated in (2) follows.
Finally, if
$d\equiv 7 \ (\operatorname {mod}\, 8)$
and n is odd, then
$\frac {1+d}{4}$
is even, and so any representation of n as
$X^2+XY+ \frac {1+d}{4} Y^2$
must necessarily have Y being even. Thus, in this case the representations of n by
$X^2+XY + \frac {1+d}{4} Y^2$
equal the representations of n by
$x^2+dy^2$
, and assertion (3) follows.
4 Proof of the upper bound
In this section, we prove Theorem 1.3. It will follow from the following proposition.
Proposition 4.1. Let N be large and k an integer in the range (1.1). Let d be a square-free integer with
$d\leq \log N$
. Then the number of integers
$n \in {\mathcal A}(N,k)$
that are represented by
$x^2+dy^2$
is
$\ll \frac {N}{2^k} (\log \log N)^3$
.
Before proving the proposition, let us deduce Theorem 1.3. Note that if
$d=d_1 d_2^2$
with
$d_1$
square-free, then an integer represented by
$x^2 +dy^2$
is automatically represented by
$x^2 +d_1 y^2$
.
Using Proposition 4.1, it follows that the number of integers in
${\mathcal A}(N,k)$
that are represented by
$x^2+dy^2$
for some d,
$1\leq d\leq \Delta $
is

since
$\Delta \leq 2^k/k^4$
. This establishes Theorem 1.3.
To prove Proposition 4.1, we require the following simple lemma.
Lemma 4.2. Let D be any fundamental discriminant apart from
$D=1$
. For all
$x\geq 1$
, we have
$\sum _{n\leq x} (1*\chi _D)(n) \ll x\log |D|$
.
Proof. Suppose first that
$x\leq |D|^2$
. Since
$(1*\chi _D)(n) \leq \tau (n)$
(the number of divisors of n), the sum in question is
$\leq \sum _{n\leq x} \tau (n) \ll x \log (x+1) \ll x\log |D|$
.
Now, suppose that
$x> |D|^2$
, and note that

Therefore,

The first term on the right side of (4.1) contributes

Since
$\chi _D$
is a nonprincipal character to the modulus
$|D|$
, it sums to zero over any interval of length D, and therefore
$|\sum _{|D| < b\leq x/a} \chi _D(b)| \leq |D|$
. It follows that the second term on the right side of (4.1) contributes
$\ll |D| \sum _{a\leq x/|D|} 1 \ll x$
, and the lemma follows.
Proof of Proposition 4.1
Let d be square-free with
$d \leq \log N$
, and let D be the fundamental discriminant associated to it (as given in (3.5)). Write
$\mathcal {R} = \mathcal {R}(d)$
for the set of all r such that the primes dividing r either divide
$|D|$
or appear to exponent at least
$2$
in the prime factorization of r. Suppose
$n \in {\mathcal A}(N,k)$
is an integer that can be expressed as
$x^2+dy^2$
. Write n uniquely as
$rs$
, where
$(r,s)=1$
, s is square-free and composed of primes not dividing
$|D|$
and
$r \in \mathcal {R}$
. We have that
$\Omega (r) \leq k$
and note that
$\Omega (s) = k-\Omega (r)$
.
By Lemma 3.1 and (3.1), we know that if n is representable by
$x^2+dy^2$
, then
$(1*\chi _D)(n) \geq R_D(n)>0$
. Since
$(1*\chi _D)$
is a nonnegative multiplicative function, it follows that
$(1*\chi _D)(s)> 0$
or, equivalently, that every prime
$p|s$
satisfies
$\chi _D(p)=1$
and therefore
$(1*\chi _D)(s) = 2^{\Omega (s)}$
. Thus,

where in the last step we used the nonnegativity of
$1*\chi _D$
to take the sum over all
$s \leq N/r$
. Applying Lemma 4.2 to the sum over s, we obtain

Now,

where the factor k above arises from the prime
$p=2$
. Since
$k\ll \log \log N$
and
$|D|\leq \log N$
, the proposition follows.
5 Plan of the proof of the lower bound
We now turn to the proof of Theorem 1.4, which constitutes the bulk of the paper. Let N be large, recall that k is an integer in the range (1.1), and suppose in all that follows that

We wish to bound the exceptional integers
$n \in {\mathcal A}(N,k)$
that cannot be represented as
$x^2+dy^2$
with d below
$\Delta $
. In fact, we shall consider only representations by such quadratic forms when d is a prime lying in a suitable residue class and show that most integers can be represented even with this further constraint.
To state our results more precisely, we distinguish two cases according to whether the
$2$
-adic valuation
$v_2(n)$
is
$0$
or
$1$
(or in other words whether
$n\equiv 1 \ (\operatorname {mod}\, 2)$
or
$n \equiv 2 \ (\operatorname {mod}\, 4)$
). Results for integers n that are multiples of
$4$
will be deduced easily from these cases. Thus, we define, for all
$j =0$
,
$1$
,

Observe that

and

Thus, from Lemma 1.2 we may deduce that for k satisfying
$|k-k_0| \leq \frac 13 k_0$
we have

Here, we recall that
$k_0$
denotes
$\log \log N$
.
To each case
$j = 0,1$
we associate a set
$\mathcal {D}_j$
of primes. Below, we let W denote a parameter tending to infinity slowly with N; for definiteness, we set
$W=\log \log \log N$
. With this choice of W, define


Here, as usual, D denotes the fundamental discriminant associated to d as given in (3.5). Thus,
$D=-d$
for
$d\in {\mathcal D}_0$
and since
$D\equiv 1 \ (\operatorname {mod}\, 8)$
, we have
$\chi _D(2) =1$
automatically. If d is in
${\mathcal D}_1$
, then
$D= -4d$
, and here
$\chi _D(2)=0$
. The primes in
${\mathcal D}_0$
lie in
$\prod _{3\leq p \leq W} \frac {p-1}{2}$
reduced residue classes
$(\operatorname {mod}\, 8\prod _{3\leq p\leq W} p)$
, while those in
${\mathcal D}_1$
lie in
$\prod _{3\leq p \leq W} \frac {p-1}{2}$
reduced residue classes
$(\operatorname {mod}\, 4\prod _{3\leq p\leq W} p)$
. Since W is suitably small, a simple application of the prime number theorem in arithmetic progressions gives

In particular, since
$W = \log \log \log N$
, and since

we have the crude bounds

We are now ready to state our result on representing integers in
${\mathcal A}_j(N,k)$
using the binary quadratic forms
$x^2 + dy^2$
with
$d\in {\mathcal D}_j$
. From this result, we shall swiftly deduce Theorem 1.4.
Theorem 5.1. Suppose that N is large, k is an integer in the range

where
$k_0 := \log \log N$
, and that
$k^3 2^k \leq \Delta \leq \log N$
. For
$j = 0,1$
, let
${\mathcal E}_j(N,k)$
denote the exceptional set of integers in
${\mathcal A}_j(N,k)$
that cannot be expressed as
$x^2+dy^2$
for some
$d\in {\mathcal D}_j$
. Then we have

Deducing Theorem 1.4 from Theorem 5.1.
Extracting the largest power of
$4$
, we see that every
$n \in {\mathcal A}(N,k)$
may be written uniquely as
$n= 4^m r$
, where r is either in
${\mathcal A}_0(N/4^m, k-2m)$
or in
${\mathcal A}_1(N/4^m, k-2m)$
. Further, if r can be represented as
$x^2 +dy^2$
with
$d\leq \Delta $
, then plainly so can
$n= 4^m r$
. Thus,

First, let us dispense with the terms
$m \geq \log k_0$
. Bounding
$|{\mathcal E}_0(N/4^m, k-2m)| + |{\mathcal E}_1(N/4^m, k-2m)|$
trivially by
$N/4^m$
, we see that these terms contribute
$\ll \sum _{m\geq \log k_0} N/4^m \ll N/k_0$
, which is better than we need.
For the terms with
$m\leq \log k_0$
, we wish to use Theorem 5.1 to bound the quantity
$|{\mathcal E}_j (N/4^m, k-2m)|$
(for
$j=0,1$
). We must check that the required conditions there hold. The condition on
$\Delta $
is automatic: Since
$\Delta $
is assumed to be
$\geq k^3 2^k$
it is clearly also
$\geq (k-2m)^3 2^{k-2m}$
. The main condition to check is the analogue of (5.5) which here reads
$|(k-2m) - \log \log (N/4^m) | \leq 2 (\log \log (N/4^m))^{2/3}$
. To verify this, note that for
$m\leq \log k_0$
, one has
$\log \log (N/4^m) = k_0 +O(1)$
, and so the left side above is
$\leq |k_0 - k| + 2m + O(1) \leq k_0^{2/3} + 2\log k_0 + O(1)$
since k is in the range (1.1). Thus, we may apply Theorem 5.1, and conclude that

Now, applying Lemma 1.2 we obtain

where the final estimate holds since
$k/k_0 = 1 +O(k_0^{-1/3})$
and
$m\leq \log k_0$
so that
$(k/k_0)^{2m} \ll 1$
. We conclude that the contribution of the terms with
$m \leq \log k_0$
may be bounded by

Combining this estimate with our bound for the larger range of m, we complete the deduction of Theorem 1.4.
Theorem 5.1 will be deduced (in the next section) from the following four propositions which form the heart of our argument. Before stating these propositions, we introduce some notation that will be in place for the rest of our work. We shall factorize n as
$n^{\flat } n^{\sharp }$
, where
$n^{\flat }$
is composed only of primes below W, and
$n^{\sharp }$
is composed only of primes above W. Further, we define

For each choice of
$j=0$
or
$1$
, we define

Note that if n cannot be represented as
$x^2 +dy^2$
with
$d\in {\mathcal D}_j$
, then
$R_D(n)=0$
for all
$d\in {\mathcal D}_j$
and therefore
$F_j(n)=0$
. The proof is based on showing that for
$n\in {\mathcal A}_j(k)$
, the quantity
$F_j(n)$
is usually close to its expected value of
$1$
, which is achieved by showing that
$(F_j(n)-1)^2$
is small on average over n. The four propositions below facilitate the calculation of this variance, which will be carried out in the next section.
Proposition 5.2. Let N be large, and let k be an integer in the range (5.5). The following statements hold for either choice of
$j=0$
or
$1$
. Let d be an element in
${\mathcal D}_j$
, and let D be the corresponding fundamental discriminant. Then

where
$\gamma _W$
is as in (5.6).
Partial summation and (5.1) easily allow us to give an asymptotic for the sum
$\sum _{n\in {\mathcal A}_j(k)} e^{-n/N}$
appearing above. Write

where we truncated the integral above using the trivial bound
$|{\mathcal A}_j(uN,k)| \leq uN$
in the range
$u \not \in [1/\log N, \log N]$
. Now, using (5.1) for
$u \in [1/\log N, \log N]$
and the estimate
$\frac {k_0^k}{k!} \ll k_0^{-1/2} \log N$
, which follows from (2.3), we obtain that for k in the range (5.5)

Note that there is a small subtlety in the application of (5.1), which is that N must be replaced by
$uN$
not only in the obvious term
$\frac {N}{\log N}$
, but also
$k_0$
must be replaced by
$\log \log (uN)$
. We leave it to the reader to check that these changes have negligible effect for u in the stated range.
Our next proposition considers averages of
$R_D(n) R_{\tilde D}(n)$
for two different elements d,
${\widetilde d} \in {\mathcal D}_j$
. The answer will involve the character
$\chi _{d {\tilde d}}$
, which we now briefly introduce. Since d and
${\tilde d}$
are different primes that are congruent to each other
$(\operatorname {mod}\, 4)$
, it follows that
$d{\tilde d} \equiv 1 \ (\operatorname {mod}\, 4)$
is a fundamental discriminant, and so the Kronecker symbol
$\chi _{d {\tilde d}}$
is a primitive character to the modulus
$d{\tilde d}$
. This character is also closely connected to the product of characters
$\chi _D \chi _{\tilde D}$
. Indeed, in the case
$j=0$
both characters are identical, and in the case
$j=1$
the character
$\chi _D \chi _{\tilde D}$
is the imprimitive character
$(\operatorname {mod}\, 4d{\tilde d})$
induced by the primitive character
$\chi _{d {\tilde d}}$
.
Proposition 5.3. Let N be large, and let k be an integer in the range (5.5). Let j be
$0$
or
$1$
. Let d and
${\widetilde d}$
be two different elements in
${\mathcal D}_j$
, and let D and
$\widetilde {D}$
denote the corresponding fundamental discriminants. If
$d\equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
(which is automatic when
$j=0$
), then

while if
$d \not \equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
(which can only happen for
$j=1$
), then

The next proposition concerns the case when
$d = {\widetilde d}$
, where an upper bound suffices.
Proposition 5.4. Let N be large, and let k be an integer in the range (5.5). Let
$j=0$
or
$1$
, and let d be an element of
${\mathcal D}_j$
with D denoting the corresponding fundamental discriminant. Then we have

Finally, to complete our calculation of the average of
$(F_j(n)-1)^2$
, we shall need an asymptotic for the the average of
$L(1,\chi _{d{\widetilde d}})$
appearing in Proposition 5.3.
Proposition 5.5. For each
$j=0,1$
, we have

6 Deducing Theorem 5.1 from Propositions 5.2, 5.3, 5.4 and 5.5
We now deduce Theorem 5.1 from the four propositions enunciated in the previous section. Let j be
$0$
or
$1$
, and k an integer in the range (5.5). Recall from (5.7) the definition of
$F_j(n)$
, and recall that
$F_j(n) =0$
if n cannot be represented as
$x^2+dy^2$
with
$d\in {\mathcal D}_j$
. Therefore, writing
${\mathcal E}_j(N,k)$
for the exceptional set as in Theorem 5.1,

We now invoke Propositions 5.2, 5.3, 5.4 and 5.5 to bound the right side above. To handle some error terms that arise, we require bounds for the average values of
$L(1,\chi _D)^{-m}$
with
$m=1$
and
$2$
. Although we can be more precise, it suffices to use [Reference Granville and Soundararajan11, Theorem 2] and (5.4) to obtain

for
$j = 0,1$
and
$m = 1,2$
.
A few further remarks on the application of [Reference Granville and Soundararajan11, Theorem 2] may be helpful. First, since we are dealing with moments where m is bounded (albeit negative) we can exclude the contribution of exceptional characters, as remarked in the paragraph following the statement of [Reference Granville and Soundararajan11, Theorem 2]. Second, denoting by X the random Euler product featuring in the statement of [Reference Granville and Soundararajan11, Theorem 2] then, as remarked in [Reference Granville and Soundararajan11, page 995],
$\mathbf {P}(L(1, X) \leq 1/t)$
decays doubly exponentially as
$t \rightarrow \infty $
so that the moments
$\mathbf {E} L(1, X)^{-1}$
and
$\mathbf {E} L(1,X)^{-2}$
are bounded.
From Proposition 5.2, (6.2) (with
$m = 1$
) and the assumption that
$\Delta \leq \log N$
, it follows that

It remains to evaluate the terms involving
$F_j(n)^2$
in (6.1). Expanding out the square, we have

Here, we separate the diagonal terms
$d={\widetilde d}$
from the off-diagonal terms
$d\neq {\widetilde d}$
. By Proposition 5.4, we see that the contribution of the diagonal terms is bounded by

Using (5.4) and (6.2), the Mertens bound
$\gamma _W \geq 1/\log W = (\log \Delta )^{-o(1)}$
and that
$\Delta \geq k^3 2^k$
, the above is

As for the off-diagonal terms in (6.4), using Proposition 5.3 we see that their contribution is

Now, using Proposition 5.5, (6.2) and the bound
$\gamma _W^{-1} \ll (\log \Delta )^{o(1)}$
, the above is

Combining (6.5) and (6.6), we conclude that

Taken together with (6.3), it follows that

in view of (5.8) and Lemma 1.2. Using this estimate in (6.1) and recalling that
$W = \log \log \log N = \log k_0$
, Theorem 5.1 follows.
7 Proof of Proposition 5.5
In the proof below, it is convenient to set

Suppose d and
${\widetilde d}$
are distinct elements in
${\mathcal D}_j$
with
$d \equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
. Then
$d{\widetilde d}$
is a square-free integer
$\equiv 1 \ (\operatorname {mod}\, 8)$
and is thus a fundamental discriminant. Since
$d{\tilde d}\leq \Delta ^2$
, partial summation and the Pólya–Vinogradov inequality give

We first show that (when summed over d and
${\tilde d}$
) the terms with
$n> K$
contribute a negligible amount. Here, we extend the sum over
$d {\tilde d}$
to all discriminants below
$\Delta ^2$
that are
$1 \ (\operatorname {mod}\, 8)$
. Recall that a discriminant is an integer
$\ell \equiv 0$
or
$1 \ (\operatorname {mod}\, 4)$
and that every discriminant
$\ell $
may be written uniquely as
$\ell _0 r^2$
, where
$\ell _0$
is a fundamental discriminant. For every discriminant
$\ell $
, we may define the Kronecker symbol
$\chi _\ell $
exactly as in Subsection 3.1, and it defines a quadratic character
$(\operatorname {mod}\, \ell )$
, possibly imprimitive and induced from the primitive character
$\chi _{\ell _0}\kern-1.3pt $
. Thus, using Cauchy–Schwarz, we find

Expanding the square, we obtain

Write
$n_1 n_2$
as
$2^a n$
, where n is odd. Since
$d\equiv 1 \ (\operatorname {mod}\, 8)$
,
$\chi _d(2)=1$
, and therefore
$\chi _d(n_1 n_2) = \chi _d(n)$
may also be expressed as the Jacobi symbol
$(\frac {d}{n})$
. Now, the Jacobi symbol
$( \frac {\cdot }{n})$
is a quadratic character
$(\operatorname {mod}\, n)$
and is nonprincipal exactly when n is not a square, or, in other words, when
$n_1 n_2$
is neither a square nor twice a square. Thus, when
$n_1 n_2$
is neither a square nor twice a square we find by the Pólya–Vinogradov inequality

where
$\overline 8$
denotes the inverse of
$8$
modulo n. If
$n_1 n_2$
is a square or twice a square, then the inner sum over d in (7.2) is clearly
$O(\Delta ^2)$
. Thus, we obtain that the quantity in (7.2) is

The second term above is easily bounded by
$\ll M\log M$
. Now, consider the first term, where we handle the case
$n_1 n_2 =m^2$
with the case
$n_1 n_2 =2m^2$
treated in the same manner. The terms
$n_1 n_2 =m^2$
contribute, with
$\tau (\cdot )$
denoting the divisor function

We conclude that the quantity in (7.2) is

Combining the above argument with (7.1), we find that

To analyse the main term above, write
$n\leq K$
uniquely as
$n= frm^2$
, where f and r are square-free with all prime factors of f being below W and all prime factors of r being above W (in particular, r is odd, and note that r could be
$1$
). Note that for all p,
$3\leq p\leq W$
, we have
$\chi _{d{\widetilde d}}(p) = \chi _{D}(p) \chi _{\widetilde D}(p) =1$
. Since
$d \equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
, we have
$d{\widetilde d} \equiv 1 \ (\operatorname {mod}\, 8)$
, and it follows also that
$\chi _{d{\widetilde d}} (2) =1$
. Finally, since d and
$\widetilde d$
are primes in the range
$[\Delta /\log \Delta , \Delta ]$
, and
$m^2 \leq n \leq K = (\log \Delta )^{20}$
we know that
$(d{\widetilde d},m^2)=1$
and therefore
$\chi _{d{\widetilde d}}(m^2) =1$
. Thus,
$\chi _{d{\widetilde d}}(n)$
equals the Jacobi symbol
$(\frac {d{\widetilde d}}{r})$
, which for given r is a quadratic character that is principal when
$r=1$
and nonprincipal for
$r>1$
. With this notation, the main term in (7.3) may be expressed as

We now show that the asymptotic in Proposition 5.5 arises from the contribution of
$r=1$
here, while the terms with
$r>1$
contribute a negligible amount. When
$r=1$
, note that
$(\frac {d \widetilde d}{r}) =1$
. Since d and
${\widetilde d}$
range over primes in
$[\Delta /\log \Delta , \Delta ]$
in suitable progressions modulo
$8\prod _{3\leq p\leq W} p$
, and this modulus is
$\leq e^{(1+o(1))W} = (\log \Delta )^{1+o(1)}$
, by the prime number theorem in arithmetic progressions it follows that

When
$j=0$
the condition
$d \equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
is automatic, while when
$j=1$
we only know from the definition that
$d \equiv {\widetilde d} \ (\operatorname {mod}\, 4)$
and the extra constraint
$(\operatorname {mod}\, 8)$
accounts for the factor
$2^j=2$
above. Now, the unrestricted sum over n satisfies

while the tail
$\sum _{n= fm^2>K} 1/n$
may be bounded by

We conclude that the terms with
$r=1$
in (7.4) contribute

This is
$2^{-j} |\mathcal {D}_j|^2 \gamma _W^{-1} ( 1+ O(W^{-1}))$
, matching the expression in the proposition.
It remains to show that the contribution to (7.4) of terms with
$r>1$
is negligible. Given
$d\in {\mathcal D}_j$
, consider the sum over
${\widetilde d}$
in (7.4), which is

Now, the sum over
${\widetilde d}$
above counts primes in
$[\Delta /\log \Delta ,\Delta ]$
lying in a suitable number of arithmetic progressions modulo
$8r \prod _{3\leq p\leq W} p$
. Since the modulus is
$\ll K e^{(1+o(1))W} \leq (\log \Delta )^{22}$
, an application of the prime number theorem in arithmetic progressions shows that the above equals

upon noting that the main terms cancel (since
$(\frac {\cdot }{r})$
is a nonprincipal character) and that
$r\leq K = (\log \Delta )^{20}$
. Thus, the contribution of the terms
$r>1$
to (7.4) is

Combining this with our evaluation of the terms with
$r=1$
, we conclude that the quantity in (7.4) is
$2^{-j} \gamma _W^{-1} |{\mathcal D}_j|^2 (1+O(W^{-1}))$
, and using this in (7.3) the proof of Proposition 5.5 is complete.
8 Class group L-functions
We begin by recalling properties of class group L-functions over general number fields. In our work, we will only need the special cases of quadratic and biquadratic extensions. Let K be a number field of degree m and discriminant
$D_K$
. Let
$\Psi $
be a character of the class group of K, and let
$L(s,\Psi )$
denote the corresponding L-function. Recall that
$L(s,\Psi )$
is defined by

where both the Dirichlet series and Euler product above converge absolutely in the half-plane
$\sigma> 1$
. In the half-plane
$\sigma>1$
, we define a holomorphic branch of
$\log L(s,\Psi )$
by setting

The Dirichlet series coefficients of
$L(s,\Psi )$
are bounded in absolute value by the corresponding coefficients of the Dedekind zeta-function
$\zeta _K(s)$
, which in turn are no more than the coefficients of
$\zeta (s)^m$
(which has coefficients given by the m-divisor function). Further, the coefficients of
$\log L(s,\Psi )$
(as defined above) are supported on prime powers and bounded in size by the coefficients of
$\log \zeta _K(s)$
(defined as above for the principal character
$\Psi _0$
) and thus are no more than
$m/j$
on the prime powers
$p^j$
. In particular, we note that in the half-plane
$\sigma>1$

with the second bound being a standard bound for
$\zeta $
(see, for instance, [Reference Montgomery and Vaughan14, Corollary 1.4]).
We now collect together some classical bounds for
$L(s,\Psi )$
, along with describing a zero-free region for
$L(s,\Psi )$
and bounds for
$|\log L(s,\Psi )|$
inside the zero-free region.
Lemma 8.1. Let K,
$\Psi $
and
$L(s,\Psi )$
be as above. Then the following statements hold.
-
1. Suppose that
$\Psi $ is not the principal character. Then
$L(s,\Psi )$ extends to an entire function and uniformly in the region
$\sigma \geq 0$ satisfies the bound
(8.4)$$ \begin{align} |L(\sigma+it,\Psi)| \ll_m \Big( (|D_K| (1+|t|)^m)^{(1-\sigma)/2} + 1 \Big) (\log (|D_K|(1+|t|)))^{m}. \end{align} $$
For every
$\varepsilon>0$ , there is a constant
$C=C(m,\varepsilon )>0$ such that the region
$$\begin{align*}{\mathcal R}_0 = {\mathcal R}_0(\varepsilon) = \{ \sigma \geq 1- C |D_K|^{-\varepsilon}, \ \ |t| \leq |D_K| \} \end{align*}$$
$L(s,\Psi )$ . Thus,
$\log L(s,\Psi )$ extends analytically to the region
${\mathcal R}_0$ , and moreover in the subregion
$$\begin{align*}{\mathcal R} = {\mathcal R}(\varepsilon) = \Big \{ \sigma \geq 1- \tfrac{1}{2} C |D_K|^{-\varepsilon}, \ \ |t| \leq \tfrac{1}{2} |D_K| \Big\}, \end{align*}$$
(8.5)$$ \begin{align} |\log L(s,\Psi)| \leq 6m\varepsilon \log |D_K| + O_{m,\varepsilon}(1). \end{align} $$
-
2. Suppose that
$\Psi $ is the principal character so that
$L(s,\Psi )$ is the Dedekind zeta-function
$\zeta _K(s)$ of the field K. The Dedekind zeta-function extends to a meromorphic function, with a single simple pole at
$s=1$ . The convexity bound (8.4) holds provided
$|t| \geq 1$ , while for
$|t|\leq 1$ the same bound holds for
$|(s-1)\zeta _K(s)|$ . The region
${\mathcal R}_0$ is free of zeros of
$\zeta _K(s)$ , and the function
$\log ((s-1)\zeta _K(s))$ extends analytically to the region
${\mathcal R}_0$ . The bound (8.5) holds for
$|\log \zeta (s)|$ in the subregion
${\mathcal R}$ provided
$|s-1| \geq 1$ , and for points in
${\mathcal R}$ with
$|s-1| \leq 1$ the same bound holds for
$|\log ((s-1) \zeta _K(s))|$ instead.
Proof. Suppose first that
$\Psi $
is nonprincipal. The analytic continuation of
$L(s,\Psi )$
to the entire plane is due to Hecke (for a modern account, see, for example, Chapter 7 of [Reference Narkiewicz15]).
The bound in (8.4) is a standard convexity bound and, for instance, may be obtained from Lemma 4 of Fogels [Reference Fogels8]. Fogels’s paper [Reference Fogels8] established a classical zero-free region for
$L(s,\Psi )$
of the form
$\sigma \geq 1- c/\log (|D|(1+|t|))$
for a suitable constant
$c>0$
, when the character
$\Psi $
is complex. In the case of a real character
$\Psi $
, the same region is free of zeros of
$L(s,\Psi )$
except for the possibility of a simple zero at
$1-\delta $
for a real number
$\delta $
(analogous to the Siegel zero for Dirichlet L-functions). Analogously to the Brauer–Siegel theorem, Fogels [Reference Fogels9] shows, by reducing to Brauer’s work, that
$\delta \geq C(m, \varepsilon ) |D_K|^{-\varepsilon }$
. Thus, the region
${\mathcal R}_0$
is free of zeros of
$L(s,\Psi )$
.
The bound (8.5) on
$|\log L(s,\Psi )|$
in the narrower region
${\mathcal R}$
follows by an application of the Borel–Caratheodory lemma using the preliminary bounds (8.3) and (8.4), as we shall now see. Let
$z_0 = 1+ \frac C2 |D_K|^{-\varepsilon } +i t$
with
$|t|\leq |D_K|/2$
, and put
$r= C|D_K|^{-\varepsilon }$
and
$R = \frac 32 C |D_K|^{-\varepsilon }$
. The function
$f(z) = \log L(z,\Psi )$
is holomorphic inside the circle of radius R centered at
$z_0$
(since this is contained in the region
${\mathcal R}_0$
), and for z inside this larger circle it satisfies the bound

since by (8.4) we have
$|L(z,\Psi )| \ll _{m,\epsilon } (\log |D_K|)^m$
. Further, by (8.3)

The Borel–Carathéodory lemma (see, for example, Section 5.5 of [Reference Titchmarsh18]) now shows that for z inside the smaller circle
$|z-z_0| \leq r$
one has

This establishes (8.5) for all
$s= \sigma +it$
with
$|t|\leq \frac {1}{2}|D_K|$
and
$1-\frac {1}{2} C |D_K|^{\varepsilon } \leq \sigma \leq 1+ \frac {3}{2}C |D_K|^{-\varepsilon }$
. When
$\sigma> 1+\frac {3}{2}C |D_K|^{-\varepsilon }$
(and
$|t| \leq \frac {1}{2}|D_K|$
) the bound in (8.5) follows at once from (8.3), and this completes the proof in the case of nonprincipal
$\Psi $
.
The case when
$\Psi $
is principal follows in the same way. The only difference is that the Dedekind zeta-function has a pole at
$s=1$
so that near
$1$
we deal with
$(s-1) \zeta _K(s)$
instead.
To prove Propositions 5.2, 5.3 and 5.4, we shall make use of the expression (3.4) of
$R_D(n)$
in terms of the coefficients of the class group L-functions
$r(n,\psi )$
. As consequences of Lemma 8.1, we now show that in such expressions the contribution of most class group characters
$\psi $
is negligible. The main lemmas we will prove in this section are Lemmas 8.2, 8.4 and 8.5. The analytic details are very similar across all three, so we will only give complete details in the proof of Lemma 8.2.
Recall the convention introduced in Section 5, namely that for integer n we write
$n = n^{\sharp } n^{\flat }$
, where
$n^{\sharp }$
has only prime factors
$\leq W$
, and
$n^{\flat }$
only prime factors
$>W$
.
Lemma 8.2. Let N be large and k be an integer in the range (5.5). Let j be
$0$
or
$1$
, and let d be an element of
${\mathcal D}_j$
with D denoting the corresponding fundamental discriminant. Let
$\psi $
be a nonprincipal class group character of the quadratic field
$K=\mathbf {Q}(\sqrt {D})$
. Then

Proof. The key idea here and in the proofs of Lemmas 8.4 and 8.5 is to follow Selberg [Reference Selberg16] and introduce, for any
$z \in \mathbf {C}$
with
$|z| = 1$
, the Dirichlet series

Later, we will recover the condition
$\Omega (n) = k$
(which defines the set
$\mathcal {A}_j(k)$
) by Fourier inversion.
Since (by (3.2), (3.3))
$|r(n,\psi )| \leq \tau (n)$
, we see that
${\mathcal F}(s,z,j)$
converges absolutely in the half plane Re
$(s) =\sigma>1$
and further satisfies the bound

using [Reference Montgomery and Vaughan14, Corollary 1.4] in the last step. Further, by Mellin inversion we have, setting
$c= 1+1/\log N$
,

By Stirling’s formula
$|\Gamma (\sigma + it)| \ll (1+|t|)^{\sigma -1/2} e^{-\pi |t|/2}$
uniformly for
$\sigma $
in bounded intervals (see, for instance, (C.19) of [Reference Montgomery and Vaughan14]). Combining this with the bound (8.6), we find that the tails of the integral in (8.7) above where
$|\text {Im}(s)| \geq (\log \log N)^2$
contribute

Thus, writing
$T= (\log \log N)^2$
,

To estimate the truncated integral here, we shall extend
${\mathcal F}(s;z,j)$
analytically a little to the left of the
$1$
-line and shift contours. To extend
${\mathcal F}(s;z,j)$
analytically, we shall compare it with
$L(s,\psi )^z$
. Note that when Re
$(s)>1$
we may define
$L(s,\psi )^z$
by the Euler product
$\prod _{\mathfrak p} (1- \psi (\mathfrak p)/N(\mathfrak p)^s)^{-z}$
, and this product converges absolutely when Re
$(s)>1$
. Further, we may extend
$L(s,\psi )^z$
analytically to a wider region by writing it as
$\exp (z\log L(s,\psi ))$
and using the analytic continuation described in Lemma 8.1. Thus, define

which is, to start with, analytic in the half-plane
$\sigma>1$
. The definition of
${\mathcal F}(s;z,j)$
permits us (in this region) to write
${\mathcal G}(s;z,j)$
as an Euler product
$\prod _p {\mathcal G}_p(s;z,j)$
, whose factors we now describe. For
$p>W$
, we have

For p with
$3\leq p\leq W$
, we have

Finally, for
$p=2$
we have

and in both cases this is
$1+O(2^{-\sigma })$
. From these remarks, we see that the Euler product
${\mathcal G}_p(s;z,j)$
converges absolutely in the region
${\operatorname {Re}}(s)>\frac {1}{2}$
and defines a holomorphic function of s in that region. Moreover, in the region
$\sigma \geq \frac {3}{4}$
, we have the bound

For the rest of the paper, we fix the domain

Applying Lemma 8.1 with
$\varepsilon =\frac {1}{100}$
(and
$m=2$
), we see that (keeping in mind
$(\log N)^{1/2} \leq |D| \leq \log N$
) the function
$\log L(s,\psi )$
is analytic in
$\mathcal {W}$
and satisfies
$|\log L(s,\psi )| \leq \frac 18 \log |D| +O(1)$
here. Therefore,

is also analytic in
$\mathcal {W}$
and by (8.9) satisfies in this region

We now return to the integral in (8.8), and replace the line of integration from
$c-iT$
to
$c+iT$
by integrals along the following three line segments: (i) the horizontal line segment from
$c-iT$
to
$1-(\log N)^{-1/2} -iT$
, (ii) the vertical line segment from
$1-(\log N)^{-1/2} -iT$
to
$1-(\log N)^{-1/2} +iT$
and (iii) the horizontal line segment from
$1-(\log N)^{-1/2} +iT$
to
$c+iT$
. On the horizontal line segment (i), we may bound the integral by

upon using
$|\Gamma (\sigma -iT)| \ll T^{\sigma -1/2} e^{-\pi T/2} \ll e^{-T}$
. Naturally, the same estimate applies to the integral on the horizontal line segment in (iii). As for the vertical line segment (ii), the integral here is

Putting all this together and recalling (8.1), we conclude that uniformly for
$|z|=1$

By Fourier inversion,

and the proof of Lemma 8.2 is complete.
To state the other two lemmas of this section, we first need to isolate the genus characters which play a special role. The genus characters for a quadratic field are the class group characters that take only the real values
$\pm 1$
. We will only need to know what these are in the case d prime, in which case the classification is as follows.
Proposition 8.3. Let d be an odd prime, and let D be the associated fundamental discriminant as in (3.5). Let
$K = \mathbf {Q}(\sqrt {D})$
.
-
1. If
$d \equiv 3 \ (\operatorname {mod}\, 4)$ , so
$D = -d$ , then there is only one genus character in
$\widehat {C}_K$ , namely the principal character
$\psi _0$ . The corresponding class group L-function is the Dedekind zeta-function of K, given by
$$ \begin{align*}L_K(s,\psi_0) = \zeta_K(s) = \zeta(s) L(s,\chi_{D}). \end{align*} $$
-
2. If
$d \equiv 1\ (\operatorname {mod}\, 4)$ , so
$D = -4d$ , then there are two genus characters in
$\widehat {C}_K$ : the principal character
$\psi _0$ , whose L-function is equal to the Dedekind zeta-function of K as above and a nontrivial genus character
$\psi _1$ . On prime ideals
$\mathfrak {p}$ ,
$\psi _1$ is given by
$\psi _1(\mathfrak {p}) = \chi _{-4}(N\mathfrak {p})$ if
$\mathfrak {p}$ lies above an odd prime, and
$\psi (\mathfrak {p}) = \chi _{d}(2)$ if
$\mathfrak {p}$ is the (unique, ramified) prime ideal above
$2$ . The corresponding L-function is given by
$L_K(s,\psi _1) = L(s,\chi _{-4}) L(s,\chi _{d})$ .
Proof. There is a bijective correspondence between genus characters of imaginary quadratic fields of discriminant D and factorizations
$D = D' \cdot D"$
into fundamental discriminants, with the decomposition
$D = 1 \cdot D$
being allowed, and with decompositions different only in the order of
$D, D"$
being considered equivalent. See [Reference Zagier19, Chapter 12, Satz 2] for a discussion of this, as well as a discussion of how to compute these factorisations in terms of the factorisation of D into prime discriminants. For us, the factorisations can easily be computed by hand: If
$d \equiv 3\ (\operatorname {mod}\, 4)$
, then there is only the trivial factorisation, whilst when
$d \equiv 1\ (\operatorname {mod}\, 4)$
, in which case
$D = -4d$
, we additionally have the factorisation in which
$D' = -4$
and
$D" = d$
.
In [Reference Zagier19, Chapter 12, Satz 2], one may also find the description of a genus character
$\psi $
corresponding to a given factorisation
$D = D' \cdot D"$
: It is given on prime ideals by

(That this is well defined is part of the statement.) We have the Kronecker factorisation of L-functions

Specialising to the specific case
$D' = -4$
,
$D" = d$
gives the stated result.
Lemma 8.4. Let N be large and k be an integer in the range (5.5). Let j be
$0$
or
$1$
, and let d and
${\widetilde d}$
be two distinct elements of
${\mathcal D}_j$
with D and
${\widetilde D}$
denoting the corresponding fundamental discriminants. Let
$\psi $
and
${\widetilde \psi }$
be characters of the class groups of
$K =\mathbf {Q}(\sqrt {D})$
and
${\widetilde K} = \mathbf {Q}(\sqrt {{\widetilde D}})$
respectively. Then

unless (i)
$\psi $
and
${\widetilde \psi }$
are the principal characters in their respective class groups, or (ii) both
$\psi $
and
${\widetilde \psi }$
are the nonprincipal genus character in their respective class groups (and this possibility occurs only in the case
$j=1$
).
Proof. Given two characters
$\psi \in {\widehat C}_K$
and
${\widetilde \psi } \in {\widehat C}_{\widetilde K}$
, we may find a class group character
$\Psi $
on the biquadratic field
$L=\mathbf {Q}(\sqrt {D}, \sqrt {{\widetilde D}})$
such that for all unramified primes p

The character
$\Psi $
is defined by setting

where
$N_{L/K}$
denotes the ideal norm from L to K (and similarly for
${\widetilde K}$
), and by Diao [Reference Diao7, Lemma 6], we may check that
$\Psi $
is nonprincipal except in the cases (i) and (ii) described in the lemma. (Precisely, Lemma 6 of Diao [Reference Diao7] shows that
$\Psi $
can be principal only if
$\psi $
and
${\widetilde \psi }$
are genus (or real) characters. The last remaining case when one of
$\psi $
or
${\widetilde \psi }$
is principal while the other equals a nonprincipal genus character is easily checked directly.) Therefore, we may write for any complex number z with
$|z|=1$

where
${\mathcal G}(s;z,j)$
is given by a suitable Euler product which converges absolutely in Re
$(s) \geq \frac 34$
and satisfies in that region

Since the discriminant of L is
$\ll \Delta ^4$
, using Lemma 8.1 and arguing exactly as in our proof of Lemma 8.2 we establish that

and then the bound of the lemma follows by an application of Fourier inversion.
Lemma 8.5. Let N be large and k be an integer in the range (5.5). Let j be
$0$
or
$1$
, and let d be an element of
${\mathcal D}_j$
with D denoting the corresponding fundamental discriminant. Let
$\psi $
and
${\widetilde \psi }$
be two characters of the class group of
$K =\mathbf {Q}(\sqrt {D})$
, and suppose that neither
$\psi $
nor
$\overline {\psi }$
is equal to
${\widetilde \psi }$
. Then

Proof. For an unramified prime p, we have

To see this, note that if p is inert, then
$r(p,\psi )=r(p,\widetilde {\psi })=r(p,\psi \widetilde \psi )=r(p,\overline {\psi } \widetilde \psi )=0$
, while if p splits as
${\mathfrak p}\overline {\mathfrak p}$
, then
$r(p,\psi )=\psi (\mathfrak {p}) + \overline {\psi }(\mathfrak {p})$
(and similarly for the other quantities) so that the stated relation follows with a little algebra. It follows that for any complex number z with
$|z|=1$
we may write

where
${\mathcal G}(s;z,j)$
is given by a suitable Euler product which converges absolutely in Re
$(s) \geq \frac 34$
and satisfies in that region

By hypothesis, both
$\psi \widetilde {\psi }$
and
$\overline {\psi } \widetilde {\psi }$
are nonprincipal characters of the class group of K, and therefore arguing exactly as in Lemma 8.2 and Lemma 8.4, we obtain the lemma.
9 Proof of Proposition 5.2
In this section, we prove Proposition 5.2. Using (3.4), we may write

By Lemma 8.2, the contribution of the nonprincipal characters is
$\ll N(\log N)^{-100}$
. Thus,

where
$\psi _0$
is the trivial character.
To understand the main term on the right-hand side of (9.1), we will again follow Selberg [Reference Selberg16] and introduce for any
$z \in \mathbf {C}$
with
$|z| = 1$
the Dirichlet series

which to begin with converges absolutely for
$\sigma =\text {Re}(s)>1$
and defines a holomorphic function there. As in Selberg’s work, we will find that
$\mathcal {F}$
can be understood in terms of the complex powers of
$\zeta $
and L-functions, thereby obtaining an analytic continuation of
${\mathcal F}$
to a wider region. The sum in (9.1) can be expressed in terms of a contour integral involving
${\mathcal F}(s)$
, which can then be evaluated using the analytic continuation of
${\mathcal F}$
and an argument involving a Hankel contour. Since we need to keep track of the uniformity in d, we give a self-contained account in Appendix A.
Let us turn to the details. We obtain an analytic continuation of
${\mathcal F}$
to a wider region by writing

Note that
$\zeta (s) L(s,\chi _D)$
is the Dedekind zeta-function of the quadratic field
$\mathbf {Q}(\sqrt {D})$
, and by
$(\zeta (s) L(s,\chi _D))^z$
, we mean
$\exp (z\log (\zeta (s) L(s,\chi _D))$
, where the logarithm is initially defined in
$\sigma>1$
by an absolutely convergent Dirichlet series as in (8.2). Thus, (9.3) should be thought of as the definition of the function
${\mathcal G}(s;z,j)$
, which is holomorphic in the half-plane
$\sigma>1$
. We shall shortly see that
${\mathcal G}(s;z,j)$
is analytic in
$\sigma> \frac 12$
with suitable bounds in that region. By part (2) of Lemma 8.1, we may obtain an analytic continuation of
$\log ((s-1)\zeta (s) L(s,\chi _D))$
to the region
${\mathcal R}_0$
with corresponding bounds in the region
${\mathcal R}$
. In this way, we obtain a continuation of
${\mathcal F}(s;z,j)$
(essentially) to the region
${\mathcal R}$
, except that we must omit the real line segment to the left of
$s=1$
owing to the logarithmic singularity at
$s=1$
.
From the multiplicative nature of the definition of
$\mathcal {F}(s;z,j)$
, in the region
$\sigma>1$
we see that
${\mathcal G}(s;z,j)$
is given by an Euler product
$\prod _{p} {\mathcal G}(s;z,j)$
. We now describe these Euler factors. If
$p>W$
, we have

and since
$r(p,\psi _0) = 1+ \chi _D(p)$
and
$0\leq r(p^\ell ,\psi _0)\leq (\ell +1)$
it follows that

For
$3\leq p\leq W$
, from our choice (5.2), (5.3) of d, we have
$\chi _D(p)=1$
so that
$r(p^{\ell },\psi _0) = (1 \ast \chi _D)(p^{\ell }) = \ell +1$
. Thus,

As we have just seen, for the primes
$p\geq 3$
there is no dependence on j. By contrast, for
$p=2$
the behaviour is different in the two cases
$j=0$
(where
$\chi _D(2)=1$
) and
$j=1$
(where
$\chi _D(2)=0$
). Here, we find

From (9.6) and (9.7). note that for all
$p\leq W$
(and uniformly for
$|z| = 1$
)

From (9.5) and (9.8), we see that the Euler product
$\prod _{p} {\mathcal G}_p(s;z,j)$
, which was known initially to converge absolutely in
${\operatorname {Re}} s =\sigma>1$
, in fact converges absolutely for
$\sigma>\frac 12$
. Moreover, for
$\sigma \geq \frac 34$
we deduce that

Now, we apply Selberg’s method as explained in Appendix A. Specifically, by (A.7) and (A.8), we obtain

Applying the above with
$z= e^{i\theta }$
for
$-\pi \leq \theta \leq \pi $
and applying orthogonality (Fourier inversion), we deduce that

We now simplify the main term appearing in (9.11). Since
$1/\Gamma (z)$
is entire, uniformly for
$\theta \in [-\pi ,\pi ]$
we have

Now, from its definition in (9.4) we may see that for
$p>W$

Integrating this over
$\phi $
from
$0$
to
$\theta $
we obtain that, for
$\theta \in [-\pi , \pi ]$
,

Similarly, from (9.6) and (9.7) it follows that for all
$p\leq W$

Multiplying the relations in (9.13) and (9.14) over all primes, we conclude that

for some constant C (consider the cases
$|\theta | \leq (\log \log W)^{-1}$
and
$|\theta |> (\log \log W)^{-1}$
separately).
For later use, let us record the value of
${\mathcal G}(1;1,j)$
. For any prime
$p>W$
one may see using (9.4) and the identity
$\sum _{\ell = 0}^{\infty }\sum _{j = 0}^{\ell } x^j y^{\ell } = (1 - xy)^{-1}(1 - y)^{-1}$
with
$x = \chi _D(p)$
and
$y = p^{-s}$
that
${\mathcal G}_p(s;1,j) =1$
, while for
$3\leq p \leq W$
it follows from (9.6) that
${\mathcal G}_p(s;1,j) = 1-1/p^{s}$
, and lastly from (9.7) we see that
${\mathcal G}_2(1;1,j) = 2^{-j-1}(1-2^{-1})$
. Combining these observations, it follows that

Using (9.12), (9.15) and (9.16) we obtain

where the main term arises upon noting that, for
$X> 0$
,

Note that
$\cos \theta \leq 1- \theta ^2/8$
for all
$|\theta |\leq \pi $
, and that (by the class number formula (9.22))
$L(1,\chi _D) \geq |D|^{-1/2} \geq (\log N)^{-1/2}$
so that
$L(1,\chi _D) \log N \geq (\log N)^{1/2}$
. Therefore, (recalling that
$k_0 = \log \log N$
and that
$W = \log \log \log N$
)

so that the remainder term in (9.17) is seen to be

Using (9.17) and (9.18) in (9.11), we conclude that

where we absorbed the error term
$O_{\varepsilon }(N(\log N)^{\varepsilon - 1/2})$
in (9.11) into the error term above using
$L(1,\chi _D) \gg _{\varepsilon } |D|^{-\varepsilon } \gg (\log N)^{-\varepsilon }$
.
We now claim that for k in the range (5.5) and all x with
$(\log N)^{-1/2} \leq x \leq k_0^4$
we have

To verify the claim, consider the following two cases: (i) when
$k_0^{-4} \leq x \leq k_0^4$
, and (ii) when
$(\log N)^{-1/2} \leq x \leq k_0^{-4}$
. In case (i) note that
$|\log x| = O(\log k_0)$
, and so the left-hand side of (9.20) is

so that the claim follows here. In case (ii), note that since
$k\geq 3k_0/4$
and we have

so (9.20) holds in this case also.
Applying (9.20) with
$x=L(1,\chi _D)$
(which satisfies
$(\log N)^{-1/2} \leq L(1,\chi _D) \ll \log \log N$
), we see that

where the last estimate follows using the bound
$k_0^k/k! \ll k_0^{-1/2} \log N$
, which is a consequence of Stirling’s formula. Using this in (9.19), we conclude that

Using (9.21) in (9.1) and invoking the class number formula

we obtain (note also that
$\gamma _W \gg (\log W)^{-1} \gg _{\varepsilon } k_0^{-\varepsilon }$
)

10 Proof of Proposition 5.3
We turn now to the proof of Proposition 5.3. We first dispense with the case when
$d\not \equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
which, recalling the definitions (5.2) and (5.3), can only happen in the case
$j = 1$
. Note that if
$d \equiv 1 \ (\operatorname {mod}\, 8)$
, then the integers n with
$n \equiv 2 \ (\operatorname {mod}\, 4)$
that are represented by
$x^2+dy^2$
satisfy
$n\equiv 2 \ (\operatorname {mod}\, 8)$
, while if
$d\equiv 5 \ (\operatorname {mod}\, 8)$
, then such integers n must be
$\equiv 6 \ (\operatorname {mod}\, 8)$
. Thus, if
$d \not \equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
and if
$n \in \mathcal {A}_1(k)$
(which means that
$n \equiv 2 \ (\operatorname {mod}\, 4)$
), then n cannot be represented by both
$x^2 + dy^2$
and
$x^2 + \tilde d y^2$
. Since both
$d, \tilde d$
are
$1 (\operatorname {mod}\, 4)$
, it follows from Lemma 3.1 (1) that
$R_D(n)R_{\widetilde D}(n) =0$
, and so (5.10) follows.
For the rest of the argument, we assume that
$d\equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
with the goal now being to establish (5.9).
Using (3.4), we see that

We may now use Lemma 8.4 to estimate the contribution of all the characters
$\psi $
and
${\widetilde \psi }$
apart from when (i) both
$\psi $
and
${\widetilde \psi }$
are the principal characters in their respective class groups, and (ii) both
$\psi $
and
${\widetilde \psi }$
are the nontrivial genus characters in their class groups. Note that the second case only arises when
$j = 1$
, and since
$d \equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
, we have here (with notation as in Proposition 8.3)

for all n. To see this, we use Proposition 8.3 and multiplicativity of
$r(n,\psi )$
to deduce that
$r(n,\psi _1) = \chi _{-4}( n) r(n,\psi _0)$
and similarly
$r(n, \widetilde {\psi }_1) = \chi _{-4}(n) r(n, \widetilde {\psi }_0)$
for odd n so that the stated relation holds for n odd. Further
$r(2^a, \psi _1) = \chi _d(2^a) r(2^a, \psi _0)$
and
$r(2^a, \widetilde {\psi }_1) = \chi _{{\widetilde d}}(2^a) r(2^a, \widetilde {\psi }_0)$
, and since
$\chi _d(2) = \chi _{\widetilde d}(2)$
(because
$d\equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
) we see the stated relation for even n as well. Thus,

where

To evaluate the main term M above, we proceed analogously to the previous section using Selberg’s method. To highlight the close parallels with the earlier argument, we will use the same notation for the analogous Dirichlet series that arise here. The first step is to consider, for
$|z| = 1$
, the Dirichlet series

which converges absolutely for
$\sigma =\text {Re }(s)>1$
and defines a holomorphic function there. As in the previous section, we shall obtain an analytic continuation of
${\mathcal F}$
to a wider region by writing

Note that
$\zeta (s)L(s,\chi _D) L(s,\chi _{\widetilde D}) L(s,\chi _{d\widetilde {d}})$
is the Dedekind zeta-function of the biquadratic field
$L={\mathbb Q}(\sqrt {D}, \sqrt {\widetilde D})$
, and

where the logarithm is initially defined in
$\sigma>1$
by an absolutely convergent Dirichlet series as in (8.2). Thus, (10.4) should be thought of as the definition of the function
${\mathcal G}(s;z,j)$
, which is holomorphic in the half plane
$\sigma>1$
. We shall see shortly that
${\mathcal G}(s;z,j)$
is analytic in
$\sigma> \frac 12$
with suitable bounds in that region. By Lemma 8.1 (2), we may obtain an analytic continuation of
$\log ((s-1) \zeta (s)L(s,\chi _D) L(s,\chi _{\widetilde D}) L(s,\chi _{d\widetilde {d}}))$
to the region
${\mathcal R}_0$
with corresponding bounds in the region
${\mathcal R}$
. In this way, we obtain a continuation of
${\mathcal F}(s;z,j)$
essentially to the region
${\mathcal R}$
, with the caveat that the real line segment to the left of
$s=1$
must be omitted owing to the logarithmic singularity at
$s=1$
.
From the multiplicative definition of
${\mathcal F}(s;z,j)$
, in the region
$\sigma> 1$
we see that
${\mathcal G}(s;z,j)$
is given by an Euler product
$\prod _{p} {\mathcal G}_p(s;z,j)$
. We continue as in Section 9 by describing these Euler factors and showing that the Euler product converges absolutely in
$\sigma>\frac 12$
(and assume below that
$\sigma>\frac 12$
).
In the case
$p>W$
, we have

Since
$0\leq r(p^\ell ,\psi _0) r(p^{\ell },\widetilde {\psi _0}) \leq (\ell +1)^2$
, and
$r(p,\psi _0) r(p,\widetilde {\psi _0}) = (1+ \chi _D(p))(1+\chi _{\widetilde D}(p)) = 1 +\chi _D(p) + \chi _{\widetilde D}(p) + \chi _{d{\widetilde d}} (p)$
(note that for odd p, we have
$\chi _D(p) \chi _{\widetilde D}(p) = (\frac {D}{p})(\frac {\widetilde D}{p}) = (\frac {D{\widetilde D}}{p}) = (\frac {d{\widetilde d}}{p}) = \chi _{d{\widetilde d}}(p)$
), we see that

Turning to the case
$3 \leq p \leq W$
, recall that
$n^{\flat }$
is the product of all the prime divisors of n which are
$\leq W$
, and recall also that
$\chi _D(p) = \chi _{\widetilde D}(p)=1$
from the definition of
$\mathcal {D}_0$
,
$\mathcal {D}_1$
(see (5.2) and (5.3)). Thus, (as in the last section) we see that

As in the last section, for primes
$p\geq 3$
there is no difference between the cases
$j=0$
and
$j=1$
, but at the prime
$p = 2$
, there is a distinction in the definition of
${\mathcal G}_2(s;z,j)$
. Here, we have (compare with (9.7))

upon noting that when
$j=0$
we have
$\chi _D(2) = \chi _{\widetilde D}(2) = \chi _{d {\widetilde d}}(2)= 1$
(since D,
${\widetilde D}$
and
$d{\widetilde d}$
are all
$1\ (\operatorname {mod}\, 8)$
), and that when
$j=1$
we have
$\chi _D(2) = \chi _{\widetilde D}(2) = 0$
and
$\chi _{d\widetilde d}(2) = 1$
(since
$d\equiv {\widetilde d} \ (\operatorname {mod}\, 8)$
so that
$d\widetilde d \equiv 1 \ (\operatorname {mod}\, 8)$
). From (10.6) and (10.7) note that for all
$p\leq W$

From (10.5) and (10.8), we see that the Euler product
$\prod _p {\mathcal G}_p(s;z,j)$
, which was known initially to converge absolutely for
$\sigma>1$
, in fact converges absolutely in
$\sigma> \frac 12$
. Moreover, for
$\sigma \geq \frac 34$
we deduce that

By Selberg’s method, specifically by (A.7) and (A.9), we obtain

where, here and below,

The main quantity of interest M (see (10.2)) may be recovered by Fourier inversion, integrating over
$z=e^{i\theta }$
with
$-\pi \leq \theta \leq \pi $
. Thus, analogously to (9.11) we find

Arguing as in (9.12), (9.13), (9.14), we find analogously to (9.15)

for a suitable constant C. Using this in our expression for M and arguing as in (9.17) and (9.18) we arrive at

Here, the error term
$O_{\varepsilon }(N (\log N)^{-1/2 + \varepsilon })$
has been absorbed into the error term above since the L-values at
$1$
are all
$\gg |D|^{-\varepsilon } \gg (\log N)^{-\varepsilon }$
.
Applying (9.20) with
$x= L_{d,\tilde d}$
(which satisfies
$(\log N)^{-\varepsilon } \ll L_{d,\tilde d} \ll k_0^3$
), we see that for k in the range (5.5)

Using this in (10.11), we conclude that

From (10.5), (10.6) and (10.7), we see that

(In fact, rather than use the crude bound (10.5) one may compute
$\mathcal {G}_p(1,1,j) = 1 - \chi _{d\tilde d}(p)p^{-2}$
for
$p> W$
, but we do not need this.) Using this together with the class number formula (9.22) and (10.12) in (10.1), (10.2), we obtain

where we have absorbed the error term
$O(N(\log N)^{-100})$
from (10.1) into the much larger error term above.
To complete the proof of Proposition 5.3, we multiply though by
$|D \tilde D|^{1/2} /\pi ^2 \gamma _W^2$
(noting that any extraneous factors of
$\gamma _W$
may be absorbed by
$k_0^{\varepsilon }$
terms) and, finally, use (5.8).
11 Proof of Proposition 5.4
In this final section of the main paper, we establish Proposition 5.4.
Using (3.4), we see that

Using Lemma 8.5, we may bound the contribution of terms with
${\widetilde \psi } \notin \{ \psi , \overline {\psi }\}$
by
$\ll N(\log N)^{-100}$
. It remains to treat the cases when
${\widetilde \psi } = \psi $
or
$\overline {\psi }$
. Note that if
${\mathfrak a}$
is an ideal of norm n, then so is
$\overline {\mathfrak a}$
and moreover
$(n)={\mathfrak a} \overline {\mathfrak a}$
. Therefore,
$\overline \psi ({\mathfrak a}) = {\psi }(\overline {\mathfrak a})$
and it follows that
$r(n,\psi ) = r(n, \overline {\psi })$
is real valued. Thus, when
$\widetilde \psi $
is
$\psi $
or
${\overline \psi }$
we have
$r(n,\psi ) r(n, {\widetilde \psi }) = r(n,\psi )^2$
, which is real and nonnegative. Collecting the observations so far, we find

The contribution of the real characters
$\psi $
(which are the genus characters, and there are at most two of them) to (11.1) is

since
$r(n,\psi _0) \leq 2^{\Omega (n)} \leq 2^k$
. Using (9.1) and Proposition 5.2, the above quantity is bounded by

By (5.8) and Stirling’s formula, this is

Now, we bound the contribution of the complex characters
$\psi $
in (11.1). For a complex character
$\psi $
, note that

where we take
$c= 1+ 1/\log N$
. By considering whether p does or does not split in
$\mathbf {Q}(\sqrt {D})$
, we may check that for any unramified prime p

Since
$\psi $
is not real, note that
$\psi ^2$
is not principal. By comparing Euler products, we may therefore write

where
${\mathcal G}(s)$
is given by an Euler product which converges absolutely in the region
${\operatorname {Re}} s> \tfrac 12+ \delta $
and is uniformly bounded in that region. Moving the line of integration in (11.3) to the line
${\operatorname {Re}} s=\frac {3}{4}$
, we see that this integral equals

To bound the integral above, we use the convexity bound for L-functions (see Chapter 5 of [Reference Iwaniec and Kowalski13], as well as (8.4) in the case of
$L(s,\psi ^2)$
) which gives

and

Noting further that
$|{\mathcal G}(\frac 34+it)| \ll 1$
, and
$|\Gamma (\frac 34+it)| \ll (1+|t|)^{1/4} e^{-\pi |t|/2}$
(see (C.19) of [Reference Montgomery and Vaughan14]), we may bound the integral on
${\operatorname {Re}} s =\frac {3}{4}$
in (11.4) by

With this in mind, and referring back to (11.3), (11.4), it follows that for complex
$\psi $
we have

Now,
$L(1,\psi ^2) \ll (\log |D|)^2$
by (8.4), and so we conclude that

where the last estimate follows since
$|D| \ll \log N$
and
$L(1,\chi _D) \gg |D|^{-\varepsilon }$
.
Thus, the contribution of the complex characters
$\psi $
to (11.1) is

Combining this with the contribution of the real characters given in (11.2), we conclude that

Proposition 5.4 follows upon multiplying through by
$|D|/\pi ^2 \gamma _W^2$
and using the class number formula together with the trivial bound
$\gamma _W^{-2} \ll \log |D|$
.
A Details of Selberg’s method
In this appendix, we supply proofs of the applications of Selberg’s method as used in Sections 9 and 10; namely, the asymptotic formulae (9.10) and (10.10). Let
${\mathcal F}(s;z,j)$
be defined either as in (9.2) or (10.3), and correspondingly let
${\mathcal G}(s;z,j)$
be defined as in (9.3) or (10.4). Define
$f(n)$
to be
$r(n,\psi _0)/\tau (n^{\flat })$
in the situation of Section 9, and
$f(n)$
to be
$r(n,\psi _0) r(n,\widetilde {\psi _0})/\tau (n^{\flat })^2$
in the situation of Section 10. Our goal is to obtain the stated asymptotic formulae for

where we take
$c = 1 +1/\log N$
.
We begin by truncating the integral in (A.1) to
$|{\operatorname {Im}} s| \leq (\log \log N)^2$
. Note that in both situations under consideration
$f(n)$
is nonnegative and bounded by
$\tau (n)^2$
so that

Since
$|\Gamma (c+it)| \ll (1+|t|)^{c-1/2} e^{-\pi |t|/2}$
by Stirling’s formula, we deduce that the tails of the integral in (A.1) contribute

Thus,

and note that the error term above is negligible compared to the error terms in the formulae (9.10) and (10.10) that we are seeking to establish.
To proceed further with evaluating the truncated integral in (A.2), we will shift contours using a Hankel or a keyhole-type contour. As in (8.10), denote by
${\mathcal W}$
the region

and let
${\mathcal W}_*$
denote the domain
${\mathcal W}$
with the line segment from
$1-2 (\log N)^{-1/2}$
to
$1$
excised. In the region
${\mathcal W}_*$
, we define
$\log (s-1)$
to be the principal branch of the logarithm, taking real values for
$s\in (1,\infty ]$
and if s lies just above the cut, then the argument is
$i\pi $
, while if s lies just below the cut, then the argument is
$-i\pi $
. This gives a definition of
$(s-1)^{w} = \exp (w \log (s-1))$
(for any complex number w), which is holomorphic in
${\mathcal W}_*$
. Now, as in (9.3) or (10.4), we may write
${\mathcal F}(s;z,j) = \zeta _K(s)^z {\mathcal G}(s;z,j)$
, where K is either a quadratic (in the case of Section 9) or a biquadratic (in the case of Section 10) field. Here,
${\mathcal G}(s;z,j)$
extends to a holomorphic function in a region containing
${\mathcal W}$
, and throughout
${\mathcal W}$
it satisfies the bound
$|{\mathcal G}(s;z,j)| \ll \exp (W^{1/4})$
(see (9.9) or (10.9)). From Lemma 8.1 (2), we see that
$\zeta (s)^{z}$
extends to a holomorphic function on
${\mathcal W}_*$
, and for
$s \in {\mathcal W}_*$
with
$|{\operatorname {Im}}(s)| \geq 1$
we have

where we used (8.5) together with
$|D_K| \ll \Delta ^4 \leq (\log N)^4$
. Finally, again by the second part of Lemma 8.1, we see that
$((s-1) \zeta _K(s))^z$
extends to a holomorphic function in
${\mathcal W}$
, and satisfies for
$s\in {\mathcal W}$
with
$|{\operatorname {Im}}(s)|\leq 1$

Synthesizing the remarks above, we conclude that
${\mathcal F}(s;z,j)$
extends holomorphically to
${\mathcal W}_*$
, and for
$s\in {\mathcal W}_*$
with
$|{\operatorname {Im}}(s)| \geq 1$
satisfies

Moreover,
$(s-1)^z {\mathcal F}(s;z,j)$
extends holomorphically to
${\mathcal W}$
, and for
$s\in {\mathcal W}$
with
$|{\operatorname {Im}} (s)|\leq 1$
satisfies

We return now to the truncated integral in (A.2), which we will replace with an integral over the following Hankel-type contour.
This consists of
-
•
$\Gamma _1$ , the horizontal line segment from
$c - i (\log \log N)^2$ to
$1 - (\log N)^{-1/2} - i(\log \log N)^2$ ;
-
•
$\Gamma _2$ , the vertical line segment from
$1 - (\log N)^{-1/2} - i(\log \log N)^2$ to
$1 - (\log N)^{-1/2} $ ;
-
•
$\Gamma _3$ , which consists of a path
$\Gamma _3^-$ going horizontally from
$1-(\log N)^{-1/2}$ to
$1-r$ staying just below the line
${\operatorname {Im}} s = 0$ , then a circle
$\Gamma _3^{\circ }$ of radius r about
$s = 1$ , then a horizontal path
$\Gamma _3^+$ from
$1-r$ back to
$1 - (\log N)^{-1/2}$ but now staying just above the line
${\operatorname {Im}} s=0$ (here
$r\leq 1/\log N$ is a parameter which we later allow to tend to
$0$ );
-
•
$\Gamma _4$ , the vertical line segment from
$1 - (\log N)^{-1/2}$ to
$1 - (\log N)^{-1/2} + i(\log \log N)^2$ ;
-
•
$\Gamma _5$ , the horizontal line segment from
$1 - (\log N)^{-1/2} + i(\log \log N)^2$ to
$c + i (\log \log N)^2$ .
Since the integrand
${\mathcal F}(s;z,j) \Gamma (s)N^s$
is holomorphic in
${\mathcal W}_*$
, we may replace the vertical contour from
$c - i (\log \log N)^{1/2}$
to
$c + i(\log \log N)^2$
by the Hankel-type contour
$\Gamma _1 \cup \Gamma _2 \cup \Gamma _3 \cup \Gamma _4 \cup \Gamma _5$
.
(Note that a limiting argument, which we suppress, is required to deal with the fact that
$\Gamma _3^{\pm }$
lie on the boundary of
$\mathcal {W}_*$
rather than within
$\mathcal {W}_*$
itself.) Denote, for
$\ell = 1,2,3,4,5$
,

To estimate the horizontal integrals
$I_1$
and
$I_5$
, we use (A.3) together with the exponential decay of
$|\Gamma (s)|$
. Thus, we obtain

The vertical integrals
$I_2$
and
$I_4$
are likewise easy to handle. If
$|t| \geq 1$
, then (A.3) gives
$|{\mathcal F}(1-(\log N)^{-1/2}+it;z,j)| \ll (\log N)^{\varepsilon }$
, while if
$|t| \leq 1$
, then from (A.4) we deduce that
$|{\mathcal F}(1-(\log N)^{-1/2}+it;z,j)| \ll (\log N)^{1/2+ \varepsilon }$
(here we take t to be either strictly positive or strictly negative but avoiding point
$t=0$
). Combining these estimates with the bound
$|\Gamma (1-(\log N)^{-1/2} + it)| \ll e^{-|t|}$
, we obtain

It remains lastly to consider the integral
$I_3$
over the Hankel contour
$\Gamma _3$
. Set

Consider the circle centered at
$1$
with radius
$2(\log N)^{-1/2}$
. Since
$|\Gamma (s)|$
is bounded in this region, from (A.4) we see that
$|{\mathcal H}(s;z,j)| \ll (\log N)^{\varepsilon }$
. Therefore, if s is any point within a circle of radius
$(\log N)^{-1/2}$
centered at
$1$
, we see that

where we have used Cauchy’s formula and the fact that
$|w-s|$
and
$|w-1|$
are
$\gg (\log N)^{-1/2}$
. Thus,

Consider first the error term in (A.5). On the two horizontal parts of
$\Gamma _3$
, namely
$s= \sigma + 0^{\pm } i$
(depending whether we are just above or just below the cut), we have
$|(s-1)^{-z}N^s| \ll (1-\sigma )^{-{\operatorname {Re}} z} N^{\sigma }$
so that these integrals contribute

Similarly, the (nearly) circular portion of
$\Gamma _3$
contributes

since
$r \leq 1/\log N$
. Thus, the error term in (A.5) is
$\ll N (\log N)^{{\operatorname {Re}} z- 3/2+ \varepsilon }$
.
Turning to the main term in (A.5), we claim that

To obtain (A.6), denote by
$\mathcal {H}$
(the Hankel contour) the contour obtained from
$\Gamma _3$
by extending both horizontal parts out to
$-\infty $
. The integral in (A.6) extended over
$\mathcal {H}$
is equal to
$\frac {1}{\Gamma (z)} N(\log N)^{z-1}$
, as follows from the standard Hankel integral [Reference Tenenbaum17, Theorem II.0.17] and a substitution. Now, note that

which is much smaller than
$N(\log N)^{-100}$
, and thus establishes the claim (A.6). Putting all this together gives

Combining this with our estimates for
$I_1$
,
$I_2$
,
$I_4$
and
$I_5$
, from (A.2) we conclude that

Finally, in the context of Section 9 note that (using (9.3), and since
$\lim _{s\to 1} (s-1) \zeta (s)= 1$
)

while in the context of Section 10 (using (10.4))

Acknowledgments
This work began at the 2022 Oberwolfach Analytic Number Theory meeting, and it is a pleasure to thank the Mathematisches Forschungsinstitut Oberwolfach for the stimulating working conditions. BG is supported by a Simons Investigator grant and is grateful to the Simons Foundation for their continued support. KS is supported in part by a Simons Investigator award from the Simons foundation and a grant from the National Science Foundation.
Competing interest
None.