1. Introduction
For any integer a, let $\sigma_a(n)$ denote the sum of the ath powers of the divisors of n, that is,
While the particular value of $\sigma_a(n)$ depends crucially on the divisibility properties of n, there are nevertheless many beautiful identities dating back to a 1916 paper of Ramanujan [ Reference Ramanujan18 ] relating additive convolutions of some of these functions to others. For positive integers a and b, let
Perhaps the most well-known identity is
but Ramanujan establishes eight other exact identities of this type. He also establishes the asymptotic identity
At the top of the second page of his paper, however, Ramanujan remarks, “It seems very likely that (the main part of the asymptotic in (1·1)) is true for all positive (real) values of a and b, but this I am at present unable to prove.” This less well known conjecture of Ramanujan was established in 1927 by Ingham [ Reference Ingham9 ], and then with a power saving error term in 1957 by Halberstam [ Reference Halberstam6 ]. Halberstam later [ Reference Halberstam7 ] proved that if both parameters are small, in that they satisfy $a+b<1$ , then there is a secondary term given by a different expression in this asymptotic formula. This formula does not, however, recover the secondary term in Ramanujan’s formula (1·1), both owing to its different formulation and to the requirement that $a+b<1$ .
In this paper we give another proof of the asymptotic in (1·1), improving upon the result by establishing lower-order terms in the asymptotic for many ranges of the parameters that recover Ramanujan’s secondary term. We begin with the following theorem on what is typically the largest of these lower order terms.
Theorem 1·1. If a and b are positive real numbers with $b>a \geqslant 1$ , then
Notice that when a is an odd integer $\geqslant 3$ , the secondary term in Theorem 1·1, which is $O\left(n^{b+1}\right)$ , actually vanishes, so Theorem 1·1 is consistent with (1·1) (which requires both parameters to be odd integers) but does not quite recover it. In fact, our proof shows that there are typically many lower order terms in the asymptotic formula for $S_{a,b}(n)$ , of orders $O\left(n^{b+1-m}\right)$ for non-negative integers $0 \leqslant m < ({b-a})/{2} + {7}/{4}$ . All of these terms but that of order $O\left(n^b\right)$ vanish if the smaller parameter a is an odd integer, and it is in fact this term that recovers Ramanujan’s secondary term.
Theorem 1·2. Let a and b be positive real numbers. If $b-a > 3/2$ , then
where $\mathrm{Res}({-}m)$ is given explicitly by (4·7). It satisfies $\mathrm{Res}({-}m) \ll n^{b-m}$ in general, and if a is an odd integer, then $\mathrm{Res}(0) = -({1}/{2})\zeta({-}a)\sigma_b(n)$ and $\mathrm{Res}({-}m)=0$ for each $m \geqslant 1$ .
In particular, when $a\geqslant 3$ is an odd integer and $b > a+3/2$ , Theorem 1·2 implies
recovering Ramanujan’s formula (1·1) but without requiring b to be an odd integer. Thus, Theorem 1·2 recovers and expands on the asymptotic formula for $S_{a,b}(n)$ available from the theory of modular forms. We note that when b is also an odd integer, it was conjectured by Ramanujan and proved by Deligne that the error term is of the form $O_{a,b,\epsilon}\left(n^{\frac{a+b}{2}+\frac{1}{2}+\epsilon}\right)$ . This improved error term is available only when b is an odd integer, however; we discuss possible improvements to the error term when b is not an odd integer in the final section of this paper.
The core of the paper is Section 4, where we state and prove a theorem subsuming Theorems 1·1 and 1·2. We first present in Section 3 a simple elementary proof of Ramanujan’s conjecture (with power saving error term) along similar lines as Halberstam [ Reference Halberstam6 ].
Also in this paper, in Section 2 we describe a problem in geometric topology which initially motivated our interest in this problem. In brief, the additive convolution $S_{1,2}(n)$ appears while counting primitive ramified degree n covers of the square torus (or in other words, square-tiled surfaces with n squares) with two ramification points. These surfaces can be classified according to their horizontal cylinder configurations. There are exactly four such configurations, and knowing the asymptotic for $S_{1,2}(n)$ , which already is difficult to find in the literature, enables us to compute asymptotic proportions of two of these four horizontal cylinder configurations.
2. Motivation from Geometric Topology
Our initial interest in studying additive convolutions of the kind $S_{a,b}$ arose from a counting problem in geometric topology. In order to describe succinctly where the additive convolution appears we begin with a brief exposition on translation surfaces and their moduli spaces.
2.1. Translation surfaces and their moduli spaces
A translation surface is a closed orientable surface obtained from the union of finitely many Euclidean polygons $\{\Delta_1, \dots, \Delta_n\}$ such that:
-
(i) the embedding of the polygons in $\mathbb{R}^2$ is fixed only up to translation;
-
(ii) the boundary of every polygon is oriented counterclockwise; and
-
(iii) for every $1 \leqslant j \leqslant n$ and for every oriented side $s_j$ of $\Delta_j$ , there exist $1 \leqslant \ell \leqslant n$ and an oriented side $s_\ell$ of $\Delta_\ell$ so that $s_j$ and $s_\ell$ are parallel, of equal length and of opposite orientation. The sides $s_j$ and $s_\ell$ are glued together by a parallel translation.
A few key things follow from the definition.
-
(i) The total angle around a vertex is $2 \pi (k+1)$ for some non-negative integer k. When $k > 0$ , we call the point a cone point.
-
(ii) We distinguish between two polygons one obtained from the other by a nontrivial rotation. However, two polygons are “cut, parallel transport, and paste” equivalent. For instance, consider Figure 1. Hence, translation surfaces come with a well defined vertical direction.
Some basic examples of translation surfaces include an axis parallel square with opposite sides identified to give a square torus and a regular octagon with opposite sides identified. One can also take two regular n-gons with n odd and identify opposite corresponding sides to form a translation surface. Consider Figure 2 for an example with $n=5$ . In general, the polygons need not be regular.
Translation surfaces also admit an alternate definition via complex analysis. Viewing the polygons as embedded in $\mathbb{C}$ , a translation surface has a complex structure with transition functions given by translations. The globally defined 1-form dz on $\mathbb{C}$ then induces a globally defined 1-form $\omega$ with zeroes exactly at the cone points. Hence, from the polygonal definition of a translation surface we obtain a pair $(X, \omega)$ where X is a Riemann surface and $\omega$ is holomorphic 1-form. On the other hand, given such a pair $(X, \omega)$ one can also recover the polygonal definition using a geodesic triangulation of X satisfying the appropriate properties outlined in the polygonal definition. Therefore, a translation surface can also be thought of as a pair $(X, \omega)$ of a Riemann surface X equipped with a holomorphic 1-form $\omega$ . See [ Reference Masur14 ] for a more precise formulation of the equivalence of these two definitions of translation surfaces.
The genus of a translation surface is given by the classical Gauss–Bonnet theorem which relates the Euler characteristic of a surface with the total curvature. Since translation surfaces are built out of Euclidean polygons, they are flat everywhere except the cone points, and the Gauss–Bonnet theorem takes on a simpler form. Hence, a surface of genus g with m cone points of angles $2 \pi(\alpha_1+1), \dots, 2 \pi (\alpha_m +1)$ satisfies the relation
The angle data around the cone points can be recorded in a vector $\alpha = (\alpha_1, \dots, \alpha_m)$ , where m is the number of cone points and $2\pi(\alpha_i+1)$ are the cone angles defined as above. The collection of translation surfaces sharing the same angle data is called a stratum and is denoted $\mathcal{H}(\alpha)$ .
For any $\alpha$ that is an integer partition of an even number, $\mathcal{H}(\alpha)$ can be given the structure of a complex orbifold. The main idea is that given $(X, \omega) \in \mathcal{H}(\alpha_1, \dots, \alpha_m)$ , we can fix a basis $\rho_1, \dots, \rho_{2g+m-1}$ for the first homology $H_1\left(X, \left\{P_1, \dots, P_m\right\};\,\mathbb{Z}\right)$ relative to the cone points. We can then get a map
These are called ${period coordinates}$ for $\mathcal{H}(\alpha)$ . The period coordinates serve as local coordinates via which it can be shown, as in [ Reference Masur13 , Reference Veech22 , Reference Veech23 ], that the strata are complex orbifolds of dimension $2g+m-1$ , where g is the genus of the translation surface with cone point data $(\alpha_1, \dots, \alpha_m)$ . Kontsevich and Zorich [ Reference Kontsevich and Zorich10 ] classified the connected components of $\mathcal{H}(\alpha)$ for all $\alpha$ . In particular, any $\mathcal{H}(\alpha)$ can have at most 3 connected components. Moreover, any stratum admits an $\mathrm{SL}_2(\mathbb{R})$ action — given a translation surface built out of polygons $\{\Delta_i\}$ , its image under $A \in \mathrm{SL}_2(\mathbb{R})$ is simply the translation surface $\{A \cdot \Delta_i\}$ where A acts on the polygons linearly.
2.2. Volume in $\mathcal{H}(\alpha)$
The period coordinates can also be used to define a volume form on $\mathcal{H}(\alpha)$ . Consider the linear volume form on $\mathbb{C}^{2g+m-1}$ , normalised so that the fundamental domain of the integer lattice $(\mathbb{Z}+i\mathbb{Z})^{2g+m-1}$ has volume 1. The pullback of this volume form under the period map gives what is popularly called the Masur–Veech volume form on $\mathcal{H}(\alpha)$ . Furthermore, this induces a volume form on $\mathcal{H}_1(\alpha)$ , the set of translation surfaces in $\mathcal{H}(\alpha)$ of area 1 (i.e. collections of surfaces with total Euclidean area of the polgyons 1). The measure of $\mathcal{H}_1(\alpha)$ with respect to this induced volume form has been shown to be finite for any $\alpha$ , independently by Masur [ Reference Masur13 ] and Veech [ Reference Veech22 ].
Twenty years after, Eskin and Okounkov [ Reference Eskin and Okounkov4 ] computed the volume of these strata, $\mathcal{H}_1(\alpha)$ . They counted a particular type of translation surfaces called square-tiled surfaces (STSs), which are exactly those translation surfaces in which the polygons are axis parallel Euclidean unit squares. Alternatively, they are exactly those translation surfaces $(X , \omega)$ such that their image under the period map (2·1) is in $(\mathbb{Z} + i \mathbb{Z})^{2g+m-1}$ . In this manner, STSs have a lattice-like structure in the space of translation surfaces and can be thought of as “integer points” of strata. Topologically, STSs are also thought of as branched covers of the standard square-torus with branching over exactly one point.
The idea of the volume computation is motivated by the following simple case. To compute the surface area of a body in $\mathbb{R}^n$ , one can consider a large dilate of the body by $R > 1$ , and count the integer points inside. Asymptotically, the number of such integer points would be $c\cdot R^n$ since $\mathbb{R}^n$ is n-dimensional. The surface area of the body is then given by
To compute the volume of $\mathcal{H}_1(\alpha)$ , one applies the same technique. Applying a homothety to the codimension 1 subset $\mathcal{H}_1(\alpha)$ by n, we get the set of translation surfaces surfaces of area n. The integer points within this dilated region in $\mathcal{H}(\alpha)$ are STSs with at most n squares. The asymptotics of this count then yields the volume of $\mathcal{H}_1(\alpha)$ .
2.3. Connections to Number Theory
Using the volume computation heuristic described above, Zorich [ Reference Zorich25 ] computed the volume of the first few strata by hands-on counting and obtained
In general, Eskin and Okounkov [ Reference Eskin and Okounkov4 ] showed that the volume of $\mathcal{H}_1(\alpha)$ is given by
where $|\alpha| = \sum \alpha_i$ , and the $\mathcal{C}_d$ are the coefficients of a certain generating function $ \mathcal{C}(\alpha) = \sum_{d =1}^\infty \mathcal{C}_d(\alpha) q^d $ which they proved to be a quasimodular form, i.e, a polynomial in the Eisenstein series $G_k(q)$ for $k = 2, 4, 6$ . Consequently, they showed that
for any stratum $\mathcal{H}(\alpha)$ of genus g translation surfaces.
Since Eskin and Okounkov’s volume computations, various counting problems have received much attention in the study of STSs, including the enumeration of primitive square-tiled surfaces, i.e. those STSs whose covering of the square torus does not factor through another STS. In some ways this problem is analogous to counting primitive vectors in $\mathbb{Z}^n$ .
In 2006, Hubert and Lelievre [ Reference Hubert and Lelievre8 ] and McMullen [ Reference McMullen15 ] proved that primitive n-square STSs in $\mathcal{H}(2)$ partition into at most two orbits under the linear action of $\mathrm{SL}_2(\mathbb{Z})$ (induced by the linear action of $\mathrm{SL}_2(\mathbb{R})$ ). Subsequently, Lelievre and Royer [ Reference Lelievre and Royer12 ] obtained orbit-wise counting of primitive n-square STSs for odd n in $\mathcal{H}(2)$ . In the computation, they obtained and used closed forms of sums of the type
Note that $S_{1,1}^1 = S_{1,1}$ as defined above, the convolution of $\sigma_1$ with itself. For $k=2, 4$ and $n\geqslant 1$ , they obtained
They were able to express these sums as linear combinations of sums of powers of divisors using the fact that the spaces of quasimodular forms on congruence subgroups such as $M_4[\Gamma_0(4)]$ and $M_2[\Gamma_0(2)]$ are finite dimensional. Notably, however, since the generating functions for $\sigma_a$ for a even are odd weight Eisenstein series, the analysis of the convolution of $S_{a,b}$ for even a resists the theory of quasimodular forms, and hence we use alternate methods to understand the asymptotics of such sums.
We now describe the specific problem in the enumeration of STSs that motivated us to study $S_{a,b}$ for even a.
Every STS can be viewed as a union of horizontal square-tiled cylinders glued together. One way to analyse an STS in a given stratum is to categorise its horizontal cylinder decomposition type, popularly termed cylinder diagram that describes how many horizontal cylinders makes up the surface, and in what ways they are glued together.
In particular, STSs in $\mathcal{H}(1,1)$ (translation surfaces of genus two with two cone points) partition into exactly 4 cylinder diagrams. Figure 3 shows prototypical examples of surfaces in the 4 cylinder diagrams named A, B, C and D in $\mathcal{H}(1,1)$ .
The counting problem in question is to enumerate, given a fixed n, the number of primitive STSs in $\mathcal{H}(1,1)$ in each of the four cylinder diagrams and find the individual asymptotic densities of each them. For example, let the number of primitive n-square surfaces in $\mathcal{H}(1,1)$ with diagram D be D(n). The second author proved in [ Reference Shrestha21 ] that
where $J_k(n) \,:\!=\, n^k \prod_{p | n} \left(1 - \dfrac{1}{p^k}\right)$ is the Jordan totient function of order k, $\mu$ is the Möbius function and $*$ is Dirichlet convolution. Using Theorem 3·1, the second author proved that surfaces with diagram D have asymptotic density $1- \dfrac{\zeta(2)\zeta(3)}{2\zeta(5)} \approx 0.047$ . For similar formulae and asymptotic densities concerning the other diagrams A, B and C, see [ Reference Shrestha21 , theorem 1·1].
An analogous problem for the other genus two stratum $\mathcal{H}(2)$ was solved by Zmiaikou [ Reference Zmiaikou24 ]. Complete results for strata of genus 3 and above are not known although the density of one cylinder surfaces (although not necessarily primitive) has been computed by Delecroix–Goujard–Zograf–Zorich [ Reference Delecroix, Goujard, Zograf and Zorich2 ].
3. Proof of Theorem 3·1
For the reader’s convenience, we begin with a short proof of Ramanujan’s conjecture, along similar lines to Halberstam [ Reference Halberstam6 ]:
Theorem 3·1. For any positive real numbers a and b, as $n\to\infty$ there holds
where
and $\kappa$ is $2$ if $a = b = 1, 1$ if $a = 1$ and $b \neq 1$ or vice versa, and zero otherwise.
The theorem also holds if a and b are complex numbers with positive real part, in which case replace a and b by their real parts everywhere in the error terms and inequalities.
We begin with two lemmas.
Lemma 3·2. For any integer n and residue class $k \,\left(\mathrm{mod}\,{m}\right)$ , we have
Proof. (Sketch) As in [ Reference Halberstam6 ], we rewrite the sum in (3·2) as $n^{a + b} \sum_{j = 0}^{r-1} f\left(\alpha_0 + j\alpha\right)$ , where $f(t) \,:\!=\, t^a (1 - t)^b$ , for some $\alpha_0$ and r satisfying $0 \leqslant \alpha_0 < \alpha$ and $\left| r - \alpha^{-1} \right| < 1$ . After a change of variables, we recognise this as a Riemann sum approximation to the integral defining the beta function, yielding the result.
Lemma 3·3. We have, as a formal identity of Dirichlet series,
Proof. This follows by rewriting the left-hand side as
Proof of Theorem 3·1. We rewrite $S_{a, b}(n)$ in the form
If $(d, e) \nmid n$ then the inner sum vanishes. Otherwise, the divisibility conditions are equivalent to demanding that $k \equiv k_0 \,\left(\mathrm{mod}\,{({de}/{(d, e)})}\right)$ for some $k_0$ , and by Lemma 3·2 the inner sum equals
so that
Assuming for now that $a, b > 1$ , the error term of $O\left(n^{-1}\right)$ above contributes an error bounded by
The sum in the main term of (3·4) is equal to
By Lemma 3·3 the sum over i and j above is $\zeta(a+1)\zeta(b+1)/\zeta(a+b+2)$ , while the sum over w may be identified as $n^{-a-b-1}\sigma_{a+b+1}(n)$ . Assembling this in (3·4), we obtain Theorem 1·1 with an error of $O\left(n^{a + b}\right)$ in the case that $a,b>1$ .
If $a \leqslant 1$ and $b \geqslant 1$ , then in (3·5) the error term is $\ll n^{a + b + 1 - a} (\log n)^{\kappa}$ , where $\kappa$ is 2 if $a = b = 1$ , 1 if $a = 1$ and $b \neq 1$ or vice versa, and zero otherwise. If $a \geqslant 1$ and $b < 1$ , then the error is similarly $\ll n^{a + b + 1 - b} (\log n)^{\kappa}$ .
If instead $a, b < 1$ , take the sum in (3·5) only through $d \leqslant D$ and $e \leqslant E$ , making an error $\ll n^{a + b} D^{1 - a} E^{1 - b}$ . Rewriting (3·3) in the form
the contribution from $d > D$ is $O\left(n^{a + b + 1 + \epsilon} D^{-a}\right)$ , and the contribution from $e > E$ is similarly $O\left(n^{a + b + 1 + \epsilon} E^{-b}\right)$ . We therefore make a total error
Equating the parameters by choosing $D = n^{\frac{b}{b+a-ab}}$ and $E = n^{\frac{a}{b+a-ab}}$ , we obtain an error term
This yields Theorem 3·1 in the remaining cases.
4. Main theorem and proof
Again, for notational simplicity we assume that b and a are both real; if not, replace b and a with $\text{Re}(b)$ and $\text{Re}(a)$ in all inequalities and error estimates. We also assume without loss of generality that $b \geqslant a$ (i.e., that $\text{Re}(b) \geqslant \text{Re}(a)$ if these quantities are complex).
To motivate our strategy, in place of $\sum_{k = 1}^{n - 1} \sigma_a(k) \sigma_b(n - k)$ , consider the problem of estimating the simpler sum $\sum_{k = 1}^{n - 1} \sigma_a(k) (n - k)^b$ . The factor $(n - k)^b$ appears to complicate matters, but via the theory of Riesz means and Mellin transforms it may be interpreted as a smoothing factor that helps in evaluating of the sum.
In particular, we have the following familiar formula.
Lemma 4·1. We have, for any Dirichlet series $\sum_k a(k) k^{-s}$ and any complex number b with $\text{Re}(b) > 0$ , the formula
where the contour is over any vertical line where the Dirichlet series converges uniformly and absolutely.
Proof. Switching the order of integration and summation, this reduces to the formula
for which see [ Reference Gradshteyn and Ryzhik5 , 17.43.22]. (It may be proved by shifting the contour infinitely far to the right or left as appropriate, and evaluating the sum of residues in the latter case.)
Our aim will be to first manipulate our sum into something resembling (4·1), where the Dirichlet series $\sum a(k) k^{-s}$ can be expressed in terms of zeta functions and therefore enjoys analytic continuation to $\mathbb{C}$ . As is familiar in various analytic number theory contexts, this will then allow us to shift the integral in (4·1) to the left.
Now, we have
where the integral is taken over the vertical line with $\text{Re}(s) = a + 2$ .
For any real $x>0$ , let $\zeta(s,x)$ be the Hurwitz zeta function, defined for $\text{Re}(s)>1$ by the Dirichlet series
We note that
Thus, we conclude that
The main aim of this section is to prove the following theorem, essentially a restatement of Theorems 1·1 and 1·2.
Theorem 4·2. Let a and b be positive real numbers.
(i) If $b-a > 3/2$ , then
where $\mathrm{Res}({-}m)$ denotes the residue of the integrand of (4·3) at $s=-m$ , and is given explicitly by (4·7). It satisfies $\mathrm{Res}({-}m) \ll n^{b-m}$ in general, and if a is an odd integer, then $\mathrm{Res}(0) = -({1}/{2})\zeta({-}a)\sigma_b(n)$ and $\mathrm{Res}({-}m)=0$ for each $m \geqslant 1$ .
(ii) If $\max\{a,2-a\} < b \leqslant a + {3}/{2}$ , then
After recalling some analytic facts about the Hurwitz zeta function, we begin by analysing the poles and residues of the integrand. This constitutes an analysis of the main terms provided in Theorem 4·2. We then bound the error terms in Theorem 4·2 by means of the functional equation for Hurwitz zeta functions. This has the net effect of replacing the summation of Hurwitz zeta functions by a Dirichlet series whose coefficients are certain Kloosterman sums. This also implicitly gives another evaluation of the residual terms $\mathrm{Res}({-}m)$ .
Finally, we note that we can obtain the secondary term in a simpler fashion, with no Kloosterman sums, when $a > 1$ and $b > a + 2$ . We explain this in Section 4.4.
4.1. Properties of the Hurwitz zeta function
The following lemma recalls some basic properties of the Hurwitz zeta function. For proofs, see [ Reference Apostol1 ].
Lemma 4·3. For any real $x > 0$ and $\text{Re}(s) > 1$ , the Hurwitz zeta function $\zeta(s, x) \,:\!=\, \sum_{n = 0}^{\infty} (n + x)^{-s}$ satisfies the following:
-
(i) (Analytic continuation) $\zeta(s, x)$ has analytic continuation to all of $\mathbb{C}$ , with a simple pole at $s = 1$ with residue 1, and holomorphic elsewhere;
-
(ii) (Functional equation) $\zeta(s, x)$ satisfies a functional equation, which for $x = e/d$ rational and $\text{Re}(s) < 0$ can be written
(4·4) \begin{equation} \zeta(1-s,e/d) = \frac{\Gamma(s)}{(2\pi)^s} \left( e^{\pi i s/2} \sum_{k \geqslant 1} \frac{e^{-2\pi i ke/d}}{k^s} + e^{-\pi i s/2} \sum_{k \geqslant 1} \frac{e^{2\pi i ke/d}}{k^s}\right);\end{equation} -
(iii) (Evaluation at negative integers) For integer values $k \geqslant 0$ , there is the special value
(4·5) \begin{equation}\zeta({-}k, x) = \frac{-1}{k+1} B_{k+1}(x),\end{equation}where $B_{k+1}(x)$ denotes the degree $k+1$ Bernoulli polynomial.
To estimate the values of $\zeta(s, x)$ inside the critical strip, we will use the approximate functional equation, as proved in the following form by Miyagawa [ Reference Miyagawa16 ].
Lemma 4·4. Assume $s = \sigma + i t$ for some $0 < \sigma < 1 $ . Set $T = \sqrt{2 \pi (|t|+1)}$ . Then for any real $x > 0$ ,
We also note the following consequence of Stirling’s formula.
Lemma 4·5. For any b, we have
4.2. Analysis of poles and residues
We now proceed with our analysis of the integral (4·3). For each $e_1,e_2$ , the integrand has right-most pole at $s=a+1$ , coming from the factor of $\zeta(s-a,e_1/d)$ , which has a simple pole with residue 1. The sum of the residues is
We then note that
Thus, write $f \,:\!=\, (k,d)$ , and observe that we may assume $f \mid n$ . So doing, and replacing d and k by fd and fk, respectively, our expression for the residue at $s=a+1$ becomes
by Lemma 3·3.
Before turning to the residue of the pole at $s=1$ , we note one consequence of the above argument. In particular, for any fixed n and b, in the identity proved above,
both sides define analytic functions of a for $a>-b$ , $a\neq 0$ . Thus, this expression must hold for $-b<a<0$ , even though neither $\zeta(a+1,x)$ nor $\zeta(a+1)$ is defined via a convergent Dirichet series in this region. This will be useful in evaluating the residue at $s=1$ , which we now turn to.
Using (4·3) again, the pole at $s=1$ is seen to be
Since we have assumed $a < b$ , it follows that $-a > -b$ , so by the identity (4·6), this evaluates to
Finally, we evaluate the residue at $s=-m$ , $m \geqslant 0$ , arising from the gamma function. We do so in general, but we only provide a clean simplification of the term when a is an odd integer. The residues for other values of a do not seem to have a natural multiplicative structure, for example, so we consider the case that a is odd to be the most interesting.
Using (4·3), the residue at $s=-m$ is
When a is an integer, by the special value formula (4·5) the inner summation over $e_1,e_2$ in (4·7) becomes
For fixed d, the substitution $(e_1,e_2) \mapsto (d-e_1,d-e_2)$ defines an involution on the set of pairs $(e_1,e_2)$ with $e_1,e_2 \neq d$ . Since $B_{k+1}(1-x) = ({-}1)^{k+1}B_{k+1}(x)$ , if a is odd, it follows for such $e_1,e_2$ that
Consequently, when a is odd, the sum over $e_1,e_2$ with $e_1,e_2 \neq d$ cancels, and it remains to consider only those pairs where one of $e_1$ and $e_2$ equals d. Given that $e_1$ and $e_2$ are restricted to satisfy the congruence $e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)$ , such pairs arise only when $d \mid n$ . In this case, the summation over $e_1$ and $e_2$ in (4·7) collapses to
If $m \geqslant 1$ , then, since a is odd, every term above is 0, and consequently the residue (4·7) is 0 as well. On the other hand, if $m=0$ , then the above expression simplifies to $d^{-a}\zeta(0)\zeta({-}a) = -({d^{-a}}/{2})\zeta({-}a)$ . We then find for $m=0$ that (4·7) evaluates to
4.3. Error analysis via Kloosterman sums
Applying the functional equation (4·4) for both $\zeta(1-s-a,e_1/d)$ and $\zeta(1-s,e_2/d)$ , we will be led to consider exponential sums of the form
where we write $e(x) \,:\!=\, e^{2\pi i x}$ for any real x. By relating these to classical Kloosterman sums we obtain the following strong bound.
Lemma 4·6. With notation as above, we have
for any $\epsilon > 0$ .
Proof. Recall that the classical Kloosterman sums are defined by
We begin by proving the identity
For $e_1$ as in the sum defining $S_n(m,k;\,d)$ , let $f = (e_1,d)$ , and note that there are no terms with $f \nmid (d,n)$ . Write $e_1 = e_1^\prime f$ , where $(e_1^\prime, d/f) = 1$ . Let $e_2^\prime$ be such that $e_1^\prime e_2^\prime \equiv 1\,\left(\mathrm{mod}\,{d/f}\right)$ , so that the allowed values of $e_2 \,\left(\mathrm{mod}\,{d}\right)$ are given by $e_2 = e_2^\prime n/f + jd/f$ for $0 \leqslant j \leqslant f-1$ .
Thus, we find
as claimed.
Now apply the Weil bound $|K(a,b;\,q)| \leqslant \tau(q)q^{1/2}\mathrm{gcd}(a,b,q)^{1/2}$ to conclude
as desired.
We first assume that $b > a + 3/2$ . We will shift the contour in (4·3) to the line $\text{Re}(s)=1-\delta$ for some $\delta > 1$ . Using Stirling’s formula, along the line $\text{Re}(s)=1-\delta$ for $\delta>1$ , the integrand in (4·3) is
The integral (4·3) thus converges absolutely on the line $\text{Re}(s)=1-\delta$ provided that $\delta < ({b-a+1}/{2})$ . This is compatible with the assumption that $\delta>1$ by the assumption $b>a+3/2$ .
Using Lemma 4·6, the integral in (4·3), evaluated on the line $\text{Re}(s)=1-\delta$ , is
by the assumption that $\delta > 1$ . Since $b>a+{3}/{2}$ , we take $\delta = ({b-a})/{2} + {1}/{4}-\epsilon$ and conclude the integral is
Together with the analysis of the poles, this yields the first part of Theorem 4·2.
Now, assume that $b > \max\{a,2-a\}$ . Our goal in this case is to show that the contour in (4·3) may be shifted to the line $\text{Re}(s)=\sigma$ for some $0 < \sigma < 1$ . This is equivalent to obtaining sufficient cancellation in the series
on the line $\text{Re}(s)=\sigma$ . We shall find it convenient to assume that $\sigma < a$ so that $\zeta(s-a,e_1/d)$ is related to an absolutely convergent Dirichlet series via the functional equation (4·4). For $\zeta(s,e_2/d)$ , we do not have this luxury, so we instead invoke the approximate functional equation of Lemma 4·4.
In principle, in applying the functional equation for $\zeta(s-a,e_1/d)$ and the approximate functional equation for $\zeta(s,e_2/d)$ , we are forced to consider six summations, corresponding to pairing each of the two terms in (4·4) with the three terms in Lemma 4·4. However, the two summations in (4·4) have the same shape as each other, as do the second and third summations in Lemma 4·4. Consequently, it essentially suffices to consider only two types of summation, corresponding to pairing the first term from Lemma 4·4 with a term from (4·4) or pairing one of the latter two terms from Lemma 4·4 with a term from (4·4).
In the first of these two cases, where the first term of Lemma 4·4 for $\zeta(s,e_2/d)$ is paired with one of the terms in (4·4) for $\zeta(s-a,e_1/s)$ , we are led to consider series of the form
where, as in Lemma 4·4, we have set $T = \sqrt{2\pi(1+|t|)}$ . The exponential sum in (4·9) is 0 unless $(d,k) \mid (n,d,m)$ , in which case it is of absolute value (d, k). Thus, since we have assumed $\text{Re}(s)=\sigma < a$ , (4·9) is bounded by
provided that $\sigma > 1 - ({b-a})/{2}.$ Since we have assumed $b>2-a$ , there is some $\sigma < a$ for which this holds. Using Stirling’s formula, the additional factors in (4·4) as applied to $\zeta(s - a, e_1/d)$ coming from the gamma function and exponentials may be bounded by $O\left((1+|t|)^{a-\sigma+\frac{1}{2}}\right)$ . Altogether, the contribution to (4·8) from the first term in the approximate functional equation for $\zeta(s,e_2/d)$ is seen to be $O\left((1+|t|)^{a-\frac{3\sigma}{2}+1}\right)$ .
We now consider the second type of summation, arising from the second and third terms in the approximate functional equation. In particular, we are led to estimate
We appeal to Lemma 4·6 to conclude that (4·11) is bounded by
Once again, the additional factors in (4·4) are of size $O((1+|t|)^{a-\sigma+\frac{1}{2}}$ , while those in Lemma 4·4 are seen to be $O((1+|t|)^{\frac{1}{2}-\sigma})$ . We thus find that terms arising from the second and third summations in Lemma 4·4 contribute an amount that is $O\left((1+|t|)^{a-\frac{3\sigma}{2}+1}\right)$ to (4·8), matching the contribution from those terms arising from the first summation in Lemma 4·4. The error terms in Lemma 4·4 contribute a smaller amount, and we conclude that on the line $\text{Re}(s)=\sigma$ ,
provided that $1 - ({b-a})/{2} < \sigma < a$ .
Thus, estimating the quotient of gamma factors by Lemma 4·5, the integrand in (4·3) is $O_{a,b,\sigma}\left(n^{b+\sigma} (1+|t|)^{a-b-\frac{3\sigma}{2}}\right)$ . The integral therefore converges absolutely on the line $\text{Re}(s) = 1 - ({b-a})/{2}+\epsilon$ for any $\epsilon>0$ . This yields the second part of the theorem when $\max\{a,2-a\} < b\leqslant a + 3/2$ .
4.4. A simpler version of the error analysis
We present an alternative treatment of the error that avoids the complications of the last section, obtaining a weaker error term of $o\left(n^{b + 1}\right)$ for some ranges of the parameters. In particular, we assume that $a > 1$ and $b>a+2$ .
Shift the contour in (4·3) to $\text{Re}(s) = 1 - \epsilon$ for small $\epsilon > 0$ . We have $\zeta(s - a, e_1/d) \ll (1 + |t|)^{a - \frac{1}{2} + \epsilon}$ by the functional equation and Stirling’s formula; we have $\zeta(s, e_2/d) \ll (1 + |t|)^{\epsilon} \cdot \left( {e_2}/{d} \right)^{-1}$ by the convexity bound, with the term $\left( {e_2}/{d}\right)^{-1}$ arising from the first term $(e_2/d)^{-s}$ of $\zeta(s, e_2/d)$ ; and we again use Lemma 4·5 to estimate the quotient of gamma functions.
We conclude that the integrand is
This yields an error term of $O(n^{b + 1 - \epsilon})$ provided that the sum over d and the integral over t converge. These conditions are satisfied for some $\epsilon>0$ if $b - a > 2$ .
5. Possible improvements
As made clear in the discussion surrounding Lemma 4·6, the error term in Theorem 1·2 is controlled by sums of Kloosterman sums $K(r,s;\,q)$ , where q denotes the modulus. The Weil bound implies that $K(r,s;\,q) \ll q^{1/2+\epsilon}$ , and this is a key ingredient in the proof. However, it is expected that much greater cancellation holds on average. We expect that if the estimate $K(r,s;\,q) \ll q^{\theta+\epsilon}$ holds on average for some $0 \leqslant \theta \leqslant 1/2$ , then the error term in Theorem 1·2 may be improved to $O\left(n^{\frac{a+b}{2}+\frac{1+\theta}{2}+\epsilon}\right)$ . Assuming a conjecture of Selberg [ Reference Selberg19 ], the value $\theta=0$ is likely admissible, and this would yield a Ramanujan–Deligne quality error term in Theorem 1·2. Using work of Deshouillers and Iwaniec [ Reference Deshouillers and Iwaniec3 ] on sums of Kloosterman sums, we speculate it may be possible to improve the error in Theorem 1·2, perhaps to the level $O\left(n^{\frac{a+b}{2}+\frac{7}{12}+\epsilon}\right)$ . Alternatively, Shparlinski suggested to us that his work with Zhang [ Reference Shparlinski and Zhang20 ] on cancellation amongst Kloosterman sums to prime moduli could be readily generalised to the composite case without difficulty, again leading to possible improvements. We leave these questions for future work.
Finally, as P. Humphries pointed out to us, these questions can also be addressed via the spectral theory of automorphic forms. We refer to Kuznetsov [ Reference Kuznetsov11 ] and Motohashi [ Reference Motohashi17 ] for some related results along these lines, including a treatment by Motohashi of the case $a = b = 0$ . Humphries suggested to us that these techniques may be able to address complex a and b in greater generality, and again we leave this question for future work.
Acknowledgements
The authors would like to thank Bruce Berndt, Michael Filaseta, Peter Humphries, Karl Mahlburg, Ken Ono, Ian Petrow, Igor Shparlinski and Matt Young for useful discussions and for pointing us to relevant related works. We would also like to thank an anonymous referee for helpful comments.
RJLO was partially supported by NSF grant DMS-1601398. FT was partially supported by grants from the Simons Foundation (Nos. 563234 and 586594).