1 Introduction
Two of the outstanding conjectures in number theory are the so-called Batyrev–Manin conjecture [Reference Franke, Manin and Tschinkel9] for the density of rational points on open subschemes of Fano varieties with respect to a Weil height, and Malle’s conjecture [Reference Malle14] on the number of number fields having bounded discriminant, fixed degree, and fixed Galois group. Both conjectures assert, roughly, that the number of objects to be counted with an appropriate height at most X satisfy an asymptotic formula of the form
where $C, \alpha , \beta $ are nonnegative numbers with $C, \alpha> 0$ , and that $C, \alpha , \beta $ can be computed explicitly within their respective geometric and arithmetic frameworks.
In a recent article, J. Ellenberg, M. Satriano, and D. Zureick-Brown extend the theory of heights to algebraic stacks. They formulate bold conjectures that encompasses both the Batyrev–Manin and Malle conjectures as special cases [Reference Ellenberg, Satriano and Zureick-Brown6, Main Conjecture]. While the Manin and Malle conjectures are well studied, comparatively little is known about the behavior of rational points on algebraic stacks.
In this article, we study heights on stacky curves, analogues of smooth projective algebraic curves defined over $\mathbb {Q}$ . The height functions we develop on stacky curves are completely explicit and can be understood in an elementary manner. Further, we will see that natural questions involving stacky curves lead to an equivalent formulation of the $abc$ -conjecture.
The Stacky Batyrev–Manin-Malle Conjecture is still open for smooth one-dimensional algebraic stacks. In this article, we focus on stacky curves whose coarse moduli space is ${\mathbb {P}}^1$ . A key example is ${\mathbb {P}}^1$ endowed with three stacky points of degree $\frac {1}{2}$ and the height given by the anti-canonical bundle. This algebraic stack may be thought of as the usual projective line, except the points $0,1,\infty $ have been replaced with “stacky” points that have degree $\frac {1}{2}$ rather the $1$ . Alternatively, one may consider the M-curve as described by Darmon, here the points $0,1,\infty $ have multiplicity 2. The key observation is that a point of multiplicity m has degree $\frac {1}{m}$ .
In the case of ${\mathbb {P}}^1$ with three half points, rational points on this stack can be thought of as rational points on ${\mathbb {P}}^1$ . The anti-canonical height function can now be explicitly written after normalization as
Put $N(T)=\{(a:b)\in {\mathbb {P}}^1({\mathbb {Q}})\colon H(a:b)\leq T\}$ . Then we have the following theorem.
Theorem 1.1 There are positive numbers $c_1,c_2,c_3$ such that
for all $T>c_3$ .
The above estimate proves the stacky Batyrev–Manin conjecture for the anti-canonical height on ${\mathbb {P}}^1$ with three half points.
We note that P. Le Boudec has independently proved this statement in a private communication.
Let $\mathfrak {X}$ be a proper smooth stacky curve defined over ${\mathbb {Q}}$ that has coarse space ${\mathbb {P}}^1$ . The rational points of this stack are the rational points of ${\mathbb {P}}^1$ . As in the classical case, there is a canonical line bundle $K_{\mathfrak {X}}$ on $\mathfrak {X}$ . The theory described in [Reference Ellenberg, Satriano and Zureick-Brown6] provides a height function for every line bundle on $\mathfrak {X}$ . Therefore, as in the case of algebraic curves, we may consider the anti-canonical height $H_{-K_{\mathfrak {X}}}$ . This anti-canonical height is analogous to the anti-canonical height on a Fano variety in the usual Batyrev–Manin conjecture. A natural question is when does the anti-canonical height of $\mathfrak {X}$ satisfy the Northcott property. It turns out that the stacky Euler characteristic answers this question. The Euler characteristic of $\mathfrak {X}$ is defined as
If $\mathfrak {X}$ is ${\mathbb {P}}^1$ with stacky points $p_1,...,p_n$ with $\deg p_i=\frac {1}{m_i}$ , then one has the formula
The ESZ-B height associated with $-K_{\mathfrak {X}}$ is then given by the explicit formula
The functions $\phi _{m_i}(\ell _i(x,y))$ are generalizations of the functions $\operatorname {sqf}(\vert x\vert ) ,\operatorname {sqf}(\vert y\vert )$ , and $\operatorname {sqf}(\vert x+y\vert )$ that appear in equation (1.1) (see Section 3 for the precise definitions of the functions $\ell _i$ and $\phi _{m_i}$ ). We obtain a similar explicit description for any line bundle on $\mathfrak {X}$ in terms of the functions $\phi _{m_i}$ and $\ell _i$ in Theorem 3.16. For the anti-canonical height, we obtain the following.
Theorem 1.2 Let $\mathfrak {X}$ be a proper smooth stacky curve defined over ${\mathbb {Q}}$ that has coarse space ${\mathbb {P}}^1$ or is isomorphic to a smooth projective curve. Then the anti-canonical height $H_{-K_{\mathfrak {X}}}$ has the Northcott property if and only if $\chi (\mathfrak {X})>0$ .
The above result tells us that the ESZ-B heights when applied to the tangent bundle recovers the behavior of the Weil heights when applied to the tangent bundle, providing additional evidence that the ESZ-B theory of heights is the correct generalization of the classical theory. In particular, Theorem 1.2 demonstrates that the stacky Batyrev–Manin conjecture of [Reference Ellenberg, Satriano and Zureick-Brown6] is a direct generalization of the classical Batyrev–Manin conjecture for Fano varieties.
Ellenberg, Satriano, and Zureick-Brown have proposed a generalized Vojta’s conjecture applicable to the case of algebraic stacks [Reference Ellenberg, Satriano and Zureick-Brown6, Conjecture 4.23]. In the case of stacky curves, the stacky Vojta conjecture is directly related to when does $H_{-K_{\mathfrak {X}}}$ to have the Northcott property.
In [Reference Ellenberg, Satriano and Zureick-Brown6, Section 4.7], it is speculated that [Reference Ellenberg, Satriano and Zureick-Brown6, Conjecture 4.23] for stacky curves should follow from some version of the $abc$ -conjecture. In the case of algebraic curves, Vojta’s conjecture is known to be equivalent to the $abc$ -conjecture. We show that, much like the case of algebraic curves, the stacky analogue of Vojta’s conjecture in the curve case is equivalent to the $abc$ -conjecture. We formulate this as follows:
Theorem 1.3 Let $\mathfrak {X}$ be a proper smooth stacky curve defined over ${\mathbb {Q}}$ that has coarse space ${\mathbb {P}}^1$ or is isomorphic to a smooth projective curve. Further, suppose that $\mathfrak {X}$ has negative Euler characteristic. Then the following statements are equivalent:
-
(1) The $abc$ -conjecture holds; and
-
(2) For all $\mathfrak {X}$ satisfying the hypotheses of the theorem and for all $\delta> 0$ , the function $\mathcal {H}_{-K_{\mathfrak {X}}} \cdot H^\delta $ has Northcott’s property, where $H([x,y]) = \max \{|x|, |y|\}$ is the usual height function on ${\mathbb {P}}^1({\mathbb {Q}})$ .
Theorem 1.3 shows that Conjecture 4.23 in [Reference Ellenberg, Satriano and Zureick-Brown6] is equivalent to the $abc$ -conjecture, answering a question of Ellenberg, Satriano, and Zureick-Brown. Their conjectures are motivated by the work of P. Vojita, see [Reference Vojta21]
In [Reference Ellenberg, Satriano and Zureick-Brown6], the authors wonder if the stacky Vojta conjecture might be more “in reach” for algebraic stacks obtained by rooting along a divisor D on a scheme X. The proof of Theorem 1.3 shows that if there is some $m\geq 4$ such that item (2) in Theorem 1.3 holds for $\mathfrak {X}_m = \mathfrak {X}({\mathbb {P}}^1 : ((0,1,\infty ) : (m,m,m))$ , then a weak variant of the $abc$ -conjecture can be derived. Specifically, there exists a positive number $c_m \geq 1$ such that for any co-prime $a,b,c \in {\mathbb {Z}}$ with $a + b = c$ and $\varepsilon> 0$ that
In particular, any progress on the stacky Vojta conjecture for curves would lead to substantial progress on the $abc$ -conjecture.
2 A further elaboration of our ideas
In this section, we motivate and describe our main results in more detail, as well as describe our grounds-up height construction.
2.1 An elementary height machine on stacky curves
We define our algebraic stacks in terms of a base variety along with some extra data which are enough to uniquely construct an algebraic stack. A stacky curve defined over a number field K is determined by the following data: A smooth variety X defined over K, and a finite number of stacky points $P_1,\dots ,P_r$ along with integer multiplicities $m_{P_i}=m_i>1$ attached to each point $P_i$ . We use the notation
to denote the stacky curve with multiplicities $m_{P_i}=m_i$ at the points $P_i$ . We will write $\mathfrak {X}(X:({\mathbf {a}}, {\mathbf {m}}))$ as an abbreviation. We identify the rational points of the stack $\mathfrak {X}$ with those of the coarse space, which is just the variety X. That is, we require
In [Reference Ellenberg, Satriano and Zureick-Brown6], some care is taken to work with the locus of stacky points on an algebraic stack. In particular, one must contend with the accumulation of infinitely new stacky points. We ignore such difficulties since they are not important in our context.
To obtain an E-S-ZB height on a stacky curve $\mathfrak {X}(X:({\mathbf {a}},{\mathbf {m}}))$ , we must choose a vector bundle on $\mathfrak {X}(X:({\mathbf {a}},{\mathbf {m}}))$ . We will be primarily interested in the stacky curves $\mathfrak {X}({\mathbb {P}}^1_{\mathbb {Q}}: (a_1,m_1),\dots ,(a_r,m_r))$ and the vector bundle being a line bundle. Unless otherwise mentioned, stacky curve $\mathfrak {X}$ will now be of this form. Associated with each stacky point, $a_i$ is a line bundle ${\mathcal {L}}_{a_i}$ and it suffices to consider line bundles of the form
where L is a divisor on ${\mathbb {P}}^1$ and $0\leq c_i\leq m_i-1$ . To associate a height on such a divisor, we associate a height to each ${\mathcal {L}}_{a_i}^{\otimes c_i}$ and extend linearly. Motivated by [Reference Ellenberg, Satriano and Zureick-Brown6], we develop the following construction. For each stacky point $a_i = [\alpha _i : \beta _i]$ with $\alpha _i,\beta _i$ coprime integers, we associate to it the linear form $\ell _i(x,y) = \alpha _i y- \beta _i x$ . For each $m_i$ , define $\phi _{m_i}(n)$ is defined to be the smallest positive integer such that $n \phi _{m_i}(n)$ is a perfect $m_i$ -th power. The height function associated with ${\mathcal {L}}_{a_i}^{\otimes c_i}$ is then
The linear form $\ell _i$ takes into account the point $a_i$ and $\phi _{m_i}$ accounts for the multiplicity of $a_i$ , while the power $c_i$ accounts for the multiple of ${\mathcal {L}}_{a_i}$ . The introduction of the functions $\phi _{m_i}$ is due to [Reference Ellenberg, Satriano and Zureick-Brown6] and working with these functions is a key feature of stacky curves with coarse space ${\mathbb {P}}^1$ . We define a height function for any divisor $\mathcal {L}=L\otimes \prod _{i=1}^s{\mathcal {L}}_{a_i}^{\otimes c_i}$ on $\mathfrak {X}$ as
whenever $x,y$ are coprime integers. The Euler characteristic of the stacky curve is defined to be the degree of the anti-canonical divisor. If we wish to emphasize in our situation that $\chi (\mathfrak {X})$ only depends on the vector of multiplicities ${\mathbf {m}}$ and that the anti-canonical height only depends on $({\mathbf {a}},{\mathbf {m}})$ , we write
and
2.2 Properties of the anti-canonical E-S-ZB height $H_{-K_{\mathfrak {X}}}$
The Northcott property of the naive height on ${\mathbb {P}}^1$ implies that the ESZ-B anti-canonical height $H_{-K_{\mathfrak {X}}}$ has the Northcott property whenever $\chi (\mathfrak {X})=\delta ({\mathbf {m}})>0$ . On the other hand, if $\chi (\mathfrak {X})\leq 0$ , then it is not at all obvious whether $H_{-K_{\mathfrak {X}}}$ should have the Northcott property. The following question is fundamental: Let ${\mathcal {L}}$ be a line bundle on $\mathfrak {X}(X: ({\mathbf {a}},{\mathbf {m}}))$ and let $H_{{\mathcal {L}}}$ be the associated ESZ-B height. When does $H_{\mathcal {L}}$ have the Northcott property? We will tackle this question when ${\mathcal {L}}=-K_{\mathfrak {X}}$ and $X={\mathbb {P}}^1$ leaving the general case for future study.
Theorem 2.1 Let $\{a_1, \ldots , a_n\} \subset {\mathbb {P}}_{\mathbb {Q}}^1$ and ${\mathbf {m}} = (m_1, \ldots , m_n)$ be a vector of multiplicities. Then whenever
the anti-canonical height $H_{-K_{\mathfrak {X}}}$ given by (2.2) on the curve $\mathfrak {X}({\mathbb {P}}^1 : (a_1, m_1), \ldots , (a_n, m_n))$ fails to have the Northcott property.
If we assume that the ESZ-B theory should behave roughly like its classical counterpart, we can argue the converse: When $\chi (\mathfrak {X})\leq 0$ , one should have that $H_{-K_{\mathfrak {X}}}$ should fail to have the Northcott property. In particular, Theorem 2.1 shows that our arithmetic and geometric intuition prove to be correct when $\mathfrak {X}$ has coarse space ${\mathbb {P}}^1$ . This answers a question posed by Ellenberg.
The proof of Theorem 2.1 uses the following theorem about elliptic curves.
Theorem 2.2 Let $F \in {\mathbb {Z}}[x,y]$ be a non-singular binary quartic form. Then there exists square-free $d \in {\mathbb {Z}}$ such that the curve
has a rational point and such that its Jacobian has positive rank as an elliptic curve defined over ${\mathbb {Q}}$ .
The proof of Theorem 2.2 is provided to us by Shnidman in [Reference Shnidman15], and we graciously acknowledge his assistance.
Combining these results gives the following uniform statement.
Corollary 2.3 Let $\mathfrak {X}$ be a smooth proper stacky curve defined over ${\mathbb {Q}}$ such that $\mathfrak {X}$ has coarse space ${\mathbb {P}}^1_{\mathbb {Q}}$ or $\mathfrak {X}$ is a projective algebraic curve. Let $H_{\mathfrak {X}}$ be the height associated with the anti-canonical divisor $-K_{\mathfrak {X}}$ . Then $\chi (\mathfrak {X})>0$ if and only if $H_{\mathfrak {X}}$ has the strong Northcott property.
Proof If $\mathfrak {X}$ is an algebraic stack and not an algebraic curve, then it is of the form $\mathfrak {X}=\mathfrak {X}({\mathbb {P}}^1_{\mathbb {Q}};({\mathbf {a}},{\mathbf {m}}))$ and (2.1) gives the desired result. On the other hand, if $\mathfrak {X}$ is a smooth projective and geometrically integral curve, then $\chi (\mathfrak {X})\leq 0$ implies that $-H_{C}$ does not have the Northcott property.
In cases where we can prove that the Northcott property fails, we expect, according to [Reference Ellenberg, Satriano and Zureick-Brown6], that there should be a stacky Vojta conjecture. In particular, we would like to know whether the anti-canonical height $\mathcal {H}_{({\mathbf {a}}, {\mathbf {m}})}$ can be modified to recover the Northcott property. Difficulties arise because the ESZ-B height machine is not functorial in the usual sense; in the setting of algebraic varieties, one can work with a linear spaces of divisors and then apply the height machine which by functoriality will respect the linear structure. Such methods are not immediately available to us. Instead, we will apply the height machine, and then apply linear operations. We ask that for $\mathfrak {X}=\mathfrak {X}({\mathbb {P}}^1:({\mathbf {a}},{\mathbf {m}}))$ with $\chi (\mathfrak {X})\leq 0$ , what can be said about the quantity
Clearly, if we change the exponent in the classical part of the height so that it is positive, then we will recover the Northcott property. In fact, we expect that something far less drastic suffices.
For a real number $\delta $ and the curve $\mathfrak {X}({\mathbb {P}}^1 : ({\mathbf {a}}, {\mathbf {m}}))$ , define the height
We then see that $\mathcal {H}_{({\mathbf {a}}, {\mathbf {m}})} = \mathcal {H}_{({\mathbf {a}}, {\mathbf {m}})}^{\chi (\mathfrak {X})}$ . Next, put
In fact, $\gamma (\mathfrak {X})$ depends only on ${\mathbf {m}}$ , so we may also write it as $\gamma ({\mathbf {m}})$ . We make the following conjecture.
Conjecture 2.4 (Northcott conjecture for stacky curves with coarse space ${\mathbb {P}}^1$ )
For all $\mathfrak {X}=\mathfrak {X}({\mathbb {P}}^1:({\mathbf {a}}, {\mathbf {m}}))$ , we have $\gamma (\mathfrak {X}) = \min \{\chi (\mathfrak {X}), 0\}$ .
Conjecture 2.4 is in fact a version of Vojta’s conjecture for stacky curves, and agrees with a conjecture of Ellenberg, Satriano, and Zureick-Brown in [Reference Ellenberg, Satriano and Zureick-Brown6]. Toward this conjecture, we have the following.
Theorem 2.5 We have $\gamma (\mathfrak {X}) = 0$ if $\chi (\mathfrak {X}) \geq 0$ . Moreover, the height $\mathcal {H}_{({\mathbf {a}}, {\mathbf {m}})}^0$ has the Northcott property if and only if $\chi (\mathfrak {X})< 0$ .
Combined with Theorem 2.1, the conjecture predicts that the set of $\delta \in {\mathbb {R}}$ such that $\mathcal {H}^\delta _{{\mathbf {a}},{\mathbf {m}}}$ has the Northcott property is an interval of the form $(\chi (\mathfrak {X}),\infty )$ when $\chi (\mathfrak {X})<0$ and $(0,\infty )$ when $\chi (\mathfrak {X})\geq 0$ . Therefore, while Theorem 2.1 tells us we cannot count points with $\mathcal {H}_{{\mathbf {a}},{\mathbf {m}}}$ , Conjecture 2.4 predicts that we can count points using $\mathcal {H}^{\chi (\mathfrak {X})+\varepsilon }_{{\mathbf {a}},{\mathbf {m}}}$ for any $\varepsilon>0$ .
Next, we prove that Conjecture 2.4 is a consequence of the $abc$ -conjecture. However, it seems that we are very far from being able to prove such a result as strong as Conjecture 2.4 unconditionally.
Theorem 2.6 Suppose that the $abc$ -conjecture holds. Then, for any $\delta> \delta ({\mathbf {m}})$ , the function $\mathcal {H}_{({\mathbf {a}}, {\mathbf {m}})}^{\delta } (x,y)$ on $\mathfrak {X}({\mathbb {P}}^1: ({\mathbf {a}}, {\mathbf {m}}))$ has Northcott’s property.
In fact, Conjecture 2.4 is equivalent to the $abc$ -conjecture (see Theorem 1.3). The proof of the converse is quite different and so we give it in a separate subsection.
2.3 Quantitative arithmetic on stacky curves
In the positive Euler characteristic case, we consider a particular family of stacky curves, which includes an important example suggested by J. EllenbergFootnote 1 and show that our theory of heights matches [Reference Ellenberg, Satriano and Zureick-Brown6] in this instance. Finally, we verify a specific instance of the main conjecture in [Reference Ellenberg, Satriano and Zureick-Brown6] given by Ellenberg, Satriano, and Zureick-BrownFootnote 1 using analytical methods. We remark that P. Le Boudec had obtained the same result as us in independent work (private communication).
We study the expression (2.3) a bit more carefully. It is easy to deduce that $\delta ({\mathbf {m}}) \geq 0$ if and only if $n \leq 4$ , and $\delta ({\mathbf {m}})> 0$ only if $n \leq 3$ . We will not consider the case $n \leq 2$ in this paper.
If we assume $m_1 \leq m_2 \leq m_3$ , then the only cases when we have positive Euler characteristic are when $m_1 = m_2 = 2$ , $m_1 = 2, m_2 = m_3 = 3$ or $m_1 = 2, m_2 = 3, m_3 = 4$ . In each of these three cases, the Northcott property for $\mathcal {H}_{({\mathbf {a}}, {\mathbf {m}})}$ holds trivially.
We now focus on the simplest cases, where $m_1 = m_2 = 2$ and $m_3 = m, m \geq 2$ . Using that $\operatorname {PGL}_2$ acts 3-transitively on ${\mathbb {P}}^1$ , we reduce to the case $\{a_1, a_2, a_3\} = \{0,-1,\infty \}$ . For $[x,y] \in {\mathbb {P}}^1$ , we may then set
with $x_1, y_1$ square-free, and
with $z_1, \ldots , z_{m-1}$ square-free. In this notation, the E-S-ZB height is given by
We normalize the height so that the exponent of the “classical part” is equal to one, to obtain the normalized height
We now put
We prove the following theorem, which gives a crude upper bound for $N_m(T)$ .
Theorem 2.7 Let $\mathfrak {X} = \mathfrak {X}({\mathbb {P}}^1 : (0,2), (\infty , 2), (-1, m))$ , and let $H_m$ be the height function on $\mathfrak {X}$ defined by (2.8). Then, for any $\varepsilon> 0$ , we have
When $m = 2$ , the upper bound of Theorem 2.7 is essentially the trivial bound, but it is nontrivial as soon as $m> 2$ . In general, we expect the exponent in Theorem 2.7 to be equal to the lower bound. Indeed, this can be verified when $m = 2$ . Even more, we can give an exact order of magnitude for $N_2(T)$
Theorem 2.8 There exist positive numbers $c_1, c_2, c_3$ such that
for all $T> c_3$ .
In particular, we confirm the stacky Batyrev–Manin conjecture [Reference Ellenberg, Satriano and Zureick-Brown6, Main Conjecture] for $\mathfrak {X}({\mathbb {P}}^1_{\mathbb {Q}},(a,2),(b,2),(c,2))$ . For this stacky curve, [Reference Ellenberg, Satriano and Zureick-Brown6, Main Conjecture] predicts that $N_2(T) = O_{\varepsilon } \left (T^{1/2 + \varepsilon }\right )$ .Footnote 1 Our theorem gives an exact order of magnitude for $N_2(T)$ . We remark, once again, that P. Le Boudec had obtained the same result. Further, our counting arguments are similar to those obtained by Le Boudec in [Reference le Boudec13] which studies the equation (7.4).
The other cases with positive Euler characteristic do not yield to the simple analytic counting arguments used to prove Theorem 2.8, though in principle counting rational points by height is a well-posed problem. We plan on returning to this issue in the future.
We illustrate how the stacky curve height machine (equation (2.2)) allows one to detect integral points on stacky curves. In this case, the standard height is given by $H_s(a,b) = \max \{|a|,|b|\}$ and the stacky height is given by (2.7). They are equal precisely when
or in the notation of (7.4), that $|x_1| = |x_2| = |x_3| = 1$ . (7.4) then turns into
and up to rearranging we are essentially counting points on the conic
Therefore, if we denote by ${\mathcal {N}}(T)$ the number of integral points (in the sense of Definition 3.19) on ${\mathbb {P}}_{2,2,2}^1$ , then:
Corollary 2.9 There exist positive numbers $c_1, c_2, c_3$ such that for all $T> c_3$ we have
The proof is elementary, since the curve can be explicitly parametrized by
The condition $\max \{|y_1|, |y_2\} \leq T^{1/2}$ is subsumed by $u^2 + v^2 \leq 4T^{1/2}$ say, so the number of possible $u,v$ ’s is $\asymp T^{1/2}$ as desired.
Theorem 2.8 and Corollary 2.9 imply that asymptotically $0\%$ of the rational points on $\mathfrak {X}({\mathbb {P}}^1 :(0,2),,(-1,2),(\infty ,2))({\mathbb {Q}})$ are integral, in the sense of Darmon (Definition 3.19).
To close off this subsection, we note that in [Reference Bhargava and Poonen2], Bhargava and Poonen study situations where the rational and integral points of a stacky curve satisfy the Hasse Principle. Motivated by this work, we prove that the integral points on $\mathfrak {X}({\mathbb {P}}^1 : (a_1, 2), \ldots , (a_n, 2))$ satisfy Hasse’s principle.
Theorem 2.10 Let
Then $\mathfrak {X}$ has integral points if and only if the ternary quadratic form
defines a conic with a rational point.
Notation
We denote by $d_k(n)$ for the number of ways of writing n as a product of k (not necessarily distinct) positive integers, and write $d(n) = d_2(n)$ for the usual divisor function. We will also use the big-O notation as well as Landau’s notation. In particular, we will denote in the subscripts any dependencies; if there are no subscripts, then the implied constants are absolute.
3 (Stacky) Heights on stacky curves
In this section, we give an alternative construction of the height functions constructed in [Reference Ellenberg, Satriano and Zureick-Brown6] in a special case: We construct the ESZ-B heights associated with line bundles on stacky curves with coarse space ${\mathbb {P}}^1_{\mathbb {Q}}$ . We use [Reference Voight and Zureick-Brown20] as our main reference, though we made substantial use of [16].
Definition 3.1 (Definition 5.2.1 in [Reference Voight and Zureick-Brown20])
A stacky curve $\mathfrak {X}$ over a field k of characteristic 0 is a smooth proper geometrically connected Deligne–Mumford stack over k of dimension 1 that contains an open dense subscheme.
A stacky curve can be thought of a smooth projective curve, along with a finite choice of points with integer multiplicities.
Theorem 3.2 (Classification of nice stacky curves: Lemma 5.3.10 in [Reference Voight and Zureick-Brown20])
Let $\mathfrak {X}$ be a stacky curve over k. Then the isomorphism class of $\mathfrak {X}$ is determined by the coarse moduli space X of $\mathfrak {X}$ and the orders of the stabilizer groups of points of $\mathfrak {X}$ .
Before continuing, let us fix some notation. We let $\mathfrak {X}=(X:(P_1,m_1), \ldots , (P_r,m_r))$ be the stacky curve with coarse space X and a $\mu _{m_i}$ stabilizer at $P_i$ . In light of Theorem 3.2, this determines a unique stacky curve.
3.1 Construction of heights
We will give an alternative construction of heights on a stacky curve associated with line bundles. Our construction only depends on the coarse space, and the multiplicities of points. The ideas exposited in this section can be viewed as a “bottom up” construction, similar to the work of Geraschenko and Satriano in [Reference Geraschenko and Satriano10]. We then show that our height construction corresponds to the heights associated with line bundles in [Reference Ellenberg, Satriano and Zureick-Brown6] when the coarse space is ${\mathbb {P}}^1_{\mathbb {Q}}$ . As in the classical setting, we will work with height functions up to some bounded function. Line bundles on a stacky curve can be described as follows.
Lemma 3.3 ([Reference Fantechi, Mann and Nironi7, Section 1.3])
Let $\mathfrak {X}=\mathfrak {X}(X:(P_1,\cdots ,P_r),(m_1,\cdots ,m_r))$ . Let ${\mathcal {O}}_{X}(P)$ be the line bundle associated with the divisor P on X. Then there are line bundles ${\mathcal {L}}_{P_i}$ on $\mathfrak {X}$ such that
where $\pi _{\mathfrak {X}}\colon \mathfrak {X}\rightarrow X$ is the coarse space map. Moreover, we have that any line bundle ${\mathcal {L}}$ on $\mathfrak {X}$ can be uniquely written as
where $0\leq d_i<m_i$ and M is a line bundle on X.
We will use the definition of the degree of a line bundle on a stacky curve.
Definition 3.4 Let $\mathfrak {X}=\mathfrak {X}(X:(P_1,\ldots ,P_r),(m_1, \ldots , m_r)))$ and $\mathcal {L}=\pi _{\mathfrak {X}}^* D\otimes \prod _{i=1}^r{\mathcal {L}}_{P_i}^{\otimes d_i}$ . Then we define
In [Reference Ellenberg, Satriano and Zureick-Brown6], the height is broken down into two parts: a so-called stable part and a local part. We now define the stable part in our setting.
Definition 3.5 Let $\mathfrak {X}=\mathfrak {X}(X:(P_1,\ldots ,P_r),(m_1, \ldots , m_r)))$ , and let ${\mathcal {L}}$ be a line bundle on $\mathfrak {X}$ with
where $0\leq d_i<m_i$ and M is a line bundle on X with $\pi _{\mathfrak {X}}$ being the coarse space map. We define the stable height associated with ${\mathcal {L}}$ as
Later in Proposition 4.3, we show that this definition matches the one given in [Reference Ellenberg, Satriano and Zureick-Brown6]. The stable height should be thought of as the part of the height consisting of classical height functions.
We will use the notions introduced in Section 3 to define our heights. We further choose a finite set of primes S of ${\mathcal {O}}_K$ containing all the primes of bad reduction and all infinite places of K. We further choose a smooth and proper model $\underline {X}$ of X over ${\mathcal {O}}_{K,S}$ . Let $P,Q$ be distinct points in $X(K)$ and place $\nu $ a place in K with $\nu \notin S$ . Take $\mathfrak {p}_\nu \subset {\mathcal {O}}_K$ to be the prime ideal associated with $\nu $ .
Definition 3.6 (Darmon [Reference Darmon4])
We define the intersection multiplicity of P and Q at $\nu $ as follows:
where the maximum over the empty set is defined to be 0.
We now package all the intersection multiplicities together while taking into account the arithmetic of the field extension $K\mid {\mathbb {Q}}$ . We shall use the following notation for the remainder of this section.
Notation 3.7 Fix a stacky curve $\mathfrak {X}=(X:(P_1,m_1), \ldots , (P_r,m_r))$ defined over a number field K. Choose a finite set of primes S of ${\mathcal {O}}_K$ containing all the primes of bad reduction and all infinite places of K. We further choose a smooth and proper model $\underline {X}$ of X over ${\mathcal {O}}_{K,S}$ . We define the following quantities.
-
(1) Given a prime $\mathfrak {p}_\nu \subseteq {\mathcal {O}}_{K}$ , we let $\mathfrak {f}_\nu =[{\mathcal {O}}_K/\mathfrak {p}_\nu \colon {\mathbb {Z}}/(\mathfrak {p}_\nu \cap {\mathbb {Z}})].$
-
(2) Fix $P\in X(K)$ , $t\in X(K)\setminus \{P\}$ and $\nu \notin S$ , now put $(t\cdot P)_p=\sum _{\nu \notin S,\nu \mid p}\mathfrak {f}_\nu \cdot (t\cdot P)_\nu $ .
-
(3) We set
$$ \begin{align*} \lambda_{S,\underline{X},\nu}(P,t)=\lambda_{\nu}(P,t)=\mathrm{N}(\mathfrak{p}_\nu)^{(t\cdot P)_\nu} \end{align*} $$and$$ \begin{align*} \lambda_{S,\underline{X}}(P,t) = \lambda(P,t)=\prod_{\nu\notin S}\lambda_\nu(P,t)=\prod_pp^{(t\cdot P)_p}. \end{align*} $$
The integer $\lambda (P,t)$ is an exponential version of the familiar looking intersection product
We will also require the following basic functions.
Definition 3.8 For each integer $m\geq 1$ , we let $[0], \ldots ,[m-1]$ be a set of representatives of the equivalence classes of ${\mathbb {Z}}/m{\mathbb {Z}}$ . Define
for $0\leq r<m$ and
for any $d\in {\mathbb {Z}}$ . With this notation, $N_{m,-}=N_{m,1}$ .
These functions are used to make the following definition of the height function associated with ${\mathcal {L}}_{P_i}^{d_i}$ .
Definition 3.9 The stacky height function associated with ${\mathcal {L}}_{P_i}^{d_i}$ is a function
defined by
where $N_{m_i,d_i}\colon {\mathbb {Z}}_{\geq 0}\rightarrow {\mathbb {Z}}_{\geq 0}$ is the function defined by equation (3.4).
Putting this all together, we obtain the following.
Definition 3.10 (Definition of heights)
Let ${\mathcal {L}}$ be the line bundle ${\mathcal {L}}\cong \pi _{\mathfrak {X}}^*M\otimes \prod _{i=1}^r{\mathcal {L}}_{P_i}^{\otimes d_i}$ on $\mathfrak {X}$ . The stacky height associated with ${\mathcal {L}}$ is defined to be
Unwinding the definitions, we obtain
This decomposition allows us to define what we call the classical and stacky part of a height function.
Definition 3.11 We call
the classical part of the height $H_{\mathcal {L}}$ and
the stacky part of the height $H_{\mathcal {L}}$ .
We will primarily work explicitly with stacky heights on ${\mathbb {P}}^1$ , the formulas in that case are as follows.
Corollary 3.12 Use the notation of Notation 3.7. The canonical height function may be computed as
The anti-canonical height function may be computed as
Proof This follows directly from the definition, Corollary 4.4, and the fact that $K_{\mathfrak {X}}$ corresponds to the line bundle
We now introduce two multiplicative functions $\phi _m$ and $r_m$ that depend on an integer $m\geq 1$ . The functions $\phi _m$ and $r_m$ are dual to one another in a certain sense. This duality is key to understanding the nonlinear aspects of heights on stacky curves.
Let x be a positive integer with prime factorization $x=\prod _{p}p^{\operatorname {ord}_p(x)}$ . We will work with the following functions.
-
(1) Using the division algorithm, we define integers $q_{p,m}(x),r_{p,m}(x)$ by the equation $\operatorname {ord}_p(x)=q_{p,m}(x)m+r_{p,m}(x)$ where $0\leq r_{p,m}(x)<m$ .
-
(2) $q_m(x)=\prod _{p}p^{q_{p,m}(x)}$ and $r_m(x)=\prod _{p}p^{r_{m,p}(x)}$ .
-
(3) Set $\phi _m(x)$ to be the least positive integer such that $x\phi _m(x)$ is an mth power.
-
(4) We define the m-radical of x to be the product of all prime divisors of x whose order is not a factor of m. In other words,
$$\begin{align*}\operatorname{rad}_m(x)=\prod_{p\ \mathrm{s.t.}\ \operatorname{ord}_p(x)\neq 0 \quad\mod m}p.\end{align*}$$
The $r_m$ is related to $N_{m,\mathrm {can}}$ and $\phi _m$ is related to $N_{m,-}$ because of the following.
Proposition 3.13 Let $x\in {\mathbb {Z}}_{\geq 0}$ . Then we have
In particular, we have
From the above formulas, we obtain the following.
Proposition 3.14 Fix and integer $m>1$ and let $x\in {\mathbb {Z}}$ . Then:
-
(1) Both $r_m$ and $\phi _m$ are multiplicative functions.
-
(2) $\phi _m(x)r_m(x)=\operatorname {rad}_m(x)^m$ .
-
(3) $\phi _m(x)=1\iff r_m(x)=1$ .
-
(4) If $m=2$ , then $r_m(x)=\phi _m(x)$ .
With these definitions in hand, we relate our height functions on the stacky curve $\mathfrak {X}$ to a function from the functions $\phi _{m_P}(\lambda (P,t))^{\frac {1}{m_P}},$ and the classical Weil heights on the coarse space X.
Corollary 3.15 Use the notation of Notation 3.7. Consider the line bundle
with $0\leq d_i\leq m_i-1$ . Then
In particular, when $X={\mathbb {P}}^1$ , we have
where $\chi (\mathfrak {X})=-\deg K_{\mathfrak {X}} $ is the Euler characteristic of $\mathfrak {X}$ .
One interesting feature of the heights given by (3.10) is that they differentiate between rational and integral points on stacky curves. The connection to [Reference Ellenberg, Satriano and Zureick-Brown6] and our heights is the following, which is proved in 4.1.
Theorem 3.16 Fix a stacky curve $\mathfrak {X}=({\mathbb {P}}^1_{\mathbb {Q}}:(P_1,m_1), \ldots , (P_r,m_r))$ . Choose S to be the set of all finite primes of ${\mathbb {Z}}$ , and let ${\mathbb {P}}^1_{\mathbb {Z}}$ be the canonical model of ${\mathbb {P}}^1_{\mathbb {Q}}$ over ${\mathbb {Z}}$ . Let ${\mathcal {L}}$ be a line bundle on $\mathfrak {X}$ . Let $H^{\mathrm{ESZ}\text{-}\mathrm{B}}_{\mathcal {L}}$ be the height constructed in [Reference Ellenberg, Satriano and Zureick-Brown6] associated with ${\mathcal {L}}$ . Then there is some constant $C>0$ with
That is to say, up to a constant, the stacky heights from Definition 3.10 agree with the ESZ-B heights in [Reference Ellenberg, Satriano and Zureick-Brown6] when the coarse space is ${\mathbb {P}}^1_{\mathbb {Q}}$ .
We now explain how the functions $\phi _m$ and $r_m$ can be used to understand the difference between $H_{{\mathcal {L}}}$ and $H_{{\mathcal {L}}^{\otimes n}}$ .
Proposition 3.17 Let $m\in {\mathbb {Z}}_{\geq 1}$ and choose an integer $d\geq 0$ .
Proof Since $\phi _m$ is multiplicative, it suffices to prove the statement for $x=p^a$ where p is some prime. Note that $\phi _m((p^a)^{n}=p^{m-na \ \ \mod m})$ . Therefore, $\phi _m((p^a)^{-d \ \ \mod m})=p^{m+da \ \ \mod m}$ . On the other hand, $r_m((p^a)^{d \ \ \mod m}=p^{da \ \ \mod m}=p^{m+da\ \ \mod m}$ as needed.
The theory of heights is different from the classical theory of heights, as $H_{{\mathcal {L}}^{-1}}\neq \frac {1}{H_{\mathcal {L}}}+O(1)$ and $H_{{\mathcal {L}}^{\otimes n}}\neq (H_{\mathcal {L}})^n+O(1)$ . The functions $\phi _m$ and $r_m$ can be used to compute these quantities.
Theorem 3.18 (Duality theorem)
Let $\mathfrak {X}=\mathfrak {X}(X:(P_1,m_1), \ldots ,(P_r,m_r))$ be a stacky curve, and let
with $0\leq d_i\leq m_i-1$ . Fix an integer $n\neq 0$ and write $n_i=nd_i\ \ \mod m_i$ .
-
(1) Then we always have
$$ \begin{align*} H_{{\mathcal{L}}^{\otimes n}}&=(H_{{\mathcal{L}}}^{\mathrm{st}})^n\cdot \prod_{i=1}^r\phi_{m_i}(\lambda(P_i,t)^{n_i})^{\frac{1}{m_i}}\\ &=(H_{{\mathcal{L}}}^{\mathrm{st}})^n\cdot \prod_{i=1}^r\phi_{m_i}(\lambda(P_i,t)^{nd_i\quad \mod m_i})^{\frac{1}{m_i}}. \end{align*} $$ -
(2) If $n>0$ , then
$$ \begin{align*} H_{{\mathcal{L}}^{\otimes -n}}=(H_{{\mathcal{L}}}^{\mathrm{st}})^{-n}\cdot \prod_{i=1}^rr_{m_i}(\lambda(P_i,t)^{nd_i \quad\mod m_i})^{\frac{1}{m_i}}. \end{align*} $$In particular,$$\begin{align*}H_{{\mathcal{L}}^{-1}}=(H_{{\mathcal{L}}}^{\mathrm{st}})^{-1}\cdot \prod_{i=1}^rr_{m_i}(\lambda(P_i,t)^{d_i})^{\frac{1}{m_i}}.\end{align*}$$
Proof Write $nd_i=m_iq_i+r_i$ with $0\leq r_i<m_i$ . Note that ${\mathcal {L}}_{P_i}^{\otimes m_i}=\pi ^*_{\mathfrak {X}} {\mathcal {O}}_X(P_i)$ . So we have that
Therefore, we have
Now we have that
by the definition of the $r_i$ . Now let $n>0$ . By Proposition 3.17, we have $\phi _m(\lambda (P_i,t)^{-dn \ \ \mod m})=r_m(\lambda (P_i,t)^{nd_i \ \ \mod m_i})$ giving the desired conclusion of (2).
3.2 Integral points on stacky curves
Here, we show that the height functions defined in Definition 3.10) can be used to obtain information about integral points on $\mathfrak {X}$ . Fix a stacky curve $\mathfrak {X}=\mathfrak {X}(X:({\mathbf {p}},{\mathbf {m}}))$ .
Let $H_{\mathfrak {X},P_i}$ be the height function associated with the line bundle ${\mathcal {L}}_{P_i}$ and put ${\mathcal {D}}_{\mathfrak {X}}=\prod _{i=1}^r{\mathcal {L}}_{P_i}.$ Recall that we have $H_{\mathfrak {X},P_i}(t)=\phi _{m_i}(\lambda (P_i,t))^{1/m_i}$ . We will work with the height
This is the stacky part of the anti-canonical height. We will prove that the set of integral points is contained in the set of points where $H_{{\mathcal {D}}_{\mathfrak {X}}}(t)=1$ . When we take $K={\mathbb {Q}}$ , we see that this condition is sufficient. In other words, the S-integral points are those where the stacky part of the height is trivial. Following Darmon [Reference Darmon4], we have the following notion of integral points on an stacky curve.
Definition 3.19 (Darmon)
Let $\mathfrak {X}=(X:(P_1,m_1), \ldots , (P_r,m_r))$ be a stacky curve over a number field K, S a finite set of places of K containing all primes of bad reduction. Let $\underline {X}$ be a smooth proper model for X over ${\mathcal {O}}_{K,S}$ . The $(\underline {X},S)$ -integral points of $\mathfrak {X}$ (usually abbreviated to S-integral points of $\mathfrak {X}$ ) are the points $t\in X(K)$ such that
for all $P\in X(K)$ and $\nu \notin S$ .
We shall prove the following theorem.
Theorem 3.20 Let $\mathfrak {X}=(X:(P_1,m_1), \ldots ,(P_r,m_r))$ be a stacky curve over K satisfying our assumptions and choose S and a model $\underline {X}$ as we have specified. Then we have the following conclusions.
-
(1)
$$\begin{align*}\mathfrak{X}({\mathcal{O}}_{K,S,\underline{X}})\subseteq \bigcap_{i=1}^r\mathfrak{X}(P_i;K),\end{align*}$$where $\mathfrak {X}(P_i;K)=\{t\in X(K)\colon H_{\mathfrak {X},P_i}(t)=1\}$ . -
(2) If $K={\mathbb {Q}}$ , then
$$\begin{align*}\mathfrak{X}({\mathcal{O}}_{K,S,\underline{X}})= \bigcap_{i=1}^r\mathfrak{X}(P_i;K). \end{align*}$$In particular, the set of S-integral points of $\mathfrak {X}$ is precisely the set of points where $H_{\mathfrak {X},P_i}(t)=1$ for all $i=1,...,r$ .
Fix a prime $\nu \notin S$ and write $(t\cdot P)_\nu =m_{P}^{e_{\nu ,P}(t)}\cdot q_{\nu ,P}(t)$ where $e_{\nu ,P}(t)\geq 0$ and $q_{\nu ,P}(t)\geq 0$ is not divisible by $m_P$ . In other words, $q_{\nu ,P}(t)$ is the $m_P$ -free part of $(t\cdot P)_\nu $ . Set $\mathrm {N}(\mathfrak {p}_\nu )=p_\nu ^{f(\nu )}$ . Then
and
Using the functions $\lambda (P,t)$ , we can find subsets of the rational points that contain all integral points.
Proposition 3.21 Suppose that $m_P>1$ . Define $\mathfrak {X}(P;K)=\{t\in X(K)\colon H_{\mathfrak {X},P}(t)=1\}$ . Then
Proof Note that $H_{\mathfrak {X},P}(t)=1$ is equivalent to $\phi _{m_P}(\lambda (P,t))=1$ , which is then equivalent to $r_{m_P}(\lambda (P,t))=1$ . Suppose now that t is an S-integral point. Then $(t\cdot P)_\nu \equiv 0\ \ \mod m_{P}\Rightarrow e_{\nu ,P}(t)>0$ for all $\nu \notin S$ . Thus,
whence $H_{\mathfrak {X},P}(t)=1$ as $\lambda (P,t)$ is an $m_P$ -power.
We see that each point P with multiplicity $m_P>1$ imposes a height dropping condition on the set of integral points. Thus, to study integral points, it suffices to study
We obtain the following, that the stacky part of the anti-canonical height cuts out the integral points.
Corollary 3.22 Let $\mathfrak {X}=\mathfrak {X}(X:({\mathbf {p}},{\mathbf {m}}))$ be a stacky curve and ${\mathcal {D}}_{\mathfrak {X}}=\prod _{i=1}^r{\mathcal {L}}_{P_i}$ . Then
In other words, the integral points are precisely those points where the stacky part of the anti-canonical height vanishes.
Example 3.23 Let $X=C$ be an elliptic curve. Then $H_{{\mathcal {D}}_{\mathfrak {X}}}=H_{-K_{\mathfrak {X}}}$ . In other words, the integral points of the stacky elliptic curve are precisely the points where the anti-canonical height vanishes. Since $\phi _m(x)=1\iff r_m(x)=1$ , we have that the S-integral points of a stacky elliptic curve are also precisely the points where the stacky canonical height vanishes. If $\mathfrak {X}$ is a scheme and X is an elliptic curve, then this is certainly true, as the canonical height is trivial and every rational point is integral, and vice versa.
Proof of Theorem 3.20
We have already shown part $(1)$ of Theorem 3.20 in (3.21). We turn to part $(2)$ and assume that $K={\mathbb {Q}}$ . We know that $\bigcap _{m_P>1}\mathfrak {X}(P;{\mathbb {Q}})\subseteq \mathfrak {X}({\mathcal {O}}_{{\mathbb {Q}},S,\underline {X}})$ by (3.21). We now show the reverse inclusion. Let $t\in X({\mathbb {Q}})$ with $H_{\mathfrak {X},P}(t)=1$ for all P with $m_P>1$ . Since $K={\mathbb {Q}}$ , we have that $\mathrm {N}(\mathfrak {p}_\nu )=p_\nu $ and $f(\nu )=1$ for all finite places $\nu $ . Fix P with $m_P>1$ . Toward a contradiction suppose that $(t\cdot P)_{\nu _0}\neq 0 \ \ \mod m_P$ for some $\nu _0\notin S$ . Then $e_{\nu _0,P}(t)=0$ . Notice that $H_{\mathfrak {X},P}(t)=1$ means that $\lambda (P,t)$ is an $m_P$ -power. Since if $\nu \neq \nu ^\prime $ we have that $p_\nu \neq p_{\nu ^{\prime }}$ , we have by unique factorization of integers that
for some integers $z_\nu (t)$ . In particular, for $\nu _0$ , we have
Thus, $z_{\nu _0}(t)m_P=q_{\nu _0,P}(t)$ , which contradicts $q_{\nu _0,P}(t)$ being indivisible by $m_P$ . Thus, for all $m_P>1$ and $\nu \notin S$ , we have $(t\cdot P)_\nu \equiv 0 \ \ \mod m_P$ and t is an S-integral point of $\mathfrak {X}$ by definition.
4 Stacky curves with coarse space ${\mathbb {P}}^1$
We focus on the situation when the base curve is ${\mathbb {P}}^1$ . Let $X={\mathbb {P}}^1_{\mathbb {Q}}$ and $S=\{\nu _\infty \}$ and take ${\mathcal {L}}$ to be ${\mathcal {O}}_{{\mathbb {P}}^1}(1)$ , so the ample height is the usual one. We consider the stacky curve
In this situation, the $\lambda (P,t)$ can be easily computed.
Proposition 4.1 Let $t=[x:y]$ and suppose that $P_i=(a_i:b_i)$ where $a_i,b_i$ are coprime integers. Then $\lambda (P_i,t)=\mid a_iy-b_ix\mid .$
Proof We have that $(t\cdot P_i)_p=\max _n\{[x:y]\equiv [a_i:b_i]\ \ \mod p^n\}$ . Note that this means there is some $\lambda \neq 0\ \ \mod p^n$ with $(x,y)=\lambda (a_i,b_i)\ \ \mod p^n$ . Since $a_i,b_i$ have been taken coprime, we may assume that p does not divide $a_i$ or $b_i$ . Suppose that $p\nmid a_i$ (the other case is similar). Then $\lambda =\frac {x}{a_i} \ \ \mod p^n$ and therefore $b_ix-a_iy= 0 \ \ \mod p^n$ . Thus, $(t\cdot P_i)_p=\operatorname {ord}_p(b_ix-a_iy)$ . Then we have that
as needed.
Definition 4.2 (Euler characteristic of stacky curves [Reference Darmon4])
Let $\mathfrak {X}=(X:(P_1,m_1), \ldots ,(P_r,m_r))$ be a stacky curve. The Euler characteristic of $\mathfrak {X}$ is defined by the formula
where $g(X)$ is the genus of the curve X. We define the genus of a stacky curve by the formula $\chi (\mathfrak {X})=2-2g(\mathfrak {X})$ .
We now begin assembling the necessary ingredients to compare our heights constructed in Section 3 to those in [Reference Ellenberg, Satriano and Zureick-Brown6]. We first work with $H_{{\mathcal {L}}}^{\mathrm {st}}$ .
Proposition 4.3 Let $\mathfrak {X}=\mathfrak {X}(X:(P_1, \ldots ,P_r),(m_1, \ldots , m_r))$ , and let ${\mathcal {L}}$ be a line bundle on $\mathfrak {X}$ with
where $0\leq d_i<m_i$ and M is a line bundle on X with $\pi _{\mathfrak {X}}$ being the coarse space map. Let $H_{{\mathcal {L}}}^{\mathrm {st},\mathrm{ESZ}\text{-}\mathrm{B}}$ be the stable height as constructed in [Reference Ellenberg, Satriano and Zureick-Brown6]. Then $H_{{\mathcal {L}}}^{\mathrm {st}}=H_{{\mathcal {L}}}^{\mathrm {st},\mathrm{ESZ}\text{-}\mathrm{B}}$ .
Proof In [Reference Ellenberg, Satriano and Zureick-Brown6], a general definition of the stable height is given. Let $m=\prod _{i=1}^rm_i$ . By the properties of the stable height described in [Reference Ellenberg, Satriano and Zureick-Brown6], we have that
On the other hand,
It is a fact that if L is a line bundle on X, then $H^{\mathrm {st},\mathrm{ESZ}\text{-}\mathrm{B}}_{\pi _{\mathfrak {X}}^* L}=H_L\circ \pi _{\mathfrak {X}}$ . Therefore, we have
Taking $m{\mathrm {th}}$ roots gives the desired inequality.
Corollary 4.4 Let $\mathfrak {X}=\mathfrak {X}({\mathbb {P}}^1:(P_1, \ldots ,P_r),(m_1, \ldots , m_r)))$ , and let ${\mathcal {L}}$ be a line bundle on $\mathfrak {X}$ . Then
Proof Let ${\mathcal {L}}$ be a line bundle on $\mathfrak {X}$ . Then we may write
On ${\mathbb {P}}^1$ , we have that ${\mathcal {O}}_{{\mathbb {P}}^1}(P_i)\cong {\mathcal {O}}_{{\mathbb {P}}^1}(1)$ . So, by definition, the stable height is
as needed.
We can now precisely define the heights on a stacky curve with coarse space ${\mathbb {P}}^1_{\mathbb {Q}}$ .
Definition 4.5 Let $\mathfrak {X}=({\mathbb {P}}^1_{\mathbb {Q}},(P_1,m_1), \ldots ,(P_r,m_r))$ be a stacky curve. Set $P_i=[a_i:b_i]$ with $a_i,b_i$ coprime integers and $\ell _i(t)=ax-by$ when $t=[x:y]$ for $x,y$ coprime integers. Let ${\mathcal {L}}=\pi _{\mathfrak {X}}^* M\otimes \prod _{i=1}^r{\mathcal {L}}_{P_i}^{\otimes d_i}$ with $0\leq d_i<m_i$ . Then define
In particular, we have that
and
When $P_1=[1,0],P_2=[1,-1],P_3=[1:0]$ , and $m_1=m_2=m_3=2$ , we obtain the square root of the height function Question (2.7). We now have enough to prove our main comparison theorem.
4.1 Proof of Theorem 3.16
We follow the argument given in [Reference Ellenberg, Satriano and Zureick-Brown6, p. 45]. Write ${\mathcal {L}}\cong \pi _{\mathfrak {X}}^*M\otimes \prod _{i=1}^r{\mathcal {L}}_{P_i}^{\otimes d_i}$ . From [Reference Ellenberg, Satriano and Zureick-Brown6, Section 2.3], we have a decomposition
where the $\delta _{{\mathcal {L}},p}$ ’s are the local discrepancies associated with $H_{{\mathcal {L}}}^{\mathrm{ESZ}\text{-}\mathrm{B}}$ . On the other hand, we have that
By Proposition 4.3, we have that
up to some positive constant. Therefore, it suffices to show that
Let $x\colon \mathrm {spec}\ {\mathbb {Q}}\rightarrow \mathfrak {X}$ be a rational point whose image is not any of the stacky points $P_i$ . Then, in [Reference Ellenberg, Satriano and Zureick-Brown6], there is a one-dimensional stack ${\mathcal {C}}$ called the tuning stack and a diagram:
Moreover, the local discrepancies can be computed at a prime p by
These degrees can be computed locally on $\mathfrak {X}$ in terms of the stacky points $P_i$ . In other words,
The local degree of ${\mathcal {L}}$ at $P_i$ at ${\mathbb {Q}}/{\mathbb {Z}}$ is $\frac {d_i}{m_i}$ . Following [Reference Ellenberg, Satriano and Zureick-Brown6], we have the local degrees
We obtain that the contribution at $P_i$ to the local discrepancy at p can be written as
Now write $d_i\ell _i(x)=q_im_i+r_i$ with $0\leq r_i\leq m_i-1$ . First, suppose that $r_i=0$ . Then
Now suppose that $r_i\neq 0$ . Then we have that
In other words, we have shown that
Thus, the local discrepancies are given by
Combining equations 4.2–4.4 gives the desired conclusion.
Question 4.6 Consider a stacky curve $\mathfrak {X}=(X:(P_1,m_1), \ldots ,(P_r,m_r))$ . A line bundle M on X with a chosen height function $H_M$ and integers $0\leq d_i<m_i$ . Then does
agree with the height constructed in [Reference Ellenberg, Satriano and Zureick-Brown6] up to a bounded constant when $X\neq {\mathbb {P}}^1_{\mathbb {Q}}$ . In other words, does our stacky height machine recover the heights in [Reference Ellenberg, Satriano and Zureick-Brown6] for all stacky curves over all number fields? If not, can one define different size functions $\tilde {N}_{m_i,d_i}\colon {\mathbb {Z}}_{\geq 0}\rightarrow {\mathbb {Z}}_{\geq 0}$ so that the ESZ-B height associated with ${\mathcal {L}}$ is of the form
This result will follow provided one can show that the local degree of $\bar {x}^*{\mathcal {L}}$ with respect to $P_i$ over a prime p is
In this case, the argument given in Theorem 3.16 would give the desired result. Further, one might ask if these methods could be extended to compute the height functions of line bundles certain higher dimensional analogues of stacky curves.
4.2 Morphisms of stacky curves
We will require some results on morphisms between stacky curves.
Definition 4.7 (Darmon [Reference Darmon4])
Let $\mathfrak {X}_1=(X_1,(P,m_P)),\mathfrak {X}_2=(X_2,(Q,m_Q))$ be M curves defined over a number field K. A morphism of stacky curves over K is a morphism of algebraic curves $\pi \colon X_1\rightarrow X_2$ defined over K such that for all $P\in X_1(K)$ with $\pi (P)=Q$ , we have that
where $e_\pi (P)$ is the ramification index of $\pi $ at P. We also define $e_{\underline {\pi }}(P)=\frac {e_{\pi }(P)m_P}{m_{Q}}$ the ramification index of $\underline {\pi }$ at P.
Now let $\mathfrak {X}=(X:{\mathbb {Q}};(P_1,m_1), \ldots ,(P_r,m_r))$ be an M curve. For any $s<r$ , choose positive divisors $d_i$ of $m_i$ for $i=1,...,s$ . Then there is a multiplicity lowering morphism
defined by the identity morphism on X. The usefulness of this notion can be seen by the following.
Proposition 4.8 (Darmon [Reference Darmon4])
Let $\underline {\pi }\colon \mathfrak {X}_1\rightarrow \mathfrak {X}_2$ be a morphism of stacky curves defined over K. Then
In other words, a morphism of stacky curves preserves the notion of S-integral points.
Lemma 4.9 Let K be a field and
Define a bilinear form $L({\mathbf {v}},{\mathbf {x}})={\mathbf {v}}^TU^T{\mathbf {x}}$ . Let T be a non-singular matrix with entries in K. Then
Proof We have that
Direct computation shows that $(UT)^TT=(\det T)U^T$ . We then have
Note that if $P=[a:b]$ and $t=[x,y]$ where $a,b$ and $x,y$ are coprime integers, then $\lambda (P,t)=ay-bx$ by (4.1). In other words, $\lambda (P,t)=L(P,t)$ as defined in (4.9).
Lemma 4.10 Let $P=[a:b],t=[x,y]\in {\mathbb {P}}^1_{\mathbb {Q}}$ with $a,b$ and $x,y$ coprime integers. Fix an integer $m>1$ . Let $\alpha \colon {\mathbb {P}}^1_{\mathbb {Q}}\rightarrow {\mathbb {P}}^1_{\mathbb {Q}}$ be an automorphism. Let $\det \alpha $ be the smallest possible nonnegative determinant of an integral representation of $\alpha $ and similarly for $\det \alpha ^{-1}$ . Then we have
Proof Let L be as (4.1). Then we have that
for some integers $d_1,d_2$ which account for common factors of $\alpha (P)$ and $\alpha (t)$ . Let $n_1=q_1m+r_1,n_2=q_2m+r_2$ be integers with $0\leq r_i<m$ .
Since $\phi _m$ is multiplicative, we have that $\phi _m(zw)\leq \phi _m(z)\phi _m(w)\leq \operatorname {rad}_m(z)^{m-1}\phi _m(w)$ . Therefore, using (4.6),
Applying the same reasoning using $\alpha ^{-1}$ , we have that
Therefore, we have
as required.
Corollary 4.11 Let $\mathfrak {X}=\mathfrak {X}({\mathbb {P}}^1;(P_1,m_1), \ldots ,(P_r,m_r))$ . Let $\alpha \colon {\mathbb {P}}^1_{\mathbb {Q}}\rightarrow {\mathbb {P}}^1_{\mathbb {Q}}$ be an automorphism of ${\mathbb {P}}^1_{\mathbb {Q}}$ . Let $Q_i=\alpha (P_i)$ and $\mathfrak {Y}=\mathfrak {Y}({\mathbb {P}}^1;(Q_1,m_1), \ldots ,(Q_r,m_r))$ . Then $\alpha $ induces an isomorphism $\underline {\alpha }\colon \mathfrak {X}\rightarrow \mathfrak {Y}$ . Let $\det \alpha $ and $\det \alpha ^{-1}$ be as in (4.10). Let $D(\alpha ,\mathfrak {X})=\prod _{i=1}^r\operatorname {rad}_{m_i}(\det \alpha )^{1-1/m_i}$ and similarly define $D(\alpha ^{-1},\mathfrak {X})$ . Suppose that $C_\phi $ is a constant (such a constant always exists) such that
Then
Proof By assumption, for each $i=1, \ldots ,r$ , we have from (4.10) and our assumption that
Similarly, we have
We can get slightly worse, but more understandable bounds as follows. We always have $\operatorname {rad}_m(x)\leq \operatorname {rad}(x)$ . Note that we have that $\sum _{i=1}^r(1-1/m_i)=2-\chi (\mathfrak {X})=2g(\mathfrak {X})$ . Thus, we in fact have
Of particular note is that we see that when studying the Northcott property, we may change the height by an automorphism. Thus, the Northcott property is stable under isomorphism as expected.
4.3 Northcott property of the canonical height on stacky curves
We now investigate the properties of the canonical height on stacky curves, given by
When $\delta ({\mathbf {m}}) = 0$ , we see that the canonical height exhibits a clear duality with the anti-canonical height, and so the same argument shows that Northcott’s property will fail. When $\delta ({\mathbf {m}}) < 0$ , we then see at once that $H_{({\mathbf {a}}, {\mathbf {m}})}$ will have Northcott’s property as a consequence that the Weil height having Northcott’s property. It remains to consider Northcott’s property when $\delta ({\mathbf {m}})> 0$ . In this case, we have ${\mathbf {m}} = (2, m_2, m_3)$ with
It suffices to show that for any such pair $(m_2,m_3)$ , there exist integers $a,b,c$ such that the curve
has infinitely many primitive integral solutions. This is the content of Beukers’ paper [Reference Beukers1], and we are done.
5 On the Northcott property of canonical and anti-canonical heights on stacky curves
In this section, we prove Theorem 1.2, starting with Theorem 2.1. We start with a reduction procedure of a curve $\mathfrak {X}({\mathbb {P}}^1 : ({\mathbf {a}}, {\mathbf {m}}))$ which we describe colloquially. By convention, we shall take our weight vectors ${\mathbf {m}}$ to have the property that
Definition 5.1 Consider a stacky curve $\mathfrak {X}(X : ({\mathbf {a}}, {\mathbf {m}}))$ with ${\mathbf {a}}=(P_1, \ldots ,P_r)$ . Let ${\mathbf {i}}=i_1, \ldots ,i_k$ be a sub-sequence of $1,2,...,r$ . Then there is a morphism $\pi _{{\mathbf {i}}}\colon \mathfrak {X}(X : ({\mathbf {a}}, {\mathbf {m}}))\rightarrow \mathfrak {X}(X : ({\mathbf {a}}^\prime , {\mathbf {m}}^\prime ))$ where ${\mathbf {a}}^\prime =(P_{i_1}, \ldots ,P_{i_k})$ and ${\mathbf {m}}^\prime =(m_{i_1},\ldots ,m_{i_k})$ . The map is defined by taking the identity morphism on the coarse space X. We call such a morphism a totally ramified canonical covering.
The above construction defines a morphism by the definition of a morphism of M-curves. It is totally ramified in the sense that if i is some index that does not appear in ${\mathbf {i}}$ , then $\pi _{\mathbf {i}}$ has ramification index $m_i$ at $P_i$ . We use the term canonical this type of construction can be used for any stacky curve. In particular, by taking ${\mathbf {i}}$ to be the empty set, we obtain the coarse space morphism. We will show that if Theorem 2.1 holds for a totally ramified canonical covering of the shape $\mathfrak {X}({\mathbb {P}}^1 : ({\mathbf {a}}^\prime , {\mathbf {m}}^\prime ))$ where ${\mathbf {a}}^\prime , {\mathbf {m}}^\prime $ is obtained from ${\mathbf {a}}, {\mathbf {m}}$ , respectively, by removing a subset of indices, then it also holds for $\mathfrak {X}({\mathbb {P}}^1 : ({\mathbf {a}}, {\mathbf {m}}))$ (see Theorem 5.2 below).
Theorem 5.2 Let $\mathfrak {X}({\mathbb {P}}^1 : (a_1, m_1), \ldots , (a_n, m_n))$ be a stacky curve. If the Northcott property fails for the height (2.2) for some totally ramified canonical cover of $\mathfrak {X}$ , then it will also fail for $\mathfrak {X}$ .
Proof Let $\mathfrak {X}$ be given as in the statement of Theorem 5.2. We may assume, after reindexing the points if necessary, that the Northcott property for the ESZ-B height fails for the totally ramified canonical cover given by
for some $k \leq n$ . This implies that, for some positive number $C_k$ depending at most on $a_1, \ldots , a_k$ and $m_1, \ldots , m_k$ , there are infinitely many integers $x,y$ such that
Next, note that the quotient
Observe that $\phi _m(s) \leq |s|^{m-1}$ for any integer s, with equality if and only if s is square-free. It follows that
and from here we immediately see from the triangle inequality that
Thus, by replacing $C_k$ with a larger positive number if necessary, we see that the Northcott property also fails for $H_{({\mathbf {a}}, {\mathbf {m}})}$ on $\mathfrak {X}$ .
Now, given Theorem 5.2, it remains to consider certain minimal choices of ${\mathbf {m}}$ . We say that $\delta ({\mathbf {m}})$ is minimally nonnegative if there is no subsequence ${\mathbf {m}}^\prime $ of ${\mathbf {m}}$ such that $\delta ({\mathbf {m}}^\prime ) \leq 0$ . We have the following lemma characterizing the minimally nonnegative tuples.
Lemma 5.3 Suppose ${\mathbf {m}} = (m_1, \ldots , m_n)$ with $2 \leq m_1 \leq \cdots \leq m_n$ is minimally nonnegative. Then $n \leq 4$ .
Proof Suppose $n \geq 5$ . If there exist $m_i, m_j, m_k \geq 3$ , then the sub-sequence $(m_i, m_j, m_k)$ satisfies $\delta ((m_i, m_j, m_k)) \leq 0$ , so ${\mathbf {m}}$ is not minimally nonnegative. If $m_3 \geq 3$ , then such a choice of $i,j,k$ exists, since $n \geq 5$ . Therefore, we may assume that $m_1 = m_2 = m_3 = 2$ . But then, $(2,2,2,m_4)$ satisfies $\delta ((2,2,2,m_4)) \leq 0$ , so ${\mathbf {m}}$ is not minimally nonnegative.
It remains to deal with minimally nonnegative tuples with $n = 3,4$ . Before we proceed, we will require Theorem 2.2, which we prove now.
Proof of Theorem 2.2
As we remarked earlier, the proof given here is provided to us by Shnidman in [Reference Shnidman15].
For given non-singular binary quartic form $F \in {\mathbb {Z}}[x,y]$ given by
we write $C_F$ for the curve defined by
The Jacobian of the genus one curve $C_{a,b}$ is the elliptic curve $E_{a,b}$ given by
where $I,J$ are the basic invariants given by
By 2-descent, we see that $C_{F}$ corresponds to a class c in $H^1({\mathbb {Q}} , E_{a,b}[2])$ . Note that for any integer d, the group $H^1({\mathbb {Q}}, E_{F}^{(d)}[2])$ is canonically isomorphic to $H^1({\mathbb {Q}} , E_{F}^{(d)}[2])$ such that c is the class of $C_{F}^{(d)}$ in $H^1({\mathbb {Q}}, E_{F}^{(d)}[2])$ .
We now consider two cases. First suppose that c does not come from 2-torsion. In this case, it is immediate that $E_{F}^{(d_0)}$ has positive rank, where
This is because in this case $C_{F}^{(d_0)}({\mathbb {Q}}) \ne \emptyset $ .
If c comes from 2-torsion, then we note that $C_{F}^{(d)}({\mathbb {Q}})$ is non-empty for all $d \in {\mathbb {Z}}$ . That is, for all $d \in {\mathbb {Z}}$ , we have $C_{F}^{(d)}$ is isomorphic to $E_{F}^{(d)}$ over ${\mathbb {Q}}$ . We can then choose a class $c^\prime $ in $H^1({\mathbb {Q}}, E_{F}[2])$ , represented by a different binary quartic form G, and choose d such that the twist of the genus one curve ${\mathcal {C}} : z^2 = G(u,v)$ given by ${\mathcal {C}}^{(d)}$ has a rational point. This implies that $E_{F}^{(d)}$ has positive rank. Then, with this choice of d, we find that $C_{F}^{(d)}({\mathbb {Q}}) \ne \emptyset $ and $E_{F}^{(d)}$ has positive rank, which completes the proof.
With Theorem 2.2, we proceed to handle minimally nonnegative tuples, starting with the case $n = 4$ .
5.1 Minimally nonnegative tuples with $n = 4$
We begin with the case ${\mathbf {m}} = (2,2,2,2)$ , and we will need Theorem 2.2. By $3$ -transivity of the action of $\operatorname {PGL}_2$ on ${\mathbb {P}}^1$ and Lemma 4.10, we may assume that three of the points are $0, 1, \infty $ , corresponding to the linear forms $x,y, x+y$ in the variables $x,y$ . We then write $\ell (x,y) = ax + by$ for the linear form representing the fourth half-point.
We prove the following as a warm-up.
Lemma 5.4 There exist integers $a,b$ such that the stack $\mathfrak {X}({\mathbb {P}}_{\mathbb {Q}}^1: (0, 2), (1, 2), (\infty , 2), (a/b, 2))$ has infinitely many rational points of E-S-ZB height equal to one.
Proof In this case, the height is given by
so this is equal to one if and only if each of $x, y, x+y, ax+by$ is a square. To wit, we set
This induces the equation
which is solvable and whose (primitive) integral solutions are parametrized by
Inserting this into $ax + by$ gives
We then fix $u = u_0, v = v_0$ so that $2u_0 v_0$ and $u_0^2 - v_0^2$ are co-prime, then solve the linear diophantine equation
Given a solution $(a,b)$ to this diophantine equation, one obtains a genus curve defined by
which is isomorphic to an elliptic curve, since it has a rational point given by $w_0^2 = F_{a,b}(u_0, v_0)$ . In particular, it must be isomorphic to its Jacobian. A simple calculation shows that the Jacobian of this genus one curve is given by the equation
so it suffices to find $a,b$ such that $E_{a,b}$ has positive rank. We find that setting $u_0 = 1, v_0 = 5$ and $a = 17, b = -118$ that the curve $E_{a,b}$ has positive rank, and therefore $w^2 = F_{a,b}(u,v)$ will have infinitely many integral solutions $(u,v,w)$ . This gives infinitely many pairs $u,v$ such that $F_{17,-118}(u,v)$ is a square. This implies our result, since
The general case will follow by applying the same ideas in tandem with Theorem 2.2. Indeed, Theorem 2.2 gives that for any $a,b \in {\mathbb {Z}}$ such that $F_{a,b}(u,v) = a(u^2 - v^2)^2 + 4bu^2 v^2$ is non-singular that there exists $d \in {\mathbb {Z}}$ such that $C_F({\mathbb {Q}}) \ne \emptyset $ and $E_F$ has positive rank. Fixing such a d, we see that there are infinitely many co-prime integers $u,v,z$ such that
Recall that in this setup we have
whence
This concludes the proof for the ${\mathbf {m}} = (2,2,2,2)$ case.
Note that if ${\mathbf {m}} = (m_1, m_2, m_3, m_4)$ is minimally nonnegative, then $m_1 = m_2 = 2$ , since $\delta ((3,3,3)) = 0$ . Thus, we may write
with $x_1, y_1$ square-free. We then write
Again, we have $z_i, w_j$ are square-free for $1 \leq i \leq m_3 - 1$ and $1 \leq j \leq m_4 - 1$ .
We now specialize to the points where $z_i = w_j = 1$ except for $i = 2$ and $j = 1,2$ , as well as $x_1 = y_1 = 1$ . Applying Theorem 2.2 and using the same argument as in the $(2,2,2,2)$ case, we see that there is a choice of $w_1 = d$ such that there are infinitely many choices of $x_2, y_2, w_2, z_2$ satisfying
The height of such a point is given by
It follows that there are infinitely many points of bounded height, and so Northcott’s property fails.
5.2 Minimally nonnegative tuples with $n = 3$
To complete the proof of Theorem 2.1, it remains to handle the cases when $n = 3$ and $\chi (\mathfrak {X}) \leq 0$ . We shall assume that $m_1 \leq m_2 \leq m_3$ . We then note that $\delta ({\mathbf {m}}) \leq 0$ if and only if one of the following conditions is satisfied:
-
(1) $m_1 \geq 3$ .
-
(2) $m_1 = 2, m_2 = 3, m_3 \geq 6$ .
-
(3) $m_1 = 2, m_2 \geq 4$ .
We deal with the first case. We then write
Now set
and
Then the value of the height $H_{({\mathbf {a}}, {\mathbf {m}})}(x,y)$ in this case is given by
Observe that
for $i = 1,2,3$ .
It remains to choose d so that the plane cubic curve
has infinitely many rational points. This is an easy consequence of the seminal work of Stewart and Top [Reference Stewart and Top18, Theorem 7], which in turn depends on the important work of Stewart in [Reference Stewart17]. In particular, they showed that the number of cube-free integers d with $|d| \leq X$ such that the equation
defines an elliptic curve with rank at least $2$ is asymptotically greater than $X^{1/3}$ . We, of course, do not need such a strong statement; indeed, we only need one such d. This completes the proof for the case $m_1 \geq 3$ .
We proceed to handle the case $m_1 = 2, m_2 \geq 4$ . Using the same notation as in (5.2), we then set
and set
This gives a curve
We need to choose square-free d so that this curve has infinitely many integral solutions, and such a d exists by Theorem 2.2. The height $H_{({\mathbf {a}}, Bm)}(x,y)$ is given by
Note that
so we obtain the upper bound
It follows that there are infinitely many integers $x,y$ such that $H_{({\mathbf {a}}, {\mathbf {m}})}(x,y)$ remains bounded.
Finally, we resolve the case $m_1 = 2, m_2 = 3, m_3 \geq 6$ . In this case, we use the fact that there exist integers $a,b,c$ with a square-free, b cube-free, and c sixth power-free such that the equation
has infinitely many primitive solutions (see [Reference Darmon and Granville5, Section 6.3]). Thus, by fixing such a triple $(a,b,c)$ and setting
and
we specialized a point on $\mathfrak {X}({\mathbb {P}}^1 : (0, 2), (\infty , 3), (-1, 6))$ to the curve given by (5.3). The height of such a point is then bounded in terms of $a,b,c$ only, and is thus absolutely bounded. This shows that the Northcott property fails in this case as well.
This concludes the proof of Theorem 2.1.
5.3 Proof of Theorems 2.5
We proceed to prove Theorem 2.5. The claim when $\delta ({\mathbf {m}}) = 0$ is covered in Theorem 2.1, so we will not discuss it again. When $\delta ({\mathbf {m}})> 0$ , we note that $n = 3$ , and that in each such case, there exist integers $a_{\mathbf {m}}, b_{\mathbf {m}}, c_{\mathbf {m}}$ such that the equation
has infinitely many primitive integral solutions (see, for example, [Reference Beukers1]). This shows that the Northcott property fails for $H^0$ .
We may now work with the case when $\delta ({\mathbf {m}}) < 0$ . We see that the height $H^0$ is bounded below by
so it suffices to show that this quantity necessarily goes to infinity. We then use the notation from (5.2), to obtain the equation
By convention, we have that $x_{i,j}$ is square-free for $1 \leq i \leq 3$ and $1 \leq j \leq m_i - 1$ . Thus, (5.4) is equal to
Viewing these products as coefficients in (5.5), we see that if $H^0$ is to be bounded, these coefficients must be bounded. Therefore, it suffices to check that for a fixed triple of integers $a,b,c$ , the equation
has finitely many primitive integer solutions when $1/m_1 + 1/m_2 + 1/m_3 < 1$ . But this is exactly the content of Darmon and Granville’s paper [Reference Darmon and Granville5], so we are done.
6 Northcott property of perturbed anti-canonical heights and the $abc$ -conjecture
In this section, we prove Theorem 1.3, starting with Theorem 2.6. We consider the property of recovering Northcott’s property on a modified ESZ-B anti-canonical height on the stacky curve
Here, the modified height takes the shape
Since $\chi (\mathfrak {X})=\delta ({\mathbf {m}})=2-\sum _{i=1}^n(1-\frac {1}{m_i})\leq 0$ , our goal is to show that
assuming the $abc$ -conjecture. Recall that we have shown that $ H^{\chi (\mathfrak {X})}_{({\mathbf {a}},{\mathbf {m}})}$ does not have the Northcott property unconditionally. Thus, we must show that $ H^{\chi (\mathfrak {X})+\kappa }_{({\mathbf {a}},{\mathbf {m}})}$ has the Northcott property for all $\kappa>0$ . First, assume that $\chi (\mathfrak {X})=0$ . The Northcott property for the standard height implies that $H_{({\mathbf {a}}, {\mathbf {m}})}^\delta (x,y)$ has the Northcott property whenever $\delta>0$ . So $\inf \{\delta \in {\mathbb {R}}\colon H^\delta _{({\mathbf {a}},{\mathbf {m}})}\ \mathrm{has\ the\ Northcott\ property}\}=0=\chi (\mathfrak {X})$ as needed. Now suppose that $\chi (\mathfrak {X})<0$ . Assume, without loss of generality, that $m_1 \leq m_2 \leq \cdots \leq m_n$ . We then write
We have
It follows that
Suppose that the following inequality holds for any $\epsilon>0$ :
Then, by multiplying both sides of the equation by $ \max \{\vert x\vert ,\vert y\vert \}^{\chi (\mathfrak {X})+\kappa }$ , we obtain
Taking $\varepsilon =\frac {\kappa }{2}$ , we have that
Thus, $H_{({\mathbf {a}}, {\mathbf {m}})}^{\chi (\mathfrak {X})+\kappa } (x,y)$ must have the Northcott property as it cannot remain bounded by the usual Northcott property for ${\mathbb {P}}^1$ . Therefore, we are done if we can confirm inequality (6.1). To do so, we require the following proposition, due to Granville [Reference Granville11].
Proposition 6.1 (Granville)
Suppose that the $abc$ -conjecture holds. Then, for any binary form F with nonzero discriminant and $\varepsilon> 0$ , we have
In other words, if the $abc$ -conjecture holds, then the radical of $F(m,n)$ will be quite large compared to the variables $m,n$ (provided that the degree is at least 3).
We will apply Proposition 6.1 to reduce the proof of Theorem 2.6 to a linear programming problem.
6.1 A linear program bound
Observe that for each $1 \leq i \leq n$ ,
Applying Proposition 6.1 to the binary form
in conjunction with the above observation, we obtain
Similarly, for each i, we have the bound
Taking logarithms and writing $y_{i,j} = \log |z_{i,j}|$ , we then have an optimization problem:
subject to
and
where $B = \max \{|x|, |y|\}$ . Further, we have $y_{i,j} \geq 0$ for all $i,j$ .
We emphasize that, at this point, integrality no longer plays a role, and neither does the syzygies relating the $z_{i,j}$ ’s. Indeed, we only need to solve the above linear program allowing arbitrary real inputs.
Now put
for $1\leq i\leq n$ and $1\leq j\leq m_i-1$ . Write $c_{i,m_i}=0$ and let ${\mathbf {c}}=(c_{i,j})$ to be the column vector with
We have that ${\mathbf {c}}\in {\mathbb {R}}^{N}$ where $N=\sum _{i=1}^n n m_i$ .
Let A be the matrix with rows representing the constraints,
If we have taken ${\mathbf {e}}_{i,j}$ to be a basis of ${\mathbb {R}}^N$ , then we have that the rows of A are given by
Finally, let ${\mathbf {b}}$ be the column vector with $n+1$ entries representing the constraints given by (6.7) and (6.8). In other words, we have
Our linear programming problem is then the following: Let ${\mathbf {y}}=(y_{ij})$ ordered as above.
The dual linear program is
where ${\mathbf {x}}=[x_0,x_1,...,x_n]$ . We call a vector ${\mathbf {x}}$ dual feasible if $A^T{\mathbf {x}}\leq {\mathbf {c}}$ and vector ${\mathbf {y}}$ primal feasible if $A{\mathbf {y}}\geq {\mathbf {b}}$ . We have the following well-known weak duality statement.
Lemma 6.2 (Weak duality)
Let A be an $m\times n$ matrix with real entries and ${\mathbf {c}}$ an $n\times 1$ real vector and ${\mathbf {b}}$ an $m\times 1$ real vector. Consider the primal linear program
and the dual linear program
Let ${\mathbf {y}}$ be any primal feasible vector and ${\mathbf {x}}$ a dual feasible vector. Then
Proof Let $A=(a_{i,j})$ . Because ${\mathbf {y}}$ is primal feasible, we have $A{\mathbf {y}}\geq {\mathbf {b}}$ . Therefore, for all $1\leq i\leq m$ , we have
Multiplying by $x_i$ and summing over all i, we have
On the other hand, because ${\mathbf {x}}$ is dual feasible, we have that $A^T{\mathbf {x}}\leq {\mathbf {c}}$ , so for each $1\leq j\leq n$ , we have
Multiplying by $y_j$ and summing over all j gives
Combining inequality (6.17) and inequality (6.18) gives
Returning to our problem, the weak duality theorem tells us that it suffices to find a dual feasible solution ${\mathbf {x}}=[x_0,\dots x_n]$ such that ${\mathbf {b}}^{T}{\mathbf {x}}\geq -\chi (\mathfrak {X})+\epsilon $ . In other words, we seek ${\mathbf {x}}=[x_0,\dots , x_n]$ with
Take ${\mathbf {x}}=[1,\frac {1}{m_1},\frac {1}{m_2},\dots ,\frac {1}{m_n}]$ . We first show that ${\mathbf {x}}$ is dual feasible. In this case, A is an $(n+1)\times \sum _{i=1}^n m_i$ matrix. So a row of $A^T$ is indexed by a pair $(i,j)$ with $1\leq i\leq n$ and $1\leq j\leq m_i$ . We have that the $(i,j)$ entry of $A^T{\mathbf {x}}={\mathbf {x}}^{T}A$ can be computed as
Therefore, to show that ${\mathbf {x}}$ is dual feasible for an arbitrary ${\mathbb {X}}$ , we need that
In our case, the $(i,j)$ entry of $A^T{\mathbf {x}}$ is given by
so ${\mathbf {x}}$ is dual feasible. We then compute
Therefore, ${\mathbf {x}}=\left (1,\dfrac {1}{m_1},\dots ,\dfrac {1}{m_n} \right )$ is a dual feasible solution and
By the weak duality theorem, we have that
Exponentiating gives
As $B=\max \{\vert x\vert ,\vert y\vert \}$ , we have verified inequality (6.1) and consequently we have that conditional on the $abc$ -conjecture that
when $\chi (\mathfrak {X})\leq 0$ .
6.2 Proof of Theorem 1.3
One direction of the theorem is provided by Theorem 2.6, which we proved in the previous subsection. It suffices to prove the converse.
Actually, for the converse, we only need the assertion that for any $\kappa> 0$ and $m \geq 4$ , the function $H_{-K_{\mathfrak {X}_m}}({\mathbf {x}}) \cdot H({\mathbf {x}})^\kappa $ has Northcott’s property, where $\mathfrak {X}_m = \mathfrak {X}({\mathbb {P}}^1 : ((0, 1, \infty ), (m, m, m))$ . To see this, let us fix $\varepsilon> 0$ . Choose $0 < \kappa < \varepsilon /3$ and choose $m \in {\mathbb {N}}$ sufficiently large so that
By hypothesis, we have
Trivially, we see that
for all $u \in {\mathbb {Z}}$ . Hence, (6.19) implies
Since $x,y,x+y$ are pairwise co-prime, we have $\operatorname {rad}(x)\operatorname {rad}(y)\operatorname {rad}(x+y) = \operatorname {rad}(xy(x+y))$ ; hence,
Raising both sides to the $m/(m-1)$ power, we have
by our hypotheses on $m, \kappa $ . It follows that
which is plainly equivalent to the $abc$ -conjecture, provided we adjust the implied constant.
7 Quantitative arithmetic of stacky curves
7.1 Crude bound for $N_{\mathbf {m}}(T)$ , with ${\mathbf {m}} = (2,2,m)$
Here, we deal with the case ${\mathbf {m}} = (2,2,m)$ . The Euler characteristic is equal to
The height $H(x,y)$ is given by
where $x = x_1 x_2^2, y = y_1 y_2^2$ and
with $x_1, y_1, z_1, \ldots , z_{m-1}$ square-free. We normalize the height by raising it to the mth power, obtaining the bound
From here, we see that
whence we conclude that
This bound and $|z_m| \geq 1$ implies that
From here, we obtain a crude upper bound for $N_{\mathbf {m}}(T)$ , which proves Theorem 2.7. Indeed, having chosen $x_1, y_1, z_1, \ldots , z_{m-1}$ , there are then $O(T^{1/m}/(|x_1 y_1|^{1/2} |z_1 \cdots z_{m-1}|)$ possibilities for $z_m$ . Having chosen $z_m$ as well, there are then $O_{\varepsilon } (T^\varepsilon )$ possibilities for $x_2, y_2$ , since $x_2, y_2$ are polynomially bounded, so they will be determined by the norm-equation (7.1) up to a log factor. Thus, there are
possible solutions to (7.1) satisfying the height bound (7.2). We evaluate this as
To give a lower bound, we choose square-free integers $a,b,c$ so that the curve
has a primitive integral solution. Such a triple is guaranteed to exist (see [Reference Beukers1]). Then we can parametrize (some) of the solutions by a triple of integral binary forms $(F,G, h)$ where $\deg F = \deg G = m$ and $\deg h = 2$ . By
The height is
so if we treat $a,b,c$ as constants, then
Therefore, we are looking for solutions to the Thue inequality
If we restrict $u,v$ so that
then we see that the above height bound is satisfied. Thus, $N_m(T) \gg T^{1/m}$ .
7.2 Proof of Theorem 2.8
In this section, we prove Theorem 2.8. To do so, we will show that $N_2(T) = O \left (T^{1/2} (\log T)^3 \right )$ and give a separate argument to show that $N_2(T) \gg T^{1/2} (\log T)^3$ . The incompatibility of these two arguments represents the main obstacle as to why an asymptotic formula for $N_2(T)$ remains elusive.
We count rational points of bounded height on the curve $\mathfrak {X}({\mathbb {P}}^1_{\mathbb {Q}};(0,2), (-1,2),(\infty ,2))$ with the height on ${\mathbb {P}}^1$ given by (2.7). On writing
(note that this differs from the notation used elsewhere in the paper), we then have
and the max on the right-hand side is dependent only on the relative size of $|a|,|b|$ . If we write
then we further obtain the expression
We may assume without loss of generality that $|x_1 y_1^2| \geq |x_2 y_2|^2$ and $x_1> 0$ , so that
We consider the problem of counting integral points on the variety defined by (7.4), subject to the constraint
To obtain the upper bound, we must dissect (7.5) into suitable ranges. When $|x_1 x_2 x_3| \leq T^{1/2}$ , we fix $x_1, x_2, x_3$ and treat (7.4) as a diagonal ternary quadratic form, say $Q_{\mathbf {x}}$ . It is then the case that
for $i = 1,2,3$ , and by Corollary 2 of [Reference Browning and Heath-Brown3], we then have the estimate
for the number of ${\mathbf {y}} \in {\mathbb {Z}}_{\ne 0}^3$ satisfying (7.5) and (7.4) provided that the quadratic form $Q_{\mathbf {x}}$ has a rational zero. Otherwise, it is clear that there will be no contribution. Thus, we must estimate
This is similar to the work of Guo in [Reference Guo12], except he counted with respect to the height $\lVert {\mathbf {x}} \rVert _\infty $ . Nevertheless, the techniques are similar, and again this may be of independent interest.
Next, we must deal with the case when $|x_1 x_2 x_3| \geq T^{1/2}$ . For this, it suffices to observe from (7.6) that $|x_1 x_2 x_3| \geq T^{1/2}$ implies
We then treat (7.4) as a linear form $L_{\mathbf {y}}$ in ${\mathbf {x}}$ . We use this to show that the contribution for each ${\mathbf {y}}$ is $O\left (T^{1/2} |y_1 y_2 y_3|^{-1} + 1 \right )$ , which gives an acceptable contribution upon summing over ${\mathbf {y}}$ .
For the lower bound, we first restrict $y_1, y_2, y_3 \in {\mathbb {Z}}_{\ne 0}$ satisfying
for some explicit $\delta> 0$ to be specified later. We note that to obtain the correct order of magnitude, it is permissible to choose any $\delta> 0$ .
Having fixed ${\mathbf {y}} = (y_1, y_2, y_3)$ , we consider the simultaneous conditions (7.4) and (7.5). This gives rise to a binary form inequality of the shape
Because $|y_1 y_2 y_3|$ is small, we can count the number of solutions ${\mathbf {x}}$ to this inequality with reasonable precision. However, even with $|y_1 y_2 y_3|$ counting the number of solutions ${\mathbf {x}}$ with enough uniformity appears to still be a challenging task, because the binary form in (7.7) is singular. This difficulty is exacerbated by the fact that we will need to apply a square-free sieve eventually to produce triples ${\mathbf {x}}$ with each coordinate square-free.
To get around this issue, we simply count solutions to (7.7) with $x_1, x_2$ satisfying the inequalities
for some positive numbers $c_1, c_2$ . This has the effect that the long cusps inherent in (7.7) are removed, and reduce the problem to a more straightforward geometry of numbers question.
7.2.1 Upper bounds
To obtain upper bounds, it is crucial to view (7.4) as a plane in $x_1, x_2, x_3$ when $|y_1 y_2 y_3| \leq T^{1/2}$ and viewing (7.4) as a conic in $y_1, y_2, y_3)$ when $|x_1 x_2 x_3| \leq T^{1/2}$ . We call the former the linear case and the latter the quadratic case. We proceed to deal with the linear case below.
We shall first suppose that $|y_1 y_2 y_3| \leq T^{1/2}$ is fixed, and count the triples $(x_1, x_2, x_3)$ and $(y_1, y_2, y_3)$ for which (7.4) holds.
The key is the following lemma on counting points in sublattices of ${\mathbb {Z}}^2$ .
Lemma 7.1 Let $\Lambda \subset {\mathbb {Z}}^2$ be a lattice. Then, for all positive real numbers $R_1, R_2$ , the number of primitive integral points ${\mathbf {x}} \in \Lambda $ satisfying $|x_i| \leq R_i, i = 1,2$ is at most $O \left (R_1 R_2/\det (\Lambda ) + 1 \right )$ .
Proof If the rectangle $[-R_1, R_2] \times [-R_2, R_2]$ contains at least two primitive vectors in $\Lambda $ , say ${\mathbf {x}}_1, {\mathbf {x}}_2$ , then since this rectangle is convex it contains the parallelogram with end points $\pm {\mathbf {x}}_1, \pm {\mathbf {x}}_2$ . The area of this parallelogram is at least as large as $\det \Lambda $ , since the lattice spanned by ${\mathbf {x}}_1, {\mathbf {x}}_2$ is a sublattice of $\Lambda $ . It thus follows that
Otherwise, the rectangle $[-R_1, R_1] \times [-R_2, R_2]$ contains at most one primitive vector in $\Lambda $ . This completes the proof.
The strength of this lemma is that it gives a strong upper bound even in lopsided boxes.
Given (7.4), it follows that there is at least one $i \in \{2,3\}$ such that
whence
Without loss of generality, we assume that this holds for $i = 2$ . Suppose that $M_1 \leq x_1 < 2M_1$ . By (7.5), we have
whence
Applying Lemma 7.1 to the lattice defined by the congruence $y_1^2 x_1 - y_3^2 x_3 \equiv 0 \pmod {y_2^2}$ which has determinant equal to $y_2^2$ , there are
possibilities for $x_1, x_3$ , which then determines $x_2 = (y_1^2 x_1 - y_3^2 x_3)/y_2^2$ . Similarly, applying Lemma 7.1 to the lattice defined by $y_1^2 x_1 + y_2^2 x_2 \equiv 0 \pmod {y_3^2}$ , with determinant equal to $y_3^2$ , gives the estimate
for the number of $x_1, x_2$ , which then also determine $x_3$ . The two bounds coincide when
and we get the bound
for the number of $x_1, x_2, x_3$ given $y_1, y_2, y_3$ . Thus, we obtain an acceptable estimate whenever $|y_1 y_2 y_3| \ll T^{1/2}$ , since
It is well known that
By partial summation, we have
It follows that
It remains to deal with the case when $|y_1 y_2 y_3| \gg T^{1/2}$ , where we instead fiber over ${\mathbf {x}}$ and consider zeros of the corresponding diagonal quadratic forms $Q_{\mathbf {x}}$ . Since
for $i = 1,2$ by assumption, it follows that
hence,
If $|x_1 x_2 x_3| \gg T^{1/2}$ , then
This implies that
which violates (7.5) if the implied constants are sufficiently large. It thus follows that we must have $|x_1 x_2 x_3| \ll T^{1/2}$ in this case.
We now fix $x_1, x_2, x_3$ and consider (7.4) as a ternary quadratic form in $y_1, y_2, y_3$ . We shall require the following version of Corollary 2 in [Reference Browning and Heath-Brown3], which is an analogue of Lemma 7.1.
Lemma 7.2 Let $x_1, x_2, x_3$ be pairwise co-prime square-free integers. Let $R_1, R_2, R_3$ be positive real numbers. Then the number of primitive solutions $y_1, y_2, y_3$ to the equation
with $|y_i| \leq R_i$ is bounded by
Since $|x_i y_i^2| \ll x_1 y_1^2$ for $i = 1,2$ , it follows that
for $i = 1,2$ , whence
This implies that
hence
Lemma 7.2 then implies that for fixed $x_1, x_2, x_3$ , the number of primitive ${\mathbf {y}} = (y_1, y_2, y_3)$ satisfying (7.4) is
We now sum over primitive ${\mathbf {x}} \in {\mathbb {Z}}^3$ satisfying $|x_1 x_2 x_3| \ll T^{1/2}$ , with the property that the quadratic form $Q_{\mathbf {x}}$ given by (7.4) has a rational zero. By the Hasse–Minkowski theorem, this is tantamount to the form $Q_{\mathbf {x}}({\mathbf {y}}) = x_1 y_1^2 + x_2 y_2^2 - x_3 y_3^2$ being everywhere locally soluble. The estimation of this is interesting on its own right and will be handled in a separate subsection.
7.2.2 Counting soluble ternary quadratic forms
In this section, we consider the set
By a well-known theorem of Legendre (see [Reference Guo12]), the indicator function for ${\mathcal {S}}$ is given by
We will now combine the ideas given in [Reference Guo12] and those in [Reference Fouvry and Kluners8].
Put
Since $x_1, x_2, x_3$ are pairwise coprime and square-free, it follows that
where $\omega (n)$ is the number of distinct prime factors of n. It follows that
where g expresses a product of Jacobi symbols. The sum
is expected to contribute the main term, while the sum
is expected to be negligible, due to the cancellation of characters.
By partial summation, we obtain
where
and
Our situation differs from that of Guo in [Reference Guo12] since we are counting over triples with $|x_1 x_2 x_3| \leq X$ rather than $\max \{|x_1|, |x_2|, |x_3|\} \leq X$ , which introduces some difficulties. However, this is exactly analogous to the situation encountered by Fouvry and Kluners in [Reference Fouvry and Kluners8].
Our key proposition will be the following.
Proposition 7.3 We have the asymptotic upper bound
In fact, we can refine Proposition 7.3 to give an asymptotic formula, but this is unnecessary for our purposes.
We proceed to prove Proposition 7.3 in the remainder of the section. We begin by showing that triples $(x_1, x_2, x_3)$ with $\mu ^2(x_1 x_2 x_3) = 1$ and $\omega (x_1 x_2 x_3)$ large contribute negligibly. To wit, put
By the triangle inequality, it is clear that
By partial summation, we have
To estimate the latter sum, we will need the following result, which is Lemma 11 in [Reference Fouvry and Kluners8].
Lemma 7.4 There exists an absolute constant $B_0 \geq 1$ such that for every $r \geq 0$ , we have
Applying the lemma, we have for $\Omega = 30 (\log \log X + B_0)$
the final sum a convergent geometric series. Hence,
We thus conclude that
and is thus negligible.
Note that $x_1, x_2, -x_3$ cannot all be the same sign; otherwise, (7.4) will only have a trivial real solution. Hence, the signs of $(x_1, x_2, x_3)$ must be $(+, +, +)$ , or $(+, -, +)$ , since we assumed $x_1> 0$ and $x_1 y_1^2 \geq |x_2 y_2^2|$ . By rearranging, we must thus assume $x_1, x_2, x_3> 0$ .
We then expand (7.11) by writing $x_i = x_{i1}x_{i2}$ for $i = 1,2,3$ , and
We now follow the strategy outlined in [Reference Fouvry and Kluners8] and break up the set
by restricting the $x_{ij}$ ’s to intervals of the form
where
For a given $\mathbf {A} = (A_{11}, A_{12}, A_{21}, A_{22}, A_{31}, A_{32})$ , put
We then have the following lemma.
Lemma 7.5 We have the bound
Proof We have
By Taylor’s theorem, we have
The proof then follows.
To proceed, we shall require the following well-known lemma regarding character sums.
Lemma 7.6 (Double Oscillation Lemma)
Let $\{\alpha _n\}, \{\beta _m\}$ be two sequences of complex numbers with each term having absolute value bounded by $1$ . Let $M,N$ be positive real numbers. Then we have
and for every $\varepsilon> 0$ ,
We will also need the following variant of the Siegel–Walfisz theorem.
Lemma 7.7 Let $\chi _q$ be a primitive character modulo $q \geq 2$ . Then, for every $A> 1$ , we have
uniformly for $X \geq Y \geq 2$ .
We now consider, as in [Reference Fouvry and Kluners8], the quantities
We now consider those $\mathbf {A}$ with the property that at most two entries larger than $X^\ddagger $ . We dissect the sum according to the number $r \leq 2$ of terms $A_{ij}$ greater than $X^\ddagger $ . Let n be the product of those $x_{ij}$ which are larger than $X^\ddagger $ , and m the product of the remaining ones. We sum over $\mathbf {A}$ with such properties to obtain
This is sufficiently small for our purposes.
We may now assume that $A_{ij} \geq X^\ddagger $ for at least three pairs $i,j$ with $1 \leq i \leq 3, 1 \leq j \leq 2$ . We now suppose that there exist $a \ne b$ such that
The sum over $\mathbf {A}$ satisfying these properties can be bounded by
where $\alpha , \beta $ have modulus at most one. Lemma 7.6 then applies, and since our variables $x_{a,2}, x_{b,1}$ range over intervals exceeding $X^\dagger $ in length, it follows that
which is again enough.
Next, consider the family where the two previous conditions do not hold, and in addition there exist $a \ne b$ such that $2 \leq A_{b,1} \leq X^\dagger $ and $A_{a,2}> X^\ddagger $ . Under these conditions, we see that
where $A_{ij} \leq x_{ij} \leq \Delta A_{ij}$ and $\omega (x_{ij}) \leq \Omega $ for $1 \leq i \leq 3, 1 \leq j \leq 2$ . Now put $\ell = \omega (x_{a,2})$ , writing
with $p_1 < p_2 < \cdots < p_\ell $ , we obtain
the inner sum being bounded by
and $p_1, \ldots , p_\ell $ satisfy $A_{a,2} \leq p_1 \cdots p_{\ell } \leq \Delta A_{a,2}$ . Note that
We may now apply Lemma 7.7 to obtain the bound
with A arbitrarily large. Note that $p_1 \cdots p_{\ell -1} \leq X$ , and hence
Hence,
Choosing A large shows that this contribution is negligible.
The remaining case can be summarized by the following properties:
-
(1) $\prod _{i,j} A_{ij} \leq \Delta ^{-6} X$ .
-
(2) $A_{ij} \geq X^\ddagger $ for at least three pairs of indices $(i,j)$ .
-
(3) If $A_{ij}, A_{k\ell } \geq X^\dagger $ , then $j = \ell $ .
-
(4) If $A_{ij} \leq A_{k \ell }$ with $j \ne \ell $ , then either $A_{ij} = 1$ or $2 \leq A_{ij} \leq X^\dagger $ and $A_{k \ell } < X^\ddagger $ .
We now show that the second option in (4) cannot happen. This will imply that we have accounted for all possibilities for (7.11), and hence reduced our problem to estimating ${\mathcal {S}}_1(X)$ .
Suppose, without loss of generality, that $2 \leq A_{11} \leq X^\dagger $ and $A_{22} < X^\ddagger $ . Since $A_{ij} \geq X^\ddagger $ for at least three pairs of indices $(i,j)$ , one of $A_{12}$ or $A_{32}$ must exceed $X^\ddagger $ . We then have $A_{11} \leq X^\dagger $ and $A_{32}$ , say, exceeds $X^\ddagger $ , which means that our earlier estimation covers this case.
The upshot now is that
for some $\kappa (A)> 0$ . It follows from (7.12) that
which is sufficiently small for our purposes.
Finally, we may evaluate the main term, which is given by (7.10). By the triangle inequality, we have
which is $O((\log X)^3)$ . This completes the proof of the proposition.
7.2.3 Lower bounds
For the lower bound, it suffices to give an accurate count for some subset of the points enumerated by the quantity $N_2(T)$ . The arguments used here are inspired by the work of the second author and C.L. Stewart in [Reference Stewart and Xiao19], though the situation here is slightly simpler. To wit, we shall consider the subset of points $({\mathbf {x}}, {\mathbf {y}})$ satisfying the condition
where $\delta $ is some explicit positive number which we shall specify later. Next, we suppose that $x_1, x_2$ satisfy
Note that
whence
Thus,
Therefore, every pair $(x_1, x_2)$ satisfying (7.17) with $x_1, x_2$ both square-free and $x_3 = (y_1^2 x_1 + y_2^2 x_2) y_3^{-2} \in {\mathbb {Z}}$ square-free will contribute to $N_2(T)$ .
We now count pairs $(x_1, x_2)$ such that:
-
(1) $(x_1, x_2)$ satisfies (7.17);
-
(2) $\gcd (x_1, x_2) = 1$ ;
-
(3) $x_1, x_2$ are square-free; and
-
(4) $y_1^2 x_1 + y_2^2 x_2 \equiv 0 \pmod {y_3^2}$ , $(y_1^2 x_1 + y_2^2 x_2)y_3^{-2}$ is square-free.
For each prime p, we interpret conditions (2)–(4) modulo $p^2$ . Condition (2) is the assertion that $p | x_1 \Rightarrow p \nmid x_2$ , Condition (3) is the assertion that for all primes p we have $p^2 \nmid x_1, x_2$ , and condition (4) is stating $y_3^2 | y_1^2 x_1 + y_2^2 x_2$ , and if $p^{s} || y_3$ , then $p^{2s + 2} \nmid y_1^2 x_1 + y_2^2 x_2$ . Let
It is apparent that $\rho _{\mathbf {y}}(\cdot )$ is multiplicative. Put
and
By standard arguments using the inclusion–exclusion sieve, we have
the error term being bounded by
Since $|y_1 y_2 y_3| \leq T^\delta $ , we obtain an acceptable error term provided that $\delta < 1/4$ . This shows that
Since
this confirms the lower bound.
7.3 Counting points with respect to the canonical height when $\chi (\mathfrak {X}) < 0$
In this section, we first prove that the number of quadratic points on a hyperelliptic curve given by the model
where F is an integral, non-singular binary form having degree $2g+2$ with $g \geq 2$ , is dominated by the “obvious” points given by triples $(x,y, \sqrt {F(x,y)})$ . To show that the proper quadratic points are negligible, we note that when $g = 2$ the proper quadratic points, which come in conjugate pairs, are in bijection with the rational points of the Jacobian $\operatorname {Jac}(C_F)$ via the correspondence $[P] \mapsto [P_1 + P_2] - K_{C_F}$ , where $K_{C_F}$ is the canonical divisor. Thus, in this case, the proper quadratic points of bounded height are given by the rational points of bounded height in $\operatorname {Jac}(C_F)({\mathbb {Q}})$ , for which there are $O_F((\log T)^{r_F})$ many, where $r_F$ is the Mordell–Weil rank of $\operatorname {Jac}(C_F)$ . For $g \geq 3$ , the proper quadratic points are finite by Faltings’ theorem. Thus, the number of quadratic points on $C_F$ is asymptotically equal to the number of rational points in ${\mathbb {P}}_{\mathbb {Q}}^1$ of bounded height.
To the contrary, for $\mathfrak {X} = \mathfrak {X}({\mathbb {P}}^1 : ({\mathbf {a}}, {\mathbf {m}}))$ with $\chi (\mathfrak {X}) < 0$ , we get a much less reasonable result. This is because we have little control over the set of integers $x,y$ such that $\ell _i(x,y)$ is divisible by a large square for $i = 1, \ldots , n$ . Even with the $abc$ -conjecture, there is only so much that can be shown. In the case when ${\mathbf {m}} = (2, \ldots , 2)$ , we have the following.
Theorem 7.8 Let $\mathfrak {X} = \mathfrak {X}({\mathbb {P}}^1 : ({\mathbf {a}}, {\mathbf {m}}))$ be a stacky curve with ${\mathbf {m}} = \underbrace {(2, \ldots , 2)}_n$ with $n \geq 5$ . Let $N_{{\mathbf {a}}, n}(T)$ be the number of rational points on $\mathfrak {X}$ satisfying $H_{({\mathbf {a}}, {\mathbf {m}})}(x,y) \leq T$ . Assume that the $abc$ -conjecture holds. Then, for any $\varepsilon> 0$ , we have
Proof This is similar to the proof of Theorem 2.6. We conclude from that proof that
and
by the triangle inequality. Comparing, we conclude that
and in turn
It follows that
Hence, $N_{({\mathbf {a}}, {\mathbf {m}})}(T)$ is bounded by the number of rational points in ${\mathbb {P}}_{\mathbb {Q}}^1$ having height at most $O_{\varepsilon } \left (T^{\frac {1}{2n-8 - \varepsilon }}\right )$ , which is $O_{\varepsilon } \left (T^{\frac {1}{n-4-\varepsilon }} \right )$ . By adjusting $\varepsilon $ , we see that
Remark 7.9 We do not expect the upper bound given in Theorem 7.8 to be sharp. Indeed, the bound we obtain essentially comes from the scenario that for almost all integers $m \ll _{\varepsilon } T^{2+\varepsilon }$ that there exist $x,y$ with $\max \{|x|, |y|\} \ll _{\varepsilon } T^{2 + \varepsilon }$ with $Q(x,y)$ divisible by $m^2$ . We expect that this should not be the case.
7.4 Hasse principle for integral points when ${\mathbf {m}} = (2, 2, 2)$
We now consider the question of whether the Hasse principle holds for integral points on stacky curves of the shape $\mathfrak {X} = \mathfrak {X}({\mathbb {P}}_{\mathbb {Q}}^1 : (a_1, a_2, a_3), (2,2,2))$ . By Theorem 3.20 it suffices to consider when it is possible for the stacky part of the height to be equal to one. This is tantamount to requiring the existence of co-prime integers $x,y$ and integers $y_1, y_2, y_3$ for which
Here, as we recall, $\ell _i(x,y) = \alpha _i x - \beta _i y$ , with $a_i = [\alpha _i : \beta _i]$ . For $i = 1,2$ , we obtain a system of linear equations
Inverting, we find that
It follows that
which we can write as
Therefore, the existence of the integers $y_1, y_2, y_3$ , and hence $x,y$ , depends on whether this conic has a rational point.
Acknowledgement
We thank J.S. Ellenberg for introducing us to this problem, when he discussed the problem in several lectures, as well his continued interest and encouragement throughout the project. We would also like to thank M. Satriano for explaining the construction of heights on algebraic stacks to us, and for his continuous support and guidance during this project.
Competing interests
As far as we are aware, there are no competing interests involved with this manuscript.
Data availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.