1. Introduction
The relationship between nonnegative polynomials and sums of squares is a fundamental topic in real algebraic geometry. This subject has received renewed attention in the last twenty years due to its connection with polynomial optimization and many applications [Reference Blekherman, Parrilo and Thomas3]. In a foundational paper, Hilbert described all the cases in terms of degree and number of variables where any globally nonnegative polynomial can be written as a sum of squares of polynomials [Reference Hilbert12]. A modern approach to this question is to study nonnegative polynomials and sums of squares on a real projective variety $X\subseteq {\mathbb {P}}^n_{{\mathbb {R}}}$ . This allows one to restrict to quadrics since degree $2d$ forms on X are quadrics on the d-th Veronese embedding of X. The two main objects of interest are:
In fact, $\Sigma _X \subseteq P_X$ are convex cones in the vector space $R(X)_2$ of all quadrics on X, which facilitates their study via convex geometry (see [Reference Blekherman, Parrilo and Thomas3]). For instance, as an extension of Hilbert’s result, [Reference Blekherman, Smith and Velasco6, Theorem 1.1] showed that $\Sigma _X = P_X$ if and only if X is a variety of minimal degree, that is, $\deg X = 1 + {\operatorname {codim}}\ X$ . However, the structure of these cones is still not well understood in general.
It is sometimes more convenient to work with the dual cones $P_X^\star \subseteq \Sigma _X^\star $ . The cone $\Sigma _X^\star $ is a spectrahedron, that is, a slice of the cone of positive semidefinite (PSD) matrices with a linear subspace. We call $\Sigma _X^\star $ the Hankel spectrahedron of X. By identifying a point $\ell \in \Sigma _X^\star $ with a PSD matrix, we can talk about the rank of $\ell $ . Rank one extreme rays of $\Sigma _X^\star $ are precisely the extreme rays of $P_X^\star $ . Therefore, if $P_X^\star \subsetneq \Sigma _X^\star $ we can quantitatively measure the difference between these cones by analyzing the ranks of extreme rays of $\Sigma _X^\star $ that are greater than one. This motivates the following key definition:
Definition 1.1 (see [Reference Blekherman, Sinn and Velasco5, Definition 1]).
The Hankel index of X, denoted $\eta (X)$ , is defined to be the minimal rank of a(n extreme) ray $\ell \in \Sigma _X^\star \setminus P_X^\star $ , or $\infty $ if $\Sigma _X^\star = P_X^\star $ .
The Hankel index is a subtle invariant which is often quite hard to compute. A surprising connection between the Hankel index and homological properties of the minimal free resolution of the ideal of X was found in [Reference Blekherman, Sinn and Velasco5, Theorem 4 and Theorem 6]: Namely, there is a lower bound $\eta (X) \ge \alpha (X) + 1$ , where $\alpha (X)$ is the Green–Lazarsfeld index of X (here, X need not be irreducible). Recall that the Green–Lazarsfeld index of X is defined as follows: $\alpha (X)=0$ if the ideal of X is not generated by quadrics; otherwise it is equal to one plus the number of steps that the minimal free resolution of the coordinate ring of X is linear, that is, has only linear syzygies. In all cases where the Hankel index was known, this bound was tight. These cases include varieties of minimal degree, arithmetically Cohen–Macaulay (ACM) varieties of almost minimal degree, varieties defined by quadratic squarefree monomial ideals, some general canonical curves and Veronese embeddings of ${\mathbb {P}}^2$ (see [Reference Blekherman, Sinn and Velasco5, Theorem 28] and [Reference Blekherman and Sinn4]).
We present the first examples where the difference between Hankel index and Green–Lazarsfeld index is larger than one. The Hankel index of X is a semialgebraic invariant, while the Green–Lazarsfeld index is an algebraic invariant which makes no distinction between the real and complex points of X. Nevertheless, separating these two invariants is challenging. To accomplish this we consider non-ACM curves of almost minimal degree. This class of curves is well studied: Such curves admit a description as an outer projection of a rational normal curve and are thus determined by a single point, namely the projection center [Reference Brodmann and Schenzel8]. Since we are working with the rational normal curve, we can identify the projection center p with a binary form $F(p)$ (see Section 3.1). In this case, both Green–Lazarsfeld and Hankel indexes are intimately connected to another classical notion, the Waring decomposition of $F(p)$ , that is, the shortest decomposition of $F(p)$ as a sum of powers of linear forms. In [Reference Park16, Theorem 1.1(2)], it was shown that for such curves, the Green–Lazarsfeld index equals the complex Waring border rank of $F(p)$ minus 3: $\alpha (X) = {{\mathbb {C}}\text{-}\operatorname {b.rk}}(F(p)) - 3$ .
Our main result states that the Hankel index of X is determined by the shortest decomposition of $F(p)$ into a sum of powers of almost real forms, (see Section 2.2 for precise definitions), which we call the almost real rank of $F(p)$ , denoted by ${\operatorname {ar-rk}}(F(p))$ . We use $\sigma _3(X)$ to denote the third secant variety of X, that is, the Zariski closure of the union of $2$ -planes spanned by three points in X.
Theorem 1.2. Let $X = \pi _p(C_d)$ be a projection of a rational normal curve $C_d$ of degree d away from a point $p \in {\mathbb {P}}^d \setminus \sigma _3(C_d)$ , with corresponding binary form $F(p) \in {\mathbb {R}}[x,y]_d$ . Then the Hankel index of X is given by
This theorem elucidates the semialgebraic nature of the Hankel index and demonstrates two ways in which it differs from the Green–Lazarsfeld index: the difference between rank and border rank and the difference between almost real decompositions and complex decompositions.
We note an interesting technical detail of the proof of Theorem 1.2. To prove an upper bound on Hankel index we need a construction of rays in $\Sigma _X^\star \setminus P_X^\star $ , and for this we use point evaluations at points of X in special position. Such constructions using Cayley–Bacharach relations were used in [Reference Blekherman1] and more generally in [Reference Blekherman, Smith and Velasco6] (the idea goes all the way back to Hilbert’s original proof). Until now these constructions only used reduced points of X, but in this paper we use nonreduced zero-dimensional subschemes of X. The use of such nonreduced configurations is necessary and cannot be replicated by reduced points.
Real and complex Waring decompositions of binary forms is a classical subject dating back to Sylvester [Reference Sylvester17, Reference Sylvester18] (see [Reference Comas and Seiguer9, Reference Iarrobino and Kanev14] for modern treatments). The notion of almost real rank is new, and we prove some basic results about almost real rank of binary forms. We show that the maximal almost real rank for degree d forms is $d-1$ and classify all forms of maximal almost real rank (Theorem 7.5). We also show that the range of typical almost real ranks r is precisely $\lfloor \frac {d+2}{2}\rfloor \le r \le d-2$ (Theorem 7.9).
We outline the paper as follows: Sections 2 and 3 introduce necessary background and setup, including the notion of almost real rank. Section 4 consists of a small explicit example illustrating construction techniques presented in Section 5. Sections 5 and 6 constitute the proof of Theorem 1.2 (covering the inequalities ‘ $\le $ ’ and ‘ $\ge $ ’, respectively). We conclude in Section 7 with an investigation of almost real rank for binary forms.
2. Apolarity and ranks
We begin with a brief review of apolarity and the apolar inner product, which is our preferred method of explicitly identifying primal and dual spaces.
Definition 2.1. Let k be a field of characteristic 0 and $R = k[x_1, \ldots , x_n]$ a polynomial ring over k. Consider the ‘differential’ bilinear form on R defined by
where $f(\partial )$ is the differential operator obtained from f by replacing each variable $x_i$ with $\frac {\partial }{\partial x_i}$ , and $\bullet $ denotes the action of differential operators on polynomials. For a given degree d, the bilinear form $\langle \cdot , \cdot \rangle $ restricts to an inner product on $R_d$ , the k-vector space of forms of degree d. For $F \in R$ , the apolar ideal of F is defined as the orthogonal complement of F with respect to Equation (1), that is,
If $F \in R_d$ is homogeneous, then $(F)^\perp $ is a homogeneous ideal.
Remark 2.2. For any form F, the apolar ideal $(F)^\perp $ is an Artinian Gorenstein graded ideal. Conversely, every Artinian Gorenstein graded ideal I is of the form $(F)^\perp $ , where F generates the socle of $R/I$ .
We now specialize to the case of binary forms, that is, forms in two variables $x, y$ . Let $F \in k[x,y]_d$ be a binary form. Then $(F)^\perp $ is Gorenstein of codimension two, hence is a complete intersection. As this fact will be used repeatedly in the sequel, we introduce some notation for the generators of this complete intersection:
Definition 2.3. For $F \in k[x,y]_d$ , let $F_\perp , F^\circ \in k[x,y]$ denote forms that satisfy
with $\deg F_\perp \le \deg F^\circ $ . If $d_1 := \deg F_\perp $ and $d_2 := \deg F^\circ $ , we say that the apolar ideal $(F)^\perp $ is of type $(d_1, d_2)$ . One always has the relation
Note that if $d_1 < d_2$ , then $F_\perp $ is uniquely defined by F (up to nonzero scale), while $F^\circ $ is unique modulo the principal ideal $(F_\perp )$ .
For example, if $l = ax + by \in k[x,y]_1$ is a binary linear form, then $(l)^\perp $ is of type $(1,2)$ , with $l_\perp = bx - ay$ , and $l^\circ $ is a quadric in $(l)^\perp \setminus (l_\perp )$ .
We are now ready to state the apolarity lemma for binary forms, which characterizes membership in the apolar ideal:
Lemma 2.4 (Generalized apolarity lemma, see [Reference Iarrobino and Kanev14, Lemma 1.31]).
Let $F \in k[x, y]_d$ . For a given set $\{ l_1, \ldots , l_r \} \subseteq k[x, y]_1$ of linear forms and $d_1, \ldots , d_r \in {\mathbb {N}}$ with $\sum _{i=1}^r d_i \le d$ , one has $\prod _{i=1}^r l_i^{d_i} \in (F)^\perp $ if and only if there exist $c_{ij} \in k\ (1 \le i \le r$ , $0 \le j \le d_i-1)$ such that
We remark that our statement is slightly different from [Reference Iarrobino and Kanev14, Lemma 1.31], but the equivalence of two statements follows by taking $\{(l_i)^j(l_i)_\perp ^{d_i-j-1}:0\le j \le d_i - 1\}$ as a basis for $k[x,y]_{d_i-1}$ for each $1\le i\le r$ and expressing each $G_i$ in this basis in [Reference Iarrobino and Kanev14, Definition 1.30]. The case $d_1 = \dotsb = d_r = 1$ is classically referred to as the apolarity lemma and characterizes squarefree forms in the apolar ideal via a Waring decomposition of F, as a sum of $d^{\text {th}}$ powers of linear forms.
Another useful criterion for determining membership in the apolar ideal is the following:
Lemma 2.5. Let $F \in k[x,y]_d$ and $G \in k[x,y]_n$ for some $n \le d$ . Then $G \in (F)^\perp $ if and only if $(G)_d \subseteq (F)^\perp $ .
Proof. If $G \in (F)^\perp $ , then certainly $(G)_d \subseteq (F)^\perp $ since $(F)^\perp $ is an ideal. Conversely, suppose $G \not \in (F)^\perp $ and set $H := \langle G, F \rangle \in k[x,y]_{d-n} \ne 0$ . Since $\langle \cdot , \cdot \rangle $ is nondegenerate on $k[x,y]_{d-n}$ , there exists $0 \ne K \in k[x,y]_{d-n}$ with $\langle K, H \rangle \ne 0$ . Then $0 \ne \langle K, G(\partial ) \bullet F \rangle = K(\partial ) \bullet (G(\partial ) \bullet F) = (KG)(\partial ) \bullet F$ , so $KG \in (G)_d \setminus (F)^\perp $ .
2.1. Ranks of forms
Classically, it is an important problem to decompose a given form as a linear combination of powers of linear forms. Such decompositions lead various notions of rank of a form, which are sensitive to the underlying field of scalars.
Definition 2.6. Let $F \in {\mathbb {R}}[x,y]_d$ . The real (resp. complex) rank of F is the minimal number of real (resp. complex) linear forms $l_1, \ldots , l_r$ such that F is an ${\mathbb {R}}$ -linear (resp. ${\mathbb {C}}$ -linear) combination of $l_1^d, \ldots , l_r^d$ . The real (resp. complex) border rank of F is the minimal number r such that F is a limit of forms of real (resp. complex) rank r.
Remark 2.7. Via apolarity, we can reinterpret the various ranks in Definition 2.6. Indeed, it follows from Lemma 2.4 that for any $F \in {\mathbb {R}}[x,y]_d$ ,
Proof. The description of real rank and complex rank follows by the classical apolarity lemma, that is, the case $d_1 = \dotsb = d_r = 1$ of Lemma 2.4. The statement for border ranks follows by considering perturbations of forms in the apolar ideal. (See Section 6 for explicit descriptions of the approximations.)
Note that any complex rank is at most the corresponding real rank, and any border complex (real) rank is at most the corresponding complex (resp. real) rank. Moreover, if $(F)^\perp $ is of type $(d_1, d_2)$ , then ${{\mathbb {C}}\text{-}\operatorname {rk}}(F) = d_1$ if and only if $F_\perp $ has distinct factors over ${\mathbb {C}}$ and equals $d_2$ otherwise (since $F_\perp , F^\circ $ form a complete intersection, thus have no common factors).
2.2. Almost reality
We now introduce a central notion for this article, which is that of a binary form almost splitting over ${\mathbb {R}}$ , or a univariate polynomial having almost all real roots. For technical reasons we will need to include the possibility of one pair of roots being nondistinct so that the resulting rank is intermediate between a true rank and a border rank.
Definition 2.8. Let $F \in {\mathbb {R}}[x,y]_d$ . We say that F has almost real roots if F has $\ge d-2$ simple linear factors over ${\mathbb {R}}$ . Equivalently, F has a factorization over ${\mathbb {R}}$ of the form
where $l_i$ are linear and q is quadratic. A polynomial F with almost real roots thus belongs to exactly one of three classes: (i) F has all simple real roots; (ii) F has a unique nonreal complex conjugate pair of roots; (iii) F has a unique double real root (note that in cases (ii) and (iii), all other roots are real and simple).
In analogy with Remark 2.7, we define the almost real rank of F as
Remark 2.9. One can generalize the definition above to arbitrary (i.e., not necessarily binary) forms. Given a form $F \in {\mathbb {R}}[x_1, \ldots , x_n]$ , define the almost real rank of F as the minimal length of a zero-dimensional subscheme $Z \subseteq {\mathbb {P}}^{n-1}_{{\mathbb {C}}}$ such that $I(Z) \subseteq (F)^\perp $ and Z has either (i) all reduced real points, or (ii) exactly one nonreal conjugate pair of points, or (iii) exactly one double point. In this article though, we will only use the notion of almost real rank for binary forms.
Note that for any $F \in {\mathbb {R}}[x,y]_d$ , it follows from the definitions that ${{\mathbb {C}}\text{-}\operatorname {b.rk}}(F) \le {\operatorname {ar-rk}}(F) \le {{\mathbb {R}}\text{-}\operatorname {rk}}(F)$ . For more properties of almost real rank, see Section 7.
3. From binary forms to quadrics
3.1. Associating forms to points
A crucial identification throughout this paper is that of associating points in projective space to (binary) forms, which we now explain. Let $\nu _d : {\mathbb {P}}^1 \to {\mathbb {P}}^d$ be the d-uple embedding (or $d^{\text {th}}$ Veronese map). Let $C_d := \nu _d({\mathbb {P}}^1) \subseteq {\mathbb {P}}^d$ be the image, which is the standard rational normal curve of degree d. Given a point $p \in {\mathbb {P}}^d$ , consider the vector space of linear forms on ${\mathbb {P}}^d$ vanishing at p (these generate the vanishing ideal of p). Pulling this space back to ${\mathbb {P}}^1$ via $\nu _d$ gives a d-dimensional vector space of degree d binary forms, which is a hyperplane in $k[x,y]_d$ (the space of all degree d binary forms). We set $F(p)$ to be the degree d binary form (unique up to nonzero scale) which is orthogonal to this hyperplane, with respect to the inner product (1).
An alternate way to compute $F(p)$ is: Under the d-uple embedding, a point $\nu _d([a : b])$ on the rational normal curve is associated to the $d^{\text {th}}$ -power $(ax+by)^d \in k[x,y]_d$ . Since points on the rational normal curve are in linearly general position, extending additively gives a correspondence between all points in ${\mathbb {P}}^d$ and binary forms of degree d. Explicitly, for $p \in {\mathbb {P}}^d$ , we may choose an expression of p as a linear combination of $r \le d+1$ points on $C_d$ , say $p = \sum _{i=1}^r c_i p_i$ . Setting $p_i =: \nu _d([a_i : b_i])$ , we have
In this way, we may consider the various ranks (defined in Sections 2.1 and 2.2) of a point $p \in {\mathbb {P}}^d$ , as the ranks of the associated binary form $F(p)$ .
3.2. Quadratic forms vs. linear functionals on quadrics
For an embedded nondegenerate projective variety $X \subseteq {\mathbb {P}}^n$ , there is a correspondence between quadratic forms on X and linear functionals on quadrics on X. Let $R = R(X) = \bigoplus _{i \ge 0} R_i$ be the homogeneous coordinate ring of X. A bilinear form on $R_1$ is a bilinear map $R_1 \times R_1 \to k$ , or equivalently a linear map $R_1 \otimes _k R_1 \to k$ . The bilinear form is symmetric if and only if this descends to ${\operatorname {Sym}}^2(R_1) \to k$ . Since X is nondegenerate, $\dim R_1 = n+1$ (i.e., $R_1$ consists of all linear forms on ${\mathbb {P}}^d$ ), so there is a natural surjection ${\operatorname {Sym}}^2(R_1) \twoheadrightarrow R_2$ with kernel $I(X)_2$ , the degree $2$ part of the defining ideal of X. This yields a bijection
Finally, symmetric bilinear forms on $R_1$ whose kernel contains $I(X)_2$ correspond to quadratic forms on the variety X. Explicitly, given $\ell \in R(X)_2^\star $ , we associate to $\ell $ a quadratic form $Q_\ell $ on $R(X)_1$ given by $Q_\ell (f) := \ell (f^2)$ .
3.3. Curves of almost minimal degree
We now specialize to the main class of varieties of interest in this paper. Since $P_X$ only depends on real points of X, it is natural to restrict to totally real varieties (i.e., real varieties whose set of real points is Zariski-dense), and since $\Sigma _X$ only depends on the quadratic part of the coordinate ring of X, it is important to restrict to varieties defined by quadrics. We consider smooth projective non-ACM curves of almost minimal degree. Such curves arise as projections of the rational normal curve $C_d$ from a point (see [Reference Brodmann and Schenzel8, Theorem 1.2]). Let $\sigma _3(C_d)$ denote the $3^{\text {rd}}$ secant variety of $C_d$ , that is, the Zariski closure of the union of all secant $2$ -planes to $C_d$ in ${\mathbb {P}}^d$ , meeting $C_d$ in three distinct points. For $p \in {\mathbb {P}}^d \setminus \sigma _3(C_d)$ , let $\pi _p : {\mathbb {P}}^d \dashrightarrow {\mathbb {P}}^{d-1}$ be projection with center p (i.e., away from p). On restriction to $C_d$ , the rational map $\pi _p$ becomes a morphism, and the image $X := \pi _p(C_d) \subseteq {\mathbb {P}}^{d-1}$ is a smooth rational curve of almost minimal degree $d = \deg X = {\operatorname {codim}}\ X + 2$ . Let $R(X) := \mathbb {R}[x_0, \ldots , x_{d-1}]/I(X)$ denote the real coordinate ring of X. The assumption that $p \not \in \sigma _3(C_d)$ is equivalent to the statement that $I(X)$ is generated by quadrics; see [Reference Park16, Theorem 1.1(2)]. Since X is projective, $R(X) = \bigoplus _{i=0}^\infty R(X)_i$ is naturally $\mathbb {Z}$ -graded.
Our main object of interest is the Hankel spectrahedron
This is the dual cone to the sums-of-squares cone of X and is contained in $R(X)_2^\star $ , the space of linear functionals on quadrics on X. That it is a spectrahedron can be seen from an alternate description (see [Reference Blekherman1, Lemma 2.1] and Section 3.2)
where $\mathbb {S}_{+}$ is the cone of PSD symmetric matrices (identified with nonnegative quadratic forms) on X, and $(I(X)_2)^\perp $ is the orthogonal complement of the degree 2 part of the ideal of X (which comprises linear equations in $R(X)_2^\star $ ). We next spell out a series of basic, but useful, identifications.
Remark 3.1. (i) The surjection $\pi _p : C_d \twoheadrightarrow X$ induces an injection of coordinate rings $R(X) \hookrightarrow R(C_d)$ , which is naturally graded. In this way, $R(X)_1$ is identified with a hyperplane $H \subseteq R(C_d)_1$ .
(ii) Since $p \not \in \sigma _3(C_d)$ , the quadratic part of the coordinate ring of X can be identified with the quadratic part of the coordinate ring of $C_d$ , that is, $R(X)_2 = R(C_d)_2$ . Equivalently, the Hilbert function of X in degree $2$ has value $2d+1$ .
(iii) Via the d-uple embedding $\nu _d : {\mathbb {P}}^1 \to {\mathbb {P}}^d$ , $R(C_d)_1$ can in turn be identified with $R({\mathbb {P}}^1)_{d} = {\mathbb {R}}[x,y]_{d}$ , the space of all degree d binary forms, and similarly $R(C_d)_2 \cong {\mathbb {R}}[x,y]_{2d}$ .
(iv) The apolar inner product (1) on ${\mathbb {R}}[x,y]_{d}$ , along with (iii), gives an explicit description of the hyperplane H in (i): Namely H is the orthogonal complement in ${\mathbb {R}}[x,y]_d$ of the center $F(p)$ (see Section 3.1), which is also $(F(p))^\perp _d$ , the degree d part of the apolar ideal of $F(p)$ . Moreover, with respect to the inner product on ${\mathbb {R}}[x,y]_{2d}$ , every functional $\ell \in {\mathbb {R}}[x,y]_{2d}^\star $ can be realized as $\ell (\cdot ) = \langle \cdot , L \rangle $ for some $L \in {\mathbb {R}}[x,y]_{2d}$ .
(v) Putting (i)–(iv) together with Section 3.2, we may thus associate to any $\ell \in {\mathbb {R}}[x,y]_d^\star $ a binary form $L \in {\mathbb {R}}[x,y]_{2d}$ , as well as quadratic forms $Q_\ell $ acting on ${\mathbb {R}}[x,y]_d \cong R(C_d)_1$ and $q_\ell $ acting on $(F(p))^\perp _d \cong R(X)_1$ . When represented as symmetric matrices, $Q_\ell $ is $(d+1) \times (d+1)$ , whereas $q_\ell $ is $d \times d$ .
For a linear functional $\ell \in {\mathbb {R}}[x,y]_d^\star $ , we will consistently use $Q_\ell $ to denote the associated quadratic form and $q_\ell $ to denote the restriction of $Q_\ell $ to the hyperplane H as in Remark 3.1(v).
We briefly review what is known about algebraic invariants of curves of almost minimal degree. First, for any nondegenerate variety $Y \subseteq {\mathbb {P}}^n_{{\mathbb {C}}}$ , there is a stratification of ${\mathbb {P}}^n$ by (higher) secant varieties of Y:
This gives rise to the notion of Y-border rank: for $p \in {\mathbb {P}}^n$ , the Y-border rank of p is defined as $b.{\operatorname {rk}}_Y(p) := \min \{ i \mid p \in Y^i \}$ ([Reference Blekherman and Teitler7, Reference Landsberg and Teitler15]). For $Y = C_d$ , it follows from Section 3.1 and apolarity that the $C_d$ -border rank of a point is exactly the complex border rank of the corresponding binary form, that is, $b.{\operatorname {rk}}_{C_d}(p) = {{\mathbb {C}}\text{-}\operatorname {b.rk}}(F(p))$ .
Next, a fruitful way to study a projected curve $X = \pi _p(C_d)$ is to consider the rational normal scrolls containing X as a divisor. Recall that a rational normal scroll is a variety $S(a_1, \ldots , a_m)$ which is a join of disjoint rational normal curves of degrees $a_1, \ldots , a_m$ in ${\mathbb {P}}^{\sum _{i=1}^m (a_i+1)-1}$ ; the tuple $(a_1, \ldots , a_m)$ is called the type of the scroll. As $\dim S(a_1, \ldots , a_m) = m$ and $\deg S(a_1, \ldots , a_m) = \sum _{i=1}^m a_i$ , every scroll is a variety of minimal degree and conversely any nondegenerate variety of minimal degree is either a quadric hypersurface, the second Veronese of ${\mathbb {P}}^2$ , or a scroll ([Reference Eisenbud and Harris10]). It was shown in [Reference Park16] that the Green–Lazarsfeld index of X (and even the entire graded Betti table of X) is determined by the types of surface scrolls containing X, which in turn is determined by $b.\,{\operatorname {rk}}_{C_d}(p)$ :
Theorem 3.2 [Reference Park16, Theorem 1.1].
Let $C_d \subseteq {\mathbb {P}}^d$ be a rational normal curve of degree d, $\pi _p : {\mathbb {P}}^d \dashrightarrow {\mathbb {P}}^{d-1}$ the projection away from a point $p \in {\mathbb {P}}^d \setminus \sigma _2(C_d)$ , and $X := \pi _p(C_d) \subseteq {\mathbb {P}}^{d-1}$ . Then
-
1. X is contained in a surface scroll $S(a,b)$ with $1 \le a \le b$ if and only if $a = b.{\operatorname {rk}}_{C_d}(p) - 2$ , and
-
2. The Green–Lazarsfeld index of X is given by $\alpha (X) = b.\,{\operatorname {rk}}_{C_d}(p)-3$ .
This implies that
by [Reference Blekherman, Sinn and Velasco5, Theorems 4, 6]. We will strengthen this inequality in Theorem 6.1.
3.4. Kernels of rays
Definition 3.3. Let ${\mathcal {K}} \subseteq {\mathbb {R}}^n$ be a convex cone and $\ell \in {\mathcal {K}}$ . We say that $\ell $ spans an extreme ray of ${\mathcal {K}}$ if whenever $\ell = \ell _1 + \ell _2$ with $\ell _1, \ell _2 \in {\mathcal {K}}$ , one has $\ell _1 = \lambda _1 \ell $ , $\ell _2 = \lambda _2 \ell $ for some $\lambda _1, \lambda _2 \in {\mathbb {R}}$ .
If $\ell \in {\mathcal {K}}$ spans an extreme ray of ${\mathcal {K}}$ , we will simply say that $\ell $ is an extreme ray of ${\mathcal {K}}$ (i.e., we do not distinguish an extreme ray from its nonzero elements). For instance, we can say that every $\ell \in {\mathcal {K}}$ can be written as a sum of extreme rays.
Proposition 3.4 [Reference Blekherman1, Lemma 2.2].
Let ${\mathcal {K}} = \mathbb {S}_{+} \cap L$ be a spectrahedron and $\ell \in {\mathcal {K}}$ . Then $\ell $ is an extreme ray of ${\mathcal {K}}$ if and only if $\ker Q_\ell $ is maximal, that is, if $\ker Q_\ell \subseteq \ker Q_\ell '$ for some $\ell ' \in L$ , then $\ell ' = \lambda \ell $ for some $\lambda \in {\mathbb {R}}$ .
The simplest extreme rays in $\Sigma _X^\star $ are given by point evaluations. For a point $p \in X$ , we can pick an affine representative $\tilde {p}$ lying on the line spanned by p, and define a linear functional $\ell _{\tilde {p}}(q):= q(\tilde {p})$ for $q \in R(X)_2$ . Varying the affine representative only rescales the point evaluation functional, and so by a slight abuse of terminology we will talk about point evaluations at a point $p \in X$ and use $\ell _p$ to denote any of the linear functionals obtained by using an affine representative of p. Point evaluations are precisely the rank $1$ quadratic forms in $\Sigma _X^\star $ : if $\ell \kern1.3pt{\in}\kern1.3pt \Sigma _X^\star $ has ${\operatorname {rank}}\ Q_\ell \kern1.3pt{=}\kern1.3pt 1$ , then $\ell \kern1.3pt{=}\kern1.3pt \ell _p$ for some $p \kern1.3pt{\in}\kern1.3pt X$ [Reference Blekherman, Smith and Velasco6, Lemma 2.3].
Recall that if $V \subseteq R_d$ is a space of forms, then a point p is called a basepoint of V if all forms in V vanish at p. If V has no basepoints, we say that V is basepoint-free.
Remark 3.5. We take a moment to clarify the relationship between rays with basepoint-free kernels and sums of point evaluations.
(i) For $\ell _i \in \Sigma _X^\star $ , $\ker (\sum _i Q_{\ell _i}) = \bigcap _i \ker (Q_{\ell _i})$ . (Proof: For $v \in R(X)_1$ , one has $Q_{\ell _i}(v) \ge 0$ with equality if and only if $v \in \ker (Q_{\ell _i})$ , as $Q_{\ell _i}$ is positive semidefinite.)
(ii) If the functional $\ell _p$ is the evaluation at a point $p \in X$ , then p is a basepoint of $\ker (Q_{\ell _p})$ .
(iii) It follows from (i) and (ii) that if $\ell \in \Sigma _X^\star $ is such that $\ker (Q_\ell )$ is basepoint-free, then for any decomposition of $\ell $ as a sum of extreme rays $\ell = \sum \ell _i$ of $\Sigma _X^\star $ , each extreme ray $\ell _i$ has rank $> 1$ , that is, is not a point evaluation. Otherwise, the kernel of any point evaluation $\ell _p$ used in the decomposition of $\ell $ will contain the kernel of $\ell $ , and thus the kernel of $\ell $ will have a basepoint. (In fact the converse holds as well: Iif p is a basepoint of $\ker (Q_\ell )$ , then there is a decomposition of $\ell $ into extreme rays, one of which is $\ell _p$ . However, note that a sum of extreme rays of rank $> 1$ may have basepoints.)
The next lemma connects kernels of quadratic forms to apolar ideals of binary forms, which is key for our main result.
Lemma 3.6. Let $d \ge 1$ , $L \in {\mathbb {R}}[x,y]_{2d}$ and Q the quadratic form on ${\mathbb {R}}[x,y]_d$ associated to the functional $\langle \cdot , L \rangle $ (as in Remark 3.1). Then $\ker (Q) = (L)^\perp _d$ .
Proof. The matrix A of Q is constructed with respect to a basis $B = \{b_0, \ldots , b_d\}$ of ${\mathbb {R}}[x,y]_d$ as follows: The $(i,j)$ entry of A is $\langle b_ib_j, L \rangle $ . Given $f \in {\mathbb {R}}[x,y]_d$ , one has $f \in \ker (Q) \iff Q(b_i f) = 0$ for all $0 \le i \le d \iff f \in (L)^\perp $ by Lemma 2.5.
We also note that vanishing at points on ${\mathbb {P}}^1$ with specified multiplicities imposes independent conditions on binary forms.
Proposition 3.7. Let $d \ge 0$ , $\{ p_1, \ldots , p_r \} \subseteq {\mathbb {P}}^1$ and $r_1, \ldots , r_r \in {\mathbb {N}}$ be given. Then the space of degree d binary forms vanishing to order at least $r_i$ at each $p_i$ has codimension $\sum _{i=1}^r r_i $ in $k[x,y]_d$ (we interpret the space as empty if $\sum _{i=1}^r r_i> d$ ).
Proof. Vanishing at $[a_1:b_1], \ldots , [a_r:b_r]$ to orders $r_1, \ldots , r_r$ is equivalent to being divisible by $\prod _{i=1}^r (b_i x - a_i y)^{r_i}$ .
3.5. Linear algebra
Here, we collect various results from linear algebra which will be needed in the proof of Theorem 1.2.
Lemma 3.8. Let $A = \sum _{i=1}^k \lambda _i v_i v_i^T$ be an $n \times n$ symmetric matrix, with $v_i \in {\mathbb {R}}^n$ . If $\{v_1, \ldots , v_k \}$ are linearly independent, then the signature of A is given by the sign pattern of the coefficients $\lambda _i$ .
Proof. Diagonalize A by extending $\{ v_1, \ldots , v_k \}$ to a basis of ${\mathbb {R}}^n$ .
Lemma 3.9 (Cauchy interlacing).
Let A be a real symmetric matrix. If B is any principal submatrix of A, then the eigenvalues of B interlace the eigenvalues of A.
Proof. See [Reference Horn and Johnson13, Theorem 4.3.17].
Corollary 3.10. Let Q be a quadratic form on ${\mathbb {R}}^n$ with Lorentz signature $(n-1, 1) = (+, \ldots , +, -)$ and $H \subseteq {\mathbb {R}}^n$ a hyperplane. Then the following are equivalent for the restriction $Q \big |_H$ of Q to H:
-
1. $\ker (Q \big |_H) \ne 0$
-
2. ${\operatorname {rank}} Q \big |_H = n - 2$
-
3. $Q \big |_H$ is positive semidefinite, but not positive definite.
Proof. Choose a basis of ${\mathbb {R}}^n$ which arises from extending a basis of H so that if A is the $n \times n$ symmetric matrix representing Q, then $Q \big |_H$ is represented by a principal $(n-1) \times (n-1)$ submatrix B of A. Now, $\ker (Q \big |_H) \ne 0$ implies that $0$ is an eigenvalue of B. If B had a negative eigenvalue, then Lemma 3.9 would imply that A has $\ge 2$ negative eigenvalues, contradiction.
4. A monomial example
Before proceeding to the proof of Theorem 1.2, we illustrate the major steps of the construction in Section 5 in an example. Let $C_d := \nu _d({\mathbb {P}}^1) \subseteq {\mathbb {P}}^d$ be the standard rational normal curve of degree d and $e_0, \ldots , e_d$ the torus-fixed points of ${\mathbb {P}}^d$ . Set $X_i := \pi _{e_i}(C_d) \subseteq {\mathbb {P}}^{d-1}$ , the projection of $C_d$ away from $e_i$ . As this corresponds to the case where the center of projection $F(e_i) = x^{d-i}y^i$ is a binary monomial of degree d, we refer to $X_i$ as a monomial projection of the rational normal curve.
In the following example, we will use the (normalized) monomial basis $\left \{ \dfrac {x^{6-i}y^i}{i! (6-i)!} \mid 0 \le i \le 6 \right \}$ of ${\mathbb {R}}[x,y]_6$ ordered lexicographically. By removing the normalized monomial $\frac {1}{3!3!}x^3y^3$ from this basis, we obtain an ordered basis for H which we use to write down our matrices.
Example 4.1. Let $d = 6$ , and consider the rational normal curve $C_6 \subseteq {\mathbb {P}}^6$ . Set $X = X_3 = \pi _{e_3}(C_6)$ , where the center of projection $e_3$ corresponds to the monomial $x^3y^3 \in {\mathbb {R}}[x,y]_6$ . Then $(x^3y^3)^\perp = (x^4, y^4)$ , and the form $x^4 - y^4 = (x-y)(x+y)(x^2+y^2)$ has almost real roots, which correspond to four points
Accordingly, ${\operatorname {ar-rk}}(x^3y^3) = 4$ , and there is a decomposition
where $\zeta := e^{\pi i /12}$ . Using notation as in Section 5.2, we construct a ray
where $\alpha = \frac {{\operatorname {Re}}(c_4)}{2}, \beta = \frac {{\operatorname {Im}}(c_4)}{2}$ . We thus get a $7 \times 7$ matrix representing the quadratic form $Q_\ell $ on ${\mathbb {R}}[x,y]_6 \cong R(C_6)_1$ corresponding to the ray $\ell $ , for example, with respect to the (normalized) monomial basis $\left \{ \dfrac {x^{6-i}y^i}{i! (6-i)!} \mid 0 \le i \le 6 \right \}$ of ${\mathbb {R}}[x,y]_6$ . Note that $Q_\ell $ has rank $4$ , corresponding to a sum of two real and one pair of complex-conjugate point evaluations (for a total of four point evaluations), and the matrix of $Q_\ell $ has three positive and one negative eigenvalue.
Next, we restrict $Q_\ell $ to the hyperplane $(x^3y^3)^\perp _6 \subseteq {\mathbb {R}}[x,y]_6$ , to obtain a quadratic form $q_\ell $ on $R(X)_1$ . In the chosen monomial basis, this corresponds to deleting the $4^{\text {th}}$ (= middle) row and column from $Q_\ell $ and yields the following block matrix:
In particular, the block structure of $q_\ell $ implies that ${\operatorname {rank}}(q_\ell ) \le 2 \iff 0 = \det M = 64(d_1d_2\alpha + (d_1 + d_2) (\alpha ^2 + \beta ^2))$ . Thus, when $\frac {\alpha }{\alpha ^2+\beta ^2} + (\frac {1}{d_1} + \frac {1}{d_2})= 0$ , $d_1, d_2> 0$ and $\beta \ne 0$ , the submatrix M is singular and $q_\ell $ is PSD of rank $2$ . For example, if $d_1 = 1, d_2 = 1, \alpha = -1/4, \beta = 1/4$ , then $M = \begin {bmatrix} 3 & 1 & 1 \\ 1 & 1 & -1 \\ 1 & -1 & 3 \end {bmatrix}$ . Thus, the linear functional $\ell \in R(X)_2^\star $ has rank $2$ . Moreover, one can compute a basis $\{ -x_0 + 12x_1 + x_6, -x_1 + x_5, -x_0 + 15x_4, -x_0 + 12x_1 + 15x_2 \}$ of $\ker (q_\ell )$ in ${\mathbb {R}}[x_0,x_1,x_2,x_4,x_5,x_6] = R({\mathbb {P}}^5)$ , whose vanishing defines a line in ${\mathbb {P}}^5$ . It is readily verified that this line does not meet X, which shows that $\ker (q_\ell )$ is basepoint-free. Thus, $\ell \in \Sigma _X^\star \setminus P_X^\star $ , hence $\eta (X) \le 2$ . As $\eta (X) \ge 2$ by definition, this shows that $\eta (X) = 2$ .
5. Construction of rays in $\Sigma _X^\star $
We now turn to the proof of Theorem 1.2, which will span the next two sections. In this section, we give a general procedure for constructing elements in $\Sigma _X^\star $ of ranks between ${\operatorname {ar-rk}}(F(p)) - 2$ and $d - 3$ , whose kernels are basepoint-free. By Remark 3.5, this shows that if ${\operatorname {ar-rk}}(F(p))> 3$ , then $\eta (X) \le {\operatorname {ar-rk}}(F(p)) - 2$ .
Choose r with ${\operatorname {ar-rk}}(F(p)) \le r \le d-1$ , and choose a form $g \in (F(p))^\perp _r$ with almost real roots. We assume that no proper divisor of g is in $(F(p))^\perp $ (which is automatic when $r = {\operatorname {ar-rk}}(F(p))$ and can be arranged when $r \ge \deg F^\circ $ ). Then there is a factorization over ${\mathbb {C}}$
of g into linear forms $l_i =: a_i x + b_i y \in {\mathbb {C}}[x,y]_1$ , where either
-
1. All $l_i$ ’s are distinct and real, or
-
2. All $l_i$ ’s are distinct, and there is exactly one conjugate pair $l_r = \overline {l_{r-1}}$ , or
-
3. All $l_i$ ’s are real, and there is exactly one repeated factor $l_r = l_{r-1}$ .
For the first two cases, the construction that we give below has appeared before, for example, in [Reference Blekherman1, Theorem 6.1 and Theorem 7.1] (for Veronese embeddings of projective spaces) and [Reference Blekherman, Smith and Velasco6, Proposition 3.2 and Procedure 3.3]. Case (3), however, is new, specifically dealing with a nonreduced zero-dimensional scheme.
5.1. Simple real roots
We start with case (1), that is, all roots of g are real and distinct. By apolarity (Lemma 2.4), $F(p)$ may be expressed as a linear combination of $(l_1)_\perp ^d, \ldots , (l_r)_\perp ^d$ , that is, there exist $c_1, \ldots , c_r \in {\mathbb {R}}$ such that
Note that since no proper factor of g is in $(F(p))^\perp $ , each coefficient $c_i$ in Equation (3) is nonzero.
We now construct elements in $\Sigma _X^\star $ of rank $r-2$ . Let $p_1, \ldots , p_r \in {\mathbb {P}}^{d}$ correspond to the r roots of g (explicitly, $p_i = \nu _d([a_i : b_i])$ ). Consider a linear combination
with (as yet unspecified) coefficients $d_i \in {\mathbb {R}}$ , where $\ell _{p_i} =$ evaluation at $p_i$ (note that $\ell _{p_i}$ corresponds to the binary form $(l_i)_\perp ^{d} \in {\mathbb {R}}[x,y]_{d}$ ). Then as in Remark 3.1, $\ell $ gives rise to a quadratic form $Q_\ell $ on $R(C_d)_1$ , as well as its restriction $q_\ell $ to $R(X)_1$ .
Next, we claim that if the $d_i$ are chosen so that
then ${\operatorname {rank}}(q_\ell ) = r - 2$ . To show this, we choose coordinates to reduce to a computation with matrices. Let
be the zero-dimensional variety of the points $p_i$ . The coordinate ring $R(Z)$ satisfies $\dim _{{\mathbb {R}}} R(Z)_1 = r$ , with basis $\{e_i\}_{i=1}^r$ given by indicator functions of the points, that is, $e_i(p_j) = \delta _{ij}$ . (One can of course write down explicit polynomial representatives on ${\mathbb {P}}^d$ for the $e_i$ ’s via interpolators (with a suitable padding up to degree d), although we will not need such representatives.) If $I(Z)$ is the defining ideal of Z in $C_d$ , then via the isomorphism $R(Z) \cong R(C_d)/I(Z)$ , a quadratic form on $R(C_d)_1$ whose kernel contains $I(Z)_1$ (such as $Q_\ell $ ) induces a quadratic form on $R(Z)_1$ , which is in turn represented as an $r \times r$ matrix.
The choice of basis $\{e_i\}_{i=1}^r$ then allows for a convenient expression of the matrix of the induced quadratic form $\widetilde {Q_\ell }$ on $R(Z)_1$ : namely, $\widetilde {Q_\ell }$ is represented by a diagonal matrix ${\operatorname {diag}}(b_1, \ldots , b_r)$ in this basis. Note that the conditions (5) imply that $d_r < 0$ (recall that $c_i \ne 0$ ), so by Lemma 3.8, $\widetilde {Q_\ell }$ has Lorentz signature (since $r \le d+1$ , any set of r points on $C_d$ are in linearly general position, so the functionals $\ell _{p_1}, \ldots , \ell _{p_r} \in (R(C_d)_2)^\star $ are linearly independent).
On the other hand, we may also consider the quadratic form induced by $q_\ell $ on the points $\pi (Z) := \{ \pi (p_1), \ldots , \pi (p_r) \}$ . The key difference is that the points $\pi (p_1), \ldots , \pi (p_r) \in X$ are not in linearly general position – indeed, the projection map $\pi $ can be viewed as a projectivization of the vector space quotient ${\mathbb {R}}[x,y]_d \twoheadrightarrow {\mathbb {R}}[x,y]_d/{\operatorname {span}}\{F(p)\}$ , so the linear relation (3) gives a linear dependency
expressing the last point evaluation $\ell _{\pi (p_r)}$ in terms of the others. In particular, on removing the last point $\pi (p_r)$ , the coordinate ring $R(\pi (Z \setminus \{ p_r \}))$ has a basis $\{e_i\}_{i=1}^{r-1}$ for its degree 1 part (note that $\pi (Z \setminus \{ p_r \})$ is in linearly general position in ${\mathbb {P}}^{d-1}$ ). This gives a quadratic form $\widetilde {q_\ell }$ on $R(\pi (Z \setminus \{ p_r \}))_1$ induced by $q_\ell $ : Explicitly, substituting Equation (6) into Equation (4) gives the expression
$\displaystyle \sum _{i=1}^{r-1} b_i \ell _{\pi (p_i)}^2 + \frac {b_r}{c_r^2} \Big ( \sum _{i=1}^{r-1} c_i \ell _{\pi (p_i)} \Big )^2$ for (the linear functional corresponding to) $\widetilde {q_\ell }$ . Setting
we see that the matrix of $\widetilde {q_\ell }$ in the basis $\{e_i\}_{i=1}^{r-1}$ is given by $D + \dfrac {d_r}{c_r^2} {\textbf {c}} {\textbf {c}}^T$ . Finally, observe that the vector $D^{-1}{\textbf {c}}$ is in the kernel of $\widetilde {q_\ell }$ :
by Equation (5). Corollary 3.10 then implies that $q_\ell $ is PSD (which implies that $\ell \in \Sigma _X^\star $ ) of rank $r-2$ .
It remains to show that for any ray $\ell $ constructed satisfying Equations (4) and (5), $\ker (q_\ell )$ is basepoint-free. The following reasoning will also apply to the cases in Section 5.2 and Section 5.3. First, we claim that $\ker (q_\ell )$ can have no basepoints outside of $\pi (Z)$ : If not, then $\ker (Q_\ell )$ would have a basepoint outside of Z. However, by Lemma 3.6, $\ker (Q_\ell ) = (g)_d \subseteq (F(p))^\perp _d$ is an ${\mathbb {R}}$ -vector space of dimension $d+1 - \deg g = d+1-r$ which consists of binary forms vanishing at all the points of Z (to orders specified by multiplicities of factors of g in the case of a double root in Section 5.3), thus cannot have another common zero outside of Z by Proposition 3.7. It thus suffices to eliminate the possibility of any point of $\pi (Z)$ as a basepoint, but this follows since the vector $D^{-1}{\textbf {c}}$ in $\ker (q_\ell )$ has all nonzero entries in the basis $\{e_i\}_{i=1}^{r-1}$ .
5.2. One complex pair
Next, we consider case (2), that is, g has one pair of nonreal roots $l_r = \overline {l_{r-1}}$ . The general argument will follow the outline of case (1), so we focus only on the differences (which will mainly be in the last two functionals). Essentially, rather than using two functionals arising from evaluations at complex conjugate points, we use the real and imaginary parts of one complex point evaluation. Over ${\mathbb {C}}$ , there is an expression $F(p) = \sum _{i=1}^{r-2} c_i (l_i)_\perp ^d + c_{r-1} (l_{r-1})_\perp ^d + c_r (l_r)_\perp ^d$ , and independence of the forms $\{ (l_i)_\perp ^d \}_{i=1}^r$ and conjugate-symmetry forces $c_r = \overline {c_{r-1}}$ . By rescaling $l_r \in {\mathbb {C}}[x,y]_1$ , we may assume that $c_r = 1$ (so that $c_{r-1} = 1$ as well) and thus write the analogue of Equation (3) in the form
where $c_1, \ldots , c_{r-2} \in {\mathbb {R}}$ are all nonzero since no proper factor of g is in $(F(p))^\perp $ .
We then construct the functional in $\Sigma _X^\star $ . As before, choosing $p_1, \ldots , p_r \in {\mathbb {P}}^d$ corresponding to the roots of g (with $p_r = \overline {p_{r-1}}$ a nonreal conjugate pair), we obtain a linear functional $\ell := \sum _{i=1}^{r-2} b_i \ell _{p_i}^2 + b_r \ell _{p_r}^2 + \overline {b_r} \overline {\ell _{p_r}}^2 \in R(C_d)_2^\star $ , which becomes
where $\alpha := \frac {{\operatorname {Re}}(d_r)}{2}$ , $\beta := \frac {{\operatorname {Im}}(d_r)}{2}$ . We claim that if the $b_i$ are chosen so that
then $q_\ell $ has rank $r-2$ and basepoint-free kernel. Indeed, writing $\ell _1, \ldots , \ell _r$ for the images of $(l_1)_\perp ^d, \ldots , (l_{r-2})_\perp ^d$ , $2 {\operatorname {Re}}((l_r)_\perp ^d), 2 {\operatorname {Im}}((l_r)_\perp ^d)$ in $R(C_d)_1^\star $ , and choosing forms in $R(C_d)_1$ dual to the functionals $\ell _1, \ldots , \ell _r$ , we see that the matrix of $Q_\ell \Big |_{{\operatorname {span}}\{e_i\}}$ is given by $\begin {bmatrix} D & 0 \\ 0 & A \end {bmatrix}$ , where $D := {\operatorname {diag}}(b_1, \ldots , b_{r-2})$ , $A := \begin {bmatrix} \alpha & -\beta \\ -\beta & -\alpha \end {bmatrix}$ , so that $Q_\ell $ has Lorentz signature (note that $\det (A) < 0$ ). Expressing Equation (3′) in the form
setting ${\textbf {c}} := \begin {bmatrix} c_1 & \ldots & c_{r-2} \end {bmatrix}^T$ and substituting Equation (6′) into Equation (4′) gives the matrix
Finally, observe that the vector $\begin {bmatrix} D^{-1} {\textbf {c}} \\ \frac {-\beta }{\alpha ^2+\beta ^2} \end {bmatrix}$ is in $\ker \widetilde {q_\ell }$ and has all nonzero entries:
The reasoning that $\ker (q_\ell )$ is basepoint-free was already explained at the end of Section 5.1.
5.3. One double root
Finally, we consider case (3), that is, g has a unique real double root $l_r = l_{r-1}$ (with all other roots real and simple). In this case, the two functionals we use correspond to evaluation at the double point, as well as differentiation followed by evaluation. From apolarity, there is a relation
where as before $c_1, \ldots , c_r \in {\mathbb {R}}$ are all nonzero. Let $\ell _1, \ldots , \ell _r \in R(C_d)_1^\star $ be the linear functionals corresponding to $(l_1)_\perp ^d, \ldots , (l_{r-2})_\perp ^d, (l_r)_\perp ^d, 2 l_r (l_r)_\perp ^{d-1}$ and consider the linear functional in $(R(C_d)_2)^\star $ defined by
We claim that if the $d_i$ are chosen so that
then $q_\ell $ has rank $r-2$ and basepoint-free kernel. Indeed, the matrix of $Q_\ell $ (restricted to the subspace of $R(C_d)_1$ spanned by forms dual to $\ell _1, \ldots , \ell _r$ ) is given by $\begin {bmatrix} D & 0 \\ 0 & A \end {bmatrix}$ , where $D := {\operatorname {diag}}(b_1, \ldots , b_{r-2})$ , $A := \begin {bmatrix} b_{r-1} & \frac {b_r}{2} \\ \frac {b_r}{2} & 0 \end {bmatrix}$ , hence has Lorentz signature (note that $\det (A) < 0$ ). Writing (3′′) in the form
setting ${\textbf {c}} := \begin {bmatrix} c_1 & \ldots & c_{r-2} \end {bmatrix}^T$ and substituting Equation (6′′) into Equation (4′′) gives the matrix
As before, we exhibit a kernel vector $\begin {bmatrix} D^{-1} {\textbf {c}} \\ \frac {c_r}{b_r} \end {bmatrix}$ with all nonzero entries:
We remark that the (left-hand side of the) equation in Equation (5′′) is precisely the Schur complement $\widetilde {q_\ell }/D$ (this provides another proof that $\widetilde {q_\ell }$ is PSD but not positive definite). As a quadratic in $\frac {b_r}{c_r}$ , this equation always has two real solutions (as the discriminant $(2c_{r-1})^2 + 4b_{r-1} {\textbf {c}}^T D^{-1} {\textbf {c}}$ is positive by Equation (5′′)).
As before, the reasoning that $\ker (q_\ell )$ is basepoint-free was given at the end of Section 5.1.
6. Lower bound
In this section, we prove a lower bound on the Hankel index in terms of the almost real rank of the center of projection, showing that our construction in Section 5 of rays of minimal rank is sharp. Throughout, let $X = \pi _p(C_d)$ be a projection with center p of a rational normal curve $C_d \subseteq {\mathbb {P}}^d$ . We assume that the center p is not contained in $\sigma _3(C_d)$ (which implies $d \ge 6$ ), and as in Section 3.1, we associate to p a binary form $F(p) \in {\mathbb {R}}[x,y]_d$ .
Theorem 6.1. We have the following bound on the Hankel index of a projected rational normal curve X with center of projection $F(p)$ :
Proof. Fix a ray $\ell \in \Sigma _X^\star $ . By Remark 3.1(v), we get $L \in {\mathbb {R}}[x,y]_{2d}$ such that $\ell (\cdot ) = \langle \cdot , L \rangle $ , and quadratic forms $Q_\ell $ on ${\mathbb {R}}[x,y]_d$ and $q_\ell := Q_\ell \Big |_H$ , where $H = (F(p))^\perp _d \cong R(X)_1$ . Note that $q_\ell $ is PSD since $\ell $ was an element of $\Sigma _X^\star $ , which implies by Lemma 3.9 that $Q_\ell $ has at most one negative eigenvalue.
We now further assume that $\ker q_\ell $ is basepoint-free. By Lemma 3.8, this implies that $Q_\ell $ is not PSD, as otherwise $q_\ell $ would be a sum of point evaluations, contradicting Remark 3.5. It follows that $Q_\ell $ has Lorentz signature $(+, \ldots , +, -)$ .
Consider the apolar ideal $(L)^\perp = (L_\perp , L^\circ )$ , and set $s := \deg L_\perp $ (so that $\deg L^\circ = 2d + 2 - s$ ). By Lemma 3.6, $\ker (Q_\ell ) = (L)^\perp _d$ , and since this space is nonzero (being basepoint-free), one must have $s \le d$ (in particular, $s < \deg L^\circ $ ). Write
where $l_i \in {\mathbb {R}}[x,y]_1$ are distinct linear forms and $\sum d_i = s$ .
We next claim that $L_\perp $ has almost real roots, which is the core of this proof. For convenience, say that a form G has a triple root if G has a real root of multiplicity $3$ , and all other roots are real and simple. We first show, via a perturbation argument, that either $L_\perp $ has almost real roots, or $L_\perp $ has a triple root. Then, we show that $L_\perp $ does not have a triple root.
Thus, suppose that $L_\perp $ does not have almost real roots, nor a triple root. The key idea for the perturbation argument is the following: We may approximate $L_\perp $ by a sequence of polynomials, all of which have at least two pairs of simple complex roots. Intuitively, each pair of simple complex roots contributes a negative eigenvalue to the signature and then continuity implies that $Q_\ell $ has $\ge 2$ negative eigenvalues, a contradiction.
To be precise, we consider the following types of replacements of certain factors of $L_\perp $ , depending on the way that $L_\perp $ fails to have almost real roots/a triple root:
(here, $\alpha \in {\mathbb {C}} \setminus {\mathbb {R}}$ , and $a, b \in {\mathbb {R}}$ are distinct). Then for all sufficiently small $\epsilon> 0$ , the polynomial $L_\epsilon $ obtained from $L_\perp $ by performing one of the above replacements has $\ge 2$ pairs of simple complex roots and satisfies $L_\epsilon \to L_\perp $ as $\epsilon \to 0$ (if $L_\perp $ already has two pairs of simple complex roots, then we may take $L_\epsilon = L_\perp $ ). Taking apolar ideals of the form $(L_\epsilon , L^\circ )$ gives a sequence of degree $2d$ forms converging to L, and with this associated quadratic forms $Q_\epsilon \to Q_\ell $ . Then each $L_\epsilon $ has $\ge 2$ pairs of simple complex roots, so $Q_\epsilon $ has $\ge 2$ negative eigenvalues. Furthermore, $\dim \ker (Q_\epsilon ) = \dim (L_\epsilon )_d = d-s+1$ is constant in $\epsilon $ . Then continuity of eigenvalues implies that $Q_\ell $ has $\ge 2$ negative eigenvalues (as no negative eigenvalue can become positive without crossing zero, and the number of zero eigenvalues stays constant), contradicting the fact that $Q_\ell $ has Lorentz signature.
To conclude that $L_\perp $ has almost real roots, it remains to eliminate the possibility that $L_\perp $ has a triple root. We will show that if $L_\perp $ has a triple root, then $\ker q_\ell $ is not basepoint-free. Suppose the roots of $L_\perp $ have multiplicities $(d_1, \ldots , d_{s-2}) = (1, \ldots , 1, 3)$ . Setting $l := l_{s-2}$ , by apolarity we may write
for some $b_1, \ldots , b_s \in {\mathbb {R}}$ . Write $\ell _1, \ldots , \ell _s$ for the functionals in $R(C_d)_1^\star $ corresponding to $(l_1)_\perp ^d, \ldots , (l_{s-3})_\perp ^d$ , $(l_\perp )^d, l(l_\perp )^{d-1}, l^2(l_\perp )^{d-2}$ . Then Equation (7) may be expressed as
(with $\ell _{s-2} \ell _s = \ell _{s-1}^2$ ). Since $L_\perp \in H = (F(p))^\perp $ (shown below), there is also a relation
with $c_i \in {\mathbb {R}}$ . Note that since a proper factor of $L_\perp $ may lie in $(F(p))^\perp $ , we cannot say a priori whether any particular $c_i$ is nonzero. We thus consider cases depending on whether $c_s$ is nonzero.
If $c_s \ne 0$ , then substituting $\ell _s = -\frac {1}{c_s} \sum _{i=1}^{s-1} c_i \ell _i$ into Equation (8) gives a matrix for $\widetilde {q_\ell }$ (with respect to $\{\ell _1, \ldots , \ell _{s-1}\}$ ) whose last diagonal entry ( $=$ coefficient of $\ell _{s-1}^2)$ ) is 0. If $c_s = 0$ , then substituting $\ell _i = -\frac {1}{c_i} \sum _{j \ne i}^{s-1} c_j \ell _j$ (for some $1 \le i \le s-1$ ) into Equation (8) gives a matrix for $\widetilde {q_\ell }$ (with respect to $\{\ell _1, \ldots , \hat {\ell _i}, \ldots , \ell _s \}$ ) whose last diagonal entry ( $=$ coefficient of $\ell _s^2)$ ) is 0. Thus, in any case $\widetilde {q_\ell }$ can be represented by a matrix with last diagonal entry $0$ , and since $\widetilde {q_\ell }$ is PSD, this implies that the entire last column of $\widetilde {q_\ell }$ must be $0$ . Then $\ker (\widetilde {q_\ell })$ is generated by the vector $\begin {bmatrix} 0 & \ldots & 0 & 1 \end {bmatrix}^T$ , but this implies that $\ker (q_\ell )$ is not basepoint-free (as each of the roots of $l_1, \ldots , l_{s-1}$ would be basepoints).
This shows that $L_\perp $ has almost real roots. Next, we show that $L_\perp $ is contained in the apolar ideal of the center $(F(p))^\perp $ . If $L_\perp $ has simple roots, then the points on X corresponding to these roots cannot be in linearly general position: If they were, then Lemma 3.8 implies that $\ell $ would be a sum of point evaluations, contradicting Remark 3.5(iii). This means precisely that $F(p)$ can be written as a linear combination of $d^{\text {th}}$ powers of roots of $L_\perp $ , so by apolarity $L_\perp \in (F(p))^\perp $ .
Next, suppose that $L_\perp $ does not have simple roots, and define the following ‘reduction of order’ polynomial
with the key property that $L_\perp $ divides $\widetilde {L_\perp }^2$ . We claim that
To see this, note that for $f \in H$ , one has $f \in \ker (q_\ell ) \iff q_\ell (f) = 0$ (as $q_\ell $ is PSD on H – this need not be the case if $q_\ell $ were indefinite). Together with Lemma 3.6, this gives the second equality. For the first equality, note that $L_\perp \in (\widetilde {L_\perp }) \implies (L_\perp )_d \cap H \subseteq (\widetilde {L_\perp })_d \cap H$ . Conversely, any $f \in (\widetilde {L_\perp })_d \cap H$ is of the form $f := g \widetilde {L_\perp }$ (for some $g \in {\mathbb {R}}[x,y]_{d-\deg \widetilde {L_\perp }}$ ), hence satisfies $q_\ell (f) = \langle g^2(\widetilde {L_\perp })^2, L \rangle = 0$ , since $L_\perp $ divides $\widetilde {L_\perp }^2$ , and $\langle L_\perp , L \rangle = 0$ .
In view of Equation (9): Given that $L_\perp \ne \widetilde {L_\perp }$ , one has $(L_\perp )_d \subsetneq (\widetilde {L_\perp })_d$ , but since the intersections of these subspaces with the hyperplane H coincide, it must be the case that $\dim (L_\perp )_d$ , $\dim (\widetilde {L_\perp })_d$ differ by exactly $1$ (note that dimension decreases by at most $1$ when intersecting with a hyperplane and does not change precisely when the subspace is already contained in the hyperplane). From this, we deduce that $(L_\perp )_d \subseteq H$ , hence $L_\perp \in (F(p))^\perp $ by Lemma 2.5. (Note that this argument also shows that $\deg L_\perp \le 1 + \deg \widetilde {L_\perp }$ , which gives another proof that $L_\perp $ has at most one multiple real root, which must be of multiplicity $\le 3$ ).
Putting the above results together, we see that $L_\perp \in (F(p))^\perp $ has almost real roots, so ${\operatorname {ar-rk}}(F(p)) \le \deg L_\perp = s$ . Now, $\dim \ker (Q_\ell ) = \dim (L)_d^\perp = \dim (L_\perp )_d = d-s+1$ , so ${\operatorname {rank}}(Q_\ell ) = d+1 - \dim \ker (Q_\ell ) = s$ , and by Corollary 3.10, ${\operatorname {rank}}(q_\ell ) = {\operatorname {rank}}(Q_\ell ) - 2$ . Thus, ${\operatorname {rank}}(\ell ) = {\operatorname {rank}}(q_\ell ) = s - 2 \ge {\operatorname {ar-rk}}(F(p)) - 2$ . Since this holds for any ray $\ell $ with $\ker (q_\ell )$ basepoint-free, in particular it holds for any extreme ray of $\Sigma _X^\star $ which is not a point evaluation, so $\eta (X) \ge {\operatorname {ar-rk}}(F(p)) - 2$ as desired.
7. Almost real rank
As shown by our main result Theorem 1.2, the almost real rank of a form is an interesting quantity to study. In this final section, we investigate almost real rank of binary forms in general. To begin, the following proposition characterizes some cases where the almost real rank is small.
Remark 7.1. Let $d \ge 3$ and $F \in {\mathbb {R}}[x,y]_d$ , with apolar ideal $(F)^\perp = (F_\perp , F^\circ )$ of type $(d_1, d_2)$ . Then:
-
1. ${\operatorname {ar-rk}}(F) = d_1 \iff F_\perp $ has almost real roots.
-
2. If ${\operatorname {ar-rk}}(F)> d_1$ , then ${\operatorname {ar-rk}}(F) \ge d_2$ .
-
3. ${\operatorname {ar-rk}}(F) = 1 \iff d_1 = 1 \iff {{\mathbb {R}}\text{-}\operatorname {rk}}(F) = 1$ .
-
4. ${\operatorname {ar-rk}}(F) = 2 \iff d_1 = 2 \iff {{\mathbb {C}}\text{-}\operatorname {b.rk}}(F) = 2$ .
-
5. ${\operatorname {ar-rk}}(F) = 3 \iff d_1 = 3$ and $F_\perp $ is not a cube (of a linear form).
(If $d_1 = d_2$ , we interpret ‘ $F_\perp $ has almost real roots’ to mean ‘there exists a form in $(F)^\perp _{d_1}$ with almost real roots’ and similarly in (5)).
Remark 7.2. One can stratify all degree d binary forms by almost real rank as follows: Write $V_i := H^0(\mathcal {O}_{{\mathbb {P}}^1}(i))$ for the vector space of (real) degree i binary forms. Let $\varphi _{1,d} : {\mathbb {P}}(V_1) \to {\mathbb {P}}(V_d)$ be the $d^{\text {th}}$ Veronese map, and for $r \ge 2$ , define the map
(here, $q_1, q_2$ are the degree d forms corresponding to the complex linear factors of the quadric q as in Section 5, for example, if $q = l^2$ , then $q_1 = l^d$ , $q_2 = l_\perp (l)^{d-1}$ ). By Lemma 2.4, the image of $\varphi _{r,d}$ is precisely the set of degree d binary forms of almost real rank $\le r$ . Restricting $\varphi _{r,d}$ to the (open) subset, where $q, l_0, \ldots , l_{r-3}$ are relatively prime and, removing the image of $\varphi _{r-1, d}$ , gives the set of degree d binary forms of almost real rank $= r$ .
From this description, one can deduce various structural properties of the set of forms of a given almost real rank. For instance, $\varphi _{2,d}$ is injective (for $d \ge 3$ ), so the set of forms with almost real rank $\le 2$ has dimension $3$ . Also, when $r = \lfloor \frac {d+2}{2} \rfloor = \lfloor \frac {d}{2} \rfloor + 1$ , $\varphi _{r,d}$ is dominant, corresponding to the fact that the generic type is $(r, d+2-r)$ , and among forms of degree r, those with almost real roots are typical. For dimension reasons, this is the least value of r for which $\varphi _{r,d}$ can be dominant, with general fibers of dimension $0$ (resp. $1$ ) when d is odd (resp. even).
It is natural to ask what the maximal almost real rank is for binary forms of degree d. This is answered by the next theorem:
Theorem 7.3. For any $d \ge 3$ and $F \in {\mathbb {R}}[x,y]_d$ , ${\operatorname {ar-rk}}(F) \le d-1$ .
Proof. First, we reduce to the case that $1<{{\mathbb {C}}\text{-}\operatorname {rk}}(F) < d$ . If ${{\mathbb {C}}\text{-}\operatorname {rk}}(F) = d$ , then the apolar ideal $(F)^\perp $ is of type $(2, d)$ (see Remark 2.7), so ${\operatorname {ar-rk}}(F) = 2 \le d-1$ by Remark 7.1(4). Additionally, if ${{\mathbb {C}}\text{-}\operatorname {rk}}(F) = 1$ , then ${\operatorname {ar-rk}}(F) = 1$ as well. Thus, we may assume $2 \le {{\mathbb {C}}\text{-}\operatorname {rk}}(F) \le d-1$ .
We now induct on d. For the base case $d = 3$ , the apolar ideal is of type $(2,3)$ , so again ${\operatorname {ar-rk}}(F) \le 2$ . For the inductive step, choose any direction $u = (u_1, u_2) \in {\mathbb {R}}^2$ , corresponding to a linear form $l_u(x,y) := u_1x + u_2y$ . Then by induction, the apolar ideal of the directional derivative $D_u(F) = \langle l_u, F \rangle $ contains a form with almost real roots of degree $\le d-2$ (note that $D_u(F) \ne 0$ , since ${{\mathbb {C}}\text{-}\operatorname {rk}}(F)> 1$ by assumption $\implies l_u \not \in (F)^\perp $ ). By multiplying an additional factor if necessary, we may choose $G \in (D_u(F))^\perp $ of degree $= d-2$ with almost real roots. Then $G \cdot l_u \in (F)^\perp $ is of degree $d-1$ . Since ${{\mathbb {C}}\text{-}\operatorname {rk}}(F) \le d-1$ , we may also choose $H \in (F)^\perp $ of degree $= d-1$ with simple complex roots.
We claim that for sufficiently small $\epsilon \in {\mathbb {R}}$ , the form $G_\epsilon := G \cdot l_u + \epsilon H \in (F)^\perp $ has almost real roots. First, observe that there are only finitely many $\epsilon $ such that $G_\epsilon $ does not have simple roots: These are given by the roots of the discriminant of $G_\epsilon $ , viewed as a polynomial in $\epsilon $ (note that this polynomial is nonzero since H has simple roots). Thus, by avoiding these finitely many choices of $\epsilon $ , we may assume that $G_\epsilon $ has simple roots, and so it suffices to show that $G_\epsilon $ has at most $1$ pair of complex roots.
For $|\epsilon |$ sufficiently small, any simple root of $G \cdot l_u$ gives a simple root of $G_\epsilon $ (by dehomogenizing we may consider a simple root of a univariate real polynomial, which is, for example, negative to the left of the root and positive to the right, and this is stable under small perturbation). Thus, we need only consider the following cases: (i) $G \cdot l_u$ has a triple root, and (ii) $G \cdot l_u$ has two double roots. In case (i), since $G_\epsilon $ has simple roots, the triple root of $G \cdot l_u$ induces either three distinct real roots of $G_\epsilon $ , or one real root and one complex pair, and since all other roots of $G \cdot l_u$ are real and simple in this case, we get at most $1$ pair of complex roots of $G_\epsilon $ .
In case (ii), suppose $G \cdot l_u$ has two double roots, and let p be one of these. If p is a root of H, then p is also a root of $G_\epsilon $ for any $\epsilon $ , so the double root p of $G \cdot l_u$ induces two real roots of $G_\epsilon $ (one of which is p, which implies that the other root must be real). Otherwise, if p is not a root of H, then $G \cdot l_u$ will either be nonnegative or nonpositive in a neighborhood of p while $H(p)$ is nonzero, so by choosing the sign of $\epsilon $ appropriately, the double root p of $G \cdot l_u$ will again induce distinct real roots of $G_\epsilon $ . Hence, in either case the other double root of $G \cdot l_u$ gives at most one complex pair of roots of $G_\epsilon $ .
Remark 7.4. There are some instances in which the type of the apolar ideal determines the almost real rank. Some cases of this are listed in Remark 7.1. Another example of this occurs in degree 6: If a real binary sextic F has an apolar ideal of type $(4,4)$ , then ${\operatorname {ar-rk}}(F) = 4$ . To see this, note that if $F_\perp , F^\circ $ were both $4^{\text {th}}$ powers, then $F_\perp - F^\circ $ has almost real roots. Moreover, if both $F_\perp $ and $F^\circ $ have two pairs of complex roots, then $F_\perp , F^\circ $ is globally positive, in which case a suitable ${\mathbb {R}}$ -linear combination of $F_\perp $ , $F^\circ $ has at least a pair of real roots. Thus, without loss of generality $F_\perp $ has at most one root of multiplicity 3, or two double roots, or one double root and one complex pair of roots, and by the reasoning in the proof of Theorem 7.3, there exists a form in $(F)^\perp _4$ with almost real roots.
We next characterize when the maximal almost real rank of $d-1$ is achieved, which serves as a converse of Theorem 7.3:
Theorem 7.5. Let $d \ge 5$ and $F \in {\mathbb {R}}[x,y]_d$ . Then ${\operatorname {ar-rk}}(F) = d-1 \iff F_\perp $ is a cube of a linear form $\iff (F)^\perp $ contains a cube of a linear form (but no quadratic forms).
Proof. If $F_\perp $ is a cube of a linear form, then $(F)^\perp $ is of type $(3, d-1)$ and ${\operatorname {ar-rk}}(F) \ge d-1$ by Remark 7.1(2, 5). Conversely, we show that if $d \ge 5$ and $(F)^\perp $ contains no cubes, then ${\operatorname {ar-rk}}(F) \le d-2$ , by induction on d.
We first rule out small types: Let $(d_1, d_2)$ be the type of $(F)^\perp $ . If $d_1 \le 3$ , then (with the assumptions of no cubes) ${\operatorname {ar-rk}}(F) \le d-2$ by Remark 7.1. This is enough to cover the base case $d = 5$ , and by Remark 7.4, this also covers the case $d = 6$ . Thus, we assume for the remainder of the proof that $d_1 \ge 4$ .
Now, suppose F is a form of degree $d \ge 7$ . Note that either $(D_x(F))^\perp $ or $(D_y(F))^\perp $ does not contain a cube of a linear form: If not, say $l_1^3 \in (D_x(F))^\perp $ and $l_2^3 \in (D_y(F))^\perp $ , then $(F)^\perp $ would contain two independent quartics $xl_1^3, yl_2^3$ , which can only happen if $d_1 \le 3$ (since $d_1 = 4 \implies d_2 = d-2 \ge 5$ ), which has already been covered. Without loss of generality, we may assume $(D_x(F))^\perp $ does not contain a cube of a linear form. By induction, there is a form $g \in (D_x(F))^\perp $ of degree $\le d-3$ with almost real roots. Then $xg \in (F)^\perp $ is of degree $\le d-2$ , and since $F^\circ \in (F)^\perp $ has degree $\le d-2$ as well, the reasoning in the proof of Theorem 7.3 shows that there exists a form in $(F)^\perp _{d-2}$ with almost real roots.
The characterization above yields sharp bounds on the Hankel index for the curves studied in this paper:
Corollary 7.6. Let $X = \pi _p(C_d)$ be a projection of a rational normal curve $C_d$ away from a point $p \in {\mathbb {P}}^d \setminus \sigma _3(C_d)$ . Then $2 \le \eta (X) \le d-4$ . In particular, if $d = 6$ , then $\eta (X) = 2$ .
Proof. If ${\operatorname {ar-rk}}(F(p)) = d-1$ , then $(F(p))^\perp $ contains a cube by Theorem 7.5. By apolarity, this implies that $p \in \sigma _3(C_d)$ , a contradiction. Thus, ${\operatorname {ar-rk}}(F(p)) \le d-2$ , so $\eta (X) \le d - 4$ by Theorem 1.2.
As preparation for determining the typical almost real ranks, it is useful to know explicit forms which attain a given almost real rank. We thus compute the various ranks of monomials $x^{d-i}y^i \in {\mathbb {R}}[x,y]_d$ . When $i = 0$ , $x^{d-i}y^i = x^d$ is a power of a linear form, hence has real (and complex) [border] rank 1. By symmetry, we may therefore assume $1 \le i \le \lfloor \frac {d}{2} \rfloor $ . In general, the apolar ideal is
From this, we see that ${{\mathbb {C}}\text{-}\operatorname {b.rk}}(x^{d-i}y^i) = i+1$ and ${{\mathbb {C}}\text{-}\operatorname {rk}}(x^{d-i}y^i) = d-i+1$ (see Remark 2.7). Since $x^{d-i}y^i$ has all real roots, we also have ${{\mathbb {R}}\text{-}\operatorname {rk}}(x^{d-i}y^i) = d$ .
Proposition 7.7. For $d \ge 1$ and $0 \le i \le \lfloor \frac {d}{2} \rfloor $ ,
Proof. The cases $i = 0, 1$ follow from Remark 7.1; the case $i = 2$ is covered by Theorem 7.5. This includes all cases with $d \le 5$ .
It thus suffices to show that if $d \ge 6$ and $3 \le i \le \lfloor \frac {d}{2} \rfloor $ , then ${\operatorname {ar-rk}}(x^{d-i}y^i)> d-3$ . The cases $d = 6$ (resp. $d = 7$ ) are covered by Remark 7.4 (resp. Remark 7.1). Now, suppose $d \ge 8$ . Every form of degree $d-3$ in $(x^{d-i}y^i)^\perp $ can be expressed as
with $(i-3) + (d-i-3) = d - 6$ coefficients $a_0, \ldots , a_{i-4}, b_{d-i-4}, \ldots , b_0 \in {\mathbb {R}}$ , where we take no $a_i$ ’s if $i = 3$ (so that the support of this polynomial has a gap of size 4). By the Descartes’ rule of signs, the number of distinct nonzero real roots of this polynomial is at most the number of sign changes between adjacent coefficients, hence is $\le d - 7$ . Thus, ${\operatorname {ar-rk}}(x^{d-i}y^i)> d-3$ , and so Theorems 7.3 and 7.5 imply that ${\operatorname {ar-rk}}(x^{d-i}y^i) = d-2$ .
In particular, we see that for monomial projections, the almost real rank is essentially independent of i (and depends only on whether $X_i$ is contained in the rational normal surface scroll $S(1, d-3)$ ) and is much larger than the complex border rank (with a gap of at least $\lceil \frac {d}{2} \rceil - 3$ , hence the gap is unbounded as $d \to \infty $ ).
An amusing corollary of Proposition 7.7 is the existence, in any degree $\ge 4$ , of univariate real polynomials with almost real roots whose supports have a gap of size 3, that is, the rule of signs bound is sharp for these polynomials (although the existence of such polynomials is not sufficient to prove Proposition 7.7). For more on the sharpness of the rule of signs bound, see [Reference Grabiner11].
Finally, we consider the problem of determining which almost real ranks are typical. Our presentation follows that of [Reference Blekherman2]. Recall that a property P of degree d forms is said to be typical if, on identifying the set of degree d forms with ${\mathbb {R}}^{d+1}$ , there is a nonempty Euclidean open set of degree d forms all of which have property P. We say that an almost real rank r is typical if the property ‘has almost real rank $= r$ ’ is typical. For $F\in {\mathbb {R}}[x,y]_d$ , we say that F is a typical form of almost real rank r if F lies in an open set of ${\mathbb {R}}[x,y]_d$ which consists of forms of almost real rank r.
Note that the condition ‘ $(F)^\perp $ contains a cube’ in Theorem 7.5 is equivalent to saying that F has a real root of multiplicity $\ge d-2$ , which is not a typical property. It follows that $d-1$ is not a typical almost real rank. Moreover, Remark 7.2 implies that any $r < \lfloor \frac {d+2}{2}\rfloor $ cannot be a typical almost real rank. It turns out that these are the only obstructions for an almost real rank to be typical, as will be shown in Theorem 7.9. To this end, we first characterize the typical forms of a given almost real rank:
Lemma 7.8. Let $F\in {\mathbb {R}}[x,y]_d$ with $(F)^\perp $ of generic type, and set $r = {\operatorname {ar-rk}} F$ . Then F is a typical form of almost real rank r if and only if all forms in $(F)^\perp _{r-1}$ have at least two pairs of complex roots (counted with multiplicity).
Proof. Suppose that F is typical of almost real rank r, and there exists $g \in (F)^\perp _{r-1}$ such that g has at most one pair of complex roots. In any $\epsilon $ -neighborhood of g there exists a form $g_\epsilon $ such that $g_\epsilon $ has almost real roots. For any $\epsilon> 0$ , we have $\dim (g_\epsilon )_d = \dim (g)_d=d-r+2$ , and as $\epsilon $ approaches $0$ , $(g_\epsilon )_d$ approaches $(g)_d$ . Therefore, the orthogonal complement of $(g_\epsilon )_d$ also approaches the orthogonal complement of $(g)_d$ as $\epsilon $ goes to $0$ . We conclude that in any neighborhood of F there exist forms of almost real rank at most $r-1$ , which is a contradiction.
Conversely, let $F\in {\mathbb {R}}[x,y]_d$ with $(F)^\perp $ of generic type and ${\operatorname {ar-rk}} F = r$ . Suppose that all forms in $(F)^\perp _{r-1}$ have at least two pairs of complex roots. For $\epsilon> 0$ sufficiently small, the $\epsilon $ -neighborhood of F contains only forms with apolar ideals of generic type (as having nongeneric type is a Zariski-closed condition). For such $\epsilon $ , fix $F_\epsilon $ in the $\epsilon $ -neighborhood of F. Within this neighborhood, the ideal $(F_\epsilon )^\perp $ (i.e., the sequence of graded components of $(F_\epsilon )^\perp $ ) depends continuously on the coefficients of $F_\epsilon $ . Now, both conditions ‘all forms in $(F)^\perp _{r-1}$ have at most one pair of complex roots’ and ‘there exists a form in $(F)^\perp _r$ with almost real roots’ are stable under sufficiently small perturbation, which shows that F is typical of almost real rank r.
Theorem 7.9. For $d \ge 5$ , any r with $\lfloor \frac {d+2}{2}\rfloor \le r \le d-2$ is a typical almost real rank.
Proof. We first show that $d-2$ is always a typical almost real rank. By Theorem 7.3 and Theorem 7.5, it suffices to show that for each $d \ge 5$ , there exists a nonempty open set of degree d forms with almost real rank $> d-3$ . For $5 \le d \le 7$ , we may verify this directly: If $d = 5$ , then a general form (which is of type $(3,4)$ ) has almost real rank $3$ ; the case $d = 6$ is covered by Remark 7.4; and for $d = 7$ , there is an nonempty open set of forms F of type $(4, 5)$ for which $F_\perp $ has only complex roots (i.e., is a product of two strictly positive quadrics). For $d \ge 8$ , it follows from Lemma 7.8 and the proof of Proposition 7.7 that the ‘balanced’ monomial $x^{\lceil \frac {d}{2} \rceil }y^{\lfloor \frac {d}{2} \rfloor }$ (which is of generic type) is a typical form of almost real rank $d-2$ .
For the remaining ranks, we induct on the degree d. For the base cases $d = 5, 6$ , we have that $d-2 = \lfloor \frac {d+2}{2} \rfloor $ is a typical almost real rank by the above. For the inductive step, fix the following data:
-
1. a rank $\lceil \frac {d+2}{2} \rceil \le r \le d-2$ ,
-
2. a typical form $F \in {\mathbb {R}}[x,y]_d$ of almost real rank r (by perturbing F if necessary, we may assume that $(F)^\perp = (F_\perp , F^\circ )$ is of generic type),
-
3. a nonzero form $S := C_1 F_\perp + C_2 F^\circ \in (F)^\perp _r$ with almost real roots.
We will exhibit a form H of degree $d+1$ such that $(H)^\perp $ is of generic type, $(H)^\perp \subseteq (F)^\perp $ and $S \in (H)^\perp $ . By Lemma 7.8, this shows that H is a typical form of almost real rank r, so r is a typical almost real rank in degree $d+1$ . This is enough for the induction since we already know that $(d+1)-2$ is a typical almost real rank in degree $d+1$ (note also that $\lceil \frac {d+2}{2} \rceil = \lfloor \frac {(d+1)+2}{2} \rfloor $ ). We consider two cases depending on the parity of d, namely $d = 2k$ for $k \ge 3$ , or $d = 2k-1$ for $k \ge 4$ .
First, suppose $d = 2k-1$ is odd so that $\deg F_\perp = k$ , $\deg F^\circ = k+1$ . We claim that there exists a linear form $L \in {\mathbb {R}}[x,y]_1$ such that $C_1 - LC_2$ has a real root which is not a root of $LF_\perp + F^\circ $ . If not, then for every linear form L, we have that every root of $C_1 - LC_2$ is a root of $LF_\perp + F^\circ $ . Now, for any $(a,b) \in {\mathbb {R}}^2$ with $F_\perp (a,b) \ne 0$ and $C_2(a,b) \ne 0$ , there exists a linear form L such that $L(a,b) = \frac {C_1(a,b)}{C_2(a,b)}$ , that is, $(a,b)$ is a root of $C_1 - LC_2$ . By assumption, $(a,b)$ is also a root of $LF_\perp + F^\circ $ , so $L(a,b) = \frac {-F^\circ (a,b)}{F_\perp (a,b)}$ . Varying over such $(a,b)$ , we see that the two rational functions $C_1/C_2$ and $-F^\circ /F_\perp $ agree at infinitely many points, hence must be equal. But this implies that $S = C_1F_\perp + C_2F^\circ = 0$ , a contradiction. We conclude that such an L exists. For such L, set $G := L F_\perp + F^\circ $ , write $C_1 - LC_2 = L_1K$ , where $L_1 \in {\mathbb {R}}[x,y]_1$ does not divide G, and take H to be the unique form of degree $d+1$ with apolar ideal generated by $(L_1F_\perp , G)$ . Then $(H)^\perp \subseteq (F)^\perp $ , and $S = (C_1-L C_2)F_\perp + C_2 G = K(L_1F_\perp ) + C_2 G \in (H)^\perp $ as desired.
The reasoning in the case $d = 2k$ is similar: Here, $\deg (F_\perp ) = \deg (F^\circ ) = k+1$ . We claim that there exists $\alpha \in {\mathbb {R}}$ such that $C_1 - \alpha C_2$ has a real root which is not a root of $\alpha F_\perp + F^\circ $ . This follows from the same reasoning as in the case $d = 2k-1$ (in fact even simpler, since there is no choice involved in the scalar $\alpha $ , as opposed to a linear form). Having obtained such an $\alpha $ , we set $G := \alpha F_\perp + F^\circ $ , write $C_1 - \alpha C_2 = L_0 K$ , where $L_0 \in {\mathbb {R}}[x,y]_1$ does not divide G, and take H to be the unique form of degree $d+1$ with apolar ideal generated by $(L_0F_\perp , G)$ . Then as before, $(H)^\perp \subseteq (F)^\perp $ and $S = (C_1 - \alpha C_2)F_\perp + C_2G = K(L_0 F_\perp ) + C_2G \in (H)^\perp $ .
Competing interest
The authors have no competing interest to declare.
Financial support
Grigoriy Blekherman and Jaewoo Jung were partially supported by NSF grant DMS-1901950.