1 Introduction
Consider the symbolic dynamical system $(\mathcal {A}^{\mathbb {N}},\mathscr {F},\mu ,\theta )$ in which $\mathcal A$ is a finite alphabet, $\theta $ is the left-shift map and $\mu $ is a shift-invariant probability measure, that is, $\mu \circ \theta ^{-1}=\mu $ . We are interested in the statistical properties of the return time $R_n(x)$ , the first time the orbit of x comes back in the nth cylinder $[x_0^{n-1}]=[x_0,\ldots ,x_{n-1}]$ (that is, the set of all $y\in \mathcal {A}^{\mathbb {N}}$ coinciding with x on the first n symbolsFootnote 1 ).
The main contribution of this paper is the calculation of the return-time $L^q$ -spectrum (or cumulant generating function) in the class of equilibrium states (a subclass of shift-invariant ergodic measures; see §2.1). More specifically, consider a potential $\varphi $ having summable variation (this includes Hölder continuous potentials for which the variation decreases exponentially fast). Our main result, Theorem 3.1, states that if its unique equilibrium state, denoted by $\mu _\varphi $ , is not of maximal entropy, then
where $P(\cdot )$ is the topological pressure, the supremum is taken over shift-invariant probability measures, and $q_\varphi ^*\in ]-1,0[$ is the unique solution of the equation
We also prove that when $\varphi $ is a potential corresponding to the measure of maximal entropy, then $q^*_\varphi =-1$ and
is piecewise linear (Theorem 3.2). In this case, and only in this case, the return-time spectrum coincides with the waiting-time $L^q$ -spectrum
that was previously studied in [Reference Chazottes and Ugalde7] (see §2.2 for definitions). It is fair to say that the expressions of
and
are unexpected, and that it is surprising that they only coincide if $\mu _\varphi $ is the measure of maximal entropy.
Below we will list some implications of this result, and how it relates to the literature.
1.1 The ansatz $R_n(x)\longleftrightarrow 1/\mu _\varphi ([x_0^{n-1}])$
A remarkable result [Reference Ornstein and Weiss15, Reference Shields18] is that, for any ergodic measure $\mu $ ,
where $h(\mu )=-\lim _n ({1}/{n}) \sum _{a_0^{n-1}\in \mathcal {A}^n} \mu ([a_0^{n-1}]) \log \mu ([a_0^{n-1}])$ is the entropy of $\mu $ . Compare this result with the Shannon–McMillan–Breiman theorem which says that
Hence, using return times, we do not need to know $\mu $ to estimate the entropy, but only to assume that we observe a typical output $x=x_0,x_1,\ldots $ of the process. In particular, combining the two previous pointwise convergences, we can write $R_n(x) \asymp 1/\mu ([x_0^{n-1}])$ for $\mu $ -almost every x Footnote 2 . This yields the natural ansatz
when integrating with respect to $\mu _\varphi $ . However, it is a consequence of our main result that this ansatz is not correct for the $L^q$ -spectra. Indeed, for the class of equilibrium states we consider (see §2.2),
which means that the $L^q$ -spectrum of the measure and
are different when $q<q^*_\varphi $ .
1.2 Fluctuations of return times
When $\mu _\varphi $ is the equilibrium state of a potential $\varphi $ of summable variation, there is a uniform control of the measure of cylinders, in the sense that $\log \mu _\varphi ([x_0^{n-1}])=\sum _{i=0}^{n-1} \varphi (x_i^{\infty })\pm \text {Const}$ , where the constant is independent of x and n. Moreover, $h(\mu _\varphi )=-\int \varphi \,d\mu _\varphi $ , so it is tempting to think that the fluctuations of $({1}/{n})\log R_n(x)$ should be the same as that of $-({1}/{n})\sum _{i=0}^{n-1} \varphi (x_i^{\infty })$ , in the sense of the central limit and large deviation asymptotics. Indeed, when $\varphi $ is Hölder continuous, it was proved in [Reference Collet, Galves and Schmitt8] that $\sqrt {n}(\log R_n/n-h(\mu ))$ converges in law to a Gaussian random variable $\mathcal {N}(0,\sigma ^2)$ , where $\sigma ^2$ is the asymptotic variance of $(({1}/{n})\sum _{i=0}^{n-1} \varphi (X_i^{\infty }))$ Footnote 3 . This was extended to potentials with summable variation in [Reference Chazottes and Ugalde7]. In plain words, $(({1}/{n})\log R_n(x))$ has the same central limit asymptotics as $(({1}/{n})\sum _{i=0}^{n-1} \varphi (x_i^{\infty }))$ Footnote 4 .
In [Reference Collet, Galves and Schmitt8], large deviation asymptotics of $(({1}/{n})\log R_n(x))$ , when $\varphi $ is Hölder continuous, were also considered. It is proved therein that, on a sufficiently small (non-explicit) interval around $h(\mu _\varphi )$ , the so-called rate function coincides with the rate function of $(-({1}/{n})\sum _{i=0}^{n-1} \varphi (x_i^{\infty }))$ . The latter is known to be the Legendre transform of $P((1-q)\varphi )$ . Using the Legendre transform of the return-time $L^q$ -spectrum, a direct consequence of our main result (see Theorem 3.3) is that, when $\varphi $ has summable variation, the coincidence of the rate functions holds on a much larger (and explicit, depending on $q^*_\varphi $ ) interval around $h(\mu _\varphi )$ . In other words, we extend the large deviation result of [Reference Collet, Galves and Schmitt8] in two ways: we deal with more general potentials and we get a much larger interval for the values of large deviations.
Notice that a similar result was deduced in [Reference Chazottes and Ugalde7] for the waiting time, based on the Legendre transform of the waiting-time $L^q$ -spectrum. In any case, this strategy cannot work to compute the rate functions of $(({1}/{n})\log R_n)$ and $(({1}/{n})\log W_n)$ , because the corresponding $L^q$ -spectra fail to be differentiable. Obtaining the complete description of large deviation asymptotics for $(({1}/{n})\log R_n)$ and $(({1}/{n})\log W_n)$ is an open question to date.
1.3 Relation to the return-time dimensions
Consider a general ergodic dynamical system $(M,T,\mu )$ and replace cylinders by (Euclidean) balls in the above return-time $L^q$ -spectrum, that is, consider the function $q\mapsto \int \tau _{B(x,\varepsilon )}^q(x) \,d\mu (x)$ , where $\tau _{B(x,\varepsilon )}(x)$ is the first time the orbit of x under T comes back to the ball $B(x,\varepsilon )$ of center x and radius $\varepsilon $ . The idea is to introduce return-time dimensions $D_\tau (q)$ by postulating that $\int \tau _{B(x,\varepsilon )}^q(x) \,d\mu (x)\approx \varepsilon ^{D_\tau (q)}$ , as $\varepsilon \downarrow 0$ . This was done in [Reference Hadyn, Luevano, Mantica and Vaienti11] (with a different ‘normalization’ in q) and compared numerically with the classical spectrum of generalized dimensions $D_\mu (q)$ defined in a similar way, with $\mu (B(x,\varepsilon ))^{-1}$ instead of $\tau _{B(x,\varepsilon )}(x)$ (geometric counterpart of the ansatz (1)). They studied a system of iterated functions in dimension one and numerically observed that return-time dimensions and generalized dimensions do not coincide. This can be understood with analytical arguments. For recent progress, more references and new perspectives, see [Reference Caby, Faranda, Mantica, Vaienti and Yiou6]. Working with (Euclidean) balls in dynamical systems with a phase space M of dimension higher than one is more natural than working with cylinders, but it is much more difficult. It is an interesting open problem to obtain an analog of our main result even for uniform hyperbolic systems. We refer to [Reference Caby, Faranda, Mantica, Vaienti and Yiou6] for recent developments.
1.4 Further recent literature
We now come back to large deviations for return times and comment on other results related to ours, besides [Reference Collet, Galves and Schmitt8]. In [Reference Jain and Bansal12], the authors obtain the following result. For a $\phi $ -mixing process with an exponentially decaying rate, and satisfying a property called ‘exponential rates for entropy’, there exists an implicit positive function I such that $I(0)=0$ and
where h is the entropy of the process. In the same vein, [Reference Coutinho, Rousseau and Saussol9] considered the case of (geometric) balls in smooth dynamical systems.
1.5 A few words about the proof of the main theorem
For $q>0$ , an important ingredient of the proof is an approximation of the distribution of $R_n(x)\mu _\varphi ([x_0^{n-1}])$ by an exponential law, with a precise error term, recently proved in [Reference Abadi, Amorim and Gallo1]. Using this result, the computation of is straightforward. The range $q<0$ is much more delicate. To get upper and lower bounds for $\log \int R_n^q \,d\mu _{\varphi }$ , we have to partition $\mathcal {A}^{\mathbb {N}}$ over all cylinders; in particular, we cannot only take into account cylinders which are ‘typical’ for $\mu _\varphi $ . A crucial role is played by orbits which come back after less than n iterations under the shift in cylinders of length n. Such orbits are closely related to periodic orbits. What happens is roughly the following. There are two terms in competition in the ‘ $(1/2)\log $ limit’. The first one is
where $T_{[a_0^{n-1}]}(x)$ is the first time that the orbit of x enters $[a_0^{n-1}]$ , and $\tau ([a_0^{n-1}])$ is the smallest first return time among all $y\in [a_0^{n-1}]$ . The second term is
Depending on the value of $q<0$ , when we take the logarithm and then divide by n, the first term (2) will beat the second one in the limit $n\to \infty $ , or vice-versa. Since the second term (3) behaves like ${e}^{nP((1-q)\varphi )}$ , and since we prove that the first one behaves like ${e}^{n \sup _\eta \int \varphi \,d\eta }$ , this indicates why the critical value $q^*_\varphi $ shows up. The asymptotic behavior of the first term (2) is rather delicate to analyze (see Proposition 4.2), and is an important ingredient of the present paper.
1.6 Organisation of the paper
The framework and the basic definitions are given in §2. In §2.1, we collect basic facts about equilibrium states and topological pressure. In §2.2, we define $L^q$ -spectra for measures, return times and waiting times. In §3, we give our main results and two simple examples in which all the involved quantities can be explicitly computed. The proofs are given in §4.
2 Setting and basic definitions
2.1 Shift space and equilibrium states
2.1.1 Notation and framework
For any sequence $(a_k)_{k\geq 0}$ where $a_k\in \mathcal {A}$ , we denote the partial sequence (‘string’) $(a_i,a_{i+1},\ldots ,a_j)$ by $a_i^j$ , for $i<j$ . (By convention, $a_i^i:=a_i$ .) In particular, $a_i^\infty $ denotes the sequence $(a_k)_{k \geq i}$ .
We consider the space $\mathcal {A}^{\mathbb {N}}$ of infinite sequences $x=(x_0,x_1,\ldots )$ , where $x_i\in \mathcal {A}$ , $i\in \mathbb {N}:=\{0,1,\ldots \}$ . Endowed with the product topology, $\mathcal {A}^{\mathbb {N}}$ is a compact space. The cylinder sets $[a_i^j]=\{x \in \mathcal {A}^{\mathbb {N}}: x_i^j=a_i^j\}$ , $i,j\in \mathbb {N}$ , generate the Borel $\sigma $ -algebra $\mathscr {F}$ . Now define the shift $\theta :\mathcal {A}^{\mathbb {N}}\to \mathcal {A}^{\mathbb {N}}$ by $(\theta x)_i=x_{i+1}$ , $i\in \mathbb {N}$ . Let $\mu $ be a shift-invariant probability measure on $\mathscr {F}$ , that is, $\mu (B)=\mu (\theta ^{-1}B)$ for each cylinder B. We then consider the stationary process $(X_k)_{k\geq 1}$ on the probability space $(\mathcal {A}^{\mathbb {N}},\mathscr {F},\mu )$ , where $X_n(x)=x_n$ , $n\in \mathbb {N}$ . We will use the shorthand notation $X_i^j$ for $(X_i,X_{i+1},\ldots ,X_j)$ , where $i<j$ . As usual, $\mathscr {F}_i^j$ is the $\sigma $ -algebra generated by $X_i^j$ , where $0\leq i\leq j\leq \infty $ . We denote by $\mathscr {M}_\theta (\mathcal {A}^{\mathbb {N}})$ the set of shift-invariant probability measures. This is a compact set in the weak topology.
2.1.2 Equilibrium states and topological pressure
We refer to [Reference Bowen4, Reference Walters22] for details on the material of this section. We consider potentials of the form $\beta \varphi $ , where and is of summable variation, that is,
where
Obviously, $\beta \varphi $ is of summable variation for each $\beta $ , and it has a unique equilibrium state denoted by $\mu _{\beta \varphi }$ . This means that it is the unique shift-invariant measure such that
where $P(\beta \varphi )$ is the topological pressure of $\beta \varphi $ .
For convenience, we ‘normalize’ $\varphi $ as explained in [Reference Walters22, Corollary 3.3], which implies, in particular, that
This gives the same equilibrium state $\mu _\varphi $ . (Since $\sum _{a\in \mathcal {A}}{e}^{\varphi (ax)}=1$ for all $x\in \mathcal {A}^{\mathbb {N}}$ , we have $\varphi <0$ .)
The maximal entropy is $\log |\mathcal {A}|$ and, because $P(\varphi )=0$ , it is the equilibrium state of the potentials of the form $u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function .
We will use the following property, often referred to as the ‘Gibbs property’. There exists a constant $C=C_\varphi \geq 1$ such that, for any $n\geq 1$ , any cylinder $[a_0^{n-1}]$ and any $x\in [a_0^{n-1}]$ ,
See [Reference Parry and Pollicott16], where one can easily adapt the proof of their Proposition 3.2 to generalize their Corollary 3.2.1 to get (5) with $C=\exp (\sum _{k\geq 1}\operatorname {var}_k(\varphi ))$ . We will also often use the following direct consequence of (5). For $g\ge 0$ , $m,n\ge 1$ and $a_0^{m-1}\in \mathcal {A}^{m},b_0^{n-1}\in \mathcal {A}^n$ , we have
For completeness, the proof is given in an appendix.
For the topological pressure of $\beta \varphi $ , we have the formula
One can easily check that $P(\psi +u-u\circ \theta +c)=P(\psi )+c$ for any continuous potential $\psi $ , any continuous and any . The map $\beta \mapsto P(\beta \varphi )$ is convex and continuously differentiable with
It is strictly decreasing since $\varphi <0$ . Moreover, it is strictly convex if and only if $\mu _\varphi $ is not the measure of maximal entropy, that is, the equilibrium state for a potential of the form $u-u\circ \theta -\log |\mathcal {A}|$ , where is continuous. We refer to [Reference Takens and Verbitski21] for a proof of these facts.
2.2 Hitting times, recurrence times and related $L^q$ -spectra
2.2.1 Hitting and recurrence times
Given $x\in \mathcal {A}^{\mathbb {N}}$ and $a_0^{n-1}\in \mathcal A^n$ , the (first) hitting time of x to $[a_0^{n-1}]$ is
that is, the first time that the pattern $a_0^{n-1}$ appears in x. The (first) return time is defined by
that is, the first time that the first n symbols reappear in x. Finally, given $x,y\in \mathcal {A}^{\mathbb {N}}$ , we define the waiting time
which is the first time that the n first symbols of x appear in y.
2.2.2 $L^q$ -spectra
Consider a sequence $(U_n)_{n\ge 1}$ of positive measurable functions on some probability space $(\mathcal {A}^{\mathbb {N}},\mathscr {F},\mu )$ , where $\mu $ is shift-invariant, and define, for each
and $n\in \mathbb {N}^*$ , the quantities
and
Definition 2.1. ( $L^q$ -spectrum of $(U_n)_{n\ge 1}$ )
When for all , this defines the $L^q$ -spectrum of $(U_n)_{n\ge 1}$ , denoted by .
We will be mainly interested in three sequences of functions, which are, for $n\ge 1$ ,
Corresponding to (8), we naturally associate the functions
where for the third one, we mean that we integrate, in (8), with $\mu \otimes \mu $ : in other words, x and y are drawn independently and according to the same law $\mu $ . Finally, according to Definition 2.1, when the limits exist, we let
be the $L^q$ -spectrum of the measure, the return-time $L^q$ -spectrum and the waiting-time $L^q$ -spectrum, respectively.
The existence of these spectra is not known in general. Trivially, , and . It is easy to see that for ergodic measures (this follows from Kač’s lemma).
In this paper, we are interested in the particular case where $\mu =\mu _\varphi $ is an equilibrium state of a potential $\varphi $ of summable variation. In this setting, it is easy to see (this follows from (5) and (7)) that exists and that, for all ,
On the other hand, as mentioned in the introduction, [Reference Chazottes and Ugalde7] proved, in the same setting, that
It is one of the main objective of the present paper to compute (and, in particular, to show that it exists).
3 Main results
3.1 Two preparatory results
We start with two propositions about the critical value of q below which we will prove that the return-time $L^q$ -spectrum is different from the $L^q$ -spectrum of $\mu _\varphi $ .
Proposition 3.1. Let $\varphi $ be a potential of summable variation. Then, the equation
has a unique solution $q^*_\varphi \in [-1,0[$ . Moreover, $q_\varphi ^*=-1$ if and only if $\varphi =u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function .
See §4.1 for the proof.
The following (non-positive) quantity naturally shows up in the proof of the main theorem. Given a probability measure $\nu $ , let
whenever the limit exists. In fact, we have the following variational formula for $\gamma _{\mu _\varphi }^+$ .
Proposition 3.2. Let $\varphi $ be a potential of summable variation. Then $\gamma _\varphi ^+:=\gamma ^+(\mu _\varphi )$ exists and
The proof is given in §4.2.
3.2 Main results
We can now state our main results.
Theorem 3.1. (Return-time $L^q$ -spectrum)
Let $\varphi $ be a potential of summable variation. Assume that $\varphi $ is not of the form $u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function
(i.e., $\mu _\varphi $ is not the measure of maximal entropy). Then the return-time $L^q$ -spectrum
exists, and we have
where $q_\varphi ^*$ is given in Proposition 3.1.
In view of (9) and (12), the previous formula can be rewritten as
In other words, the return-time $L^q$ -spectrum coincides with the $L^q$ -spectrum of the equilibrium state only for $q\geq q_\varphi ^*$ .
We subsequently deal with the measure of maximal entropy because, for that measure, the return-time and the waiting-time spectra coincide.
In view of the waiting-time $L^q$ -spectrum , given in (10), which was computed by [Reference Chazottes and Ugalde7], we see that, if $\varphi $ is not of the form $u-u\circ \theta -\log |\mathcal {A}|$ , then in the interval $]$ $-\infty ,q_\varphi ^*[\supsetneq ]-\infty , -1[$ . The fact that $P(2\varphi )<\sup _{\eta \in \mathscr {M}_\theta (\mathcal {A}^{\mathbb {N}})} \int \varphi \,d\eta $ follows from the proof of Proposition 3.1, in which we prove that $q_\varphi ^*>-1$ in that case.
Figure 1 illustrates Theorem 3.1.
We now consider the case where $\mu _\varphi $ is the measure of maximal entropy.
Theorem 3.2. (Coincidence of and )
The return-time $L^q$ -spectrum coincides with the waiting-time $L^q$ -spectrum if and only if $\varphi =u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function
. In that case,
3.3 Consequences on large deviation asymptotics
Let $\varphi $ be a potential of summable variation and assume that it is not of the form $u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function u. Let
We define the function
by
where $q(v)$ is the unique real number $q\in ]q^*_\varphi ,+\infty [$ such that
. It is easy to check that
. (This is because
is strictly convex, by the assumption we made on $\varphi $ , and is strictly increasing.) Notice that since
, we have $h(\mu _\varphi )\in ]v^*_\varphi ,v^+_\varphi [$ , and, in that interval,
is strictly convex and only vanishes at $v=h(\mu _\varphi )$ .
We have the following result.
Theorem 3.3. Let $\varphi $ be a potential of summable variation and assume that it is not of the form $u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function u. Then, for all $v\in [h(\mu _\varphi ),v^+_\varphi [$ ,
For all $v\in [v^*_\varphi ,h(\mu _\varphi )[$ ,
Proof. We apply a theorem from [Reference Plachky and Steinebach17], a variant of the classical Gärtner–Ellis theorem [Reference Dembo and Zeitouni10], which roughly says that the rate function is the Legendre transform of the cumulant generating function in the interval where it is continuously differentiable. We have that is not differentiable at $q=q_\varphi ^*$ since and . Hence, we apply the large deviation theorem from [Reference Plachky and Steinebach17] for $q\in ]q^*_\varphi ,+\infty [$ to prove the theorem.
Remark 3.1. Theorem 3.3 tells nothing about the asymptotic behaviour of ${\mu _\varphi ((1/2)\log R_n < v)}$ when $v\leq v^*_\varphi $ . Notice that the situation is similar for the large deviation rate function of waiting times; the only difference is that we take $-1$ in place of $q^*_\varphi $ and, therefore, $-\int \varphi \,d \mu _{2\varphi }$ in place of $v^*_\varphi $ . We believe that there exists a non-trivial rate function describing the large deviation asymptotic for these values of v for both return and waiting times, but this has to be proved using another method.
3.4 Some explicit examples
3.4.1 Independent random variables
The return-time and hitting-time spectra are non-trivial even when $\mu $ is a product measure, that is, even for a sequence of independent random variables taking values in $\mathcal {A}$ . Take, for instance, $\mathcal {A}=\{0,1\}$ and let $\mu =m^{\mathbb {N}}$ , where m is a Bernoulli measure on $\mathcal {A}$ with parameter $p_1\neq \tfrac 12$ . This corresponds to a potential $\varphi $ which is locally constant on the cylinders $[0]$ and $[1]$ . We can identify it with a function from $\mathcal {A}$ to
such that $\varphi (1)=\log p_1$ . For concreteness, let us take $p_1=\tfrac 13$ . Then, it is easy to verify that
and
and hence $P(2\varphi )<\gamma ^+_\varphi $ , as expected. Solving equation (11) numerically gives
So, in this case, Theorem 3.1 reads
We refer to Figure 1 where this spectrum is plotted, together with
and
.
Remark 3.2. One can check that, as $p_1\to \tfrac 12$ , and $\lim \nolimits _{p_1\to {1/2}} q^*_\varphi =-1$ , as expected.
3.4.2 Markov chains
If a potential $\varphi $ depends only on the first two symbols, that is, $\varphi (x)=\varphi (x_1,x_2)$ , then the corresponding process is a Markov chain. For Markov chains on $\mathcal {A}=\{1,\ldots ,K\}$ with matrix $(Q(a,b))_{a,b\in \mathcal {A}}$ , a well-known result [Reference Szpankowski19, for instance] states that
where $\mathcal {C}_{\ell }$ is the set of cycles of distinct symbols of $\mathcal {A}$ , with the convention that $a_{i+1}=a_i$ (circuits). On the other hand, it is well known [Reference Szpankowski19] that
where $\unicode{x3bb} _{\ell }$ is the largest eigenvalue of the matrix $((Q(a,b))^\ell )_{a,b\in \mathcal {A}}$ . This means that, in principle, everything is explicit for the Markov case. In practice, calculations are intractable even with some innocent-looking examples. Let us restrict to binary Markov chains ( $\mathcal {A}=\{0,1\}$ ) which enjoy reversibility. In this case, (13) simplifies to
(See, for instance [Reference Kamath and Verdú14].) If we further assume symmetry, that is, $Q(1,1)=Q(0,0)$ , then we obtain
and $\gamma ^+_\varphi =\max \{\log Q(0,0),\log Q(0,1)\}$ . If we want to go beyond the symmetric case, the explicit expression of
gets cumbersome. As an illustration, consider the case $Q(0,0)=0.2$ and $Q(1,1)=0.6$ . Then
From (14) we easily obtain $\gamma _\varphi ^+=\log (0.6)$ . The solution of equation (11) can be found numerically: $q^*_\varphi \approx -0.870750$ .
4 Proofs
4.1 Proof of Proposition 3.1
Recall that
It follows easily from the basic properties of $\beta \to P(\beta \varphi )$ listed above that the map $q\mapsto \mathcal M_\varphi (q)$ is a bijection from to since it is a strictly increasing $C^1$ function. This implies that the equation $\mathcal M_\varphi (q)=\gamma _\varphi ^+$ has a unique solution $q_\varphi ^*$ that is necessarily strictly negative, since $\gamma _\varphi ^+<0$ (because $\varphi <0$ ) and $\mathcal M_\varphi (q)< 0$ if and only if $q< 0$ (since $P(\varphi )=0$ ).
We now prove that $q_\varphi ^*\geq -1$ . We use the variational principle (4) twice, first for $2\varphi $ and then for $\varphi $ , to get
Hence, $q_\varphi ^*\geq -1$ since $q\mapsto \mathcal M_\varphi (q)$ is increasing. Notice that $\mathcal M_\varphi $ is a bijection between $[-1,0]$ and $[P(2\varphi ),0]$ , and $\gamma _\varphi ^+\in [P(2\varphi ),0]$ .
It remains to analyze the ‘critical case’, that is, $q_\varphi ^*=-1$ .
If $\varphi =u-u\circ \theta -\log |\mathcal {A}|$ , where is continuous, then the equation $\mathcal M_\varphi (q)=\gamma _\varphi ^+$ boils down to the equation $q\log |\mathcal {A}|=-\log |\mathcal {A}|$ , and hence $q_\varphi ^*=-1$ .
We now prove the converse. It is convenient to introduce the auxiliary function
We collect its basic properties in the following lemma, the proof of which is given at the end of this section.
Lemma 4.1. The map has a continuous extension in $0$ , where it takes the value $h(\mu _\varphi )$ . It is $C^1$ and decreasing on $(0,+\infty )$ , and . Moreover, .
The condition $q_\varphi ^*=-1$ is equivalent to $\mathcal M_\varphi (-1)=\gamma _\varphi ^+$ , which, in turn, is equivalent to . But, since decreases to $-\gamma _\varphi ^+$ , we must have for all $q\geq 1$ , and hence the right derivative of at $1$ is equal to $0$ , but, since is differentiable, this implies that the left derivative of at $1$ is also equal to $0$ . Hence, . But, by the last statement of the lemma, this means that $h(\mu _{2\varphi })+\int \varphi \,d\mu _{2\varphi }=0$ , which is possible if and only if $\mu _{2\varphi }=\mu _\varphi $ , by the variational principle (since $h(\eta )+\int \varphi \,d\eta =0$ if and only if $\eta =\mu _\varphi $ ). In turn, this equality holds if and only if there exists a continuous function and such that $2\varphi =\varphi + u-u\circ \theta +c$ , which is equivalent to
Since $P(\varphi )=0$ , one must have $c=-\log |\mathcal {A}|$ .
The proof of the proposition is complete.
Proof of Lemma 4.1
Since
we can use l’Hospital’s rule to conclude that
where we used the variational principle for $\varphi $ . Hence, we can extend
at $0$ (and denote the continuous extension by the same symbol). Then, since the pressure function is $C^1$ , we have for $q>0$ , and using the variational principle twice, that
Hence,
is $C^1$ and decreases on $(0,+\infty )$ . Taking $q=1$ gives the last statement of the lemma. Finally, we prove that
. By an obvious change of variable and a change of sign, it is equivalent to prove that
By the variational principle applied to $q\varphi $ ,
for any shift-invariant probability measure $\eta $ . Therefore, for any $q>0$ , we get
and hence
Taking $\eta $ to be a maximizing measure for $\varphi $ , we obtain
(By compactness of $\mathscr {M}_\theta (\mathcal {A}^{\mathbb {N}})$ , there exists at least one shift-invariant measure maximizing $\int \varphi \,d\eta $ .) We now use (7). For any $q>0$ , we have the trivial bound
Hence, by taking the limit $n\to \infty $ on both sides, and using (19) (see the next subsection), we have, for any $q>0$ ,
and hence
Combining this inequality with (16) gives (15). The proof of the lemma is complete.
4.2 Proof of Proposition 3.2
For each $n\geq 1$ , let
(We can put a maximum instead of a supremum in the definition of $s_n(\varphi )$ since, by compactness of $\mathcal {A}^{\mathbb {N}}$ , the supremum of the continuous function $x\mapsto \sum _{k=0}^{n-1}\varphi (x_k^\infty )$ is attained for some y.) Fix $n\geq 1$ . We have
Since $\mathcal {A}^{\mathbb {N}}$ is compact and $\varphi $ is continuous, for each n there exists a point $z^{(n)}\in \mathcal {A}^{\mathbb {N}}$ such that
Now, using (5), we get
for any choice of $x_{n}^\infty \in \mathcal {A}^{\mathbb {N}}$ , so we can take $x_{n}^\infty =(z^{(n)})_{n}^\infty $ . By using (18) and (17), we thus obtain
Now, one can check that $(s_n(\varphi ))_n$ is a subadditive sequence such that $\inf _m m^{-1} s_m(\varphi ) \geq -\|\varphi \|_\infty $ . Hence, by Fekete’s lemma (see for example, [Reference Szpankowski20]) $\lim _n n^{-1}s_n(\varphi )$ exists, so the limit of $(\gamma _{\varphi ,n}^+)_{n\ge 1}$ also exists and coincides with $\lim _n n^{-1}s_n(\varphi )$ . We now use the fact that
The proof is found in [Reference Jenkinson13, Proposition 2.1]. This finishes the proof of Proposition 3.2.
4.3 Auxiliary results concerning recurrence times
In this section, we state some auxiliary results which will be used in the proofs of the main theorems and are concerned with recurrence times.
4.3.1 Exponential approximation of return-time distribution
The following result of [Reference Abadi, Amorim and Gallo1] will be important in the proof of Theorem 3.1 for $q>0$ .
We recall that a measure $\mu $ enjoys the $\psi $ -mixing property if there exists a sequence $(\psi (\ell ))_{\ell \geq 1}$ of positive numbers decreasing to zero where
Theorem 4.1. (Exponential approximation under $\psi $ -mixing)
Let $(X_k)_{k\geq 0}$ be a process distributed according to a $\psi $ -mixing measure $\mu $ . There exist constants $C,C'>0$ such that, for any $x\in \mathcal {A}^{\mathbb {N}}$ , $n\geq 1$ and $t\ge \tau (x_0^{n-1})$ ,
where $(\epsilon _n)_n$ is a sequence of positive real numbers converging to $0$ , and where $\tau (x_0^{n-1})$ and $\zeta _\mu (x_0^{n-1})$ are defined in (22) and (23), respectively.
In [Reference Abadi, Amorim and Gallo1], this is Theorem 1, statement 2, combined with Remark 2. A consequence of $\psi $ -mixing is that there exist $c_1,c_2>0$ such that $\mu ([x_0^{n-1}])\leq c_1 {e}^{-c_2 n}$ for all x and n. This also follows from (5) since $\varphi <0$ .
Remark 4.1. Notice that a previous version of the present paper relied on an exponential approximation of the return-time distribution given in [Reference Abadi and Vergne3], but their error term turned out to be wrong for $t\le {1}/{2\mu ([x_0^{n-1}])}$ . This mistake was fixed in [Reference Abadi, Amorim and Gallo1].
Equilibrium states with potentials of summable variation are $\psi $ -mixing.
Proposition 4.1. Let $\varphi $ be a potential of summable variation. Then its equilibrium state $\mu _\varphi $ is $\psi $ -mixing.
Proof. The proof follows easily from (6), for $i=0$ . First, notice that this double inequality obviously holds for any $F\in \mathscr F_0^{m-1}$ in place of $a_0^{m-1}\in \mathcal {A}^{m}$ . Moreover, by the monotone class theorem, it also holds for any $G\in \mathscr F$ in place of $b_0^{n-1}\in \mathcal {A}^n$ , and we obtain that, for any $n\ge 1,F\in \mathscr F_0^{n-1}, G\in \mathscr F$ ,
We now apply Theorem 4.1(2) in [Reference Bradley5] to conclude the proof.
Remark 4.2. Let us mention that, although the $\psi $ -mixing property, per se, is not studied in [Reference Walters22], it is a consequence of what is actually proved in the proof of Theorem 3.2 therein.
4.3.2 First possible return time and potential well
For the proof of the main theorem in the case $q<0$ , we will need to consider the short recurrence properties of the measures. The smallest possible return time in a cylinder $[a_0^{n-1}]$ , also called its period, will have a particularly important role. It is defined by
One can check that $\tau (a_0^{n-1})=\inf \{k\geq 1: [a_0^{n-1}]\cap \theta ^{-k}[a_0^{n-1}]\neq \emptyset \}$ . Observe that $\tau (a_0^{n-1})\leq n$ , for all $n\geq 1$ .
Let $\mu $ be a probability measure and assume that it has complete grammar, that is, it gives a positive measure to all cylinders. We denote by $\mu _{a_0^{n-1}}(\cdot ):=\mu ([a_0^{n-1}]\cap \cdot )/\mu ([a_0^{n-1}])$ the measure conditioned on $[a_0^{n-1}]$ . For any $a_0^{n-1}\in \mathcal {A}^n$ , define
This quantity was called potential well in [Reference Abadi, Amorim and Gallo1, Reference Abadi, Cardeño and Gallo2], and shows up as an additional scaling factor in exponential approximations of the distributions of hitting and return times (see, for instance, the next subsection).
Remark 4.3. For $t<\mu ([a_0^{n-1}])\,\tau (a_0^{n-1})$ ,
since, by definition, $\mu _{a_0^{n-1}}(T_{a_0^{n-1}}< \tau (a_0^{n-1}) )=0$ (and hence the rightmost equality in (23)).
As already mentioned, equilibrium states with potential of summable variation are $\psi $ -mixing (see Proposition 4.1). Since, moreover, they have complete grammar, they, therefore, satisfy the conditions of Theorem 2 of [Reference Abadi, Amorim and Gallo1]. This result states that the potential well is bounded away from $0$ , that is,
in which $\zeta _{\varphi }:=\zeta _{\mu _\varphi }$ .
We conclude this subsection with the following proposition which plays an important role in the proof of our main result. Its proof is quite long and, for this reason, it is postponed until §4.6.
Proposition 4.2. Let $\mu _\varphi $ be the equilibrium state of a potential $\varphi $ of summable variation. Then,
4.4 Proof of Theorems 3.1 and 3.2 for $q\ge 0$
Notation 4.1. We will write $\sum _{A\in \mathcal {A}^n}$ for $\sum _{a_0^{n-1}\in \mathcal {A}^n}$ and $\mu _\varphi (A)$ for $\mu _\varphi ([a_0^{n-1}])$ . We will also use the notation $\mu _{\varphi ,A}(\cdot )=\mu _\varphi (A\cap \cdot )/\mu _\varphi (A)$ .
For the case of $q\ge 0$ , we proceed as in [Reference Chazottes and Ugalde7], but we give the proof for completeness. The case $q=0$ is trivial. For any $q>0$ , by a classical formula and a trivial change of variable,
We take into account that $\mu _{\varphi ,A}(T_A\leq t)=0$ for $t<\tau (A)$ . Theorem 3.1 will be proved for $q>0$ if we prove that the above integral is of the order $C\mu _\varphi (A)^{-q}$ for any A. We use the exponential approximation (20) of Theorem 4.1, and the following facts.
-
• By (24), we have $\inf _A \zeta _\varphi (A)\ge \zeta _\varphi ^->0$ , and, by definition, $\zeta _\varphi (A)\le 1$ for all A.
-
• Consequently, there exists a constant $\varrho>0$ such that, for all n large enough, $\varrho \leq \inf _A \zeta _\varphi (A)-C'\epsilon _n\leq 1/2$ .
-
• For all n large enough, we have $\sup _A(\zeta _\varphi (A)\mu _\varphi (A)\tau (A))\le 1$ since $\zeta _\varphi (A)\leq 1$ , $\tau (A)\leq n$ and $\mu _\varphi (A)$ decays exponentially fast to $0$ with a rate independent of A.
By (20), we thus have the following upper bound: there exists $n_0$ such that, for all $n\geq n_0$ and for all A,
Hence, we obtain (after an obvious change of variable)
The right-hand side increases if we replace $\tau (A)\mu _\varphi (A)$ by $0$ in the first two integrals. It follows at once that there is a constant $\tilde {C}(q)>0$ such that, for all n larger than some $\tilde {n}_0$ and for all A,
Hence,
and, therefore, using Proposition 9, we get
Now, by (20), we have the following lower bound: for all $n\geq n_0$ and for all A,
It is left to the reader to check that there exists a constant $\widehat {C}(q)>0$ such that, for n larger than some $\hat {n}_0$ ,
and, therefore, using Proposition 9,
We have thus proved that
exists for all $q\geq 0$ , and
This proves both Theorems 3.1 and 3.2 in this regime. When $\varphi =u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function
, we have $P((1-q)\varphi )=q\log |\mathcal {A}|$ , and this is the only case when this function is not strictly convex.
4.5 Proofs of Theorems 3.1 and 3.2 for $q<0$
We continue using Notation 4.1.
Proceeding as above, for any $q<0$ ,
where we integrate from $\mu _\varphi (A){\tau (A)}$ since (see Remark 4.3)
Therefore, we want to estimate the integral
Since $t^{-|q|-1}$ diverges close to $0$ , we see that we need a sufficiently precise control of $\mu _{\varphi ,A}(T_{A}\le {t}/{\mu _\varphi (A)})$ for ‘small’ t. This will be done ‘by hand’, using the results of §4.3.2 instead of Theorem 4.1.
4.5.1 Bounding $\mu _{\varphi ,A}(T_{A}\le {t}/{\mu _\varphi (A)})$
We first consider the case $t\in [\mu _\varphi (A)\tau (A),2[$ and then the case $t\geq 2$ to control the integral (26). (Since we will take the limit $n\to \infty $ , we implicitly assume that n is large enough so that $\mu _\varphi (A)\tau (A)$ is smaller that $2$ .)
For $t\in [\mu _\varphi (A)\tau (A),2[$ , we first observe that
On the other hand, for any such t, we have
We want to get the upper bound (29) (see below) for the first term on the right-hand side of (27). To get this upper bound, first suppose that $\tau (A)=n$ . Then, in this case, $\mu _{\varphi ,A}(T_{A}\le n-1)=0$ and the inequality is obvious. Thus, we now suppose that $\tau (A)\le n-1$ . Since $\mu _{\varphi ,A}(T_{A}<\tau (A))=0$ and since for any $\tau (A)\le i\le n-1$ (remember that $A=a_0^{n-1}$ ), there is a constant $D\geq 1$ such that
The second inequality is trivial since $a_{n-\tau (a_0^{n-1})}^{n-1}$ is a substring of $a_{n-i}^{n-1}$ . The other two inequalities use (6) for $g=0$ . We deduce from (28) that
We now want an upper bound for the second term on the right-hand side of (27). Using (6) for $g=0$ , we get
Therefore, for any $t\in [\mu _\varphi (A)\tau (A),2[$ ,
For $t\geq 2$ ,
where we used Markov’s inequality and then Kač’s lemma (which holds since $\mu $ is ergodic).
4.5.2 Integral estimates
Using the bounds for $\mu _{\varphi ,A}(T_{A}\le {t}/{\mu _\varphi (A)})$ that we obtained in the preceding subsection, we can now bound the integral $I(q,[\tau (A)\mu _\varphi (A),\infty ])$ from above and from below.
Lower bound for any $q<0$ . Using (30) and (32), we get
We can choose a suitable constant $c(q)>0$ , ensuring that, for any sufficiently large n, we have $(\mu _\varphi (A)\tau (A))^{-|q|}-2^{-|q|}\ge c(q)(\mu _\varphi (A)\tau (A))^{-|q|}$ , which is itself bounded below by $c(q)(\mu _\varphi (A)n)^{-|q|}$ since $\tau (A)\leq n$ . This gives, for all $q<0$ ,
Upper bounds. Using the upper bounds of (30) and (31),
We have to consider three cases according to the values of q.
-
• First, assume that $q<-1$ . Then,
$$ \begin{align*} I(q,[\mu_\varphi(A)\tau(A),\infty])&\le \frac{1}{|q|}\bigg(nD^2\mu_{\varphi,A}( T_{A}= \tau(A))[\mu_\varphi(A)\tau(A)]^{-|q|} \\ &\quad+ \frac{D|q|}{|q|-1}(\mu_\varphi(A)\tau(A))^{-|q|+1}+2^{-|q|}\bigg). \end{align*} $$We can take a suitable constant $C(q)>0$ , ensuring that, for any sufficiently large n,$$ \begin{align*} \frac{D|q|}{|q|-1}(\mu_\varphi(A)\tau(A))^{-|q|+1}+2^{-|q|}\le C(q)(\mu_\varphi(A)\tau(A))^{-|q|+1}. \end{align*} $$Now using that $1\le \tau (A)$ , we get(34) $$ \begin{align} I(q,[\mu_\varphi(A)\tau(A),\infty]) \le\frac{1}{|q|} (nD^2\mu_{\varphi,A}( T_{A}\!=\! \tau(A))\mu_\varphi(A)^{-|q|}+C(q)\mu_\varphi(A)^{-|q|+1}). \end{align} $$ -
• For $q\in (-1,0)$ , putting $C'(q):=({D|q|}/{|q|-1})2^{-|q|+1}+2^{-|q|}$ , we have
(35) $$ \begin{align} I(q,[\mu_\varphi(A)\tau(A),\infty]) \le \frac{1}{|q|}(nD^2\mu_{\varphi,A}( T_{A}= \tau(A))\mu_\varphi(A)^{-|q|}+C'(q)). \end{align} $$ -
• We conclude with the case $q=-1$ . Integrating, we get
$$ \begin{align*} & I(-1,[\mu_\varphi(A)\tau(A),\infty])\\ &\quad \le nD^2\mu_{\varphi,A}( T_{A}= \tau(A))\mu_\varphi(A)^{-1}+D\log\frac{2}{\mu_\varphi(A)\tau(A)}+\frac{1}{2}\\ &\quad \le nD^2\mu_{\varphi,A}( T_{A}= \tau(A))\mu_\varphi(A)^{-1}+D\log\frac{2}{\mu_\varphi(A)}+\frac{1}{2}. \end{align*} $$Now, since $\mu _\varphi (A)\ge C^{-1}{e}^{-\|\varphi \|_\infty n}$ by (5) (where $C\geq 1$ is independent of A and n), we get, for all n large enough,(36) $$ \begin{align} I(-1,[\mu_\varphi(A)\tau(A),\infty]) \le nD^2\mu_{\varphi,A}( T_{A}= \tau(A))\mu_\varphi(A)^{-1}+2Dn\|\varphi\|_\infty. \end{align} $$
4.5.3 Conclusion of the proofs
Let $(a_n), (b_n)$ be two sequences of positive real numbers. The following notion of asymptotic equivalence is convenient in what follows.
We now list the properties that we are going to use to conclude the proofs. By (9), for all ,
By Proposition 4.2,
since $1-\zeta _\varphi (A)=\mu _{\varphi ,A}(T_A=\tau (A))$ (see (23)). By Proposition 3.1, the unique solution of the equation is $q_\varphi ^*\in [-1,0[$ . Finally, we also have to remember that is strictly increasing.
Up to prefactors that are negligible in the sense of $\asymp $ , the proofs will boil down to comparing with $\Lambda _\varphi $ , when q runs through , to see which one of the two ‘wins’ on the logarithmic scale.
We first prove that for $q\leq q_\varphi ^*$ , and for $q>q_\varphi ^*$ . By (25), (26) and (33), for all $q<0$ and for all n large enough,
If $q>q_\varphi ^*$ , , and hence, by (37) and (38), we get . If $q\leq q_\varphi ^*$ , , and hence, by (37) and (38), we get .
We now prove that for $q\leq q_\varphi ^*$ , and for $q>q_\varphi ^*$ .
We first consider the case where $\varphi $ is not of the form $u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function , which is equivalent to $-1<q_\varphi ^*<0$ , by Proposition 3.1.
Suppose that $q<-1$ . By (34), for all n large enough, we get
Since
, we obtain
For $-1<q<0$ , for all n large enough, by (35),
Since
when $q\leq q_\varphi ^*$ , we conclude that
. When $q>q_\varphi ^*$ ,
, and hence
. When $q=-1$ , by (36),
so we conclude that
since $-1<q_\varphi ^*$ . Therefore,Theorem 3.1 is proved.
To conclude the proof of Theorem 3.2, we now suppose that $\varphi $ is of the form $u-u\circ \theta -\log |\mathcal {A}|$ for some continuous function
, which is equivalent to $q_\varphi ^*=-1$ , by Proposition 3.1. When $\varphi $ is of that form,
By (10),
coincides with
since $P(2\varphi )=P(0-2\log |\mathcal {A}|)=-\log |\mathcal {A}|$ (since, for any continuous potential $\psi $ , any continuous function v and any
, one has $P(\psi +v-v\circ \theta +c)=P(\psi )+c$ ).
4.6 Proof of Proposition 4.2
Proof of Proposition 4.2
Recall that
Since $a_0^{n-1}a_{n-\tau (a_0^{n-1})}^{n-1}=a_0^{\tau (a_0^{n-1})-1}a_0^{n-1}$ ,
Let
We now prove that $\overline {\Lambda }_\varphi \le \gamma _\varphi ^+$ . By (6) (with $g=0$ ),
Partitioning according to the values of $\tau (a_0^{n-1})$ gives
Now, observe that
This implies, in particular, that $\sum _{a_0^{n-1}}\mu _\varphi ([a_0^{\tau (a_0^{n-1})-1}])\le n$ . Coming back to (39), we conclude by Proposition 3.2 that
We now prove that $\underline {\Lambda }_\varphi \ge \gamma _\varphi ^+$ . We need the following lemma, the proof of which is given below.
Lemma 4.2. Let $\varphi $ be a potential of summable variation. Then there exists a sequence of strings $(A_n)_{n\geq 1}$ with $A_n\in \mathcal {A}^n$ such that
For any $n\ge 1$ and any string $a_0^{n-1}$ , we introduce the notation $p_\tau (a_0^{n-1})=a_0^{\tau (a_0^{n-1})-1}$ , which is the prefix of $a_0^{n-1}$ of size $\tau (a_0^{n-1})$ . Now, using (6) (with $g=0$ ) gives
Therefore,
We now use (5) and (6). For any point $x\in A_n$ , and using the fact that $\varphi (x)\geq -\inf \varphi>-\infty $ (since $\varphi $ is continuous and $\mathcal {A}^{\mathbb {N}}$ is compact), we obtain
Therefore, by Lemma 4.2, we get
which concludes the proof of the proposition.
Proof of Lemma 4.2
We know that $\gamma _\varphi ^+$ exists by Proposition 3.2. This means that there exists a sequence of strings $(B_i)_{i\geq 1}$ with $B_i\in \mathcal {A}^i$ , such that
Now, let $(k_i)_{i\geq 1}$ be a diverging sequence of positive integers. Then, for each $i\ge 1$ , consider the string $B_{i}^{k_i}$ obtained by concatenating $k_i$ times the string $B_{i}$ , that is,
Using (6) (with $g=0$ ) gives
For any $n\ge 1$ , take the unique integer $i_n$ such that $n\in [ik_i,(i+1)k_{i}-1]$ (we omit the subscript n of $i_n$ to simplify the notation). We write $r=r(i,n):=n-ik_i$ and let $A_n=B_{i}^{k_i} B_{r(i)}$ , where $B_{r(i)}$ is the beginning (or prefix) of size $r(i)$ of $B_i$ , that is,
Therefore,
since i (and, therefore, $k_i$ ) diverges as $n\rightarrow \infty $ . Now, observe that
which gives, using (40),
The right-hand side is equal to
and $({1}/{i})\log \mu ([B_{i}])\rightarrow \gamma _\varphi ^+$ , whereas $k_i (k_i+{r}/{i})^{-1}\rightarrow 1$ . The limit of the left-hand side is also $\gamma _\varphi ^+$ . This concludes the proof of the lemma.
Acknowledgements
VA acknowledges IFSP for financial support. SG acknowledges École Polytechnique for financial support and hospitality during a two-month stay, as well as for other short visits. SG was supported by FAPESP (BPE: 2017/07084-6) CNPq (PQ 312315/2015-5 Universal 462064/2014-0). MA SG acknowledge the FAPESP-FCT joint project between SP-Brazil Portugal (19805/2014).
A Appendix. Proof of inequalities (6)
To simplify the notation, we write $\mu $ instead of $\mu _\varphi $ . Recall that $\mu _{a_0^{m-1}}$ is the conditional measure $\mu (\,\cdot \, \cap [a_0^{m-1}])/\mu ([a_0^{m-1}])$ (which is well defined). Given $g\geq 0$ , $m,n\geq 1$ and $a_0^{m-1}, b_0^{n-1}$ we first observe that
To prove (6), it is enough to prove that
To prove (A1), it suffices to prove that
for all $p,q\geq 1$ and $a_0^{p-1}, b_0^{q-1}$ . By (5), we have that, for any $x\in [a_0^{p-1}]\cap \theta ^{-p}[b_0^{q-1}]$ ,
and, for any $y\in [a_0^{p-1}]$ , $z\in [b_0^{q-1}]$ , we also have
Taking $y=x$ and $z=\theta ^p x$ and combining (A3) and (A4), we obtain (A2). The proof of (6) is complete.