Hostname: page-component-cd9895bd7-jkksz Total loading time: 0 Render date: 2024-12-27T02:54:47.603Z Has data issue: false hasContentIssue false

Bounds for a nonlinear ergodic theorem for Banach spaces

Published online by Cambridge University Press:  18 February 2022

ANTON FREUND*
Affiliation:
Department of Mathematics, Technical University of Darmstadt, Schlossgartenstr. 7, 64289 Darmstadt, Germany (e-mail: [email protected])
ULRICH KOHLENBACH
Affiliation:
Department of Mathematics, Technical University of Darmstadt, Schlossgartenstr. 7, 64289 Darmstadt, Germany (e-mail: [email protected])
Rights & Permissions [Opens in a new window]

Abstract

We extract quantitative information (specifically, a rate of metastability in the sense of Terence Tao) from a proof due to Kazuo Kobayasi and Isao Miyadera, which shows strong convergence for Cesàro means of non-expansive maps on Banach spaces.

Type
Original Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press

1 Introduction

Throughout this paper, we assume that X is a uniformly convex real Banach space, and that $C\subseteq X$ is non-empty, closed and convex. Also, we assume that the map $T:C\to C$ has a fixed point and is non-expansive (i.e., that we have $\lVert Tx-Ty \rVert \leq \lVert x-y \rVert $ for all $x,y\in C$ ). By

$$ \begin{align*} S_nx:=\frac1{n}\sum_{i=0}^{n-1}T^ix \end{align*} $$

we denote the Cesàro means with respect to T. Our aim is a quantitative version of the following result due to Kobayasi and Miyadera.

Theorem 1.1. [Reference Kobayasi and Miyadera13]

Consider $T:C\to C$ as above. Given $x\in C$ , assume that the sequences $(\lVert T^nx-T^{n+i}x \rVert )_n$ converge uniformly in i. We then have

$$ \begin{align*} \lim_{n\to\infty}\lVert y-S_nT^kx \rVert=0\quad\text{uniformly in}~k, \end{align*} $$

for some fixed point y of T.

Let us discuss what a quantitative version of the conclusion should look like. The proof by Kobayasi and Miyadera shows that the fixed point y from the theorem is the limit of the sequence $(S_mT^mx)$ . Hence, we obtain

(1) $$ \begin{align} \lim_{m,n\to\infty}\lVert S_mT^mx-S_nT^kx \rVert&=0\quad\text{uniformly in}~k, \end{align} $$
(2) $$ \begin{align} \lim_{n\to\infty}\lVert T^lS_nT^nx-S_nT^nx \rVert&=0\quad\text{uniformly in}~l. \end{align} $$

To see that we get (2), note that $Ty=y$ and the fact that T is non-expansive yield

$$ \begin{align*} \lVert T^lS_nT^nx-S_nT^nx \rVert\leq\lVert T^lS_nT^nx-T^ly \rVert+\lVert y-S_nT^nx \rVert\leq 2\cdot\lVert y-S_nT^nx \rVert. \end{align*} $$

Indeed, the conjunction of (1) and (2) is equivalent to the conclusion of Theorem 1.1 together with the result that $(S_nT^nx)$ converges to a fixed point of T. With respect to (2), we note that uniformity in l is somewhat trivial in the presence of an actual fixed point y. It is less trivial when only approximate fixed points are available, and this will play a role in our quantitative analysis (cf. the formulation of (5) below). Let us also point out that statements (1) and (2) entail

(3) $$ \begin{align} \lim_{n\to\infty}\lVert TS_nT^kx-S_nT^kx \rVert=0\quad\text{uniformly in}~k. \end{align} $$

This asymptotic regularity result is due to Bruck [Reference Bruck5] (in the Banach space case).

What, then, should a quantitative version of (1) assert? As a first idea, we might look for a rate of convergence, i.e., for a function $\varepsilon \mapsto N(\varepsilon )$ with

$$ \begin{align*} \lVert S_mT^mx-S_nT^kx \rVert<\varepsilon\quad\text{for all }m,n\geq N=N(\varepsilon)\text{ and all }k\in\mathbb N. \end{align*} $$

In particular, $\varepsilon \mapsto N(\varepsilon /2)$ would be a Cauchy rate for the sequence $(S_nx)$ . It is known that such a rate cannot be computed (or given by a ‘simple closed expression’) in general (see [Reference Avigad, Gerhardy and Towsner1, Theorem 5.1]). However, we will be able to construct a rate of metastability, i.e., a map $(\varepsilon ,g,h)\mapsto \Phi (\varepsilon ,g,h)$ that takes an $\varepsilon>0$ and functions $g,h:\mathbb N\to \mathbb N$ as input and ensures that we have

(4) $$ \begin{align} \lVert S_mT^mx-S_nT^kx \rVert<\varepsilon\quad&\text{for some } N\leq\Phi(\varepsilon,g,h) \text{ and}\nonumber\\ &\text{all } m,n\in[N,N+g(N)] \text{ and } k\leq h(N). \end{align} $$

Note that this is still as strong as (1) from above: if the latter fails, there is an $\varepsilon>0$ such that any N admits $m,n\geq N$ and $k\in \mathbb N$ with $\lVert S_mT^mx-S_nT^kx \rVert \geq \varepsilon $ . If we now set ${g(N):=\max \{m,n\}-N}$ and $h(N):=k$ for such numbers, (4) must fail. Similarly, our quantitative analysis of (2) will yield a map $(\varepsilon ,g,h)\mapsto \Psi (\varepsilon ,g,h)$ with

(5) $$ \begin{align} \lVert T^lS_nT^nx-S_nT^nx \rVert<\varepsilon\quad&\text{for some } N\leq\Psi(\varepsilon,g,h) \text{ and}\nonumber\\ &\text{all } n\in[N,N+g(N)] \text{ and } l\leq h(N). \end{align} $$

Following the structure of Kobayasi and Miyadera’s proof, we will first construct $\Psi $ and use it to define $\Phi $ . However, it will then turn out that one can switch to $\Psi :=\Phi $ , and that all desired properties can be satisfied simultaneously (see Theorem 1.2).

As mentioned, one cannot expect a computable rate of convergence (rather than metastability) for (1). It is not clear whether rates of convergence are available in the case of (2) or (3). The proof by Kobayasi and Miyadera [Reference Kobayasi and Miyadera13] does not seem to yield such rates, while Bruck’s [Reference Bruck5] proof of (3) remains to be analyzed. In the case of Hilbert space, a rate of convergence for (3) is known (see [Reference Kohlenbach20, Lemma 3.4]).

The term ‘metastability’ for statements such as (4) and (5) has been coined by Tao [Reference Tao31]. Even before Tao had introduced this terminology, the notion had been studied in mathematical logic, in particular, in the proof mining program (see the textbook [Reference Kohlenbach17]), with foundational work reaching back to Gödel. One interesting aspect of metastability is its connection with the number of $\varepsilon $ -fluctuations (see the results by Avigad and Rute [Reference Avigad and Rute2, §5]). Both experience and general metatheorems from logic (cf. the end of this section) show that rates of metastability can be extracted from a wide range of mathematical proofs. Specifically, the present paper complements quantitative work on nonlinear non-expansive operators that satisfy a condition due to Wittmann [Reference Wittmann32] (cf. §5), notably, by Safarik [Reference Safarik30] (convergence of Cesàro means in Hilbert spaces) and the second author [Reference Kohlenbach19] (convergence of iterates of asymptotically regular maps in Banach spaces). Concerning complexity, it will be straightforward to see that our bounds are primitive recursive in the sense of Kleene (see, e.g., [Reference Kohlenbach17, §3.4]), as in the cited results. More generally, there is a wealth of quantitative results on the convergence of various iteration schemes in Hilbert, Banach and more general spaces (see, e.g., [Reference Avigad, Gerhardy and Towsner1, Reference Kohlenbach and Leuştean23] for the linear and [Reference Kohlenbach15, Reference Kohlenbach18, Reference Kohlenbach20, Reference Kohlenbach22, Reference Kohlenbach and Leuştean24, Reference Kohlenbach and Sipoş26, Reference Powell29] for the nonlinear case).

As explained above, our quantitative analysis of Theorem 1.1 (due to Kobayasi and Miyadera [Reference Kobayasi and Miyadera13]) consists in the construction of maps $\Phi $ and $\Psi $ as in (4) and (5). In the following, we specify the quantitative data that we consider as given. First, we assume that we have a bound $b>0$ with

(6) $$ \begin{align} C\subseteq B_{b/2}:=\bigg\{x\in X\,\bigg|\,\lVert x \rVert\leq\frac{b}{2}\bigg\}. \end{align} $$

Note that such a bound exists if, and only if, C has bounded diameter (specifically, (6) yields $\lVert x-y \rVert \leq b$ for $x,y\in C$ ). The version of Theorem 1.1 for bounded C is not actually weaker. To obtain the full theorem, recall the assumption that T has a fixed point f. For $x\in C$ and $r:=\lVert x-f \rVert $ , the set $D:=\{x\in C\mid \lVert x-f \rVert \leq r\}\cap C$ is closed and convex with $T(D)\subseteq D\supseteq \{x,f\}$ . In view of $D\subseteq B_{r+\lVert f \rVert }$ , we can conclude by the seemingly weaker version of Theorem 1.1. In the context of Lemma 2.6, we will need a bound as in (6), not just a bound on the diameter.

Secondly, to witness our standing assumption on the Banach space X, we assume as given a modulus $\eta :(0,2]\to (0,1]$ of uniform convexity, so that we have

(7) $$ \begin{align} \bigg\lVert \frac{x+y}{2}\bigg\rVert\leq 1-\eta(\varepsilon)\quad\text{for } \lVert x \rVert,\lVert y \rVert\leq 1 \text{ with } \lVert x-y \rVert\geq\varepsilon. \end{align} $$

Sometimes, it is convenient to have $\eta $ defined on $[0,\infty )$ with values in $[0,\infty )$ as a continuous function satisfying

(8) $$ \begin{align} 0<s<t\quad\Rightarrow\quad 0=\eta(0)<\eta(s)<\eta(t)\ \text{and}\ \frac{\eta(s)}{s}\leq\frac{\eta(t)}{t}. \end{align} $$

This form is used in §2, which exhibits the quantitative content of an intermediate result by Bruck [Reference Bruck6] and where it is shown how an arbitrary modulus as in (7) can be converted into a new modulus $\eta '$ with the additional properties. Conversely, given $\eta ':[0,\infty )\to [0,\infty )$ , one can simply take $\eta :(0,2]\to (0,1]$ with $\eta (\varepsilon ):=\min \{ 1,\eta '(\varepsilon )\}$ (note, in particular, that (7) is trivial when $\varepsilon>2$ and false or void when $\eta (\varepsilon )>1$ ). The property (8) can equivalently be expressed as $\eta (s)=s\cdot \tilde \eta (s)$ for a non-decreasing and continuous function $\tilde \eta :[0,\infty )\to [0,\infty )$ with $\tilde \eta (s)>0$ for $s>0$ . Let us also recall that the $L^p$ -spaces admit the natural modulus

$$ \begin{align*} \eta(\varepsilon)=\begin{cases} \dfrac{p-1}{8}\varepsilon^2 & \text{if } 1<p<2,\\[10pt] \dfrac{\varepsilon^p}{p\cdot 2^p} & \text{if } 2\le p<\infty, \end{cases} \end{align*} $$

for which (7) and (8) are satisfied (see [Reference Hanner12] and also [Reference Kohlenbach, Blanck, Brattka and Hertling14, §3]).

Thirdly, given that X is a uniformly convex Banach space, so is $X^2$ with the norm defined by $\lVert (x,y) \rVert _2=(\lVert x \rVert ^2+\lVert y \rVert ^2)^{1/2}$ . Indeed, assume that (7) holds for $(X,\lVert \cdot \rVert )$ and a modulus $\eta $ that satisfies $\eta (\varepsilon )\leq \varepsilon /4$ and $\eta (\varepsilon )=\varepsilon \cdot \tilde \eta (\varepsilon )$ with non-decreasing $\tilde \eta $ (which, e.g., follows if $\eta $ satisfies (8)). An analysis of the proof in [Reference Clarkson7] shows that (7) remains valid when $(X,\lVert \cdot \rVert )$ and $\eta $ are replaced by the space $(X^2,\lVert \cdot \rVert _2)$ and the modulus $\eta _2:(0,2]\to (0,1]$ given by

$$ \begin{align*} \eta_2(\varepsilon):=\delta\bigg(\frac{\varepsilon}{8\sqrt{2}}\cdot\eta\bigg(\frac{\varepsilon}{\sqrt{2}}\bigg)\bigg)\quad\text{with }\delta(\varepsilon):=\frac{\varepsilon^2}{8}. \end{align*} $$

Note that $\eta _2$ satisfies (8) if $\eta $ does. Now the uniformly convex space $(X^2,\lVert \cdot \rVert _2)$ is, in particular, B-convex. By a characterization due to Pisier [Reference Pisier28] (cf. the proof of [Reference Bruck6, Theorem 1.1]), this means that there are $c>0$ and $q>1$ with the following property: for all independent mean zero random variables $Z_1,\ldots ,Z_n$ with values in $X^2$ , the expected values satisfy

(9) $$ \begin{align} \mathbb E\bigg(\bigg\lVert \sum_{i=1}^n Z_i\bigg\rVert_2^q\bigg)\leq c^q\cdot\sum_{i=1}^n\mathbb E(\lVert Z_i \rVert_2^q). \end{align} $$

For our quantitative analysis, we assume that we are given such c and q. This additional data could be avoided, i.e., expressed in terms of our modulus $\eta $ of uniform convexity (as guaranteed by the metatheorem cited below). Indeed, one can explicitly construct suitable $c,q$ in terms of $\eta _2$ via an analysis of the proof in [Reference Pisier28]. In fact, since that proof only uses that X is uniformly non-square in the sense of James, it suffices to use one non-trivial value of $\eta _2$ , e.g. $\eta _2(1)$ . In the case of $L^p$ -spaces, one can take $q:=\min \{ 2,p\},$ where the optimal constant c has been computed in [Reference Haagerup11] (see [Reference Ledoux and Talagrand27, §9.2]). Nevertheless, we include $c,q$ among our input data, because this simplifies matters and is computationally harmless. Note that c and q depend on the space only. Also, the complexity class of our bounds does not depend on c and q, because the latter are numbers rather than functions.

Finally, for given $x\in C$ , we abbreviate

$$ \begin{align*} \alpha^i_n:=\lVert T^nx-T^{n+i}x \rVert. \end{align*} $$

Since T is non-expansive, we always have $\alpha ^i_n\geq \alpha ^i_{n+1}\geq 0$ , so that each of the sequences $(\alpha ^i_n)$ converges. A central assumption of Theorem 1.1 demands that the rate of convergence (but not necessarily the limit) is independent of the number i. In terminology due to Bruck [Reference Bruck4], this means that T is asymptotically isometric on the set $\{x\}$ . As a quantitative version of this assumption, we suppose that we are given a rate of metastability, i.e., a map $(\varepsilon ,g,h)\mapsto A(\varepsilon ,g,h)$ that guarantees

(10) $$ \begin{align} |\alpha^i_m-\alpha^i_n|<\varepsilon\quad&\text{for some } N\leq A(\varepsilon,g,h)\nonumber\\ &\text{and all } m,n\in[N,N+g(N)] \text{ and } i\leq h(N). \end{align} $$

In order to apply our quantitative result, one will have to provide a map A with this property. We mention three situations where this is possible. First, assume that T satisfies Wittmann’s [Reference Wittmann32] condition $\lVert Tx+Ty \rVert \leq \lVert x+y \rVert $ (which is, e.g., the case when $C=-C$ and T is odd in addition to being non-expansive) and is asymptotically regular with given rate ( $\lVert T^{n+1}x-T^nx \rVert \to 0$ for $n\to \infty $ , which holds, e.g., for averaged maps). As shown by the second author (see [Reference Kohlenbach19] and the generalization in [Reference Kohlenbach21, §3]), one can then construct a rate of metastability that witnesses $\lVert T^ix-T^jx \rVert \to 0$ for $i,j\to \infty $ . Such a rate is readily transformed into a map A that validates (10). Secondly, the assumption that T is asymptotically regular can be dropped in the Hilbert space case. Finally, one can satisfy (10) when $(T^nx)$ has a convergent subsequence (even in Banach space). A quantitative analysis of the second and third situation (which are mentioned by Kobayasi and Miyadera [Reference Kobayasi and Miyadera13]) is given in §5 of our paper. We point out that all three constructions of A yield a rate of metastability rather than convergence. In this respect, it is also interesting to consider the beginning of §4, where a ‘limsup $\,\leq \,$ liminf’-argument from the proof of [Reference Kobayasi and Miyadera13, Lemma 2] forces us to settle for metastability.

Based on the preceding discussion, we now state the main result of our paper, which will be proved at the end of §4. Note that we simultaneously satisfy statements (4) and (5) as well as the metastable version of (3). As explained above, the original result of Kobayasi and Miyadera [Reference Kobayasi and Miyadera13] (stated as Theorem 1.1) can be recovered as an immediate consequence of the following quantitative version.

Theorem 1.2. Consider a uniformly convex real Banach space X, a subset $C\subseteq X$ that is non-empty, closed and convex, and a non-expansive map $T:C\to C$ that has a fixed point. Assume that we have maps $A,\eta $ and numbers $b,c>0$ and $q>1$ that satisfy (6)–(9) and (10), the latter for given $x\in C$ . Depending on this data only, we get a map $(\varepsilon ,g,h)\mapsto \Phi (\varepsilon ,g,h)$ , explicitly given in Definition 4.11, such that the following holds: for any $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , there is an $N\leq \Phi (2\varepsilon /5,g,h)$ with

$$ \begin{align*} \max\{\lVert S_mT^mx-S_nT^kx \rVert,\lVert T^lS_nT^nx-S_nT^nx \rVert,\lVert T^lS_nT^kx-S_nT^kx \rVert\}<\varepsilon \end{align*} $$

for all $m,n\in [N,N+g(N)]$ and $k,l\leq h(N)$ .

To conclude the introduction, we return to the topic of logical metatheorems. In order to determine the precise bounds $\Phi $ and $\Psi $ , it is, of course, necessary to consider the proof of Theorem 1.1 in detail. However, the fact that one can extract suitable $\Phi $ and $\Psi $ is guaranteed in advance, by the second author’s general result [Reference Kohlenbach16, Theorem 3.30] on uniformly convex normed linear spaces. We only sketch why the latter applies (cf. [Reference Safarik30, §3] for a detailed discussion in a related case). The cited metatheorem covers, roughly speaking, results of the form ‘for all–exists’. A convergence statement such as (1) does not have this form, as the existential claim (‘there is an N’) is followed by a universal quantification (‘for all $m,n\geq N$ ’). On the other hand, the metastable version ‘for all $\varepsilon ,g,h$ there is a $\Phi (\varepsilon ,g,h)$ as in (4)’ does have the required form: here it is crucial that the quantification over $k,m,n$ and N inside (4) ‘does not count’, because it only refers to numbers below a given bound. Similarly, the assumption associated with (10) contains essentially no existential quantification when we treat A as a given function (which we may assume to be a majorant, namely, of the function $A^-$ providing the least N satisfying (10)). In this situation, the cited metatheorem predicts that there are computable maps $\Phi $ and $\Psi $ that only depend on our bound b with $C\subseteq B_{b/2}$ , on the given function A, and on the modulus $\eta $ of uniform convexity. Furthermore, the proof of the metatheorem suggests a general strategy for the extraction of $\Phi $ and $\Psi $ . Hence, the logical background is useful in practice and interesting as a uniform explanation. At the same time, each concrete application can be presented without any reference to logic, as the following sections testify.

Remark 1.3. For logicians

Officially, the aforementioned metatheorem requires A to be a strong majorant for some $A^-$ satisfying (10). However, strong majorization is only needed when dealing with proofs whose quantitative analysis requires so-called bar recursion, which is not the case here, while otherwise ordinary majorization can be used in the monotone functional interpretation proving the metatheorem (see [Reference Kohlenbach17, Remark 17.37]).

2 Nonlinearity and convex combinations

In this section, we discuss quantitative aspects of a result due to Bruck [Reference Bruck6]. Specifically, we construct an increasing function $\gamma :[0,\infty )\to [0,\infty )$ such that

(11) $$ \begin{align} \gamma\bigg(\bigg\lVert T\bigg(\sum_{i=1}^n\unicode{x3bb}_ix_i\bigg)-\sum_{i=1}^n\unicode{x3bb}_iTx_i\bigg\rVert\bigg)\leq\max_{1\leq i,j\leq n}(\lVert x_i-x_j \rVert-\lVert Tx_i-Tx_j \rVert) \end{align} $$

holds for any convex combination $\sum \unicode{x3bb} _ix_i$ (i.e., we require $\unicode{x3bb} _i\geq 0$ and $\sum \unicode{x3bb} _i=1$ ). We note that most quantitative information in this section is already quite explicit in Bruck’s original presentation. Nevertheless, it will be important to streamline some constructions for our purpose (cf. the paragraph after Definition 2.4).

Bruck first constructs functions $\gamma =\gamma _n$ that satisfy (11) for fixed n. In a second step, he achieves independence of n by diagonalizing over these functions. A more common way to assert (11) for $n=2$ is to say that T is of type ( $\gamma $ ). This reveals that the case $n=2$ coincides with [Reference Bruck5, Lemma 1.1]. The following makes the computational information explicit. By a standing assumption from the introduction, the function $\eta :[0,\infty )\to [0,\infty )$ is a modulus of uniform convexity for X, while $b>0$ bounds the diameter of $C\subseteq X$ , the domain of our map $T:C\to C$ .

Definition 2.1. Let $\gamma _2:[0,\infty )\to [0,\infty )$ be given by $\gamma _2(t)=\min \{t,{b}/{2}\cdot \eta ({4t}/{b})\}$ .

The next lemma shows that (11) holds for $\gamma =\gamma _2$ and fixed $n=2$ . The additional properties ensure that we have a strictly increasing inverse $\gamma _2^{-1}:[0,\infty )\to [0,\infty )$ (with $\gamma _2^{-1}(t)\geq t$ due to the minimum above), as used in the proof of Lemma 2.5.

Lemma 2.2. The function $\gamma _2$ is strictly increasing, unbounded and continuous with minimal value $\gamma _2(0)=0$ . For all $x_1,x_2\in C$ and $\unicode{x3bb} \in [0,1]$ , we have

$$ \begin{align*} \gamma_2(\lVert T(\unicode{x3bb} x_1+(1-\unicode{x3bb})x_2)-(\unicode{x3bb} Tx_1+(1-\unicode{x3bb})Tx_2) \rVert)\leq\lVert x_1-x_2 \rVert-\lVert Tx_1-Tx_2 \rVert. \end{align*} $$

Proof. The first sentence of the lemma is immediate by the corresponding properties of $\eta $ , which hold by a standing assumption from the introduction (note that (8) yields $\eta (t)\geq \eta (1)\cdot t$ for $t\geq 1$ ). For the remaining claim, we follow the proof of [Reference Bruck5, Lemma 1.1]. As in the latter, the value of

(12) $$ \begin{align} 2\unicode{x3bb}(1-\unicode{x3bb})\cdot\lVert x_1-x_2 \rVert\cdot\eta\bigg(\frac{\lVert \unicode{x3bb} Tx_1+(1-\unicode{x3bb})Tx_2-T(\unicode{x3bb} x_1+(1-\unicode{x3bb})x_2) \rVert}{\unicode{x3bb}(1-\unicode{x3bb})\cdot\lVert x_1-x_2 \rVert}\bigg) \end{align} $$

is smaller than or equal to $\lVert x_1-x_2 \rVert -\lVert Tx_1-Tx_2 \rVert $ (unless $\unicode{x3bb} \in \{0,1\}$ or $x_1=x_2$ and the claim is trivial). Now recall the standing assumption that $\eta $ is convex. More specifically, from (8) we readily get that $t\cdot \eta (r/t)\leq s\cdot \eta (r/s)$ for $r\geq 0$ and $0<s\leq t$ . With $s=\unicode{x3bb} (1-\unicode{x3bb} )\cdot \lVert x_1-x_2 \rVert \leq b/4=t$ , we see that (12) is larger than or equal to

$$ \begin{align*} \frac{b}{2}\cdot\eta\bigg(\frac{4\cdot\lVert \unicode{x3bb} Tx_1+(1-\unicode{x3bb})Tx_2-T(\unicode{x3bb} x_1+(1-\unicode{x3bb})x_2) \rVert}{b}\bigg). \end{align*} $$

Hence, the definition of $\gamma _2$ (even without the minimum) is as required.

We have reproduced part of the proof from [Reference Bruck5] in order to show how the convexity of $\eta $ is used. As promised in the introduction, we now recall how a convex modulus can be constructed.

Remark 2.3. Assume that $\eta _0:(0,2]\to (0,1]$ is any modulus of uniform convexity for our Banach space X, which means that (7) holds with $\eta _0$ in place of $\eta $ . Define $\eta _1:(0,2]\to (0,1]$ by setting

$$ \begin{align*} \eta_1(\varepsilon)=\sup\{\eta_0(\varepsilon')\mid\varepsilon'\in(0,\varepsilon]\}. \end{align*} $$

Then (7) does still hold with $\eta _1$ in place of $\eta $ (since $\eta _1(\varepsilon )>1-\lVert (x-y)/2 \rVert $ entails that $\eta _0(\varepsilon ')>1-\lVert (x-y)/2 \rVert $ for some $\varepsilon '\leq \varepsilon $ ). The point is that $\eta _1$ is increasing (not necessarily strictly). We also write $\eta _1:[0,\infty )\to [0,1]$ for the extension of this function by the values $\eta _1(0)=0$ and $\eta _1(\varepsilon )=\eta _1(2)$ for $\varepsilon>2$ . Then $\eta _1$ is still increasing and hence Riemann integrable. As in the proof of [Reference Bruck5, Lemma 1.1], we now define $\eta :[0,\infty )\to [0,\infty )$ by

$$ \begin{align*} \eta(\varepsilon)=\frac12\cdot\int_0^{\varepsilon}\eta_1(t)\,{d}t. \end{align*} $$

For $\varepsilon \in (0,2]$ , we have $\eta (\varepsilon )\leq \varepsilon \cdot \eta _1(\varepsilon )/2\leq \eta _1(\varepsilon )$ , so that (7) holds for $\eta $ . Note that this extends to all $\varepsilon \in [0,\infty )$ , by $\eta (0)=0$ and the trivial reason mentioned in the introduction. Furthermore, $\eta $ is strictly increasing and continuous. For $0<s\leq t$ , we can split the integral to get $2\cdot \eta (t)\geq 2\cdot \eta (s)+(t-s)\cdot \eta _1(s)$ and then

$$ \begin{align*} 2\cdot(s\cdot\eta(t)-t\cdot\eta(s))\geq (t-s)\cdot(s\cdot\eta_1(s)-2\cdot\eta(s))\geq 0. \end{align*} $$

Thus, we have $\eta (s)/s\leq \eta (t)/t$ , as required by the convexity property from (8). In conclusion, $\eta $ satisfies all standing assumptions from the introduction. To estimate the cost of these assumptions, we observe that $\eta _1(\varepsilon )\leq 2\cdot \eta (2\varepsilon )/\varepsilon $ holds for $\varepsilon \in (0,2]$ . As noted in the introduction, the given construction is not required in the case of $L^p$ -spaces with $1<p<\infty $ , where the specific $\eta $ given does already satisfy (7) and (8).

We will see that the following functions validate (11) for arbitrary but fixed n.

Definition 2.4. We construct functions $\gamma _n:[0,\infty )\to [0,\infty )$ by recursion on $n\geq 2$ . The base case is provided by Definition 2.1, while the step is given by

$$ \begin{align*} \gamma_{n+1}(t)=\min\bigg\{\gamma_n(t),\gamma_2\bigg(\frac{\gamma_n(t/2)}{3}\bigg)\bigg\}. \end{align*} $$

Let us point out that we do not make the functions $\gamma _n$ convex, as this property will not be needed beyond the proof of Lemma 2.2. This allows us to give a somewhat simpler definition than Bruck. One other point is important: the proof of [Reference Bruck6, Lemma 2.1] suggests a recursive construction of $\gamma _n^{-1}$ rather than $\gamma _n$ . We will consider inverses in the verification below, but we have avoided them in the construction itself, because this gives more control on the complexity of our bounds. Indeed, a function and its inverse do not generally belong to the same complexity class. As an example for logicians, we mention that the function $F_{\varepsilon _0}^{-1}$ in [Reference Freund9] is primitive recursive while $F_{\varepsilon _0}$ is not.

Lemma 2.5. The functions $\gamma _n$ are strictly increasing, unbounded and continuous with $\gamma _n(0)=0$ . For each fixed $n\geq 2$ , inequality (11) is valid for $\gamma =\gamma _n$ .

Proof. A straightforward induction over n yields the first sentence of the lemma. The point is that we can now consider the inverses $\gamma _n^{-1}:[0,\infty )\to [0,\infty )$ , which are strictly increasing as well. By the proof of [Reference Bruck6, Lemma 2.1], the claim that (11) holds for any convex combination (in the domain of $T:C\to C$ ) reduces to

$$ \begin{align*} \gamma_{n+1}^{-1}(s)\geq\gamma_2^{-1}(s)+\gamma_n^{-1}(s+2\cdot\gamma_2^{-1}(s)). \end{align*} $$

Given that Definition 2.1 forces $\gamma _2(r)\leq r$ , we inductively get $\gamma _{n+1}(r)\leq \gamma _n(r)\leq r$ and hence $r\leq \gamma _n^{-1}(r)$ . This ensures that we have

$$ \begin{align*} \gamma_2^{-1}(s)+\gamma_n^{-1}(s+2\cdot\gamma_2^{-1}(s))\leq 2\cdot\gamma_n^{-1}(3\cdot\gamma_2^{-1}(s))=:t. \end{align*} $$

We can thus conclude that

$$ \begin{align*} \gamma_{n+1}(\gamma_2^{-1}(s)+\gamma_n^{-1}(s+2\cdot\gamma_2^{-1}(s)))\leq\gamma_{n+1}(t)\leq\gamma_2\bigg(\frac{\gamma_n(t/2)}{3}\bigg)=s. \end{align*} $$

Since $\gamma _{n+1}^{-1}$ is increasing, this yields the open claim.

In order to obtain (11) for a function $\gamma $ that is independent of n, Bruck argues that any convex combination can be approximated by one with a bounded number of summands. More specifically, this observation is applied to convex combinations of elements $(x,Tx)\in C\times C\subseteq X^2$ . Here, $X^2$ is a uniformly convex Banach space with norm given by $\lVert (x,y) \rVert _2=(\lVert x \rVert ^2+\lVert y \rVert ^2)^{1/2}$ , as discussed in the introduction. For a number $p\geq 1$ and a subset $M\subseteq X^2$ , we put

$$ \begin{align*} \operatorname{co}_p(M)=\bigg\{\sum_{i=1}^n\unicode{x3bb}_iz_i\,\bigg|\,z_i\in M\text{ and } \unicode{x3bb}_i\geq 0\text{ with }\sum_{i=1}^n\unicode{x3bb}_i=1\text{ for }n\leq p\bigg\}. \end{align*} $$

Let us note that we can always arrange $n=p$ by repeating some of the $z_i$ . Also write $\operatorname {co}(M)$ for the full convex hull (i.e., the union over all $\operatorname {co}_p(M)$ with $p\geq 1$ ), and recall that $B_r=\{z\in X^2\mid \lVert z \rVert _2\leq r\}$ is the closed ball with radius r. By a standing assumption, we have $c>0$ and $q>1$ that validate (9) from the introduction. The proof of [Reference Bruck6, Theorem 1.1] contains the following computational information.

Lemma 2.6. We have $\operatorname {co}(M)\subseteq \operatorname {co}_p(M)+B_{r\cdot \varepsilon }$ when $M\subseteq B_r$ and $2cp^{(1-q)/q}\leq \varepsilon $ .

Proof. In the proof of [Reference Bruck6, Theorem 1.1], Bruck seems to claim that the result holds for $2cp^{1/q}<\varepsilon $ . This appears counterintuitive, since $p\mapsto 2cp^{1/q}$ is strictly increasing while $\operatorname {co}_p(M)$ grows with p. We recall Bruck’s proof to show that it actually yields our bound (cf. the proof of Theorem 6.2 in [Reference Dilworth, Howard and Roberts8] for similar reasoning). By a straightforward rescaling, we may assume that $r=1$ . Consider any convex combination

$$ \begin{align*} z=\sum_{i=1}^n\unicode{x3bb}_iz_i\in\operatorname{co}(M)\subseteq B_1. \end{align*} $$

For p as in the lemma, consider independent $X^2$ -valued random variables $Z_1,\ldots ,Z_p$ with identical distribution given by

$$ \begin{align*} Z_1,\ldots,Z_p\,\stackrel{\text{iid}}{\sim}\,\frac{Z-z}{p}\quad\text{for } Z=z_i\text{ with probability }\unicode{x3bb}_i. \end{align*} $$

Now $(2/p)^q$ bounds all possible values of $\lVert Z_i \rVert _2^q$ and hence its expectation. By (9) we can conclude that

$$ \begin{align*} \mathbb E\bigg(\bigg\lVert \sum_{j=1}^pZ_j\bigg\rVert_2^q\bigg)\leq(2c)^q\cdot p^{1-q}. \end{align*} $$

In particular, the right-hand side must bound at least one possible value of $\lVert \sum Z_j \rVert _2^q$ . More explicitly, there must be some event $\omega :\{1,\ldots ,p\}\to \{1,\ldots ,n\}$ (under which $Z_j$ assumes value $(z_{\omega (j)}-z)/p$ ) such that

$$ \begin{align*} \bigg\lVert z-\sum_{j=1}^p\frac{1}{p}\cdot z_{\omega(j)}\bigg\rVert_2=\bigg\lVert \sum_{j=1}^p\frac{z_{\omega(j)}-z}{p}\bigg\rVert_2\leq 2cp^{({1-q})/{q}}\leq\varepsilon. \end{align*} $$

This shows that our given $z\in \operatorname {co}(M)$ lies in $\operatorname {co}_p(M)+B_{\varepsilon }$ , as desired.

Following the construction by Bruck, we now diagonalize over the functions $\gamma _n$ , in order to achieve independence of n.

Definition 2.7. Let $\gamma :[0,\infty )\to [0,\infty )$ be given by $\gamma (0)=0$ and, for $t>0$ ,

$$ \begin{align*} \gamma(t)=\gamma_{p(t)}\bigg(\frac{t}{3}\bigg)\quad\text{with } p(t)=\max\bigg\{2,\bigg\lceil \bigg(\frac{6bc}{\sqrt{2}t}\bigg)^{q/(q-1)} \bigg\rceil\bigg\}. \end{align*} $$

The next proof is very close to the one of [Reference Bruck6, Theorem 2.1]. We provide details in order to show how the previous constructions come together.

Proposition 2.8. The function $\gamma $ is strictly increasing and validates (11) for any convex combination of elements $x_1,\ldots ,x_n\in C$ with arbitrary $n\geq 1$ .

Proof. The function $t\mapsto p(t)$ is decreasing, as $q>1$ holds by a standing assumption from the introduction. Since Definition 2.4 ensures that $\gamma _{n+1}(t)\leq \gamma _n(t)$ , this yields

$$ \begin{align*} s<t\quad\Rightarrow\quad \gamma(s)=\gamma_{p(s)}\bigg(\frac{s}{3}\bigg)\leq\gamma_{p(t)}\bigg(\frac{s}{3}\bigg)<\gamma_{p(t)}\bigg(\frac{t}{3}\bigg)=\gamma(t). \end{align*} $$

To establish (11), consider a convex combination $\sum \unicode{x3bb} _ix_i$ of $x_1,\ldots ,x_n\in C$ . We set

$$ \begin{align*} t:=\bigg\lVert T\bigg(\sum_{i=1}^n\unicode{x3bb}_ix_i\bigg)-\sum_{i=1}^n\unicode{x3bb}_iTx_i\bigg\rVert \end{align*} $$

and assume that $t>0$ , as the remaining case is trivial. The crucial idea (from the proof by Bruck [Reference Bruck6]) is to apply Lemma 2.6 to $M=\{z_1,\ldots ,z_n\}$ with $z_i=(x_i,Tx_i)$ . By standing assumption (6), we have $C\subseteq B_{b/2}$ in X and hence $M\subseteq C\times C\subseteq B_{b/\sqrt {2}}$ in $X^2$ (recall that $\lVert (x,y) \rVert _2=(\lVert x \rVert ^2+\lVert y \rVert ^2)^{1/2}$ ). One can readily check that

$$ \begin{align*} \varepsilon:=\frac{\sqrt{2}t}{3b}\geq 2c\cdot p(t)^{(1-q)/q}. \end{align*} $$

For this $\varepsilon $ and with $p=p(t)$ , Lemma 2.6 yields a subset $\{z_{i_1},\ldots ,z_{i_p}\}\subseteq M$ and coefficients $\mu _1,\ldots ,\mu _p\geq 0$ with $\sum \mu _j=1$ such that

$$ \begin{align*} \bigg\lVert \sum_{i=1}^n\unicode{x3bb}_iz_i-\sum_{j=1}^p\mu_j z_{i_j}\bigg\rVert_2\leq\frac{t}{3}. \end{align*} $$

In view of $\lVert x \rVert ,\lVert y \rVert \leq \lVert (x,y) \rVert _2$ and since T is non-expansive, this yields

$$ \begin{align*} \bigg\lVert T\bigg(\sum_{i=1}^n\unicode{x3bb}_ix_i\bigg)-T\bigg(\sum_{j=1}^p\mu_j x_{i_j}\bigg)\bigg\rVert\leq\frac{t}{3}\quad\text{and}\quad\bigg\lVert \sum_{i=1}^n\unicode{x3bb}_iTx_i-\sum_{j=1}^p\mu_j Tx_{i_j}\bigg\rVert\leq\frac{t}{3}. \end{align*} $$

Using the triangle inequality, we can conclude that

$$ \begin{align*} t\leq\frac{t}{3}+\bigg\lVert T\bigg(\sum_{j=1}^p\mu_jx_{i_j}\bigg)-\sum_{j=1}^p\mu_jTx_{i_j}\bigg\rVert+\frac{t}{3}. \end{align*} $$

Hence, the remaining norm on the right-hand side must be at least $t/3$ . By Lemma 2.5, we get

$$ \begin{align*} \gamma(t)=\gamma_p\bigg(\frac{t}{3}\bigg)\leq\max_{1\leq k,l\leq p}(\lVert x_{i_k}-x_{i_l} \rVert-\lVert Tx_{i_k}-Tx_{i_l} \rVert). \end{align*} $$

This suffices to conclude the proof, as the maximum on the right-hand side becomes larger when we admit all $i,j\in \{1,\ldots ,n\}$ , rather than those of the form $i=i_k$ and $j=i_l$ only.

3 Nonlinearity and Cesàro means

Recall that $S_nx$ denotes the Cesàro mean $(T^0x+\cdots +T^{n-1}x)/n$ . If T is nonlinear, then it may fail to commute with $S_n$ . This failure can be measured by

$$ \begin{align*} \beta^l_n:=\lVert S_nT^{l+n}x-T^lS_nT^nx \rVert. \end{align*} $$

More generally, we can recover these values as $\beta _n^l=\beta _{n,n}^l$ for

$$ \begin{align*} \beta_{m,n}^l:=\bigg\lVert \frac{S_mT^{l+m}x+S_nT^{l+n}x}{2}-T^l\bigg(\frac{S_mT^mx+S_nT^nx}{2}\bigg)\bigg\rVert. \end{align*} $$

Kobayasi and Miyadera [Reference Kobayasi and Miyadera13, Lemma 1] show that, under the assumptions made in our Theorem 1.1 (which is their Theorem 1),

$$ \begin{align*} \lim_{n\to\infty}\beta_n^l=\lim_{m,n\to\infty}\beta_{m,n}^l=0. \end{align*} $$

In the present section, we establish a rate of metastability for this result. Let us recall that we have only assumed a rate of metastability rather than convergence in (10). If we had assumed a rate of convergence there, then we would get one for the $\beta ^l_{m,n}$ here (simply read $g(N)=\infty =h(N)$ below). However, we would then be forced to downgrade from convergence to metastability later, as explained in the introduction. In view of this fact, it is reasonable to work with a rate of metastability from the outset, as this makes for a computationally weaker assumption. Concerning the next definition, recall that our standing assumptions provide a map $(\varepsilon ,g,h)\mapsto A(\varepsilon ,g,h)$ that validates (10). The function $\gamma $ is given by Definition 2.7.

Definition 3.1. For $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , we put $B(\varepsilon ,g,h):=A(\gamma (\varepsilon ),g',h')$ with $g'(N):=N+2\cdot g(N)+h(N)$ and $h'(N):=2\cdot (N+g(N))$ .

By analyzing the proof of [Reference Kobayasi and Miyadera13, Lemma 1], we get the following lemma.

Lemma 3.2. For any $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , there is an $N\leq B(\varepsilon ,g,h)$ with

$$ \begin{align*} \beta_{m,n}^l\leq\varepsilon\quad\text{for all }m,n\in[N,N+g(N)]\text{ and all }l\leq h(N). \end{align*} $$

In particular, we get $\beta _n^l\leq \varepsilon $ under the same conditions on n and l.

To be precise, we point out that $\beta _{m,n}^l$ is only defined for $m,n>0$ . In order to avoid this condition, one can simply declare that $\beta _{0,n}^l=0=\beta _{m,0}^l$ .

Proof. We first observe that Proposition 2.8 remains valid with $T^l:C\to C$ in place of $T:C\to C$ , with the same $\gamma :[0,\infty )\to [0,\infty )$ for all $l\in \mathbb N$ . Indeed, a glance at Definitions 2.1, 2.4 and 2.7 reveals that the relevant quantitative information depends on our Banach space X and the bounded subset $C\subseteq X$ only (cf. the standing assumptions (6)–(10) from the introduction). The qualitative assumption that T is non-expansive holds for $T^l$ as well. As in the proof of [Reference Kobayasi and Miyadera13, Lemma 1], we now apply (11) with $T^l$ and $m+n$ in place of T and n, respectively, and with

$$ \begin{align*} \begin{cases} \begin{alignedat}{5} x_i&{}=T^{m+i-1}x&\text{ and } &\unicode{x3bb}_i&&{}=1/(2m)\quad && \text{for } 1\leq i\leq m,\\ x_i&{}=T^{n+i-(m+1)}x&\text{ and } &\unicode{x3bb}_i&&{}=1/(2n)\quad && \text{for } m+1\leq i\leq m+n. \end{alignedat} \end{cases} \end{align*} $$

One can readily check that this yields

$$ \begin{align*} \sum_{i=1}^{m+n}\unicode{x3bb}_ix_i=\frac{S_mT^mx+S_nT^nx}{2}\quad\text{and}\quad\sum_{i=1}^{m+n}\unicode{x3bb}_iT^lx_i=\frac{S_mT^{l+m}x+S_nT^{l+n}x}{2}. \end{align*} $$

Hence, inequality (11) amounts to

$$ \begin{align*} \gamma(\beta_{m,n}^l)\leq\max\{\lVert x_i-x_j \rVert-\lVert T^lx_i-T^lx_j \rVert\,:\,1\leq i,j\leq m+n\}. \end{align*} $$

Writing $\alpha _n^i=\lVert T^nx-T^{n+i}x \rVert $ as in the introduction, we get

(13) $$ \begin{align} \gamma(\beta_{m,n}^l)\leq\max\{\alpha_k^p-\alpha_{l+k}^p\mid\min\{m,n\}\leq k<2\cdot\max\{m,n\}>p\}. \end{align} $$

Let $g'$ and $h'$ be as given in Definition 3.1. By a standing assumption from the introduction, pick an $N\leq A(\gamma (\varepsilon ),g',h')$ such that (10) holds with $\gamma (\varepsilon ),g'$ and $h'$ in place of $\varepsilon ,g$ and h, respectively. To establish the present lemma, consider arbitrary $m,n\in [N,N+g(N)]$ and $l\leq h(N)$ . For all $k,l,p$ as in (13), we have

$$ \begin{align*} N\leq k\leq k+l<2\cdot(N+g(N))+h(N)=N+g'(N) \end{align*} $$

and $p<h'(N)$ . Hence, the current version of (10) yields $\alpha _k^p-\alpha _{l+k}^p<\gamma (\varepsilon )$ in all cases that are relevant for (13). The latter thus entails that $\gamma (\beta _{m,n}^l)<\gamma (\varepsilon )$ . Proposition 2.8 tells us that $\gamma $ is strictly increasing, so that we get $\beta _{m,n}^l<\varepsilon $ , as desired.

4 Cesàro means and fixed points

This section completes the quantitative analysis of Kobayasi and Miyadera’s results in [Reference Kobayasi and Miyadera13], as sketched in the introduction.

We first analyze the preliminary result from Lemma 2 of [Reference Kobayasi and Miyadera13], which asserts that

$$ \begin{align*} \theta_n^f:=\lVert S_nT^nx-f \rVert \end{align*} $$

converges whenever $f=Tf$ is a fixed point. From a methodological standpoint, this can be seen as the most interesting part of our quantitative analysis: it is here that we are forced to extract a rate of metastability rather than convergence. In order to make this transparent, we recall the original proof of [Reference Kobayasi and Miyadera13, Lemma 2]. For a fixed point $f=Tf$ and arbitrary $m\geq 1$ and $n\in \mathbb N$ , the cited proof shows that

(14) $$ \begin{align} \theta_{m+n}^f\leq\theta_m^f+\frac{m-1}{m+n}\cdot\lVert x-f \rVert+\frac{1}{m+n}\cdot\sum_{i=0}^{m+n-1}\beta_m^{n+i}, \end{align} $$

with $\beta ^l_n=\lVert S_nT^{l+n}x-T^lS_nT^nx \rVert $ as in the previous section. The proof then concludes with a ‘limsup $\,\leq \,$ liminf’-argument, which can be spelled out as follows. Write $\theta ^f$ for the limit inferior of the sequence $(\theta ^f_n)$ . According to [Reference Kobayasi and Miyadera13, Lemma 1], we have $\lim _{n\to \infty }\beta ^l_n=0$ uniformly in l (cf. our Lemma 3.2). Given $\varepsilon>0$ , there must thus be a number $m\in \mathbb N$ such that

$$ \begin{align*} \text{(i) } \theta^f_k\geq\theta^f-\varepsilon/4\quad \text{ for } k\geq m,\quad \text{(ii) } \beta^l_m\leq\varepsilon/4 \quad \text{ for } l\in\mathbb N, \quad\text{(iii) } \theta^f_m\leq\theta^f+\varepsilon/4. \end{align*} $$

Put $N:=\max \{m,\lceil 4(m-1)\cdot \lVert x-f \rVert /\varepsilon \rceil \}$ . For $j\geq N$ (think $j=m+n$ ), we can combine (14) and (ii) to get $\theta ^f_j\leq \theta ^f_m+\varepsilon /2$ , which, by (iii), yields $\theta ^f_j\leq \theta ^f+3\varepsilon /4$ . Together with (i), we get $|\theta ^f_j-\theta ^f_k|\leq \varepsilon $ for $j,k\geq N$ , as needed to show that $(\theta ^f_n)$ is a Cauchy sequence. Our focus on the Cauchy property (rather than on convergence to the limit $\theta ^f$ ) assimilates the argument to the quantitative analysis below.

From a computational viewpoint, the previous argument appears problematic because we do not know how fast the limit inferior is approximated. More explicitly, there is no obvious bound on a number m that would satisfy (i) or (iii) above. Nevertheless, a modified argument does reveal quantitative information: if (ii) holds for sufficiently many l (which can be ensured by Lemma 3.2), then we obtain $\theta ^f_j\leq \theta ^f_m+\varepsilon /2$ as above. Without reference to $\theta ^f$ , we can conclude that $|\theta ^f_j-\theta ^f_k|\leq \varepsilon $ holds unless we have $\theta ^f_k<\theta ^f_m-\varepsilon /2$ . In the latter case, we repeat the argument with k in place of m. Crucially, there can be only finitely many repetitions of this type, as $\theta _n^f$ cannot become negative. For an explicit bound, use $f=Tf$ to get

(15) $$ \begin{align} \theta_n^f=\bigg\lVert f-\frac{1}{n}\cdot\sum_{i=0}^{n-1}T^ix\bigg\rVert\leq\frac{1}{n}\cdot\sum_{i=0}^{n-1}\lVert T^ix-T^if \rVert\leq\lVert x-f \rVert\leq b. \end{align} $$

Here, b is a bound on the diameter of $C\subseteq B_{b/2}$ , the domain of our non-expansive map $T:C\to C$ (cf. standing assumption (6) from the introduction). The preceding discussion is somewhat informal, but it helps to motivate the following definition.

Definition 4.1. Consider a function $g:\mathbb N\to \mathbb N$ . As usual, we write $g^M$ for the monotone bound given by $g^M(n):=\max _{m\leq n}g(m)$ . For $\varepsilon>0$ , we put

$$ \begin{align*} g_\varepsilon(n):=n'+g^M(n')\quad\text{with } n'=\max\bigg\{n,\bigg\lceil \frac{4nb}{\varepsilon}\bigg \rceil\bigg\}. \end{align*} $$

Now consider the iterates $g_\varepsilon ^{(i)}:\mathbb N\to \mathbb N$ (with corrected start value) given by

$$ \begin{align*} g_\varepsilon^{(0)}(n)=\max\{1,n\}\quad\text{and}\quad g_\varepsilon^{(i+1)}(n)=g_\varepsilon(g_\varepsilon^{(i)}(n)). \end{align*} $$

Finally, define a map $(\varepsilon ,g)\mapsto \Theta (\varepsilon ,g)$ by setting

$$ \begin{align*} \Theta(\varepsilon,g):=g_\varepsilon^{(K+1)}\bigg(B\bigg(\frac{\varepsilon}{4},g_\varepsilon^{(K+1)},2\cdot g_\varepsilon^{(K+1)}\bigg)\bigg)\quad\text{with } K=\bigg\lceil \frac{2b}{\varepsilon} \bigg\rceil. \end{align*} $$

Here, B is the map from Definition 3.1.

As a systematic explanation for logicians, we point out that $\Theta $ can be seen as the combination of two rates of metastability (cf. Theorem 5.8 in the preprint version arXiv:1412.5563 of [Reference Kohlenbach, Leuştean and Nicolae25]). It is well known that a non-increasing sequence in $[0,b]$ admits a rate of metastability that depends on b only (see [Reference Kohlenbach17, Proposition 2.27]). We have a similar rate here, given that the $\theta ^f_n$ are ‘almost’ non-increasing by (14). This rate is combined with the rate B, which ensures that the last summand in (14) is indeed small (see Lemma 3.2). The fact that $\Theta $ combines two rates is made explicit in Corollary 4.4 below (note that $\Theta ^B$ is a minor variant of $\Theta $ ). Let us now present our metastable version of [Reference Kobayasi and Miyadera13, Lemma 2].

Proposition 4.2. Let $f=Tf$ be a fixed point. For any $\varepsilon>0$ and $g:\mathbb N\to \mathbb N$ , there is an $N\leq \Theta (\varepsilon ,g)$ such that $|\theta _m^f-\theta _n^f|<\varepsilon $ holds for all $m,n\in [N,N+g(N)]$ .

Proof. For $K=\lceil 2b/\varepsilon \rceil $ , Lemma 3.2 yields an $N_0\leq B(\varepsilon /4,g_\varepsilon ^{(K+1)},2\cdot g_\varepsilon ^{(K+1)})$ with

(16) $$ \begin{align} \beta_m^l\leq\frac{\varepsilon}{4}\quad\text{for all }m\in[N_0,N_0+g_\varepsilon^{(K+1)}(N_0)]\ \text{and}\ l\leq 2\cdot g_\varepsilon^{(K+1)}(N_0). \end{align} $$

We will show that some $N\in [N_0,g_\varepsilon ^{(K+1)}(N_0)]$ validates the proposition. To see that we get $N\leq \Theta (\varepsilon ,g)$ , it suffices to observe that each iterate $g_\varepsilon ^{(i)}$ is increasing. Inductively, this reduces to the same statement about $g_\varepsilon $ , which holds by construction. Aiming at a contradiction, we now assume that, for any number $N\in [N_0,g_\varepsilon ^{(K+1)}(N_0)]$ , there are $m,n\in [N,N+g(N)]$ with $|\theta _m^f-\theta _n^f|\geq \varepsilon $ . We shall construct a sequence of numbers $n(i)\leq g_\varepsilon ^{(i)}(N_0)$ with $\theta ^f_{n(i)}\leq b-i\varepsilon /2$ by recursion on $i\leq K+1$ , which contradicts $\theta ^f_{n(K+1)}\geq 0$ . In the base case, set $n(0)=g_\varepsilon ^{(0)}(N_0)$ and note that $\theta ^f_{n(0)}\leq b$ holds due to (15). In the recursion step from $i\leq K$ to $i+1$ , we consider

$$ \begin{align*} N:=\max\bigg\{n(i),\bigg\lceil \frac{4b\cdot n(i)}{\varepsilon} \bigg\rceil\bigg\}. \end{align*} $$

By (14) for $m=n(i)$ and $n=N-n(i)+k$ (with arbitrary k), we get

(17) $$ \begin{align} \theta^f_{N+k}\leq\theta^f_{n(i)}+\frac{\varepsilon}{4}+\frac{1}{N+k}\cdot\sum_{j=0}^{N+k-1}\beta_{n(i)}^{N-n(i)+k+j}. \end{align} $$

In order to apply (16), we verify that the indices of $\beta $ lie in the appropriate interval of metastability. We anticipate that we will choose an $n(i+1)$ above $N\geq n(i)$ , so that we may inductively assume that $n(i)\geq n(0)\geq N_0$ . As the function $g_\varepsilon $ is increasing with $g_\varepsilon (j)\geq j$ , we also have

$$ \begin{align*} n(i)\leq N\leq N+g^M(N)=g_\varepsilon(n(i))\leq g_\varepsilon(g_\varepsilon^{(i)}(N_0))\leq g_\varepsilon^{(K+1)}(N_0). \end{align*} $$

Furthermore, for $k\leq g(N)$ and $j<N+k$ , we obtain

$$ \begin{align*} N-n(i)+k+j<2\cdot(N+g^M(N))\leq 2\cdot g_\varepsilon^{(K+1)}(N_0). \end{align*} $$

In view of these bounds, we can combine (16) and (17) to get

$$ \begin{align*} \theta^f_{N+k}\leq\theta^f_{n(i)}+\frac{\varepsilon}{2}\quad\text{for } k\leq g(N). \end{align*} $$

On the other hand, we have $|\theta ^f_m-\theta ^f_n|\geq \varepsilon $ for some $m,n\in [N,N+g(N)]$ , by the contradictory assumption for the present $N\in [N_0,g_\varepsilon ^{(K+1)}(N_0)]$ . Thus, there must be a $k\leq g(N)$ with $\theta ^f_{N+k}\leq \theta ^f_{n(i)}-\varepsilon /2$ . In order to complete the recursion step, we set $n(i+1):=N+k$ for some such k. Note that we indeed get

$$ \begin{gather*} n(i+1)\leq N+g^M(N)\leq g_\varepsilon^{i+1}(N_0),\\[3pt] \theta^f_{n(i+1)}\leq\theta^f_{n(i)}-\frac{\varepsilon}{2}\leq b-\frac{i\cdot\varepsilon}{2}-\frac{\varepsilon}{2}=b-\frac{(i+1)\cdot\varepsilon}{2}, \end{gather*} $$

so that $n(i+1)$ retains the properties that were promised above.

We will want to bound $|\theta _m^f-\theta _n^f|$ and $\beta ^l_m$ simultaneously, i.e., on the same interval of metastability. In the present case, it suffices to tweak our previous construction.

Definition 4.3. Extending Definition 4.1, we set

$$ \begin{align*} \Theta^B(\varepsilon,g,h):=g_\varepsilon^{(K+1)}\bigg(B\bigg(\frac{\varepsilon}{4},g_\varepsilon^{(K+2)},\max\{2\cdot g_\varepsilon^{(K+1)},h^M\circ g_\varepsilon^{(K+1)}\}\bigg)\bigg), \end{align*} $$

still with $K=\lceil 2b/\varepsilon \rceil $ and $h^M(n)=\max _{m\leq n}h(m)$ .

As promised, this yields a simultaneous bound.

Corollary 4.4. For a fixed point $f=Tf$ and arbitrary $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , there is an $N\leq \Theta ^B(\varepsilon ,g,h)$ such that we have both $\beta _{m,n}^l\leq \varepsilon /4$ and $|\theta ^f_m-\theta _n^f|<\varepsilon $ for all $m,n\in [N,N+g(N)]$ and $l\leq h(N)$ .

Proof. From Lemma 3.2, we obtain a number

$$ \begin{align*} N_0\leq B\bigg(\frac{\varepsilon}{4},g_\varepsilon^{(K+2)},\max\{2\cdot g_\varepsilon^{(K+1)},h^M\circ g_\varepsilon^{(K+1)}\}\bigg) \end{align*} $$

such that $\beta _{m,n}^l\leq \varepsilon /4$ holds whenever we have $m,n\in [N_0,N_0+g_\varepsilon ^{(K+2)}(N_0)]$ as well as $l\leq 2\cdot g_\varepsilon ^{(K+1)}(N_0)$ or $l\leq h^M\circ g_\varepsilon ^{(K+1)}(N_0)$ . The proof of Proposition 4.2 yields an $N\in [N_0,g_\varepsilon ^{(K+1)}(N_0)]$ with $|\theta _m^f-\theta _n^f|<\varepsilon $ for all $m,n\in [N,N+g(N)]$ . As before, we see that $N\leq \Theta ^B(\varepsilon ,g,h)$ . In view of $h(N)\leq h^M\circ g_\varepsilon ^{(K+1)}(N_0)$ and

$$ \begin{gather*} N+g(N)\leq g_\varepsilon(N)\leq g_\varepsilon^{(K+2)}(N_0), \end{gather*} $$

all the desired inequalities $\beta _{m,n}^k\leq \varepsilon /4$ are available as well.

In their proof of [Reference Kobayasi and Miyadera13, Lemma 3], Kobayasi and Miyadera show that $(S_nT^nx)$ is a Cauchy sequence. We now provide a rate of metastability.

Definition 4.5. For $\varepsilon>0$ and $g:\mathbb N\to \mathbb N$ , we set

$$ \begin{align*} g_\varepsilon'(n):=\bigg\lceil \frac{6b\cdot(n+g(n))}{\delta(\varepsilon)}\bigg \rceil\quad\text{with }\delta(\varepsilon):=\min\bigg\{b,\frac{\varepsilon}{4},\frac{\varepsilon}{8}\cdot\eta\bigg(\frac{\varepsilon}{2b}\bigg)\bigg\}. \end{align*} $$

Then, we define

$$ \begin{align*} \Delta(\varepsilon,g):=\Theta^B(\delta(\varepsilon),g+g_\varepsilon',\operatorname{Id}+g+2\cdot g_\varepsilon'), \end{align*} $$

with $\operatorname {Id}(n)=n$ and for $\delta (\varepsilon )$ as above.

As promised, we have the following proposition.

Proposition 4.6. For any $\varepsilon>0$ and $g:\mathbb N\to \mathbb N$ , there is an $N\leq \Delta (\varepsilon ,g)$ such that we have $\lVert S_mT^mx-S_nT^nx \rVert <\varepsilon $ for all $m,n\in [N,N+g(N)]$ .

Proof. Pick a fixed point $f=Tf\in C\subseteq B_{b/2}$ , as justified by the standing assumptions from the introduction. In the conclusion of the proposition, we can replace both expressions $S_kT^kx$ by $y_k:=S_kT^kx-f$ . Note that we get $\lVert y_k \rVert =\theta _k^f$ in the notation from above. Corollary 4.4 yields an $N\leq \Delta (\varepsilon ,g)$ with

(18) $$ \begin{align}\def\arraystretch{1.5} \left. \begin{array}{r@{}} |\theta_m^f-\theta_n^f|<\delta(\varepsilon)\\ \text{and }\beta_{m,n}^l\leq\frac{\delta(\varepsilon)}{4} \end{array}\right\} \quad\text{when } \begin{cases} m,n\in[N,N+g(N)+g^{\prime}_\varepsilon(N)]\\[2pt] \text{and } l\leq N+g(N)+2\cdot g^{\prime}_\varepsilon(N). \end{cases} \end{align} $$

We must show that $\lVert y_m-y_n \rVert <\varepsilon $ for arbitrary $m,n\in [N,N+g(N)]$ , say, with $m\leq n$ . For $\delta (\varepsilon )$ as in Definition 4.5, we put

$$ \begin{align*} k:=\bigg\lceil \frac{6n\cdot b}{\delta(\varepsilon)} \bigg\rceil\leq g_\varepsilon'(N)\quad\text{and}\quad K:=m+k\leq N+g(N)+g_\varepsilon'(N). \end{align*} $$

First, assume that we have $\theta _K^f\leq \varepsilon /4$ . Using (18), we then get

$$ \begin{align*} \lVert y_m-y_n \rVert\leq\lVert y_m \rVert+\lVert y_n \rVert=\theta^f_m+\theta^f_n<2\cdot(\theta^f_K+\delta(\varepsilon))\leq\varepsilon. \end{align*} $$

From now on, we assume that $\theta ^f_K>\varepsilon /4$ , which entails that

(19) $$ \begin{align} (\theta^f_K+\delta(\varepsilon))\cdot\bigg(1-\eta\bigg(\frac{\varepsilon}{2b}\bigg)\bigg)<\theta_K^f-\delta(\varepsilon). \end{align} $$

Normalize $y_m$ and $y_n$ by dividing through by $\theta ^f_K+\delta (\varepsilon )$ , and then apply the contrapositive of (7) with $\varepsilon /(2b)$ in place of $\varepsilon $ . Due to $\theta _K^f,\delta (\varepsilon )\leq b$ this yields

$$ \begin{align*} \bigg\lVert \frac{y_m+y_n}{2}\bigg\rVert>(\theta^f_K+\delta(\varepsilon))\cdot\bigg(1-\eta\bigg(\frac{\varepsilon}{2b}\bigg)\bigg)\,\,\Rightarrow\,\,\lVert y_m-y_n \rVert<(\theta^f_K+\delta(\varepsilon))\cdot\frac{\varepsilon}{2b}\leq\varepsilon. \end{align*} $$

In view of (19), it remains to show that we have $\lVert (y_m+y_n)/2 \rVert \geq \theta _K^f-\delta (\varepsilon )$ . As in the proof of [Reference Kobayasi and Miyadera13, Lemma 3], we first observe that $\lVert y_{l+1}-y_l \rVert $ tends to zero. Since our standing assumption $C\subseteq B_{b/2}$ ensures $\lVert T^lx \rVert \leq b/2$ , we have

$$ \begin{align*} \lVert y_{l+1}-y_l \rVert&=\bigg\lVert \bigg(\frac{1}{l}-\frac{1}{l(l+1)}\bigg)\cdot\sum_{i=1}^{l+1} T^{l+i}x-\frac{1}{l}\cdot\sum_{i=0}^{l-1} T^{l+i}x\bigg\rVert\leq{}\\ {}&\leq\frac{1}{l}\cdot\lVert T^{2l+1}x+T^{2l}x-T^lx \rVert+\frac{1}{l(l+1)}\cdot\sum_{i=1}^{l+1}\lVert T^{l+i}x \rVert\leq\frac{2b}{l}. \end{align*} $$

Adding up yields $\lVert y_{l+i}-y_l \rVert \leq 2b\cdot i/l$ . Applied to $l=m+k=K$ and $i=n-m$ , we obtain

$$ \begin{align*} \lVert y_{n+k}-y_K \rVert\leq 2b\cdot\frac{n-m}{m+k}\leq 2b\cdot\frac{n}{k}\leq\frac{\delta(\varepsilon)}{3}\leq\delta(\varepsilon). \end{align*} $$

We can now use the reverse triangle inequality to infer

(20) $$ \begin{align} \lVert y_{m+k}+y_{n+k} \rVert\geq 2\cdot\lVert y_K \rVert-\lVert y_{n+k}-y_K \rVert\geq 2\cdot\theta_K^f-\delta(\varepsilon). \end{align} $$

On the other hand, the proof of [Reference Kobayasi and Miyadera13, Lemma 3] shows that

$$ \begin{align*} \lVert y_{m+k}+y_{n+k} \rVert\leq\lVert y_m+y_n \rVert+\frac{3n}{m+k}\cdot b+\frac{2}{m+k}\cdot\sum_{i=0}^{m+k-1}\beta_{m,n}^{k+i}. \end{align*} $$

For $i\leq m+k-1$ , we have $k+i\leq m+2k\leq N+g(N)+2\cdot g^{\prime }_\varepsilon (N)$ . By (18) and the choice of k, we can thus infer that $\lVert y_{m+k}+y_{n+k} \rVert \leq \lVert y_m+y_n \rVert +\delta (\varepsilon )$ . Together with inequality (20), we can conclude that

$$ \begin{align*} \lVert y_m+y_n \rVert\geq\lVert y_{m+k}+y_{n+k} \rVert-\delta(\varepsilon)\geq 2\cdot(\theta^f_K-\delta(\varepsilon)), \end{align*} $$

as needed to establish the open claim.

We have just analyzed the proof that $(S_nT^nx)$ is a Cauchy sequence. The limit y of this sequence is a fixed point of T, as shown by Kobayasi and Miyadera (still in the proof of [Reference Kobayasi and Miyadera13, Lemma 3]). One might expect that a quantitative version of this result consists in (metastable) bounds on the norms $\lVert TS_nT^nx-S_nT^nx \rVert $ . However, in the proof of [Reference Kobayasi and Miyadera13, Theorem 1], Kobayasi and Miyadera use $T^ly=y$ for arbitrary $l\in \mathbb N$ (rather than just for $l=1$ ). In order to reflect this fact, we will bound $\lVert T^lS_nT^nx-S_nT^nx \rVert $ with a metastable dependency on l.

Definition 4.7. First, extend Definition 4.5 by setting

$$ \begin{align*} \Delta^B(\varepsilon,g,h):=\Theta^B(\delta(\varepsilon),g+g_\varepsilon',\max\{\operatorname{Id}+g+2\cdot g_\varepsilon',h\}). \end{align*} $$

For $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , we now put

$$ \begin{align*} \Psi(\varepsilon,g,h):=\Delta^B\bigg(\frac{\varepsilon}{4},g^{\prime\prime}_{\varepsilon,h},h\bigg)\quad\text{with } g^{\prime\prime}_{\varepsilon,h}(N):=\max\bigg\{g(N),\bigg\lceil \frac{4b\cdot h(N)}{\varepsilon} \bigg\rceil\bigg\}. \end{align*} $$

It is straightforward to see that $\Delta ^B$ combines Proposition 4.6 with Corollary 4.4 (by the proof of the proposition and $\delta (\varepsilon )\leq \varepsilon /4$ ; cf. the proof of the corollary).

Corollary 4.8. For any $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , there is an $N\leq \Delta ^B(\varepsilon ,g,h)$ with $\beta _{m,n}^l\leq \varepsilon /16$ and $\lVert S_mT^m-S_nT^nx \rVert <\varepsilon $ for $m,n\in [N,N+g(N)]$ and $l\leq h(N)$ .

We are ready to show that $\Psi $ validates statement (5) from the introduction.

Theorem 4.9. For any $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , there is an $N\leq \Psi (\varepsilon ,g,h)$ such that we have $\lVert T^lS_nT^nx-S_nT^nx \rVert <\varepsilon $ for all $n\in [N,N+g(N)]$ and $l\leq h(N)$ .

Proof. For $g_{\varepsilon ,h}''$ as in Definition 4.7, the corollary yields an $N\leq \Psi (\varepsilon ,g,h)$ with

(21) $$ \begin{align}\def\arraystretch{1.5} \left. \begin{array}{r@{}} \lVert S_mT^mx-S_nT^nx \rVert<\dfrac{\varepsilon}{4}\\[6pt] \text{and }\beta_n^l<\dfrac{\varepsilon}{4} \end{array}\right\} \quad\text{when } \begin{cases} m,n\in[N,N+g_{\varepsilon,h}''(N)]\\[3pt] \text{and } l\leq h(N). \end{cases} \end{align} $$

To establish the conclusion of the theorem, we consider arbitrary $n\in [N,N+g(N)]$ and $l\leq h(N)$ . Set $m:=N+g_{\varepsilon ,h}''(N)$ and observe that

$$ \begin{align*} \lVert T^lS_mT^mx-T^lS_nT^nx \rVert\leq\lVert S_mT^mx-S_nT^nx \rVert<\frac{\varepsilon}{4}. \end{align*} $$

Using the triangle inequality, we can conclude that

$$ \begin{align*} \lVert T^lS_nT^nx-S_nT^nx \rVert\leq\lVert T^lS_mT^mx-S_mT^mx \rVert+\frac{\varepsilon}{2}. \end{align*} $$

In other words, we have reduced the claim for n to the claim for the given m (at the cost of a smaller $\varepsilon $ ). The point is that m is large, so that $S_mT^mx$ is a better approximation of the fixed point $y=Ty$ that is considered in Kobayasi and Miyadera’s proof of [Reference Kobayasi and Miyadera13, Lemma 3]. Also by the triangle inequality, we get

$$ \begin{align*} \lVert T^lS_mT^mx-S_mT^mx \rVert\leq\lVert S_mT^{l+m}x-S_mT^mx \rVert+\beta_m^l. \end{align*} $$

Due to cancellations between the Cesàro sums, we have

$$ \begin{align*} \lVert S_mT^{l+m}x-S_mT^mx \rVert\leq\frac{1}{m}\cdot\bigg(\sum_{i=m}^{m+l-1}\lVert T^{m+i}x \rVert+\sum_{i=0}^{l-1}\lVert T^{m+i}x \rVert\bigg). \end{align*} $$

By the standing assumption that $T:C\to C$ has domain $C\subseteq B_{b/2}$ , this entails that

$$ \begin{align*} \lVert S_mT^{l+m}x-S_mT^mx \rVert\leq\frac{l\cdot b}{m}\leq\frac{h(N)\cdot b}{g_{\varepsilon,h}''(N)}\leq\frac{\varepsilon}{4}. \end{align*} $$

Given that (21) provides $\beta _m^l<\varepsilon /4$ , we can combine the previous inequalities to obtain $\lVert T^lS_nT^nx-S_nT^nx \rVert <\varepsilon $ , as desired.

In order to obtain the simultaneous bounds from Corollaries 4.4 and 4.8, we have modified $\Theta $ and $\Delta $ into $\Theta ^B$ and $\Delta ^B$ , respectively. In the case of $\Psi $ , we get a simultaneous bound without additional modifications, by the proof of Theorem 4.9.

Corollary 4.10. For any $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , there is an $N\leq \Psi (\varepsilon ,g,h)$ such that $\lVert T^lS_nT^nx-S_nT^nx \rVert <\varepsilon $ and $\lVert S_mT^mx-S_nT^nx \rVert <\varepsilon /4$ and $\beta _n^l<\varepsilon /4$ hold for all $m,n\in [N,N+g(N)]$ and $l\leq h(N)$ .

Finally, we can specify the map $\Phi $ that validates (4) from the introduction.

Definition 4.11. First, let the map $(\varepsilon ,n)\mapsto N_\varepsilon (n)$ be given by

$$ \begin{align*} N_\varepsilon(n):=\max\bigg\{n,\bigg\lceil \frac{6n\cdot b}{\varepsilon} \bigg\rceil\bigg\}. \end{align*} $$

For $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , we now set

$$ \begin{align*} \Phi(\varepsilon,g,h):=N_\varepsilon\bigg(\Psi\bigg(\frac{\varepsilon}{2},g',h'\bigg)\bigg)\quad\text{with } \left\{ \begin{aligned} g'(n)&:=N_\varepsilon(n)+g(N_\varepsilon(n)),\\[1ex] h'(n)&:=g(n)+h(N_\varepsilon(n)). \end{aligned}\right. \end{align*} $$

As promised, we get the following theorem.

Theorem 4.12. For any $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , there is an $N\leq \Phi (\varepsilon ,g,h)$ such that $\lVert S_mT^mx-S_nT^kx \rVert <\varepsilon $ holds for all $m,n\in [N,N+g(N)]$ and $k\leq h(N)$ .

Proof. Use Corollary 4.10 to find an $M\leq \Psi ({\varepsilon }/{2},g',h')$ with

(22) $$ \begin{align}\def\arraystretch{1.5} \left. \begin{array}{r@{}} \lVert T^lS_nT^nx-S_nT^nx \rVert<\dfrac{\varepsilon}{2}\\[6pt] \text{and }\lVert S_mT^mx-S_nT^nx \rVert<\dfrac{\varepsilon}{8}\\[6pt] \text{and }\beta_n^l<\dfrac{\varepsilon}{8} \end{array}\right\} \quad\text{when } \begin{cases} m,n\in[M,M+g'(M)]\\[2pt] \text{and } l\leq h'(M), \end{cases} \end{align} $$

for $g'$ and $h'$ as in Definition 4.11. To establish the conclusion of the theorem for $N:=N_\varepsilon (M)\leq \Phi (\varepsilon ,g,h)$ , consider arbitrary $m,n\in [N,N+g(N)]$ and $k\leq h(N)$ . In view of $n\geq N\geq M$ , the proof of [Reference Kobayasi and Miyadera13, Theorem 1] yields

(23) $$ \begin{align} \lVert S_nT^kx-S_mT^mx \rVert\leq\frac{3M}{2n}\cdot b+\frac{1}{n}\cdot\sum_{i=M}^{n-1}\lVert S_MT^{k+i}x-S_mT^mx \rVert. \end{align} $$

For each $i\in [M,n-1]$ , we can write $k+i=M+l$ with $l\leq h'(M)$ . Using the triangle inequality and (22), it follows that $\lVert S_MT^{k+i}x-S_mT^mx \rVert $ is smaller than

$$ \begin{align*} \beta_M^l+\lVert T^lS_MT^Mx-S_MT^Mx \rVert+\lVert S_MT^Mx-S_mT^mx \rVert\leq\tfrac34\cdot\varepsilon. \end{align*} $$

By (23) and the definition of $N=N_\varepsilon (M)$ , we can conclude that

$$ \begin{align*} \lVert S_nT^kx-S_mT^mx \rVert<\frac{3M}{2\cdot N_\varepsilon(M)}\cdot b+\frac34\cdot\varepsilon\leq\varepsilon, \end{align*} $$

just as the theorem claims.

Finally, we show that all relevant quantities can be bounded simultaneously. This yields the main result of our paper, which was stated in the introduction.

Proof of Theorem 1.2

The proof of Theorem 4.12 yields an $N\leq \Phi (2\varepsilon /5,g,h)$ such that

$$ \begin{align*} \lVert S_mT^mx-S_nT^kx \rVert<2\varepsilon/5<\varepsilon\quad\text{and}\quad\lVert T^lS_nT^nx-S_nT^nx \rVert<\varepsilon/5<\varepsilon \end{align*} $$

for all $m,n\in [N,N+g(N)]$ and $k,l\leq h(N)$ . Using the triangle inequality and the fact that T is non-expansive, we also get

$$ \begin{align*} \lVert T^lS_nT^k-S_nT^k \rVert\leq 2\cdot\lVert S_nT^kx-S_NT^Nx \rVert+\lVert T^lS_NT^Nx-S_NT^Nx \rVert<\varepsilon, \end{align*} $$

still for $n\in [N,N+g(N)]$ and $k,l\leq h(N)$ .

5 Bounds on asymptotic isometry

Theorem 1.1 (due to Kobayasi and Miyadera) involves the condition that T is asymptotically isometric on $\{x\}$ , or, explicitly, that the values $\alpha ^i_n=\lVert T^nx-T^{n+i}x \rVert $ converge for $n\to \infty $ uniformly in $i\in \mathbb N$ . On the quantitative side, this corresponds to one of our standing assumptions: throughout the previous sections, we have assumed that we are given a map $(\varepsilon ,g,h)\mapsto A(\varepsilon ,g,h)$ that validates statement (10). In the introduction, we have mentioned three cases in which a suitable A can be constructed. The first case is covered in a previous paper by the second author [Reference Kohlenbach19]: for an asymptotically regular map T satisfying Wittmann’s condition on a uniformly convex Banach space, Theorem 2.2 of the cited paper (with $g'(n):=g(n)+h(n)$ and $\varepsilon ':=\varepsilon /2$ ) yields a bound on an N such that

$$ \begin{align*} |\alpha^i_m-\alpha^i_n|\leq|\alpha^i_m|+|\alpha^i_n|<\varepsilon\quad\text{for all } m,n\in[N,N+g(N)] \text{ and } i\leq h(N). \end{align*} $$

Two further constructions in different cases are provided in the present section. Note that each construction yields a version of Theorems 4.9 and 4.12 in which $\Phi $ and $\Psi $ no longer depend on A (but possibly on new data such as $\Gamma $ in (26) below).

For the first part of this section, we consider a non-expansive map $T:C\to C$ on a subset C of a Hilbert space. If T is odd on $C=-C\ni 0$ , then we clearly have

(24) $$ \begin{align} T0=0\in C\quad\text{and}\quad\lVert Tx+Ty \rVert\leq\lVert x+y \rVert\quad\text{for all }x,y\in C. \end{align} $$

In the following, we do not assume that T is odd but we do require that it satisfies (24). This condition has been studied by Wittmann [Reference Wittmann32] and plays an important role in several applications of proof mining (cf. [Reference Kohlenbach19, Reference Kohlenbach21, Reference Safarik30]). Most standing assumptions from the introduction are not needed for the following, but we still assume that $C\subseteq B_{b/2}$ (actually, it suffices here to assume that $\| x\|\le {b}/{2}$ ).

Our quantitative analysis is based on the following result, which is implicit in the proof of [Reference Brézis and Browder3, Theorem 2] (cf. also [Reference Bruck4, Theorem 2.3]). Let us point out that the sequence of norms $\lVert T^nx \rVert $ is non-increasing, since T is non-expansive with $T0=0$ . We have already seen that the sequence $(\alpha ^i_n)$ is non-increasing for each $i\in \mathbb N$ .

Lemma 5.1. We have $(\alpha ^i_m)^2-(\alpha ^i_{m+k})^2\leq 4\cdot (\lVert T^mx \rVert ^2-\lVert T^{m+k+i}x \rVert ^2)$ .

Proof. Using binomial expansion and the fact that $(\lVert T^nx \rVert )$ is non-increasing, we learn that $(\alpha ^i_m)^2-(\alpha ^i_{m+k})^2$ is bounded by

$$ \begin{align*} 2\cdot(\lVert T^mx \rVert^2-\lVert T^{m+k+i}x \rVert^2+\langle T^{m+k}x,T^{m+k+i}x\rangle-\langle T^mx,T^{m+i}x\rangle). \end{align*} $$

Hence, the claim reduces to

$$ \begin{align*} \langle T^{m+k}x,T^{m+k+i}x\rangle-\langle T^mx,T^{m+i}x\rangle\leq\lVert T^mx \rVert^2-\lVert T^{m+k+i}x \rVert^2. \end{align*} $$

To obtain the latter, iterate (24) to get $\lVert T^{m+k}x+T^{m+k+i}x \rVert ^2\leq \lVert T^mx+T^{m+i}x \rVert ^2$ . Now consider binomial expansions, and use the fact that $(\lVert T^nx \rVert )$ is non-increasing.

In view of the standing assumption $C\subseteq B_{b/2}$ , the values $\lVert T^nx \rVert ^2$ form a non-increasing sequence of reals in $[0,b^2/4]$ . Let us recall the known rate of metastability for such sequences. Given $g:\mathbb N\to \mathbb N$ , we define $\widetilde g:\mathbb N\to \mathbb N$ by $\widetilde g(n):=n+g(n)$ . The iterates $\widetilde g^i:\mathbb N\to \mathbb N$ are given by the recursive clauses

$$ \begin{align*} \widetilde g^0(n):=n\quad\text{and}\quad \widetilde g^{i+1}(n):=\widetilde g(\widetilde g^i(n)). \end{align*} $$

By [Reference Kohlenbach17, Proposition 2.27] (cf. also the beginning of §4 above),

(25) $$ \begin{align} &|\lVert T^mx \rVert^2-\lVert T^nx \rVert^2|<\varepsilon\quad\text{for some }N\leq\widetilde g^{\lceil b^2/(4\varepsilon) \rceil}(0)\nonumber\\[2pt] &\qquad\qquad\qquad\qquad\qquad\quad\text{ and all }m,n\in[N,N+g(N)], \end{align} $$

for any $\varepsilon>0$ and $g:\mathbb N\to \mathbb N$ . We will see that statement (10) from the introduction holds with the following map $A_1$ in place of A.

Definition 5.2. For $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ , we set

$$ \begin{align*} A_1(\varepsilon,g,h):=\widetilde{g+h}^{\lceil b^2/\varepsilon^2 \rceil}(0). \end{align*} $$

As promised, we obtain the following result.

Proposition 5.3. Consider a non-expansive map $T:C\to C$ on a set $C\subseteq B_{b/2}$ in Hilbert space. If T satisfies (24), then any $\varepsilon>0$ and $g,h:\mathbb N\to \mathbb N$ admit a number $N\leq A_1(\varepsilon ,g,h)$ with $|\alpha ^i_m-\alpha ^i_n|<\varepsilon $ for all $m,n\in [N,N+g(N)]$ and $i\leq h(N)$ .

Proof. By (25) with $\varepsilon ^2/4$ and $g+h$ in place of $\varepsilon $ and g, respectively, we find a number $N\leq A_1(\varepsilon ,g,h)$ with

$$ \begin{align*} |\lVert T^mx \rVert^2-\lVert T^lx \rVert^2|<\dfrac{\varepsilon^2}{4}\quad\text{for all }m,l\in[N,N+g(N)+h(N)]. \end{align*} $$

Given $m\leq n$ in $[N,N+g(N)]$ and $i\leq h(N)$ , combine this inequality for $l=n+i$ with Lemma 5.1 for $k=n-m$ . This yields

$$ \begin{align*} (\alpha^i_m-\alpha^i_n)^2\leq(\alpha^i_m)^2-(\alpha^i_n)^2\leq 4\cdot(\lVert T^mx \rVert^2-\lVert T^{n+i}x \rVert^2)<\varepsilon^2, \end{align*} $$

and hence $0\leq \alpha ^i_m-\alpha ^i_n<\varepsilon $ , as desired.

For the second part of this section, we return to the case of a Banach space. Our aim is to satisfy (10) when $(T^nx)$ has a convergent subsequence. The latter entails that there are $M,N\in \mathbb N$ as in the following lemma. We point out that the lemma is a quantitative version of a step in Bruck’s proof of [Reference Bruck4, Theorem 2.4]. Also note that the lemma establishes (10) whenever we have $N\leq A(\varepsilon ,g,h)=A(\varepsilon ,g)$ , independently of the function $h:\mathbb N\to \mathbb N$ that provides the bound $i\leq h(N)$ in (10).

Lemma 5.4. For $\varepsilon>0$ and $g:\mathbb N\to \mathbb N$ , assume that we are given $M,N\in \mathbb N$ with $M\geq N+g(N)$ and $\lVert T^Mx-T^Nx \rVert <\varepsilon /2$ . We then have $|\alpha _m^i-\alpha ^i_n|<\varepsilon $ for all numbers $m,n\in [N,N+g(N)]$ and $i\in \mathbb N$ .

Proof. From $\lVert T^Mx-T^Nx \rVert <\varepsilon /2$ , we get $\lVert T^{M+i}x-T^{N+i}x \rVert <\varepsilon /2$ , as T is non-expansive by a standing assumption. We recall that $\alpha ^i_N=\lVert T^Nx-T^{N+i}x \rVert $ to conclude that

$$ \begin{align*} \alpha^i_N\leq\lVert T^Nx-T^Mx \rVert+\lVert T^Mx-T^{M+i}x \rVert+\lVert T^{M+i}x-T^{N+i}x \rVert<\varepsilon+\alpha^i_M. \end{align*} $$

The claim follows since $\alpha ^i_n$ is decreasing in n (again, because T is non-expansive).

The assumption that $(T^nx)$ has a convergent subsequence is satisfied when the domain C of our map $T:C\to C$ (or just the set $\{T^nx\mid n\in \mathbb N\}$ ) is compact or, equivalently, (closed and) totally bounded. On the quantitative side, this last property can be witnessed by a modulus of total boundedness (in the sense of Gerhardy [Reference Gerhardy10]), i.e., by a function $\gamma :(0,\infty )\to \mathbb N$ that satisfies the following: for any $\varepsilon>0$ and any sequence $(x_i)$ in C, we have $\lVert x_j-x_i \rVert \leq \varepsilon $ for some $i<j\leq \gamma (\varepsilon )$ (see [Reference Kohlenbach, Leuştean and Nicolae25] for a metatheorem on uniform bound extractions for spaces given with such a modulus). We make an (apparently) weaker assumption, where $\gamma $ may depend on the sequence and only the smaller index is controlled: for the following, we assume that we are given a map $(\varepsilon ,g)\mapsto \Gamma (\varepsilon ,g)$ that guarantees

(26) $$ \begin{align} \lVert T^{g(j)}x-T^{g(i)}x \rVert<\varepsilon\quad\text{for some } i\leq\Gamma(\varepsilon,g) \text{ and some } j>i. \end{align} $$

Recall the notation $\widetilde g$ and $\widetilde g^{i}$ from the paragraph before Definition 5.2.

Definition 5.5. For $\varepsilon>0$ and $g:\mathbb N\to \mathbb N$ , we put

$$ \begin{align*} A_2(\varepsilon,g):=\widetilde g^K(0)\quad\text{with } K:=\Gamma\bigg(\frac{\varepsilon}{2},l\mapsto\widetilde g^l(0)\bigg). \end{align*} $$

Here, $(\varepsilon ,g)\mapsto \Gamma (\varepsilon ,g)$ is a given map that satisfies (26).

To conclude, we show that (10) holds with $(\varepsilon ,g,h)\mapsto A_2(\varepsilon ,g,h):=A_2(\varepsilon ,g)$ in place of A. Given that $A_2$ is independent of $h:\mathbb N\to \mathbb N$ , the condition $i\leq h(N)$ can be dropped in the present situation.

Proposition 5.6. Assume that $T:C\to C$ is a non-expansive map on a totally bounded subset of a Banach space, where the total boundedness for the sequences $(T^{g(n)}x)$ is witnessed by a given map $(\varepsilon ,g)\mapsto \Gamma (\varepsilon ,g)$ as in (26). For any $\varepsilon>0$ and $g:\mathbb N\to \mathbb N$ , we can then find an $N\leq A_2(\varepsilon ,g)$ such that $|\alpha _m^i-\alpha _n^i|<\varepsilon $ holds for all $m,n\in [N,N+g(N)]$ and $i\in \mathbb N$ .

Proof. Our modulus of total boundedness provides $i\leq \Gamma (\varepsilon /2,l\mapsto \widetilde g^l(0))$ and $j>i$ such that $M:=\widetilde g^j(0)$ and $N:=\widetilde g^i(0)$ satisfy $\lVert T^Mx-T^Nx \rVert <\varepsilon /2$ . Note that the function $l\mapsto \widetilde g^l(0)$ is increasing because of $\widetilde g(n)=n+g(n)\geq n$ . This allows us, first, to infer $N\leq A_2(\varepsilon ,g)$ from the definition of $A_2$ . Secondly, we learn that

$$ \begin{align*} M=\widetilde g^j(0)\geq\widetilde g^{i+1}(0)=\widetilde g(\widetilde g^i(0))=\widetilde g(N)=N+g(N), \end{align*} $$

so that we can conclude the proof using Lemma 5.4.

Remark 5.7. In the finite dimensional case $X:=\mathbb R^N$ endowed with the Euclidean norm, a modulus of total boundedness for $B_{b/2}$ can be taken as

$$\begin{align*}\gamma(\varepsilon):=\bigg\lceil \frac{\sqrt{N}b}{\varepsilon}\bigg\rceil \end{align*}$$

(see [Reference Kohlenbach, Leuştean and Nicolae25, Example 2.8]), so we may take $\Gamma (\varepsilon ,g):=\gamma (\varepsilon /2)$ independently of $g,T,x$ .

Final Comment. An inspection of the proofs above shows that the quantitative results do not depend on the assumptions of X being complete or C being closed.

Acknowledgment

Both authors were supported by the ‘Deutsche Forschungsgemeinschaft’ (DFG, German Research Foundation) – Projects 460597863, DFG KO 1737/6-1 and DFG KO 1737/6-2.

References

Avigad, J., Gerhardy, P. and Towsner, H.. Local stability of ergodic averages. Trans. Amer. Math. Soc. 362 (2010), 261288.CrossRefGoogle Scholar
Avigad, J. and Rute, J.. Oscillation and the mean ergodic theorem for uniformly convex Banach spaces. Ergod. Th. & Dynam. Sys. 35 (2015), 10091027.CrossRefGoogle Scholar
Brézis, H. and Browder, F.. Nonlinear ergodic theorems. Bull. Amer. Math. Soc. (N.S.) 82(6) (1976), 959961.CrossRefGoogle Scholar
Bruck, R. E.. On the almost-convergence of iterates of a nonexpansive mapping in Hilbert space and the structure of the weak $\omega$ -limit set. Israel J. Math. 29 (1978), 116.CrossRefGoogle Scholar
Bruck, R. E.. A simple proof of the mean ergodic theorem for nonlinear contractions in Banach spaces. Israel J. Math. 32 (1979), 107116.CrossRefGoogle Scholar
Bruck, R. E.. On the convex approximation property and the asymptotic behavior of nonlinear contractions in Banach spaces. Israel J. Math. 38 (1981), 304314.CrossRefGoogle Scholar
Clarkson, J. A.. Uniformly convex spaces. Trans. Amer. Math. Soc. 40(3) (1936), 396414.Google Scholar
Dilworth, S. J., Howard, R. and Roberts, J. W.. On the size of approximately convex sets in normed spaces. Studia Math. 140 (2000), 213241.CrossRefGoogle Scholar
Freund, A.. Proof lengths for instances of the Paris–Harrington principle. Ann. Pure Appl. Logic 168 (2017), 13611382.CrossRefGoogle Scholar
Gerhardy, P.. Proof mining in topological dynamics. Notre Dame J. Form. Log. 49(4) (2008), 431446.CrossRefGoogle Scholar
Haagerup, U.. The best constants in the Khintchine inequality. Studia Math. 70 (1982), 231283.CrossRefGoogle Scholar
Hanner, O.. On the uniform convexity of ${L}^p$ and ${l}^p$ . Ark. Mat. 3(3) (1956), 239244.CrossRefGoogle Scholar
Kobayasi, K. and Miyadera, I.. On the strong convergence of the Césaro means of contractions in Banach spaces. Proc. Japan Acad. 56 (1980), 245249.Google Scholar
Kohlenbach, U.. On the computational content of the Krasnoselski and Ishikawa fixed point theorems. Computability and Complexity in Analysis (Lecture Notes in Computer Science, 2064). Eds. Blanck, J., Brattka, V. and Hertling, P.. Springer, Berlin, 2001, pp. 119145.CrossRefGoogle Scholar
Kohlenbach, U.. Uniform asymptotic regularity for Mann iterates. J. Math. Anal. Appl. 279(2) (2003), 531544.CrossRefGoogle Scholar
Kohlenbach, U.. Some logical metatheorems with applications in functional analysis. Trans. Amer. Math. Soc. 357(1) (2005), 89128.CrossRefGoogle Scholar
Kohlenbach, U.. Applied Proof Theory: Proof Interpretations and their Use in Mathematics (Springer Monographs in Mathematics). Springer, Berlin, 2008.Google Scholar
Kohlenbach, U.. On quantitative versions of theorems due to F. E. Browder and R. Wittmann. Adv. Math. 226(3) (2011), 27642795.Google Scholar
Kohlenbach, U.. On the asymptotic behavior of odd operators. J. Math. Anal. Appl. 382 (2011), 615620.Google Scholar
Kohlenbach, U.. A uniform quantitative form of sequential weak compactness and Baillon’s nonlinear ergodic theorem. Commun. Contemp. Math. 14(1) (2012), 1250006.CrossRefGoogle Scholar
Kohlenbach, U.. On the quantitative asymptotic behavior of strongly nonexpansive mappings in Banach and geodesic spaces. Israel J. Math. 216 (2016), 215246.CrossRefGoogle Scholar
Kohlenbach, U.. Quantitative results on the proximal point algorithm in uniformly convex Banach spaces. J. Convex Anal. 28(1) (2021), 1118.Google Scholar
Kohlenbach, U. and Leuştean, L.. A quantitative mean ergodic theorem for uniformly convex Banach spaces. Ergod. Th. & Dynam. Sys. 29 (2009), 19071915.Google Scholar
Kohlenbach, U. and Leuştean, L.. Asymptotically nonexpansive mappings in uniformly convex hyperbolic spaces. J. Eur. Math. Soc. (JEMS) 12 (2010), 7192.CrossRefGoogle Scholar
Kohlenbach, U., Leuştean, L. and Nicolae, A.. Quantitative results on Fejér monotone sequences. Commun. Contemp. Math. 20(2) (2018), 1750015.CrossRefGoogle Scholar
Kohlenbach, U. and Sipoş, A.. The finitary content of sunny nonexpansive retractions. Commun. Contemp. Math. 23(1) (2021), 1950093.CrossRefGoogle Scholar
Ledoux, M. and Talagrand, M.. Probability in Banach spaces (Ergebnisse der Mathematik und ihrer Grenzgebiete, 23). Springer, Berlin, 1991.CrossRefGoogle Scholar
Pisier, G.. Sur les espaces qui ne contiennent pas de ${l}_n^1$ uniformément. Séminaire Maurey–Schwartz (1973–1974), Espaces ${L}^p$ , Applications Radonifiantes et Géométrie des Espaces de Banach. Centre de Mathématiques, École Polytechnique, Paris, 1974, Exp. No. 7.Google Scholar
Powell, T.. A new metastable convergence criterion and an application in the theory of uniformly convex Banach spaces. J. Math. Anal. Appl. 478(2) (2019), 790805.CrossRefGoogle Scholar
Safarik, P.. A quantitative nonlinear strong ergodic theorem for Hilbert spaces. J. Math. Anal. Appl. 391(1) (2012), 2637.CrossRefGoogle Scholar
Tao, T.. Structure and Randomness: Pages from Year One of a Mathematical Blog. American Mathematical Society, Providence, RI, 2008.Google Scholar
Wittmann, R.. Mean ergodic theorems for nonlinear operators. Proc. Amer. Math. Soc. 108(3) (1990), 781788.CrossRefGoogle Scholar