Hostname: page-component-586b7cd67f-rdxmf Total loading time: 0 Render date: 2024-11-26T09:33:08.805Z Has data issue: false hasContentIssue false

Lifting generic points

Published online by Cambridge University Press:  05 February 2024

TOMASZ DOWNAROWICZ*
Affiliation:
Faculty of Pure and Applied Mathematics, Wrocław University of Technology, Wrocław, Poland
BENJAMIN WEISS
Affiliation:
Einstein Institute of Mathematics, The Hebrew University of Jerusalem, Jerusalem, Israel (e-mail: [email protected])
Rights & Permissions [Opens in a new window]

Abstract

Let $(X,T)$ and $(Y,S)$ be two topological dynamical systems, where $(X,T)$ has the weak specification property. Let $\xi $ be an invariant measure on the product system $(X\times Y, T\times S)$ with marginals $\mu $ on X and $\nu $ on Y, with $\mu $ ergodic. Let $y\in Y$ be quasi-generic for $\nu $. Then there exists a point $x\in X$ generic for $\mu $ such that the pair $(x,y)$ is quasi-generic for $\xi $. This is a generalization of a similar theorem by T. Kamae, in which $(X,T)$ and $(Y,S)$ are full shifts on finite alphabets.

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press

1. Introduction

All terminology used freely in this introduction is explained in the preliminaries (§2).

Let $\pi :(X,T) \rightarrow (Y,S)$ be an extension of a compact dynamical system $(Y,S)$ and suppose that $\nu $ is an ergodic measure for S. This measure can always be lifted to an invariant measure on X (by the Hahn–Banach theorem). It then follows that there exists an ergodic measure $\mu $ that projects to $\nu $ . Clearly, any generic point for $\mu $ will project to a generic point for $\nu $ . It is natural to ask whether all $\nu $ -generic points can be obtained in this way. In other words: does every $\nu $ -generic point have a $\mu $ -generic lift? It is not difficult to show that if $\mu $ is a unique lift of $\nu $ then the answer is yes. In fact, in this case if $y \in Y$ is generic for $\nu $ then any $x\in \pi ^{-1}(y)$ will be generic for $\mu $ . However, if the extension of $\nu $ is not unique then the answer might be negative. Such examples can be obtained as follows.

Consider a minimal almost one-to-one extension $\pi :X \rightarrow Y$ where Y is strictly ergodic but X is not (cf. Furstenberg and Weiss [Reference Furstenberg and WeissFW] for examples of such systems). Then all invariant measures on X project to the unique measure $\nu $ on Y. In this situation all points in Y are generic for $\nu $ . Now, let $y\in Y$ be a point with a unique preimage $x\in X$ (by assumption such a point exists). Then either x is not generic for any invariant measure on X or it is generic for one such measure (say, $\mu _0$ ). In either case, there exists an invariant measure $\mu _1$ on X such that x is not generic for $\mu _1$ . Thus, the point y generic for $\nu $ does not lift to a point generic for the lift $\mu _1$ of $\nu $ .

However, this example does not provide an answer to a more subtle question: does every $\nu $ -generic point have a generic lift (without specifying for which measure extending $\nu $ )? In general, the answer to such a relaxed question is also negative. We will show this using the same example as before and the following theorem.

Theorem 1.1. Let $(X,T)$ be a topological dynamical system with (at least) two different ergodic measures $\mu $ and $\nu $ , both having full topological support. Then there exist a dense $G_{\delta }$ -set $B\subset X$ and a continuous function f on X such that for any $x\in B$ the ergodic averages

$$ \begin{align*} A_n(f,x)=\frac1n\sum_{i=0}^{n-1}f(T^ix) \end{align*} $$

oscillate.

Proof. Since the measures differ, there exists a continuous function f on X whose integral with respect to $\mu $ is greater than one while its integral with respect to $\nu $ is less than zero. Now, for a natural number N we define

$$ \begin{align*} E_N = \bigg\{ x \in X :\text{ there exists }_{n> N}\ \frac{1}n\sum_{i=0}^{n-1} f(T^ix) > 1\bigg\}. \end{align*} $$

This set is clearly open, and it is dense since the generic points for $\mu $ are dense. Define a similar set $F_N$ replacing ‘ $>1$ ’ by ‘ $<0$ ’. Then the desired set B is the countable intersection

$$ \begin{align*} B=\bigcap_{N\ge1}(E_N\cap F_N). \end{align*} $$

By the Baire theorem, this set is a dense $G_\delta $ , and clearly no $x\in B$ is generic for any measure.

Now let us go back to the example. Since the system $(X,T)$ is minimal, all its invariant measures have full topological support. By Theorem 1.1, there is a dense $G_\delta $ -set B of points which are not generic for any measure. As in any minimal almost one-to-one extension, the ‘singleton fibers’ (that is, points which are unique preimages of their images) also form a dense $G_\delta $ -set (call it A) in X. Then the intersection $A\cap B$ is non-empty and any point in its image is generic (for $\nu $ ) but has no generic lift.

An even more extreme situation occurs in yet another example. In [Reference FurstenbergF], Furstenberg constructs a non-uniquely ergodic minimal skew product on the $2$ -torus over an irrational rotation, where the fiber maps are also rotations. Since rotation by a fixed angle on the fibers commutes with the skew product, in every fiber either each point is generic for some invariant measure or none of them is. Theorem 1.1 implies that there are points not generic for any measure (actually, this property is explicit in [Reference FurstenbergF]—this is how Furstenberg showed non-unique ergodicity of the skew product). It follows that there are entire fibers (circles) with no generic points even though in the base all points are generic.

Before we discuss a positive result we need to mention two important issues. The first one is the phenomenon of quasi-generating invariant measures, that is, generating them along a subsequence of averages. Replacing the term ‘generic’ by ‘quasi-generic’ may lead to either stronger or weaker results, depending on where the replacement is done (in the assumption or in the thesis). The second issue is a specific way of extending a system by joining it with another system. Any extension can be viewed practically as a joining of the system with its extension (the joining is then supported by the graph of the factor map), but it is often essential to know that the extension can be obtained as a joining with a system having some specific properties (ergodicity, specification property, etc.) which the entire extension does not necessarily enjoy.

In the early 1970s, Teturo Kamae studied normal sequences and the phenomenon of normality-preserving subsequences. In symbolic dynamics a sequence over a finite alphabet is normal if it is generic for the uniform Bernoulli measure. An increasing subsequence of natural numbers $y=(n_k)_{k\ge 1}$ preserves normality if $x|_y=(x_{n_k})_{k\ge 1}$ is normal for any normal sequence $(x_n)_{n\ge 1}$ . A few years earlier, Weiss [Reference WeissW] proved that subsequences of positive lower density which are completely deterministic preserve normality. A subsequence y is completely deterministic if its indicator function, viewed as an element of the shift on two symbols, quasi-generates only measures of entropy zero. Kamae [Reference KamaeK] proved the opposite implication: only completely deterministic subsequences preserve normality. Given a non-deterministic subsequence y (that is, one that quasi-generates some measure $\nu $ of positive entropy), he needed to find a normal sequence x such that $x|_y$ is not normal. Skipping the details, let us just say that he needed to ‘pair’ the subsequence y with a normal (that is, generic for the uniform Bernoulli measure $\unicode{x3bb} $ ) sequence x, such that the pair $(x,y)$ is generic for a specific joining $\xi $ of $\unicode{x3bb} $ and $\nu $ . In order to do so, he proved a more general theorem, which motivates our current work. We take the liberty of rephrasing the statement in the language that we use throughout this paper.

Theorem 1.2. [Reference KamaeK]

Let $\xi $ be a joining of two invariant measures, $\mu $ and $\nu $ , supported on symbolic systems $\Lambda _1^{\mathbb {N}}$ and $\Lambda _2^{\mathbb {N}}$ , respectively ( $\Lambda _1$ and $\Lambda _2$ are finite alphabets). Let $y\in \Lambda _2^{\mathbb {N}}$ be quasi-generic for $\nu $ , that is, it generates $\nu $ along a subsequence of averages indexed by $\mathcal J=(n_k)_{k\ge 1}$ . Then there exists $x\in \Lambda _1^{\mathbb {N}}$ such that the pair $(x,y)$ generates $\xi $ along $\mathcal J$ . If $\mu $ is ergodic then x can be chosen generic for $\mu $ .

This theorem found another application in the work of Rauzy [Reference RauzyR], who studied normality preservation in a different sense. Let us identify all real numbers with their expansions in some fixed base $b\ge 2$ . A real number is called normal (in base b) if its expansion is a normal sequence. A real number y preserves normality if $x+y$ is normal for any normal number x. Rauzy proved that a number y preserves normality if and only if the expansion of y is completely deterministic.

Notice that Theorem 1.2 is actually very strong. First of all, it applies to any situation when a ‘symbolic’ measure $\nu $ is lifted to a ‘symbolic’ measure $\xi $ . Also note that $\nu $ is not assumed ergodic, it suffices that it admits a quasi-generic point y (which is always true within a full shift). If $\mathcal J={\mathbb {N}}$ then y is simply generic for $\nu $ and the theorem allows it to be lifted to a pair $(x,y)$ generic for $\xi $ . Even when $\mathcal J$ is an essential subsequence (and there is no hope of making the lift $(x,y)$ generic), as soon as $\mu $ is ergodic, the point x ‘paired’ with y still can be generic rather than just quasi-generic. The only weakness of the theorem is that x is found within the full shift $\Lambda _1^{\mathbb {N}}$ , even when $\mu $ is supported by a proper subshift. In other words, the theorem does not allow y to be lifted within an a priori given topological (symbolic) extension.

Our paper focuses exactly on this problem. Our goal is to find conditions under which the ‘paired’ point x (generic for $\mu $ ) can be found within the a priori given topological system $(X,T)$ being joined with $(Y,S)$ . The conditions turn out to be ergodicity of $\mu $ (like in the original theorem), and the weak specification property of $(X,T)$ . We prove the following theorem.

Theorem 1.3. Let $(X,T)$ and $(Y,S)$ be topological dynamical systems and let $\xi $ be an invariant measure on the product system $(X\times Y,T\times S)$ with marginals $\mu $ and $\nu $ on X and Y, respectively. Assume that the system $(X,T)$ has the weak specification property and that $\mu $ is ergodic under T. Suppose also that $y\in Y$ is quasi-generic for the measure $\nu $ . Then there exists a point $x\in X$ , generic for $\mu $ , such that the pair $(x,y)$ is quasi-generic for $\xi $ .

Let us mention that the weak specification property is satisfied by many systems such as ergodic mixing Markov shifts, ergodic toral automorphisms, and in fact any endomorphisms of compact Abelian groups for which the Haar measure is ergodic (see [Reference DateyamaD] and use natural extension). An advantage of our result is that it is not restricted to symbolic systems and that x is found within the space X. A disadvantage is that the pair $(x,y)$ is only quasi-generic for $\xi $ , even when $\mathcal J={\mathbb {N}}$ .

The strength of Theorem 1.3 lies in the possibility of lifting any generic point (not just almost any) y to a pair $(x,y)$ quasi-generic for $\xi $ . If $\xi $ is ergodic, such a possibility for $\nu $ -almost all y is a trivial fact. Thus, the theorem can be useful when topological, rather than measure-theoretic, precision is required.

A concrete application of Theorem 1.3 occurs in the forthcoming paper [Reference Bergelson and DownarowiczBD], where Rauzy’s equivalence between normality preservation and determinism is generalized to a wider context. That is to say, the following problem is addressed.

Question 1.4. Let $T:X\to X$ be a surjective endomorphism of a compact metrizable Abelian group, such that the Haar measure $\unicode{x3bb} $ on X is T-ergodic and has finite entropy. Let us call a point $x\in X$ normal if it is generic for $\unicode{x3bb} $ . Is it true that y preserves normality (that is, $x+y$ is normal for any normal $x\in X$ ) if and only if y is completely deterministic?

In [Reference Bergelson and DownarowiczBD] we prove sufficiency relatively easily, but the harder direction (necessity) is shown only for selected groups X (tori, solenoids, and countable direct products $\bigoplus _{n\ge 1}\mathbb Z_d$ ( $d\ge 2$ ); the necessity in full generality remains open). In all these cases Theorem 1.3 plays a crucial role in the proofs.

Our paper is organized as follows. Section 2 contains all necessary definitions and notational conventions. In §3 we provide three key lemmas together with auxiliary propositions needed in their proofs. The propositions are quite standard while the lemmas may be considered of independent interest. Finally, in §4 we present the proof of Theorem 1.3.

2. Preliminaries

Let $(X,T)$ be a topological dynamical system, where X is a compact metric space and T is a continuous surjection. By $\mathcal M(X)$ we will denote the set of all Borel probability measures on X. Since no other measures will be considered, the elements of $\mathcal M(X)$ will be called measures for short. By $\mathcal M_T(X)$ we will denote the subset of $\mathcal M(X)$ containing all measures that are T-invariant, that is, such that $\mu (T^{-1}A)=\mu (A)$ for all Borel sets $A\subset X$ . When the transformation T is fixed, the elements of $\mathcal M_T(X)$ will be called invariant measures. The sets $\mathcal M(X)$ and $\mathcal M_T(X)$ are equipped with the weak* topology, which makes both these sets compact convex and metrizable with a convex metric. (By definition, a sequence $(\mu _n)_{n\ge 1}$ of measures converges in the weak* topology to a measure $\mu $ if, for any continuous (real or complex) function f on X, the integrals $\int f\,d\mu _n$ converge to $\int f\,d\mu $ . One of the standard convex metrics compatible with this topology is given by

$$ \begin{align*} d(\mu,\nu)=\sum_{n=1}^\infty 2^{-n}\bigg|\int f_n\,d\mu - \int f_n\,d\nu\bigg|, \end{align*} $$

where $(f_n)_{n\ge 1}$ is a sequence of continuous functions on X with values in the interval $[0,1]$ , linearly dense in the space $C(X)$ of all continuous real functions on X.) It is well known that the extreme points of $\mathcal M_T(X)$ are precisely the ergodic measures, that is, invariant measures $\mu $ such that $\mu (A\,\triangle \, T^{-1}A)=0\! \implies \! \mu (A)\in \{0,1\}$ , for any Borel set $A\subset X$ .

We will be using the following notation. For two integers $a\le b$ , we will denote by $[a,b]$ the interval of integers $\{a,a+1,a+2,\ldots ,b\}$ . Given a point $x\in X$ and $0\le a\le b$ , we denote by $x[a,b]$ the ordered finite segment of the orbit of x,

$$ \begin{align*} x[a,b]=(T^ax,T^{a+1}x,\ldots,T^bx), \end{align*} $$

while by $\mu _{x[a,b]}$ we will understand the normalized counting measure on $x[a,b]$ ,

$$ \begin{align*} \mu_{x[a,b]}=\frac1{b-a+1}\sum_{n=a}^b\delta_{T^nx} \end{align*} $$

(here $\delta _x$ denotes the Dirac measure concentrated at x). We will call this measure the empirical measure associated with the orbit segment.

A point x is said to quasi-generate (or be quasi-generic for) a measure $\mu $ if $\mu $ is an accumulation point of the sequence of measures $(\mu _{x[0,n]})_{n\ge 1}$ (any such measure $\mu $ is invariant). In this case there exists an increasing sequence of natural numbers $\mathcal J=(n_k)_{k\ge 1}$ such that $\lim _{k\to \infty }\mu _{x[0,n_k]}=\mu $ . We will say that x generates $\mu $ along $\mathcal J$ . If the sequence $(\mu _{x[0,n]})_{n\ge 1}$ converges to $\mu $ then we say that x generates (or is generic for) $\mu $ (in other words, ‘generic’ equates to ‘generic along ${\mathbb {N}}$ ’). It follows from the pointwise ergodic theorem that every ergodic measure $\mu $ possesses generic points (in fact $\mu $ -almost all points are such). The following obvious fact will be used several times.

Remark 2.1. Two increasing sequences of natural numbers (say, $(n_k)_{k\ge 1}$ and $(m_k)_{k\ge 1}$ ) will be called equivalent if $\lim _{k\to \infty }({n_k}/{m_k})=1$ . It is obvious that the upper (and lower) densities of any subset of ${\mathbb {N}}$ evaluated along equivalent sequences are the same. If a point x generates a measure $\mu $ along a sequence $(n_k)_{k\ge 1}$ then it generates $\mu $ along any sequence $(m_k)_{k\ge 1}$ equivalent to $(n_k)_{k\ge 1}$ .

Other key notions in this paper are those of a specification and the specification property.

Definition 2.2

  1. (1) Consider a (finite or infinite) sequence of non-negative integers:

    $$ \begin{align*} a_1\le &\,b_1<a_2\le b_2<\cdots<a_{N_1}\le b_{N_1} \quad \text{where } N_1\in{\mathbb{N}}, \text{ or }\\ a_1\le &\,b_1<a_2\le b_2<a_3\le b_3<\cdots\,. \end{align*} $$
    Let $D=\bigcup _N[a_N,b_N]$ (where N ranges over either $[1,N_1]$ or ${\mathbb {N}}$ ). By a specification with domain D we will mean any function
    $$ \begin{align*} {\mathcal S}:D\to X \end{align*} $$
    such that for each N there exists a point $x_N$ such that for each $n\in [a_N,b_N]$ we have
    $$ \begin{align*} {\mathcal S}(n) = T^n(x_N). \end{align*} $$
    Since T is surjective, we can equivalently demand that ${\mathcal S}(n)=T^{n-a_N}(x_N)$ .
  2. (2) By ${\mathcal S}[a_N,b_N]$ we mean the ordered tuple $({\mathcal S}(a_N),{\mathcal S}(a_N+1),\ldots ,{\mathcal S}(b_N))$ which equals $x_N[a_N,b_N]$ (or $x_N[0,b_N-a_N])$ .

  3. (3) The numbers $l_N=b_N-a_N+1$ and $g_N=a_{N+1}-b_N-1$ will be called the orbit segment lengths and gaps of the specification, respectively.

  4. (4) If D is finite then the empirical measure associated with ${\mathcal S}$ is defined as

    $$ \begin{align*} \mu_{\mathcal S}=\frac1{|D|}\sum_{n\in D}\delta_{{\mathcal S}(n)}. \end{align*} $$
  5. (5) An infinite specification ${\mathcal S}$ generates (or is generic for) a measure $\mu $ along a sequence $\mathcal J=(n_k)_{k\ge 1}$ if the measures associated to the specification ${\mathcal S}$ restricted to $D\cap [0,n_k]$ converge to $\mu $ (in general, $\mu $ need not be invariant).

  6. (6) We say that (the orbit of) a point $x\in X \varepsilon $ -shadows the specification ${\mathcal S}$ if

    $$ \begin{align*} \text{for all }{n\in D},\quad d({\mathcal S}(n),T^n(x))<\varepsilon. \end{align*} $$
  7. (7) A system $(X,T)$ has the weak specification property if for every $\varepsilon>0$ there exists a function $M_\varepsilon :{\mathbb {N}}\to {\mathbb {N}}$ satisfying $\lim _{l\to \infty }({M_\varepsilon (l)}/l)=0$ , such that any finite specification (with any finite number $N_1$ of orbit segments) satisfying, for each $N\in [1,N_1]$ , the inequality $g_N\ge M_\varepsilon (l_{N+1})$ is $\varepsilon $ -shadowed by an orbit.

The last condition asserts, roughly speaking, that any appropriately spaced finite sequence of orbit segments (where each gap is adjusted to the length of the following segment, according to the function $M_\varepsilon $ ) can be $\varepsilon $ -shadowed by a single orbit.

3. Preparatory statements

The proof of Theorem 1.3 relies on three key lemmas. The first one is concerned with increasingly good shadowing of certain infinite specifications.

Lemma 3.1. Let $(X,T)$ be a topological dynamical system with the weak specification property with a family of functions $\{M_\varepsilon :\varepsilon>0\}$ . Let $(\varepsilon _k)_{k\ge 0}$ be a summable sequence of positive numbers. Let $D=\bigcup _{N=1}^\infty [a_N,b_N]$ and let ${\mathcal S}:D\to X$ be an infinite specification satisfying, for some increasing sequence $(N_k)_{k\ge 0}$ of non-negative integers starting with $N_0=0$ , the following condition: for each $k\ge 1$ and all $N\in [N_{k-1}+1,N_k]$ we have

(3.1) $$ \begin{align} g_N\ge M_{\varepsilon_k}(l_{N+1}). \end{align} $$

Then there exists a point $x_0$ such that

(3.2) $$ \begin{align} \lim_{n\in D}\ d({\mathcal S}(n),T^nx_0)=0. \end{align} $$

Proof. Let ${\mathcal S}_1$ denote the specification ${\mathcal S}$ restricted to the initial $N_1$ orbit segments. This finite specification satisfies the inequality $g_N\ge M_{\varepsilon _1}(l_{N+1})$ , hence it can be $\varepsilon _1$ -shadowed by the orbit of some point $x_1\in X$ .

We continue by induction. Suppose that we have found a point $x_k\in X$ which satisfies

(3.3) $$ \begin{align} \kern-11pt\text{for all }{i\in[1,k]},\ \text{for all }{N\in[N_{i-1}+1,N_i]},\ \text{for all } {n\in[a_N,b_N]},\quad d({\mathcal S}(n),T^nx_k)\le\sum_{j=i}^k\varepsilon_j. \end{align} $$

We define a new specification ${\mathcal S}_{k+1}$ on

$$ \begin{align*} [0,b_{N_k}]\cup\bigcup_{N=N_k+1}^{N_{k+1}}[a_N,b_N] \end{align*} $$

as follows. We let ${\mathcal S}_{k+1}[0,b_{N_k}]=x_k[0,b_{N_k}]$ , while for $N\in [N_k+1,N_{k+1}]$ we let ${\mathcal S}_{k+1}[a_N,b_N]={\mathcal S}[a_N,b_N]$ . It will be convenient not to change the enumeration of the orbit segments (except for the first one, which is new), and of the gaps (the first gap of ${\mathcal S}_{k+1}$ coincides with the $N_k$ th gap of ${\mathcal S}$ ). Then ${\mathcal S}_{k+1}$ satisfies $g_N\ge M_{\varepsilon _k}(l_{N+1})$ for all $N\in [N_k,N_{k+1}-1]$ (that is, for all gaps of ${\mathcal S}_{k+1}$ ), hence it can be $\varepsilon _k$ -shadowed by the orbit of some point $x_{k+1}\in X$ . It is clear that $x_{k+1}$ satisfies (3.3) with the parameter $k+1$ in place of k. This concludes the induction. We let $x_0$ be any accumulation point of the sequence $(x_k)_{k\ge 1}$ . As easily seen, this point satisfies, for all $n\in D$ , the inequality

$$ \begin{align*} d({\mathcal S}(n),T^n(x_0))\le\sum_{j=k_n+1}^\infty\varepsilon_j, \end{align*} $$

where $k_n$ is the unique integer $k\ge 0$ such that $n\in [a_N,b_N]$ with $N\in [N_k+1,N_{k+1}]$ . Since the sums on the right-hand side are tails of a convergent series, these distances tend to zero, as claimed.

Remark 3.2. It is easily seen that if the domain D of ${\mathcal S}$ in the above lemma has density one then the point $x_0$ from that lemma quasi-generates the same invariant measures as ${\mathcal S}$ .

The second key lemma requires two rather standard propositions from convex analysis. Although they are well known to specialists, it is hard to find them in the exact formulation. Thus, we provide them with proofs.

Let $(\mathcal M,d)$ be a compact convex set in a locally convex metric space (the reader may think of $(\mathcal M(X),d)$ , where d is some standard metric compatible with the weak* topology). The elements of $\mathcal M$ will be denoted by the letters $\mu ,\nu $ .

Proposition 3.3. Let $T:\mathcal M\to \mathcal M$ be a continuous affine transformation. Then the set $\mathcal M_T\subset \mathcal M$ , consisting of T-invariant elements, is non-empty and for any $\varepsilon>0$ there exists $n_\varepsilon\hspace{-1pt} \ge\hspace{-1pt} 1$ such that, for any $\nu\hspace{-1pt} \in\hspace{-1pt} \mathcal M$ and any $n\hspace{-1pt}\ge\hspace{-1pt} n_\varepsilon $ , we have $d((1/n)\sum _{i=0}^{n-1}T^i(\nu ), \mathcal M_T)\hspace{-1pt}<\hspace{-1pt}\varepsilon $ .

Proof. We can assume that $\mathsf {diam}(\mathcal M)=1$ . Denote $A_n(\nu )=(1/n)\sum _{i=1}^{n-1}T^i(\nu )$ . Then, by convexity of the metric and diameter 1 of $\mathcal M$ , we easily see that

$$ \begin{align*} d(T(A_n(\nu)),A_n(\nu))\le\frac1n. \end{align*} $$

This in turn implies that any limit point of any sequence of the form $A_n(\nu _n)$ (with $\nu _n\in \mathcal M$ ) is T-invariant. Such limit points exist by compactness, hence we get that $\mathcal M_T\neq \emptyset $ . Suppose that the second part of the proposition does not hold. This means that there exist $\varepsilon>0$ and an increasing sequence $(n_k)_{k\ge 1}$ of natural numbers, and a sequence $(\nu _k)_{k\ge 1}$ of points of $\mathcal M$ , such that $d(A_{n_k}(\nu _k),\mathcal M_T)\ge \varepsilon $ for all $k\ge 1$ . But we have just proved that all accumulation points of the sequence $(A_{n_k}(\nu _k))_{k\ge 1}$ belong to $\mathcal M_T$ , so we have a contradiction.

Recall that if $\xi $ is a probability measure on $\mathcal M$ then there exists a unique point $\mu \in \mathcal M$ , called the barycenter of $\xi $ , such that for every affine continuous function f one has

$$ \begin{align*} f(\mu)= \int f(\nu)\,d\xi(\nu). \end{align*} $$

The barycenter map is denoted by either $\xi \mapsto \mathsf {bar}(\xi )$ or by $\xi \mapsto \int \nu \,d\xi (\nu )$ (the integral in the sense of Pettis). It is well known that if the set of all probability measures on $\mathcal M$ is endowed with the weak* topology then the barycenter map $\xi \mapsto \mathsf {bar}(\xi )$ is continuous. In the next proposition, the reader may think of $\mathcal M$ representing $\mathcal M_T(X)$ in a dynamical system $(X,T)$ , and $\mu $ representing an ergodic measure.

Proposition 3.4. Let $\mu $ be an extreme point of $\mathcal M$ . Then for any $\varepsilon>0$ there exists $\delta>0$ such that, whenever a probability measure $\xi $ on $\mathcal M$ satisfies $d(\mathsf {bar}(\xi ),\mu )<\delta $ , we have

$$ \begin{align*} \xi\{\nu\in\mathcal M:d(\mu,\nu)<\varepsilon\}>1-\varepsilon. \end{align*} $$

Proof. If the statement is false then there exists a sequence of measures $(\xi _k)_{k\ge 1}$ on $\mathcal M$ such that $\lim _{k\to \infty }\mathsf {bar}(\xi _k)=\mu $ and $\xi _k\{\nu \in \mathcal M:d(\mu ,\nu )<\varepsilon \}\le 1-\varepsilon $ for each $k\ge 1$ . Since the function which associates to a measure $\xi $ the value $\xi (U)$ , where U is an open set, is lower semicontinuous in the weak* topology, we get that if $\xi $ is an accumulation point of the sequence $(\xi _k)_{k\ge 1}$ then $\xi \{\nu \in \mathcal M:d(\mu ,\nu )<\varepsilon \}\le 1-\varepsilon $ . On the other hand, by continuity of the barycenter map, we have $\mathsf {bar}(\xi )=\mu $ . Since $\mu $ is an extreme point of $\mathcal M$ , the only measure on $\mathcal M$ with barycenter at $\mu $ is the Dirac measure $\delta _{\mu }$ . We conclude that $\xi =\delta _{\mu }$ . This is a contradiction, since $\delta _{\mu }\{\nu \in \mathcal M:d(\mu ,\nu )<\varepsilon \}=1$ .

We proceed with the second key lemma needed in the proof of Theorem 1.3.

Lemma 3.5. Let $\mu $ be an ergodic measure on a topological dynamical system $(X,T)$ which has the weak specification property. Let $x_0\in X$ be quasi-generic for $\mu $ and let $\mathcal {J}=(n_k)_{k\ge 1}$ be a sequence along which $x_0$ generates $\mu $ . Then there exist a point $\bar x_0\in X$ generic for $\mu $ and a set $\mathbb M\subset {\mathbb {N}}$ of upper density one achieved along a subsequence of $\mathcal {J}$ , such that

$$ \begin{align*} \lim_{n\in\mathbb M}\ d(T^n\bar x_0,T^nx_0)=0. \end{align*} $$

Proof. We start by fixing a summable sequence of positive numbers $(\varepsilon _k)_{k\ge 1}$ . In view of Lemma 3.1 and Remark 3.2, it suffices to construct a specification ${\mathcal S}$ on a domain D satisfying the following four conditions:

  1. (1) the assumptions of Lemma 3.1;

  2. (2) the domain D of ${\mathcal S}$ has density one;

  3. (3) $\lim _{n\in \mathbb M} d({\mathcal S}(n),T^n(x_0))=0$ , where $\mathbb M\subset D$ has upper density one achieved along a subsequence of $\mathcal {J}$ ;

  4. (4) ${\mathcal S}$ is generic for $\mu $ .

We choose a sequence of positive integers $(l_k)_{k\ge 0}$ . The sequence should grow so fast that the ratios ${M_{\varepsilon _k}(l_k)}/{l_k}$ are all smaller than one and tend to zero. For each $k\ge 1$ we let $L_k=l_k+M_{\varepsilon _k(l_k)}$ . Next, we replace $\mathcal J$ by a fast-growing subsequence and from now on $\mathcal J=(n_k)_{k\ge 1}$ will denote that subsequence. Initially we require that the ratios ${l_k}/{n_k}$ and ${n_k}/{n_{k+1}}$ tend to zero as k grows. More conditions on the speed of growth of the sequences $(l_k)_{k\ge 1}$ and $(n_k)_{k\ge 1}$ will be specified later.

The specification ${\mathcal S}$ is created in three steps. The first auxiliary specification ${\mathcal S}'$ is just a partition of the orbit of $x_0$ without gaps. We begin by partitioning it into segments of length $L_1$ until we cover the coordinate $n_1$ . Then we continue by partitioning the remaining part of the orbit of $x_0$ into segments of length $L_2$ until we cover the coordinate $n_2$ and so on. To be precise, we create segments ${\mathcal S}'[a_N,b_N]=x_0[a_N,b_N]$ (where $N\ge 1$ ) satisfying:

  1. (i) $a_1=0$ ;

  2. (ii) for $N\ge 2$ , $a_N=b_{N-1}+1$ ;

  3. (iii) $b_N=a_N+L_k-1$ , for $N\in [N_{k-1}+1,N_k]$ , where

  4. (iv) for each $k\ge 1$ , $N_k$ is such that $n_k\in [a_{N_k},b_{N_k}]$

(for consistency of notation, we have let $N_0=0$ ). It is elementary to see that, since the ratios ${M_{\varepsilon _k}(l_k)}/{l_k}$ and ${l_k}/{n_k}$ tend to zero, the ratios ${b_{N_k}}/{n_k}$ tend to one. Thus, by Remark 2.1, the point $x_0$ generates $\mu $ along the sequence $(b_{N_k})_{k\ge 1}$ . From now on, we redefine the sequence $\mathcal J$ to be $(b_{N_k})_{k\ge 1}$ (and let $n_k=b_{N_k}$ ; we also let $n_0=0$ ). This new sequence still satisfies ${n_k}/{n_{k+1}}\to 0$ .

The empirical measures $\mu _{x_0[0,n_k]}$ tend to $\mu $ . Since $n_{k-1}$ is eventually negligible in comparison with $n_k$ , the following statement holds:

(3.4) $$ \begin{align} \text{the empirical measures}\ \mu_{x_0[n_{k-1}+1,n_k]}\ \text{tend to}\ \mu\ \text{as}\ k\ \text{grows}. \end{align} $$

The second auxiliary specification ${\mathcal S}"$ is obtained from ${\mathcal S}'$ by truncating all orbit segments (except the first) on the left, to allow for future shadowing. More precisely, we let $a^{\prime }_1=a_1=0$ and for any $k\ge 1$ and any $N\in [N_{k-1}+1,N_k]$ (except for $N=1$ ) we let $a^{\prime }_N=a_N+M_{\varepsilon _k}(l_k)$ (since $M_{\varepsilon _k}(l_k)<l_k$ , we have $a_N'<b_N$ ). Then on the new domain

$$ \begin{align*} D=\bigcup_{N\ge1}[a^{\prime}_N,b_N], \end{align*} $$

we define the specification ${\mathcal S}"$ by $S"[a^{\prime }_N,b_N]=x_0[a^{\prime }_N,b_N]$ . This new specification has, for $N\in [N_{k-1}+1,N_k]$ (except for $N=1$ ), orbit segments of length $l_k$ preceded by gaps of size $M_{\varepsilon _k}(l_k)$ (the first orbit segment has length $L_1$ and no preceding gap).

It should be quite obvious that the lower density of D is achieved along the sequence $b_{N_k}+M_{\varepsilon _{k+1}(l_{k+1})}$ (this is the end of the first gap, larger than all preceding gaps). Because the ratios ${M_{\varepsilon _k}(l_k)}/{l_k}$ tend to zero, by choosing the numbers $n_k$ (and hence $b_{N_k}$ ) sufficiently large in comparison with $M_{\varepsilon _{k+1}}(l_{k+1})$ , we can arrange that the density of D equals one, as required in (2).

We now have to go back to the choice of the sequences $(l_k)$ and $(n_k)$ and impose more conditions on the speed of their growth. We select numbers $\delta _k\le \varepsilon _k$ according to Proposition 3.4 with respect to the numbers $\varepsilon _k$ and the ergodic measure $\mu $ in the role of the extreme point of the compact convex set $\mathcal M_T(X)$ . If the numbers $l_k$ are (a priori) chosen large enough, using Proposition 3.3, we can arrange that

  • the empirical measures $\mu _{x_0[a^{\prime }_N,b_N]}$ with $N\in [N_{k-1}+1,N_k]$ are ${\delta _k}/3$ -close to some invariant measures henceforth denoted by $\mu _N$ .

Also, by imposing fast enough growth of the numbers $n_k$ , we may achieve that:

  • the empirical measure $\mu _{x_0[n_{k-1}+1,n_k]}$ is $({\delta _k}/3)$ -close to $\mu $ (see (3.4));

  • the empirical measures $\mu _{x_0[a_N,b_N]}$ with $N\in [N_{k-1}+1,N_k]$ are $({\delta _k}/3)$ -close to the respective empirical measures $\mu _{x_0[a^{\prime }_N,b_N]}$ (and hence $\frac 23\delta _k$ -close to $\mu _N$ ).

Clearly, the empirical measure $\mu _{x_0[n_{k-1}+1,n_k]}=\mu _{x_0[a_{N_{k-1}+1},b_{N_k}]}$ equals the arithmetic average of the measures $\mu _{x_0[a_N,b_N]}$ with $N{\kern-1.2pt}\in{\kern-1.2pt} [N_{k-1}{\kern-1.2pt}+{\kern-1.2pt}1,N_k]$ . By convexity of the metric, $\mu $ is $\delta _k$ -close to the arithmetic average of the invariant measures $\mu _N$ with $N{\kern-1.2pt}\in{\kern-1.2pt} [N_{k-1}{\kern-1.2pt}+{\kern-1.2pt}1,N_k]$ . By Proposition 3.4, the vast majority of the invariant measures $\mu _N$ are $\varepsilon _k$ -close to $\mu $ , and hence the corresponding empirical measures $\mu _{x_0[a^{\prime }_N,b_N]}$ are $2\varepsilon _k$ -close to $\mu $ (we are using $\delta _k<\varepsilon _k$ ). More precisely, there are fewer than $\varepsilon _k(N_k-N_{k-1})$ parameters $N\in [N_{k-1}+1,N_k]$ (we will call them ‘bad’), for which the measure $\mu _{x_0[a^{\prime }_N,b_N]}$ is not $2\varepsilon _k$ -close to $\mu $ .

We can now perform the third step in creating the specification ${\mathcal S}$ . This is done by replacing in ${\mathcal S}"$ the segments $x_0[a^{\prime }_N,b_N]$ corresponding to ‘bad’ parameters $N\in [N_{k-1},N_k]$ by orbit segments (of the same length) whose associated measures are $2\varepsilon _k$ -close to $\mu $ . For example, we can choose one ‘good’ parameter N (there exists such an N) and use the corresponding segment $x_0[a^{\prime }_N,b_N]$ everywhere we need to make a replacement. This concludes the construction of ${\mathcal S}$ .

We need to verify that ${\mathcal S}$ satisfies the desired four properties.

(1) It is clear that the specification ${\mathcal S}$ was created in accordance with the assumptions of Lemma 3.1.

(2) As we have already remarked, the domain D has density one.

(3) Note that ${\mathcal S}"$ agrees with the orbit of $x_0$ on D (which has density one). Then ${\mathcal S}$ differs from ${\mathcal S}"$ on a set whose frequency in the interval $[n_{k-1}+1,n_k]$ is at most $\varepsilon _k$ . Thus the set of the integers n for which ${\mathcal S}(n)\neq {\mathcal S}"(n)$ (or ${\mathcal S}(n)$ is not defined) has lower density zero achieved along the sequence $\mathcal J$ . The complementary set $\mathbb M$ has upper density one achieved along $\mathcal J$ , and on this set we have ${\mathcal S}(n)=T^n(x_0)$ (which trivially implies the required condition $\lim _{n\in \mathbb M}\ d({\mathcal S}(n),T^nx_0)=0$ ).

(4) Consider a long initial segment of ${\mathcal S}$ (say, ${\mathcal S}|_{[1,n]\cap D}$ ), and let k be such that $N\in [N_{k-1}+1,N_k]$ , where N is determined by the inclusion $n\in [a_N,b_N]$ . Then ${\mathcal S}|_{[1,n]\cap D}$ consists essentially of segments of two lengths: $l_{k-1}$ , whose associated empirical measures are $2\varepsilon _{k-1}$ -close to $\mu $ ; and $l_k$ , whose associated empirical measures are $2\varepsilon _k$ -close to $\mu $ (in either case we have $2\varepsilon _{k-1}$ -closeness). This closeness need not apply to the initial part left of the coordinate $n_{k-1}$ , and to the terminal, perhaps incomplete, orbit segment whose length does not exceed $L_k$ . Since both $n_{k-1}$ and $L_k$ are negligible in comparison with $n_k$ (and hence with n), the two extreme pieces can be ignored and we get that the empirical measure associated with ${\mathcal S}|_{[1,n]\cap D}$ is (nearly) $2\varepsilon _{k-1}$ -close to $\mu $ . Since k tends to infinity as n grows, ${\mathcal S}$ is generic for $\mu $ .

Lemma 3.6. Let $(X,T)$ be a topological dynamical system and let $\mu $ be an invariant measure on X. For each $\varepsilon>0$ there exists $\delta>0$ which satisfies the following assertion.

Let $\mathcal P$ be a partition of X all of whose atoms have diameter not exceeding $\delta $ . Let ${\mathcal S}$ be a finite specification consisting of $N_1$ orbit segments of length l separated by some gaps ( $N_1$ and l are arbitrary natural numbers, and the gaps are also arbitrary):

$$ \begin{align*} {\mathcal S}[a_N,a_N+l-1]=x_N[0,l-1], \end{align*} $$

where $a_1\ge 0$ and, for each $N\in [1,N_1]$ , we have $x_N\in X$ and $a_{N+1}-a_N\ge l$ . Assume that for each $B\in \mathcal P^l=\bigvee _{n=0}^{l-1}T^{-n}(\mathcal P)$ the frequency relative to $(a_N)_{N\in [1,N_1]}$ ,

$$ \begin{align*} \frac{|\{N\in[1,N_1]: x_N\in B\}|}{N_1}, \end{align*} $$

is $\delta /{|\mathcal P^l|}$ -close to $\mu (B)$ (this imposes that $N_1$ must in fact be huge). Then the empirical measure associated with ${\mathcal S}$ ,

$$ \begin{align*} \mu_{\mathcal S}=\frac1{|D|}\sum_{n\in D}\delta_{{\mathcal S}(n)}, \end{align*} $$

where $D=\bigcup _{N=1}^{N_1}[a_N,a_N+l-1]$ is the domain of ${\mathcal S}$ , is $\varepsilon $ -close to $\mu $ .

Proof. Regardless of what metric d compatible with the weak* topology on $\mathcal M(X)$ we are using, there exist a finite family of continuous $[0,1]$ -valued functions (say, $f_1,\ldots ,f_K$ ) and a small positive number $\gamma $ such that if

$$ \begin{align*} \bigg|\!\int f_k\,d\mu_1 - \int f_k\,d\mu_2\bigg|<3\gamma \end{align*} $$

for each $k\in [1,K]$ , then $d(\mu _1,\mu _2)<\varepsilon $ . Further, there exists $\beta $ such that each of the finitely functions $f_k$ varies on each $\beta $ -ball in X by less than $\gamma $ . We let $\delta =\min \{\beta ,\gamma \}$ . Let $\mathcal P$ be a partition of X as in the formulation of the lemma. Observe that if we replace each of the functions $f_k$ by a function $\bar f_k$ constant on the atoms of $\mathcal P$ (say, assuming on each atom the supremum of $f_k$ over that atom), then the integral of $\bar f_k$ with respect to any probability measure differs from the integral of $f_k$ by at most $\gamma $ . So, in order to show that $d(\mu _{\mathcal S},\mu )<\varepsilon $ , it suffices to show that

$$ \begin{align*} \bigg|\!\int f\,d\mu_{\mathcal S} - \int f\,d\mu\bigg|<\gamma, \end{align*} $$

for any $[0,1]$ -valued (not necessarily continuous) function f constant on the atoms of $\mathcal P$ . For such a function f we have

$$ \begin{align*} \int f\,d\mu_{\mathcal S} = \frac1{|D|}\sum_{N=1}^{N_1} \sum_{n=0}^{l-1}f(T^nx_N)= \frac1{N_1}\sum_{N=1}^{N_1}\frac1l\sum_{n=0}^{l-1}f(T^nx_N) \end{align*} $$

(we are using the obvious fact that $|D|=N_1l$ ). Note that if, for some $N,N'\in [1,N_1]$ , the points $x_N$ and $x_{N'}$ belong to the same atom B of $\mathcal P^l$ then the averages $(1/l)\sum _{n=0}^{l-1}f(T^nx_N)$ and $(1/l)\sum _{n=0}^{l-1}f(T^nx_{N'})$ are equal, so, we can replace them by $(1/l)\sum _{n=0}^{l-1}f(T^nx_B)$ , where $x_B$ is a point in B not depending on N. Then our integral becomes

$$ \begin{align*} \int f\,d\mu_{\mathcal S} = \sum_{B\in\mathcal P^l}\frac{|\{N\in[1,N_1]:x_N\in B\}|}{N_1}\frac1l\sum_{n=0}^{l-1}f(T^nx_B). \end{align*} $$

By assumption, the coefficient ${|\{N\in [1,N_1]:x_N\in B\}|}/{N_1}$ equals $\mu (B)$ up to $\delta /{|\mathcal P^l|}$ , all the more so up to $\gamma /{|\mathcal P^l|}$ . Since the averages $(1/l)\sum _{n=0}^{l-1}f(T^nx_B)$ do not exceed one, we obtain that $\int f\,d\mu _{\mathcal S}$ equals

$$ \begin{align*} \sum_{B\in\mathcal P^l}\mu(B)\frac1l\sum_{n=0}^{l-1}f(T^nx_B) \end{align*} $$

up to $\gamma $ . Finally, observe that the latter sum equals $\int (1/l)\sum _{n=0}^{l-1}f\circ T^n\,d\mu $ , which, by invariance of $\mu $ , equals $\int f\,d\mu $ . We have shown that $|\int f\,d\mu _{\mathcal S} - \int f\,d\mu |<\gamma $ , as required.

The next proposition is our last preparatory fact before the proof of Theorem 1.3 It is also a standard fact (this time from ergodic theory), whose exact formulation is hard to find. Thus, we provide it with a proof.

Proposition 3.7. Let $(X,T)$ be a topological dynamical system. Let x be a point quasi-generic for an ergodic measure $\mu $ and let $\mathcal J=(n_k)_{k\ge 1}$ be a sequence along which x generates $\mu $ . Fix a positive integer L. Then there exist two increasing sequences of positive integers, $(a_N)_{N\ge 0}$ and $(N_k)_{k\ge 1}$ , satisfying the following conditions.

  1. (1) For each $N\ge 1$ the difference $a_{N+1}-a_N$ equals either L or $L+1$ .

  2. (2) $\lim _{k\to \infty }({a_{N_k}}/{n_k})=1$ (that is, the sequences $(n_k)_{k\ge 1}$ and $(a_{N_k})_{k\ge 1}$ are equivalent).

  3. (3) x generates $\mu $ with respect to the sequence $(a_N)_{N\ge 1}$ , along $(N_k)_{k\ge 1}$ , that is,

    $$ \begin{align*} \lim_{k\ge1}\frac1{N_k}\sum_{N=1}^{N_k}\delta_{T^{a_N}(x)}=d\mu. \end{align*} $$

Proof. There exists an ergodic measure-preserving system $(Y,\nu ,S)$ disjoint from $(X,\mu ,T)$ (in the sense of Furstenberg). (Two measure-preserving systems are disjoint if their only joining is their product. An example of a system disjoint from $(X,\mu ,T)$ is an irrational rotation by $e^{2\pi it}$ , where t is rationally independent from all numbers s such that $e^{2\pi is}$ is an eigenvalue of $(X,\mu ,T)$ (there are at most countably many values to be avoided).) By a standard application of Rokhlin towers, there exists a set A visited by $\nu $ -almost every orbit in Y infinitely many times with only two gap sizes between consecutive visits, L and $L+1$ . There exists a topological model of $(Y,\nu ,S)$ in which the set A is clopen, that is, its indicator function, denoted by F, is continuous. Let $y\in Y$ be a point generic for $\nu $ and let $(a_N)_{N\ge 1}$ denote the sequence of times of visits of the orbit of y in A (this sequence has only two gap sizes, L and $L+1$ , as required in (1)). The pair $(x,y)$ quasi-generates, along the sequence $\mathcal J$ , some joinings of $\mu $ and $\nu $ . By disjointness, all such joinings are equal to the product measure $\mu \times \nu $ on $X\times Y$ , that is, along $\mathcal J$ the pair $(x,y)$ generates $\mu \times \nu $ . This implies that, for any continuous function f on X,

$$ \begin{align*} \lim_{k\to\infty}\frac1{n_k}\sum_{n=1}^{n_k}f(T^nx)F(S^ny)&=\int f\,d\mu\cdot\nu(A),\\ \lim_{k\to\infty}\frac1{n_k}\sum_{n=1}^{n_k}F(S^ny)&=\nu(A). \end{align*} $$

Given $k\ge 1$ , let $N_k$ denote the largest N such that $a_N\le n_k$ . Observe that since $(a_N)_{N\ge 1}$ has bounded gaps, while $(n_k)_{k\ge 1}$ tends to infinity, the ratios ${a_{N_k}}/{n_k}$ tend to one, as required in (2). Since $F(S^ny)=1$ if and only if $n=a_N$ for some N (otherwise $F(S^ny)=0$ ), we can rewrite the above limits as

(3.5) $$ \begin{align} \lim_{k\to\infty}\frac1{n_k}\sum_{N=1}^{N_k}f(T^{a_N}x)&=\int f\,d\mu\cdot\nu(A), \end{align} $$
(3.6) $$ \begin{align}\lim_{k\to\infty}\frac{N_k}{n_k}&=\nu(A). \end{align} $$

Dividing the left/right hand side of (3.5) by the left/right hand side of (3.6), we get

$$ \begin{align*} \lim_{k\to\infty}\frac1{N_k}\sum_{N=1}^{N_k}f(T^{a_N}x)=\int f\,d\mu. \end{align*} $$

Since this is true for any continuous function f on X, we have proved (3).

4. The main proof

Proof of Theorem 1.3

Let $\mathcal J=(n_k)_{k\ge 1}$ be a sequence along which y generates $\nu $ . It suffices to construct a point $x_0$ such that the pair $(x_0,y)$ generates $\xi $ along a subsequence $\mathcal J'$ of $\mathcal {J}$ . Clearly, such an $x_0$ generates $\mu $ along $\mathcal J'$ and, by Lemma 3.5, there will then exist a point x generic for $\mu $ and such that

$$ \begin{align*} \lim_{n\in\mathbb M}\ d(T^nx,T^nx_0)=0, \end{align*} $$

where $\mathbb M$ is a set of upper density one achieved along a subsequence $\mathcal J"$ of $\mathcal J'$ . Note that then the pair $(x,y)$ still generates $\xi $ along $\mathcal J"$ , so the proof will be completed.

We fix a summable sequence of positive numbers $(\varepsilon _k)_{k\ge 1}$ and an increasing sequence of natural numbers $l_k$ such that

(4.1) $$ \begin{align} \lim_{k\to\infty}\frac{M_{\varepsilon_k}(l_k)}{l_k}=0. \end{align} $$

Next, we let $(\mathcal P_k)_{k\ge 1}$ be a sequence of measurable partitions of X such that, for each $k\ge 1$ , the diameters of the atoms of $\mathcal P_k$ do not exceed the number $\delta _k$ obtained from Lemma 3.6 for the measure $\mu $ and $\varepsilon _k$ in the role of $\varepsilon $ .

The atoms of the partitions $\mathcal P_k^l=\bigvee _{i=0}^{l-1}T^{-i}(\mathcal P_k)$ , where $k\ge 1$ and $l\ge 1$ , will be called blocks of X, while the atoms $\mathcal P_k^{l_k}$ will be called blocks of order k of X.

Likewise, we let $(\mathcal Q_k)_{k\ge 1}$ be a sequence of partitions of Y with diameters bounded by $\delta _k$ . We can easily arrange the partitions $\mathcal Q_k$ so that the orbit of y avoids the boundaries of the atoms of $\mathcal Q_k$ for each $k\ge 1$ . (The partition $\mathcal Q_k$ can be constructed as follows. First we choose a finite open cover by $\delta _k$ -balls $B(c_i,\delta _k)$ centered at some points $c_i\in Y$ , $i=1,2,\ldots ,N$ , $N\in {\mathbb {N}}$ . There exists $\delta _k'<\delta _k$ such that for any $\delta \in [\delta _k',\delta _k]$ , the balls $B(c_i,\delta )$ still cover Y. Note that, for each i, the $\delta $ -spheres $S(c_i,\delta )$ are disjoint for different values of $\delta $ . Since there are uncountably many values of $\delta $ while the orbit of y is countable, there exists a $\delta \in [\delta _k',\delta _k]$ such that all the spheres $S(c_i,\delta )$ , $i=1,2,\ldots ,N$ , avoid the orbit of y. The partition $\mathcal Q_k$ is obtained by ‘disjointification’ of the cover by the balls $B(c_i,\delta )$ . Then the boundaries of the atoms of $\mathcal Q_k$ are contained in the union of the $\delta $ -spheres $S(c_i,\delta )$ , and hence the partition has the desired property.) The atoms of $\mathcal Q_k^l=\bigvee _{i=0}^{l-1}S^{-i}(\mathcal Q_k)$ , where $k\ge 1$ and $l\ge 1$ , will be called blocks of Y and the atoms of $\mathcal Q_k^{l_k}$ will be called atoms of order k of Y.

Note that if we apply the maximum metric in $X\times Y$ then the rectangular atoms of the partitions $\mathcal P_k\otimes {\mathcal Q}_k$ have diameters bounded by $\delta _k$ as well. We now choose some very small positive numbers $\gamma _k$ so that, for each $k\ge 1$ , we have

$$ \begin{align*} 2\gamma_k+\gamma_k^2<\frac{\delta_k}{|\mathcal P_k^l\otimes\mathcal Q_k^l|}. \end{align*} $$

Because y is generic for $\nu $ along $\mathcal J$ , and its orbit avoids the boundaries of the blocks, the orbit of y visits each block C of Y with frequency evaluated at times $n_k$ converging to $\nu (C)$ .

Successively using Proposition 3.7 with the parameters $L_k=l_k+M_{\varepsilon _k}(l_k)$ in the role of L, and replacing, if necessary, the sequence $\mathcal J$ by a rapidly growing subsequence $\mathcal J'$ (from now on $(n_k)_{k\ge 1}$ will denote $\mathcal J'$ ), we can arrange two increasing sequences of positive integers, $(a_N)_{N\ge 1}$ and $(N_k)_{k\ge 0}$ starting with $N_0=0$ , satisfying the following conditions.

  1. (1) ${\lim _{k\to \infty }({a_{N_k}}/{n_k})=1}$ .

  2. (2) For each $k\ge 1$ and each $N\in [N_{k-1},N_k-1]$ , the difference $a_{N+1}-a_N$ equals either $L_k$ or $L_k+1$ . (The proposition, as it is stated, does not allow us to ensure that the gap $a_{N_k}-a_{N_k-1}$ (the first gap following the series of gaps of sizes $L_k$ or $L_k+1$ ) equals either $L_{k+1}$ or $L_{k+1}+1$ . A priori, this gap may come out smaller (say, of size $j<L_{k+1}$ ). However, in this case, replacing the set A in the proof of the proposition (for $L=L_{k+1}$ ) by $T^{-(L_{k+1}-j)}A$ , we can adjust the gap to size $L_{k+1}$ .)

  3. (3) If we denote by $C_N$ the unique block of order k of Y containing $S^{a_N}y$ , then, for any block C of order k of Y, we have

    $$ \begin{align*} \bigg|\frac1{N_k-N_{k-1}}|\{N\in[N_{k-1},N_k-1]:C_N=C\}|-\nu(C)\bigg|<\gamma_k. \end{align*} $$
  4. (4) If, in addition, C satisfies $\nu (C)>0$ , then also

    $$ \begin{align*} |\{N\in[N_{k-1},N_k-1]:C_N=C\}|>\frac1{\gamma_k}. \end{align*} $$

Condition (1) says that $\mathcal J'$ and $\tilde {\mathcal J}'=(a_{N_k})_{k\ge 1}$ are equivalent. In particular, y generates $\nu $ along the sequence $(a_{N_k})_{k\ge 1}$ and if we find $x_0$ using $\tilde {\mathcal J}'$ , the same $x_0$ will serve for $\mathcal J'$ .

(*) Fix some $k\ge 1$ . Let $\xi (B|C)={\xi (B\times C)}/{\nu (C)}$ , where B and C are blocks of order k of X and Y, respectively, with $\nu (C)>0$ . For every such C, the numbers $\xi (B|C)$ , with B ranging over all blocks of order k of X, form a probability vector. By (4), this vector can be approximated up to $\gamma _k$ (at each coordinate) by a rational probability vector with entries

$$ \begin{align*} \frac{r(B,C)}{|\{N\in[N_{k-1},N_k-1]:C_N=C\}|}, \end{align*} $$

where each $r(B,C)$ is a non-negative integer. We can thus create a finite sequence $(B_N)_{N\in [N_{k-1},N_k-1]}$ of blocks of order k of X, so that, for every pair of blocks $B,C$ of order k in X and Y respectively, we have

$$ \begin{align*} |\{N\in[N_{k-1},N_k-1]:C_N=C\text{ and }B_N=B\}|=r(B,C). \end{align*} $$

Then, for each pair $B,C$ as above, with $\nu (C)>0$ , we have

$$ \begin{align*} \frac{r(B,C)}{N_k-N_{k-1}}=\frac{r(B,C)}{|\{N\in[N_{k-1},N_k-1]:C_N=C\}|}\cdot \frac{|\{N\in[N_{k-1},N_k-1]:C_N=C\}|}{N_k-N_{k-1}}, \end{align*} $$

where (by the choice of the integers $r(B,C)$ ) the first fraction equals $\xi (B|C)$ up to $\gamma _k$ , and, by (3), the second fraction equals $\nu (C)$ , also up to $\gamma _k$ . So, ${r(B,C)}/({N_k-N_{k-1}})$ equals $\xi (B\times C)$ up to $2\gamma _k+\gamma _k^2$ , which is less than ${\delta _k}/{|\mathcal P_k^l\otimes \mathcal Q_k^l|}$ .

Now, we create a finite specification $\bar {\mathcal S}_k$ in $X\times Y$ , as follows. For each $N\in [N_{k-1}, N_k-1]$ we choose a point $x_N\in B_N$ and we let

$$ \begin{align*} \bar{\mathcal S}_k[a_N,a_N+L_k-1]=(x_N,S^{a_N}y)[0,L_k-1] \end{align*} $$

(the starting point of the Nth orbit segment falls in $(B_N,C_N)$ , and the second coordinate agrees, along the entire specification, with the orbit of y). Note that by (2), the gaps in the domain of $\bar {\mathcal S}_k$ have only two sizes, zero or one. Lemma 3.6 now guarantees that the empirical measure $\mu _{\bar {\mathcal S}_k}$ is $\varepsilon _k$ -close to $\xi $ .

Let $\bar {\mathcal S}$ be the infinite specification in $X\times Y$ defined as follows: for each $k\ge 1$ and each $N\in [N_{k-1},N_k-1]$ we let

$$ \begin{align*} \bar{\mathcal S}[a_N+M_{\varepsilon_k}(l_k),a_N+L_k-1]=\bar{\mathcal S}_k[a_N+M_{\varepsilon_k}(l_k),a_N+L_k-1]. \end{align*} $$

It is fairly obvious that $\bar {\mathcal S}$ generates $\xi $ along the sequence $\tilde {\mathcal J}'$ (by (4.1), the fact that the intervals of the domain are slightly trimmed on the left does not affect the convergence).

Let us denote by ${\mathcal S}$ the projection of $\bar {\mathcal S}$ to the first coordinate. This infinite specification in X satisfies all requirements of Lemma 3.1. That lemma allows us to find a point $x_0$ whose orbit shadows the specification ${\mathcal S}$ with an increasing accuracy. Clearly, the pair $(x_0,y)$ shadows $\bar {\mathcal S}$ equally well. It is also clear that the domain of $\bar {\mathcal S}$ has density one, which (by Remark 3.2) implies that the pair $(x_0,y)$ generates $\xi $ along $\tilde {\mathcal J}'$ , and hence also along $\mathcal J'$ . We have achieved all that was necessary to complete the proof.

Remark 4.1. It is possible to modify the proof and avoid the use of Lemma 3.5 (see below). Although the main proof itself becomes slightly longer, one can skip that lemma and the two auxiliary propositions altogether. There are two reasons why we have decided to present the longer argument.

  1. (1) Lemma 3.5 is a generalization of Kamae’s Theorem 1 in [Reference KamaeK] and has some independent value of its own. It may turn out useful in further studies of systems with weak specification.

  2. (2) By following the framework of the original proof, we show that T. Kamae has insightfully laid ground for further generalizations.

Sketch of the modified proof.

Go to the paragraph marked by (*). Divide the blocks B (of order k of X) into two families: $\mathcal B$ , of those whose associated empirical measures are close to $\mu $ , and the rest. By the mean ergodic theorem, for large enough k, the joint measure of the blocks in $\mathcal B$ is very close to one. Thus, by an insignificant renormalization, we can make the vector of conditional probabilities $\xi (B|C)$ , with B ranging over $\mathcal B$ (and C fixed) probabilistic. From here we proceed as it is described except that each time we refer to B we use only the blocks from $\mathcal B$ . The specification $\bar {\mathcal S}_k$ will then have its X-coordinate consisting exclusively of blocks $B\in \mathcal B$ . The specification ${\mathcal S}$ (the X-projection of $\bar {\mathcal S}$ ) will consist of blocks whose empirical measures are getting closer and closer to $\mu $ . If the numbers $a_{N_k}$ grow sufficiently fast in comparison to the lengths $l_k$ then, by an identical argument to that in the proof of Lemma 3.5, the specification ${\mathcal S}$ will be generic for $\mu $ and so will be the point $x_0$ shadowing ${\mathcal S}$ . Lemma 3.5 becomes irrelevant.

Question 4.2. If y is generic for $\nu $ , can the pair $(x,y)$ be obtained generic for $\xi $ (as in the original theorem of Kamae)? Is a stronger specification property of X necessary for that?

Added in proof. While checking the proofs of our paper we became aware of an old paper of T. Kamae [Reference KamaeK1] which contains results which partially overlap with ours. He introduces the vague separation property (v.s.p.) under which he proves a lifting theorem similar to ours. The relation between the v.s.p. and weak specification remains to be clarified.

References

Bergelson, V. and Downarowicz, T.. On preservation of normality and determinism under arithmetic operations. In preparation.Google Scholar
Dateyama, M.. The almost weak specification property for ergodic group automorphisms of abelian groups. J. Math. Soc. Japan 42 (1990), 341351.10.2969/jmsj/04220341CrossRefGoogle Scholar
Furstenberg, H.. Strict ergodicity and transformation of the torus. Amer. J. Math. 83 (1961), 573601.10.2307/2372899CrossRefGoogle Scholar
Furstenberg, H. and Weiss, B.. On almost $1$ - $1$ extensions. Israel J. Math. 65 (1989), 311322.10.1007/BF02764869CrossRefGoogle Scholar
Kamae, T.. Subsequences of normal sequences. Israel J. Math. 16 (1973), 121149.10.1007/BF02757864CrossRefGoogle Scholar
Kamae, T.. Normal numbers and ergodic theory. Proc. Third Japan--USSR Symp. on Probability Theory (Lecture Notes in Mathematics, 550). Ed. G. Maruyama and J. V. Prokhorov. Springer, Berlin, 1976.Google Scholar
Rauzy, G.. Nombres normaux et processus déterministes. Acta Arith. 29 (1976), 211225.10.4064/aa-29-3-211-225CrossRefGoogle Scholar
Weiss, B.. Normal sequences as collectives. Proc. Symp. on Topological Dynamics and Ergodic Theory. University of Kentucky, Lexington, KY, 1971, pp. 7980.Google Scholar