An IDEAL Group, CLC, Project

]> An IDEAL Group, CLC, Project

SOME RESULTS ON RECURRENCE AND ENTROPY

DISSERTATIO $N$

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the Graduate

School of the Ohio State University

Ronald Lee Pavlov Jr., B.S., $M . S$ .

$* * * *$ $*$

The Ohio State University

2007

Dissertation Committee:

$\mapsto 133 C 1 b C l b l \cup 11 \cup \cup 111111 l b b C C$ . Approved by

Professor Vitaly Bergelson, Advisor

Professor Alexander Leibman

Advisor

Professor Manfred Einsiedler Graduate Program in $M ξ$

Graduate Program in Mathematics

ABSTRACT

This thesis is comprised primarily of two separate portions. In the first portion, we

exhibit, for any sparse enough increasing sequence ${p_{n}}$ of integers, a totally minimal,

totally uniquely ergodic, and topologically mixing system $(X, T)$ and $f \in C (X)$ for

which the averages $\frac{1}{N} \sum_{n = 0}^{N - 1} f (T^{p_{n}} x)$ fail to converge on a residual set in $X$ , answering

negatively an open question of Bergelson. We also construct here a totally minimal,

totally uniquely ergodic, and topologically mixing system $(X^{'}, T^{'})$ and $x^{'} \in X^{'}$ for

which $x^{'} \notin {T^{p_{n}} x^{'}}$ .

In the second portion, we study perturbations of multidimensional shifts of finite

type. Given any $Z^{d}$ shift of finite type $X$ for $d > 1$ and any word $w$ in the language of

$X$ , denote by $X_{w}$ the set of elements of $X$ in which $w$ does not appear. If $X$ satisfies

a uniform mixing condition called strong irreducibility, we obtain exponential upper

and lower bounds on $h^{t o p} (X) - h^{t o p} (X_{w})$ dependent only on the size of $w$ . This result

generalizes a result of Lind about $Z$ shifts of finite type.

To Dilip

ill

ACKNOWLEDGMENTS

First and foremost, I would like to thank my advisor Vitaly Bergelson. This the-

sis could never have been completed without his help. His infectious enthusiasm

for mathematics, along with a willingness to share his knowledge, were a constant

inspiration.

I would also like to thank my other committee members, Professors Alexander Leib-

man and Manfred Einsiedler, for agreeing to serve on my committee and for many

fruitful mathematical discussions.

Finally, I thank my friends and loved ones, without whom I would never have been

able to get as far as I have. In no particular order:

Thank you to Mom and Dad, who have always been willing to do anything to

support me and have always believed in me.

Thank you to Greg, without whose friendship and sense of humor graduate school

would have been much less bearable.

Thank you to Jasmine, who was always there for me. I can never thank you

enough for your constant love and support.

VITA

2000-Present . . . . . . . . . . . . . . . . . . . . . . . . . . Graduate Teaching Associate,

The Ohio State University

1998-1999 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Student Instructional Associate,

The Ohio State University

2003 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . $M . S$ . in Mathematics,

The Ohio State University

2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . $B . S$ . in Mathematics,

The Ohio State University

$v$

PUBLICATIONS

Research Papers

$•$ Some Counterexamples In Topological Dynamics (Ergodic Theory and Dynam-

ical Systems, to appear)

$•$ Perturbations of Multidimensional Shifts of Finite Type (submitted)

FIELDS OF STUDY

Major Field: Mathematics

Specialization: Ergodic Theory and Dynamical Systems

TABLE OF CONTENTS

Abstract ii

Dedication . $i i i$

Acknowledgments . 1V

Vita $v$

List of Figures ix

CHAPTER PAGE

1Introduction 1

2Recurrence and convergence of ergodic averages along sparse sequences 8

2.1 Introduction 8

2.2 Some general symbolic constructions 16

2.3 Some symbolic counterexamples 31

2.4 Some general constructions on connected manifolds . 42

2.5 Some counterexamples on connected manifolds 55

2.6 A counterexample about simultaneous recurrence 61

2.7 Questions 65

3Perturbations of multidimensional shifts of finite type 67

3.1 Introduction 67

3.2 Some measure-theoretic preliminaries 84

3.3 A replacement theorem 93

3.4 The proof of the main result 114

3.5 A closer look at the main result 146

3.6 An application to an undecidability question 155

3.7 Questions 159

Vll

Bibliography

161

Vlll

LIST OF FIGURES

FIGURE PAGE

3.1 $f_{2}$ 's action on a sample element of $Y$ 73

3.2 A portion of a sample element of $Z$ 76

3.3 $a_{4}$ 77

3.4 $b_{4}$ 77

3.5 $R_{k (j_{1} + R), k (j_{2} + R), ..., k (j_{d} + R)}$ 85

3.6 A standard replacement of $u$ . 95

3.7 A sample element $S$ of $R_{j}$ 97

3.8 An element $S$ of $R_{j}$ associated to two standard replacements 98

3.9 The suboctants of $B_{j}$ 99

3.10 Elements $S$ , $S^{'}$ of $R_{j}$ whose difference is a multiple of $e_{i}$ tOO

3.11 An element $S$ of $R_{j}$ which contains $O$ 106

3.12 107

3.13 $B_{j}^{'}$ 108

3.14 Intersecting occurrences of $w$ 116

3.15 $S_{i} ∖ S_{i}^{(R)}$ 120

3.16 Disallowed and allowed pairs of overlapping $w_{j, d}$ 127

3.17 A point $x \in X_{y}$ 138

3.18 The correspondence between copies of $Γ_{W}$ and points in $Γ_{j}^{(2)}$

139

3.19 How a copy of $Γ_{W}$ is filled if $b_{j, d} (p) = 1$ 140

3.20 $f^{'}$

141

$x$

(JIAPTER 1

INTRODUCTION

This introduction will serve as a brief mathematical and historical overview of both

of the main problems that we will examine in this thesis. Due to their somewhat

disparate natures, the two portions of this thesis will each contain a more in-depth

introduction as well. For this reason, we will for the most part relegate formal defi-

nitions to their pertinent introduction.

Ergodic theory is the study of average or long-term behavior of systems which evolve

with time. For example, one can consider a particle bouncing in a box at fixed speed.

Its position and velocity at any time can be represented by a vector in $R^{6}$ . One can

then model this behavior by taking $X$ to be the space of all possible positions and

velocities for the particle, and $T$ a self-map of $X$ which, given a position and velocity,

gives the position and velocity one second later. This pairing of a space $X$ with a

self-map $T$ is called a dynamical system.

The more general setup is that of a group $G$ acting on a space $X$ with some sort

of structure by maps ${T_{g}}_{g \in G}$ preserving this structure. In this thesis, $G$ is always

$Z^{d}$ for some $d \in$ N. Measure-theoretic ergodic theory occurs when $X$ is a measure

space with a probability measure $μ$ invariant under each $T_{g}$ , and so in this case we call

$(X, B, μ, {T_{g}}_{g \in G})$ a measure-preserving dynamical system. Topological dynam-

ics occurs when $X$ is a compact topological space and each $T_{g}$ is a homeomorphism,

and so in this case we call $(X, {T_{g}}_{g \in G})$ a topological dynamical system. In ei-

ther type of system, if $G = Z$ , then $T_{n} = (T_{1})^{n}$ for any integer $n$ , and so we shorten

$(X, B, μ, {T_{n}}_{n \in Z})$ to $(X, B, μ, T)$ and $(X, {T_{n}}_{n \in Z})$ to $(X, T)$ . (Here $T = T_{1} .$ )

These cases are not as disjoint as they may first appear: the famed Bogoliouboff-

Krylov theorem states that any topological dynamical system $(X, {T_{g}}_{g \in G})$ has at

least one invariant probability measure $μ$ as long as $G$ is an amenable group. (An

examination of amenability is beyond the scope of this thesis, but we mention that

amenable groups are a class which contain all abelian groups. We only consider

$G = Z^{d}$ in this thesis, so all topological dynamical systems examined here will possess

invariant measures.) This sometimes allows one to examine topological properties of

a system via properties of invariant measures. For instance, in Chapter 3, we will

prove some purely combinatorial properties of a $Z^{d}$ -action by homeomorphisms of a

Cantor set by means of studying the invariant measures of this action.

Chapter 2 deals primarily with ergodic averaging. One of the fundamental results of

ergodic theory is Birkhoff's ergodic theorem:

Theorem 1.0.1. ([Bi]) For any measure-preserving dynamical system $(X, B, μ, T)$

and any $f \in L^{1} (X)$ , $\lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (T^{i} x)$ exists for $μ$ -almost every $x \in X$ .

Several results have been proven about the convergence of such averages when one

averages not along all powers of $T$ , but only along some distinguished subset of the

integers. ([Bou], [Bou2], [Wi]) In particular, when one averages along ${p (n)}$ for a

polynomial $p (n)$ with integer coefficients, there is the following result of Bourgain:

Theorem 2.1.12. ([Bou], p. 7, Theorem t) For any measure-preserving dynamical

system $(X, B, μ, T)$ , for any polynomial $q (t) \in Z [t]$ , and for any $f \in L^{p} (X, B, μ)$ with

$p > 1$ , $\lim_{N \to \infty} \frac{1}{N} \sum_{n = 1}^{N} f (T^{q (n)} x)$ exists for $μ$ -almost every $x \in X$ .

Theorem 2.1.12 can be interpreted as follows: for any polynomial $q (t) \in Z [t]$ , any

measure-preserving system $(X, B, μ, T)$ , and any measure-theoretically "nice" func-

tion $f$ , the set of points $x$ where $\lim_{N \to \infty} \frac{1}{N} \sum_{n = 1}^{N} f (T^{q (n)} x)$ fails to converge is of mea-

sure zero, or negligible measure-theoretically. It is then natural to wonder whether or

not there is a topological parallel to this result using topological notions of "niceness"

(continuity) and negligibility (first category), and in fact such a question was posed

by Bergelson:

Question 2.1.13. ([Be], p. 51, Question 5) Assume that a topological dynamical

system $(X, T)$ is uniquely $e r g o d i c^{1}$ , and let $p \in Z [t]$ and $f \in C (X)$ . Is it true that for

all but a first category set of points $\lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (T^{p (i)} x)$ exists?

One of our results is a (quite negative) answer to Question 2.1.13:

Theorem 2.1.15. For any increasing sequence ${p_{n}}$ of integers with upper Banach

density $z e r o^{2}$ , there exists a totally minimal, totally uniquely ergodic, and topologically

1A topological dynamical system $(X, B, μ, T)$ is uniquely ergodic if there exists exactly one T-

invariant measure $μ$ .

$2 A$ set $A \subseteq N$ has upper Banach density zero if $\lim \sup_{n \to \infty} \sup_{m \in N} \frac{| {m, m + 1, ..., m + n - 1} \cap A |}{n} = 0$ .

mixing topological dynamical system $(X, T)$ and a continuous function $f$ on $X$ with

the property that $\frac{1}{N} \sum_{n = 0}^{N - 1} f (T^{p_{n}} x)$ fails to converge for a residual set of $x$ .

We also examine recurrence along distinguished subsets of the integers, motivated

primarily by the following result:

Theorem 2.1.17. ( $[B e L]$ , p. 14, Corollary 1.8) For any $d \in N$ , any $m i n i m a l^{3}$ topolog-

ical dynamical system $(X, {T_{v}}_{v \in Z^{d}})$ and any polynomials $q_{1} (t)$ , $q_{2} (t)$ , $...$ , $q_{d} (t) \in Z [t]$

with $q_{i} (0) = 0$ for $1 \underline{<} i \underline{<} d$ , for a residual set of $x \in X$ there exists a sequence

${n_{i}}$ of positive integers such that $T_{q (n_{i}) e_{i}} x \to x$ for $1 \underline{<} i \underline{<} d$ , where ${e_{i}}_{i = 1}^{d}$ is the

standard orthonormal basis of $R^{d}$ .

In particular, this implies that for any minimal topological dynamical system $(X, T)$

and any polynomial $q (t) \in Z [t]$ with $q (0) = 0$ , the set of $x \in X$ with $x \in {T^{q (n)} x}_{n \in N}$

is residual. The following result shows that for $q$ with degree at least two, this residual

set is not necessarily all of $X$ .

Theorem 2.1.18. For any increasing sequence ${p_{n}}$ of integers with upper Banach

density zero, there exists a totally minimal, totally uniquely ergodic, and topologically

mixing topological dynamical system (X,T) and an uncountable set A $\subset X$ such that

for every x $\in A$ , the sequence ${T^{p_{n}} x}$ does not have x as a limit point, i.e. there is

no sequence of positive integers ${n_{i}}$ such that $T^{p_{n_{i}}}$ x converges to x.

Another consequence of Theorem 2.1.17 is that for any commuting minimal homeo-

morphisms $T$ and $S$ of a compact space $X$ , there exists a residual set of $x$ for which

sA topological dynamical system $(X, B, μ, T)$ is minimal if there exist no nonempty proper closed

$T$ -invariant subsets of $X$ .

there exists a sequence of positive integers ${n_{i}}$ such that $T^{n_{i}} x \to x$ and $S^{n_{i}} x \to x$ .

The following theorem shows that for some systems, it is the case that this residual

set of $x$ is not all of $X$ .

Theorem 2.1.21. There exists a totally minimal topological dynamical system (X,T)

and a point x $\in X$ such that for any positive integers r $\neq s$ , any sequence ${n_{i}}$ of

integers satisfying $T^{r n_{i}} x \to x$ and $T^{s n_{i}} x \to x$ is eventually zero.

The systems which we construct to prove Theorems 2.1.15 and 2.1.18 are symbolic

dynamical systems. A symbolic dynamical system is defined by first choosing a

finite set $A$ , called the alphabet. $Ω = A^{G}$ endowed with the product topology is a

compact space, and for any $g \in G$ , we may define $σ_{g}$ the shift homeomorphism by

$(σ_{g} ω) (h) = ω (h g)$ for $ω$ $\in$ Q. Any closed set $X \underline{\subset} Ω$ has a topology induced by $Ω$ , and

if $X$ is invariant under each $T_{g}$ , then $(X, {σ_{g}}_{g \in G})$ is a topological dynamical system

which we call a symbolic dynamical system.

In Chapter 3, we examine symbolic dynamical systems exclusively. There, we consider

a specific type of symbolic dynamical system called a shift of finite type. A $Z^{d}$ -shift

of finite type is a symbolic dynamical system $(X, {σ_{v}}_{v \in Z^{d}})$ where $X$ is defined by

specifying a finite set $F$ of finite words (a word is a function from a finite subset of

$Z^{d}$ to $A$ ) and taking $X$ to be the set of all elements of $Ω$ in which none of the words

in $F$ appear. The shift of finite type $X$ specified by a finite set $F$ of words in this

way is denoted by $Ω_{F}$ . For example, the set of all biinfinite sequences of zeroes and

ones in which no two ones occur consecutively is a shift of finite type, as is the set of

all $Z^{2}$ arrays of zeroes and ones in which no three-by-four blocks of all zeroes or all

ones appear.

We are particularly interested in the effects of forbidding a particular word from a

shift of finite type $X$ . Given any word $w$ , define $X_{w}$ to be the set of elements of $X$ in

which $w$ does not appear. Then clearly $X_{w}$ is a subset of $X$ , and so the topological

entropy $h^{t o p} (X_{w})$ of $X_{w}$ is not greater than the topological entropy $h^{t o p} (X)$ of $X$ . It

is natural to wonder how much the topological entropy drops by when $w$ is removed,

as it is a sort of measure of how important $w$ is to the information-retaining capacity

of $X$ . In [L], Lind proved the following: (the condition $w \in L_{Γ_{n}} (X)$ means that

$w \in A^{[1, ..., n]^{d}}$ and that $w$ appears in some element of $X .$ )

Theorem 3.1.16. ([L], p. 360, Theorem 3) For any topologically transitive Z-shift

of finite type $X = Ω_{F}$ with positive topological entropy $h^{t o p} (X)$ , there exist constants

$C_{X}$ , $D_{X}$ , and $N_{X}$ such that for any $n >$ $N_{X}$ and any word $w \in L_{Γ_{n}} (X)$ , if we denote

by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then

$\frac{C_{X}}{e^{h^{t o p} (X) n}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) n}}$ .

Our main result is a generalization of Theorem 3.1.16 for $Z^{d}$ -shifts of finite type

which satisfy a mixing condition called strong irreducibility. (We formally define

strong irreducibility in Definition 3.1.19.)

Theorem 3.1.22. For any d $>$ 1 and any strongly irreducible $Z^{d}$ -shift X $= Ω_{F}$

of finite type with uniform filling length R and positive topological entropy $h^{t o p} (X)$ ,

there exist constants $N_{X} \in N$ and $D_{X} \in R$ such that for any $n >$ $N_{X}$ and any word

$w \in L_{Γ_{n}} (X)$ , if we denote by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then

$\frac{1}{e^{h^{t o p} (X) (n + 44 R + 70)^{d}}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) (n - 2 R)^{d}}}$ .

One way in which $Z^{d}$ -shifts of finite type are more difficult to deal with for $d > 1$ is

that given a finite collection $F$ of patterns, it is undecidable whether or not the shift

of finite type $X$ induced by $F$ is even nonempty. Our methods yield a situation in

which this question can be answered:

Theorem 3.6.1. For any alphabet A, there exist F, G $\in N$ such that for any m $> 0$

and any finite set of words $F_{m} = {w_{k} \in L_{Γ_{n_{k}}} (X)$ : $1 \underline{<} k \underline{<} m}$ satisfying $n_{1} > G$

and $n_{k} \underline{>} F (n_{k - 1})^{4 d^{2}}$ for $1 < k \underline{<} m$ , $Ω_{F_{m}} \neq \emptyset$ .

(JIAPTER 2

RECURRENCE AND CONVERGENCE OF ERGODIC

AVERAGES ALONG SPARSE SEQUENCES

2.1 Introduction

In this chapter, we are concerned with the convergence of averages of the form

$\frac{1}{N} \sum_{n = 0}^{N - 1} f (T^{p_{n}} x)$ for an increasing sequence of integers ${p_{n}}$ . We begin with some

definitions.

Definition 2.1.1. A measure-preserving dynamical system $(X, B, μ, {T_{g}}_{g \in G})$

consists of a measure space $X$ , a probability measure $μ$ with $σ$ -algebra 8 of measurable

sets, and a group action ${T_{g}}_{g \in G}$ of transformations $T_{g}$ : $X \to X$ with $μ (T_{g}^{- 1} A) =$

$μ (A)$ for all $A \in B$ .

Definition 2.1.2. A measure-preserving dynamical system (X,B, $μ, {T_{g}}_{g \in G})$ is er-

godic if any set A satisfying $T_{g} A \underline{\subset}$ A for all g $\in G$ has $μ (A) \in$ {0,1}.

Definition 2.1.3. A topological dynamical system (X, ${T_{g}}_{g \in G})$ consists of $a$

compact topological space X and a group action ${T_{g}}_{g \in G}$ of homeomorphisms $T_{g}$ :

X $\to X$ .

Definition 2.1.4. Given a topological dynamical system (X, ${T_{g}}_{g \in G})$ , a Borel prob-

ability measure $μ$ on X is called ergodic if (X, $B (X)$ , $μ$ , ${T_{g}}_{g \in G})$ is an ergodic

measure-preserving dynamical system, where $B (X)$ is the Borel $σ$ -algebra of X.

In this chapter, all dynamical systems have $G = Z$ , so as already described we

will use the notations $(X, B, μ, T)$ and $(X, T)$ for measure-preserving and topological

dynamical systems respectively.

Definition 2.1.5. A topological dynamical system (X,T) is minimal if for any

closed set K with $T^{- 1} K \underline{\subset} K$ , K $= \emptyset$ or K $= X$ . (X,T) is totally minimal if

(X, $T^{n})$ is minimal for every n $\in$ O.

Definition 2.1.6. A topological dynamical system (X,T) is uniquely ergodic if

there is only one Borel measure $μ$ on X such that $μ (A) = μ (T^{- 1} A)$ for every Borel

set $A \underline{\subset}$ X. (X,T) is totally uniquely ergodic if (X, $T^{n})$ is uniquely ergodic for

every n $\in N$ .

Definition 2.1.7. A topological dynamical system (X,T) is topologically mixing

if for any open sets U, $V \underline{\subset} X$ , there exists N $\in N$ such that for any n $>$ N,

$U \cap T^{n} V \neq \emptyset$ .

Definition 2.1.8. For any set $A \underline{\subset} N$ , the upper Banach density of A is defined

$d^{*} (A) = \lim \sup \sup \frac{| {m, m + 1, ..., m + n - 1} \cap A |}{n}$ .

$n \to \infty m \in N$

Definition 2.1.9. For any set $A \underline{\subset} N$ , the upper density of A is defined by

$\bar{d} (A) = \lim_{n \to} \sup_{\infty} \frac{| {1, ..., n} \cap A |}{n}$ .

Definition 2.1.10. For a set $A \underline{\subset} N$ , the density of A is defined by

$d (A) = \lim_{n \to \infty} \frac{| {1, ..., n} \cap A |}{n}$

if this limit exists.

Definition 2.1.11. Given a topological dynamical system (X,T) and a T-invariant

Borel probability measure $μ$ , a point x $\in X$ is (T, $μ)$ -generic if for every f $\in C (X)$ ,

$\lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (T^{i} x) = \int f d μ$ .

In the measure-preserving setup, there are several positive results about convergence

of averages of the form $\frac{1}{N} \sum_{n = 0}^{N - 1} f (T^{p_{n}} x)$ , including this theorem of Bourgain:

Theorem 2.1.12. ([Bou], p. 7, Theorem t) For any measure-preserving dynamical

system $(X, B, μ, T)$ , for any polynomial $q (t) \in Z [t]$ , and for any $f \in L^{p} (X, B, μ)$ with

$p > 1$ , $\lim_{N \to \infty} \frac{1}{N} \sum_{n = 1}^{N} f (T^{q (n)} x)$ exists for $μ$ -almost every $x \in X$ .

The following question regarding a possible topological version of Theorem 2.1.12 was

posed by Bergelson:

Question 2.1.13. ([Be], p. 51, Question 5) Assume that a topological dynamical

system $(X, T)$ is uniquely ergodic, and let $p \in Z [t]$ and $f \in C (X)$ . Is it true that for

all but a first category set of points $\lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (T^{p (i)} x)$ exists?

Bergelson added the hypothesis of unique ergodicity because it is a classical result

that a system $(X, T)$ is uniquely ergodic with unique $T$ -invariant measure $μ$ if and

only if for every $x \in X$ and $f \in C (X)$ , $\lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (T^{i} x) = \int_{X} f d μ$ , and so in

the topological setup this is a natural assumption to make about $(X, T)$ to achieve

the desired result.

However, Bergelson was particularly interested in the convergence of these averages

to the "correct limit," i.e. $\int_{X} f d μ$ where $μ$ is the unique $T$ -invariant measure on

$X$ . To have any hope for such a result, it also becomes necessary to assume ergod-

icity for all powers of $T$ in order to avoid some natural counterexamples related to

distribution (mod $k$ ) of $p (n)$ for positive integers $k$ . For example, if $p (n) = n^{2}$ , $T$ is

the permutation on $X = {0, 1, 2}$ defined by $T x = x + 1$ (mod 3), $μ$ is normalized

counting measure on $X$ , and $f = χ {0}$ , then $(X, T)$ is obviously uniquely ergodic with

unique invariant measure $μ = \frac{δ_{0} + δ_{1} + δ_{2}}{3}$ , but

$\lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (T^{p (i)} x) =$ $i f x = 1 i f x = 0 i f f x = 2'$ , .and

To avoid such examples, we would need $T$ to be totally ergodic as well as uniquely

ergodic, and so it makes sense to assume total unique ergodicity to encompass both

properties. Bergelson's revised question then looks like this:

Question 2.1.14. Assume that a topological dynamical system (X,T) is totally uniquely

ergodic with unique $T$ -invariant measure $μ$ , and let $p \in Z [t]$ and $f \in C (X)$ . Is it true

that for all but a first category set of points $\lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (T^{p (i)} x) = \int_{X} f d μ^{l} .$ ?

We answer Questions 2.1.13 and 2.1.14 negatively in the case where the degree of

$p$ is at least two, and in fact prove some slightly more general results. The level of

generality depends on what hypotheses we place on the space $X$ . In particular, we

can exhibit more counterexamples in the case where $X$ is a totally disconnected space

than we can in the case where $X$ is a connected space such as $T^{k}$ .

Theorem 2.1.15. For any increasing sequence ${p_{n}}$ of integers with upper Banach

density zero, there exists a totally minimal, totally uniquely ergodic, and topologically

mixing topological dynamical system $(X, T)$ and a continuous function $f$ on $X$ with

the property that $\frac{1}{N} \sum_{n = 0}^{N - 1} f (T^{p_{n}} x)$ fails to converge for a residual set of $x$ .

Theorem 2.1.16. For any increasing sequence ${p_{n}}$ of integers with the property

that for some integer $d$ , $p_{n + 1} < (p_{n + 1} - p_{n})^{d}$ for all suffiffifficiently large $n$ , there ex-

ists a totally minimal, totally uniquely ergodic, and topologically mixing topological

dynamical system $(X, T)$ and a continuous function $f$ on $X$ with the property that

$\frac{1}{N} \sum_{n = 0}^{N - 1} f (T^{p_{n}} x)$ fails to converge for a residual set of $x$ . In addition, the space $X$ is

a connected $2 d +$ 9-manifold.

We note that Theorems 2.1.15 and 2.1.16 answer Question 2.1.13 negatively for $p$

with degree at least two, since the sequence $p_{n} = p (n)$ for any nonlinear $p (t) \in Z [t]$

satisfies the hypotheses of both theorems.

Theorems 2.1.15 and 2.1.16 are about nonconvergence of ergodic averages along cer-

tain sequences of powers of $x$ . We also prove two similar results about nonrecurrence

of points. As motivation, we note that a minimal system has the property that every

point is recurrent. In other words, if $(X, T)$ is minimal, then for all $x \in X$ , it is the

case that $x \in {T^{n} x}_{n \in N}$ . If $(X, T)$ is totally minimal, then all points are recurrent

even along infinite arithmetic progressions: for any nonnegative integers $a$ , $b$ , and for

all $x \in X$ , $x \in {T^{a n + b} x}_{n \in N}$ . It is then natural to wonder if the same is true for other

sequences of powers of $T$ , and in this vein there is the following result of Bergelson

and Leibman, which is a corollary to their Polynomial van der Waerden theorem:

Theorem 2.1.17. ( $[B e L]$ , p. 14, Corollary 1.8) For any $d \in N$ , any minimal topolog-

ical dynamical system $(X, {T_{v}}_{v \in Z^{d}})$ and any polynomials $q_{1} (t)$ , $q_{2} (t)$ , $...$ , $q_{d} (t) \in Z [t]$

with $q_{i} (0) = 0$ for $1 \underline{<} i \underline{<} d$ , for a residual set of $x \in X$ there exists a sequence

${n_{i}}$ of positive integers such that $T_{q (n_{i}) e_{i}} x \to x$ for $1 \underline{<} i \underline{<} d$ , where ${e_{i}}_{i = 1}^{d}$ is the

standard orthonormal basis of $R^{d}$ .

In particular, Theorem 2.1.17 implies that for a minimal system $(X, T)$ and any

polynomial $p (t)$ with $p (0) = 0$ , the set of $x$ such that $x \in {T^{p (n)} x}_{n \in N}$ is residual. If

$(X, T)$ is assumed to be totally minimal, then by definition if the degree of $p$ is one,

then every point $x \in X$ is in ${T^{p (n)} x}_{n \in N}$ . The following two results imply that if the

degree of $p$ is greater than one, total minimality of $(X, T)$ does not necessarily imply

that every point $x \in X$ is in ${T^{p (n)} x}_{n \in N}$ .

Theorem 2.1.18. For any increasing sequence ${p_{n}}$ of integers with upper Banach

density zero, there exists a totally minimal, totally uniquely ergodic, and topologically

mixing topological dynamical system (X,T) and an uncountable set A $\subset X$ such that

for every $x \in A$ , the sequence ${T^{p_{n}} x}$ does not have $x$ as a limit point, $i . e$ . there is

no sequence of positive integers ${n_{i}}$ such that $T^{p_{n_{i}}} x$ converges to $x$ .

Theorem 2.1.19. For any increasing sequence ${p_{n}}$ of integers with the property

that for some integer $d$ , $p_{n + 1} < (p_{n + 1} - p_{n})^{d}$ for all suffiffifficiently large $n$ , there exists $a$

totally minimal, totally uniquely ergodic, and topologically mixing topological dynam-

ical system $(X, T)$ and a point $x \in X$ such that the sequence ${T^{p_{n}} x}$ does not have

$x$ as a limit point, $i . e$ . there is no sequence of positive integers ${n_{i}}$ such that $T^{p_{n_{i}}} x$

converges to $x$ . In addition, the space $X$ is a connected $2 d +$ 7-manifold.

The following simple lemma shows that if $(X, T)$ is topologically mixing, then The-

orems 2.1.18 and 2.1.19 cannot be improved too much, i.e. there is no increasing

sequence ${p_{n}}$ for which we can exhibit topologically mixing examples with a second

category set of such nonrecurrent points.

Lemma 2.1.20. If a topological dynamical system (X,T) is topologically mixing, then

for any increasing sequence ${p_{n}}$ , the set of x $\in X$ for which x $\notin {T^{p_{n}} x}$ is of first

category.

Proof: Define $C_{ε} = {x : d (x, T^{p_{n}} x) \underline{>} \in \forall n \in Z}$ . It is clear that $C_{ε}$ is closed. We

claim that $C_{ε}$ contains no nonempty open set, which shows that it is nowhere dense,

implying that $C = \cup_{n = 1}^{\infty} C_{\frac{1}{n}}$ the set of points $x$ for which $x$ is not a limit point of

${T^{p_{n}} x}$ is of first category. Suppose, for a contradiction, that there is a nonempty

open set $U$ with $U \underline{\subset} C_{ε}$ for some $ε$ . Then, there exists $V$ with diam(V $< ε$ such that

$V \underline{\subset} U \underline{\subset} C_{ε}$ . By topological mixing, there exists $n$ such that $V \cap T^{- p_{n}} V \neq \emptyset$ . This

implies that there exists $x \in V$ so that $T^{p_{n}} x \in V$ . Since diam(V $< ε$ , $d (x, T^{p_{n}} x) < ε$ .

However, $x \in V \underline{\subset} C_{ε}$ , so we have a contradiction.

$□$

Theorem 2.1.18 shows that it is possible for this set of points nonrecurrent along $p_{n}$

to be uncountable though.

We mention that some mixing condition is necessary for a statement like Lemma 2.1.20;

as a simple example, consider an irrational circle rotation $T$ : $x \mapsto x + α$ on the cir-

cle T. There is clearly some increasing sequence of integers ${p_{n}}$ such that $p_{n} (y$

(mod $1$ ) $\to \frac{1}{2}$ . Then, for any $x \in T$ , $T^{p_{n}} x \to x + \frac{1}{2}$ , and so for every $x \in X$ , ${T^{p_{n}} x}$

does not have $x$ as a limit point.

We also prove a result about simultaneous recurrence. Theorem 2.1.17 implies that

for any commuting minimal homeomorphisms $T$ and $S$ of a compact space $X$ , there

exists a residual set of $x$ for which there exists a sequence of positive integers ${n_{i}}$

such that $T^{n_{i}} x \to x$ and $S^{n_{i}} x \to x$ . The following theorem shows that for some

systems, it is the case that this residual set of $x$ is not all of X. (In this particular

example, $T$ and $S$ are powers of the same homeomorphism.)

Theorem 2.1.21. There exists a totally minimal topological dynamical system (X,T)

and a point x $\in X$ such that for any positive integers r $\neq s$ , any sequence ${n_{i}}$ of

integers satisfying $T^{r n_{i}} x \to x$ and $T^{s n_{i}} x \to x$ is eventually zero.

Before proceeding with the proofs, we now give a brief description of the content of

this chapter. In Section 2.2, we will describe some general symbolic constructions of

topological dynamical systems with particular mixing properties. At the end of this

section, we will arrive at a construction of a system which is totally minimal, totally

uniquely ergodic, and topologically mixing, and which has as a parameter a sequence

of integers ${n_{k}}$ .

In Section 2.3, by taking this sequence ${n_{k}}$ to grow very quickly, we will show

that the examples constructed in Section 2.2 are sufficient to prove Theorems 2.1.15

and 2.1.18. Some interesting questions also arise and are answered in Section 2.3

pertaining to the upper Banach density of countable unions of sets of upper Banach

density zero.

In Section 2.4, we create a flow under a function with base transformation a skew

product, which acts on a connected manifold, and which is totally minimal, totally

uniquely ergodic, and topologically mixing. This transformation has as a parameter

a function $f \in C (T)$ . We use conditions of Fayad ([Fa]) on flows under functions

to achieve topological mixing, and some conditions of Furstenberg ([Fu]) on skew

products to prove total unique ergodicity.

In Section 2.5, by a judicious choice of $f$ , we use the examples of Section 2.4 to prove

Theorems 2.1.16 and 2.1.19.

In Section 2.6, we prove Theorem 2.1.21.

Finally, in Section 2.7 we give some open questions about strengthening our results.

2.2 Some general symbolic constructions

Definition 2.2.1. An alphabet is any finite set, whose elements are called letters.

Definition 2.2.2. A symbolic dynamical system with alphabet $A$ is a topological

dynamical system $(X, {σ_{g}}_{g \in G})$ where for any $g \in G$ , $σ_{g}$ is the shift homeomorphism

of $Ω = A^{G}$ defined by $(σ_{g} ω) (h) = ω (h g)$ for $ω \in Ω$ , and $X$ is any closed subset of $Ω$

invariant under each $σ_{g}$ .

Definition 2.2.3. A word on the alphabet A is any element of $A^{F}$ for some finite

set F $\subset G$ . F is called the shape of w.

For any words $v$ of length $m$ and $w$ of length $n$ , we denote by $v w$ their concatenation,

i.e. the word $v [1] v [2] ...$ $v [m] w [1] w [2] ...$ $w [n]$ of length $m + n$ . We denote by $w^{k}$ the

word torn . . . $w$ given by the concatenation of $k$ copies of $w$ .

Definition 2.2.4. Given any symbolic dynamical system (X, ${σ_{g}}_{g \in G})$ , the language

of X, denoted by $L (X)$ , is the set of all words which appear as subwords of X.

All symbolic dynamical systems appearing in this chapter have alphabet {0, 1}, $G =$

Z. We also restrict ourselves in this chapter to words with shape ${$ 1, $...$ , $n}$ for some

$n$ . In other words, a word can be thought of for now as a finite string of letters

$w = w (1) w (2) ...$ $w (n)$ . The number of letters in a word $w$ is called its length.

Definition 2.2.5. A word w of length n is a subword of a sequence u $\in Ω$ if there

exists k $\in Z$ such that $u (i + k) = w (i)$ for $1 \underline{<} i \underline{<} n$ . Analogously, w is a subword

of a word v of length m if there exists $0 < k < m$ -n such that $v (i + k) = w (i)$ for

$1 \underline{<} i \underline{<} n$ .

We will outline three constructions which algorithmically create $X$ such that $(X, σ)$

have certain properties. (Here the "certain properties" in question depend on which

construction is used.) It should also be mentioned that many ideas from these con-

structions are taken from work of Hahn and Katznelson $([H a K])$ , where they also

algorithmically constructed symbolic topological dynamical systems with certain er-

godicity and mixing properties.

Construction 1: (Minimal) We define inductively $n_{k}$ and $A_{k}$ , which are, respec-

tively, sequences of positive integers and sets of words on the alphabet {0, 1}. Each

word in $A_{k}$ is of length $n_{k}$ . (We will use the term " $A_{k}$ -word" to refer to a member of

$A_{k}$ from now on.) We define these as follows: always define $n_{1} = 1$ and $A_{1} = {0, 1}$ .

Then, for any $k \underline{>} 1$ , $n_{k + 1}$ is defined to be any integer greater than or equal to $n_{k} | A_{k} |$

which is also a multiple of $n_{k}$ , and then $A_{k + 1}$ is chosen to be the set of words of

length $n_{k + 1}$ which are concatenations of $A_{k}$ -words, containing each $A_{k}$ word in the

concatenation at least once.

We then define $X$ to be the set of all $x \in Ω$ which are limits of shifted $A_{k}$ words In

other words, $x \in X$ if there exist $w_{k} \in A_{k}$ and $m_{k} \in Z$ such that for all large enough

$i$ , $x (i) = w_{k} (i - m_{k})$ . Since $Ω$ is compact, such $x$ exist by a standard diagonalization

argument. It is easily checkable that $X$ is closed and a-invariant. The claim is that

regardless of the choice of the integers $n_{k}$ , as long as $n_{k}$ divides $n_{k + 1}$ , and $n_{k + 1} \underline{>}$

$n_{k} | A_{k} |$ , $(X, σ)$ is minimal. It suffices to show that for any $y \in X$ , ${σ^{n} y}_{n \in Z} = X$ .

Choose any $y \in X$ and $w \in L (X)$ . By the definition of $X$ , there exists $k$ and some

$A_{k}$ word $w_{k}$ such that $w$ is a subword of $w_{k}$ . $w_{k}$ is an $A_{k}$ -word, so by definition,

every $A_{k + 1}$ -word contains $w_{k}$ , and therefore $w$ , as a subword. Finally, note that again

by the definition of Construction 1, $y$ is an biinfinite concatenation of $A_{k + 1}$ words

This implies that $y (1)$ . . . $y (2 n_{k + 1})$ contains some $A_{k + 1}$ word, and therefore $w$ , as a

subword, and so there exists $n \in Z$ so that $σ^{n} y$ begins with $w$ . Since $w$ was an

arbitrary word in $L (X)$ , this implies that ${σ^{n} y}_{n \in Z} = X$ , and so $(X, σ)$ is minimal.

$□$

So, we have now demonstrated a way of constructing minimal $(X, σ)$ . We will now

make this construction a bit more complex in order to make $(X, σ)$ totally minimal.

Construction 2: (Totally minimal) We define inductively $n_{k}$ and $A_{k}$ , which are,

respectively, sequences of positive integers and sets of words on the alphabet {0, 1}.

Each word in $A_{k}$ is of length $n_{k}$ . We define these as follows: always define $n_{1} = 1$

and $A_{1} = {0, 1}$ . Then, for any $k \underline{>} 1$ , $n_{k + 1}$ is defined to be any integer greater than

or equal to $(k!)^{2} n_{k} | A_{k} | + k! + n_{k}^{2}$ , and then $A_{k + 1}$ is chosen to be the set of words $w$ of

length $n_{k + 1}$ which are concatenations of $A_{k}$ -words and the word 1 with the following

properties: the word 1 does not appear at the beginning or end of $w^{'}$ , only a single

1 can be concatenated between two $A_{k}$ -words, and for every $w^{'} \in A_{k}$ , and for every

$0 \underline{<} i < k!$ , $w$ appears in $w^{'}$ at an $i$ (mod $k!$ )-indexed place. That is, there exists

$m \equiv i$ (mod $k!$ ) with $w (m) w (m + 1)$ . . . $w (m + n_{k} - 1)$ $= w^{'}$ . From now on, to refer

to this second condition, we say that every $w^{'} \in A_{k}$ occurs in $w$ at places indexed by

all residue classes modulo $k!$ .

Since this construction is a bit complicated, a few quick examples may be in or-

der. Suppose that $n_{2} = 6$ , $| A_{2} | = 4$ , and we choose $n_{3} = 134$ . Say that $A_{2} =$

${a, b, c, d} . T h e n w = a b c d 1 a b c d 1 d a b c d a b c a b c d a b$ is an $A_{3} - w o r d$ : each $A_{2}$ word ap-

pears at least once beginning with a letter of $w$ with an odd index, and at least

once beginning with a letter of $w$ with an even index. Examples of words which

would not be $A_{3}$ -words include abcdllabcddabcdababcabcd (1 is concatenated twice

between $d$ and $a$ ), abcdo lbcdldbcdaabcbbbbdc (occurrences of the word $a$ begin only

with even-indexed letters), or abcdldcbabcdabcdabcdadb (wrong number of letters.)

We can then define $X$ to again be the set of all $x \in Ω$ which are limits of shifted

$A_{k}$ -words. For this definition to make sense, it suffices to show that $A_{k}$ is nonempty

for all $k$ , since if this is true then $X \neq \emptyset$ by compactness of $Ω$ and a diagonalization

argument just as in Construction 1. $A_{1}$ is nonempty, so it suffices to show that $A_{k} \neq \emptyset$

implies $A_{k + 1} \neq \emptyset$ for all $k$ . For any $k$ , assume that $w_{k} \in A_{k}$ . Then, enumerate the

elements of $A_{k}$ by $w_{k} = \frac{a_{1}, a_{2}, ..., a_{| A_{k} |}, a n_{k} d d n_{k + 1^{- k^{1} (k^{1} n_{k} | A_{k} | + 1) - i (n + 1)}}}{n_{k}}$ efine the words $u_{k + 1} = a_{1}^{k!} a_{2}^{k!}$ . . . $a_{| A_{k} |}^{k!}$

and $w^{'} = (u_{k + 1} 1)^{k!} (a_{1} 1)^{i} a_{1}$ , where $i \equiv n_{k + 1} - k!$ (mod $n_{k}$ ). Since

$n_{k + 1} > (k!)^{2} n_{k} | A_{k} | + k! + n_{k}^{2}$ , $w^{'}$ exists, and is a concatenation of $A_{k}$ -words and the

word 1 with length $n_{k + 1}$ . In $w^{'}$ , at most a single 1 is concatenated between any two

$A_{k}$ -words and 1 does not appear at the beginning or end of $w^{'}$ . Also, since the length

of $u_{k + 1}$ is divisible by $k!$ , and since all $A_{k}$ -words are subwords of $u_{k + 1}$ , all $A_{k}$ words

appear in $(u_{k + 1} 1)^{k!}$ at places indexed by all residue classes modulo $k!$ , and so all

$A_{k}$ words appear in $w^{'}$ at places indexed by all residue classes modulo $k!$ as well.

Therefore, $w^{'} \in A_{k + 1}$ , and so $A_{k + 1}$ is nonempty.

We will show that $(X, σ)$ is totally minimal. Fix any $m > 0$ . We wish to show that

for any $y \in X$ , ${σ^{m n} y}_{n \in Z} = X$ . Choose any such $y$ , and fix any word $w \in L (X)$ .

By definition of $X$ , there exists $k$ and an $A_{k}$ word $w_{k}$ such that $w$ is a subword of

$w_{k}$ . Without loss of generality, we assume that $k > m$ . By the construction, $w_{k}$

occurs in every $A_{k + 1}$ -word, and it occurs at places indexed by every residue class

modulo $k!$ . Since $k > m$ , in particular this implies that $w_{k}$ , and therefore $w$ , occurs

in every $A_{k + 1}$ -word at places indexed by every residue class modulo $m$ . By definition

of Construction 2, $y$ is a biinfinite concatenation of $A_{k + 1}$ -words and the word 1.

Therefore, $y (1)$ . . . $y (2 n_{k + 1} + 2)$ contains some $A_{k + 1}$ -word as a subword, and so $w$

occurs in $y (1)$ . . . $y (2 n_{k + 1} + 2)$ at places indexed by every residue class modulo $m$ .

There then exists $n$ so that $σ^{m n} y$ begins with $w$ . Since $w$ was an arbitrary word of

$L (X)$ , this shows that ${σ^{m n} y}_{n \in Z} = X$ , and since $m$ was arbitrary, that $(X, σ)$ is

totally minimal.

$□$

We now define one more general type of construction, again more complex than the

last, so that the system created is always totally uniquely ergodic and topologically

mixing, in addition to being totally minimal. For this last construction, we first need

a couple of definitions.

Definition 2.2.6. For any integers $0 \underline{<} i < m$ and $k$ , and $w \in A_{k - 1}$ and $w^{'} \in$

$A_{k}$ , we define $f r_{i, m}^{*}$ $(w, w^{'})$ to be the ratio of the number of occurrences of $w$ as $a$

concatenated $A_{k - 1}$ -rvord at $i$ (mod $m$ )-indexed places in $w^{'}$ to the total number of

$A_{k - 1}$ -rvords concatenated in $w^{'}$ .

We consider any positive integer to be equal to 0 (mod t) for the purposes of this

definition. An example is clearly in order: if $A_{1} = {01, 10}$ , $w = 01$ , (in Construc-

tions 2 and 3, $A_{1}$ is always taken to be {0, 1}, but here we deviate from this for

illustrative purposes) and $w^{'}$ is the $A_{2}$ word Ot $| 10 | 1 | 01$ $| 10$ (here vertical bars show

where breaks in the concatenation occur), then $w$ occurs twice out of four $A_{1}$ -words,

so $f r_{0, 1}^{*}$ $(w, w^{'})$ $= \frac{1}{2}$ . Since one of these occurrences begins at $w^{'} (1)$ and one begins at

$w^{'} (6)$ , $f r_{0, 2}^{*} (w, w^{'}) = f r_{1, 2}^{*} (w, w^{'}) = \frac{1}{4}$ . We make a quick note here that there could

be some ambiguity here if an $A_{k + 1}$ -word could be decomposed as a concatenation of

$A_{k}$ words and ones in more than one way. For this reason, we just assume that when

computing $f r_{i, j}^{*}$ $(w, w^{'})$ , the definition of the $A_{k + 1}$ word $w^{'}$ includes its representation

as a concatenation of $A_{k}$ -words and ones. (i.e. in the example given, $w^{'}$ is defined

as the concatenation Of $| 10 | 1 | 01$ $| 10$ of $A_{1}$ -words and ones, rather than the nine-letter

word 011010110.)

Definition 2.2.7. Given any words $w^{'}$ of length $n^{'}$ and w of length $n \underline{<} n^{'}$ , and any

integers $0 \underline{<} i < m$ , define $f r_{i, m}$ (w, $w^{'})$ to be the number of occurrences of w at $i$

(mod m)-indexed places in $w^{'}$ , divided by $n^{'} - n + 1$ .

Taking the previous example again, $f r_{0, 1} (w, w^{'}) = \frac{3}{8}$ , since 01 occurs three times as

a subword of 011010110. Since two of these occurrences begin at letters of $w^{'}$ with

even indices and one begins at a letter of $w^{'}$ with odd index, $f r_{0, 2} (w, w^{'}) = \frac{2}{8}$ and

$f r_{1, 2} (w, w^{'}) = \frac{1}{8}$ .

Construction 3: (Totally minimal, totally uniquely ergodic, and topologi-

cally mixing) We define inductively $n_{k}$ and $A_{k}$ , which are, respectively, sequences

of positive integers and sets of words on the alphabet {0, 1}. Each word in $A_{k}$ is of

length $n_{k}$ . We define these as follows: always define $n_{1} = 1$ and $A_{1} = {0, 1}$ . Then,

we fix any sequence ${d_{k}}$ of positive reals such that $\sum_{k = 1}^{\infty} d_{k} < \infty$ and define, for each

$k \underline{>} 1$ , some $n_{k + 1} = C_{k} (k + 1)! | A_{k} | n_{k} + p$ for any integer $C_{k} > n_{k} > \frac{1}{d_{k}}$ and prime

$n_{k} < p \underline{<} 2 n_{k}$ (We may choose such a $p$ by Bertrand's postulate. $[H a W]$ ) Note that

this implies that $(n_{k}, k!) = 1$ for all $k \in N$ . We then define $A_{k + 1}$ to be the set of words

$w^{'}$ of length $n_{k + 1}$ with all of the same properties as in Construction 2, along with the

property that, for any $w \in A_{k}$ , and for any $0 \underline{<} i < k!$ , $f r_{i, k!}^{*} (w, w^{'}) \in [\frac{1 - d_{k}}{k! | A_{k} |}, \frac{1 + d_{k}}{k! | A_{k} |}]$ .

$X$ is again taken to be the set of $x \in Ω$ which are limits of shifted $A_{k}$ words For

this definition to make sense, it suffices to show that $A_{k}$ is nonempty for all $k$ , since

if this is true then $X \neq \emptyset$ by compactness of $Ω$ and a diagonalization argument

just as in Construction 1. $A_{1}$ is nonempty, so it suffices to show that $A_{k} \neq \emptyset$

implies $A_{k + 1} \neq \emptyset$ for all $k$ . For any $k$ , assume that $w_{k} \in A_{k}$ . Then, enumerate the

elements of $A_{k}$ by $w_{k} = a_{1}$ , $a_{2}$ , $...$ , $a | A_{k} |$ , and define the words $u_{k + 1} = a_{1}^{k!} a_{2}^{k!}$ . . . $a_{| A_{k} |}^{k!}$

and $w^{'} =$ $(u_{k + 1})^{C_{k} (k + 1) - p} (u_{k + 1} 1)^{p}$ . Clearly for large $k$ , $w^{'}$ exists, and is a concatenation

of $A_{k}$ -words and the word 1 with length $n_{k + 1}$ . In $w^{'}$ , at most a single 1 is concatenated

between any two $A_{k}$ -words and 1 does not appear at the beginning or end of $w^{'}$ . Since

$(n_{k}, k!) = 1$ , for every $0 \underline{<} i < k!$ and $x \in A_{k}$ , $x$ appears in $u_{k + 1}$ exactly once as a

concatenated $A_{k}$ -word at an $i$ (mod $k!$ )-indexed place. Therefore, $x$ appears in $w^{'}$

exactly $C_{k} (k + 1)$ times as a concatenated $A_{k}$ -word at $i$ (mod $k!$ )-indexed places, and

so $f r_{i, k!}^{*} (x, w^{'}) = \frac{1}{| A_{k} | n_{k} k!}$ . Since $i$ and $x$ were arbitrary, $w^{'} \in A_{k + 1}$ , and so $A_{k + 1} \neq \emptyset$ .

$(X, σ)$ is totally minimal by the same proof used for Construction 2. We claim that

$(X, σ)$ is also totally uniquely ergodic. Take any word $w \in L (X)$ , and any fixed integer

$j$ . We define two sequences ${m_{k}^{(j)}}$ and ${M_{k}^{(j)}}$ as follows: $m_{k}^{(j)}$ is the minimum value

of $f r_{i, j} (w, w^{'})$ , where $0 \underline{<} i < j$ and $w^{'}$ ranges over all $A_{k}$ word and $M_{k}^{(j)}$ is the

maximum value of $f r_{i, j}$ $(w, w^{'})$ , where $0 \underline{<} i < j$ and $w^{'}$ ranges over all $A_{k}$ -words.

Suppose that $m_{k}^{(j)}$ and $M_{k}^{(j)}$ are known, and that $k > j$ . We wish to show that $a m_{k + 1}^{(j)}$

and $M_{k + 1}^{(j)}$ are very close to each other. Let us consider any element $w^{'}$ of $A_{k + 1}$ and,

for any fixed $0 \underline{<} i < j$ , see how few occurrences of $w$ there could possibly be at $i$

(mod $j$ )-indexed places in $w^{'}$ . By the definition of Construction 3, for every $w^{'} \in A_{k}$ ,

and $0 \underline{<} i^{'} < k!$ , the ratio of the number of times $w^{'}$ occurs as a concatenated $A_{k}$ word

in $w^{'}$ whose first letter is a letter of $w^{'}$ whose index is equal to $i^{'}$ (mod $k!$ ) to the total

number of $A_{k}$ -words concatenated in $w^{'}$ is at least $\frac{1 - d_{k}}{k! | A_{k} |}$ . Since $j$ divides $k!$ , then for

any $0 \underline{<} i^{'} < j$ , the ratio of the number of times that $w^{'}$ occurs as a concatenated $A_{k^{-}}$

word at $i^{'}$ (mod $j$ )-indexed places in $w^{'}$ to the total number of $A_{k}$ -words concatenated

in $w^{'}$ is at least $\frac{1 - d_{k}}{j | A_{k} |}$ . Since the total number of $A_{k}$ -words concatenated in $w^{'}$ is at

least $\frac{n_{k + 1}}{n_{k} + 1}$ , this implies that the number of such occurrences of $w^{'}$ in $w^{'}$ is at least

$\frac{1 - d_{k}}{j | A_{k} |} \frac{n_{k + 1}}{n_{k} + 1}$ for any $i^{'}$ and $w^{'}$ . For any $w^{'}$ and $i^{'}$ , the number of times that $w$ occurs at $i$

(mod $j$ )-indexed places in $w^{'}$ as a subword of an occurrence of $w^{'}$ that occurs at an $i^{'}$

(mod $j$ )-indexed place in $w$ is then at least $\frac{1 - d_{k}}{j | A_{k} |} \frac{n_{k + 1}}{n_{k} + 1} (n_{k} - | w | + 1) f r_{i - i^{'}} (m o d j), j (w, w^{'})$ .

Summing over all $w^{'} \in A_{k}$ and $0 \underline{<} i^{'} < j$ , the number of occurrences of $w$ in $w^{'}$ at $i$

(mod $j$ )-indexed places is at least

$(1 - d_{k}) n_{k + 1} \frac{n_{k} - | w | + 1}{n_{k} + 1} \frac{\sum_{w^{'} \in A_{k}} \sum_{m = 0}^{j - 1} f r_{m, j} (w, w^{'})}{j | A_{k} |}$ .

Since $w^{'}$ was arbitrary in $A_{k + 1}$ and $0 \underline{<} i < j$ was arbitrary,

$m_{k + 1}^{(j)} \underline{>}$ $(1 - d_{k}) \frac{n_{k + 1}}{n_{k + 1} - | w | + 1} \frac{n_{k} - | w | + 1}{n_{k} + 1} \frac{\sum_{w^{'} \in A_{k}} \sum_{m = 0}^{j - 1} f r_{m, j} (w, w^{'})}{j | A_{k} |}$ .

Let us now bound from above the number of occurrences of $w$ in $w^{'}$ at $i$ (mod $j$ )-

indexed places. By precisely the same reasons as above, for any $0 \underline{<} i < j$ , the

number of occurrences of $w$ at $i$ (mod $j$ )-indexed places which lie entirely within a

concatenated $A_{k}$ -word in $w^{'}$ is not more than

$(1 + d_{k}) n_{k + 1} \frac{n_{k} - | w | + 1}{n_{k}} \frac{\sum_{w^{'} \in A_{k}} \sum_{m = 0}^{j - 1} f r_{m, j} (w, w^{'})}{j | A_{k} |}$ .

(The denominator of the first fraction changed because there are at most $\frac{n_{k + 1}}{n_{k}} A_{k^{-}}$

words concatenated in $w^{'} .$ ) However, it is possible that there are occurrences of $w$

in $w^{'}$ which do not lie entirely within a concatenated $A_{k}$ -word in $w^{'}$ . The number

of such occurrences of $w$ is not more than $| w | + 1$ times the number of concatenated

$A_{k}$ words in $w^{'}$ , which in turn is less than or equal to $(| w | + 1)$ $\frac{n_{k + 1}}{n_{k}}$ . This means that

the number of occurrences of $w$ at $i$ (mod $j$ )-indexed places in $w^{'}$ is bounded from

above by

$(1 + d_{k}) n_{k + 1} \frac{n_{k} - | w | + 1}{n_{k}} \frac{\sum_{w^{'} \in A_{k}} \sum_{m = 0}^{j - 1} f r_{m, j} (w, w^{'})}{j | A_{k} |} + (| w | + 1) \frac{n_{k + 1}}{n_{k}}$ ,

and since $0 \underline{<} i < j$ was arbitrary, this implies that

$M_{k + 1}^{(j)} \underline{<}$ $(1 + d_{k}) \frac{n_{k} - | w | + 1}{n_{k}} \frac{\sum_{w^{'} \in A_{k}} \sum_{m = 0}^{j - 1} f r_{m, j} (w, w^{'})}{j | A_{k} |} \underline{n_{k + 1}}$

$n_{k + 1} - | w | + 1$

$+ \underline{n_{k + 1}} \underline{| w | + 1}$ .

$n_{k + 1} - | w | + 1$ $n_{k}$

This implies that

$M_{k + 1}^{(j)} - m_{k + 1}^{(j)} \underline{<} 2 d_{k} \frac{n_{k + 1}}{n_{k + 1} - | w | + 1} \frac{n_{k} - | w | + 1}{n_{k}} \frac{\sum_{w^{'} \in A_{k}} \sum_{m = 0}^{j - 1} f r_{m, j} (w, w^{'})}{j | A_{k} |}$

$+ \underline{n_{k + 1}} \underline{| w | + 1}$ .

$n_{k + 1} - | w | + 1$ $n_{k}$

Since $f r_{m, j} (w, w^{'}) \underline{<} 1$ for every $0 \underline{<} m < j$ and $w^{'} \in A_{k}$ , for large $k$ this shows that

$M_{k + 1}^{(j)} - m_{k + 1}^{(j)} \underline{<} 2 d_{k} + \frac{2 (| w | + 1)}{n_{k}}$ , which clearly approaches zero as $k \to \infty$ .

We now note that since

and since by definition $f r_{m, j} (w, w^{'}) \underline{>} m_{k}^{(j)}$ for all $w^{'} \in A_{k}$ , we see that $m_{k + 1}^{(j)} \underline{>}$

$(1 - d_{k}) \frac{n_{k + 1}}{n_{k + 1} - | w | + 1} \frac{n_{k} - | w | + 1}{n_{k} + 1} m_{k}^{(j)}$ , and so

$m_{k + 1}^{(j)} - m_{k}^{(j)} \underline{>} m_{k}^{(j)} [(1 - d_{k}) (1 + \frac{| w |}{n_{k + 1} - | w | + 1}) (1 - \frac{| w | - 1}{n_{k} + 1}) - 1]$

$\underline{>} - m_{k}^{(j)} (d_{k} + \frac{| w |}{n_{k} + 1}) \underline{>} - (d_{k} + \frac{| w |}{n_{k}})$ .

By almost completely analogous reasoning, for large $k$

$M_{k + 1}^{(j)} - M_{k}^{(j)} \underline{<} M_{k}^{(j)} [(1 + d_{k}) (1 + \frac{| w | - 1}{n_{k + 1} - | w | + 1}) (1 - \frac{| w | - 1}{n_{k}}) - 1]$

$+ \frac{n_{k + 1}}{n_{k + 1} - | w | + 1} \frac{| w | + 1}{n_{k}} \underline{<} M_{k}^{(j)} d_{k} + \frac{2 (| w | + 1)}{n_{k}} \underline{<} d_{k} + \frac{2 | w | + 1}{n_{k}}$ .

Therefore,

$m_{k + 1}^{(j)} \underline{<} M_{k + 1}^{(j)} \underline{<} M_{k}^{(j)} + d_{k} + \frac{2 | w | + 1}{n_{k}} \underline{<} m_{k}^{(j)}$ a $2 d_{k - 1} + \frac{2 (| w | + 1)}{n_{k - 1}} + d_{k} + \frac{2 | w | + 1}{n_{k}}$

$\underline{<} m_{k}^{(j)} + 2 d_{k - 1} + d_{k} + \frac{4 | w | + 3}{n_{k - 1}}$ ,

so $| m_{k + 1}^{(j)} - m_{k}^{(j)} | \underline{<} d_{k} + 2 d_{k - 1} + \frac{4 | w | + 3}{n_{k - 1}}$ . In a completely analogous fashion, $| M_{k + 1}^{(j)} -$

$M_{k}^{(j)} | \underline{<} d_{k} + 2 d_{k - 1} + \frac{3 | w | + 2}{n_{k - 1}}$ . We know that $\sum_{k = 1}^{\infty} d_{k}$ converges, and since $n_{k} \underline{>} 2^{k}$ for

all $k$ , $\sum_{k = 1}^{\infty} \frac{1}{n_{k}}$ converges as well. Therefore, we see that the sequences ${m_{k}^{(j)}}$ and

${M_{k}^{(j)}}$ are Cauchy, and converge. Since we also showed that $M_{k}^{(j)} - m_{k}^{(j)} \to 0$ , we

know that they have the same limit, call it $(y$ .

This implies that for very large $k$ , $| f r_{i, j} (w, w^{'}) - α |$ is very small for every $0 \underline{<} i < j$ and

$w^{'} \in A_{k}$ . We claim that this, in turn, implies that for very large $N$ , $| f r_{i, j} (w, w^{'}) - α |$ is

very small for every word $w^{'} \in L (X)$ of length $N$ : fix any $ε > 0$ , and take $k$ such that

$| f r_{i, j} (w, w^{'}) - α | <$ : for every $0 \underline{<} i < j$ and $w^{'} \in A_{k}$ , and such that $\frac{1 + | w |}{n_{k}} < \frac{ε}{4}$ . Then

for any word $w^{'} \in L (X)$ of length at least $\frac{8 n_{k}}{ε}$ , $w^{'}$ is a subword of a concatenation

of $A_{k}$ -words and copies of the word 1. The number of full $A_{k}$ -words appearing in

the concatenation formi $n g$ $w^{'}$ iiss aatt lleeaasstt $\frac{| w^{'} |}{n_{k} + 1} - 2$ , and aatt mmoosstt $\frac{| w^{'} |}{n_{k}}$ . So, the number

of occurrences of $w$ at $i$ (mod $j$ )-indexed places in $w^{'}$ which are contained entirely

within a concatenated $A_{k}$ -word is at least $(\frac{(n_{k} - | w | + 1) | w^{'} |}{n_{k} + 1} - 2 (n_{k} - | w | + 1)) (α - \frac{ε}{2})$ $\underline{>}$

$| w^{'} | ((1 - \frac{ε}{4}) - \frac{ε}{4}) (α - \frac{ε}{2}) \underline{>} | w^{'} | (α - ε)$ , and at most $| w^{'} | \frac{n_{k} - | w | + 1}{n_{k}} (α + \frac{ε}{2}) \underline{<} | w^{'} | (α + \frac{ε}{2})$ .

Since there are at most $\frac{(| w | + 1) | w^{'} |}{n_{k}} < | w^{'} | \frac{ε}{4}$ occurrences of $w$ not contained entirely

within a concatenated $A_{k}$ -word, this implies that $f r_{i, j} (w, w^{'})$ is at least ( $y$ $- ε$ , and

at most a $+ ε$ . Since for any $ε > 0$ , this statement is true for any long enough word

$w^{'} \in L (X)$ and $0 \underline{<} i < j$ , we see that $\frac{1}{n} \sum_{i = 0}^{n - 1} χ [w] (σ^{i j} y) \to$ a uniformly for $y \in X$ .

Since $w$ was arbitrary, and since characteristic functions of cylinder sets are dense in

$C (X)$ , $\frac{1}{n} \sum_{i = 0}^{n - 1} f (σ^{i j} y)$ approaches a uniform limit for all $f \in C (X)$ , and so $(X, σ^{j})$

is uniquely ergodic for every $j \in$ N. Since an invariant measure for $(X, σ)$ would be

invariant for any $(X, σ^{j})$ as well, the unique invariant measure is the same for every

$j$ .

Finally, we claim that $(X, σ)$ is also topologically mixing. Consider any words $w$ , $w^{'} \in$

$L (X)$ . By construction, there exists $k$ so that there are $A_{k}$ words $y$ , $y^{'}$ with $w$ a

subword of $y$ and $w^{'}$ a subword of $y^{'}$ . We also claim that for any $\frac{n_{k + 1}}{6} < i < \frac{5 n_{k + 1}}{6}$ , there

exists an $A_{k + 1}$ word $b_{i}$ where $b_{i} (i + 1) b_{i} (i + 2)$ . . . $b_{i} (i + n_{k}) = y$ , and similarly $b_{i}^{'} \in A_{k + 1}$

with $b_{i}^{'} (i + 1) b_{i}^{'} (i + 2)$ . . . $b_{i}^{'} (i + n_{k}) = y^{'}$ . We show only the existence of $b_{i}$ , as the proof for

$b_{i}^{'}$ is trivially similar. Consider any $\frac{n_{k + 1}}{6} < i < \frac{5 n_{k + 1}}{6}$ , and take $j = i$ (mod $n_{k}$ ). Then,

if we enumerate the elements of $A_{k}$ by $a_{1}$ , $a_{2}$ , $...$ , $a | A_{k} |$ , first define the word $u_{k + 1} =$

$a_{1}^{k!} a_{2}^{k!} ...$ $a_{| A_{k} |}^{k!}$ , and then define the word $y_{i} = (u_{k + 1} 1)^{j} (u_{k + 1})^{C_{k} (k + 1) - p} (u_{k + 1} 1)^{p - j}$ . $y_{i}$

has the property that $y_{i} (i + 1) y_{i} (i + 2)$ . . . $y_{i} (i + n_{k})$ is a concatenated $A_{k}$ -word in $y_{i}$ as

long as $y_{i} (i + 1) y_{i} (i + 2)$ . . . $y_{i} (i + n_{k})$ lies in the subword $(u_{k + 1})^{C_{k} (k + 1) - p}$ of $y_{i}$ , which is

true for large $k$ and $i \in (\frac{n_{k + 1}}{6}, \frac{5 n_{k + 1}}{6})$ . This means that if we reorder $a_{1}$ , $...$ , $a_{| A_{k} |}$ in the

definition of $u_{k + 1}$ , we may create a word $b_{i}$ where $b_{i} (i + 1) b_{i} (i + 2)$ . . . $b_{i} (i + n_{k}) = y$ .

$b_{i} \in A_{k + 1}$ , since for every $0 \underline{<} i < k!$ and $x \in A_{k}$ , $x$ appears exactly $C (k + 1)$

times in $b_{i}$ as a concatenated $A_{k}$ -word at $i$ (mod $k!$ )-indexed places, implying that

$f r_{i, k!}^{*} (x, b_{i}) = \frac{1}{| A_{k} | n_{k} k!}$ (This uses the fact that ( $n_{k}$ , $k!) = 1$ , which was already shown.)

We create $b_{i}^{'}$ in the same way for each $i$ . Since $w$ is a subword of $y$ and $w^{'}$ is a subword

of $y^{'}$ , for every $\frac{n_{k + 1}}{6} + n_{k} \underline{<} i \underline{<} \frac{5 n_{k + 1}}{6} - n_{k}$ , it is easy to choose a word $z_{i}$ to be $b_{j}$

for properly chosen $j$ so that $z_{i} (i + 1)$ . . . $z_{i} (i + | w |) = w$ , and similarly $z_{i}^{'}$ so that

$z_{i}^{'} (i + 1)$ . . . $z_{i}^{'} (i + | w^{'} |) = w^{'}$ . For large $k$ , this means that we can construct such $z_{i}$

and $z_{i}^{'}$ for any $i \in [\frac{n_{k + 1}}{5}, \frac{4 n_{k + 1}}{5}]$ .

We will now use these $z_{i}$ and $z_{i}^{'}$ to prove that for any $n > | w | + n_{k + 1}$ , there exists a

word $x \in L (X)$ of length $n$ such that rvxrv' $\in L (X)$ . We do this by proving a lemma:

Lemma 2.2.8. For any t $> k + 1$ , and for any $0 \underline{<} i$ , j $< n_{t}$ such that there exists an

$A_{t}$ word x where $x (i + 1) x (i + 2)$ . . . $x (i + n_{k + 1})$ and $x (j + 1) x (j + 2)$ . . . $x (j + n_{k + 1})$ are

concatenated $A_{k + 1}$ -rvords in x, and for any trvo $A_{k + 1}$ words z and $z^{'}$ , there exists an

$A_{t}$ word $x^{'}$ where $x^{'} (i + 1) x^{'} (i + 2) ...$ $x^{'} (i + n_{k + 1}) = z$ and $x^{'} (j + 1) x^{'} (j + 2)$ . . . $x^{'} (j +$

$n_{k + 1}) = z^{'}$ .

Proof: We prove this by induction. First we prove the base case $t = k + 2$ ; take an

$A_{k + 2}$ word $x$ where $x (i + 1) x (i + 2)$ . . . $x (i + n_{k + 1})$ and $x (j + 1) x (j + 2)$ . . . $x (j + n_{k + 1})$

are concatenated $A_{k + 1}$ -words in $x$ , call them $a$ and $b$ respectively. Since $x$ is an $A_{k + 2^{-}}$

word, there exists an occurrence of $z$ at an ( $i$ (mod $(k + 1)$ !)) (mod $(k + 1)$ $!$ )-indexed

place, i.e. there exists $i^{'} \equiv i$ (mod (A +1)!) such that $x (i^{'} + 1) x (i^{'} + 2)$ . . . $x (i^{'} +$

$n_{k + 1}) = z$ . Similarly, there exists $j^{'} \equiv j$ (mod (A +1)!) such that $x (j^{'} + 1) x (j^{'} +$

2) . . . $x (j^{'} + n_{k + 1}) = z^{'}$ . We now create $x^{'}$ by leaving almost all of $x$ alone, but defining

$x^{'} (i + 1)$ . . . $x^{'} (i + n_{k + 1}) = z$ , $x^{'} (j + 1)$ . . . $x^{'} (j + n_{k + 1}) = z^{'}$ , $x^{'} (i^{'} + 1)$ . . . $x^{'} (i^{'} + n_{k + 1}) = a$ ,

and $x^{'} (j^{'} + 1)$ . . . $x^{'} (j^{'} + n_{k + 1}) = b$ . This new word $x^{'}$ is still a concatenation of $A_{k + 1^{-}}$

words and ones, and since we switched two pairs of $A_{k + 1}$ -words which occurred at

indices with the same residue class modulo $(k + 1)!$ , $f r_{i, (k + 1)!}^{*} (w, x) = f r_{i, (k + 1)!}^{*} (w, x^{'})$

for all $0 \underline{<} i < (k + 1)!$ and $w \in A_{k + 1}$ . Therefore, $x^{'}$ is an $A_{k + 2}$ -word, with $z$ and $z^{'}$

occurring at the proper places, completing our proof of the base case.

Now, let us assume that the inductive hypothesis is true for a certain value of $t$ ,

and prove it for $t + 1$ . Consider an $A_{t + 1}$ word $x$ where $x (i + 1)$ . . . $x (i + n_{k + 1})$ and

$x (j + 1)$ . . . $x (j + n_{k + 1})$ are concatenated $A_{k + 1}$ -words in x, call them a and b. Call

the concatenated $A_{t}$ -word that $x (i + 1)$ $...$ $x (i + n_{k + 1})$ is a subword of $a^{'}$ , and denote

by $b^{'}$ the corresponding $A_{t}$ -word for $x (j + 1)$ . . . $x (j + n_{k + 1})$ $b^{'}$ . From now on, when

we speak of these words $a$ , $b$ , $a^{'}$ , $b^{'}$ , we are talking about the pertinent occurrences at

the places within $x$ already described. There are two cases; either $a^{'}$ and $b^{'}$ are the

same; i.e. the same $A_{t}$ -word in $x$ , occurring at the same place, or they are not. If $a^{'}$

and $b^{'}$ do occur at the same place, then by the inductive hypothesis, there exists an

$A_{t}$ -word $c$ with an occurrence of $z$ at the same place as $a$ occurs in $a^{'} = b^{'}$ , and an

occurrence of $z^{'}$ at the same place as $b$ occurs in $a^{'} = b^{'}$ . If we can replace $a^{'} = b^{'}$

by $c$ in $x$ , then we will be done. If $a^{'}$ and $b^{'}$ do not occur at the same place, then

since $a$ is a concatenated $A_{k + 1}$ -word in $a^{'}$ , by the inductive hypothesis there exists

an $A_{t}$ -word $a^{'}$ such that $a^{'}$ has $z$ occurring at the same place where $a$ occurs in $a^{'}$ .

Similarly, there exists an $A_{t}$ -word $b^{'}$ such that $b^{'}$ has an occurrence of $z^{'}$ at the same

place where $b$ occurs in $b^{'}$ . If we replace $a^{'}$ by $a^{'}$ and $b^{'}$ by $b^{'}$ in $x$ , then we will be

done. So regardless of which case we are in, our goal is to replace one or two chosen

$A_{t}$ -words within $x$ with one or two other $A_{t}$ -words. We will show how to replace

two, which clearly implies that replacing one is possible. We wish to replace $a^{'}$ by

$a^{'}$ and $b^{'}$ by $b^{'}$ . We do this in exactly the same way as in the base case; say that

$a^{'} = x (i^{'} + 1)$ . . . $x (i^{'} + n_{t})$ and $b^{'} = x (j^{'} + 1)$ . . . $x (j^{'} + n_{t})$ . Since $a^{'} \in A_{t}$ , there

exists $i^{'} = i^{'}$ (mod $t!$ ) and $j^{'} = j^{'}$ (mod $t!$ ) such that $x (i^{'} + 1)$ . . . $x (i^{'} + n_{t}) = a^{'}$

and $x (j^{'} + 1)$ . . . $x (j^{'} + n_{t}) = b^{'}$ . As in the base case, we create $x^{'}$ by making

$x^{'} (i^{'} + 1)$ . . . $x^{'} (i^{'} + n_{t}) = a^{'}$ and $x^{'} (i^{'} + 1)$ . . . $x^{'} (i^{'} + n_{t}) = a^{'}$ , $x^{'} (j^{'} + 1)$ . . . $x^{'} (j^{'} + n_{t}) = b^{'}$ ,

and $x^{'} (j^{'} + 1)$ . . . $x^{'} (j^{'} + n_{t}) = b^{'}$ . Then $x^{'}$ is an $A_{t + 1^{-}}$ word, and by construction

$x^{'} (i + 1)$ . . . $x^{'} (i + n_{k + 1}) = z$ and $x^{'} (j + 1)$ . . . $x^{'} (j + n_{k + 1}) = z^{'}$ .

$□$

Choose any sequence ${v_{m}}$ of $A_{m}$ -words for all $m > k + 1$ . For any such $m$ , take

$P_{m} =$ { $n$ : $v_{m} (n + 1)$ $...$ $v_{m} (n + n_{k + 1})$ is a concatenated $A_{k + 1}$ word in $v_{m}$ }. Since

$v_{m}$ is a concatenation of $A_{k + 1}$ -words and ones, if we write the elements of $P_{m}$ as

$p_{1}^{(m)} < p_{2}^{(m)} < ... < p_{t}^{(m)}$ , then for any $1 \underline{<} l$ $< t$ , $p_{l + 1}^{(m)} - p_{l}^{(m)} \underline{<}$ a $k + 1 + 1$ . For any

$1 < l$ $< t$ , and $i$ , $j \in [\frac{n_{k + 1}}{5}, \frac{4 n_{k + 1}}{5}]$ , by Lemma 2.2.8, there exists an $A_{m}$ word $v$ with the

property that $v (p_{1}^{(m)} + 1)$ $...$ $v (p_{1}^{(m)} + n_{k + 1}) = z_{i}$ and $v (p_{l}^{(m)} + 1)$ $...$ $v (p_{l}^{(m)} + n_{k + 1}) = z_{j}^{'}$ .

This implies that there is a subword of $v$ of the form rvxrv' where the length of $x$ is

$p_{l}^{(m)} - p_{1}^{(m)} + (j - i) - | w |$ . We note that $j - i$ can take any integer value between $- \frac{3 n_{k + 1}}{5}$

and $\frac{3 n_{k + 1}}{5}$ inclusive. Therefore, the set of possible lengths of $x$ for which rvxrv' $\in L (X)$

contains

$l = 2 \cup^{t} {(| w | + p_{l}^{(m)} - p_{1}^{(m)} + [- \frac{3 n_{k + 1}}{5}, \frac{3 n_{k + 1}}{5}])}$ .

When $l$ is increased by one, $p_{l}^{(m)}$ is increased by at most $n_{k + 1} + 1$ . This, along with

the fact that the intervals $[- \frac{3 n_{k + 1}}{5}, \frac{3 n_{k + 1}}{5}]$ have length $\frac{6 n_{k + 1}}{5}$ , which for large $k$ exceeds

$n_{k + 1} + 1$ , implies that this set of possible lengths of $v$ contains [ $| w | + p_{2}^{(m)} - p_{1}^{(m)}$ -

$\frac{3 n_{k + 1}}{5}$ , $| w | + p_{t}^{(m)} - p_{1}^{(m)} + \frac{3 n_{k + 1}}{5}] \underline{\supset} [| w | + n_{k + 1}, | w | + n_{m} - 2 n_{k + 1}]$ . Since this entire

argument could be made for any $m$ , we see that for any $n > | w | + n_{k + 1}$ , there exists

$x \in L (X)$ of length $n$ so that rvxrv' $\in L (X)$ . Then, for any nonempty open sets

$U$ , $V \underline{\subset} X$ , there exist $w$ and $w^{'}$ such that $[w] \underline{\subset} U$ and $[w^{'}] \underline{\subset} V$ . By the above

arguments, there exists $N$ so that for any $n > N$ , $[w] \cap σ^{n} [w^{'}] \neq \emptyset$ , implying that

$U \cap σ^{n} V \neq \emptyset$ . This shows that $(X, σ)$ is topologically mixing.

$□$

2.3 Some symbolic counterexamples

Proof of Theorem 2.1.15: We take the continuous function $f (y) = y (0)$ for all

$y \in X$ , and first note that

${y \in X$ : $\frac{1}{N} \sum_{n = 0}^{N - 1} f (σ^{p_{n}} y)$ does not $c o n v e r g e}$

$\underline{\supset}$ $(\cap \cup n > 0 k > n {y \in X$ : $f r_{0, 1}$ $(0, y (p_{1}) y (p_{2})$ . . . $y (p_{k})) < \frac{1}{4}}) \cap$

$(\cap \cup n > 0 k > n {y \in X$ : $f r_{0, 1} (0, y (p_{1}) y (p_{2}) ... y (p_{k})) > \frac{3}{4}}$ ),

and that the latter set, call it $B$ , is clearly a $G_{δ}$ . We will choose $n_{k}$ so that $B$ is dense

in $X$ . This will imply that $B$ is a dense $G_{δ}$ , and since $X$ is a complete metric space,

by the Baire category theorem, that $B$ is residual, which will prove Theorem 2.1.15.

Now let us describe the construction of ${n_{k}}$ .

Recall that we have assumed that the sequence ${p_{n}}$ has upper Banach density zero.

We define $A : = {p_{n} : n \in N}$ . We also define the intervals of integers $B_{j} =$

$[2 j!, (j + 1)!] \cap N$ for every $j \in N$ , and take any partition of $N$ into infinitely many

disjoint infinite sets in $N$ , call them $C_{1}$ , $C_{2}$ , . . .. Define the set $D_{1} = \cup_{j \in C_{1}} B_{j}$ , and

then define the set $A_{1} = {p_{n} : n \in D_{1}} + 1$ . Next, choose some $r_{2}$ large enough so that

$(\min C_{r_{2}})! > 2 ċ 2$ , and define $D_{2} = \cup_{j \in C_{r_{2}}} B_{j}$ , and then define $A_{2} = {p_{n} : n \in D_{2}} + 2$ .

Continuing in this way, we may inductively define $A_{k}$ , $D_{k}$ for all $k \in N$ so that for

1ll $k$ , $D_{k} = \cup_{j \in C_{r_{k}}} B_{j}$ for some $r_{k}$ with the property that $(\min C_{r_{k}})!$ $> 2 k$ , and

$A_{k} = {p_{n} : n \in D_{k}} + k$ . We will verify some properties of these sets. Most

importantly, we denote by $H$ the union $\cup_{n = 1}^{\infty} A_{n}$ , and claim that $d^{*} (H) = 0$ . We

show this by noting that $H$ has a certain structure; $H$ consists of shifted subintervals

of $A$ , separated by gaps which approach infinity. More rigorously:

Lemma 2.3.1. There exist intervals $I_{k} = [a_{k}, b_{k}] \cap N$ and integers $j_{k}$ such that

$H = \cup_{k = 1}^{\infty} ((A \cap I_{k}) + j_{k})$ and such that $\lim_{k \to \infty} (\min ((A \cap I_{k + 1}) + j_{k + 1}) - \max ((A \cap$

$I_{k}) + j_{k})) = \infty$ .

Proof: Take the set $Q = \cup_{k = 1}^{\infty} C_{r_{k}}$ , and denote its members by $q_{1} < q_{2} <$ . . ..

Then, for any $k$ , $B_{q k}$ is a subset of some $D_{s}$ . The interval $I_{k}$ is then defined to be

$[p_{2 (q_{k})!}, p_{(q_{k} + 1)!)}]$ , (which means $a_{k} = p_{2 (q_{k})!}$ and $b_{k} = p_{q_{k} (q_{k})!}$ ) and $j_{k}$ is defined to be $s$ .

It is just a rewriting of the definition of the $A_{k}$ that $H = \cup_{k = 1}^{\infty} ((A \cap I_{k}) + j_{k})$ with

these notations. All that must be checked is that $\lim_{k \to \infty} (\min ((A \cap I_{k + 1}) + j_{k + 1}) -$

$\max ((A \cap I_{k}) + j_{k})) = \infty$ . We will show that $a_{k + 1} + j_{k + 1} - b_{k} - j_{k} \to \infty$ , which

implies the desired result. Since $q_{k + 1} > q_{k}$ , $a_{k + 1} - b_{k} = (q_{k + 1})! > (q_{k})!$ . So, we must

simply show that $(q_{k})! - j_{k} \to \infty$ . Suppose that $B_{q k}$ is a subset of $D_{s}$ . Then $j_{k} = s$ .

We also note that by construction, $(\min C_{r_{s}})! > 2 s$ . But, since $B_{q k} \subset D_{s}$ , $q_{k} \in C_{r_{s}}$ ,

and so $\min C_{r_{s}} \underline{<} q_{k}$ . Therefore, $(q_{k})! > 2 s$ , and so $(q_{k})! - j_{k} = (q_{k})! - s > \frac{(q_{k})!}{2}$ , which

clearly shows that this quantity approaches $\infty$ , since ${q_{k}}$ is an increasing sequence

of integers.

$□$

We now prove a general lemma that implies, in particular, that $d^{*} (H) = 0$ .

Lemma 2.3.2. If $d^{*} (A) = 0$ , and if there exist intervals $I_{k} = [a_{k}, b_{k}] \cap N$ and integers

$j_{k}$ such that $\lim_{k \to \infty}$ ( $\min$ $((A \cap I_{k + 1}) + j_{k + 1}) - \max ((A \cap I_{k}) + j_{k})) = \infty$ , then the

set $B = \cup_{k = 1}^{\infty} (A \cap I_{k}) + j_{k}$ has upper Banach density zero.

Proof: Fix $ε > 0$ . By the fact that $d^{*} (A) = 0$ , there exists $N$ such that for any interval

$J$ of integers of length at least $N$ , $\frac{| A \cap J |}{| J |} < ε$ . Take $J$ to be any interval of integers of

length exactly $N$ . Since $\lim_{k \to \infty} (\min ((A \cap I_{k + 1}) + j_{k + 1}) - \max ((A \cap I_{k}) + j_{k})) = \infty$ ,

there is some $K$ such that if $J$ has nonempty intersection with $(A \cap I_{k}) + j_{k}$ for some

$k > K$ , it is disjoint from $(A \cap I_{k^{'}}) + j_{k^{'}}$ for every $k^{'} \neq k$ . Therefore, for intervals

$J$ of integers of length $N$ with large enough minimum element, $J \cap B$ consists of a

subset of a shifted copy of $J \cap A$ , and so $\frac{| B \cap J |}{| J |} \underline{<} \frac{| A \cap J^{'} |}{| J |}$ , for some interval $J^{'}$ of integers

whose length is also $N$ . This means that in this case, $\frac{| B \cap J |}{| J |} < ε$ . We have then shown

that for every $ε$ , there exist $N$ , $M$ such that for any interval of integers $J$ of length

$N$ with $\min J > M$ , $\frac{| B \cap J |}{| J |} < ε$ . We will show that this slightly modified definition

still implies that $d^{*} (B) = 0$ . Again fix $ε > 0$ , and define $M$ and $N$ as was just done.

Now consider any interval of integers I with length at least $\frac{N + M}{ε}$ . Then, partition $I$

into subintervals: define $I_{0} = I$ $\cap {1, ..., M}$ , and then break $I ∖ I_{0}$ into consecutive

subintervals of length $N$ , called $I_{1}$ , $I_{2}$ , $...$ , $I_{k}$ . There may be one last subinterval left

over of length less than $N$ ; call it $I_{k + 1}$ (which may be empty.) Note that $| I | \underline{>} N k$ ,

or $\frac{N}{| I |} \underline{<} \frac{1}{k}$ . We see that

$\frac{| B \cap I |}{| I |} = \frac{| B \cap I_{0} |}{| I |} + \sum_{i = 1}^{k} \frac{| B \cap I_{i} |}{| I |} + \frac{| B \cap I_{k + 1} |}{| I |} \underline{<} \frac{M}{| I |} + \frac{N}{| I |} (\sum_{i = 1}^{k} \frac{| B \cap I_{i} |}{| I_{i} |}) + \frac{N}{| I |}$

$\underline{<} \frac{M + N}{| I |} + \frac{1}{k} (k ε) < 2 ε$ .

Since $ε$ was arbitrary, $d^{*} (B) = 0$ .

$□$

By combining Lemmas 2.3.1 and 2.3.2, $d^{*} (H) = 0$ . We will now create $X$ . We note

that this part of our construction uses only the fact that $d^{*} (H) = 0$ , and no other

properties. We take $n_{1} = 1$ and $A_{1} = {0, 1}$ . We recall that ${n_{k}}$ must be a sequence

of integers with the following properties: for all $k > 1$ , $n_{k + 1} = C_{k} (k + 1)! | A_{k} | n_{k} + p$

for some positive integer $C_{k} > n_{k} > \frac{1}{d_{k}}$ and prime $n_{k} < p \underline{<} 2 n_{k}$ . We also require $n_{k}$

to grow quickly enough so that for all $k$ , and for any interval of integers I of length

at least $n_{k + 1}$ , $\frac{| I \cap H |}{| I |} < \frac{d_{k}}{2 k! | A_{k} | n_{k}}$ . That we may choose such $n_{k}$ is a consequence of the

fact that $d^{*} (H) = 0$ . Using these $n_{k}$ , we define $A_{k}$ as in Construction 3. We now

prove a lemma:

Lemma 2.3.3. For any k $\in N$ , m $\in Z$ , and for any u $\in$ {0, $1}^{N}$ , there exists an

$A_{k}$ -rvord $v_{u, k, m}$ such that $v_{u, k, m} (i$ -m) $= u (i)$ for all i $\in H \cap [m + 1, m + n_{k}]$ .

Proof: This is proved by induction on $k$ . Clearly the hypothesis is true for $k = 1$

and for any $u$ , $m$ . Now suppose it to be true for a particular $k$ . We will show

that it is true for $k + 1$ and every $u$ , $m$ . We again construct an auxiliary word

$u_{k + 1}$ : enumerate the elements of $A_{k}$ by $a_{1}$ , $a_{2}$ , $...$ , $a_{| A_{k} |}$ . Then, we again define the

word $u_{k + 1} = a_{1}^{k!} a_{2}^{k!} ...$ $a_{| A_{k} |}^{k!}$ . Define $v_{k + 1}^{'} =$ $(u_{k + 1} 1) p (u_{k + 1})^{C_{k} (k + 1) - p}$ , where $n_{k + 1} =$

$C_{k} (k + 1)! | A_{k} | n_{k} + p$ as above. We note that for any $0 \underline{<} i < k!$ and $w \in A_{k}$ ,

$w$ occurs exactly $C_{k} (k + 1)$ times as a concatenated $A_{k}$ -word at $i$ (mod $k!$ )-indexed

places in $v_{k + 1}^{'}$ . (This uses the fact that $(n_{k}, k!) = 1$ for all $k$ , which has already

been shown.) Now, fix $m \in Z$ . We wish to construct an $A_{k + 1}$ word $v_{u, k + 1, m}$ such

that $v_{u, k + 1, m} (i - m) = u (i)$ for all $i \in H \cap$ [ $m + 1$ , $m +$ a $k + 1$ ]. We begin with the

$A_{k + 1}$ word $v_{k + 1}^{'}$ . Clearly it is not necessarily true that $v_{k + 1}^{'} (i - m)$ $= u (i)$ for all

$i \in H \cap [m + 1, m + n_{k + 1}]$ . We force this condition to be true by changing some of the

$A_{k}$ -words concatenated in $v_{k + 1}^{'}$ . We show that this is possible; for any concatenated

$A_{k^{-}}$ word in $v_{k + 1}^{'}$ , say $v_{k + 1}^{'} (j) v_{k + 1}^{'} (j + 1)$ $...$ $v_{k + 1}^{'} (j + n_{k} - 1)$ , the necessary condition is

that ones or zeroes (depending on $u$ ) be introduced at digits whose indices are of the

form $i - m$ for all $i \in H \cap$ [ $m + j + 1$ , $...$ , a $+ j + n_{k} - 1$ ]. To do this, we replace this $A_{k^{-}}$

word by $v_{u, k, m + j}$ , which by the inductive hypothesis has the correct digits of $u$ at the

desired places. So, we may change $v_{k + 1}^{'}$ into a concatenation of $A_{k}$ -words and ones,

call it $v_{u, k + 1, m}$ , which has the proper digits of $u$ in all desired places. This may be done

by changing at most $| H \cap [m + 1, ..., m + n_{k + 1}] | < \frac{d_{k} n_{k + 1}}{2 k! | A_{k} | n_{k}} \underline{<} C_{k} (k + 1) d_{k} A_{k}$ -words.

Therefore, since for any $0 \underline{<} i < k!$ and $w \in A_{k}$ , $w$ occurred in $v_{k + 1}^{'}$ as a concatenated

$A_{k}$ -word at $i$ (mod $k!$ )-indexed places exactly $C_{k} (k + 1)$ times, $w$ occurs in $v_{u, k + 1, m}$ as

a concatenated $A_{k}$ -word at $i$ (mod $k!$ )-indexed places between $C_{k} (k + 1) (1 - d_{k})$ and

$C_{k} (k + 1) (1 + d_{k})$ times. This implies that $f r_{i, k!}^{*} (w, v_{k + 1, m}) \in [\frac{1 - d_{k}}{k! | A_{k} |}, \frac{1 + d_{k}}{k! | A_{k} |}]$ , and since

$i$ and $w$ were arbitrary, that $v_{k + 1, m}$ is an $A_{k + 1}$ -word. By induction, Lemma 2.3.3 is

proved.

$□$

This implies in particular that for every $u$ , $k$ there exists an $A_{k}$ -word $v_{u, k, - n_{k - 1}}$ with

$v_{u, k, n_{k - 1}} (i + n_{k - 1}) = u (i)$ for all $i \in H \cap (- n_{k - 1}$ , $...$ , $n_{k} - n_{k - 1}$ ]. By a standard

diagonalization argument, there exists a sequence ${k_{j}}$ and $x \in Ω$ such that for

all $i \in Z$ , $x (i) = v_{u, k_{j}, - n_{k_{j} - 1}} (i + n_{k_{j} - 1})$ for all large enough $j$ . Since for every $j$ ,

$v_{u, k_{j}, - n_{k_{j} - 1}} (i + n_{k_{j} - 1}) = u (i)$ for all $i \in H \cap (- n_{k_{j} - 1}, n_{k_{j}} - n_{k_{j} - 1}]$ , clearly $x (i) = u (i)$

for all $i \in H$ . Since $x$ is a limit of shifted $A_{k}$ -words, $x \in X$ . As mentioned above,

this entire construction could be done with any set of zero upper Banach density in

place of $H$ , which lets us state the following corollary:

Corollary 2.3.4. For any C $\subset N$ with $d^{*} (C) = 0$ , there exists (X, $σ)$ totally mini-

mal, totally uniquely ergodic, topologically mixing, and with the property that for any

sequence u $\in$ {0, $1}^{N}$ , there exists $x_{u} \in X$ such that $x_{u} (i) = u (i)$ for all i $\in C$ .

We use Corollary 2.3.4 to create $(X, σ)$ which proves Theorem 2.1.15. Recall that

$H = \cup_{k = 1}^{\infty} A_{k}$ , where $A_{k} = {p_{n} + k : n \in D_{k}}$ for all $k$ , $D_{k} = \cup_{j \in C_{r_{k}}} B_{j}$ for some

$r_{k}$ , and $B_{j} = [2 j!, (j + 1)!] \cap Z$ for all $j$ . For each $k$ , we write the elements of $C_{r_{k}}$

in increasing order as $c_{r_{k}}^{(1)}$ , $c_{r_{k}}^{(2)}$ , $...$ . We now decompose $H$ into two disjoint subsets;

define $H_{o} =$ { $m \in H$ : $m = p_{n} + k$ for some $n \in B_{j}$ where $j = c_{r_{k}}^{(i)}$ for odd $i$ },

and $H_{e} =$ { $m \in H$ : $m = p_{n} + k$ for some $n \in B_{j}$ where $j = c_{r_{k}}^{(i)}$ for even $i$ }.

Since $d^{*} (H) = 0$ , we use Corollary 2.3.4 to create a totally minimal, totally uniquely

ergodic, and topologically mixing $(X, σ)$ and $x \in X$ with $x (n) = 0$ for $n \in H_{o}$ and

$x (n) = 1$ for $n \in H_{e}$ . Recall that we wish to show that for the continuous function

$f$ : $y \mapsto y (0)$ from $X$ to {0, 1}, the set of points $y$ such that $\frac{1}{N} \sum_{n = 0}^{N - 1} f (σ^{p_{n}} y)$ does

not converge is residual. We showed earlier that it is sufficient to show that the set

$B = ($ $\cap \cup n > 0 k > n {y \in X$ : $f r_{0, 1} (0, y (p_{1}) y (p_{2}) ... y (p_{k})) < \frac{1}{4}}) \cap$

$(\cap \cup n > 0 k > n {y \in X$ : $f r_{0, 1} (0, y (p_{1}) y (p_{2}) ... y (p_{k})) > \frac{3}{4}})$

is dense in $X$ . Fix any $j \in Z$ . By the construction of $H$ , ${n : j + p_{n} \in H} = D_{k}$

for some $k$ , $x (i) = 0$ for all $i \in H_{o}$ , and $x (i) = 1$ for all $i \in H_{e}$ . In particular,

$x (j + p_{n}) = 0$ for all $n$ in $B_{c_{r_{k}}} (i)$ for odd $i$ and $x (j + p_{n}) = 1$ for all $n$ in $B_{c_{r_{k}}} (i)$ for even $i$ .

But then for any odd integer $i$ , $(σ^{j} x) (p_{n}) = 0$ for all integers $n \in [2 (c_{r_{k}}^{(i)})!, (c_{r_{k}}^{(i)} + 1)!]$ ,

and so

$f r_{0, 1} (0, (σ^{j} x) (p_{1}) (σ^{j} x (p_{2}) ... (σ^{j} x) (p_{(c_{r_{k}}^{(i)} + 1)!})) \underline{>} \frac{c_{r_{k}}^{(i)} - 1}{c_{r_{k}}^{(i)} + 1}$ ,

which is clearly larger than $\frac{3}{4}$ for sufficiently large $k$ . Similarly, for any even integer

$i$ , $(σ^{j} x) (p_{n}) = 1$ for all integers $n \in [2 (c_{r_{k}}^{(i)})!, (c_{r_{k}}^{(i)} + 1)!]$ , and so

$f r_{0, 1} (0, (σ^{j} x) (p_{1}) (σ^{j} x) (p_{2}) ... (σ^{j} x) (p_{(c_{r_{k}}^{(i)} + 1)!})) \underline{<} \frac{2}{c_{r_{k}}^{(i)} + 1}$ ,

which is clearly less than $\frac{1}{4}$ for sufficiently large $k$ . Therefore, $σ^{j} x \in B$ . Since $j \in Z$

was arbitrary, the orbit of $x$ is a subset of $B$ . Since $X$ is minimal, $B$ is dense, which

completes the proof of Theorem 2.1.15.

$□$

We note that this proof in fact shows that the set

$(\cap \cup n > 0 k > n {y \in X$ : $f r_{0, 1} (0, y (p_{1}) y (p_{2}) ... y (p_{k!})) < \frac{2}{k}}) \cap$

$(\cap \cup n > 0 k > n {y \in X$ : $f r_{0, 1} (0, y (p_{1}) y (p_{2}) ... y (p_{k!})) > \frac{k - 2}{k}})$

is a residual set in $X$ , and so we can also say that for a residual set of $y \in X$ ,

$\lim \inf_{N \to \infty} \frac{1}{N} \sum_{n = 0}^{N - 1} f (σ^{p_{n}} y) = 0 = \inf_{x \in X} f (x)$ and $\lim \sup_{N \to \infty} \frac{1}{N} \sum_{n = 0}^{N - 1} f (σ^{p_{n}} y) =$

$1$ $= \sup_{x \in X} f (x)$ .

Proof of Theorem 2.1.18: We use Corollary 2.3.4. Consider any set $C = {p_{n}}_{n \in N}$

with $d^{*} (C) = 0$ . Choose any set $C^{'} \subset N$ with $(C + 1)$ $\subset C^{'}$ , $1 \in C^{'}$ , $d^{*} (C^{'}) = 0$ ,

and $| C^{'} ∖ (C + 1)$ $| = \infty$ . Denote the elements of $C^{'} ∖ (C + 1)$ by $b_{1} < b_{2} <$ . . ..

By Corollary 2.3.4, we may construct a totally minimal, totally uniquely ergodic,

topologically mixing $(X, σ)$ with the property that for every $u \in {0, 1}^{N}$ , there exists

$x_{u} \in X$ with $x_{u} (i) = u (i)$ for all $i \in C^{'}$ . For every $v \in {0, 1}^{N}$ , define some $u_{v} \in$

${0, 1}^{N}$ by $u_{v} (1)$ $= 0$ , $u_{v} (a) = 1$ for all $a \in C + 1$ , and $u_{v} (b_{k}) = v (k)$ for all $k \in$ C.

Then, for any such $v$ , $x_{u_{v}} (i) = 1$ for all $i \in C + 1$ , $x_{u_{v}} (1)$ $= 0$ , and $x_{u_{v}} (b_{k}) = v (k)$

for all $k \in Z$ . Since for all $n \in N$ , $p_{n} + 1$ $\in C + 1$ , $(σ^{p_{n}} x_{u_{v}}) (1) = x_{u_{v}} (p_{n} + 1)$ $= 1$ ,

whereas $x_{u_{v}} (1)$ $= 0$ . It is then clear that $x_{u_{v}}$ is not a limit point of ${σ^{p_{n}} x_{u_{v}}}$ . Since

$x_{u_{v}} (b_{k}) = v (k)$ for all $k$ $\in N$ , $x_{u_{v}} \neq x_{u_{v}}$ , for $v \neq v^{'}$ , and so the set ${x_{u_{v}}}_{v \in {0, 1}^{N}}$ is

uncountable.

$□$

It is natural to wonder about one aspect of the proof; why is it that in our construc-

tion, we only force certain digits to occur along shifted subsets of $A$ , rather than

along entire shifted copies of $A$ ? The reason comes from a combinatorial fact which

is somewhat interesting in its own right:

Example 2.3.5. There exists a set $D \underline{\subset} N$ with $d^{*} (D) = 0$ with the property that for

any infinite set G of integers, the set $D + G = {d + g$ :d $\in D,$ g $\in G}$ has upper

Banach density one.

Proof: We begin with the sequence $d_{n} = 3^{| n |_{2}}$ , where $| n |_{2}$ is the maximal integer

$k$ so that $2^{k} | n$ . So, ${d_{n}}$ begins 1, 3, 1, 9, 1, 3, 1, 27, 1, 3, 1, 9, 1, 3, 1, . . . Then, define

$c_{n} = \sum_{i = 1}^{n} d_{n}$ . ${c_{n}}$ is then an increasing sequence of integers, with the property that

the $n t h$ gap $c_{n + 1} - c_{n}$ is $d_{n + 1}$ for all $n$ . We first claim that $d^{*} ({c_{n}}) = 0$ . Choose any

positive integer $k$ , and any $2^{k} + 1$ consecutive elements $c_{i}$ , $...$ , $c_{i + 2^{k}}$ of the sequence

${c_{n}}$ . There must be some integer $j \in [0, 2^{k} - 1]$ such that $2^{k} | i + j$ . This means

that $3^{k} | d_{i + j}$ , and so that $c_{i + 2^{k}} - c_{i} = \sum_{m = i}^{i + 2^{k} - 1} d_{m} > d_{i + j} \underline{>} 3^{k}$ . This means that

any interval of integers of length less than $3^{k}$ can contain at most $2^{k}$ elements of

${c_{n}}$ for any $k$ , which implies that $d^{*} ({c_{n}}) = 0$ since $\lim_{k \to \infty} \frac{2^{k + 1}}{3^{k}} = 0$ . Therefore,

if we construct a new sequence by increasing the gaps ${d_{n}}$ , it will still have upper

Banach density zero. We will change ${d_{n}}$ countably many times, never decreasing

any element. In other words, we inductively construct, for every $k \in N$ , a sequence

${d_{n}^{(k)}}$ , so that these sequences are nondecreasing in $k$ , i.e. $d_{n}^{(k)} \underline{>} d_{n}^{(k - 1)}$ for all $n$ , $k$ .

We add the additional hypothesis that for every $k$ , $d ({n : d_{n}^{(k)} > d_{n}}) = 0$ , in other

words that only a density zero subset of the elements of ${d_{n}}$ have been changed after

any step.

Step 1: We change $d_{n}$ for some infinite, but density zero, set of $n$ , so that for every

positive integer $m$ , there exists $n$ such that $d_{n} = m$ . This is clearly possible; for

example, by increasing $d_{n}$ for a density zero set of odd $n$ . Call the resulting sequence

${d_{n}^{(1)}}$ .

Step $k$ : $(k > 1)$ Assume that we have already defined ${d_{n}^{(k - 1)}}$ , a sequence of integers

with the property that $d_{n}^{(k - 1)} \underline{>} d_{n}$ for all $n$ , and that $d ({n : d_{n}^{(k - 1)} > d_{n}}) = 0$ .

Define the set $R_{k} \subset N^{k}$ by $R_{k} =$ ${(a_{1}, ..., a_{k}) : a_{i} \in N, a_{i} > 3^{k}}$ . Since { $n$ :

$d_{n}^{(k - 1)} > d_{n}}$ has density zero, there exist infinitely many intervals of integers $I_{j}$ such

that $| I_{j} | > 2^{k + 1}$ for all $j$ , and so that $d_{n}^{(k - 1)} = d_{n}$ for all $n \in I_{j}$ for any $j$ . Therefore,

by the construction of the sequence ${d_{n}}$ , each $I_{j}$ contains a subinterval $I_{j}^{'}$ of integers

of length $2^{k}$ so that for every $j$ , and for all $n \in I_{j}^{'}$ , $d_{n}^{(k - 1)} \underline{<} 3^{k}$ . We may also assume,

by passing to a subset if necessary, that the union of all $I_{j}^{'}$ has density zero. We now

take any bijection $φ$ from $N$ to $R_{2^{k}}$ , and for every $j$ , if $I_{j}^{'} = {s, s + 1, ..., s + 2^{k} - 1}$ ,

and if $φ (j) = (a_{1}, ..., a_{2^{k}}) \in R_{2^{k}}$ , define $d_{s}^{(k)} = a_{1}$ , $d_{s + 1}^{(k)} = a_{2}$ , $...$ , $d_{s + 2^{k} - 1}^{(k)} = a_{2^{k}}$ . After

making these changes on each $I_{j}^{'}$ , for any $m$ not in any $I_{j}^{'}$ , define $d_{m}^{(k)} = d_{m}^{(k - 1)}$ . This

defines $d_{n}^{(k)}$ for all $n$ . Since for every $n$ , $d_{n}^{(k)} \underline{>} d_{n}^{(k - 1)}$ , by the inductive hypothesis we

see that $d_{n}^{(k)} \underline{>} d_{n}$ for all $n$ . Also, since $d ({n : d_{n}^{(k)} > d_{n}^{(k - 1)}}) = 0$ , and since by the

inductive hypothesis, $d ({n : d_{n}^{(k - 1)} > d_{n}}) = 0$ , we see that $d ({n : d_{n}^{(k)} > d_{n}}) = 0$ ,

completing the inductive step.

It is a consequence of this construction that if we define $H_{k} = {n : d_{n}^{(k)} > d_{n}^{(k - 1)}}$

for every $k$ , the sets $H_{k}$ are pairwise disjoint. Therefore, the sequences ${d_{n}^{(k)}}$ have

a pointwise limit, call it ${e_{n}}$ . By the construction, for any $k$ , and for any k-tuple

$(a_{1}, ..., a_{k})$ of integers all greater than $3^{k}$ , there exists $m$ such that $d_{m + i - 1}^{(k)} = a_{i}$ for

$1 \underline{<} i \underline{<} k$ . And, since the $H_{k}$ are disjoint, this means that $e_{m + i - 1} = a_{i}$ for $1 \underline{<} i \underline{<} k$ .

Now, define the sequence $f_{n} = \sum_{k = 1}^{n} e_{n}$ for all $n$ . As already noted, since $e_{n} \underline{>} d_{n}$ for

all $n$ , $D = {f_{n}}$ has upper Banach density zero. We claim that for any infinite set

$G$ of integers, $D + G$ contains arbitrarily long intervals of integers. Fix any infinite

$G = {g_{n}}$ , with $g_{1} < g_{2} < ...$ . For any $k$ , there exist $m_{1}$ , $m_{2}$ , . . . , $m_{2^{k}}$ , $m_{2^{k} + 1}$ so

that $g_{m_{j + 1}} - g_{m_{j}} > 3^{k}$ for every $1 \underline{<} j \underline{<} 2^{k}$ . Then, by construction, there exists $m$

so that $e_{m + j} = g_{m_{2^{k} - j + 2}} - g_{m_{2^{k} - j + 1}} + 1$ for $1 \underline{<} j \underline{<} 2^{k}$ . This means that $f_{m + i} =$

$f_{m} + \sum_{j = 1}^{i} e_{m + j} = f_{m} + \sum_{j = 1}^{i} (g_{m_{2^{k} - j + 2}} - g_{m_{2^{k} - j + 1}} + 1)$ $= (f_{m} + g_{m_{2^{k} + 1}}) - g_{m_{2^{k} + 1 - i}} + i$

for $1 \underline{<} i \underline{<} 2^{k}$ . But then for each such $i$ , $f_{m + i} + g_{m_{2^{k} + 1 -}}$ . $= f_{m} + g_{m_{2^{k} + 1}} + i \in D + G$ .

This implies that ${f_{m} + g_{m_{2^{k} + 1}} + 1, ..., f_{m} + g_{m_{2^{k} + 1}} + 2^{k}}$ is an interval of integers of

length $2^{k}$ which is a subset of $D + G$ . Since $k$ was arbitrary, $D + G$ contains arbitrarily

long intervals and so has upper Banach density one.

$□$

This answers our question: if we had, in the proof of Theorem 2.1.15, tried to force

ones to occur along infinitely many shifted copies of our set $A$ of upper Banach

density zero, it's possible that no matter what set $G$ of shifts we used, we would

be attempting to force $x (i) = 1$ for arbitrarily long intervals of integers $i$ and some

$x \in X$ , which would imply that $1 " \in X$ , yielding the closed invariant set ${1^{\infty}} \subset X <$

and contradicting the minimality of $X$ .

2.4 Some general constructions on connected manifolds

We will now construct some totally minimal, totally uniquely ergodic, and topo-

logically mixing topological dynamical systems $(X, T)$ for which $X$ is a connected

manifold, and in Section 2.5 we will use such examples to prove Theorems 2.1.16 and

2.1.19. The constructions in question will use both skew products and flows under

functions. We will be repeatedly using the topological space $T$ , which can most easily

be considered as the half-open interval $[0, 1)$ with 0 and 1 identified and the operation

of addition (mod 1). (Whenever we refer to the addition of elements of $T$ , it should

be understood to be addition (mod 1).) We also note that $T^{n}$ is a metric space for

any $n$ with metric $d$ defined by $d (x, y) = \min_{u, v \in Z^{n}} \bar{d} (x + u, y + v)$ , where $\bar{d}$ is the

Euclidean metric in $R^{n}$ . We denote by $λ_{k}$ Lebesgue measure on $T^{k}$ for any $k > 0$ .

For any $k > 1$ , irrational ( $y$ $\in T$ and continuous self-map $f$ of $T$ , we define the

homeomorphism $S = S_{k, α, f}$ on $T^{k}$ as follows: for any $v =$ $(v_{1}, ..., v_{k}) \in T^{k}$ , $(S v)_{1} =$

$v_{1} + α$ , $(S v)_{2} = f (v_{1}) + v_{2}$ , and for every $j > 2$ , $(S v)_{j} = v_{j - 1} + v_{j}$ .

Theorem 2.4.1. For any f differentiable with $\frac{1}{2} < f^{'} (x) < \frac{3}{2}$ for all x $\in T$ , $S_{k, α, f}$ is

totally minimal and totally uniquely ergodic with respect to $λ_{k}$ .

Proof: During the proof, since $k$ , ( $y$ , and $f$ are taken to be fixed, we suppress

notational dependence and refer to $S_{k, α, f}$ simply as $S$ . Fix any rectangles $R =$

$\prod_{i = 1}^{k} R_{i}$ and $R^{'} = \prod_{i = 1}^{k} R_{i}^{'}$ in $T^{k}$ , where $R_{i}$ and 77: are intervals of length $C$ for every

$1 \underline{<} i \underline{<} k$ . Now, fix any positive integer $p$ such that $λ_{1}$ $((R_{1} + P α) \cap R_{1}^{'}) > \frac{c}{2}$ . Fix any

$r_{2} \in R_{2}$ , $...$ , $r_{k} \in R_{k}$ , and define the set

$E_{1} = {x_{1} \in R_{1} : π_{1} (S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) \in R_{1}^{'}}$ .

(Here and elsewhere, $π_{i}$ : $T^{k} \to T$ is the usual projection map $v \mapsto v_{i}$ for $1 \underline{<} i \underline{<} k .$ )

Since $π_{1}$ $(S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) = x_{1} + I α$ , by choice of $p$ we know that $E_{1}$ is an interval

with $λ_{1} (E_{1}) > \frac{c}{2}$ . Now, define the set

$E_{2} = {x_{1} \in R_{1} : π_{i} (S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) \in R_{i}^{'}, 1 \underline{<} i \underline{<} 2}$ .

We will examine the structure of $E_{2}$ by bounding $\frac{\partial π_{2} (S^{ℓ} (x_{1}, r_{2}, ..., r_{k}))}{\partial x_{1}}$ from above and

below. For this, we note that $π_{2}$ $(S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) = r_{2} + \sum_{i = 0}^{ℓ - 1} f (x_{1} + i α)$ , and

make the observation that for any $i$ , $f (x_{1} + i α)$ is equal modulo one to an increasing

function in $x_{1}$ whose slope is between $\frac{1}{2}$ and $\frac{3}{2}$ . Therefore, considered as a function

of $x_{1}$ , $π_{2}$ $(S^{ℓ} (x_{1}, r_{2}, ..., r_{k}))$ is equal modulo one to an increasing function from $[0, 1)$

to $R$ whose derivative is in $(\frac{ℓ}{2}, \frac{3 ℓ}{2})$ for all $x_{1}$ . This implies that $E_{2}$ is a union of many

intervals separated by gaps of length less than $\frac{1}{\frac{ℓ}{2}}$ , where the length of all intervals but

the first and last is greater than $\frac{c}{\frac{3 ℓ}{2}}$ . For large $p$ , this implies that $E_{2}$ contains some

set $F_{2}$ a union of intervals of length $\frac{D_{2} C}{ℓ}$ for some constant $D_{2}$ , where $λ_{1} (F_{2}) > B_{2} C^{2}$

for some constant $B_{2}$ .

We proceed inductively: for any $2 \underline{<} i < k$ , assume that we are given $F_{i}$ a union of

intervals whose lengths are $D_{i} C l^{- (i - 1)}$ for some constant $D_{i}$ , where $λ_{1} (F_{i}) > B_{i} C^{i}$

for some constant $B_{i}$ , and such that for any $x_{1} \in F_{i}$ , $π_{j}$ $(S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) \in R_{j}^{'}$ for

$1 \underline{<} j \underline{<} i$ . We now wish to define $F_{i + 1}$ . Define the set

$E_{i + 1} = {x_{1} \in F_{i} : π_{i + 1} (S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) \in R_{i + 1}^{'}}$ .

For any interval I of length $D_{i} C l^{- (i - 1)}$ in $F_{i}$ , let's examine $I$ $\cap E_{i + 1}$ . We note that

$π_{i + 1}$ $(S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) = η_{i + 1, ℓ} (r_{2}, ..., r_{i + 1}) + \sum_{j = 0}^{ℓ - i}$ $f (x_{1} + j α)$ ,

where $η_{i + 1, ℓ}$ is some function of $r_{2}$ , . . . ' $r_{i + 1}$ which does not depend on $x_{1}$ . So, as a

function of $x_{1}$ , $π_{i + 1}$ $(S^{ℓ} (x_{1}, r_{2}, ..., r_{k}))$ is equal modulo one to an increasing function

from $[0, 1)$ to $R$ whose derivative is between $A l^{i}$ and $B l^{i}$ for some constants $A$ , $B > 0$ .

This implies that for large $p$ , $I$ $\cap E_{i + 1}$ is a union of many intervals separated by gaps

of length less than $\frac{1}{B} l^{- i}$ , where the length of all but the first and last is greater than

$\frac{c}{A} l^{- i}$ . For large $p$ , this implies that $I$ $\cap E_{i + 1}$ contains $F_{I}$ , $i + 1$ a union of intervals of

length $D C l^{- i}$ where $λ_{1} (F_{I, i + 1}) > E C λ_{1} (I)$ for some constants $D$ , $E$ . By taking $F_{i + 1}$

to be the union of all $F_{I, i + 1}$ , we see that $F_{i + 1}$ is a union of intervals of length $D_{i + 1} C l^{- i}$ ,

where $λ_{1} (F_{i + 1}) > B_{i + 1} C^{i + 1}$ for some constants $D_{i + 1}$ and $B_{i + 1}$ . Since $F_{i + 1} \subset E_{i + 1}$ , for

any $x_{1} \in F_{i + 1}$ , $π_{j}$ $(S^{ℓ} (x_{1}, r_{2}, ..., r_{k})) \in R_{j}^{'}$ for $1 \underline{<} j \underline{<} i + 1$ .

By inductively proceeding in this way, we eventually arrive at a set $F_{k}$ where $λ_{1} (F_{k}) >$

$B_{k} C^{k}$ for some constant $B_{k}$ , and where $S^{ℓ} (x_{1}, r_{2}, ..., r_{k}) \in R^{'}$ for every $x_{1} \in F_{k}$ . By

integrating over all possible $r_{2}$ , $...$ , $r_{k}$ , we see that $λ_{k} (S^{ℓ} R \cap R^{'}) > B_{k} C^{2 k - 1}$ . We have

then shown that for any $p$ with $λ_{1} ((R_{1} + k α) \cap R_{1}^{'}) > \frac{c}{2}$ , $λ_{k} (S^{ℓ} R \cap R^{'}) > B_{k} C^{2 k - 1}$ .

Denote by $L$ the set of such $p$ . Then if we define $R_{α}$ to be the transformation $x \mapsto x + α$

on $T$ , then $1 = {n : R_{α}^{n} (0) \in J}$ , where $J = {x \in T : λ_{1} ((R + x) \cap R^{'}) > \frac{c}{2}}$ . It is

easily checked that $λ_{1} (J) = C$ . This means that for any $M$ , $N \in N$ ,

$\frac{1}{N} | L \cap {0,$ M,..., $M (N - 1)} | = \frac{1}{N} \sum_{i = 0}^{N - 1} χ_{J} ((R_{α}^{M})^{i} 0)$ ,

which approaches $λ_{1} (J) = C$ as $N \to \infty$ by total unique ergodicity of $R_{α}$ with respect

to $λ_{1}$ . Then, for large $N$ ,

$\frac{1}{N} \sum_{i = 0}^{N - 1} λ_{k} (S^{M ℓ} R \cap R^{'}) \underline{>} \frac{1}{N} | L \cap {0, M, ..., M (N - 1)} | B_{k} C^{2 k - 1} \to B_{k} C^{2 k}$

Therefore,

$\lim_{N \to} \inf_{\infty} \frac{1}{N} \sum_{ℓ = 0}^{N - 1} λ_{k} (S^{M ℓ} R \cap R^{'}) \underline{>} B_{k} λ_{k} (R) λ_{k} (R^{'})$ . (2.1)

We have shown that Equation 2.1 holds for $R$ and $R^{'}$ arbitrary congruent cubes in $T^{k}$ .

It is clear that it also holds for $R$ and $R^{'}$ disjoint unions of congruent cubes. Suppose

that for some $M$ $\in N$ , there exists a Lebesgue measurable $S^{M}$ -invariant set $A \underline{\subset} T^{k}$

with $λ_{k} (A) \in (0, 1)$ . By taking complements if necessary, without loss of generality we

may assume that $λ_{k} (A) \underline{<} \frac{1}{2}$ . By regularity of Lebesgue measure, there exists 6 and

$A^{'}$ a union of cubes with side length $ε$ such that $(A^{'})^{c}$ is also a union of cubes of side

length $ε$ , $λ_{k} (A) \underline{<} λ_{k} (A^{'}) \underline{<} \frac{1}{2}$ , and $λ_{k} (A ▵ A^{'}) < \frac{B_{k}}{2} λ_{k} (A)^{2}$ . Then, for any $p \in N$ , since

$S^{M ℓ} A = A$ and $λ_{k} (A ▵ A^{'}) < \frac{B_{k}}{2} λ_{k} (A)^{2}$ , $λ_{k} (A ▵ S^{M ℓ} A^{'}) < \frac{B_{k}}{2} λ_{k} (A)^{2}$ as well. Similarly,

$λ_{k} (A^{c} ▵ S^{M ℓ} (A^{'})^{c}) < \frac{B_{k}}{2} λ_{k} (A)^{2}$ . Therefore, $λ_{k} (S^{M ℓ} A^{'} \cap S^{M ℓ} (A^{'})^{c}) < B_{k} λ_{k} (A)^{2}$ for all

$p \in N$ . Since $λ_{k} (A) \underline{<} λ_{k} (A^{'}) \underline{<} λ_{k} ((A^{'})^{c})$ , this contradicts Equation 2.1.

So, $S^{M}$ is ergodic with respect to $λ_{k}$ for every $M$ $> 0$ . We claim that this also

implies unique ergodicity of $S^{M}$ for every $M$ $> 0$ , which follows from an argument of

Furstenberg, and rests on the fact that $S^{M}$ is a skew product over an irrational circle

rotation. The following fact is shown in the proof of Lemma 2.1 in [Fu] on p. 578:

For any minimal system $(X_{0}, T_{0})$ which is uniquely ergodic with respect to a measure

$μ_{0}$ , and any skew product $T$ which acts on $X = X_{0} \times T$ by $T (x_{0}, y) = (T_{0} x_{0}, y + h (x_{0}))$

where $h$ : $X_{0} \to T$ is a continuous function, if $(X, T)$ is ergodic with respect to $μ_{0} \times m$ ,

then $(X, T)$ is minimal and uniquely ergodic with respect to $μ_{0} \times m$ .

Denote by $(S^{M})^{(i)}$ the action of $S$ on its first $i$ coordinates for any $1 \underline{<} i \underline{<} k$ . Since

they are factors of $S^{M}$ , each $(S^{M})^{(i)}$ is ergodic with respect to $λ_{i}$ . Also, for each

$1 \underline{<} i < k$ , $(S^{M})^{(i + 1)}$ is a skew product as described above with $T_{0} = (S^{M})^{(i)}$ . We

may then use Furstenberg's result and the fact that $(S^{M})^{(1)}$ is minimal and uniquely

ergodic with respect to $λ_{1}$ (since it is an irrational circle rotation) to see that $(S^{M})^{(2)}$

is minimal and uniquely ergodic with respect to $λ_{2}$ . We can continue inductively in

this fashion to arrive at the fact that $S^{M}$ is minimal and uniquely ergodic with respect

to $λ_{k}$ . Since $M$ was arbitrary, $S$ is totally minimal and totally uniquely ergodic with

respect to $λ_{k}$ .

$□$

We will now use these skew products to define flows under functions which have all

of the previous properties and are also topologically mixing. Define the continuous

function $g T^{2} \to R$ by

$g (x, y) = 2 + R e (\sum_{k = 2}^{\infty} \frac{e^{2 π i k x}}{e^{k}} + \sum_{l = 2}^{\infty} \frac{e^{2 π i l y}}{e^{l}})$ .

Note that $1 < g (x, y) < 3$ for all $x$ , $y$ . We then define the space $X = {(v, x, y, t)$ :

$v \in T^{k}$ , $x$ , $y \in T$ , $0 \underline{<} t \underline{<} g (x, y)}$ where $(v, x, y, g (x, y))$ and $(S_{k, α, f} v, x + γ, y + γ^{'}, 0)$

are identified for all $v$ , $x$ , $y$ . $X$ is then homeomorphic to the mapping torus of $T^{k + 2}$

and a continuous map, and so is a connected $(k + 3)$ -manifold. For any irrational

$γ$ , $γ^{'} \in T$ , we then define the continuous map $T_{k + 3},$ " $γ, γ$ ,, $f$ : $X \to X$ by

$T_{k + 3, α, γ, γ^{'}, f} (v, x, y, t) =$ $i f t + 1 i f t + 1 \underline{>} g (x, y) < g (x, y),$ .

Finally, we define $μ = (\int_{T^{k + 2}} g d λ_{k + 2})^{- 1} λ_{k + 3} = \frac{1}{2} λ_{k + 3}$ a $T_{k + 3, α, γ, γ^{'}, f}$ -invariant Borel

probability measure on $X$ . We will prove the following:

Theorem 2.4.2. For any $f$ differentiable with $\frac{1}{2} < f^{'} (x) < \frac{3}{2}$ for all $x \in T$ and any

irrational ( $y,$ $γ$ , $γ^{'} \in T$ which are linearly independent and which satisfy $q_{n}^{'} > e^{3 q_{n}}$ and

$q_{n + 1} > e^{3 q_{\overset{´}{n}}}$ , where ${q_{n}}$ and ${q_{n}^{'}}$ are the digits in the continued fraction expansions

of $γ$ and $γ^{'}$ respectively, $(X, T_{k + 3, α, γ, γ^{'}, f})$ is totally minimal, totally uniquely ergodic,

and topologically mixing.

Again, since ( $y,$ $γ$ , $γ^{'}$ , and $f$ are fixed, for now we suppress the dependence on these

quantities in notation and denote the transformations $T_{k + 3},$ " $γ, γ^{'}, f$ and $S_{k, α, f}$ by $T$

and $S$ respectively. We also make the notation, for any integer $p > 0$ , $g_{ℓ} (x, y) =$

$\sum_{i = 0}^{ℓ - 1} g (x + i γ, y + i γ^{'})$ , and define $g_{0} (x, y) = 0$ . The proof of Theorem 2.4.2 rests

mostly on the following lemma, which is essentially taken from [Fa] .

Lemma 2.4.3. For any suffiffifficiently large integer $n > 0$ , $y \in T$ , $p \in [\frac{1}{2} e^{2 q_{n}}, 2 e^{2 q_{\overset{´}{n}}}]$ , and

any $x_{0} \in T$ with $q_{n} x_{0} \in [\frac{1}{6}, \frac{1}{3}]$ ,

$\frac{l q_{n}}{e^{q_{n}}} < | \frac{\partial g_{ℓ} (x, y)}{\partial x} (x_{0}) | < \frac{7 l q_{n}}{e^{q_{n}}}$ .

Proof: $g_{ℓ} (x, y) = 2 P + R e (\sum_{l = 2}^{\infty} \frac{X (ℓ, l)}{e}, e^{2 π i l x} + \sum_{m = 2}^{\infty} \frac{Y (ℓ, m)}{e^{m}} e^{2 π i m y})$ , where

$X (l, l)$ $= \frac{1 - e^{2 π i ℓ l γ}}{1 - e^{2 π i l γ}}$ ,

$Y (l, m) = \frac{1 - e^{2 π i ℓ m γ^{'}}}{1 - e^{2 π i m γ}'}$ .

The following facts are proved in [Fa], p. 454:

For all $l$ $\in N ∖ {0}$ , $p \in N$ , $| X (l, l)$ $| \underline{<} l$ . (2.2)

For all $n \in N$ , $l$ $< q_{n}$ , $p \in N$ , $| X (l, l)$ $| \underline{<} q_{n}$ . (2.3)

For all $n \in N$ , $l$ $\in (q_{n}, 2 q_{n})$ , $p \in N$ , $| X (l, l)$ $| \underline{<} 2 q_{n}$ . (2.4)

For any $l \underline{<} \frac{q_{n + 1}}{2}$ , $| X (l, q_{n}) | \underline{>} \frac{2 l}{π}$ . (2.5)

For any $l \underline{<} \frac{q_{n + 1}}{2}$ , $| \arg (X (l, q_{n})) | \underline{<} π \frac{l - 1}{q_{n + 1}}$ . (2.6)

We will use these to prove Lemma 2.4.3. It is easy to check that

$\frac{\partial g_{ℓ} (x, y)}{\partial x} (x_{0})$ $= Re (\sum_{l = 2}^{\infty} 2 π i l \frac{X (l, l)}{e^{l}} e^{2 π i l x_{0)}}$

$= Re (2 π i q_{n} \frac{| X (l, q_{n}) |}{e^{q_{n}}} e^{2 π i q_{n} x_{0)}} + Re (\sum_{l = 2}^{q_{n} - 1} 2 π i l \frac{X (l, l)}{e^{l}} e^{2 π i l x_{0)}}$

$+ Re (\sum_{l = q_{n} + 1}^{2 q_{n} - 1} 2 π i l \frac{X (l, l)}{e^{l}} e^{2 π i l x_{0)}} + Re (\sum_{l = 2 q_{n}}^{\infty} 2 π i l \frac{X (l, l)}{e^{l}} e^{2 π i l x_{0)}}$

$+ Re (2 π i q_{n} \frac{X (l, q_{n}) - | X (l, q_{n}) |}{e^{q_{n}}} e^{2 π i q_{n} x_{0)}}$ .

We bound the first term from above and below and the rest from above.

$| Re (2 π i q_{n} \frac{| X (ℓ, q_{n}) |}{e^{q n}} e^{2 π i q_{n} x_{0}}) | = \frac{| X (ℓ, q_{n}) |}{e^{q n}} 2 π q_{n} | \sin (2 π q_{n} x_{0}) |$ , and since $p \in [\frac{1}{2} e^{2 q_{n}}, 2 e^{2 q_{\overset{´}{n}}}]$ ,

$p \underline{<} \frac{q n + 1}{2}$ . By (2.2) and (2.5), $\frac{2}{π} l \underline{<} | X (l, q_{n}) | \underline{<} p$ . Since $q_{n} x_{0} \in$ $[$ ;' $\frac{1}{3}]$ , $\frac{1}{2} \underline{<}$

$| \sin (2 π q_{n} x_{0}) | \underline{<} 1$ . Therefore, $\frac{2 ℓ q_{n}}{e^{q n}} \underline{<} | Re (2 π i q_{n} \frac{| X (ℓ, q_{n}) |}{e^{q n}} e^{2 π i q_{n} x_{0)}} | \underline{<} \frac{2 π ℓ q_{n}}{e^{q n}}$ .

Next, by (2.3),

$| Re (\sum_{l = 2}^{q_{n} - 1} 2 π i l \frac{X (l, l)}{e^{l}} e^{2 π i l x_{0)}} | \underline{<} 2 π \sum_{l = 2}^{q_{n} - 1} l \frac{| X (l, l) |}{e^{l}} \underline{<} 2 π \sum_{l = 2}^{q_{n} - 1} \frac{q_{n} l}{e^{l}} \underline{<} 2 π q_{n}^{2}$ .

We similarly have

$| Re (\sum_{l = q_{n} + 1}^{2 q_{n} - 1} 2 π i l \frac{X (l, l)}{e^{l}} e^{2 π i l x_{0)}} | \underline{<} 4 π q_{n}^{2}$ .

Also, from (2.2), we can conclude that for large $n$

$| Re (\sum_{l = 2 q_{n}}^{\infty} 2 π i l \frac{X (l, l)}{e^{l}} e^{2 π i l x_{0)}} | \underline{<} 2 π \sum_{l = 2 q_{n}}^{\infty} l \frac{| X (l, l) |}{e^{ℓ}} \underline{<} 2 π l \sum_{l = 2 q_{n}}^{\infty} \frac{l}{e^{l}} \underline{<} \frac{2 π l}{e^{1.5 q_{n}}}$ .

Finally, from (2.6) and (2.2), we see that

$| X (l, q_{n}) - | X (l, q_{n}) | | \underline{<} 2 | X (l, q_{n}) | \arg (X (l, q_{n})) \underline{<} 2 π \frac{l - 1}{q_{n + 1}} l$ ,

and so since $l \underline{<} 2 e^{2 q_{\overset{´}{n}}}$ and $q_{n + 1} \underline{>} e^{3 q_{\overset{´}{n}}}$ , this implies that

$| Re (2 π i q_{n} \frac{X (l, q_{n}) - | X (l, q_{n}) |}{e^{q_{n}}} e^{2 π i q_{n} x_{0)}} | \underline{<} 8 π^{2} e^{- q_{\overset{´}{n}}} q_{n} \frac{p}{e^{q_{n}}}$ .

By combining all of these bounds,

$\frac{2 l q_{n}}{e^{q_{n}}} - 2 π q_{n}^{2} - 4 π q_{n}^{2} - \frac{2 π l}{e^{1.5 q_{n}}} - 8 π^{2} e^{- q_{\overset{´}{n}}} \frac{k l q_{n}}{e^{q_{n}}} \underline{<} | \frac{\partial g_{ℓ} (x, y)}{\partial x} (x_{0})$

$\underline{<} \frac{2 π l q_{n}}{e^{q_{n}}} + 2 π q_{n}^{2} + 4 π q_{n}^{2} + \frac{2 π l}{e^{1.5 q_{n}}} + 8 π^{2} e^{- q_{\overset{´}{n}}} \frac{l q_{n}}{e^{q_{n}}}$ ,

and since $l \underline{>} \frac{1}{2} e^{2 q_{n}}$ , for large $n$ we have

$\frac{l q_{n}}{e^{q_{n}}} \underline{<} | \frac{\partial g_{ℓ} (x, y)}{\partial x} (x_{0}) | \underline{<} \frac{7 l q_{n}}{e^{q_{n}}}$ .

$□$

The proof of the following fact is trivially similar:

Lemma 2.4.4. For any suffiffifficiently large integer n $> 0$ , x $\in T$ , p $\in [\frac{1}{2} e^{2 q_{\overset{´}{n}}}, 2 e^{2 q_{n + 1}}]$ ,

and any $y_{0} \in T$ with $q_{n}^{'} y_{0} \in$ ,

$\frac{l q_{n}^{'}}{e^{q_{\overset{´}{n}}}} < | \frac{\partial g_{ℓ} (x, y)}{\partial y} (y_{0}) | < \frac{7 l q_{n}^{'}}{e^{q_{\overset{´}{n}}}}$ .

Proof of Theorem 2.4.2: Consider any $m \in [\frac{5}{3} e^{2 q_{n}}, 2 e^{2 q_{\overset{´}{n}}}]$ , and any cubes $R =$

$\prod_{i = 1}^{k + 3} R_{i}$ and $R^{'} = \prod_{i = 1}^{k + 3} R_{i}^{'}$ where $R_{i}$ and $R_{i}^{'}$ are intervals of length $C$ for $1 \underline{<} i \underline{<} k + 3$ .

Take intervals $Q_{k + 1}$ and $Q_{k + 2}$ of length $\frac{c}{2}$ central in $R_{k + 1}$ and $R_{k + 2}$ respectively.

Define $\bar{R} = \prod_{i = 1}^{k} R_{i}$ . We also make the following definition: $l (m, x, y, t)$ is the integer

$p$ such that $g_{ℓ} (x, y) \underline{<} m + t < g_{ℓ + 1} (x, y)$ . Alternately, for any $v \in T^{k}$ ,

$T^{m} (v, x, y, t) = (S^{ℓ (m, x, y, t)} v, x + P (m, x, y, t) γ, y + P (m, x, y, t) γ^{'}, m + t - g ℓ (m, x, y, t) (x, y))$ .

Now, fix any $y \in Q_{k + 2}$ and $t \in Q_{k + 3}$ . We First define the set

$E_{1}^{'} = {x_{1} \in Q_{k + 1} : q_{n} x_{1} \in [\frac{1}{6}, \frac{1}{3}]}$ .

For large $m$ , $λ_{1} (E_{1}^{'}) > D_{1} C$ for some constant $D_{1} > 0$ , and $E_{1}^{'}$ is a union of many

intervals where all but the first and last have length $\frac{1}{6 q_{n}}$ . By removing those, we can

define $E_{1}^{'}$ a union of intervals of length $\frac{1}{6 q_{n}}$ where $λ_{1} (E_{1}^{'}) > D_{2} C$ for some constant

$D_{2} > 0$ . We then define the set

$E 2 = {x_{1} \in E_{1}^{'}$ : $λ_{1} (R_{1} + P (m, x_{1}, y, t) α \cap R_{1}^{'}) > \frac{C}{2}$ , $Q_{k + 1} + l (m, x_{1}, y, t) γ \subset R_{k + 1}^{'}$ ,

$Q_{k + 2} + l (m, x_{1}, y, t) γ^{'} \subset R_{k + 2}^{'})}$ .

We wish to use Lemma 2.4.3 to analyze the structure of $E_{2}$ . First we note that since

1 $< g (x, y) < 3$ for all $x$ , $y$ , by definition of $l (m, x, y, t)$ , $l (m, x, y, t) \underline{<} m + t <$

$3 (P (m, x, y, t) + 1)$ , and so for large enough $n$ , $l (m, x, y, t) \underline{<} m \underline{<} \frac{10}{3} l (m, x, y, t)$ , or

$l (m, x, y, t) \in [. 3 m, m]$ . By our assumption on $m$ , this means that for any $x$ , $y$ , $t$ ,

$l (m, x, y, t) \in [\frac{1}{2} e^{2 q_{n}}, 2 e^{2 q_{\overset{´}{n}}}]$ . Now, fix any interval I of length $\frac{1}{6 q_{n}}$ in $E_{1}^{'}$ , and let

us examine $E_{2} \cap I$ . By the preceding remarks and Lemma 2.4.3, $\frac{ℓ (m, x_{1}, y, t) q_{n}}{e^{q n}} <$

$| \frac{\partial π_{k + 3} T^{m} (v, x, y, t)}{\partial x} (x_{1}) | < \frac{7 ℓ (m, x_{1}, y, t) q_{n}}{e^{q n}}$ for every $x_{1} \in I$ . Since $l (m, x, y, t) \in [. 3 m, m]$ ,

$. \frac{3 m q_{n}}{e^{q_{n}}} < | \frac{\partial π_{k + 3} T^{m} (v, x, y, t)}{\partial x} (x_{1}) | < \frac{7 m q_{n}}{e^{q_{n}}}$ (2.7)

for every $x_{1} \in I$ . This means that $\frac{\partial π_{k + 3} T^{m} (v, x, y, t)}{\partial x}$ has the same sign for all $x_{1} \in I$ ,

and without loss of generality we assume it to be positive. Let us define $L$ to be the

set of possible values for $l (m, x_{1}, y, t)$ for $x_{1} \in I$ . (Since $g_{ℓ}$ is continuous for every

$p$ , $L$ is an interval of integers.) Then for any fixed $p \in L$ , define $I_{ℓ} = {x_{1} \in I$ :

$l (m, x_{1}, y, t) = l} = {x_{1} \in I : g_{ℓ} (x_{1}, y) \underline{<} m + t < g_{ℓ + 1} (x_{1}, y)} = {x_{1} \in I$ :

$g_{ℓ} (x_{1}, y) \in [m + t - g (x_{1} + P γ, y + l γ^{'}), m + t]}$ . Since $1 < g (x_{1} + P γ, y + P γ^{'}) < 3$ ,

by (2.7) $\frac{1}{\frac{7 m q n}{e^{q n}}} < m (I_{ℓ}) < \frac{3}{\frac{3 m q n}{e^{q n}}}$ , or $m (I_{ℓ}) \in (\frac{1}{7} \frac{e^{q n}}{m q_{n}}, 10 \frac{e^{q n}}{m q_{n}})$ for all $p \in L$ except the

smallest and largest, for which $m (I_{ℓ})$ could be smaller.

Since $m > e^{2 q_{n}}$ , this means that the number of elements in $L$ approaches infinity

as $m$ does. Now, we note that $E_{2} \cap I$ $= \cup_{ℓ \in L}$ , $I_{ℓ}$ , where $L^{'}$ is the set of $p \in L$

where $λ_{1}$ $((R_{1} + l α) \cap R_{1}^{'}) > \frac{c}{2}$ , $Q_{k + 1} + l γ \subset R_{k + 1}^{'}$ , and $Q_{k + 2} + l γ^{'} \underline{\subset} R_{k + 2}^{'}$ . Then

$\frac{| L^{'} |}{| L |} = \frac{1}{| L |} \sum_{ℓ \in L} χ_{J} (R_{α, γ, γ}^{ℓ}, (0))$ , where $R_{α, γ, γ^{'}}$ is the rotation on $T^{3}$ given by $(a, b, c) \mapsto$

$(a + α, b + γ, c + γ^{'})$ and $J = {(a, b, c) \in T^{3}$ : $λ_{1} ((R_{1} + a) \cap R_{1}^{'}) > \frac{c}{2}$ , $Q_{k + 1} + b \subset$

$R_{k + 1}^{'}$ , $Q_{k + 2} + c \subset R_{k + 2}^{'})}$ . Since ( $y,$ $γ$ , and $γ^{'}$ are rationally independent, $R_{α, γ, γ}$ , is

uniquely ergodic, and so as $m \to \infty$ , $\frac{| L^{'} |}{| L |} \to λ_{3} (J) = \frac{c^{3}}{4}$ .

Due to the already established bounds on $λ_{1} (I_{ℓ})$ for $p \in L$ , this implies that there is

a constant $D_{3} > 0$ so that $λ_{1} (E_{2} \cap I)$ $> D_{3} C^{3} λ_{1} (I)$ for every interval I in $E_{1}^{'}$ . By

removing the possibly shorter first and last subintervals of $E_{2} \cap I$ , we have $F_{I} \underline{\subset} E_{2} \cap I$

a union of intervals of length greater than $\frac{1}{7} \frac{e^{q n}}{m q_{n}}$ with $λ_{1} (F_{I}) > D_{4} C^{3} λ_{1} (I)$ for some

constant $D_{4} > 0$ . We take $F_{2}$ to be the union of all $F_{I}$ , and then $λ_{1} (F_{2}) > D_{5} C^{4}$ for

some constant $D_{5} > 0$ . Finally, we define

$E_{3} = {x_{1} \in F_{2} : π_{k + 3} (T^{m} (v, x_{1}, y, t)) \in R_{k + 3}^{'} \forall v \in T^{k}}$ .

Note that $F_{2}$ is a union of intervals $I_{ℓ}$ , and so fix any such interval $I_{ℓ}$ of length greater

than $\frac{1}{7} \frac{e^{q n}}{m q_{n}}$ . By definition, $π_{k + 3}$ $(T^{m} (v, x_{1}, y, t)) = m + t - g_{ℓ} (x_{1}, y)$ for any $x_{1} \in I_{ℓ}$ , and

$π_{k + 3} (T^{m} (v, x_{1}, y, t))$ ranges monotonically from 0 to $g (x_{1} + P γ, y + l γ^{'})$ as $x_{1}$ increases

over $I_{ℓ}$ . Since, by Lemma 2.4.3, $\frac{ℓ q_{n}}{e^{q n}} < | \frac{\partial g ℓ (x, y)}{\partial x} (x_{1}) | < \frac{7 ℓ q_{n}}{e^{q n}}$ , $λ_{1} (E_{3} \cap I_{ℓ}) > D_{6} C λ_{1} (I_{ℓ})$

for some constant $D_{6} > 0$ . By taking the union over all $I_{ℓ}$ , $λ_{1}$ $(E 3) > D_{7} C^{5}$ for some

$D_{7} > 0$ .

Consider any $x_{1} \in E_{3}$ . We know that $λ_{1} ((R_{1} + P (m, x_{1}, y, t) α) \cap R_{1}^{'}) > \frac{c}{2}$ , and by

the proof of Theorem 2.4.1, this implies that if we define the set $A_{x_{1}} = {v \in \bar{R}$ :

$π_{i} (T^{m} (v, x_{1}, y, t)) \in R_{i}^{'}$ , 1 $\underline{<} i \underline{<} k$ }, then $λ_{k} (A_{x_{1}}) > B_{k} C^{2 k - 1}$ for some constant

$B_{k} > 0$ . But this means that for any $v \in A_{x_{1}}$ ,

$T^{m} (v, x_{1}, y, t)$

$= (S^{ℓ (m, x_{1}, y, t)} v, x_{1} + P (m, x_{1}, y, t) γ, y + P (m, x_{1}, y, t) γ^{'}, m + t - g_{ℓ (m, x_{1}, y, t)} (x_{1}, y))$

is in $R^{'}$ by definitions of E3 and $A_{x_{1}}$ . So, there exists a constant $D_{8} > 0$ such

that $λ_{k + 1}$ $({(v, x_{1}) : T^{m} (v, x_{1}, y, t) \in R^{'})}$ $> D_{8} C^{2 k + 4}$ . By integrating over all

possible $y \in Q_{k + 2}$ , $t \in Q_{k + 3}$ , we see that there is a constant $D_{9} > 0$ such that

$μ ((T^{m} R) \cap R^{'}) > \frac{D_{9}}{4} C^{2 k + 6} = D_{9} μ (R) μ (R^{'})$ .

This argument works for any large enough $m \in [\frac{5}{3} e^{2 q_{n}}, 2 e^{2 q_{\overset{´}{n}}}]$ for some $n$ . An analogous

argument using Lemma 2.4.4 and which involves varying $y$ instead of $x$ shows that

the same is true for any large enough $m \in [\frac{5}{3} e^{2 q_{\overset{´}{n}}}, 2 e^{q_{n + 1}}]$ . However, this implies that

for all sufficiently large $m$ and any congruent cubes $R$ and $R^{'}$ , $μ ((T^{m} R) \cap R^{'}) >$

$D_{9} μ (R) μ (R^{'})$ , which implies that $(X, T)$ is totally ergodic with respect to $μ$ and

topologically mixing. It remains to show that $(X, T)$ is in fact totally uniquely ergodic.

Our proof is similar to the argument of Furstenberg used earlier, however since this is

about a flow under a function and his argument was about skew products, we present

the proof in its entirety here. Fix any $M$ $\in N$ . Since $T^{M}$ is ergodic with respect to $μ$ ,

$μ - a . e$ . every point of $X$ is $(T^{M}, μ)$ -generic. Since $μ$ is shift-invariant, and since shifts

in the last coordinate commute with $T^{M}$ , if a point $(v, x, y, t) \in X$ is $(T^{M}, μ)$ -generic,

$(v, x, y, t^{'})$ is as well for all $0 \underline{<} t^{'} < g (x, y)$ . This implies that for $λ_{k + 2} - a . e$ . $(v, x, y)$ ,

the fiber ${(v, x, y, t)}_{0 \leq t < g (x, y)}$ consists of $(T^{M}, μ)$ -generic points. Denote by $G$ this set

of $(v, x, y)$ which give rise to $(T^{M}, μ)$ -generic fibers. Choose any $(v, x, y, t) \in X$ . We

know that for large $n$ , $P (n M, x, y, t) \in [. 3 n M, n M]$ , and that $P (n M, x, y, t)$ is increasing

in $n$ . Therefore, the set ${P (n M, x, y, t) : n \in N}$ has positive density. The skew

product $U$ on $T^{k + 2}$ defined by $U (v, x, y) = (S_{k, α, f} v, x + γ, y + γ^{'})$ is totally uniquely

ergodic with respect to $λ_{k + 2}$ for the same reason as in the proof of Theorem 2.4.1. (The

only difference is that here the base case is the rotation on $T^{3}$ given by $(x, y, z) \mapsto$

$(x + α, y + γ, z + γ^{'})$ instead of an irrational rotation on T. However, since $(y,$ $γ$ ,

and $γ^{'}$ are rationally independent, this rotation is also totally uniquely ergodic and

totally minimal.) This implies that the set {I : $U^{ℓ} (v,$ $x$ , $y) \in G$ } has density one.

Together, these facts imply that there exists $n$ such that $U^{ℓ (n M, x, y, t)} (v, x, y) \in G$ , or

that $T^{n M} (v, x, y, t) = (g, s)$ for some $g \in G$ and $s \in$ R. This means that $T^{n M} (v, x, y, t)$

is $(T^{M}, μ)$ -generic, and so $(v, x, y, t)$ is as well. Since $(v, x, y, t) \in X$ was arbitary,

every point in $X$ is $(T^{M}, μ)$ -generic, and so $T^{M}$ is uniquely ergodic with respect to

$μ$ . Since $μ (U) > 0$ for every nonempty open set $U$ , $T^{M}$ is minimal as well. Since $M$

was arbitrary, $(X, T)$ is totally uniquely ergodic, totally minimal, and topologically

mixing.

$□$

2.5 Some counterexamples on connected manifolds

Proof of Theorem 2.1.19: Our transformation is $T_{2 d + 7, α, γ, γ^{'}, f}$ for properly chosen

( $y$ , $γ$ , $γ^{'}$ , and $f$ . We always take ( $y$ to be the golden ratio $\frac{\sqrt{5} - 1}{2}$ because of a classical

fact from the theory of continued fractions:

Lemma 2.5.1. For any n $\in N$ , the distance from no to the nearest integer is greater

than $\frac{1}{3 n}$ .

$γ$ and $γ^{'}$ can be any irrational elements of $T$ satisfying the hypotheses of Theo-

rem 2.4.2. What remains is to define $f$ . Before doing so, we use our sequence ${p_{n}}$ to

construct another increasing sequence of integers. For any $n$ , take $w_{n} = l (p_{n}, 0, 0, 0)$ .

In other words, for any $v \in T^{2 d + 4}$ , $T^{p_{n}} (v, 0, 0, 0) = (S^{w_{n}} (v), w_{n} γ, w_{n} γ^{'}, p_{n} - g_{w_{n}} (0, 0))$ .

(Here $S = S_{2 d + 4, α, f}$ and $T = T_{2 d + 7, α, γ, γ^{'}, f}$ .) As before, for large $n$ , $w_{n} \in [. 3 p_{n}, p_{n}]$ .

We claim that $w_{n + 1} < (w_{n + 1} - w_{n})^{d + 1}$ for all large enough $n$ . This is because

$w_{n + 1} - w_{n} = l (p_{n + 1} - p_{n}, w_{n} γ, w_{n} γ^{'}, p_{n} - g_{w_{n}} (0, 0))$ :

( $S^{w_{n + 1}} v$ , $w_{n + 1} γ$ , $w_{n + 1} γ$ , $p_{n + 1} - g_{w_{n + 1}} (0, 0)) = T^{p_{n} + 1} (v, 0, 0, 0)$

$= T^{p_{n + 1} - p_{n}} (T^{p_{n}} (v, 0, 0, 0)) = T^{p_{n + 1} - p_{n}} (S^{w_{n}} v, w_{n} γ, w_{n} γ^{'}, p_{n} - g_{w_{n}} (0, 0))$

$= (S^{w_{n} + ℓ (p_{n + 1} - p_{n}, w_{n} γ, w_{n} γ^{'}, p_{n} - g_{w_{n}} (0, 0))} (S^{w_{n}} (v))$ ,

$(w_{n} + l (p_{n + 1} - p_{n}, w_{n} γ, w_{n} γ^{'}, p_{n} - g_{w_{n}} (0, 0))) γ$ ,

$(w_{n} + l (p_{n + 1} - p_{n}, w_{n} γ, w_{n} γ^{'}, p_{n} - g_{w_{n}} (0, 0))) γ^{'}, p_{n + 1} - g_{w_{n + 1} - 1} (0, 0))$

Therefore, since $p_{n + 1} - p_{n} \to \infty$ , $w_{n + 1} - w_{n} \in [. 3 (p_{n + 1} - p_{n}), p_{n + 1} - p_{n}]$ for large $n$ for

the same reasons as before. This implies for large $n$ that $w_{n + 1} \underline{<} p_{n + 1} < (p_{n + 1} - p_{n})^{d} \underline{<}$

$(\frac{10}{3} (w_{n + 1} - w_{n}))^{d} \underline{<} (\frac{10}{3})^{d} (w_{n + 1} - w_{n})^{d} < (w_{n + 1} - w_{n})^{d + 1}$ since $w_{n + 1} - w_{n} \to \infty$ . We

wish to choose $f$ so that $S^{w_{n}} (0)$ is bounded away from 0, where $0 \in T^{2 d + 4}$ is the zero

vector.

We define $f$ as an infinite sum: $F (v_{1}) = v_{1} + \sum_{i = 1}^{\infty} c_{i} s_{x_{i}, ε_{i}} (v_{1})$ is a function from $T$

to $R$ , and $f (v_{1}) = F (v_{1})$ (mod t) is a self-map of T. In this sum, $c_{i} \in R^{+}$ with

$\sum_{i = 1}^{\infty} c_{i} < 1$ , $x_{i} \in T$ and $ε_{i} > 0$ will be chosen later, and the function $s_{x, ε} (y)$ for any

$x \in T$ and $ε > 0$ is a function defined by

$s_{x, ε} (y) =$

The pertinent properties of $s_{x, ε}$ are that it is nonzero only on the interval $[x - ε,$ $x +$

$ε]$ , it attains a maximum of $\frac{ε}{π}$ at $y = x$ , and that its derivative is bounded from

above in absolute value by $\frac{1}{2}$ . Since each term $c_{i} s_{x_{i}, ε_{i}}$ in the definition of $F$ is a

differentiable function with derivative bounded from above in absolute value by $\frac{c_{i}}{2}$ ,

and since $\sum_{i = 1}^{\infty} c_{i} < 1$ and the identity function has derivative one everywhere, $F$ is

a differentiable function with $F^{'} (v_{1}) \in$ for all $v_{1} \in$ T. This shows that for any

choice of $c_{i}$ with $\sum_{i = 1}^{\infty} c_{i} < 1$ , and for any choice of $x_{i}$ , $ε_{i}$ , by Theorem 2.4.2, $(X, T)$ is

totally minimal, totally uniquely ergodic, and topologically mixing.

We wish to choose $f$ so that $1 S^{w_{n}} (0)$ is bounded away from 0. The only quantities still

to be chosen are $c_{i}$ , $x_{i}$ and $ε_{i}$ . We note that for any $1 < k \underline{<} 2 d + 4$ and any $j \underline{>} k - 1$ ,

$π_{k} (S^{j} (0)) = \sum_{i = 0}^{j - k + 1}$ $f (i α)$ . This can be proved by a quick induction, and is

left to the reader. In particular, if we make the notation $y_{n} = π_{2 d + 4} (S^{w_{n}} (0))$ for any

$w_{n} \underline{>} 2 d + 3$ , then $y_{n} = \sum_{i = 0}^{w_{n} - 2 d - 3}$ $f (i α)$ . Our goal is to choose $c_{i}$ , $x_{i}$ , and $ε_{i}$

so that $y_{w_{n}} = \frac{1}{3}$ for all sufficiently large $n$ . To do this, we choose $x_{n} = w_{n} (y$ for all

$n$ , and $ε_{n} = \inf_{0 \leq i < w_{n + 1}}$ , $i \neq w_{n} | x_{n} - i α | > \frac{1}{3 w_{n + 1}}$ by Lemma 2.5.1. This guarantees that

$s_{x_{n}, ε_{n}} (i α) = 0$ for any $0 \underline{<} i < w_{n + 1}$ , $i \neq w_{n}$ . This means that each choice of $c_{i}$ that

we make changes the values of $y_{n}$ for only $n > i$ , and allows us to finally inductively

define $c_{i}$ .

Recall that our goal is to ensure that $y_{n} = \sum_{i = 0}^{w_{n} - 2 d - 3}$ $f (i α) = \frac{1}{3}$ for all

sufficiently large $n$ . Note that since ${w_{n}}$ is an increasing sequence of integers, $w_{n} \underline{>} n$

for all $n$ . We have already shown that $w_{n} < (w_{n} - w_{n - 1})^{d + 1}$ for all large $n$ , and so

$w_{n} - w_{n - 1} > n^{\frac{1}{d + 1}}$ , and so $w_{n} = w_{1} + \sum_{i = 2}^{n} (w_{i} - w_{i - 1}) > \sum_{i = 2}^{n} i^{\frac{1}{d + 1}} > \frac{n}{2} (\frac{n}{2})^{\frac{1}{d + 1}} >$

$\frac{1}{4} n^{1 + \frac{1}{d + 1}}$ for all large $n$ , and so $\sum_{n = 1}^{\infty} w_{n}^{- 1}$ converges, a fact which will be important

momentarily. We choose $N$ so that $\sum_{n = N + 1}^{\infty} w_{n}^{- 1} < \frac{1}{6 π (2 d + 2)!}$ . The procedure for

defining the sequence $c_{n}$ is then as follows: $c_{i} = 0$ for $1 \underline{<} i \underline{<} N$ . For any $n > N$ ,

assume that $y_{i} = \frac{1}{3}$ for $N + 1$ $< i \underline{<} n$ . Then, we choose $c_{n}$ so that $y_{n + 1} = \frac{1}{3}$ . Note that

for any $n > 1$ , $y_{n} = h_{n} (c_{1}, ..., c_{n - 2}) + c_{n - 1} \frac{1}{π} ε_{n - 1}$ for some $h_{n}$ : $T^{n - 2} \to T$ .

This means that taking $c_{n} = π \frac{(\frac{1}{3} - h_{n + 1} (c_{1}, ..., c_{n - 1})) (m o d 1)}{ε_{n} (_{2 d + 2}^{w_{n + 1} - w_{n} - 1})}$ gives $y_{n + 1} = \frac{1}{3}$ . Note that

$> \frac{1}{2 (2 d + 2)!} (w_{n + 1} - w_{n})^{2 d + 2}$ for large $n$ , and recall that $ε_{n} > \frac{1}{3 w_{n + 1}}$ for all

$n$ by Lemma 2.5.1. This means that $c_{n} < 6 π (2 d + 2)! w_{n + 1} (w_{n + 1} - w_{n})^{- (2 d + 2)}$ , which,

by the hypothesis on the sequence ${w_{n}}$ , is less than $6 π (2 d + 2)! (w_{n + 1})^{- 1}$ , again for

sufficiently large $n$ . This means that $\sum_{n = 1}^{\infty} c_{n} < 6 π (2 d + 2)! \sum_{n = N + 1}^{\infty} w_{n + 1}^{- 1} < 1$ by

definition of $N$ . We have then chosen $c_{n}$ so that $d (S^{w_{n}} (0), 0) \underline{>} d (y_{n}, 0) = \frac{1}{3}$ for all

$n > N$ . For $n \underline{<} N$ , note that $π_{1} (S^{w_{n}} (0)) = w_{n} (y$ $\neq 0$ , since ( $y$ ( Q. This means that

for all $n$ , $d (S^{w_{n}} (0), 0) \underline{>} \min (\frac{1}{3},$ $\min_{1 \leq n \leq N} d (w_{n} (y, 0)) > 0$ .

However, by definition, $π_{2 d + 4} (T^{p_{n}} (0, 0, 0, 0)) = π_{2 d + 4} (S^{w_{n}} (0))$ for all $n$ . Therefore,

$T^{p_{n}} (0, 0, 0, 0)$ is bounded away from (0, 0, 0, 0), and so since $(X, T)$ is totally uniquely

ergodic, totally minimal, and topologically mixing, we are done.

$□$

We note that there was nothing special about the number $\frac{1}{3}$ in this proof, and so the

proof of the following corollary is trivially similar:

Corollary 2.5.2. For any increasing sequence ${w_{n}}$ of integers with the property that

for some integer $d$ , $w_{n + 1} < (w_{n + 1} - w_{n})^{d + 1}$ for all suffiffifficiently large $n$ , and for any

sequence ${z_{n}} \underline{\subset} T$ , there exists $f$ satisfying the hypotheses of Theorem 2.4.1 so that

for all suffiffifficiently large $n$ , $π_{2 d + 4} ((S_{2 d + 4, α, f})^{w_{n}} (0)) = z_{n}$ .

Proof of Theorem 2.1.16: Our transformation is $T_{2 d + 9, α, γ, γ^{'}, f}$ for the same $(y,$ $γ$ ,

and $γ^{'}$ as before. We use the same strategy as we did to prove Theorem 2.1.15; in

other words, we will be forcing certain types of nonrecurrence behavior along a set

comprised of a union of infinitely many shifted subsequences of ${p_{n}}$ .

We now proceed roughly as we did in proving Theorem 2.1.15. We define a sequence

${t_{n}}$ by shifting different $p_{n}$ by different amounts. First, define the intervals of integers

$B_{j} = [j! + 1, (j + 1)!] \cap N$ for every $j \in N$ , and take any partition of $N$ into infinitely

many disjoint infinite sets $C_{1}$ , $C_{2}$ , . . .. We denote the elements of $C_{i}$ , written in

increasing order, by $c_{i}^{(1)}$ , $c_{i}^{(2)}$ - Choose some $s_{1}$ large enough so that $p_{n + 1} - p_{n} > 2$ .1

for $n \underline{>}$ (inf $C_{s_{1}}$ ) !, define the set $D_{1} = \cup_{j \in C_{s_{1}}} B_{j}$ , and then define $t_{n} = p_{n} + 1$ for all

$n \in D_{1}$ . Next, choose some $s_{2}$ large enough so that $p_{n + 1} - p_{n} > 2 ċ 2$ for $n \underline{>}$ (inf $C_{s_{2}}$ ) !,

and define $D_{2} = \cup_{j \in C_{s_{2}}} B_{j}$ , and then define $t_{n} = p_{n} + 2$ for all $n \in D_{2}$ . Continuing in

this way, we may inductively define $D_{k}$ for all $k \in N$ so that $D_{k} = \cup_{j \in C_{s_{k}}} B_{j}$ for some

$s_{k}$ with the property that $p_{n + 1} - p_{n} > 2 k$ for $n \underline{>}$ (inf $C_{s_{k}}$ ) !, and then define $t_{n} = p_{n} + k$

for all $n \in D_{k}$ . For any $n \notin \cup_{k = 1}^{\infty} D_{k}$ , $t_{n} = p_{n}$ . Note that by the construction, for

any $n$ , $k$ where $t_{n} = p_{n} + k$ , it must be the case that $n \in D_{k}$ , and therefore that

$n > (\inf C_{s_{k}})!$ , and so that $p_{n + 1} - p_{n} > 2 k$ . Therefore, since $t_{n + 1} \underline{>} p_{n + 1}$ for all $n$ ,

$t_{n + 1} - t_{n} \underline{>} p_{n + 1} - p_{n} - k > \frac{p_{n + 1} - p_{n}}{2}$ , and so ${t_{n}}$ is increasing. Since $n - 1 \underline{>}$ (inf $C_{s_{k}}$ ) !,

$p_{n} - p_{n - 1} > 2 k$ , and so in particular $p_{n} > 2 k$ . This implies that $t_{n} = p_{n} + k < 2 p_{n}$

for all $n$ . Finally, we see that $t_{n + 1} < 2 p_{n + 1} < 2 (p_{n + 1} - p_{n})^{d} < 2^{d + 1} (t_{n + 1} - t_{n})^{d}$ for

all large enough $n$ . Since $t_{n + 1} - t_{n} > \frac{p_{n + 1} - p_{n}}{2} \to \infty$ as $n \to \infty$ , this means that

$t_{n + 1} < (t_{n + 1} - t_{n})^{d + 1}$ for large enough $n$ .

We again define a sequence ${w_{n}}$ : for any $n$ , take $w_{n} = l (t_{n}, 0, 0, 0)$ . In other words,

for any $v \in T^{2 d + 6}$ , $T^{t_{n}} (v, 0, 0, 0) = (S^{w_{n}} (v), w_{n} γ, w_{n} γ^{'}, t_{n} - g_{w_{n}} (0, 0))$ . (Here $S =$

$S_{2 d + 6, α, f}$ and $T = T_{2 d + 9, α, γ, γ^{'}, f}$ .) For exactly the same reasons as in the proof of

Theorem 2.1.19, $w_{n + 1} < (w_{n + 1} - w_{n})^{d + 2}$ for large $n$ .

Therefore, by using Corollary 2.5.2, for any sequence ${z_{n}} \underline{\subset} T$ , we may choose $f$

such that $π_{2 d + 6} (T^{t_{n}} (0, 0, 0, 0)) = π_{2 d + 6} (S^{w_{n}} (0)) = z_{n}$ for all $n$ . We define $z_{n} = \frac{1}{3}$ for

any $n \in D_{k}$ where $n \in B_{j}$ for $j = c_{s_{k}}^{(i)}$ with odd $i$ , and $z_{n} = \frac{2}{3}$ for any $n \in D_{k}$ where

$n \in B_{j}$ for $j = c_{s_{k}}^{(i)}$ with even $i$ . For any $z_{n}$ not defined by these conditions, $z_{n}$ may

be anything. (We can take $z_{n} = 0$ for such $n$ if it is convenient.) Take $h \in C (X)$

such that $h (v, x, y, t) = 0$ if $v_{2 d + 6} = \frac{2}{3}$ , $h (v, x, y, t) = 1$ if $v_{2 d + 6} = \frac{1}{3}$ , $\inf_{x \in X} h (x) = 0$ ,

and $\sup_{x \in X} h (x) = 1$ . Now, we note that

${z \in X$ : $\frac{1}{N} \sum_{n = 0}^{N - 1} h (T^{p_{n}} z)$ does not $c o n v e r g e}$

$\underline{\supset}$ $(\cap \cup n > 0 k > n {z \in X$ : $.$ $\frac{| {i . 1 \underline{<} i \underline{<} k, π_{2 d + 6} (T^{p i} (z)) = \frac{1}{3}} |}{k} > \frac{3}{4}}) \cap$

$(\cap \cup n > 0 k > n {z \in X$ : $.$ $\frac{| {i . 1 \underline{<} i \underline{<} k, π_{2 d + 6} (T^{p i} (z)) = \frac{2}{3}} |}{k} > \frac{3}{4}}$ ),

and that the latter set, call it $B$ , is clearly a $G_{δ}$ . We will show that $B$ is dense in $X$ .

Choose any nonempty open set $U \subset X$ . By minimality of $T$ , there is some $k$ so that

$T^{k} (0, 0, 0, O) \in U$ . By construction, $t_{n} = p_{n} + k$ for all $n \in D_{k}$ . Also by construction,

$D_{k} = \cup_{j \in C_{s_{k}}} B_{j}$ , and $π_{2 d + 6}$ $(T^{t_{n}} (0, 0, 0, 0))$ $= π_{2 d + 6} (T^{p_{n}} (T^{k} (0, 0, 0, 0)))$ $= \frac{1}{3}$ for $j = c_{s_{k}}^{(i)}$

and $i$ odd. But then for any odd integer $i$ , $π_{2 d + 6} (T^{p_{n}} (T^{k} (0, 0, 0, 0))) = \frac{1}{3}$ for any

$n \in [(c_{s_{k}}^{(i)})! + 1, (c_{s_{k}}^{(i)} + 1)!]$ , and so

$. \frac{| {i . 1 \underline{<} i \underline{<} (c_{s_{k}}^{(i)} + 1)!, π_{2 d + 6} (T^{p i} (T^{k} (0, 0, 0, 0)) = \frac{1}{3}} |}{(c_{s_{k}}^{(i)} + 1)!} \underline{>} \frac{c_{s_{k}}^{(i)}}{c_{s_{k}}^{(i)} + 1}$ ,

which is clearly larger than $\frac{3}{4}$ for sufficiently large $k$ . Similarly, for any even integer

$i$ , $π_{2 d + 6} (T^{p_{n}} (T^{k} (0, 0, 0, 0))) = \frac{2}{3}$ for any $n \in [(c_{s_{k}}^{(i)})! + 1, (c_{s_{k}}^{(i)} + 1)!]$ , and so

$. \frac{| {i . 1 \underline{<} i \underline{<} (c_{s_{k}}^{(i)} + 1)!, π_{2 d + 6} (T^{p i} (T^{k} (0, 0, 0, 0))) = \frac{2}{3}} |}{(c_{s_{k}}^{(i)} + 1)!} \underline{>} \frac{c_{s_{k}}^{(i)}}{c_{s_{k}}^{(i)} + 1}$ ,

which is also greater than $\frac{3}{4}$ for sufficiently large $k$ . This implies that $T^{k} (0, 0, 0, 0) \in$

$B$ , and so that $B \cap U$ is nonempty. Since $U$ was arbitrary, this shows that $B$ is dense,

and so a dense $G_{δ}$ . Therefore, $B$ is a residual set by the Baire category theorem, and

for every $z \in B$ , $\frac{1}{N} \sum_{n = 0}^{N - 1} h (T^{p_{n}} z)$ does not converge.

$□$

We note that exactly as in the proof of Theorem 2.1.15, this in fact shows that

for a residual set of $z \in X$ , $\lim \inf_{N \to \infty} \frac{1}{N} \sum_{n = 0}^{N - 1} h (T^{p_{n}} z) = 0 = \inf_{x \in X} h (x)$ and

$\lim \sup_{N \to \infty} \frac{1}{N} \sum_{n = 0}^{N - 1} h (T^{p_{n}} z) = 1$ $= \sup_{x \in X} h (x)$ .

2.6 A counterexample about simultaneous recurrence

Given a totally minimal system $(X, T)$ , any point $x \in X$ , and any two positive integers

$r$ and $s$ , since $(X, T^{r})$ and $(X, T^{s})$ are minimal it must be true that $x \in {T^{r n} x}_{n \in N}$

and $x \in {T^{s n} x}_{n \in N}$ . It is then the case that there exist sequences of positive integers

${n_{i}}$ and ${m_{i}}$ such that $T^{r n_{i}} x \to x$ and $T^{s m_{i}} x \to x$ . By Theorem 2.1.17, for a

residual set of $x$ , it is possible to choose $n_{i} = m_{i}$ . To prove Theorem 2.1.21, we must

exhibit an example of a totally minimal system for which this residual set is not all

of $X$ .

Proof of Theorem 2.1.21: For any $n$ , we define a symbolic dynamical system

$(X_{n}, σ)$ as follows: define $x_{n} \in Ω = {0, 1}^{Z}$ by $x_{n} (i) : = χ_{A_{n}}$ ( $i α_{n}$ (mod 1)), where

${(y_{n}}_{n \in N}$ is any sequence of rationally independent irrational reals, and $A_{n} = {0} \cup$

$\cup_{i = 1}^{\infty} [\frac{n}{n^{2 i}}, \frac{n + 1}{n^{2 i}})$ . Then, define a to be the left shift on $Ω$ , and $T$ to be the countable

product of a on $Ω^{N}$ : $T (ω_{1}, ω_{2}, ...)$ $: = (σ ω_{1}, σ ω_{2}, ...)$ . $X$ is then defined to be the

orbit closure of $x =$ $(x_{1}, x_{2}, ...)$ under $T : X = {T^{n} x}_{n \in Z}$ .

We claim that $(X, T)$ and $x$ prove Theorem 2.1.21. Consider any $r > s > 0$ , and

choose $n > r$ . Consider a sequence ${n_{i}}$ of integers where $T^{r n_{i}} x \to x$ and $T^{s n_{i}} x \to$

$x$ . Then clearly $σ^{r n_{i}} x_{n} \to x_{n}$ and $σ^{s n_{i}} x_{n} \to x_{n}$ . Since $x_{n} (0) = 1$ , this implies

that $x_{n} (r n_{i}) = x_{n} (s n_{i}) = 1$ for all large enough $i$ . Therefore, $r n_{i} (y_{n}$ (mod 1), $s n_{i} (y_{n}$

(mod $1$ ) $\in A_{n}$ for all large $i$ . This implies that $r s n_{i} (y_{n}$ (mod $1$ ) $\in r A_{n}$ (mod $1$ ) $\cap s A_{n}$

(mod t) for all large $i$ . However, $\sup r A_{n} = \frac{r (n + 1)}{n^{2}} < 1$ , so $r A_{n}$ (mod $1$ ) $= r A_{n}$ and

$s A_{n}$ (mod $1$ ) $= s A_{n}$ . We then see that $r s n_{i} (y_{n}$ (mod 1) $\in r A_{n} \cap s A_{n}$ . If $r s n_{i} (y_{n}$

(mod $1$ ) $\neq 0$ , then $r s n_{i} (y_{n} \in [\frac{r n}{n^{2 j}}, \frac{r (n + 1)}{n^{2 j}}) \cap [\frac{s n}{n^{2 k}}, \frac{s (n + 1)}{n^{2 k}})$ for some positive integers $j$ , $k$ .

Since $n > r$ , $s$ , it must be the case that $j = k$ , and then we have a contradiction since

$r n = (r - 1) n + n \underline{>} s n + n = s (n + 1)$ . Therefore, $r s n_{i} (y_{n}$ (mod t) $= 0$ , and since

( $y_{n} \notin Q$ , $n_{i} = 0$ for all large enough $i$ .

It remains to be shown that $(X, T)$ is totally minimal. Note that for any $m \in Z$ ,

$(X, T^{m})$ is a system defined similarly to $(X, T)$ , with the sequence of irrationals

${(y_{n}^{'}}_{n \in N}$ given by ( $y_{n}^{'} = m (y_{n}$ for $n \in$ N. Therefore, it suffices to show that $(X, T)$ is

minimal, since the definition of $(X, T)$ requires only that the set of irrationals ${(y_{n}}_{N}$

is rationally independent, and if this is true, it will be true of ${(y_{n}^{'}}_{n \in N}$ as well. To

show that $(X, T)$ is minimal, we must show that the orbit of any $y = (y_{1}, y_{2}, ...)$ $\in X$

under $T$ is dense in $X$ . It suffices to show that for any finite set of words ${w_{i}}_{1 \leq i \leq k}$

with to; a subword of $x_{i}$ for $1 \underline{<} i \underline{<} k$ , that there exists m $> 0$ such that $σ^{m} (y_{i}) \in [w_{i}]$

for $1 \underline{<} i \underline{<} k$ .

Choose any $1 \underline{<} i \underline{<} k$ . Since $w_{i}$ is a subword of $x_{i}$ , there exists $p$ such that $σ^{p} x_{i} \in [w_{i}]$ .

This implies that if we denote by $p_{i}$ the length of $w_{i}$ , then the set $\cap_{j = 0} ℓ_{i} - 1 B_{j}^{(i)}$ contains

$p (y$ (mod 1), where $B_{j}^{(i)} = (A_{i} - j (y_{i})$ (mod t) if $w_{i} (j) = 1$ and $B_{j}^{(i)} = ((A_{i} - j (y_{i})$

(mod 1) $)^{c}$ if $w_{i} (j) = 0$ . We now claim that the fact that $\cap_{j = 0} ℓ_{i} - 1 B_{j}^{(i)}$ is nonempty

implies that it contains an open interval. Note that for each $j$ , $B_{j}^{(i)}$ is either a union

of half-open intervals or a union of half-open intervals with a singleton, and that due

to the irrationality of ( $y_{i}$ and the rationality of all endpoints of intervals in $A_{i}$ , the

set of endpoints for intervals in $B_{j}^{(i)}$ is disjoint from the set of endpoints of intervals

in $B_{j}^{(i)}$ , for any $0 \underline{<} j \neq j^{'} < p_{i}$ . Since $\cap_{j = 0} ℓ_{i} - 1 B_{j}^{(i)} \neq \emptyset$ , there exist $I_{j} \subset B_{j}^{(i)}$ such

that $\cap_{j = 0} ℓ_{i} - 1 I_{j} \neq \emptyset$ , each $I_{j}$ is either a half-open interval or a singleton, at most one

$I_{j}$ is a singleton, and the $I_{j}$ have no endpoints in common. This implies that either

$\cap_{j = 0} ℓ_{i} - 1 I_{j}$ contains an interval, implying that $n_{j = 0}^{ℓ_{i} - 1} B_{j}^{(i)}$ does as well, or that for some

$0 \underline{<} j^{'} < p_{i}$ , $I_{j^{'}}$ is the singleton { $- j^{'} α_{i}$ (mod 1)}. In this case, $I_{j}$ is not a singleton for

$j \neq j^{'}$ , and so $\cap_{0 \leq j < ℓ_{i}, j \neq j}$ , $I_{j}$ contains an interval $I$ , and $I_{j^{'}} =$ { $- j^{'} α_{i}$ (mod 1)} lies in

the interior. Choose $ε$ such that $I_{j^{'}} + ε \underline{\subset} I$ . Since $I_{j^{'}}$ is a singleton, $B_{j}^{(i)}, = (A_{i} - j^{'} (y_{i})$

(mod 1). Choose $k$ such that $\frac{n + 1}{n^{2 k}} < ε$ . Then $[\frac{n}{n^{2 k}}, \frac{n + 1}{n^{2 k}}) - j^{'} (y_{i}$ (mod $1$ ) $\subset - B_{j^{'}}$ , and by

our choice of $ε$ , $[\frac{n}{n^{2 k}}, \frac{n + 1}{n^{2 k}}) - j^{'} (y_{i}$ (mod t) $\underline{\subset}$ I as well. Therefore, $([\frac{n}{n^{2 k}}, \frac{n + 1}{n^{2 k}}) - j^{'} (y_{i}$

(mod 1) $)$ $\cap (\cap_{0 \leq j < ℓ_{i}, j \neq j}, I_{j})$ contains an interval, and since $I_{j} \subset B_{j}^{(i)}$ for $j \neq j^{'}$ and

$[\frac{n}{n^{2 k}}, \frac{n + 1}{n^{2 k}}) - j^{'} (y_{i}$ (mod $1$ ) $\subset - B_{j}^{(i)},$ , we see that $\cap_{j = 0} ℓ_{i} - 1 B_{j}^{(i)}$ contains an interval as well.

For each $1 \underline{<} i \underline{<} k$ , denote by $J_{i}$ some interval contained in $\cap_{j = 0} ℓ_{i} - 1 B_{j}^{(i)}$ Choose

some $δ < \min_{1 \leq i \leq k} | J_{i} |$ . We now note that since the sequence ${(y_{n}}_{n \in N}$ is rationally

independent, the set ${(n α_{1}$ (mod 1), $n α_{2}$ (mod 1), . . . , $n (y_{k}$ (mod $1))}_{n \in N}$ is dense

in $[0, 1) k$ . Therefore, there exists $r$ such that the set $B = {(n α_{1}$ (mod 1), $n α_{2}$

(mod 1), . . . , $n$ ( $y_{k}$ (mod 1)) $}_{0 \leq n < r}$ is $δ$ -dense, meaning that every point in $[0, 1)^{k}$ is

within a distance of less than $δ$ from some point in $B$ . This means that ${(n α_{1}$

(mod 1), $n α_{2}$ (mod 1), . . . , $n$ ( $y_{k}$ (mod 1)) $}_{s \leq n < s + r} = B + (s α_{1},$ $s α_{2}$ , . . . , $s (y_{k})$ (mod t)

is also $δ$ -dense for any $s \in Z$ . We note that then for any 1 $\underline{<} i \underline{<} k$ and any

$m \in Z$ , $m (y_{i}$ (mod 1) $\in J_{i}$ implies that $m (y_{i}$ (mod 1) $\in \cap_{j = 0} ℓ_{i} - 1 B_{j}^{(i)}$ , and so that

$x_{i} (m) x_{i} (m + 1)$ . . . $x_{i} (m + l_{i} - 1)$ $= w_{i}$ , or that $σ^{m} (x_{i}) \in [w_{i}]$ . Define $P = \max_{1 \leq i \leq k} l_{i}$ .

Now, choose any $y =$ $(y_{1}, y_{2}, ...)$ $\in X$ . Since $y \in {T^{n} x}_{n \in Z}$ , there exists $p$ such that

$y_{i} (0) y_{i} (1)$ . . . $y_{i} (r + l) = x_{i} (p) x_{i} (p + 1)$ . . . $x_{i} (p + r + P)$ for $1 \underline{<} i \underline{<} k$ . Then, by the above

comments, there exists $m \in [p, p + r] \cap Z$ such that $x_{i} (m) x_{i} (m + 1)$ . . . $x_{i} (m + l_{i} - 1)$ $= w_{i}$

for $1 \underline{<} i \underline{<} k$ . Therefore, $y_{i}$ (rn $- p$ ) $y_{i}$ (rn $- p + 1$ ) . . . $y_{i}$ (rn $- p + l$ ) $= x_{i} (m) x_{i} (m +$

1) . . . $x_{i} (m + l)$ for $1 \underline{<} i \underline{<} k$ , and since $p \underline{>} p_{i}$ for $1 \underline{<} i \underline{<} k$ , $σ^{m - p} (y_{i}) \in [w_{i}]$ for

$1 \underline{<} i \underline{<} k$ . Since ${w_{i}}_{1 \leq i \leq k}$ were arbitrary, we have shown that the orbit of $y$ under

$T$ is dense in $X$ , and since $y \in X$ was arbitrary, that $(X, T)$ is minimal. We already

showed that this in fact implies that $(X, T)$ is totally minimal, and so are done.

$□$

2.7 Questions

There are some natural questions motivated by these results. For any totally minimal

and totally uniquely ergodic system $(X, T)$ , any nonempty open set $U \underline{\subset} X$ with

$μ (U) \in (0, 1)$ , and any $x \notin U$ , take the set $A =$ {a $\in N$ : $T^{n} x \in U$ }. Since $(X, T)$ is

uniquely ergodic, the density $d (A) : = \lim_{n \to \infty} \frac{| {1, 2, ..., n} \cap A |}{n}$ of $A$ equals $μ (U) > 0$ , and

so the sequence ${a_{n}}$ of the elements of $A$ written in increasing order does not satisfy

the hypotheses of any of our theorems. However, since $x \notin U$ , and since $T^{a_{n}} x \in U$

for all $n$ , $T^{a_{n}} x$ is bounded away from $x$ .

For a similar example, take $T$ to be a totally minimal and totally uniquely ergodic

isometry of a complete metric space $(X, d)$ with diameter greater than 2. Take $x$ , $y \in$

$X$ with $d (x, y) > 2$ , and define $f \in C (X)$ where $f (z) = 0$ for all $z \in B_{1} (x)$ and

$f (z) = 1$ for all $z \in B_{1} (y)$ . Then, take the sets $A = {n \in N : T^{n} x \in B_{\frac{1}{2}} (y)}$ and $B =$

${n \in N : T^{n} x \in B_{\frac{1}{2}} (x)}$ . Since $(X, T)$ is uniquely ergodic, $d (A) = μ (B_{\frac{1}{2}} (y)) > 0$ and

$d (B) = μ (B_{\frac{1}{2}} (x)) > 0$ . For any $z \in B_{\frac{1}{2}} (x)$ , $d (T^{a_{n}} z, y) \underline{<} d (T^{a_{n}} z, T^{a_{n}} x) + d (T^{a_{n}} x, y) =$

$d (z, x) + d (T^{a_{n}} x, y) < \frac{1}{2} + \frac{1}{2} = 1$ , and so $f (T^{a_{n}} z) = 1$ . Also for any $z \in B_{\frac{1}{2}} (x)$ ,

$d (T^{b_{n}} z, x) \underline{<} d (T^{b_{n}} z, T^{b_{n}} x) + d (T^{b_{n}} x, x) = d (z, x) + d (T^{b_{n}} x, x) < \frac{1}{2} + \frac{1}{2} = 1$ , and so

$f (T^{b_{n}} z) = 0$ . This means that by making a sequence ${c_{n}}$ by alternately choosing

longer and longer subsequences of ${a_{n}}$ and ${b_{n}}$ , we could create such ${c_{n}}$ with

$d ({c_{n}}) > 0$ (so ${c_{n}}$ does not satisfy the hypotheses of any of our theorems) where

IIIII $N \to \infty \frac{1}{N} \sum_{n = 1}^{N} f (T^{c_{n}} z)$ fails to converge for any $z$ in the second category set $B_{\frac{1}{2}} (x)$ .

The point of these examples is to show that our hypotheses are certainly not the only

ones under which examples of the types constructed in this paper exist. This brings

up the following questions:

Question 2.7.1. For what increasing sequences ${p_{n}}$ of integers does there exist $a$

totally minimal and totally uniquely ergodic system $(X, T)$ and a point $x \in X$ such

that $T^{p_{n}} x$ is bounded away from $x^{l} .$ ?

Question 2.7.2. For what increasing sequences ${p_{n}}$ of integers does there exist $a$

totally minimal and totally uniquely ergodic system $(X, T)$ and a function $f \in C (X)$

such that $\lim_{N \to \infty} \frac{1}{N} \sum_{n = 1}^{N} f (T^{p_{n}} x)$ fails to converge for a set of $x$ of second category $l .$ ?

Also, it is interesting that we could create examples for a wider class of sequences

${p_{n}}$ when $X$ was not connected. We would like to know whether or not this is

necessary, i.e.

Question 2.7.3. Given an increasing sequence ${p_{n}}$ of integers and a totally minimal

totally uniquely ergodic topological dynamical system (X,T) and x $\in X$ such that $T^{p_{n}} x$

is bounded away from x, must there exist a system with the same properties where $X$

is a connected space?

Question 2.7.4. Given an increasing sequence ${p_{n}}$ of integers and a totally minimal

totally uniquely ergodic topological dynamical system $(X, T)$ and $f \in C (X)$ such that

$\lim_{N \to \infty} \frac{1}{N} \sum_{n = 1}^{N} f (T^{p_{n}} x)$ fails to converge for a set of $x$ of second category, must

there exist a system with the same properties where $X$ is a connected space?

(JIAPTER 3

PERTURBATIONS OF MULTIDIMENSIONAL SHIFTS

OF FINITE TYPE

3.1 Introduction

Consider a symbolic dynamical system with positive topological entropy $h^{t o p} (X)$ . (A

rigorous definition of topological entropy appears below, but informally it is the expo-

nential growth rate of the number of different $n$ -letter words appearing as subwords

in elements of $X .$ ) Fix any word $w \in A^{[1, ..., n]}$ for $n \in N$ which appears as a subword

of some element of $X$ . Then, define $X_{w} \underline{\subset} Ω$ to be the set of all elements of $X$ in

which $w$ is not a subword. Clearly $X_{w} \subset X$ , and so the topological entropy of $X_{w}$

is not more than that of $X$ . We wish to estimate the size of the drop in topological

entropy $h^{t o p} (X) - h^{t o p} (X_{w})$ , and how this quantity behaves as $n \to \infty$ . To more

rigorously state and approach this problem, we begin with some definitions. Some of

these terms have already been defined in Chapter 2, but we repeat the definitions for

the more general setup of Chapter 3.

We denote by $R_{j_{1}, j_{2}, ..., j_{d}}$ the $j_{1} \times j_{2} \times$ . . . $\times j_{d}$ parallelepiped $\prod_{i = 1}^{d} {1, 2, ..., j_{i}}$ in $Z^{d}$ ,

and use the shorthand notation $Γ_{j}$ for the cube $R_{j, j, ..., j}$ . $j_{1}$ , . . . , $j_{d}$ are the sizes of

$R_{j_{1}, ..., j_{d}}$ , and $j$ is called the size of $Γ_{j}$ . We use $d$ to denote the $ι_{\infty}$ metric on $R^{d}$ : for

$p$ , $q \in R^{d}$ , $d (p, q) : = | p - q |_{\infty} = \max_{1 \leq i \leq d} | p_{i} - q_{i} |$ . For each $1 \underline{<} i \underline{<} d$ , we denote by

$e_{i}$ the $i t h$ element of the standard basis for $Z^{d}$ , and for $p$ , $q \in R^{d}$ , the distance in

the $e_{i}$ -direction between $p$ and $q$ is defined to be $| p_{i} - q_{i} |$ .

Definition 3.1.1. An alphabet is a finite set. For any alphabet $A$ , we define the $Z^{d} -$

shift action ${σ_{v}}_{v \in Z^{d}}$ on $Ω = A^{Z^{d}}$ as follows: for any $v \in Z^{d}$ and $x \in Ω$ , $(σ_{v} (x)) (u) =$

$x (v + u)$ for all $u \in Z^{d}$ .

Definition 3.1.2. $A Z^{d}$ -subshift on an alphabet $A$ is a set $X \underline{\subset} Ω$ with the following

trvo properties:

(i) $X$ is shift-invariant, meaning that for any $x \in X$ and $p \in Z^{d}$ , $σ_{p} (x) \in X$ .

(ii) $X$ is closed in the product topology on Q.

For any $Z^{d}$ -subshift $X$ , $(X, {σ_{v}}_{v \in Z^{d}})$ is a symbolic dynamical system.

Definition 3.1.3. A Borel probability measure of $Ω$ supported on a subshift $X$ is er-

godic if the measure-preserving dynamical system $(X, B (X)$ , $μ$ , ${T_{v}}_{v \in Z^{d}})$ is ergodic,

where $B (X)$ is the Borel $σ$ -algebra of $X$ .

Definition 3.1.4. A word $u$ on the alphabet $A$ is any mapping from a non-empty

subset $S$ of $Z^{d}$ to $A$ , where $S$ is called the shape of $u$ . For words $u$ and $u^{'}$ , where

$u$ has shape $S$ , we say that $u$ is a subword of $u^{'}$ if there exists $p \in Z^{d}$ such that

$u (q) = u^{'} (q + p)$ for all $q \in S$ . In this case, we say that $u$ occurs in $u^{'}$ at $S + p$ , or

that $u^{'} (S + p) = u$ .

Definition 3.1.5. A word $w$ with shape $S$ is periodic with respect to $v \in Z^{d}$ if

$S \cap (S - v) \neq \emptyset$ and $w (u) = w (u + v)$ for all $u \in S \cap (S - v)$ . We also say that $v$ is

a period of $w$ . A word is aperiodic if it has no periods.

Definition 3.1.6. For any $T \underline{\subset} S \subset Z^{d}$ , we say that trvo words u and $u^{'}$ with shape

S agree on T if $u (T) = u^{'} (T)$ , i.e. u and $u^{'}$ have the same letters on T.

Definition 3.1.7. The boundary of thickness k of a subset S of $Z^{d}$ , which is

denoted by $S^{(k)}$ , is the set of points p of S for which there exists a point q $\in Z^{d} ∖ S$

with $d (p, q) \underline{<} k$ . Whenever we refer to the boundary of a shape S, we mean the

boundary of thickness one.

Definition 3.1.8. T $\in Z^{d}$ is called a copy of a shape S if T $= S + v$ for some

v $\in Z^{d}$ . This v is then called the difference of S and T, denoted by T-S. We say

that a point p $\in S$ corresponds to a point q $\in T$ if p $= q + (S$ -T).

Definition 3.1.9. The language of $a Z^{d}$ -subshift X, denoted by $L (X)$ , is the set of

words which appear as subwords of elements of X. The set of words with a particular

shape S which are in the language of X is denoted by $L_{S} (X)$ .

Definition 3.1.10. For $a Z^{d}$ -subshift X, and for any finite shape $S \underline{\subset} Z^{d}$ , the number

of words in $L_{S} (X)$ is denoted by $H_{S} (X)$ , and the natural logarithm of this quantity

is denoted by $h_{S} (X)$ .

Definition 3.1.11. $A Z^{d}$ -shift of finite type is $a Z^{d}$ -subshift defined by specifying

a finite collection of words on A, call this collection $F$ , and then taking X $= Ω_{F}$

to be the set of elements of $Ω$ which do not contain any member of $F$ as subwords.

Different collections $F$ could induce the same shift of finite type; for any fixed X,

the type of X is the minimum nonnegative integer t such that for some $F$ consisting

entirely of words with shape $Γ_{t}$ , X $= Ω_{F}$ . A shift of type trvo is called a Markov

shift.

Definition 3.1.12. A word v with shape S forces a word w with shape T in a shift

of finite type X if any x $\in X$ with $x (S) = v$ also has $x (T) = w$ .

Definition 3.1.13. A subshift $X$ is topologically transitive if there exists $x \in X$

for which ${σ_{p} (x)}_{p \in Z^{d}}$ is dense in $X$ , $i . e$ . if there is $x \in X$ which contains every word

in $L (X)$ as a subword.

Definition 3.1.14. The topological entropy of a subshift X, denoted by $h^{t o p} (X)$ ,

is defined by

$h_{R_{j_{1}, j_{2}}}$ , ' $j_{d} (X)$

$h^{t o p} (X) = l i . q {\dot{g}}_{1}, j_{2},, \to \infty$ $j_{1} j_{2} ċ$ . . $j_{d}$

Definition 3.1.15. The measure-theoretic entropy of a subshift $X$ with respect

to a shift-invariant probability measure $μ$ on X, which is denoted by $h_{μ} (X)$ , is defined

$h_{μ} (X) =$ $\lim$ $\sum$ $f (μ ([w]))$ ,

$j_{1}, j_{2}, ..., j_{d} \to \infty j_{1} j_{2} ċ$ . . $j_{d}$

$w \in L_{R_{j_{1}}}$ , ' $j_{d} (X)$

where $f (x) = - x \ln x$ for $x > 0$ and $f (0) = 0$ .

Both of these limits exist by subadditivity.

We now restate the question from the beginning of Chapter 3. Suppose that a $Z -$

subshift $X$ on an alphabet $A$ is given, and consider any $n$ -letter word $w \in L_{Γ_{n}} (X)$ .

One can then define the subshift $X_{w}$ as consisting of all elements of $X$ which do not

contain $w$ as a subword. Since $X_{w} \underline{\subset} X$ , $h^{t o p} (X_{w}) \underline{<} h^{t o p} (X)$ , but it is natural to

wonder how much the entropy decreases by. For example, is it necessarily the case

that $h^{t o p} (X) - h^{t o p} (X_{w}) \to 0$ as $n \to \infty$ ? It is not hard to check that this is not always

the case. For example, if $X$ is minimal, then every $x \in X$ contains every $w \in L (X)$ as

a subword. This means that for any word $w \in L (X)$ , $X_{w} = \emptyset$ , and so $h^{t o p} (X_{w}) = 0$ .

This means that minimality is possibly too globally defined for our purposes. It is

then natural to move to the completely local definition of a shift of finite type. In

this setup, as was already mentioned in Chapter 1, there is the following result of

Douglas Lind:

Theorem 3.1.16. ([L], p. 360, Theorem 3) For any topologically transitive Z-shift

of finite type $X = Ω_{F}$ with positive topological entropy $h^{t o p} (X)$ , there exist constants

$C_{X}$ , $D_{X}$ , and $N_{X}$ such that for any $n >$ $N_{X}$ and any word $w \in L_{Γ_{n}} (X)$ , if we denote

by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then

$\frac{C_{X}}{e^{h^{t o p} (X) n}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) n}}$ .

Lind was apparently examining this question in order to prove some results about

the automorphism group of a shift of finite type; he needed to know that $h^{t o p} (X) -$

$h^{t o p} (X_{w}) \to 0$ as the size $n$ of the word $w$ approaches infinity. ([BoyLR]). Some

work on bounds of this type in the case where $X = A^{Ω}$ was also done by Guibas and

Odylzko. ([GO1], [GO2])

We wish to prove a version of Theorem 3.1.16 for multidimensional shifts of finite

type, i.e. shifts of finite type on $Z^{d}$ with $d > 1$ . In the remainder of this section, we

will discuss two relatively significant changes that we make in the statement of the

multidimensional extension. This discussion will have the dual purposes of motivating

our result and illustrating some of the issues that arise in higher-dimensional symbolic

dynamics. Our methods are similar in places to those used in [QT], where the effects

on entropy of removing words from shifts of finite type are also examined.

The first problem in giving amultidimensional version of Theorem 3.1.16 comes in

changing the exponents in the denominators. The most reasonable guess is that

instead of using $e^{h^{t o p} (X) n}$ , in the extension wwee should have $e^{h^{t o p} (X) n^{d}}$ However, we

will show that this is a bit too much to hope for. Consider the following examples:

take $Y = Ω$ . $Y$ is clearly a shift of finite type, with the set of forbidden words $\emptyset$ .

For every $j \in N$ , we define a shift of finite type as follows: define the alphabet $A^{(j)} =$

$L_{Γ_{j}} (Y) = {0, 1}^{Γ_{j}}$ , and define $f_{j}$ : $Y \to (A^{(j)})^{Z^{d}}$ by $(f_{j} (x)) (p) = x (p + Γ_{j} - (1, ..., 1))$

for every $x \in Y$ and $p \in Z^{d}$ . This map is usually called the higher block presentation

of Y. (see [LM]) In other words, for every $p \in Z^{d}$ , $(f_{j} (x)) (p)$ is defined to be the

subword of $x$ which lies in $Γ_{j} + p -$ (1, $...$ , 1), or a copy of $Γ_{j}$ whose least vertex in

the usual lexicographic order on $Z^{d}$ is $p$ . Figure 3.1 illustrates the action of $f_{2}$ on a

typical element of $Y$ for $d = 2$ : (the lower-leftmost entry of each array is at the point

$(1, ..., 1)$ $\in Z^{d})$

In Figure 3.1, the horizontal and vertical lines separate letters of the alphabet. So,

the left half shows a word with shape $Γ_{4}$ whose letters are elements of $A$ , namely

zeroes and ones, and the right half shows a word with shape $Γ_{3}$ on the alphabet $A^{(2)}$ ,

which are themselves words with shape $Γ_{2}$ whose letters are zeroes and ones.

With $f_{j}$ so defined, $f_{j} (Y)$ is a subshift with alphabet $A^{(j)}$ . In fact it is easily checkable

that $f_{j} (Y)$ is a Markov shift; $x \in (A^{(j)})^{Z^{d}}$ is an element of $f_{j} (Y)$ if and only if for

1101000111

00 01 11 11 $\to f_{2}$ 00 01 01 11 11 11

1	1	0	1
0	0	1	1
0	1	1	1
1	0	0	0

1 1

0 0

1 0

0 1

1 1

0 0

0 1

1 1

0 1

1 0

1 1

0 0

1 1

0 0

Figure 3.1: $f_{2}$ 's action on a sample element of $Y$

every $p \in Z^{d}$ and $1 \underline{<} i \underline{<} d$ , $x (p) (R_{j, ..., j, j - 1, j, ..., j} + e_{i}) = x (p + e_{i}) (R_{j, ..., j, j - 1, j, ..., j})$ ,

where the $j - 1$ is in the $i t h$ subindex in both cases. Since $f_{j}$ is a shift-commuting

bicontinuous bijection between $Y$ and $f_{j} (Y)$ , $f_{j} (Y)$ is topologically conjugate to $Y$

for every $j \in$ O.

Since $Y$ and $f_{j} (Y)$ are such canonical examples of shifts of finite type (the full shift $Y$

is in some sense the simplest shift of finite type, and every $f_{j} (Y)$ is conjugate to it),

any meaningful extension of Lind's result should certainly apply to them. $f_{j}$ can be

considered as a map between words just as easily as a function between subshifts: in

the diagram above, if one disregards the dots surrounding the words, then $f_{2}$ can be

thought of as sending $u \in L_{Γ_{4}} (Y)$ to $f_{2} (u) \in L_{Γ_{3}} (f_{2} (Y))$ . For this reason, and since $f_{j}$

is a topological conjugacy for all $j$ , if we consider $Y_{w}$ the shift of finite type resulting

from removing some word $w$ with shape $Γ_{n}$ from $L (Y)$ , and $(f_{j} (Y))_{f_{j} (w)}$ the shift of

finite type resulting from removing $f_{j} (w)$ from $L (f_{j} (Y))$ , then $Y_{w}$ and $(f_{j} (Y))_{f_{j} (w)}$

are topologically conjugate as well, and therefore have the same topological entropy.

In other words, the removal of a $Γ_{n}$ word from $L (Y)$ results in the same drop in

topological entropy as the removal of a $Γ_{n - j + 1}$ word from $L (f_{j} (Y))$ . Now, let us

suppose that Lind's result can be extended by simply changing the $n$ in the exponent

to an $n^{d}$ . This would result in something that looks like this:

Possible Theorem 3.1.17. For any d $>$ 1 and any topologically transitive shift

X $= Ω_{F}$ of finite type with positive topological entropy $h^{t o p} (X)$ , there exist constants

$C_{X}$ , $D_{X}$ , and $N_{X}$ such that for any n $>$ $N_{X}$ and any word w $\in L_{Γ_{n}} (X)$ , if we denote

by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then

$\frac{C_{X}}{e^{h^{t o p} (X) n^{d}}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) n^{d}}}$ .

We can now show that this possible extension cannot be true; suppose that there

exist such constants $C_{Y}$ , $D_{Y}$ , and $N_{Y}$ , which satisfy Possible Theorem 3.1.17 for $Y$ ,

and for any $j > 1$ , $C_{f_{j} (Y)}$ , $D_{f_{j} (Y)}$ , and $N_{f_{j} (Y)}$ , which satisfy Possible Theorem 3.1.17

for $f_{j} (Y)$ . Then, for any word $w \in L_{Γ_{n}} (Y)$ , where $n >$ $N_{Y}$ ,

$\frac{C_{Y}}{e^{h^{t o p} (Y) n^{d}}} < h^{t o p} (Y) - h^{t o p} (Y_{w}) < \frac{D_{Y}}{e^{h^{t o p} (Y) n^{d}}}$ .

By the previous argument, this means that for any $j$ ,

$\frac{C_{Y}}{e^{h^{t o p} (Y) n^{d}}} < h^{t o p} (f_{j} (Y)) - h^{t o p} ((f_{j} (Y))_{f_{j} (w)}) < \frac{D_{Y}}{e^{h^{t o p} (Y) n^{d}}}$ . (3. 1)

And by the definitions of $C_{f_{j} (Y)}$ , $D_{f_{j} (Y)}$ , and $N_{f_{j} (Y)}$ , and since $f_{j} (w) \in L_{Γ_{n - j + 1}} (f_{j} (Y))$ ,

as long as $n - j + 1$ $> N_{f_{j} (Y)}$ , we know that

$\frac{C_{f_{j} (Y)}}{e^{h^{t o p} (f_{j} (Y)) (n - j + 1)^{d}}} < h^{t o p} (f_{j} (Y)) - h^{t o p} ((f_{j} (Y))_{f_{j} (w)}) < \frac{D_{f_{j} (Y)}}{e^{h^{t o p} (f_{j} (Y)) (n - j + 1)^{d}}}$ . (3.2)

However, since $d > 1$ , for large enough $n$ , since $h^{t o p} (f_{j} (Y))$ $= h^{t o p} (Y)$ , $e^{h^{t o p} (f_{j} (Y)) n^{d}}$

grows much more quickly than $e^{h^{t o p} (Y) (n - j + 1)^{d}}$ Therefore (3.1) and (3.2) contradict

each other. The problem that must be addressed is that due to examples such as

these, there must be some sort of a "tolerance range" around $n$ in any meaningful

multidimensional version of Theorem 3.1.16. In other words, we should amend our

possible extension to the following:

Possible Theorem 3.1.18. For any d $>$ 1 and any topologically transitive shift

X $= Ω_{F}$ of finite type with positive topological entropy $h^{t o p} (X)$ , there exist constants

$C_{X}$ , $D_{X} > 0$ , $A_{X}$ , $B_{X}$ , and $N_{X} \in N$ such that for any n $>$ $N_{X}$ and any word w $\in$

$L_{Γ_{n}}$ (X), if we denote by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then

$\frac{C_{X}}{e^{h^{t o p} (X) (n + A_{X})^{d}}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) (n + B_{X})^{d}}}$ .

Since $j$ could be arbitrarily large in the definition of $f_{j} (Y)$ , it must be the case that $A_{X}$

and $B_{X}$ depend on $X$ rather than being absolute constants. In fact, such constants

$A_{X}$ and $B_{X}$ could be thought of as being present in Theorem 3.1.16 as well, but

hidden inside the constants $C_{X}$ and $D_{X}$ . This new version is still not true, however.

Figure 3.2: A portion of a sample element of $Z$

This time, the problem is not with the conclusion, but the hypotheses. Topological

transitivity is not nearly a strong enough hypothesis in several dimensions to get

the type of result that we are after, and in fact we will see that topological mixing

is not nearly good enough either. To show this, we consider a subshift called the

checkerboard island shift, which we call $Z$ , defined by Quas and §ahin in [QS]. $Z$ is

a $Z^{2}$ -Markov shift, on an alphabet $A$ with thirteen letters. $Z$ is also topologically

mixing; for a proof see [QS].

All letters of $A$ and elements of $L_{Γ_{2}} (X)$ are displayed in Figure 3.2. In other words,

a two-by-two word of symbols is in $L (Z)$ if and only if it appears as a subword of the

word in Figure 3.2. Consider the word with shape $Γ_{8}$ in Figure 3.3, which we call $a_{4}$ .

It will not be shown here (details are in [QS]), but an occurrence of $a_{4}$ in any element

of $Z$ forces the arrows and diagonal lines surrounding $a_{4}$ in Figure 3.2, along with

a boundary of thickness one filled with empty spaces. In other words, wherever $a_{4}$ ,

which has shape $Γ_{8}$ , appears, it forces a word, which we call $b_{4}$ , of shape $Γ_{14}$ . $b_{4}$

appears in Figure 3.4.

Figure 3.3: $a_{4}$

Figure 3.4: $b_{4}$

In fact, for every $n > 4$ , there is an analogous word $a_{n}$ (a checkerboard of size $Γ_{2 n - 2}$

with a boundary filled with arrows and diagonal lines in an analogous way to that

of $a_{4}$ ) which forces a word $b_{n}$ of size $Γ_{4 n - 2}$ around it. We claim that for any $n$ , the

shift $Z_{a_{n}}$ obtained by removing $a_{n}$ from $L (Z)$ and the shift $Z_{b_{n}}$ obtained by removing

$b_{n}$ from $L (Z)$ are equal. Since $a_{n}$ is a subword of $b_{n}$ , it is obvious that $Z_{a_{n}} \underline{\subset} Z_{b_{n}}$ .

Suppose that $Z_{b_{n}} ≦̸ Z_{a_{n}}$ . Then there exists $x \in Z$ which has an occurrence of

$a_{n}$ , but no occurrence of $b_{n}$ , which, as has already been stated, is not possible. So,

$Z_{a_{n}} = Z_{b_{n}}$ . This implies that $h^{t o p} (Z) - h^{t o p} (Z_{a_{n}}) = h^{t o p} (Z) - h^{t o p} (Z_{b_{n}})$ . However, since

$a_{n} \in L_{Γ_{2 n}} (Z)$ and $b_{n} \in L_{Γ_{4 n - 2}} (Z)$ , we have a contradiction to Possible Theorem 3.1.18

for the same reason as we did before. Since this is true for all $n$ , if we wish to adjust

the conditions of the conclusion again, we will have to change our "tolerance range"

from $[n + A_{X}, n + B_{X}]$ to something like $[n A_{X}, n B_{X}]$ , which would weaken the result

considerably. To obtain a conclusion of the strength we want, we need a stronger

mixing property than topological mixing. The notion which we use is that of strong

irreducibility.

Definition 3.1.19. $A Z^{d}$ -subshift X is strongly irreducible with uniform filling

length R if for any trvo subsets S, T of $Z^{d}$ , and for all p $\in Z^{d}$ such that $d (S, T + p) >$

R (that is, for all q $\in S$ and r $\in T + p$ , d ( q, $r) > R$ ), and for any elements x, $x^{'} \in X$ ,

there exists y $\in X$ such that $y (S) = x (S)$ and $y (T + p) = x^{'} (T + p)$ .

For the sake of comparing, we also include a definition of topological mixing whose

formulation looks similar to that of strong irreducibility:

Definition 3.1.20. $A Z^{d}$ -subshift X is topologically mixing if for any trvo subsets

S and T of $Z^{d}$ , there exists $R_{S, T}$ such that for all p $\in Z^{d}$ with the property that

$d (S, T + p) > R$ , and for any elements $x$ , $x^{'} \in X$ , there exists $y \in X$ such that

$y (S) = x (S)$ and $y (T + p) = x^{'} (T + p)$ .

The difference in the two definitions is subtle but important: for strong irreducibility,

the distance $R$ that must be present between two shapes in order to make interpolation

possible is independent of the shapes involved, whereas in topological mixing, this

distance can depend on the shapes. This distinction is irrelevant in one dimension,

since strong irreducibility and topological mixing are the same concept for $Z$ -subshifts.

(To see this, note that for a $Z$ -shift of finite type, a distance sufficient for interpolating

between two letters of the alphabet is sufficient to interpolate between any two words.)

However, strong irreducibility is a strictly stronger notion in more than one dimension.

Pertinent for our purposes is the fact that if $x (S)$ $= w$ for some $x$ in a stongly

irreducible shift of finite type $X$ , then the occurrence of $w$ at $S$ can only "force"

letters to appear in the region of $Z^{d}$ wrttnn distance $R$ of $S$ . This is because for any

$v \in Z^{d}$ with $d (v, S) > R$ and any $a \in A$ , by the definition of strong irreducibility

there is some $y \in X$ with $y (S) = w$ and $y (v) = a$ . For this reason, the checkerboard

island system $Z$ is topologically mixing but not strongly irreducible. Armed with this

stronger hypothesis, we can now refine Possible Theorem 3.1.18 a bit more:

Possible Theorem 3.1.21. For any d $> 1$ and any strongly irreducible shift X $= Ω_{F}$

of finite type with uniform filling length R and positive topological entropy $h^{t o p} (X)$ ,

there exist constants $C_{X}$ , $D_{X} > 0$ , $A_{X}$ , $B_{X}$ , and $N_{X} \in N$ such that for any n $>$ $N_{X}$

and any word w $\in L_{Γ_{n}} (X)$ , if we denote by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then

$\frac{C_{X}}{e^{h^{t o p} (X) (n + A_{X})^{d}}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) (n + B_{X})^{d}}}$ .

In fact the above is now provable with correct choices for the pertinent constants,

and so we state our main result:

Theorem 3.1.22. For any d $> 1$ and any strongly irreducible shift X $= Ω_{F}$ of finite

type with uniform filling length R and positive topological entropy $h^{t o p} (X)$ , there exist

constants $N_{X} \in N$ and $D_{X} \in R$ such that for any n $>$ $N_{X}$ and any word w $\in L_{Γ_{n}} (X)$ ,

if we denote by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then

$\frac{1}{e^{h^{t o p} (X) (n + 44 R + 70)^{d}}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) (n - 2 R)^{d}}}$ .

Before beginning with the preliminaries of a proof, we make an observation: our

example shifts of finite type $f_{j} (Y)$ used above to show the necessity of the constants

$A_{X}$ and $B_{X}$ in this statement did not actually show that $A_{X}$ and $B_{X}$ are not the

same; it is possible that $A_{f_{j} (Y)} = B_{f}$ , $ċ (Y) = j$ for all $j$ . However, we may now introduce

another example to show that in fact $A_{X}$ and $B_{X}$ must be distinct for certain $X$ .

Consider, again for any $j > 1$ , and any dimension $d$ , the shift of finite type $Z_{j}$ on

the alphabet {0, 1} defined by the set of forbidden words $F_{j} = {w \in {0, 1}^{Γ_{j}}$ :

$w$ has at least two ones}. In other words, $Z_{j}$ consists of all infinite words of zeroes

and ones on $Z^{d}$ such that any two ones are a $d$ -distance of at least $j$ from each other.

By definition, $Z_{j}$ is clearly a shift of finite type. We also claim that it is strongly

irreducible with $R = j -$ $1$ : consider any words $x \in L_{S} (Z_{j})$ and $x^{'} \in L_{T} (Z_{j})$ such

that $d (S, T) > j - 1$ . Take $y \in {0, 1}^{Z^{d}}$ defined by $y (S) = x$ , $y (T) = x^{'}$ , and $y (p) = 0$

for any $p \notin S \cup T$ . We claim that $y \in Z_{j}$ : suppose, on the contrary, that $y \notin Z_{j}$ .

Then there exist $a$ , $b \in Z^{d}$ such that $d (a, b) \underline{<} j$ , and $y (a) = y (b) = 1$ . If $a$ , $b \in S$ ,

then $x$ contains two ones a $d$ -distance less than $j$ apart and cannot be in $L (X)$ , a

contradiction. We arrive at a similar contradiction if $a$ , $b \in T$ . Clearly we cannot

have $a \in S$ and $b \in T$ since $d (S, T) > j - 1$ , and similarly it cannot be the case that

$a \in T$ and $b \in S$ . But since the only ones in $y$ are in $S$ or $T$ , we have exhausted

all possible cases. Therefore, $y \in Z_{j}$ . By definition, $Z_{j}$ is strongly irreducible with

$R = j - 1$ . We claim that in the language of Possible Theorem 3.1.21, it must be the

case that $A_{Z_{j}} \neq B_{Z_{j}}$ , and in fact that $A_{Z_{j}} - B_{Z_{j}} \underline{>} 2 j -$ $2$ $= 2 R$ .

To prove this, we first make a definition: for any $n > 0$ , we define the word $a_{n}$ with

shape $Γ_{n j + 1}$ by $a_{n} (p) = 1$ if and only if all coordinates of $p$ are equal to one modulo

$j$ . $a_{n} \in L (Z_{j})$ since it contains no two ones a $d$ -distance of less than $j$ apart. Let

us also for any $n > 0$ define the word $b_{n}$ with shape $Γ_{(n + 2) j - 1}$ where $b_{n}$ has $a_{n}$ as

a subword occupying the central $Γ_{n j + 1}$ , and has zeroes on all of $Γ_{(n + 2) j - 1}^{(j - 1)}$ . By the

definition of $Z_{j}$ , any occurrence of $a_{n}$ in an element of $Z_{j}$ forces an occurrence of $b_{n}$

containing it. Therefore, as argued for $f_{j} (Y)$ , $(Z_{j})_{a_{n}} = (Z_{j})_{b_{n}}$ for any $n$ . However,

the size of the shape of $b_{n}$ is $2 j -$ $2$ bigger than that of $a_{n}$ for any $n$ , and so for any

$A_{Z_{j}}$ , $B_{Z_{j}}$ , $C_{Z_{j}}$ , and $D_{Z_{j}}$ which satisfy Possible Theorem 3.1.21, it must be the case

that $A_{Z_{j}} - B_{Z_{j}} \underline{>} 2 j$ - $2 = 2 R$ for any $j$ . Although this does not show that the

bounds in Theorem 3.1.22 are optimal, it does show that $A_{X} -$ $B_{X}$ must be at least

linear in $R$ , and so the value of $46 R + 70$ for $A_{X}$ - $B_{X}$ attained in Theorem 3.1.22

can be improved by at most additive and multiplicative constants.

Before beginning to prove our main result, we now briefly outline the content of this

paper. The main idea in the proof of Theorem 3.1.22 is to use strong irreducibility

to find a word $w^{'} \neq w$ where $w$ and $w^{'}$ agree on $Γ_{n}^{(t)}$ , and then define a map which

replaces all occurrences of $w$ by $w^{'}$ in large words $v \in L_{Γ_{k}} (X)$ . In order to know how

many-to-one this map is, it is important to know how many times $w$ and $w^{'}$ appear in

a "typical" $v \in L_{Γ_{k}} (X)$ . In Section 3.2, we use ergodic measures of maximal entropy

to show that for fixed $n$ , any $w \in L_{Γ_{n}} (X)$ , and large $k$ , there are "many" words $v$

in $L_{Γ_{k}} (X)$ for which we have good bounds from above and below on the number of

occurrences of $w$ in $v$ . Namely, for a large set of $v$ , the number of occurrences of $w$ is

roughly $k^{d} \bar{μ} ([w])$ , where $\bar{μ}$ is an ergodic measure of maximal entropy on $X$ . We then

use a result of Burton and Steif ([BuSl], $[B u S 2]$ ) to give upper and lower bounds on

$\bar{μ} ([w])$ of the type $e^{- h^{t o p} (X) (n + A)^{d}}$

There is a slight problem though; it is entirely possible that a replacement of $w$

by $w^{'}$ could create a new occurrence of $w$ , which could make this replacement map

extremely unwieldy and hard to deal with. What we would like is a word $w^{'}$ such

that replacing $w$ by $w^{'}$ can never create a new occurrence of $w$ . Unfortunately, this

is impossible for some choices of $w$ . In Section 3.3, we prove a slightly weaker fact

that is still sufficient for our purposes; for any $w \in L_{Γ_{n}} (X)$ , there exists a subword

$w^{'} \in L_{Γ_{m}} (X)$ with $m$ not much smaller than $n$ for which there exists $w^{'} \neq w^{'}$ where

$w^{'}$ and $w^{'}$ agree on $Γ_{m}^{(t)}$ and where a replacement of $w^{'}$ by $w^{'}$ can never create a new

occurrence of $w^{'}$ .

Section 3.4 contains the proof of Theorem 3.1.22. The proof is done by considering

maps between $L (X)$ and $L (X_{w})$ which either introduce or destroy occurrences of $w$

(which is done by using the result of Section 3.3 to perform various replacements) , and

estimating the sizes of images or preimages under these maps. We can make certain

helpful assumptions about the words in which these replacements take place (such as

the approximate number of occurrences of $w$ ) by using the results of Section 3.2.

In Section 3.5, we deal with a simplified version of our main result where the word $w$ in

question is already chosen to have a nice property. (namely, that there exists $w^{'}$ such

that replacing $w$ by $w^{'}$ can neither accidentally create new $w$ or $w^{'}$ or destroy existing

$w$ or $w^{'}$ ) In this case, we prove that $(\ln 2) {\bar{μ}}_{w} ([w^{'}]) \underline{<} h^{t o p} (X) - h^{t o p} (X_{w}) \underline{<} 2$ (In $2$ ) $μ - ([w])$

for any $\bar{μ}$ an ergodic measure of maximal entropy on $X$ and ${\bar{μ}}_{w}$ an ergodic measure

of maximal entropy on $X_{w}$ . We then show that it is not possible to prove a lower

bound on $h^{t o p} (X) - h^{t o p} (X_{w})$ of the form $C \bar{μ} ([w])$ for constant $C$ .

In Section 3.6, we prove a corollary using our methods: for any alphabet $A$ , and

given any set of words $w_{1}$ , . . . , $w_{m}$ where the size of $w_{i + 1}$ is much greater than that

of $w_{i}$ for $1 \underline{<} i < m$ , $Y_{w_{1}, ..., w_{m}} \neq \emptyset$ , where $Y = Ω$ is the full shift. This can be seen

to be related to some of the so-called undecidability questions in $Z^{d}$ shifts of finite

type; given a set $F$ of words, it is algorithmically undecidable whether or not $Ω_{F}$

is nonempty. The corollary described gives a condition under which this question is

easily answered.

Finally, in Section 3.7, we give some open questions that arise in the course of proving

our results.

3.2 Some measure-theoretic preliminaries

We will now move towards proving Theorem 3.1.22. We first need a lemma about

strongly irreducible systems:

Lemma 3.2.1. For any $Z^{d}$ -subshift $X$ which is strongly irreducible with uniform fill-

ing length $R$ and for any rectangular prism $R_{j_{1}, j_{2}, ..., j_{d}}$ , $e^{h^{t o p} (X) j_{1} j_{2} ... j_{d}} \underline{<} H_{R_{j_{1}, j_{2}}}$ , ' $j_{d} (X)$

$\underline{<} e^{h^{t o p} (X) (j_{1} + R) (j_{2} + R) ... (j_{d} + R)}$ .

Proof: We first prove the first inequality: take any rectangular prism $R_{j_{1}, ..., j_{d}}$ . For

any $k \in N$ , since $R_{k j_{1}, ..., k j_{d}}$ can be partitioned into $t^{d}$ disjoint copies of $R_{j_{1}, ..., j_{d}}$ ,

$H_{R_{k j_{1}}}$ , ' $k j_{d} (X) \underline{<} H_{R_{j_{1}}}$ , ' $j_{d} (X)^{k^{d}}$ Take natural logarithms of both sides and divide by

$k^{d} j_{1} ... j_{d}$ :

$\frac{h_{R_{k j_{1},, k j_{d}}} (X)}{k^{d} j_{1} ... j_{d}} \underline{<} \frac{h_{R_{j_{1},, j_{d}}} (X)}{j_{1} ... j_{d}}$ ,

and by letting $k$ approach infinity, we see that $h^{t o p} (X) \underline{<} \frac{h_{R_{j_{1},, j_{d}}} (X)}{j_{1} ... j_{d}}$ , which implies

that $e^{h^{t o p} (X) j_{1} ... j_{d}} \underline{<} H_{R_{j_{1}}}$ , ' $j_{d} (X)$ .

We prove the second inequality in almost the same way: again consider any rect-

angular prism $R_{j_{1}, ..., j_{d}}$ . For any positive integer $k$ , $R_{k (j_{1} + R), k (j_{2} + R), ..., k (j_{d} + R)}$ can be

broken into $k^{d}$ copies of $R_{j_{1} + R, ..., j_{d} + R}$ , and we can choose copies of $R_{j_{1}, ..., j_{d}}$ inside each

$R_{j_{1} + R, ..., j_{d} + R}$ so that the $d$ -distance between any pair of these $R_{j_{1}, ..., j_{d}}$ is greater than

$R$ . This is illustrated in Figure 3.5.

(A note on figures: all figures in this paper are drawn with $d = 2$ , but this is in some

sense without loss of generality; all of the ideas that we use can be illustrated via

Figure 3.5: $R_{k (j_{1} + R), k (j_{2} + R), ..., k (j_{d} + R)}$

two-dimensional pictures. All constructions and descriptions are carried out in full

generality. )

By strong irreducibility, for any way of filling these $k^{d}$ copies of $R_{j_{1}, ..., j_{d}}$ with words in

$L (X)$ , the remainder can be filled to make a word in $L_{R_{k (j_{1} + R), k (j_{2} + R)}}$ , ' $k (j_{d} + R) (X)$ . This

implies that $H_{R_{j_{1}}}$ , ' $j_{d} (X)^{k^{d}} \underline{<} H_{R_{k (j_{1} + R), k (j_{2} + R)}}$ , ' $k (j_{d} + R) (X)$ . Take natural logarithms of

both sides and divide by $k^{d} (j_{1} + R) (j_{2} + R) ...$ $(j_{d} + R)$ :

$\frac{h_{R_{j_{1},, j_{d}}} (X)}{(j_{1} + R) ... (j_{d} + R)} \underline{<} \frac{h_{R_{k (j_{1} + R),, k (j_{d} + R)}} (X)}{k^{d} (j_{1} + R) ... (j_{d} + R)}$ ,

and by letting $k$ approach infinity, we see that $\frac{h_{R_{j_{1},, j_{d}}} (X)}{(j_{1} + R) ... (j_{d} + R)} \underline{<} h^{t o p} (X)$ , or that

$H_{R_{j_{1}}}$ , ' $j_{d} (X) \underline{<} e^{h^{t o p} (X) (j_{1} + R) (j_{2} + R) ... (j_{d} + R)}$ .

$□$

To prove Theorem 3.1.22, we will be repeatedly using the concept of a measure of

maximal entropy.

Definition 3.2.2. A shift-invariant probability measure $μ$ on a subshift X is called

a measure of maximal entropy if $h_{μ} (X) = h^{t o p} (X)$ .

Such measures are said to have maximal entropy because of the following Variational

Principle:

Theorem 3.2.3. For any $Z^{d}$ -subshift X, $h^{t o p} (X) = \sup h_{μ} (X)$ , where $μ$ ranges over

all shift-invariant Borel probability measures on X. This supremum is achieved for

some $μ$ .

See [M] for a proof. A useful observation is that since the extreme points of the

set of measures of maximal entropy are ergodic, and since the set of measures of

maximal entropy is nonempty for subshifts, any subshift also has an ergodic measure

of maximal entropy. (See [Wal] for details.) We need the following proposition:

Proposition 3.2.4. ( $[B u S 2]$ , p. 281, Proposition 1.20) Let $μ$ be a measure of maximal

entropy for $a Z^{d}$ -shift $X$ of finite type $t$ . Then for any shape $U \underline{\subset} Z^{d}$ , the conditional

distribution of $μ$ on $U$ given a fixed word $w \in L_{(U^{C})} (t)$ $(X)$ is uniform over all words

$x \in L_{U} (X)$ such that the word $y$ rvith shape $U \cup$ (& ) defined by $y (U) = x$ and

$y ((U^{c})^{(t)}) = w$ is in $L (X)$ .

We have the following convenient restatement:

Proposition 3.2.5. For any $Z^{d}$ -shift X of finite type t, and for any measure $μ$ of

maximal entropy on X, and for any word w $\in L_{S} (X)$ , and for any shape $S^{(t)} \underline{\subset} T \underline{\subset}$

$S$ , $μ ([w]) = \frac{μ ([w (T)])}{| L_{S} (X) \cap [w (T)] |}$ . In other words, any trvo words with the same shape $S$ which

agree on $T$ have the same measure under $μ$ .

Proof: Consider any two words $w$ , $w^{'} \in L_{S} (X)$ with $w (T) = w^{'} (T)$ . We claim that

$((S ∖ S^{(t)})^{c})^{(t)} \underline{\subset} S^{(t)}$ . Consider $p \in ((S ∖ S^{(t)})^{c})^{(t)}$ By definition, $p \in (S ∖ S^{(t)})^{c}$

and there exists $q \in S ∖ S^{(t)}$ such that $d (p, q) \underline{<} t$ . Suppose that $p \in S^{c}$ . Then,

since $d (p, q) \underline{<} t$ and since $q \in S$ , by definition, $q \in S^{(t)}$ , a contradiction. Therefore,

$p \in S$ . Since $p \in (S ∖ S^{(t)})^{c}$ , this implies that $p \in S^{(t)}$ . Therefore, $((S ∖ S^{(t)})^{c})^{(t)} \underline{\subset}$

$S^{(t)}$ . We define $x = w (S ∖ S^{(t)})$ and $x^{'} = w^{'} (S ∖ S^{(t)})$ . Since $S^{(t)} \underline{\subset} T$ , and since

$((S ∖ S^{(t)})^{c})^{(t)} \underline{\subset} S^{(t)}$ , $w (((S ∖ S^{(t)})^{c})^{(t)}) = w^{'} (((S ∖ S^{(t)})^{c})^{(t)})$ . Therefore, if we de-

fine $z = w (((S ∖ S^{(t)})^{c})^{(t)})$ , then since either $x$ or $x^{'}$ could fill $S ∖ S^{(t)}$ while $z$ fills

$((S ∖ S^{(t)})^{c})^{(t)}$ , we may take $U = S ∖ S^{(t)}$ and use Proposition 3.2.4 to see that

$μ (w) = μ (w^{'})$ .

$□$

Lemma 3.2.6. For any strongly irreducible $Z^{d} -$ shift X of finite type t with uniform

filling length R, and for any measure $μ$ of maximal entropy on X, and for any word

u with shape S a subset of $Γ_{M}$ ,

$\frac{1}{H (X), (s u (S^{C}) (t + R)) ∖ (s u (S^{C})^{(t + R))^{(t)}}}, \underline{<} μ ([u]) \underline{<} \frac{1}{H_{S ∖ S^{(R)}} (X)}$ .

Proof: Before we begin the proof, we recall that for any positive integer $j$ , $S^{(j)}$ is

the boundary of thickness $j$ of $S$ , and that $(S^{c})^{(j)}$ is the boundary of thickness $j$ of

the complement of $S$ , i.e. the set of all points $p \notin S$ for which there exists $q \in S$ with

$d (p, q) \underline{<} j$ .

Fix $u \in L_{S} (X)$ . We begin by bounding $μ ([u])$ from below. Choose any word $v \in$

$L_{(s u (S^{c})^{(t + R)})^{(t)}} (X)$ . We claim that $d (S, (S \cup (S^{c})^{(t + R)})^{(t)}) > R$ . Consider any $p \in S$

and $q \in$ $(S U (S^{c})^{(t + R)})^{(t)}$ . By definition, there exists $r \notin S \cup$ $(S^{c})^{t + R}$ such that

$d (q, r) \underline{<} t$ . Also by definition, $d (p, r) > R + t$ . Therefore, by the triangle inequality,

$d (p, q) > R$ , as claimed. This means that by strong irreducibility, there exists a

word $w_{v} \in L_{s u (S^{c})^{(t + R)}} (X)$ such that $w_{v} (S)$ $= u$ and $w_{v} ((S \cup (S^{c})^{(t + R)})^{(t)}) = v$ .

By Proposition 3.2.5, $μ ([w_{v}]) = \frac{μ ([v])}{| L_{s u (S^{C})^{(t + R)}} (X) \cap [v] |} \underline{>} \frac{μ ([v])}{H (s u (S^{C}) (t + R)) ∖ (s u (S^{C})^{(t + R))^{(t)}} (X)}$ .

The inequality comes from the fact that the number of words in $L (X)$ with shape

$S \cup$ $(S^{c})^{(t + R)}$ with the fixed word $v$ on $(S \cup (S^{c})^{(t + R)})^{(t)}$ is clearly less than or equal

to the number of words in $L (X)$ with shape $(S \cup (S^{c})^{(t + R)}) ∖ (S \cup (S^{c})^{(t + R)})^{(t)}$ For

clarity, we rewrite:

$μ ([w_{v}]) = \frac{μ ([v])}{H (X), (s u (S^{C}) (t + R)) ∖ (s u (S^{C})^{(t + R))^{(t)}}}$ , $ċ$ (3.3)

If we sum (3.3) over all possible choices for $v$ , then we get

$v \in L (S u (S^{C})^{(t + R)})^{(t)} \sum_{(X)} μ ([w_{v}]) = \frac{\sum_{v \in L (X) (s u (S^{C})^{(t + R)})^{(t)}} μ ([v])}{H (X), (s u (S^{C}) (t + R)) ∖ (s u (S^{C})^{(t + R))^{(t)}}}, ċ$ (3.4)

Note that all $w_{v}$ are distinct, and for all $w_{v}$ , $w_{v} (S) = u$ . Therefore,

$v \in L_{(S u (S^{C})^{(t + R)_{)} (t)}} \cup [w_{v}] (X) \underline{\subset} [u]$ ,

and so $\sum_{v \in L_{(s u (S^{C})^{(t + R)})^{(t)}} (X)} μ ([w_{v}]) \underline{<} μ ([u])$ . Since $\cup_{v \in L_{(s u (S^{C})^{(t + R)})^{(t)}} (X)} [v] = X$ ,

$\sum_{v \in L_{(s u (S^{C})^{(t + R)})^{(t)}} (X)} μ ([v]) = 1$ . By combining these facts with (3.4), we see that

$μ ([u]) \underline{>} \frac{1}{H (X), (s u (S^{C}) (t + R)) ∖ (s u (S^{C})^{(t + R))^{(t)}}}$ , $ċ$

We now bound $μ ([u])$ from above. Choose any word $v \in L_{s u (S^{C})} (t)$ $(X)$ with $v (S) = u$ .

We wish to use Proposition 3.2.5 with $(S^{c})^{(t)}$ as our $T$ . To do so, we need to know that

$(S \cup (S^{c})^{(t)})^{(t)} \underline{\subset} (S^{c})^{(t)} \underline{\subset} S U$ $(S^{c})^{(t)}$ . The second containment is trivial. To prove the

first, suppose that $p \in$ $(S \cup (S^{c})^{(t)})^{(t)}$ . By definition, there then exists $q \notin S \cup$ $(S^{c})^{(t)}$

such that $d (p, q) \underline{<} t$ . Since $q \notin S \cup$ $(S^{c})^{(t)}$ , in particular $q \notin S$ , or $q \in S^{c}$ . Therefore,

since $d (p, q) \underline{<} t$ , $p \in (S^{c})^{(t)}$ by definition. We now apply Proposition 3.2.5:

$μ ([v]) = \frac{μ ([v ((S^{c})^{(t)})])}{| L_{s u (S^{c})^{(t)}} (X) \cap [v ((S^{c})^{(t)})] |}$ (3.5)

We claim that $d (S ∖ S^{(R)}, (S^{c})^{(t)}) > R$ ; choose any $p \in S ∖ S^{(R)}$ and $q \in (S^{c})^{(t)}$ .

Clearly $q \in S^{c}$ . If $d (p, q) \underline{<} R$ , then $p \in S^{(R)}$ , a contradiction. Thus, $d (p, q) > R$ . This

implies that for any $v$ , and for any $v^{'} \in L_{S ∖ S (R)} (X)$ , there exists $y \in L_{s u (S^{C})} (t)$ $(X)$ with

$y ((S^{c})^{(t)}) = v ((S^{c})^{(t)})$ and $y (S ∖ S^{(R)}) = v^{'}$ . Therefore, $| L_{s u (S^{c})^{(t)}} (X) \cap [v ((S^{c})^{(t)})] | \underline{>}$

$H_{S ∖ S (R)}$ $(X)$ . By using this fact and summing (3.5) over all possible $v$ , we get

$\sum_{v \in L_{s u (S^{C})^{(t)}} (X) \cap [u]} μ ([v]) \underline{<} \frac{\sum_{v \in L_{s u (S^{C})^{(t)}} (X) \cap [u]} μ ([v ((S^{c})^{(t)})])}{H_{S ∖ S^{(R)}} (X)}$ . (3.6)

Note that since all $v$ are distinct, $\cup_{v \in L_{s u (S^{C})^{(t)}} (X) \cap [u]} [v] = [u]$ , and so

$\sum_{v \in L_{s u (S^{C})^{(t)}} (X) \cap [u]} μ ([v]) = μ ([u])$ . Also note that since all $v$ are distinct, but all have

$v (S) = U$ , it must be the case that all $v ((S^{c})^{(t)})$ are distinct, and so

$\sum_{v \in L_{s u (S^{C})^{(t)}} (X) \cap [u]} μ ([v ((S^{c})^{(t)})]) \underline{<} 1$ .

Combining these facts with (3.6), we see that

$μ ([u]) \underline{<} \frac{1}{H_{S ∖ S^{(R)}} (X)}$ ,

which, along with the lower bound on $μ ([u])$ already achieved, completes the proof.

$□$

To avoid confusion, from now on we use $\bar{μ}$ to denote an ergodic measure of maximal

entropy on a subshift $X$ , and $μ$ to denote a shift-invariant probability measure which

may or may not be of maximal entropy.

Lemma 3.2.7. For any $Z^{d}$ -subshift $X$ , and for any shift-invariant ergodic probability

measure $μ$ on $X$ , and for any finite set of words $u_{1} \in L_{S_{1}} (X)$ , $u_{2} \in L_{S_{2}} (X)$ , $...$ , $u_{j} \in$

$L_{S_{j}} (X)$ and $ε > 0$ , if we denote by $A_{k, ε, μ, u_{1}, ..., u_{j}} (X)$ the set of words in $L_{Γ_{k}} (X)$ which

have be rween $k^{d} (μ ([u_{i}]) - ε)$ and $k^{d} (μ ([u_{i}]) + ε)$ occurrences of $u_{i}$ for all $1 \underline{<} i \underline{<} j$ ,

then $\lim \inf_{k \to \infty} \frac{\ln | A_{k, ε, μ, u_{1},, u_{j}} (X) |}{k^{d}} \underline{>} h_{μ} (X)$ .

Proof: (For brevity, we use the shorthand notation $μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X))$ for the mea-

sure of the union of all cylinder sets corresponding to words in $A_{k, ε, μ, u_{1}, ..., u_{j}}$ $(X) .)$ Fix

any $ε > 0$ . Firstly, we notice that it is a simple consequence of Birkhoff's ergodic

theorem that $\lim_{k \to \infty} μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X)) = 1$ : for each $1 \underline{<} i \underline{<} j$ ,

$\lim_{k \to \infty} \frac{1}{k^{d}} \sum_{p \in Γ_{k}} χ_{[u_{i}]}$ $(σ^{p} (x)) = μ ([u_{i}])$ for almost every $x \in X$ ,

where a represents the $Z^{d} a c t \dot{i} o n$ by shifts on $X$ . Convergence almost everywhere

implies convergence in measure, meaning that

$\lim_{k \to \infty} μ$ ({ $x$ : $\frac{1}{k^{d}} \sum_{p \in Γ_{k}} χ_{[u_{i}]} (σ^{p} (x)) \in (μ ([u_{i}]) - ε,$ $μ ([u_{i}]) + ε)$ for $1 \underline{<} i \underline{<} j$ } $) = 1$ . (3.7)

Denote by $A_{k, ε, μ, u_{1}, ..., u_{j}}^{'}$ $(X)$ the set whose measure is taken on the lefthand side of

(3.7). Then whether or not $x \in X$ lies in $A_{k, ε, μ, u_{1}, ..., u_{j}}^{'} (X)$ depends only on its values

on $Γ_{k + M}$ , where $M$ is the minimum integer such that all $S_{i}$ are subsets of some copy

of $Γ_{M}$ . Therefore, $A_{k, ε, μ, u_{1}, ..., u_{j}}^{'} (X)$ is a union of cylinder sets of words in $L_{Γ_{k + M}} (X)$ .

It is not hard to see that for large $k$ , any word whose cylinder set is a subset of

$A_{k, \frac{ε}{2}, μ, u_{1}, ..., u_{j}}^{'}$ $(X)$ is an element of $A_{k + M, ε, μ, u_{1}, ..., u_{j}} (X)$ . Combining this with the fact

that (3.7) is true for any $ε > 0$ , we see that $\lim_{k \to \infty} μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X)) = 1$ .

We now write out the formula for the measure-theoretic entropy of $X$ :

$h_{μ} (X, ξ) = \lim_{k \to \infty} \frac{1}{k^{d}} \sum_{u \in L_{Γ_{n}} (X)} f (μ ([u]))$ ,

where $f (x) = - x \ln x$ for $x > 0$ and $f (0) = 0$ . Since $ξ$ is a measure-theoretic

generator for the a-algebra of Borel sets in $X$ , we know that $h_{μ} (X, ξ) = h_{μ} (X)$ . (For

more information on measure-theoretic and topological generators, see $[W a l]) W e$ now

partition $L_{Γ_{k}}$ $(X)$ into the two pieces $A_{k}$ , $ε$ , $μ$ , $u_{1}$ , $...$ , $u_{j}$ $(X)$ and $A_{k}$ , $ε$ , $μ$ , $u_{1}$ , $...$ , $u_{j}$ $(X)^{c}$ :

$\sum$ $f (μ ([u]))$

$h_{μ} (X) = \lim_{k \to \infty} (\frac{1}{k^{d}}$

$u \in A_{k, ε, μ, u_{1}}$ , ' $u_{j} (X)$

$+ \frac{1}{k^{d}} \sum_{u^{'} \in A_{k, ε, μ, u_{1}}}$ , ' $u_{j} (X)^{c} f (μ ([u^{'}])))$ .

To estimate each summand, we use the easily checkable fact that for any set of

nonnegative reals $α_{1}$ , $α_{2}$ , . . ., ( $y_{j^{'}}$ whose sum is $β$ , the maximum value of $\sum_{i = 1}^{j^{'}} f (α_{i})$

occurs when all terms are equal, and is $β$ In $\frac{j^{'}}{β}$ . Using this, we see that the above sum

is bounded from above by

$\frac{1}{k^{d}} (μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X)) \ln \frac{| A_{k, ε, μ, u_{1}, ..., u_{j}} (X) |}{μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X))})$

$+ \frac{1}{k^{d}} ((1 - μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X))) \ln \frac{H_{Γ_{k}}, (X)}{1 - μ (A_{k, ε μ, u_{1}, ..., u_{j}} (X))})$ .

Since, as $k \to \infty$ , $μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X))$ approaches 1 and $\frac{\ln | L_{Γ_{k}} (X) |}{k^{d}}$ approaches $h^{t o p} (X) <$

$\infty$ , the second term approaches zero as $k \to \infty$ . By replacing $μ (A_{k, ε, μ, u_{1}, ..., u_{j}} (X))$ by

1 in the limit in the first term, we see that

$h_{μ} (X) \underline{<} \lim_{k \to} \inf_{\infty} \frac{1}{k^{d}}$ In $| A_{k, ε, μ, u_{1}, ..., u_{j}} (X) |$ ,

which was exactly what needed to be shown.

$□$

Corollary 3.2.8. For any strongly irreducible $Z^{d}$ -shift X of finite type t with uniform

filling length R, for any $ε > 0$ , and for any positive integers k and M, denote by

$A_{k, ε, M} (X)$ the set of words u in $L_{Γ_{k}}$ (X) with be rween

$k^{d} (\frac{1}{H (X), (s_{i} u (s_{i} c) (t + R)) ∖ (s_{i} u (s_{i}^{C})^{(t + R))^{(t)}}},$ $- ε)$ and $k^{d} (\frac{1}{H (X), s_{i} ∖ s_{i}^{(R)}}, + ε)$ occurrences of $u_{i} \in$

$L_{S_{i}}$ (X) for every word $u_{i}$ with shape $S_{i} \underline{\subset} Γ_{M}$ . Then,

$\lim_{k \to \infty} \frac{\ln | A_{k, ε, M} (X) |}{k^{d}} = h^{t o p} (X)$ .

Proof: Using a combination of Lemmas 3.2.6 and 3.2.7 for any fixed $\bar{μ}$ an ergodic

measure of maximal entropy on $X$ shows that

$\lim_{k \to} \inf_{\infty}$ $\frac{\ln | A_{k, ε, M} (X) |}{k^{d}} \underline{>} h_{\bar{μ}} (X) = h^{t o p} (X)$ .

And by the definition of topological entropy,

$\lim_{k \to} \sup_{\infty} \frac{\ln | A_{k, ε, M} (X) |}{k^{d}} \underline{<} \lim_{k \to} \sup_{\infty} \frac{h_{Γ_{k}} (X)}{k^{d}} = h^{t o p} (X)$ .

$□$

3.3 A replacement theorem

We must prove an auxiliary theorem about replacements which is mostly combinato-

rial in nature, and which will be integral in the proof of the upper bound portion of

Theorem 3.1.22.

Theorem 3.3.1. For any strongly irreducible $Z^{d}$ -Markov shift $X$ with uniform filling

length $R$ , there exists $N$ depending on $X$ such that for any choice of $w \in L_{Γ_{n}} (X)$

with $n > N$ , there exists $w^{'} \in L_{Γ_{m}} (X)$ a subword of $w$ , where $m > n - n^{1 - \frac{1}{3 d}}$ , and

$w^{'} \in L_{Γ_{m}} (X)$ so that $w^{'}$ and $w^{'}$ agree on the boundary, and with the property that

replacing an occurrence of $w^{'}$ by $w^{'}$ in any element of $X$ cannot possibly create a nerv

occurrence of $w^{'}$ .

Proof: First let us make sure that the statement of Theorem 3.3.1 is totally clear.

What this means is that we find $w^{'}$ , $w^{'} \in L_{Γ_{m}} (X)$ such that $w^{'}$ is a subword of $w$ , $w^{'}$

and $w^{'}$ agree on the boundary, and such that the following is true: for any $x \in X$

with $x (S)$ $= w^{'}$ for some $S$ a copy of $Γ_{m}$ , if one defines $x^{'} \in X$ by replacing $x (S)$

by $w^{'}$ , then for any $T$ a copy of $Γ_{m}$ such that $x^{'} (T) = w^{'}$ , it must be the case that

$x (T) = w^{'}$ . We define $l$ $= ⌈ (\frac{d \ln n}{h^{t o p} (X)}) \bar{d} ⌉ + 1$ or $l$ $= {$

$(\frac{d \ln n}{h^{t o p} (X)}) \bar{d} ⌉ + 2$ , whichever is

odd. Then, $H_{L_{Γ}}$ , (X) is, by Lemma 3.2.1, at least $e^{h^{t o p} (X) l^{d}} > e^{d \ln n} = n^{d}$ . This is

greater than the number of words with shape $Γ_{l}$ which occur as subwords of $w$ , and

so therefore there exists a word $a \in L_{Γ}$ , (X) which is not a subword of $w$ . This allows

us to make a definition:

Definition 3.3.2. Given a subword u of w, choose a subword of u of shape $Γ_{2 R + l}$

which does not contain any portion of the boundary of u, and use strong irreducibil-

ity to replace this subword by any word of shape $Γ_{2 R + l}$ which has a as the subword

occupying its central $Γ_{l}$ . The word that u has become after this replacement is called

a standard replacement of u, and so is any word created in this way.

Choosing the $Γ_{2 R + l}$ to be replaced to be disjoint from the boundary of $u$ ensures that

any standard replacement of $u$ agrees on the boundary with $u$ . This means that since

$X$ is a Markov shift, any occurrence of $u$ in a point $x \in X$ can be replaced by a

standard replacement $u^{'}$ , and the resulting element of $Ω$ is still in $X$ .

Definition 3.3.3. A subword u $\in L_{Γ_{k}} (X)$ of w has Property A if for every standard

Figure 3.6: A standard replacement of $u$

replacement $u^{'}$ of $u$ , replacing $u$ by $u^{'}$ in some $x \in X$ could possibly create a nerv

occurrence of $u$ . In other words, for every standard replacement $u^{'}$ of $u$ , there exists

$x \in X$ and $S$ , $T$ copies of $Γ_{k}$ such that $x (S) = u$ , $x (T) \neq u$ , and if we define $x^{'} \in X$

by replacing $x (S)$ by $u^{'}$ , then $x^{'} (T) = u$ .

Clearly, if we can find $u$ a subword of $w$ which is large enough and does not have

Property $A$ , we will be done.

The organization of this induction is a bit odd, so we give a quick overview. We will

construct a sequence $w_{j}$ of subwords of $w$ , where $w_{1} = w$ and each is a subword of

the previous. Each $w_{j}$ has shape a cube. We will be able to show that one of these $w_{j}$

does not have Property $A$ , and that it is large enough in the sense of Theorem 3.3.1.

However, the construction of these $w_{j}$ depends on a fact about periodicity of subwords

of $w$ which have Property $A$ , which we must prove first. We will set up and prove

this periodicity fact for any $w_{j}$ even though $w_{j}$ has not yet been defined for $j > 1$ .

This is not a problem though; this proof depends only on $w_{j}$ being a subword of $w$ ,

which will be clearly true once the definitions of $w_{j}$ are able to be made.

Let's assume that $w_{j}$ has Property A for some $j$ . For the purpose of the following

constructions, it is helpful to picture $w_{j}$ as fixed in space, so say that $w_{j}$ has shape

$Γ_{n_{j}}$ , i.e. the cube with size $n_{j}$ which lies entirely within the positive octant of $Z^{d}$

and has a vertex at (1, 1, $...$ , 1). To make sure to distinguish it from other copies of

$Γ_{n_{j}}$ , we call the shape that this fixed occurrence of $w_{j}$ occupies $B_{j}$ . We take a copy

of $Γ 3 (2 R + l) ⌈ n_{j} ⌉ (1 - \frac{1}{2 d})$ or $Γ 3 (2 R + l) (⌈ n_{j} ⌉ (1 - \frac{1}{2 d}) - 1)$ (call it $A_{j}$ , and its size $k_{j}$ ), where the size

is chosen to have the same parity as $n_{j}$ , and which is central in $B_{j}$ , and partition

it into $(3 (⌈ n_{j}^{(1 - \frac{1}{2 d})} ⌉))^{d}$ or $(3 (⌈ n_{j}^{(1 - \frac{1}{2 d})} ⌉) - 1)^{d}$ disjoint copies of $Γ_{2 R + l}$ . Consider only

the interior copies of $Γ_{2 R + l}$ in this partition, i.e. the ones which are disjoint from the

boundary of $A_{j}$ . For large $n_{j}$ , there are more than $2^{d} n_{j}^{d - \frac{1}{2}}$ of these, and to each one

we may associate a standard replacement of $w_{j}$ , and since $w_{j}$ has Property $A$ , also

a copy of $Γ_{n_{j}}$ which overlaps $B_{j}$ which could somehow be filled with $w_{j}$ at the same

time that $B_{j}$ is filled with the standard replacement in question. We use $R_{j}$ to denote

the set (possibly with repetitions) of these copies of $Γ_{n_{j}}$ .

Recall that each of these represents a possible new occurrence of $w_{j}$ created by per-

forming a standard replacement. Since any of the standard replacements that we

would perform is done by changing only letters in a particular $Γ_{2 R + l}$ in $B_{j}$ , it must

be the case that every element of $R_{j}$ contains some portion of its associated $Γ_{2 R + l}$ .

(Otherwise, the supposed "new" occurrence of $w_{j}$ would have already existed before

Figure 3.7: A sample element $S$ of $R_{j}$

the replacement.) Also, since every standard replacement results in some occurrence

of $a$ in $A_{j}$ , and since $a$ is not a subword of $w_{j}$ , none of the elements in $R_{j}$ contains

all of its associated $Γ_{2 R + l}$ . In Figure 3.7, the element of $R_{j}$ shown, which we call

$S$ , is associated to the standard replacement given by changing the letters in the

highlighted $Γ_{2 R + l}$ .

We first make the claim that $R_{j}$ contains no repeated elements, i.e. that two distinct

copies of $Γ_{2 R + l}$ must be associated to distinct elements of $R_{j}$ .

In Figure 3.8, suppose that $S$ is associated to the standard replacements correspond-

ing to each of $U$ and $V$ , the two highlighted copies of $Γ_{2 R + l}$ . This means that there

Figure 3.8: An element $S$ of $R_{j}$ associated to two standard replacements

exists a word $u \in L_{B_{j} \cup S} (X)$ with $u (S) = w_{j}$ and $u (B_{j}) = w_{j}^{'}$ , where $w_{j}^{'}$ is a standard

replacement of $w_{j}$ made by changing only the letters in $U$ . The letters in $U \cap S$ must

be changed from their values in $w_{j}$ in order for the occurrence of $w_{j}$ in $S$ to be "new."

Therefore, $u (B_{j} ∖ U) = w_{j} (B_{j} ∖ U)$ , and $u (U \cap S) \neq w_{j} (U \cap S)$ . Similarly, there exists

a word $u^{'} \in L_{B_{j} \cup S} (X)$ with $u^{'} (S) = w_{j}$ , $u^{'} (B_{j} ∖ V) = w_{j} (B_{j} ∖ V)$ , and $u^{'} (V \cap S)$ $\neq$

$w_{j} (V \cap S)$ . Since $U$ and $V$ are disjoint, this implies that $u (U \cap S) \neq w_{j} (U \cap S)$ and

$u^{'} (U \cap S) = w_{j} (U \cap S)$ , and so that $u (U \cap S)$ $\neq u^{'} (U \cap S)$ . But $u (S) = u^{'} (S) = w_{j}$ ,

and since $U \cap S \underline{\subset} S$ we have a contradiction. Now we know that $R_{j}$ consists of more

than 2 $d n_{j}^{d - \frac{1}{2}}$ distinct copies of $Γ_{n_{j}}$ .

Definition 3.3.4. A suboctant of $B_{j}$ is a subcube of $B_{j}$ which shares vertices with

both $B_{j}$ and $A_{j}$ , is co ntained in $B_{j}$ , and whose intersection with $A_{j}$ contains only the

shared vertex. (See Figure 3.9.)

Figure 3.9: The suboctants of $B_{j}$

It is fairly clear upon observation that each element of $R_{j}$ must contain some suboc-

tant of $B_{j}$ , since each element of $R_{j}$ contains some portion of $A_{j}$ . Since $R_{j}$ has more

than $2^{d} n_{j}^{d - \frac{1}{2}}$ distinct elements, there is ssoomm $e$ suboctant $0_{j}$ which is contained in more

than $n_{j}^{d - \frac{1}{2}}$ of the copies of $Γ_{n_{j}}$ in $R_{j}$ .

We now prove a fact about the periodicity of a set which is contained in two elements

of $R_{j}$ . Suppose that two distinct elements $S$ and $S^{'}$ of $R_{j}$ are given such that $S - S^{'}$ ,

which we denote by $v$ , is a multiple of $e_{i}$ for some $1 \underline{<} i \underline{<} d$ . We make the additional

too

assumption that both S and $S^{'}$ contain some suboctant O of $B_{j}$ , and assume without

loss of generality that $O$ is the least suboctant lexicographically of $B_{j}$ , i.e. $O$ has the

point (1, $...$ , 1) as a vertex, and that $v$ is a negative multiple of $e_{i}$ . (This is without

loss of generality because one can make these assumptions simply by renaming $S$ and

$S^{'} a n d / o r$ reflecting $B_{j}$ several times, which does not affect our proof.) We will show

that certain portions of $w_{j}$ must be periodic based just on these hypotheses.

Figure 3.10: Elements $S$ , $S^{'}$ of $R_{j}$ whose difference is a multiple of $e_{i}$

In Figure 3.10, $t$ is any point of $B_{j}$ such that $t + v$ is also in $B_{j}$ , $O^{'}$ is the suboctant

of $B_{j}$ opposite $O$ , $q$ is the vertex shared by $O$ and $B_{j}$ , $r$ is the vertex of $B_{j}$ opposite

101

$q$ , $p$ is the vertex of $S$ corresponding to $r$ in $B_{j}$ , and $v^{'} = t - (p + v)$ . $P$ and $P^{'}$ are

subcubes of $O$ and $O^{'}$ , respectively, which share vertices with $A_{j}$ and whose size is

$\frac{n_{j} - k_{j}}{2} - k_{j}$ , or the size of $O$ minus the size of $A_{j}$ . (The dotted line portions are copies

of $A_{j}$ to remind us of the definition of $P$ and $P^{'}$ , but are themselves irrelevant and

are not named.) We define $t^{'}$ to be $r + v^{'}$ , giving us two new points $t^{'}$ and $t^{'} + v$ .

Since $S$ , $S^{'} \in R_{j}$ , they may each be filled with $w_{j}$ when $B_{j}$ is filled with the correct

standard replacement of $w_{j}$ . Call $U$ the copy of $Γ_{2 R + l}$ on which $w_{j}$ is altered to make

the standard replacement which allows $S$ to be filled with $w_{j}$ , and $V$ the copy of

$Γ_{2 R + l}$ on which $w_{j}$ is altered to make the standard replacement which allows $S^{'}$ to be

filled with $w_{j}$ . (A quick note: in several figures in this paper, we represent a point

$p \in Z^{d}$ by the vector which points from 0 to $p .$ )

We first claim that if $t$ , $t + v \in P$ , then $t^{'}$ , $t^{'} + v \in B_{j} ∖ (U \cup V)$ . Suppose that

$t$ , $t + v \in P$ . Since $S^{'}$ contains some portion of $V$ , but does not contain all of $V$ , the

same can be said of $A_{j}$ ; $S^{'}$ contains some portion of $A_{j}$ , but not all of $A_{j}$ . Therefore,

some coordinate of $p + v$ is between $\frac{n_{j} - k_{j}}{2} + 1$ and $\frac{n_{j} + k_{j}}{2}$ . (We always use the word

"between" in the inclusive sense.) The fact that $S^{'}$ has nonempty intersection with

$A_{j}$ also implies that all coordinates of $p + v$ are between $\frac{n_{j} - k_{j}}{2} + 1$ and $n_{j}$ , since all

coordinates of all elements of $A_{j}$ are at lleuausutt $\frac{n_{j} - k_{j}}{2} + 1$ . Since $t \in P$ , every coordinate

of $t$ is between $k_{j} + 1$ and $\frac{n_{j} - k_{j}}{2}$ . This implies that every coordinate of $v^{'}$ is nonpositive

and at least $- n_{j} + 1$ , and also that one coordinate of $v^{'}$ is at least $- \frac{n_{j} - k_{j}}{2} + 1$ . Then,

$t^{'} = r + v^{'}$ is in $B_{j}$ , and has one coordinate at least $\frac{n_{j} + k_{j}}{2} + 1$ , and therefore does not

lie in $A_{j}$ , and so is not in $U$ or $V$ . Since $t + v \in P$ , the same argument shows that

102

$t^{'} + v$ is in $B_{j}$ , but does not lie in $U$ or $V$ . Clearly, since $U$ and $V$ are disjoint from

the boundary of $A_{j}$ by their construction, $t$ , $t + v$ are not in $U$ or $V$ either.

Suppose that $t$ , $t + v$ , $t^{'}$ , $t^{'} + v \in B_{j} ∖ (U \cup V)$ . Then by noting that $S$ can be filled with

$w_{j}$ when $B_{j}$ is filled with a standard replacement of $w_{j}$ which agrees with $w_{j}$ outside

of $U \cup V$ , we can infer that $w_{j} (t) = w_{j} (t^{'} + v)$ (this is because $t \in S$ corresponds

to $t^{'} + v \in B_{j}$ ), and by noting that $S^{'}$ can be filled with $w_{j}$ when $B_{j}$ is filled with

a standard replacement of $w_{j}$ which agrees with $w_{j}$ outside of $U \cup V$ , we infer that

$w_{j} (t) = w_{j} (t^{'})$ and $w_{j} (t + v) = w_{j} (t^{'} + v)$ . (This is because $t \in S^{'}$ corresponds to

$t^{'} \in B_{j}$ , and $t + v \in S^{'}$ corresponds to $t^{'} + v \in B_{j}$ ) This implies that $w_{j} (t) = w_{j} (t + v)$ .

This conclusion was dependent only on the fact that $t$ , $t + v$ , $t^{'}$ , $t^{'} + v \in B_{j} ∖ (U \cup V)$ .

We have already shown that this is true for any $t$ , $t + v \in P$ , and so $w_{j} (P)$ is periodic

with respect to $v$ .

Note that in the course of this argument, we have also shown that $w_{j} (t^{'}) = w_{j} (t^{'} + v)$ .

We claim that any pair of points in $P^{'}$ which are separated by $v$ are $t^{'}$ and $t^{'} + v$

for some choice of $t$ , $t + v \in B_{j} ∖ (U \cup V)$ . To do this, we just determine $t$ from $t^{'}$ :

if $t^{'} \in P^{'}$ , then every coordinate of $v^{'}$ is between $- \frac{n_{j} - k_{j}}{2}$ and $- k_{j} + 1$ . As already

mentioned, all coordinates of $p + v$ are between $\frac{n_{j} - k_{j}}{2} + 1$ and $n_{j}$ , and one coordinate

is between $\frac{n_{j} - k_{j}}{2} + 1$ and $\frac{n_{j} + k_{j}}{2}$ . This implies that $p + v + v^{'} = t$ has all coordinates

between 1 and $n_{j}$ , and so is in $B_{j}$ , and that one coordinate is between 1 and $\frac{n_{j} - k_{j}}{2} + 1$ ,

which implies that $t$ either does not lie in $A_{j}$ or lies in the boundary of $A_{j}$ , and, in

either case does not lie in $U$ or $V$ . Since $t^{'} + v \in P^{'}$ , the same argument shows that

$t + v$ is in $B_{j}$ , but does not lie in $U$ or $V$ . We have then shown that $t^{'}$ and $t^{'} + v$ could

103

be any pair of points in $P^{'}$ separated by $v$ , and so the fact that $w_{j} (t^{'}) = w_{j} (t^{'} + v)$ in

the above argument shows that $w_{j} (P^{'})$ is also periodic with respect to $v$ .

We will use this fact momentarily; for now let's recall that we earlier showed that

some suboctant $O_{j}$ of $B_{j}$ is contained in more than $n_{j}^{d - \frac{1}{2}}$ distinct elements of $R_{j}$ . As

before, denote by $q$ the vertex of $B_{j}$ shared by $B_{j}$ and $O_{j}$ , and by $r$ the vertex of $B_{j}$

opposite $q$ . It is clear upon observation that for any element $S$ of $R_{j}$ which contains

$O_{j}$ , the vertex of $S$ which corresponds to $r$ in $B_{j}$ must be contained in $B_{j}$ . Fix any $e_{i}$

in $Z^{d}$ . $B_{j}$ can be partitioned into $n_{j}^{d - 1}$ sets ${w + m e_{i}}_{0 \leq m < n_{j}}$ where $w$ ranges over all

points $w$ in $B_{j}$ with $w_{i} = 1$ , and by tthllee above comments, there are more than $n_{j}^{d - \frac{1}{2}}$

distinct points in $B_{j}$ which are vertices of elements of $R_{j}$ which contain $O_{j}$ . By the

pigeonhole principle, this implies that one of the sets ${w + m e_{i}}_{0 \leq m < n_{j}}$ contains more

than $\sqrt{n_{j}}$ of these vertices. Again by the pigeonhole principle, this implies that there

are two elements of $R_{j}$ , call them $S$ and $S^{'}$ , which contain $O_{j}$ and where $S^{'} - S$ is a

multiple of $e_{i}$ whose length is less than $\sqrt{n_{j}}$ . We make the notation $v_{i}^{(j)} : = S^{'} - S$ .

This in turn implies that $w_{j}$ is periodic with respect to $v_{i}^{(j)}$ on the regions above

described as $P$ and $P^{'}$ , and since these regions are dependent on $O_{j}$ and $O_{j}^{'}$ , we call

them $P_{j}$ and $P_{j}^{'}$ . Since this can be done for all $1 \underline{<} i \underline{<} d$ , $w_{j} (P_{j})$ and $w_{j} (P_{j}^{'})$ are

periodic with respect to $v_{1}^{(j)}$ , $v_{2}^{(j)}$ , . . . , $v_{d}^{(j)}$ where for each $1 \underline{<} i \underline{<} d$ , $v_{i}^{(j)}$ is a multiple

of $e_{i}$ with length less than $\sqrt{n_{j}}$ . This implies that their lengths are less than $\sqrt{n}$ as

well, since $n_{j} \underline{<} n$ .

Definition 3.3.5. A word w is purely periodic if for some positive integers $n_{1}$ , ..., $n_{d}$ ,

it is periodic with respect to $n_{i} e_{i}$ for $1 \underline{<} i \underline{<} d$ .

104

We have then shown that if $w_{j}$ has Property $A$ , then there are regions $P_{j}$ and $P_{j}^{'}$

as described above so that $w_{j} (P_{j})$ and $w_{j} (P_{j}^{'})$ are purely periodic with all periods

less than $\sqrt{n}$ . In particular, the preceding is true for $j = 1$ and $w_{1} = w$ . For

each 1 $\underline{<} i \underline{<} d$ , we can then choose $v_{i}^{(1)}$ a multiple of $e_{i}$ which is a period for

$w_{1} (P_{1})$ and $w_{1} (P_{1}^{'})$ and which has length less than $\sqrt{n}$ . We now choose $w_{2}$ a subword

of $w_{1}$ . For sufficiently large $n_{1}$ , we take $w_{2}$ to be the subword of $w_{1}$ with shape

$B_{2} = Γ n_{1} - ((10 ċ 3^{2^{d}}) (2 R + l) ⌈ n_{1}^{1 - \frac{1}{2 d}} ⌉)$ and which still has $q$ , the vertex shared by $O_{1}$ and

$B_{1}$ , as a vertex. The purpose of this is to cause the forced purely periodic portion $P_{1}$

of $w_{1}$ to contain a purely periodic central cube within $w_{2}$ . We will choose subsequent

$w_{j}$ to be purely periodic on a central cube as well, which we will denote by $C_{j}$ , and

so $w_{j} (C_{j})$ will always be a purely periodic subword of $w_{j}$ . In fact, for each $j$ , we will

choose $w_{j + 1}$ so that $w_{j + 1} (C_{j + 1})$ is a subword of $w_{j} (C_{j})$ , and so each $w_{j} (C_{j})$ will be

periodic with respect to $v_{i}^{(1)}$ for $1 \underline{<} i \underline{<} d$ . Due to the construction of $w_{2}$ , we can

take $C_{2}$ to have shape $Γ 10 ċ 3^{2^{d}} (2 R + l) ⌈ n_{1}^{1 - \frac{1}{2 d}} ⌉$ in $w_{2}$ . We also move $B_{2}$ in space to lie

entirely within the positive octant of $Z^{d}$ with one vertex at (1, $...$ , 1) (in other words,

$B_{2} = Γ_{n_{2}})$ , so that the phrasing of the earlier arguments still works.

This choice of $w_{2}$ was done to create $C_{2}$ : from now on we will follow a different

inductive procedure. We need one more definition:

Definition 3.3.6. A superoctant of any $B_{j}$ (j $>$ 1) is a subcube which shares

vertices with both $B_{j}$ and $C_{j}$ , is contained in $B_{j}$ , and contains $C_{j}$ .

There is then a natural one-to-one correspondence between the superoctants and

suboctants of $B_{j}$ ; for every superoctant, there is exactly one suboctant which is a

105

subset of it. For each $j \underline{>} 2$ where $w_{j}$ has Property $A$ , we now describe how to

construct $w_{j + 1}$ . For each such $j$ , we define $T_{j}$ to be the set of superoctants $\bar{O}$ of $B_{j}$

such that $w_{j} (\bar{O})$ is periodic with respect to $v_{i}^{(1)}$ for $1 \underline{<} i \underline{<} d$ . We denote the size of $C_{j}$

by $m_{j}$ . Since $w_{1} (P_{1})$ is periodic with respect to each $v_{i}^{(1)}$ , and since $w_{1} (P_{1}) = w_{2} (\bar{O})$

for a superoctant $\bar{O}$ of $w_{2}$ , we have then already shown that $T_{2}$ contains $O$ , and

$m_{2} = 10$ $ċ 3^{2^{d}} (2 R + l)$ $⌈ n_{1}^{1 - \frac{1}{2 d}} ⌉$ . We assume that $m_{j} \underline{>} 10 (2 R + l) n_{1}^{1 - \frac{1}{2 d}}$ for all $j$ , and

in fact this will be clear once the construction is finished.

Let's now suppose that for some $j \underline{>} 2$ , $w_{j}$ has Property A. The idea is that we take

$w_{j + 1}$ to be a suitable subword of $w_{j}$ where $C_{j + 1}$ is smaller than $C_{j}$ , but $T_{j + 1}$ is strictly

larger than $T_{j}$ . To show this, we have to prove a couple of claims. Firstly, we claim

that any superoctant of $B_{j}$ on which $w_{j}$ is purely periodic with periods less than $\sqrt{n}$

and which contains $C_{j}$ must be periodic with respect to each $v_{i}^{(1)}$ , i.e. must be in $T_{j}$ .

Since $m_{j} \underline{>} 10 (2 R + l) n^{1 - \frac{1}{2 d}}$ , clearly $m_{j} \underline{>} 4 \sqrt{n}$ for all $n$ .

We now claim that if $T_{j}$ contains any pair of opposite superoctants $\bar{O}$ and $\bar{O^{'}}$ of $B_{j}$ ,

then neither of the suboctants $O$ and $O^{'}$ associated to them can possibly be contained

in any element of $R_{j}$ .

We again assume without loss of generality that $O$ is the least suboctant lexicograph-

ically of $B_{j}$ . In Figure 3.11, $w_{j} (\bar{O} \cup \bar{O^{'}})$ (the portion of $w_{j}$ shaded) is purely periodic

with periods $v_{i}^{(1)}$ , and $U$ is the $Γ_{2 R + l}$ which is changed to make the standard replace-

ment corresponding to $S \in R_{j}$ . We assume that $O \in S$ and derive a contradiction.

We have denoted by $C$ the region $(U \cap S) + (B_{k} - S)$ , i.e. $C$ in $B_{j}$ corresponds to

$U \cap S$ in S. $C \subset \bar{O^{'}}$ because $O \subset S$ , and for some $1 \underline{<} i \underline{<} d$ , it is the case that

106

Figure 3.11: An element $S$ of $R_{j}$ which contains $O$

$C$ consists of points whose $i t h$ coordinate is between $n_{j} - k_{j} + 1$ and $n_{j}$ . We then

choose $u_{i}$ which is a multiple of $v_{i}^{(1)}$ such that $C + u_{i}$ is a subset of $\bar{O^{'}}$ and $U + u_{i}$

is a subset of $\bar{O}$ which is disjoint from $U$ (this can be done since $v_{i}^{(1)}$ has length less

than $\sqrt{n}$ ; some multiple of it is greater than $k_{j}$ , but still less than $\frac{n_{j} - k_{j}}{2}$ ) and take

$D = C + u_{i}$ and $E =$ $($ &|"1 $S) + u_{i}$ . It must be the case that filling $B_{j}$ with a standard

replacement for $w_{j}$ in which no letters of $B_{j} ∖ U$ are changed and at least some letters

in $U \cap S$ are changed can be done simultaneously with filling $S$ with $w_{j}$ . Call the

word with shape $B_{j} \cup S$ with this property $u$ . We can restate our assumptions then

107

by saying $u (B_{j} ∖ U) = w_{j} (B_{j} ∖ U)$ , $u (U \cap S)$ $\neq w_{j} (U \cap S)$ , and $u (S) = w_{j}$ . Since

$U \cap S$ in $S$ corresponds to $C$ in $B_{j}$ , and since $u (S) = w_{j}$ , it must be the case that

$u (U \cap S) = w_{j} (C)$ . By periodicity of $w_{j} (\bar{O^{'}})$ with respect to $u_{i}$ , $w_{j} (C) = w_{j} (D)$ .

Since $D$ in $B_{j}$ corresponds to $E$ in $S$ , and since $u (S) = w_{j}$ , $w_{j} (D) = u (E)$ . Since

$u (B_{j} ∖ U) = w_{j} (B_{j} ∖ U)$ and $E$ and $U$ are disjoint, $u (E) = w_{j} (E)$ . Finally, by period-

icity of $w_{j} (\bar{O})$ with respect to $u_{i}$ , $w_{j} (E) = w_{j} (U \cap S)$ . But we have then shown that

$u (U \cap S) = w_{j} (U \cap S)$ , a contradiction. Therefore, if two opposite superoctants are

contained in $T_{j}$ , then neither of the suboctants associated to them may be contained

in any element of $R_{j}$ .

$B_{j}$ $0_{j}$ ' $O_{j}^{y}$

$P_{j}$ '

$\bar{o_{j}}$ $c_{j}$

$A_{j}$

$o_{j}$ $P_{j}$

Figure 3.12: $B_{j}$

Finally, we may make our inductive step. Consider any $j \underline{>} 2$ and $w_{j}$ with Property

A. By the earlier argument, we may then conclude that there are subcubes $P_{j}$ and $P_{j}^{'}$

108

as defined above so that $w_{j} (P_{j})$ and $w_{j} (P_{j}^{'})$ are purely periodic with periods at most

$\sqrt{n}$ . However, in the proof of this, we made use of the fact that the suboctant $O_{j}$ of

$B_{j}$ corresponding to $P_{j}$ is contained in many elements of $R_{j}$ . Therefore, by the fact

just shown, one of $O_{j}$ or $O_{j}^{'}$ corresponds to a superoctant of $B_{j}$ which is not in $T_{j}$ .

First, we take a subword of $w_{j}$ , call it $w_{j}^{'}$ , obtained by removing the boundary of

thickness $k_{j}$ from $B_{j}$ . In other words, we take $w_{j}^{'}$ to be the subword whose shape is

the central $Γ_{n_{j} - 2 k_{j}}$ of $B_{j}$ , which we call $B_{j}^{'}$ .

$P_{j}$ ' $B_{J}^{'}$

$c_{j}$

$A_{j}$

$P_{j}$

Figure 3.13: $B_{j}^{'}$

$w_{j}^{'} (P_{j})$ and $w_{j}^{'} (P_{j}^{'})$ are now subwords of $w_{j}^{'}$ occupying suboctants of $B_{j}^{'}$ . By construc-

tion, $w_{j}^{'} (P_{j})$ and $w_{j}^{'} (P_{j}^{'})$ are also purely periodic with periods at most $\sqrt{n}$ . However,

109

one of them is associated with a superoctant which is not in $T_{j}$ , and we may as-

sume that without loss of generality it is $P_{j}$ . Also, since the size $m_{j}$ of $C_{j}$ is at

least 10(2R+l) $n^{1 - \frac{1}{2 d}}$ , and since the size of $A_{j}$ is at most $3 (2 R + l) n^{1 - \frac{1}{2 d}}$ , the overlap

between $P_{j}$ and $C_{j}$ is a cube whose size is at least $\frac{m_{j}}{3}$ . Denote this cube by $C_{j + 1}$ .

$B_{j + 1}$ is defined to be a subcube of $B_{j}^{'}$ which shares the same vertex with $B_{j}^{'}$ that

$B_{j}^{'}$ shares with $P_{j}$ , with the correct size so that the overlap between $P_{j}$ and $C_{j}$ is

central. In doing this, we reduce the size of $w_{j}^{'}$ by at most $m_{j}$ . $w_{j + 1}$ is defined to be

$w_{j}^{'}$ $(B_{j + 1})$ . Then, since $w_{j + 1} (C_{j + 1})$ is a subword of $w_{j} (C_{j})$ , it is also purely periodic

with respect to each $v_{i}^{(1)}$ , and by construction, $C_{j + 1}$ is central in $B_{j + 1}$ , so indeed $C_{j + 1}$

has the necessary properties. Each superoctant in $T_{j}$ corresponds to a superoctant

in $T_{j + 1}$ , and $w_{j + 1} (P_{j})$ now occupies a superoctant of $B_{j + 1}$ , call it $O$ , on which $w_{j + 1}$

is purely periodic with periods at most $\sqrt{n}$ . We claim that this means that $\bar{O} \in T_{j + 1}$

as well. For each $1 \underline{<} i \underline{<} d$ , choose a period $v_{i}^{(j)}$ of $w_{j + 1} (\bar{O})$ which is a multiple

of $e_{i}$ whose length is less than $\sqrt{n}$ . For every $j$ , as already mentioned, $w_{j + 1} (C_{j + 1})$

is periodic with respect to $v_{i}^{(1)}$ for $1 \underline{<} i \underline{<} d$ . Now, consider any $p \in \bar{O}$ such that

$p + v_{k}^{(1)} \in \bar{O}$ for some fixed $1 \underline{<} k \underline{<} d$ . Since the size $m_{j + 1}$ of $C_{j + 1}$ is much greater

than $\sqrt{n}$ , there exists some linear combination $\sum_{i = 1}^{d} a_{i} v_{i}^{(j)}$ with $a_{i} \in Z$ , call it $v$ , such

that $p + v \in C_{j + 1}$ and $(p + v) + v_{k}^{(1)} \in C_{j + 1}$ . Then, $w_{j + 1} (p) = w_{j + 1} (p + v)$ since $v$ is

a sum of periods for $w_{j + 1} (\bar{O})$ . $w_{j + 1} (p + v) = w_{j + 1} (p + v + v_{k}^{(1)})$ since $w_{j + 1} (C_{j + 1})$ 1s

periodic with respect to $v_{k}^{(1)}$ Finally, $w_{j + 1} (p + v + v_{k}^{(1)})$ $= w_{j + 1} (p + v_{k}^{(1)})$ since $v$ is

a sum of periods for $w_{j + 1} (\bar{O})$ , and therefore is itself a period of $w_{j + 1} (\bar{O})$ . Then, by

definition, $w_{j + 1} (\bar{O})$ is periodic with respect to $v_{k}^{(1)}$ as well, and since $k$ was arbitrary,

with respect to every $v_{i}^{(1)}$ for $1 \underline{<} i \underline{<} d$ . By definition, this shows that $\bar{O} \in T_{j + 1}$ .

110

Since $\bar{O}$ corresponds to a superoctant of $B_{j}$ that is not in $T_{j}$ , $T_{j + 1}$ contains at least

one more element than $T_{j}$ , and $m_{j + 1} \underline{>} \frac{m_{j}}{3}$ . We again move $B_{j}$ in space so that it

lies entirely within the positive octant of $Z^{d}$ and has one vertex at (1, $...$ , 1). Since

$m_{2} \underline{>} 10$ $ċ 3^{2^{d}} (2 R + l) n^{1 - \frac{1}{2 d}}$ , this implies that $m_{j} \underline{>} 10 (2 R + l) n^{1 - \frac{1}{2 d}}$ for $1 \underline{<} j \underline{<} 2^{d}$ .

We will show that this induction never need proceed beyond $j = 2^{d}$ , and so in the

process will justify the claim already used that $m_{j} \underline{>} 10 (2 R + l) n^{1 - \frac{1}{2 d}}$ for all $j$ that

we consider.

If each $w_{j}$ for $1 \underline{<} j \underline{<} 2^{d}$ has Property $A$ , this means that $T_{2^{d} + 1}$ must have more

than $2^{d}$ elements. But, there are only $2^{d}$ superoctants of $B_{2^{d} + 1}$ , and so we have a

contradiction. This implies that some $w_{j}$ must not have Property $A$ , which we call

$w^{'}$ . $w^{'}$ is created by at most $2^{d}$ truncations of $w$ , each of which reduces the size by at

most $k_{2} + m_{2}$ , and so since $k_{2} < m_{2}$ for large enough $n$ , the size $m$ of $w^{'}$ is at least

n-1O $ċ$ $2^{d + 1} 3^{2^{d}} (2 R + l)$ $⌈ n^{1 - \frac{1}{2 d} ⌉}$ . Since for large $n$ , $2 R + l$ $< c \ln n$ for some constant

$c$ , we can then say that there exists $N^{'}$ for which $m > n - K$ In $n ċ$ $n^{1 - \frac{1}{2 d}}$ for $n > N^{'}$

and some constant $K$ . Then, take $N > N^{'}$ so that A $\ln n < n^{\frac{1}{6 d}}$ for $n > N$ . This

implies that for $n > N$ , $m > n - n^{1 - \frac{1}{3 d}}$ . Since $w^{'}$ does not have Property $A$ , there

exists a standard replacement of $w^{'}$ , call it $w^{'}$ , which agrees on the boundary with $w^{'}$

by definition of standard replacement, and with the property that replacing $w^{'}$ by $w^{'}$

in some $x \in X$ cannot possibly create a new occurrence of $w^{'}$ . Thus, Theorem 3.3.1

is proved.

$□$

We can prove as a corollary a version of Theorem 3.3.1 for any strongly irreducible

shift of finite type:

Corollary 3.3.7. For any strongly irreducible $Z^{d}$ -shift $X$ of finite type $t$ with uniform

filling length $R$ , there exists $N$ depending on $X$ such that for any choice of $w \in L_{Γ_{n}} (X)$

with $n > N$ , there exists a subword $w^{'}$ of $w$ , with shape $Γ_{m}$ , where $m > n - n^{1 - \frac{1}{3 d}}$ , and

$w^{'} \in L_{Γ_{m}} (X)$ so that $w^{'}$ and $w^{'}$ agree on the boundary of thickness $t$ , and with the

property that replacing an occurrence of $w^{'}$ by $w^{'}$ in an element of $X$ cannot possibly

create a nerv occurrence of $w^{'}$ .

Proof: Fix $X$ a strongly irreducible $Z^{d}$ -shift of finite type $t$ and uniform filling length

$R$ . Take the alphabet $A^{(t)}$ whose letters are the elements of $L_{Γ_{t}} (X)$ , i.e. the words

in the language of $X$ whose shape is $Γ_{t}$ , and then define a map $f$ from $X$ to $(A^{(t)})^{Z^{d}}$

where for any $p \in Z^{d}$ , $(f (x)) (p)$ is the subword of $x$ which occupies $Γ_{t} + p$ , i.e. the

copy of $Γ_{t}$ whose least vertex lexicographically is $p$ . Then, $f (X)$ is a Markov shift

with alphabet $A^{(t)}$ . The reader can check that $f (X)$ is still strongly irreducible, but

with a uniform filling length of $R + t - 1$ . In short, this is because two shapes which

are a distance of $j$ apart in an element of $x$ correspond to shapes a distance of $j + t - 1$

apart in $f (x)$ for $j \in$ C.

Since $f (X)$ is a strongly irreducible Markov shift, by Theorem 3.3.1 there exists $N^{'}$

such that for any $w \in L_{Γ_{n}} (f (X))$ with $n > N^{'}$ , there exist $w^{'} \in L_{Γ_{m}}$ a subword of

$w$ , with $m > n - n^{1 - \frac{1}{3 d}}$ , and $w^{'}$ with the same shape as $w^{'}$ and which agrees on the

boundary with $w^{'}$ such that replacing $w^{'}$ with $w^{'}$ in $x^{'} \in f (X)$ cannot result in the

creation of a new occurrence of $w^{'}$ .

112

Now, take any word $w \in L_{Γ_{n}} (X)$ with $n > N + t - 1$ . Although $f$ is defined

as a map from one subshift to another, it should be fairly clear that as defined,

it can also function as a map from $L_{Γ_{k + t - 1}} (X)$ to $L_{Γ_{k}} (f (X))$ . So we can define

$f (w) \in L_{Γ_{n - t + 1}} (f (X))$ . For ease of notation, we define $\bar{n} = n - t + 1$ , so $\bar{n} > N$ .

Then, by Theorem 3.3.1, we can find $v^{'} \in L_{Γ_{m}} (f (X))$ a subword of $f (w)$ , with

$m > \bar{n} - {\bar{n}}^{1 - \frac{1}{3 d}}$ , and $v^{'}$ which agrees with $v^{'}$ on the boundary so that replacing $v^{'}$

with $v^{'}$ in any element of $f (X)$ cannot create a new occurrence of $w^{'}$ . $f^{- 1} (v^{'})$ and

$f^{- 1} (v^{'})$ are then subwords of $w$ with shape $Γ_{m + t - 1}$ which agree on the boundary

of thickness $t$ . We wish to show that replacing $f^{- 1} (v^{'})$ by $f^{- 1} (v^{'})$ in some $x \in X$

cannot possibly result in a new occurrence of $f^{- 1} (v^{'})$ . Suppose that this is not the

case. Then there is $x \in X$ and two copies of $Γ_{m + t - 1}$ , call them $S$ and $S^{'}$ , such that

$x (S) = f^{- 1} (v^{'})$ , $x (S^{'}) \neq f^{- 1} (v^{'})$ , and if we define $x^{'} \in X$ by replacing $x (S)$ by $f^{- 1} (v^{'})$ ,

then $x^{'} (S^{'}) = f^{- 1} (v^{'})$ . If we apply $f$ to all of the objects in the above description,

we arrive at a contradiction to the definition of $v^{'}$ and $v^{'}$ . It is therefore the case

that replacing $f^{- 1} (v^{'})$ with $f^{- 1} (v^{'})$ in any $x \in X$ cannot create a new occurrrence of

$f^{- 1} (v^{'})$ . Since $v^{'}$ is a subword of $f (w)$ , $f^{- 1} (v^{'})$ is a subword of $w$ . $m > \bar{n} - {\bar{n}}^{1 - \frac{1}{3 d}}$ ,

and the size of the shape of $f^{- 1} (v^{'})$ is $m + t - 1$ $> n - {\bar{n}}^{1 - \frac{1}{3 d}} \underline{>} n - n^{1 - \frac{1}{3 d}}$ , and so we

are done.

$□$

Corollary 3.3.7 is the main tool in the proof of Theorem 3.1.22; very roughly, the

idea is that for each occurrences of $w$ in some large word in $X$ , we destroy this

occurrence of $w$ by replacing the occurrence of $w^{'}$ which is a subword of it by $w^{'}$ .

113

The requirement that new occurrences of the destroyed words not appear during this

process is to ensure that the process of getting rid of these occurrences of $w$ takes

as few steps as possible. It is natural to wonder why we go to the trouble of dealing

with subwords in the statement of Corollary 3.3.7; i.e., why is it not possible, instead

of dealing with the intermediate step of taking a subword of $w$ , to just find a word

$w^{'} \neq w$ such that replacing $w$ by $w^{'}$ can never result in a new occurrence of $w$ being

created? The answer is that there are examples of strongly irreducible shifts of finite

type $X$ and words $w \in L (X)$ for which this is impossible! Here is a quick example.

Consider, for any dimension $d$ and any $n > 1$ , the full shift $Y = Ω$ , and the word

$w \in L_{Γ_{n}} (Y)$ defined by $w (v) = 1$ if $v_{1} = v_{2} = ... = v_{d} = 1$ , and $w (v) = 0$ if $1 \underline{<} v_{i} \underline{<} n$

for $1 \underline{<} i \underline{<} d$ and $v_{i} > 1$ for some $1 \underline{<} i \underline{<} d$ . $w$ then has zeroes at every point of

$Γ_{n}$ except the least lexicographically. We claim that for any $w^{'} \neq w$ , there is some

$x \in Y$ such that $x (Γ_{n}) = w$ , and replacing $x (Γ_{n})$ by $w^{'}$ creates a new occurrence of

$w$ . Consider any $w^{'} \neq w$ . Suppose that $w^{'}$ contains a one, i.e. there is some $v \in Γ_{n}$

such that $w^{'} (v) = 1$ . Since $w^{'} \neq w$ , we can assume that $v \neq (1, 1, ..., 1)$ . Consider

$x \in Y$ defined by taking $x ($ 1, 1, $...$ , $1) = 1$ and $x (v) = 0$ for all other $v \in Z^{d}$ . Then,

$x (Γ_{n}) = w$ . Create a new $x^{'} \in Y$ by replacing $x (Γ_{n})$ by $w^{'}$ . Then, it is not hard to

check that $x^{'} (Γ_{n} + (v - (1, 1, ..., 1))) = w$ , and that $x (Γ_{n} + (v - (1, 1, ..., 1)))$ is the

word consisting of all zeroes, and so not equal to $w$ . This means that the replacement

involved in changing $x$ to $x^{'}$ created a new occurrence of $w$ . The only other possibility

for $w^{'}$ is that $w^{'}$ is the word consisting of all zeroes, i.e. $w^{'} (v) = 0$ for all $v \in Γ_{n}$ . If

this is the case, then define $x \in Y$ by taking $x ($ 0, 1, $...$ , $1) = x (1, 1, ..., 1)$ $= 1$ and

$x (v) = 0$ for all other $v \in Z^{d}$ . Again, $x (Γ_{n}) = w$ . Create $x^{'} \in Y$ by replacing $x (Γ_{n})$

114

by $w^{'}$ . Then, it is not hard to check that $x^{'} (Γ_{n} - e_{1}) = w$ , and that $x (Γ_{n} - e_{1})$ had two

ones, and is thus not equal to $w$ . This means that in this case also, the replacement

involved in changing $x$ to $x^{'}$ created a new occurrence of $w$ . We have then shown

that for any $w^{'} \neq w$ , it is possible that a replacement of $w$ by $w^{'}$ could create a new

occurrence of $w$ , and therefore the extra step of taking subwords in Theorem 3.3.1

and Corollary 3.3.7 is in fact necessary.

3.4 The proof of the main result

Proof of Theorem 3.1.22: We take $X$ to be a strongly irreducible shift of finite

type $t$ . For any word $w$ in $L_{Γ_{n}} (X)$ , we call $X_{w}$ the shift of finite type resulting

from removing $w$ from the language of $X$ . We begin by proving an upper bound

on $h^{t o p} (X) - h^{t o p} (X_{w})$ for sufficiently large $n$ . By Corollary 3.3.7, as long as $n$ is

sufficiently large, there exists $w^{'} \in L_{Γ_{m}} (X)$ a subword of $w$ with $m > n - n^{1 - \frac{1}{3 d}}$ and

$w^{'}$ with the same shape as $w^{'}$ and which agrees with $w^{'}$ on $Γ_{m}^{(t)}$ with the property

that replacing $w^{'}$ by $w^{'}$ cannot create new occurrences of $w^{'}$ .

For any $k > m$ , we will define a mapping $φ_{k}$ : $L_{Γ_{k + 4 m}} (X) \to L_{Γ_{k}} (X_{w})$ . $φ_{k}$ will

actually be defined as a composition of three maps: ( $y_{k}$ : $L_{Γ_{k + 4 m}} (X) \to L_{Γ_{k + 4 m}} (X)$ ,

$β_{k}$ : ( $y_{k} (L_{Γ_{k + 4 m}} (X)) \to L_{Γ_{k + 4 m}} (X_{w})$ , and $γ_{k}$ : ( $β_{k} \circ (y_{k}) (L_{Γ_{k + 4 m}} (X)) \to L_{Γ_{k}} (X_{w})$ . To

define these we need a definition:

Definition 3.4.1. Trvo overlapping occurrences of the word $w^{'}$ which occur in copies

S and $S^{'}$ of $Γ_{m}$ are said to have overlap of type B if |S $- S^{'} |_{\infty} \underline{>} \frac{m}{2} - 2 n^{1 - \frac{1}{3 d}}$ .

115

We point out an important property of this definition: given a fixed occurrence of $w^{'}$ ,

there are at most $3^{d} m^{d}$ copies of $Γ_{m}$ which could be filled with an overlapping $w^{'}$ , and

so there are at most $3^{d} m^{d}$ words which are made up of a pair of $w^{'}$ whose overlap is of

type B. Call these words $u_{1}$ , $u_{2}$ , . . . , $u_{b}$ , where $b \underline{<} 3^{d} m^{d}$ , and call $S_{i}$ the shape of $u_{i}$ for

any $1 \underline{<} i \underline{<} b$ . Given any $u \in L_{Γ_{k + 4 m}} (X)$ , we define ( $y_{k} (v)$ by finding each occurrence

of any of the $u_{i}$ within $u$ , and in each one, replacing the first lexicographically of

the two occurrences of $w^{'}$ which make up $u_{i}$ by $w^{'}$ . In order to make this operation

well-defined, we must specify an order for these replacements to be done in; let us

just choose the usual lexicographic order, and search for and replace each $u_{i}$ in turn.

(This means that we first replace the lexicographically first occurrence of $u_{1}$ , then

find the new lexicographically first occurrence of $u_{1}$ and replace it, and continue until

no $u_{1}$ remain. We then perform the same procedure for $u_{2}$ , $u_{3}$ , etc.) Since replacing

$w^{'}$ by $w^{'}$ can never create a new occurrence of $w^{'}$ , the resulting $Γ_{k + 4 m}$ -word, which

we call ( $y_{k} (u)$ , contains no $u_{i}$ , i.e. contains no pair of occurrences of $w^{'}$ with overlap

of type B.

For any ( $y_{k} (u) \in (y_{k} (Γ_{k + 4 m} (X))$ , we define ( $β_{k} \circ (y_{k}) (u)$ as follows: find the first

occurrence of $w$ lexicographically in ( $y_{k} (u)$ , and replace the $w^{'}$ which is a subword of

this occurrence of $w$ by $w^{'}$ . Denote by $i$ the word that $w$ becomes when its subword

$w^{'}$ is replaced by $w^{'}$ . Repeat this procedure until there are no occurrences of $w$ left,

and call the resulting $Γ_{k + 4 m}$ word ( $β_{k} \circ (y_{k}) (u)$ . There is one fact that we must check;

namely that when performing any of these replacements, we do not accidentally create

a new occurrence of $w$ . For a contradiction, assume that a particular replacement of

$w^{'}$ by $w^{'}$ could create a new occurrence of $w$ .

116

Figure 3.14: Intersecting occurrences of $w$

In Figure 3.14, $S$ is a copy of $Γ_{n}$ which is filled with $w$ , $T$ is a copy of $Γ_{m}$ which is

filled with $w^{'}$ , $U$ is the $Γ_{2 R + l}$ on which letters are changed to change the word on $T$

from $w^{'}$ to $w^{'}$ , $S^{'}$ is a copy of $Γ_{n}$ which will be filled with $w$ once the replacement

is done, $T^{'}$ is the copy of $Γ_{m}$ in $S^{'}$ corresponding to $T$ in $S$ , which will therefore be

filled with $w^{'}$ once the replacement is done, and $A_{j}$ is the copy of $Γ 3 (2 R + l) ⌈ n_{j}^{1 - \frac{1}{2 d}} ⌉$ or

$Γ 3 (2 R + l) (⌈ n_{j}^{1 - \frac{1}{2 d}} ⌉ - 1)$ central in $S$ in which $U$ must lie, which comes from the proof

of Theorem 3.3.1. For the same reasons as in the proof of Theorem 3.3.1, $S^{'}$ cannot

contain all of $U$ , but must contain some nonempty subset of $U$ . Since the replacement

cannot possibly create a new occurrence of $w^{'}$ , $T^{'}$ must have already been filled with

117

$w^{'}$ before the replacement occurred. This implies that before the replacement, $T$

and $T^{'}$ were filled with overlapping occurrences of $w^{'}$ . We claim that these two

occurrences had overlap of type $B$ . Since $U \underline{\subset} A_{j}$ , and since the size of $A_{j}$ is at most

$3 (2 R + l)$ $⌈ n^{1 - \frac{1}{2 d}} ⌉$ , the distance in the $e_{i}$ -direction between the center of $U$ and the

center of $A_{j}$ is less than $\frac{3}{2} (2 R + l) n^{1 - \frac{1}{2 d}}$ for each $1 \underline{<} i \underline{<} d$ . $A_{j}$ is central in $S$ , so

the center of $A_{j}$ is the same as the center of $S$ . Since $T$ is a subcube of $S$ whose

size is at most $n^{1 - \frac{1}{3 d}}$ shorter the distance in the $e_{i}$ -direction between the center of $S$

and the center of $T$ is less than $\frac{1}{2} n^{1 - \frac{1}{3 d}}$ for each $1 \underline{<} i \underline{<} d$ . For the same reason, the

distance in the $e_{i}$ -direction between the center of $S^{'}$ and the center of $T^{'}$ is less than

$\frac{1}{2} n^{1 - \frac{1}{3 d}}$ for each $1 \underline{<} i \underline{<} d$ . Finally, since $U$ intersects the boundary of $S^{'}$ , and since

$U$ has size $2 R + l$ , there exists $1 \underline{<} i \underline{<} d$ for which the distance in the $e_{i}$ -direction

between the center of $S^{'}$ and the center of&is between $\frac{m}{2} - (R + \frac{l}{2})$ and $\frac{m}{2} + (R + \frac{l}{2})$ .

Putting all of these facts together, we see that there exists $1 \underline{<} i \underline{<} d$ so that the

distance in the $e_{i}$ -direction between the center of $T$ and the center of $T^{'}$ is between

$\frac{m}{2} - \frac{3}{2} (2 R + l) n^{1 - \frac{1}{2 d}} - n^{1 - \frac{1}{3 d}} - (R + \frac{l}{2})$ and $\frac{m}{2} + \frac{3}{2} (2 R + l) n^{1 - \frac{1}{2 d}} + n^{1 - \frac{1}{3 d}} + (R + \frac{l}{2})$ , which

implies that for large enough $n$ , it is between $\frac{m}{2} - 2 n^{1 - \frac{1}{3 d}}$ and $\frac{m}{2} + 2 n^{1 - \frac{1}{3 d}}$ , which

implies that before the replacement, $T$ and $T^{'}$ were indeed filled with occurrences

of $w^{'}$ with overlap of type B. However, ( $y_{k} (u)$ contained no pair of occurrences of $w^{'}$

with overlap of type $B$ , and so since all replacements in the definition of $β_{k}$ involve

replacing some $w^{'}$ with $w^{'}$ , no new occurrences of $w^{'}$ can be created during these

replacements. Therefore, there can never be a pair of occurrences of $w^{'}$ with overlap

of type $B$ during the application of $β_{k}$ to a member of ( $y_{k} (Γ_{k + 4 m})$ . We therefore have

a contradiction. Our original assumption was wrong, and none of these replacements

can create a new occurrence of $w$ . So, when these replacements are finished, we have

a word with shape $Γ_{k + 4 m}$ , which we called ( $β_{k} \circ (y_{k}) (u)$ , which has no occurrences of

$w$ .

The definition of $γ_{k}$ is much simpler: for any $u \in Γ_{k + 4 m} (X)$ , $γ_{k} (u)$ is the subword of $u$

occupying its central $Γ_{k}$ . For any $u \in Γ_{k + 4 m} (X)$ , we define $φ_{k} (u) = (γ_{k} \circ β_{k} \circ (y_{k}) (u)$ .

We now claim that $φ_{k} (u)$ is in $L (X_{w})$ , i.e. can be extended to a configuration on $Z^{d}$

which is in $X$ and contains no occurrences of $w$ . To do this, we inductively define

a sequence of words $d_{j}$ , where $d_{j} \in L_{Γ_{k + 4 j m}} (X)$ and each $d_{j}$ has no occurrences of

$w$ . Take $d_{0} = φ_{k} (u)$ and $d_{1} = (β_{k} \circ (y_{k}) (u)$ . For any $d_{j}$ for $j > 0$ , define $d_{j + 1}$ as

follows: since $d_{j} \in L (X)$ , it can be extended to an infinite configuration in $X$ . Use

this fact to create a word $d_{j}^{'} \in L_{Γ_{k + 4 (j + 1) m}} (X)$ which has $d_{j}$ as the word occupying

its central $Γ_{k + 4 j m}$ . We then create $d_{j + 1}$ by performing the same replacements as

in the definitions of ( $y_{k}$ and then $β_{k}$ , in other words first finding and replacing any

occurrences of $u_{i}$ for any $1 \underline{<} i \underline{<} b$ , and then finding and replacing any occurrences

of $w$ . For the same reasons as above, $d_{j + 1}$ has no occurrences of $w$ . Since $d_{j + 1}$ agrees

on the boundary of thickness $t$ with $d_{j}^{'}$ , $d_{j + 1} \in L (X)$ , completing the inductive step

and allowing us to define $d_{j}$ for all $j$ . We also note that for any $j$ , since $d_{j}$ contained

no occurrences of $w$ or $u_{i}$ for any $1 \underline{<} i \underline{<} b$ , any occurrences of these words in $d_{j}^{'}$

must have had nonempty overlap with the newly created portion, i.e. must have been

contained in the boundary of thickness $4 m$ of $Γ_{k + 4 (j + 1) m}$ . As a result, $d_{j + 1}$ and $d_{j}$

must agree on their respective central $Γ_{k + 4 (j - 1) m}$ , since no replacements could have

affected that portion. This means that we may define the "limit" of the $d_{j}$ : for each

119

$j$ , since $d_{j} \in L (X)$ , there exists $z_{j} \in X$ with $z_{j} (Γ_{k + 4 j m}) = d_{j}$ . Since $d_{j + 1}$ and $d_{j}$ agree

on their respective central $Γ_{k + 4 (j - 1) m}$ , the limit of the $z_{j}$ exists, call it $x$ . Since $X$

is a subshift, it is closed and so $x \in X$ . Since the central $Γ_{k}$ of each $d_{j}$ is equal to

$φ_{k} (u)$ , $φ_{k} (u)$ is a subword of $x$ . Finally, since each $d_{j}$ has no occurrences of $w$ and $x$

is a limit of the $d_{j}$ , $x$ has no occurrences of $w$ . This means that $φ_{k} (u) \in L_{Γ_{k}} (X_{w})$ as

claimed.

We have now defined $φ_{k}$ on all of $L_{Γ_{k + 4 m}} (X)$ . For any $ε > 0$ , we can then deal

with the restriction of $φ_{k}$ to $A_{k + 4 m, ε, 3 n} (X)$ . Since the shape of $w$ is $Γ_{n} \underline{\subset} Γ_{3 n}$ , any

element of $A_{k + 4 m, ε, 3 n} (X)$ has fewer than $(k + 4 m)^{d} (\frac{1}{H_{Γ_{n - 2 R}} (X)} + ε)$ occurrences of $w$ ,

and since each $u_{i}$ has shape a subset of $Γ_{3 n}$ , any element of $A_{k + 4 m, ε, 3 n} (X)$ has fewer

than $(k + 4 m)^{d} (\frac{1}{H (X), s_{i} ∖ s_{i}^{(R)}}, + ε)$ occurrences of $u_{i}$ for each $1 \underline{<} i \underline{<} b$ . By Lemma 3.2.1,

$H_{Γ_{n - 2 R}} (X) \underline{>} e^{h^{t o p} (X) (n - 2 R)^{d}}$ We now bound $H_{S_{i} ∖ S_{i}} (R) (X)$ in a similar fashion. First

let's look at a picture of $S_{i} ∖ S_{i}^{(R)}$ for any $1 \underline{<} j \underline{<} b$ . Recall that $S_{i}$ is the union of

two overlapping copies of $Γ_{m}$ , so $S_{i} ∖ S_{i}^{(R)}$ is the union of two overlapping copies of

$Γ_{m - 2 R}$ concentric with the original copies of $Γ_{m}$ . By the definition of type $B$ overlap,

there exists $1 \underline{<} i \underline{<} d$ for which the distance in the $e_{i}$ -direction between the centers

of these two $Γ_{m - 2 R}$ is between $\frac{m}{2} - 2 n^{1 - \frac{1}{3 d}}$ and $\frac{m}{2} + 2 n^{1 - \frac{1}{3 d}}$ .

In Figure 3.15, we denote the directio $n$ in question by $e_{i}$ , the two copies of $Γ_{m - 2 R}$

by $S$ and $S^{'}$ , and the distance in the $e_{i}$ -direction between the centers of $S$ and $S^{'}$

by $c$ . We then define $T$ to be a rectangular prism which is a subset of $S$ a distance

of $R + 1$ away from $T^{'}$ . The sizes of $T$ are $m - 2 R$ in every direction but $e_{i}$ , and

$c - R \underline{>} m - 2 n^{1 - \frac{1}{3 d}} - R$ in the $e_{i}$ direction

120

$1^{e_{i}}$

Figure 3.15: $S_{i} ∖ S_{i}^{(R)}$

Clearly by just taking very large $m$ we can assume that this dimension is greater

than $m - 3 n^{1 - \frac{1}{3 d}}$ . Since $d (T, S^{'}) = R + 1$ and $T$ , $S^{'} \underline{\subset} S_{i} ∖ S_{i}^{(R)}$ , by strong irreducibil-

ity $H_{S_{i} ∖ S_{i}^{(R)}} (X) \underline{>} H_{T} (X) H_{S^{'}} (X) \underline{>} e^{h^{t o p} (X) [(m - 2 R)^{d - 1} (\frac{m}{2} - 3 n^{1 - \frac{1}{3 d}}) + (m - 2 R)^{d}]}$ . $R$ , $t$ , and

$n^{1 - \frac{1}{3 d}}$ are all small in relation to $m$ for large enough $m$ , and $m$ asympotically ap-

proaches $n$ as $n \to \infty$ , so for large $n$ , this bound is greater than $e^{h^{t o p} (X) 1.4 n^{d}}$ Using

these bounds, we can rewrite the above statement about $A_{k + 4 m, ε, 3 n} (X)$ a little more

easily: every element of $A_{k + 4 m, ε, 3 n} (X)$ has fewer than $(k + 4 m)^{d} (e^{- h^{t o p} (X) (n - 2 R)^{d}} + ε)$

occurrences of $w$ and fewer than $(k + 4 m)^{d} (e^{- 1.4 h^{t o p} (X) n^{d}} + ε)$ occurrences of any $u_{i}$

for $1 \underline{<} i \underline{<} b$ .

121

We now prove the upper bound on $h^{t o p} (X) - h^{t o p} (X_{w})$ . We do this by proving

upper bounds on $| α_{k}^{- 1} (v) \cap A_{k + 4 m, ε, 3 n} (X) |$ for any $v \in (y_{k} (L_{Γ_{k + 4 m}} (X))$ , on $| β_{k}^{- 1} (v^{'}) \cap$

( $y_{k} (A_{k + 4 m, ε, 3 n} (X))$ $|$ for any $v^{'} \in L_{Γ_{k}} (X)$ , and on $| γ_{k}^{- 1} (v^{'}) |$ for any $v^{'} \in L_{Γ_{k}} (X_{w})$ . The

first upper bound is fairly straightforward: consider any $u \in A_{k + 4 m, ε, 3 n} (X)$ . Since

$b \underline{<} 3^{d} m^{d}$ , the total number of letters in $u$ which are part of an occurrence of any

$u_{i}$ is at most $(2 m^{d}) (3^{d} m^{d}) (k + 4 m)^{d} (e^{- 1.4 h^{t o p} (X) n^{d}} + ε)$ , which for large $m$ is less than

$(k + 4 m)^{d} (e^{- 1.3 h^{t o p} (X) n^{d}} + 3^{d + 1} m^{2 d} ε)$ . Now, consider any $v \in (y_{k} (L_{Γ_{k + 4 m}} (X))$ . By

the previous reasoning, any $u \in α_{k}^{- 1} (v) \cap A_{k + 4 m, ε, 3 n} (X)$ differs from $v$ on less than

$(k + 4 m)^{d} (e^{- 1.3 h^{t o p} (X) n^{d}} + 3^{d + 1} m^{2 d} ε)$ letters, and so the number of possible $u$ is at most

$| A |^{(k + 4 m)^{d} (e^{- 13 h^{t o p} (X) n^{d}} + 3^{d + 1} m^{2 d} ε)} \sum_{i = 0}^{+ 3^{d + 1} m^{2 d} ε) ⌊} ⌋ (k + 4 m)^{d} (e^{- 13 h^{t o p_{(X) n^{d}}}}$

$\underline{<} | A |^{(k + 4 m)^{d} (e^{- 13 h^{t o p} (X) n^{d}} + 3^{d + 1} m^{2 d} ε)}$ $(k + 4 m)^{d}$

$\underline{<} e^{(k + 4 m)^{d} e^{- h^{t o p} (X) (n - 2 R)^{d} + 3^{d + 1} m^{2 d} ε \ln | A |}}$

for large values of $n$ .

The second upper bound, the one on $| β_{k}^{- 1} (v^{'}) \cap α_{k} (A_{k + 4 m, ε, 3 n} (X)) |$ for any $v^{'} \in L_{Γ_{k}} (X)$ ,

is slightly more difficult. Consider any $u \in (y_{k} (A_{k + 4 m, ε, 3 n} (X))$ . Since $u$ is in the image

of ( $y_{k},$ $u$ has no occurrences of any $u_{i}$ . Consider any pair of overlapping occurrences of

$w$ in $u$ , say at $S$ and $S^{'}$ copies of $Γ_{n}$ . Then these occurrences of $w$ contain occurrences

of $w^{'}$ as subwords, say at $T$ and $T^{'}$ copies of $Γ_{m}$ with $T \subset S$ , $T^{'} \subset S^{'}$ , and $T - T^{'} =$

$S - S^{'}$ . Also define $U$ to be the copy of $Γ_{2 R + l}$ in $T$ which would be changed to change

$u (T)$ to $w^{'}$ , and $U^{'}$ to be the corresponding copy of $Γ_{2 R + l}$ in $T^{'}$ . Since there are no

122

occurrences of $u_{i}$ , it must be the case that |T $- T^{'} |_{\infty} < \frac{m}{2} - 2 n^{1 - \frac{1}{3 d}}$ . This implies

that $T$ contains the central $Γ 3 (2 R + l) ⌈ n^{1 - \frac{1}{2 d}} ⌉$ of $S$ for large $n$ , and so that $T$ contains $U^{'}$ .

Similarly, $T^{'}$ contains $U$ . Therefore, if $u (T)$ is replaced by $w^{'}$ sometime during the

replacements defining $β_{k}$ , then $u (S^{'})$ will be changed to something other than $w$ , and if

$u (T^{'})$ is replaced by $w^{'}$ at some point in these replacements, then $u (S)$ will be changed

to something other than $w$ . In other words, in any replacement involved in changing

$u$ to $β_{k} (u)$ , if $u (S)$ is changed from $w$ to $\bar{w}$ , then all occurrences of $w$ with nonempty

overlap with $S$ are also changed. Since new occurrences of $w$ cannot be created during

these replacements, this implies that $(β_{k} (u)) (S) = i$ for any such $S$ . Note that we

have also shown that the copies of $Γ_{n}$ where replacements occur in the application of

$β_{k}$ must be disjoint. We now wish to bound from above the number of occurrences

of $i$ in $β_{k} (u)$ for any $u \in (y_{k} (A_{k + 4 m, ε, 3 n} (X))$ . By the definition of $A_{k + 4 m, ε, 3 n} (X)$ , for

any $u^{'} \in A_{k + 4 m, ε, 3 n} (X)$ , $u^{'}$ has at most $(k + 4 m)^{d} (e^{- h^{t o p} (X) (n - 2 R)^{d}} + ε)$ occurrences of

$\bar{w}$ . We now claim that each of the replacements involved in ( $y_{k}$ and $β_{k}$ could create at

most $9^{d}$ new occurrences of $\bar{w}$ , which rests on an aperiodicity property of $\bar{w}$ : namely,

for large $n$ , and for any $x \in X$ with $x (S)$ $= x (S^{'}) = i$ for $S$ and $S^{'}$ copies of $Γ_{n}$ ,

$| S - S^{'} |_{\infty} > \frac{n}{3}$ .

To show this, recall that in the definition of $w^{'}$ , we ensure that $w^{'}$ contains a sub-

word $a \in L_{Γ_{2 R +}}$ , (X) which is not a subword of $w$ . In addition, $a$ lies in the central

$Γ 3 (2 R + l) ⌈ n^{1 - \frac{1}{2 d}} ⌉$ in $w^{'}$ . Also recall that $w^{'}$ is asubword of $w$ whose shape has size at

least $n - n^{1 - \frac{1}{3 d}}$ . Together, these imply that for large $n$ , $i$ contains $a$ as a subword of its

central $Γ_{\frac{n}{4}}$ . Since $i$ and $w$ agree outside this central $Γ_{\frac{n}{4}}$ , and since $a$ is not a subword

of $w$ , this implies that any occurrences of $a$ in $i$ have nonempty intersection with its

123

central $Γ_{\frac{n}{4}}$ , and that there is at least one such occurrence of $a$ . Now consider $x \in X$

such that $x (S) = x (S^{'}) = \bar{w}$ , where $S \neq S^{'}$ . Also consider $T$ a copy of $Γ_{2 R + l}$ such that

$x (T) = a$ . There are two possibilities: either $T$ lies inside $S^{'}$ or $T$ does not lie inside

$S^{'}$ . If $T$ lies inside $S^{'}$ , then since $x (T) = a$ , $T$ must have nonempty intersection with

the central $Γ_{\frac{n}{4}}$ of $S^{'}$ . Again since $x (T) = a$ , $T$ must have nonempty intersection with

the central $Γ_{\frac{n}{4}}$ of $S$ as well. This implies that $| S - S^{'} |_{\infty} \underline{<} \frac{n}{4} + (2 R + l)$ . However,

then $S - S^{'}$ is a period of $\bar{w}$ . Since $0 \neq | S - S^{'} |_{\infty} \underline{<} \frac{n}{4} + (2 R + l)$ $< \frac{n}{3}$ for large

$n$ , there is some positive integer $k$ so that $T + k (S - S^{'})$ lies in $S$ , but has empty

intersection with the empty central $Γ_{\frac{n}{4}}$ of $S$ . But by periodicity of $i$ with respect to

$S - S^{'}$ , $x (T + k (S - S^{'})) = x (T) = a$ , which gives a contradiction. Therefore, it must

be the case that $T$ does not lie inside $S^{'}$ . Since $T$ has nonempty intersection with the

central $Γ_{\frac{n}{4}}$ of $S$ , this implies that $| S - S^{'} |_{\infty} \underline{>} \frac{3 n}{8} - (2 R + l)$ $> \frac{n}{3}$ for large $n$ .

Now consider any replacement of $w^{'}$ by $w^{'}$ in the replacements defining ( $y_{k}$ and $β_{k}$ ,

and suppose that this replacement occurs at $S$ a copy of $Γ_{m}$ . Specifically, suppose

that $v$ , $v^{'}$ $\in L_{Γ_{k + 4 m}} (X)$ where $v (S) = w^{'}$ , $v^{'} (S) = w^{'}$ , and $v$ and $v^{'}$ agree outside of $S$ .

Also assume that $T$ a copy of $Γ_{n}$ is a location for an occurrence of $i$ newly created

in this replacement, i.e. $v^{'} (T) = \bar{w}$ , $v (T) \neq \bar{w}$ . Since $v$ and $v^{'}$ agree outside $S$ , $S$ and

$T$ must have nonempty intersection, meaning that any possible location for $T$ given

a fixed $S$ lies in a copy of $Γ_{m + 2 n}$ concentric with $S$ . However, by the aperiodicity

fact about $i$ above, any $T$ , $T^{'}$ both of which contain newly created occurrences of

$i$ must satisfy $| T - T^{'} |_{\infty} > \frac{n}{3}$ . This means that at each replacement, there are no

more than $9^{d}$ newly created occurrences of $\bar{w}$ . We now bound from above the total

number of replacements performed in applying ( $y_{k}$ to $u \in A_{k + 4 m, ε, 3 n} (X)$ . As already

124

shown, for large $n$ the total number of occurrences of any $u_{i}$ in $u \in A_{k + 4 m, ε, 3 n} (X)$ is

smaller than $3^{d} m^{d} (k + 4 m)^{d} (e^{- 1.4 h^{t o p} (X) n^{d}} + ε)$ , and so the total number of occurrences

of $i$ created during these replacements is less than $(27 m)^{d} (k + 4 m)^{d} (e^{- 1.4 h^{t o p} (X) n^{d}} +$

$ε)$ , which is less than $(k + 4 m)^{d} (e^{- h^{t o p} (X) (n - 2 R)^{d}} + (27 m)^{d} ε)$ for large $n$ . We now

bound from above the total number of replacements performed in applying $β_{k}$ to

( $y_{k} (u)$ . For this, we need to bound from above the total number of occurrences

of $w$ in ( $y_{k} (u) .$ $u$ itself had at most $(k + 4 m)^{d} (e^{- h^{t o p} (X) (n - 2 R)^{d}} + ε)$ occurrences of

$w$ by definition of $A_{k + 4 m, ε, 3 n} (X)$ . While changing $u$ to ( $y_{k} (u)$ , less than $3^{d} m^{d} (k +$

$4 m)^{d} (e^{- 1.4 h^{t o p} (X) n^{d}} + ε)$ replacements are made. Each one of these could create at

most $(m + 2 n)^{d}$ new occurrences of $w$ by the same reasoning we used for $i$ (we

just don't have any aperiodicity facts about $w$ to use.) Therefore, ( $y_{k} (u)$ has less than

$(k + 4 m)^{d} (e^{- h^{t o p} (X) (n - 2 R)^{d}} + ε) + (m + 2 n)^{d} 3^{d} m^{d} (k + 4 m)^{d} (e^{- 1.4 h^{t o p} (X) n^{d}} + ε)$ occurrences

of $w$ , which is less than $(k + 4 m)^{d} (2 e^{- h^{t o p} (X) (n - 2 R)^{d}} + (12 m)^{d} ε)$ for large $n$ . This

means that the number of replacements involved in changing ( $y_{k} (u)$ to $(β_{k} \circ (y_{k}) (u)$

is less than $(k + 4 m)^{d} (2 e^{- h^{t o p} (X) (n - 2 R)^{d}} + (12 m)^{d} ε)$ as well. Since each of these can

create at most $9^{d}$ new occurrences of $\bar{w}$ , the number of occurrences of $i$ created

during these replacements is less than $(k + 4 m)^{d} (2 ċ 9^{d} e^{- h^{t o p} (X) (n - 2 R)^{d}} + (108 m)^{d} ε)$ .

So, the total number of occurrences of $i$ created in the process of changing $u$ to

( $β_{k} \circ (y_{k}) (u)$ is less than $(k + 4 m)^{d} ((2 ċ 9^{d} + 1) e^{- h^{t o p} (X) (n - 2 R)^{d}} + ((108 m)^{d} + (27 m)^{d}) ε)$ .

Since $u$ itself contained at most $(k + 4 m)^{d} (e^{- h^{t o p} (X) (n - 2 R)^{d}} + ε)$ occurrences of $i$

by definition of $A_{k + 4 m, ε, 3 n} (X)$ , this means that ( $β_{k} \circ (y_{k}) (u)$ contains less than $(k$ $+$

$4 m)^{d} (2 0^{d} e^{- h^{t o p} (X) (n - 2 R)^{d}} + (216 m)^{d} ε)$ occurrences of $\bar{w}$ .

We now collect the three facts we need to bound $| β_{k}^{- 1} (v^{'}) \cap (y_{k} (A_{k + 4 m, ε, 3 n} (X)) |$ from

125

above for any $v^{'} \in L_{Γ_{k}} (X)$ . Firstly, for any $u \in A_{k + 4 m, ε, 3 n} (X)$ , we know that for any

$S$ a copy of $Γ_{n}$ which is the location of a replacement during the process of changing

( $y_{k} (u)$ to $(β_{k} \circ (y_{k}) (u), ((β_{k} \circ (y_{k}) (u)) (S) = \bar{w}$ . In other words, once $w$ is changed to $i$

during the application of $β_{k}$ to ( $y_{k} (u)$ , that occurrence of $i$ will not be changed during

future replacements. Secondly, all replacements perfromed in the application of $β_{k}$

to ( $y_{k} (u)$ are disjoint. This means that knowing ( $β_{k} \circ (y_{k}) (u)$ , along with knowing the

locations of all copies of $Γ_{n}$ where replacements occur in changing ( $y_{k} (u)$ to $(β_{k} \circ (y_{k}) (u)$ ,

is enough to uniquely determine ( $y_{k} (u)$ . Finally, ( $β_{k} \circ (y_{k}) (u)$ contains fewer than

$(k + 4 m)^{d} (2 0^{d} e^{- h^{t o p} (X) (n - 2 R)^{d}} + (216 m)^{d} ε)$ occurrences of $\bar{w}$ . This implies that for any

$v^{'} \in L_{Γ_{k}} (X)$ , $| β_{k}^{- 1} (v^{'}) \cap (y_{k} (A_{k + 4 m, ε, 3 n} (X)) | \underline{<} 2^{(k + 4 m)^{d} (2 0^{d} e^{- h^{t o p} (X) (n - 2 R)^{d}} + (216 m)^{d} ε)}$ , the

total number of subsets of the locations of occurrences of $i$ in $v^{'}$ which could have

been the locations of replacements in the application of $β_{k}$ .

Finally, we give an upper bound on $| γ_{k}^{- 1} (v^{'}) |$ for any $v^{'} \in L_{Γ_{k}} (X)$ , which is straight-

forward: any $u \in γ_{k}^{- 1} (v')$ must have its central $Γ_{k}$ filled with $v^{'}$ . This means that

$| γ_{k}^{- 1}$ $(v^{'})$ $| \underline{<} | A |^{(k + 4 m)^{d} - k^{d}} < | A |^{5 d m k^{d - 1}}$ for large $k$ .

Putting all of these upper bounds together, we see that $| φ_{k}^{- 1} (v) \cap A_{k + 4 m, ε, 3 n} (X) | <$

$e^{(k + 4 m)^{d} e^{- h^{t o p} (X) (n - 2 R)^{d} + \ln | A | 3^{d + 1} m^{2 d} ε}}$ . $2^{(k + 4 m)^{d} (2 0^{d} e^{- h^{t o p} (X) (n - 2 R)^{d}} + (216 m)^{d} ε)}$ . $| A |^{5 d m k^{d - 1}}$

$\underline{<} e^{(k + 4 m)^{d} C e^{- h^{t o p} (X) (n - 2 R)^{d}} + D ε}$

for constants $C$ independent of $n$ and $k$ and $D$ independent of $k$ . This implies that

$H_{Γ_{k}} (X_{w}) \underline{>} \frac{| A_{k + 4 m, ε, 3 n} (X) |}{e^{(k + 4 m)^{d} C e^{- h^{t o p_{(X) (n - 2 R)^{d}}}} + D ε}}$ .

126

We then take natural logarithms of both sides, divide by $(k + 4 m)^{d}$ , and let $k \to \infty$

to get

$\lim_{k \to \infty} \frac{k^{d}}{(k + 4 m)^{d}} \frac{h_{Γ_{k}} (X_{w})}{k^{d}} \underline{>} \lim_{k \to \infty} \frac{\ln | A_{k + 4 m, ε, 3 n} (X) |}{(k + 4 m)^{d}} - C e^{- h^{t o p} (X) (n - 2 R)^{d}} - D ε$ ,

and by using the definition of entropy and Corollary 3.2.8, we are left with

$h^{t o p} (X_{w}) \underline{>} h^{t o p} (X) - C e$ $- h^{t o p} (X) (n - 2 R)^{d} - D ε$ .

$ε$ was arbitrary, so we may let it approach zero, leaving

$h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{C}{e^{h^{t o p} (X) (n - 2 R)^{d}}}$ .

A different tactic must be used to prove a lower bound for $h^{t o p} (X) - h^{t o p} (X_{w})$ . We

in a sense proceed in the opposite way from our upper bound: we will define a

map $ψ_{k}$ which sends any word in $L_{Γ_{k}} (X_{w})$ to a subset of $L_{Γ_{k}} (X)$ such that for any

$u \neq u^{'} \in L_{Γ_{k}} (X_{w})$ , $ψ_{k} (u)$ and $ψ_{k} (u^{'})$ are disjoint. This map will create occurrences of

$w$ instead of eliminating them.

We first need another auxiliary combinatorial theorem, again about a sort of lack of

periodicity.

Theorem 3.4.2. For any $d > 1$ and any strongly irreducible $Z^{d}$ -shift $X$ of finite

type $t$ with uniform filling length $R$ , there exists a constant $W$ dependent on $X$ such

that for all suffiffifficiently large $j$ , there is a word $w_{j, d} \in L_{Γ_{j W}} (2 W) (X)$ with the following

aperiodicity condition: there cannot exist trvo overlapping occurrences of $w_{j, d}$ such

that one has nonempty intersection with the empty central $Γ_{(j - 4) W}$ of the other.

127

Figure 3.16: Disallowed and allowed pairs of overlapping $w_{j, d}$

Proof: To make the statement of Theorem 3.4.2 more clear, we refer to Figure 3.16,

which gives an example of a pair of overlapping occurrences of $w_{j, d}$ which is ruled out

by the given aperiodicity condition (left) and a pair which is not. (right)

We begin by giving a concrete example, for any fixed $d$ and $j \underline{>} 3 d + 5$ , of an aperiodic

word $a_{j, d}$ in ${0, 1}^{Γ_{j}}$ . For any $p \in Γ_{j}$ , we define $a_{j, d} (p) = 1$ if any $p_{i}$ for $1 \underline{<} i \underline{<} d - 1$

is not equal to 1 or $j$ . This leaves only $a_{j, d} (p)$ where each of $p_{1}$ , $p_{2}$ , . . . , $p_{d - 1}$ are 1 or $j$

to be defined. We think of this undefined portion as a set of $2^{d - 1}$ one-dimensional j-

letter words: for every $(p_{1}, p_{2}, ..., p_{d - 1}) \in {1, j}^{d - 1}$ , we think of the yet-to-be-defined

$a_{j, d} (p_{1}, p_{2}, ..., p_{d - 1}, 1)$ , $a_{j, d} (p_{1}, p_{2}, ..., p_{d - 1}, 2)$ , $...$ ,

$a_{j, d} (p_{1}, p_{2}, ..., p_{d - 1}, j)$ as the letters of a $j$ -letter word. Now, to fill these portions in,

we define $2^{d - 1} j$ -letter words, which we will call $h_{0}$ , $h_{1}$ , . . . , $h_{2^{d - 1} - 1}$ , such that no two

may overlap each other, i.e. if the final $k$ letters of $h_{i}$ are equal to the initial $k$ letters

128

of $h_{i^{'}}$ for some $k > 0$ , then $k = j$ and $i = i^{'}$ . We do this in the following way: the first

four letters of any $h_{i}$ are defined to be 0001, and the final four letters are defined to

be Ottt. The $3 d - 3$ letters following the initial 0001 in any $h_{i}$ are defined as follows:

concatenate d-l th $r e e$ -letter words, determined by $i' s d - 1$ -digit binary expansion:

each 0 in the binary expansion corresponds to the word 001, and each 1 in the binary

expansion corresponds to Of 1. The remaining $j - (3 d + 5)$ letters of $h_{i}$ which precede

the final Ottt are alternating zeroes and ones, beginning with a zero.

An example should be helpful: suppose that $d = 3$ and $j = 18$ . Then we create

four 18-letter words $h_{0}$ , $h_{1}$ , $h_{2}$ , and $h_{3}$ . The initial four letters of $h_{0}$ are 0001. The

next $3 d - 3 = 6$ letters are dependent on the two-digit binary expansion of the

subscript 0: since it is 00, the next six letters are 001001. The next $j - (3 d + 5) = 4$

letters are alternating zeroes and ones beginning with a zero, i.e. 0101, and the

final four letters are Otll. This gives $h_{0} = 000100100101010111$ . Using similar

reasoning, we see that $h_{1} = 000100101101010111$ , $h_{2} = 000101100101010111$ , and

$h_{3} =$ 000101101101010111.

We claim that this set of $2^{d - 1}$ words has the nonoverlapping property described earlier.

Suppose that for some $k > 0$ and $0 \underline{<} i$ , $i^{'} < 2^{d - 1}$ , the initial $k$ letters of $h_{i}$ are the

same as the final $k$ letters of $h_{i^{'}}$ . $k$ cannot equal 1 or 2, because the first $k$ letters

of $h_{i}$ would then be 0 or 00, and $h_{i^{'}}$ does not end with either of those words. So,

$k \underline{>} 3$ . Then the initial $k$ letters of $h_{i}$ begin with 000, and since the only place that

000 occurs in $h_{i^{'}}$ is at the beginning, this implies that $k = j$ , and since ${h_{i}}_{i = 0}^{2^{d - 1} - 1}$

contains $2^{d - 1}$ distinct words, that $i = i^{'}$ .

129

We now, for every choice of $(p_{1}, p_{2}, ..., p_{d - 1}) \in {1, j}^{d - 1}$ , choose an $h_{i}$ , and define

$a_{j, d} (p_{1}, p_{2}, ..., p_{d - 1}, i^{'}) = h_{i} (i^{'})$ for $1 \underline{<} i^{'} \underline{<} j$ . Assigning the $2^{d - 1} h_{i}$ to the $2^{d - 1}$ choices

for $(p_{1}, p_{2}, ..., p_{d - 1})$ in a one-to-one way completes the definition of $a_{j, d}$ . We now make

the claim that $a_{j, d}$ is aperiodic. Suppose not; then there exists $v =$ $(v_{1}, v_{2}, ..., v_{d}) \neq 0$

with $| v_{i} | < j$ for $1 \underline{<} i \underline{<} d$ such that $a_{j, d} (r) = a_{j, d} (r + v)$ for all $r \in Γ_{j}$ such that

$r + v \in Γ_{j}$ . Since $- v$ could serve as a period as well, we may assume without loss of

generality that $v_{d} \underline{>} 0$ . We choose a vertex $q$ of $Γ_{j, d}$ based on $v$ : for each $1 \underline{<} i \underline{<} d$ ,

if $v_{i} \underline{>} 0$ , $q_{i} = 1$ . If $v_{i} < 0$ , $q_{i} = j$ . In this way, we ensure that $q + v \in Γ_{j}$ , and

therefore that $a_{j, d} (q) = a_{j, d} (q + v)$ . Since $q_{i} \in {1, j}$ for $1 \underline{<} i \underline{<} d - 1$ and $q_{d} = 1$ ,

by construction $a_{j, d} (q) = 0$ . Therefore, $a_{j, d} (q + v) = 0$ . However, again due to

construction, the only zeroes in $a_{j, d}$ lie at points whose first $d$ -1 coordinates are

either 1 or $j$ . This implies that the first $d - 1$ coordinates of $q + v$ are either 1 or $j$ .

Let's denote by $h_{i}$ the $j$ -letter word where $h_{i} (k) = a_{j, d} (q + (k - 1) e_{d})$ for $1 \underline{<} k \underline{<} j$ ,

and by $h_{i^{'}}$ the $j$ -letter word where $h_{i^{'}} (k) = a_{j, d} (q + v + (k - 1 - v_{d}) e_{d})$ . Due to the

supposed periodicity of $a_{j, d}$ , the first $j - v_{d}$ letters of $h_{i}$ are the same as the final

$j - v_{d}$ letters of $h_{i^{'}}$ . Since no two distinct words in ${h_{n}}_{n = 1}^{2^{d - 1}}$ may overlap, $v_{d} = 0$ and

$h_{i} = h_{i^{'}}$ . This implies that the first $d$ -1 coordinates of $q$ are the same as the first

Z-t coordinates of $q + v$ , and therefore that the first Z-t coordinates of $v$ are zero,

implying that $v = 0$ , a contradiction. Thus, $a_{j, d}$ is aperiodic.

We now use $a_{j, d - 1}$ as a tool to create, for any $j \underline{>} 3 d + 2$ , a word $b_{j, d} \in {0, 1}^{Γ_{j}^{(2)}}$ with

the aperiodicity property outlined in the conclusion of Theorem 3.4.2: there cannot

exist two overlapping occurrences of $b_{j, d}$ such that one has nonempty intersection

with the empty central $Γ_{j - 4}$ of the other. $b_{j, d} (p)$ is defined to be 0 if $p_{d} = 1$ or 2,

130

and defined to be 1 if $3 \underline{<} p_{d} \underline{<} j - 1$ . If $p_{d} = j$ , then $b_{j, d} (v) = a_{j, d - 1} (p_{1}, ..., p_{d - 1})$ .

Suppose that $b_{j, d}$ is periodic with respect to $v \neq 0$ . Since our supposed overlap is of

the form described it must be the case that $| v_{i} | \underline{<} j - 2$ for $1 \underline{<} i \underline{<} d$ . We can again

assume without loss of generality that $v_{d} \underline{>} 0$ . Choose a point $q$ of $Γ_{j}^{(2)}$ as follows:

for each $1 \underline{<} i \underline{<} d - 2$ , take $q_{i} = 1$ if $v_{i} \underline{>} 0$ , and $q_{i} = j$ - 1 if $v_{i} < 0$ . Choose

$q_{d - 1} = j - 1$ $- v_{d - 1}$ if $v_{d - 1} \underline{>} 0$ , and $q_{d - 1} = 1$ $- v_{d - 1}$ if $v_{d - 1} < 0$ . Finally, take $q_{d} = 1$ .

For any $r \in Z^{d}$ all of whose coordinates are zero or one the first Z-t coordinates of

$q + r$ are between 1 and $j$ , and the dth coordinate is 1 or 2, implying that $q + r \in Γ_{j}^{(2)}$ .

This means that a copy of $Γ_{2}$ with its least vertex lexicographically at $q$ is a subset of

$Γ_{j}^{(2)}$ , call it $S$ . Since $S$ is composed entirely of points whose dth coordinate is 1 or 2,

$b_{j, d} (S)$ is filled with zeroes. Again, for any point $r$ all of whose coordinates are zero or

one, the first $d - 2$ coordinates of $q + v + r$ are between 1 and $j$ , the Z-dth coordinate

is 1, 2, $j - 1$ , or $j$ , and the dth coordinate is between 1 and $j$ . Therefore, any such

$q + v + r$ is in $Γ_{j}^{(2)}$ , rmplying that a copy of $Γ_{2}$ with its least vertex lexicograhically at

$q + v$ is a subset of $Γ_{j}^{(2)}$ as well, which is then $S + v$ . By periodicity, $b_{j, d} (S + v)$ must

be filled with zeroes as well. However, by construction, for any copy of $Γ_{2}$ which is

filled with zeroes in $b_{j, d}$ , the dth coordinate of its least vertex lexicographically is 1.

Since $q_{d} = 1$ , it must be the case that $v_{d} = 0$ . This implies that $a_{j, d - 1}$ is periodic

with period $(v_{1}, v_{2}, ..., v_{d - 1})$ , which implies that $v_{i} = 0$ for $1 \underline{<} i \underline{<} d - 1$ as well, and

so $v = 0$ , a contradiction. Therefore, $b_{j, d}$ has the desired aperiodicity property. Note,

however that this does not prove Theorem 3.4.2, since $b_{j, d}$ is a word in the full shift,

and there is no reason for it to be in $L (X)$ .

We now turn to our shift of finite type $X$ . First, we will be constructing a word

131

$y \in L_{Γ_{R + 4}} (X)$ for which $X_{y}$ is nonempty. Obviously, finding a word $y$ such that $X_{y}$

is nonempty could be done by using the already proven upper bound for $h^{t o p} (X) -$

$h^{t o p} (X_{y})$ , but this may require $y$ to have shape with very large size. We need the

shape of $y$ to have a small size to prove as tight a lower bound as possible, and for

our proof, we need a lower bound on $H_{Γ_{k}} (X)$ for relatively small $k$ . For large $k$ ,

Lemma 3.2.1 gives an exponential lower bound for $H_{Γ_{k}} (X)$ , but doesn't really tell us

anything about small $k$ . For this reason, we prove the following lemma:

Lemma 3.4.3. For any $Z^{d}$ -subshift X with $h^{t o p} (X) > 0$ , and finite S $\subset Z^{d}$ , $H_{S} (X) >$

|S|.

Proof: We write $S = {s_{1}, s_{2}, ..., s_{| S |}}$ where $s_{i}$ comes before $s_{i + 1}$ in the usual lex-

icographical order for $1 \underline{<} i < | S |$ , and make the notation $S_{i} = {s_{1}, ..., s_{i}}$ for all

$1 \underline{<} i \underline{<} | S |$ . Suppose that $H_{S} (X) \underline{<} | S |$ . Then, since $H_{S_{1}} (X) = | A | > 1$ , it must

be the case that for some $1 \underline{<} j < | S |$ , $H_{S_{j + 1}} (X) \underline{<} H_{S_{j}} (X)$ . Since $S_{j} \subset S_{j + 1}$ , this

means that for every word $w \in L_{S_{j}} (X)$ , there is a unique way to extend it to a word

in $L_{S_{j + 1}} (X)$ . In other words, for any $x \in X$ , given $x (S_{j})$ , $x (s_{j + 1})$ is forced. By shift-

invariance of $X$ , for any $x \in X$ , given $x (S_{j} - s_{j + 1})$ , $x (0)$ is forced. Since $S$ is finite,

take $N > d i a m (S)$ . Then, $S_{j} - s_{j + 1}$ consists of elements of $Z^{d}$ within a d-distance

of less than $N$ from 0 and lexicographically less than 0. For this reason, we make

the notation $H_{n, d} =$ { $v \in (Γ_{2 n - 1}$ - n) : $v$ is less than 0 lexicographically}, where

$n =$ $(n, n, ..., n)$ . It is then clear that $S_{j} - s_{j + 1} \underline{\subset} H_{N, d}$ , and so we note that $x (H_{N, d})$

forces $x (0)$ . We need a few more pieces of notation; $I_{n, d} = {v \in (Γ_{2 n - 1} - n) : v_{d} < 0}$ ,

$G_{n, k, d} =$ { $v \in Z^{d}$ : $v_{i} > 0$ for $1 \underline{<} i \underline{<} d$ , $\sum_{i = 1}^{d} n^{i - 1} v_{i} \underline{<} k$ }, and $L_{n, k, d} = (Γ_{k + n} -$

132

$n) ∖ Γ_{k}$ . We quickly note a few useful facts about these sets which can be checked by

the reader: $H_{n, d + 1} = I_{n, d} \cup$ $(H_{n, d} \times {0})$ , and $G_{n, k, d + 1} = \cup_{i}^{L \frac{k}{= 1 n^{d}} ⌊} (G_{n, k - i n^{d}, d} \times {i})$ . a $e$

now claim that if $x (H_{n, d})$ forces $x (0)$ for all $x \in X$ , then $x (L_{n, k, d})$ forces $x (G_{n, k, d})$ for

any $k$ . We prove this claim by induction on $d$ .

For $d = 1$ , the claim is fairly easy to check (and is in fact a classical theorem due

to Hedlund and Morse ([HeM]) $)$ ; it amounts to showing that if any $n$ consecutive

letters of any $x$ force the next, then any $n$ consecutive letters of $x$ force the next $k$ for

any $k$ . But this is clear; if $x (1)$ , $x (2)$ , . . . , $x (n)$ force $x (n + 1)$ , then $x (2)$ , . . . , $x (n + 1)$

force $x (n + 2)$ , and we can proceed like this indefinitely. Thus, the claim is proven

for $d = 1$ . Now suppose that it is true for a fixed $d$ , and we will prove it for $d + 1$ .

This proof will also be by induction; since we wish to show that $x (L_{n, k, d + 1})$ forces

$x (G_{n, k, d + 1})$ , and since $G_{n, k, d + 1} = \cup_{i}^{L \frac{k}{= 1 n^{d}} ⌊} (G_{n, k - i n^{d}, d} \times {i})$ , it suffices to show that

$x (L_{n, k, d + 1} \cup \prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {i}))$ forces $x (G_{n, k - (j + 1) n^{d}, d} \times {j + 1})$ for every $0 \underline{<}$

$j < ⌋ \frac{k}{n^{d}} ⌊$ . Fix any such $j$ , and suppose that $x (L_{n, k, d + 1} \cup \prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {i}))$

is given. We will show that $x (G_{n, k - (j + 1) n^{d}, d} \times {j + 1})$ is forced. Consider any

$v = (v^{'}, j + 1)$ $\in G_{n, k - (j + 1) n^{d}, d} \times {j + 1}$ . We claim that $v + I_{n, d + 1} \underline{\subset} L_{n, k, d + 1} \cup$

$\prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {j})$ . Since $I_{n, d + 1} = (Γ_{2 n - 1} - n) \times$ ${- n + 1, ..., 0}$ , it suffices to

show that $v^{'} + (Γ_{2 n - 1} - n) \underline{\subset} (L_{n, k, d + 1} \cup \prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {j})) \cap (Z^{d} \times {i})$ for

$j - n < i \underline{<} j$ . Since $(L_{n, k, d + 1} \cup \prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {j})) \cap (Z^{d} \times {i}) = (L_{n, k, d} \cup$

$Γ_{k}) \times {i}$ for $- n + 1 \underline{<} i \underline{<} 0$ , and $(L_{n, k, d + 1} \cup \prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {j}))$ $\cap (Z^{d} \times {i}) =$

$(L_{n, k, d} \cup G_{n, k - i n^{d}, d}) \times {i}$ for $i > 0$ , and since $G_{n, k - (i + 1) n^{d}, d} \underline{\subset} G_{n, k - i n^{d}, d} \underline{\subset} Γ_{k}$ for

$i > 0$ , it suffices to show that $G_{n, k - (i + 1) n^{d}, d} +$ ( $Γ_{2 n - 1}$ - n) $\underline{\subset} L_{n, k, d} \cup G_{n, k - i n^{d}, d}$ for

$i > 0$ . Consider any $v \in G_{n, k - (i + 1) n^{d}, d}$ and any $v^{'} \in (Γ_{2 n - 1} - n)$ . By definition of

133

$G_{n, k - (i + 1) n^{d}, d}$ , $\sum_{i = 1}^{d} v_{i} n^{i - 1} \underline{<} k - (i + 1) n^{d}$ , and $v_{i} > 0$ for $1 \underline{<} i \underline{<} d$ . By choice of $v^{'}$ ,

$- n + 1 \underline{<} v_{i}^{'} \underline{<} n - 1$ for $1 \underline{<} i \underline{<} d$ . Therefore, $- n + 1$ $< (v + v^{'})_{i}$ for $1 \underline{<} i \underline{<} d$ , and

$\sum_{i = 1}^{d} (v + v^{'})_{i} n^{i - 1} \underline{<} (k - (i + 1) n^{d}) + (\sum_{i = 0}^{d - 1} n^{i}) < k - i n^{d}$ . There are then two cases; if

any coordinate of $v + v^{'}$ is nonpositive, then $v + v^{'} \in L_{n, k, d}$ , and if all coordinates are

positive, then by definition, $v + v^{'} \in G_{n, k - i n^{d}, d}$ . So, indeed $G_{n, k - (i + 1) n^{d}, d} + (Γ_{2 n - 1} -$

n) $\underline{\subset} L_{n, k, d} \cup$ $G_{n, k - i n^{d}, d}$ , and so for any $v \in G_{n, k - (j + 1) n^{d}, d} \times {j + 1}$ , $v + I_{n, d + 1} \underline{\subset}$

$L_{n, k, d + 1} \cup \prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {i})$ . This means that since $H_{n, d + 1} = I_{n, d + 1} U (H_{n, d} \times {0})$ ,

for any $v \in (G_{n, k - (j + 1) n^{d}, d} \times {j + 1})$ , $x (v + (H_{n, d} \times {0}))$ forces $x (v)$ . But now since

$L_{n, k - (j + 1) n^{d}, d} \times {j + 1}$ $\subset L_{n, k, d + 1}$ , by the inductive hypothesis $x (G_{n, k - (j + 1) n^{d}, d})$ 1s

in fact forced by $x (L_{n, k, d + 1} \cup \prod_{i = 1}^{j} (G_{n, k - i n^{d}, d} \times {i}))$ . Since $j$ was arbitrary here, as

described above this shows that $x (L_{n, k, d + 1})$ forces $x (G_{n, k, d + 1})$ as claimed, and so by

induction we see that for any $d$ , if $x (H_{n, d})$ forces $x (0)$ for all $x \in X$ , then $x (L_{n, k, d})$

forces $x (G_{n, k, d})$ for any $k$ .

We finally return to our original example of $X$ for which $H_{S} (X) \underline{<} S$ . Recall that

we showed that there is then some $N$ such that $x (H_{N, d})$ forces $x (0)$ for all $x$ . By

the claim just shown, $x (L_{N, k, d})$ then forces $x (G_{N, k, d})$ for all $k$ . This then shows that

$H_{G_{N, k, d}} (X) \underline{<} H_{L_{N, k, d}} (X)$ for any $k$ . Note that $Γ_{⌋ \frac{k}{d N^{d}} ⌊} \underline{\subset} G_{N, k, d}$ for all $k$ . Therefore,

$H_{Γ_{n}} (X) \underline{<} H_{G_{N, n d N^{d}, d}} (X) \underline{<} H_{L_{N, n d N^{d}, d}} (X)$ for all $n$ . Since $| L_{N, n d N^{d}, d} | = (n d N^{d} +$

$N)^{d} - (n d N^{d})^{d} \underline{<} 2 N d (n d N^{d})^{d - 1} = C n^{d - 1}$ for large $n$ and a constant $C$ independent

of $n$ , we see that $H_{Γ_{n}} (X) \underline{<} | A |^{C n^{d - 1}}$ for all $n$ , and so by definition of entropy,

$h^{t o p} (X) = \lim_{n \to \infty} \frac{\ln H_{Γ_{n}} (X)}{n^{d}} \underline{<} \lim \sup_{n \to \infty} \frac{C n^{d - 1} \ln | A |}{n^{d}} = 0$ . Therefore, $h^{t o p} (X) = 0$ , a

contradiction to the hypotheses. Our initial assumption was therefore wrong, and so

$H_{S} (X) > | S |$ for all finite shapes $S$ .

134

$□$

Before moving on with our proof, we must comment on Lemma 3.4.3. In the course

of the proof, we show that if for some total order $\underline{<}$ on $Z^{d}$ which is preserved under

addition, some finite set $S \subset Z^{d}$ , and some $t \notin S$ with $t \underline{>} s$ for all $s \in S$ , $x (S)$ forces

$x (t)$ , then the topological entropy of $X$ is zero. It is natural to wonder whether the

assumption about ordering is necessary. In fact it is:

Lemma 3.4.4. For any finite $S \subset Z^{d}$ and $t \notin S$ such that there is no addition-

preserved total order on $Z^{d}$ with respect to which $t$ is greater than all elements of $S$ ,

there exists a subshift $X$ such that $x (S)$ forces $x (t)$ for all $x \in X$ , but $h^{t o p} (X) > 0$ .

Proof: Suppose that for some $S \subset Z^{d}$ finite and $t \notin S$ , it is the case that for no

addition-preserved total order on $Z^{d}$ is $t$ greater than all elements of $S$ . We claim

that this implies that $t$ lies in the convex hull of $S$ . To see this, suppose that $t$ does

not lie in the convex hull of $S$ . Then clearly there is a $d - 1$ -dimensional hyperplane

$L$ of $R^{d}$ such that $t$ and $S$ lie on opposite sides of $L$ . Define the subspace $L^{'}$ of $R^{d}$

which equals $L - P$ for any $p \in L$ . Without loss of generality, we may assume that

$L \cap Z^{d} = \emptyset$ , since such an assumption requires only a small rotation of $L$ . Then,

define an addition-preserved total order $\underline{<}$ on $Z^{d}$ by arbitrarily fixing one of the open

half-spaces $H$ that $L^{'}$ splits $R^{d}$ into and defining $v \underline{<} v^{'}$ if $v^{'} - v \in H \cup$ ${0}$ . Then,

since $t$ and $S$ are on opposite sides of $L$ , for all $s \in S$ , $t - s$ is on the same side of $L^{'}$ ,

and so either $t \underline{<} s$ for all $s \in S$ or $t \underline{>} s$ for all $s \in S$ . By reversing $\underline{<}$ if necessary, we

may assume without loss of generality that $t$ is greater than all elements of $S$ under

$\underline{<}$ . This is a contradiction, and so $t$ lies in the convex hull of $S$ .

135

To make things easier, by shift-invariance, we can without loss of generality assume

that $t = 0 : = (0, ..., 0)$ . We will, for any $S$ such that 0 lies in the convex hull of $S$ ,

construct a subshift $X$ with positive topological entropy such that $x (S)$ forces $x (0)$

for any $x \in X$ . By decomposing the convex hull of $S$ into simplices, we can, for some

$1 \underline{<} d^{'} \underline{<} d$ , find $S^{'}$ a linearly independent subset of $S$ of cardinality $d^{'} + 1$ such that

the convex hull of $S^{'}$ contains 0 and is contained in a $d^{'}$ -dimensional subspace of $R^{d}$ ,

call it $L$ . Denote the elements of $S^{'}$ by $s_{1}$ , $s_{2}$ , $...$ , $s_{d^{'} + 1}$ . Then there exist positive

reals $n_{1}$ , . . . , $n_{d^{'} + 1}$ such that $0 = \sum_{i = 1}^{d^{'} + 1} n_{i} s_{i}$ . Since all elements of $S$ are in $Z^{d}$ , we can

take all $n_{i}$ to be rational, and so without loss of generality integers. This implies that

$s_{d^{'} + 1} = \sum_{i = 1}^{d^{'}} - \frac{n_{i}}{n_{d + 1}}, s_{i}$ . Take $N > \frac{n_{i}}{n_{d + 1}}, + 1$ for all $1 \underline{<} i \underline{<} d^{'}$ . Take the d'-dimensional

parallelepiped $R = {\sum_{i = 1}^{d^{'}} r_{i} s_{i} : 0 \underline{<} r_{i} < N}$ contained in $L$ . One may then tessellate

$L$ with copies of $R : L = \prod_{v \in Z^{d}} (R + \sum_{i = 1}^{d^{'}} N v_{i} s_{i})$ . (Intersecting with $Z^{d}$ on both sides

leads to a similar tessellation of $L \cap Z^{d}$ by copies of $R \cap Z^{d} .$ ) If $L \subset R^{d} <$ ' then choose

linearly independent $u_{1}$ , $...$ , $u_{d - d^{'}}$ in $Z^{d}$ such that $(L \cap Z^{d}) + \sum_{j = 1}^{d - d^{'}} (Z u_{j}) = Z^{d}$ . We

will define $X$ , a subshift with alphabet $A = {0, 1}$ $\times (R \cap Z^{d})$ , such that $x (S^{'})$ forces

$x (0)$ for any $x \in X$ , which clearly implies that $x (S)$ forces $x (0)$ for any $x \in X$ as well.

Choose any $y \in$ Q. We define $x \in X$ as follows: for any $v \in Z^{d}$ and $r \in (R \cap Z^{d})$ ,

define $x (r + \sum_{i = 1}^{d^{'}} N v_{i} s_{i} + \sum_{j = 1}^{d - d^{'}} v_{j + d^{'}} u_{j}) = (y (v), r)$ . Due to the already mentioned

tessellation of $L \cap Z^{d}$ by copies of $R \cap Z^{d}$ and tessellation of $Z^{d}$ by copies of $L \cap Z^{d}$ ,

this defines $x$ on all of $Z^{d}$ . Take $X^{'}$ to be the set of all elements of $Ω$ constructed in

this way. $X^{'}$ is not necessarily shift-invariant, but by the construction, it is invariant

under shifts by $N s_{i}$ for $1 \underline{<} i \underline{<} d^{'}$ and $u_{j}$ for $1 \underline{<} j \underline{<} d - d^{'}$ . Since the collection

${N s_{1}, N s_{2}, ..., N s_{d^{'}}, u_{1}, ..., u_{d - d^{'}}}$ is linearly independent, this means that a finite

136

union of shifts of $X^{'}$ , call it X, is a shift-invariant set. It is also clear that X is closed,

and thus it is a subshift.

We first show that $x (S^{'})$ forces $x (0)$ for any $x \in X$ . By the definition of $X$ , it is

sufficient to show that $x^{'} (S^{'} + p)$ forces $x^{'} (p)$ for any $x^{'} \in X^{'}$ and $p \in Z^{d}$ . Since $p \in Z^{d}$ ,

there exists some $r \in (R \cap Z^{d})$ and $v \in Z^{d}$ such that $p = r + \sum_{i = 1}^{d^{'}} N v_{i} s_{i} + \sum_{j = 1}^{d - d^{'}} v_{j + d^{'}} u_{j}$ .

Since $r \in R$ , $r = \sum_{i = 1}^{d^{'}} r_{i} s_{i}$ for some positive reals $r_{i}$ . For any $1 \underline{<} i \underline{<} d^{'}$ , $r + s_{i}$ is

in $R$ unless the $s_{i}$ coefficient of $r + s_{i}$ is greater than or equal to $N$ . Therefore,

if $r + s_{i} \notin R$ for all 1 $\underline{<} i \underline{<} d^{'}$ , then it must be the case that $r_{i} \underline{>} N - 1$ for

all $1 \underline{<} i \underline{<} d^{'}$ . But then since $s_{d^{'} + 1} = \sum_{i = 1}^{d^{'}} - \frac{n_{i}}{n_{d + 1}}, s_{i}$ as noted earlier, and since

$\frac{n_{i}}{n_{d + 1}}, < N - 1$ for all $1 \underline{<} i \underline{<} d^{'}$ , $r + s_{d^{'} + 1} = \sum_{i = 1}^{d^{'}} (r_{i} - \frac{n_{i}}{n_{d + 1}},) s_{i}$ , with all coefficients

in $[0, N)$ . This implies that $r + s_{d^{'} + 1} \in R$ . We then showed that if $r + s_{i} \notin R$ for all

$1 \underline{<} i \underline{<} d^{'}$ , then $r + s_{d^{'} + 1} \in R$ . In other words, there is some $1 \underline{<} j \underline{<} d^{'} + 1$ such

that $r + s_{j} \in R$ . Fix this $j$ . Then, $p + s_{j} = (r + s_{j}) + \sum_{i = 1}^{d^{'}} N v_{i} s_{i} + \sum_{j = 1}^{d - d^{'}} v_{j + d^{'}} u_{j}$ .

Say that $x^{'} (p + s_{j}) = (r + s_{j}, q)$ for some $m \in {0, 1}$ . Then by definition of $X^{'}$ , since

$p$ and $p + s_{j}$ lie in the same copy of $R$ , it must be the case that $x^{'} (p) = (r, q)$ . This

means that $x^{'} (p + s_{j})$ forced $x^{'} (p)$ Since $1 \underline{<} j \underline{<} d^{'} + 1$ , this shows that $x^{'} (S^{'} + p)$

forces $x^{'} (p)$ for any $p \in Z^{d}$ , and so that $x (S^{'})$ forces $x (0)$ for any $x \in X$ .

It remains to show that $h^{t o p} (X) > 0$ . Choose $M$ $> | N s_{i} |_{\infty}$ , $| u_{j} |_{\infty}$ for $1 \underline{<} i \underline{<} d^{'}$ ,

$1 \underline{<} j \underline{<} d - d^{'}$ . Then, $R + {\sum_{i = 1}^{d^{'}} N k_{i} s_{i} + \sum_{j = 1}^{d^{'} - d} k_{j + d^{'}} u_{j} : | k_{i} | \underline{<} K} \underline{\subset} [- (K +$

$1) M d$ , $(K + 1) M d]^{d}$ for any $K$ . By the definition of $X$ , there are at least $2^{| [- K, K]^{d} |}$

ways to fill letters of $x$ on $R + {\sum_{i = 1}^{d^{'}} N k_{i} s_{i} + \sum_{j = 1}^{d^{'} - d} k_{j + d^{'}} u_{j} : | k_{i} | \underline{<} K}$ , and so we

see that $H_{[- (K + 1) M d, (K + 1) M d]^{d}} (X) = H_{Γ_{2 (K + 1) M d + 1}} (X) \underline{>} 2^{(2 K + 1)^{d}}$ This means that

137

$\frac{h_{Γ_{2 (K + 1) M d + 1}} (X)}{(2 (K + 1) M d + 1)^{d}} \underline{>} \frac{(\ln 2) (2 K + 1)^{d}}{(2 (K + 1) M d + 1)^{d}} \underline{>} \frac{1}{(4 M d)^{d}}$

for all $K$ , and therefore, by letting $K$ approach infinity, we see that $h^{t o p} (X) \underline{>} \frac{1}{(4 M d)^{d}} >$

$0$ .

$□$

We claim that for a strongly irreducible shift of finite type $X$ , $H_{Γ_{R + 4}} (X) > (2 R + 4)^{d}$ .

We define $S_{1} = {1, 2}$ $\times {1, 2, ..., R + 4}^{d - 1} \subset Γ_{R + 4}$ and $S_{2} = {R + 3, R + 4} \times$

${1, 2, ..., R + 4}^{d - 1} \subset Γ_{R + 4}$ . Then $d (S_{1}, S_{2}) > R$ , and so by strong irreducibility

$H_{Γ_{R + 4}} (X) \underline{>} H_{S_{1}} (X) H_{S_{2}} (X)$ , which is greater than $4 (R + 4)^{2 d - 2}$ by Lemma 3.4.3.

Since $R + 4 \underline{>} 4$ ,

4 $(R + 4)^{2 d - 2} \underline{>} 4^{d - 1} (R + 4)^{d} \underline{>} 2^{d} (R + 4)^{d} = (2 R + 8)^{d} > (2 R + 4)^{d}$ .

Therefore, $H_{Γ_{R + 4}} (X) > (2 R + 4)^{d}$ . Consider any $y \in L_{Γ_{3 R + 7}} (X)$ . Then, since there

are $(2 R + 4)^{d}$ copies of $Γ_{R + 4}$ contained in $Γ_{3 R + 7}$ , there are at most $(2 R + 4)^{d}$ different

words in $L_{Γ_{R + 4}} (X)$ which are subwords of $y$ . Since $H_{Γ_{R + 4}} (X) > (2 R + 4)^{d}$ , this means

that there is $z \in L_{Γ_{R + 4}} (X)$ such that $z$ is not a subword of $y$ . Then, again using strong

irreducibility, we create $x \in X$ such that for any $v \in Z^{d}$ , $x (Γ_{R + 4} + (2 R + 4) v) = z$ .

(See Figure 3.17.) Then, for any $S$ a copy of $Γ_{3 R + 7}$ , $x (S)$ contains a $z$ , and therefore

$x (S) \neq y$ . Thus, $y$ is not a subword of $x$ , and so $X_{y}$ contains at least one point and

1s nonempty.

138

$R + 3 R + l$

Figure 3.17: A point $x \in X_{y}$

We will use $y$ and $b_{j, d}$ to create, for $W = 8 R + 14$ , and for any $j \underline{>} 3 d + 2$ , a word

$w_{j, d} \in L_{Γ_{j W}} (2 W) (X)$ such that there cannot exist two overlapping occurrences of $w_{j, d}$

where one has nonempty intersection with the empty central $Γ_{(j - 4) W}$ of the other.

To do this, we first partition $Γ_{j W}^{(2 W)}$ into disjoint copies of $Γ_{W}$ . The disjoint copies of

$Γ_{W}$ then have an obvious bijective correspondence to tthllee points of $Γ_{j}^{(2)}$ , illustrated in

Figure 3.18.

We then use each entry of $b_{j, d}$ to assign entries of $w_{j, d}$ to the corresponding $Γ_{W}$ in

$Γ_{j W}^{(2 W)}$ For $p \in Γ_{j}^{(2)}$ , if $b_{j, d} (p) = 0$ , then the least copy lexicographically of $Γ_{W - R}$ in

the $Γ_{W}$ corresponding to $p$ is filled with any word in $L_{Γ_{W - R}} (X_{y})$ , i.e. a word without

any occurrences of $y$ . If $b_{j, d} (p) = 1$ , then the least copy lexicographically of $Γ_{W - R}$ in

the $Γ_{W}$ corresponding to $p$ has $2^{d}$ occurrences of $y$ placed inside it, each one sharing

a vertex with it.

139

The remainder of I $j W (2 W)$ is then filled to make a word in $L (X)$ by using strong irre-

ducibility, since all of the filled portions are a distance of at least $R + 1$ from each

other. We claim that this word $w_{j, d} \in L_{Γ_{j W}} (2 W) (X)$ has the desired aperiodicity con-

dition. Suppose not; then there exist two overlapping occurrences of $w_{j, d}$ such that

one has nonempty overlap with the empty central $Γ_{(j - 4) W}$ of the other, which implies

that $w_{j, d}$ is periodic with respect to some $v \neq 0$ with $| v_{i} | < (j - 2) W$ for $1 \underline{<} i \underline{<} d$ .

We then define $v^{'}$ by defining $v_{i}^{'}$ to be the closest multiple of $W$ to $v_{i}$ for $1 \underline{<} i \underline{<} d$ .

If two are equally close, choose either. In this way, we ensure that $| v_{i} - v_{i}^{'} | \underline{<} \frac{W}{2}$ for

each $1 \underline{<} i \underline{<} d$ . We make the notation $v^{'}$ $: = v - v^{'}$ . Since each coordinate of $v^{'}$ is

divisible by $W$ , $\frac{v^{'}}{W}$ has integer coordinates.

Figure 3.18: The correspondence between copies of $Γ_{W}$ and points in $Γ_{j}^{(2)}$

140

$R + l$ $r_{w}$

$y$ $y$

$R + l$

$y$ $y$

$R + l$ $R + 1$

Figure 3.19: How a copy of $Γ_{W}$ is filled if $b_{j, d} (p) = 1$

Since $| v_{i}^{'} | \underline{<} (j - 2) W$ for all $i$ , every coordinate of $\frac{v^{'}}{W}$ has absolute value at most

$j -$ $2$ . This implies that either $v^{'} = 0$ or $\frac{v^{'}}{W}$ is the difference between two overlapping

occurrences of $Γ_{j}^{(2)}$ , one of which has nonempty intersection with the empty central

$Γ_{j - 4}$ of the other. Assume that the latter is the case. Then, by the already proven

aperiodicity condition on $b_{n, d}$ , this implies that there exist $q$ , $q + \frac{v^{'}}{W} \in Γ_{j}^{(2)}$ such that

$b_{j, d} (q) \neq b_{j, d} (q + \frac{v^{'}}{W})$ . Without loss of generality, we assume that $b_{j, d} (q) = 1$ and

$b_{j, d} (q + \frac{v^{'}}{W}) = 0$ .

Let's call $S$ the central $Γ_{W - 2 R}$ in the $Γ_{W}$ in $Γ_{j W}^{(2 W)}$ corresponding to $q$ and $T$ the

central $Γ_{W - 2 R}$ in the $Γ_{W}$ in $Γ_{j W}^{(2 W)}$ corresponding ttoo $q + \frac{v^{'}}{W}$ . Then $T - S = v^{'}$ , and

$w_{j, d}$ is periodic with respect to $v$ , $w_{j, d} (S \cap (T - v)) = w_{j, d} ((S + v) \cap T)$ . $T - v =$

$(T - v^{'}) - v^{'} = S - v^{'}$ . Since every coordinate of $v^{'}$ is at most $\frac{W}{2}$ , $S \cap (S - v^{'})$ ,

and hence $S \cap (T - v)$ , is nonempty. In fact, it is a rectangular prism $R_{s_{1}, ..., s_{d}}$ where

$s_{i} \underline{>} (W - 2 R) - \frac{W}{2} = 3 R + 7, 1 \underline{<} i \underline{<} d$ . Therefore, $S \cap (T - v)$ contains one of

141

the $2^{d}$ copies of $Γ_{3 R + 7}$ in $S$ which share a vertex with $S$ . Since $b_{j, d} (q) = 1$ , and since

$S$ is the central $Γ_{W - 2 R}$ in the $Γ_{W}$ corresponding to $q$ , every one of these subcubes

of $S$ contains an occurrence of $y$ in $w_{j, d}$ . Therefore, $w_{j, d} (S \cap (T - v))$ has $y$ as a

subword. However, since $(S + v) \cap T \underline{\subset} T$ , $w_{j, d} ((S + v) \cap T)$ is a subword of $w_{j, d} (T)$ .

Since $b_{j, d} (q + \frac{v^{'}}{R}) = 0$ , and since $T$ is the central $Γ_{W - 2 R}$ of the $Γ_{W}$ corresponding to

$q + \frac{v^{'}}{R}$ , $w_{j, d} (T)$ contains no occurrences of $y$ . Therefore, $w_{j, d} ((S + v) \cap T)$ contains

no occurrences of $y$ either. Since $w_{j, d} (S \cap (T - v)) = w_{j, d} ((S + v) \cap T)$ , we have a

contradiction.

The only remaining case is when $v^{'} = 0$ , i.e. $| v_{i} | \underline{<} \frac{W}{2}$ for $1 \underline{<} i \underline{<} d$ . We then simply

take an integer multiple $n v^{'}$ of $v^{'}$ such that some coordinate of $n v^{'}$ is greater than $\frac{W}{2}$ ,

but at most $W$ . Then, $n v^{'}$ is also a period of $w_{j, d}$ , and we can repeat the argument

just made for the same contradiction.

$□$

$w_{o, d}$ $Γ_{\circ W}$

$w$

Figure 3.20: $f^{'}$

142

We now return to our proof of the lower bound on $h^{t o p} (X) - h^{t o p} (X_{w})$ . We suppose

that our word $w \in L_{Γ_{n}} (X)$ which we are removing from $L (X)$ has $n > (3 d + 5) W$ .

Take $0$ to be the smallest integer such that (0-4)W $> n + 2 R$ . Then, (0-4)W $>$

$(3 d + 5) W$ , and so $0 \underline{>} 3 d + 2$ . We also define $n^{'} = o W + 2 R$ . By the definition of $0$ ,

$n^{'} < n + 4 R + 5 W \underline{<} n + 44 R + 70$ . Take ${\bar{μ}}_{w}$ to be any ergodic measure of maximal

entropy on $X_{w}$ . Since

$\sum_{w^{'} \in L_{Γ_{n^{'}}} (X_{w})} {\bar{μ}}_{w} ([w^{'}]) = {\bar{μ}}_{w} (\cup_{w^{'} \in L_{Γ_{n^{'}}} (X_{w})} [w^{'}]) = {\bar{μ}}_{w} (X_{w}) = 1$ ,

there must exist some word $f \in L_{Γ_{n}}$ , $(X_{w})$ such that ${\bar{μ}}_{w} ([f]) \underline{>} \frac{1}{H_{Γ_{n}}, (X_{w})}$ . We note that

since $X_{w} \subset X$ , $H_{Γ_{n}}$ , $(X_{w}) \underline{<} H_{Γ_{n}}$ , (X). By Lemma 3.2.1, $H_{Γ_{n}}$ , $(X) \underline{<} e^{h^{t o p} (X) (n^{'} + R)^{d}}$

Therefore, ${\bar{μ}}_{w} ([f]) \underline{>} e^{- h^{t o p} (X) (n^{'} + R)^{d}}$

This means that by using Lemma 3.2.7, we see that if, for any $ε > 0$ , we denote

by $B_{k, ε} (X_{w})$ the set of words in $L_{Γ_{k}} (X_{w})$ which have at least $k^{d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)$

occurrences of $f$ , then $\lim_{k \to \infty} \frac{\ln | B_{k, ε} (X_{w}) |}{k^{d}} = h^{t o p} (X_{w})$ .

In Figure 3.20, we show a word $f^{'} \in L_{Γ_{n - 2 R}}, (X)$ constructed as follows: the copy of

$I_{o W}^{(2 W)}$ is filled with $w_{o, d}$ as constructed in Theorem 3.4.2, and the central $Γ_{n}$ is filled

with $w$ . The remaining shaded portion is filled using strong irreducibility to create a

word $f^{'} \in L (X)$ .

We may now finally describe our map $ψ_{k}$ . Consider any $u \in B_{k, ε} (X_{w})$ . By definition,

$u$ has at least $k^{d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)$ occurrences of $f$ . We choose a set of disjoint

occurrences of $f$ using a simple algorithm: choose any occurrence of $f$ to begin, call

it $f^{(1)}$ . At most 3 $d d n^{'}$ occurrences of $f$ overlap $f^{(1)}$ , and so we choose any occurrence

143

of $f$ which does not overlap $f^{(1)}$ , and call it $f^{(2)}$ . $f^{(3)}$ is any occurrence of $f$ which

does not overlap $f^{(1)}$ or $f^{(2)}$ , and we continue in this fashion. We are able to choose

at least $k^{d} 3^{- d} n^{' - d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)$ disjoint occurrences of $f$ in $u$ in this way. We

then choose any nonempty subset of this chosen set of occurrences of $f$ , and for each

chosen occurrence of $f$ , if its shape is $U$ a copy of $Γ_{n^{'}}$ , we use strong irreducibility to

replace the central $Γ_{n^{'} - 2 R}$ of $U$ by $f^{'}$ . $ψ_{k} (u)$ is defined to be the set of words which

could be created by performing such replacements on $u$ . The cardinality of $ψ_{k} (u)$ is

then at least $2^{k^{d} 3^{- d} n^{' - d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)} - 1$ for any $u \in B_{k, ε} (X_{w})$ .

We claim that $ψ_{k} (u) \cap ψ_{k} (u^{'}) = \emptyset$ for any $u \neq u^{'} \in B_{k, ε} (X_{w})$ . To show this, we

first show an auxiliary claim: that no occurrence of $f^{'}$ is ever incidentally created

in the replacement process of $ψ_{k}$ . In other words, when some occurrences of $f$ in

$u \in B_{k, ε} (X_{w})$ are replaced by words containing $f^{'}$ to create an element of $ψ_{k} (u)$ , the

only occurrences of $f^{'}$ in the result are the ones which are subwords of each replaced

occurrence of $f$ . Suppose that this is not the case; that for some $v \in B_{k, ε} (X_{w})$ and

$v^{'} \in ψ_{k} (v)$ , $v^{'}$ contains an occurrence of $f^{'}$ which occupies a copy of $Γ_{n^{'} - 2 R}$ , which we

call $B^{'}$ , which was not the central $Γ_{n^{'} - 2 R}$ of a copy of $Γ_{n}$ which was occupied by one

of the replaced occurrences of $f$ in $v$ . Denote by $B^{'}$ the central $Γ_{n}$ of $B^{'}$ and by $B$ the

copy of $Γ_{n^{'}}$ in which $B^{'}$ is central. Then $v^{'} (B^{'}) = w$ . Since $v \in L (X_{w})$ , $v (B^{'}) \neq w$ .

This implies that one of the replacements made had nonempty intersection with $B^{'}$ ,

otherwise $v (B^{'}) = v^{'} (B^{'})$ , a contradiction. Call the $Γ_{n^{'}}$ where this replacement

was made $C$ , and call its central $Γ_{n^{'} - 2 R} C^{'}$ . By our hypothesis, $B$ was not one

of the replaced copies of $Γ_{n^{'}}$ , and so $B \neq C$ . Since all of the replacements made

were disjoint, $v^{'} (C^{'}) = f^{'}$ . We know that $v^{'} (B^{'}) = f^{'}$ as well. Since $C \cap B^{'} \neq \emptyset$ ,

144

$| B - C |_{\infty} \underline{<} n + \frac{n^{'} - n}{2} = \frac{n^{'} + n}{2}$ . But, $\frac{n^{'} + n}{2} = \frac{n^{'} - o W}{2} + \frac{o W + n}{2}$ . Since $n^{'} = o W + 2 R + 2 t$ ,

$\frac{n^{'} - o W}{2} + \frac{o W + n}{2} = R + t + \frac{o W + n}{2}$ . It was part of the definition of $0$ that (0-4)W $> n +$

$2 R + 2 t$ , so $R + t < \frac{(0 - 4) W - n}{2}$ . Therefore, $R + t + \frac{o W + n}{2} < \frac{(0 - 4) W - n}{2} + \frac{o W + n}{2} = (0 - 2) W$ .

We have then shown that $| B - C |_{\infty} < (0 - 2) W$ . Since $B^{'}$ is central in $B$ and $C^{'}$

is central in $C$ , $B^{'} - C^{'} = B - C$ and so $| B^{'} - C^{'} |_{\infty} < (0 - 2) W$ as well. We make

one more notation: call $E^{'}$ the boundary of thickness $2 W$ of $B^{'}$ and $F^{'}$ the boundary

of thickness $2 W$ of $C^{'}$ . Since $B \neq C$ , $E^{'} \neq F^{'}$ . Then, since $v^{'} (B^{'}) = v^{'} (C^{'}) = f^{'}$ ,

$v^{'} (E^{'}) = v^{'} (F^{'}) = w_{o, d}$ . Also, $E^{'} - F^{'} = B^{'} - C^{'}$ , so $| E^{'} - F^{'} |_{\infty} < (0 - 2) W$ ,

which implies that $E^{'}$ has nonempty intersection with the central $Γ_{(0 - 4) W}$ of $F^{'}$ , a

contradiction to the aperiodicity property of $w_{o, d}$ .

We have then shown that the only occurrences of $f^{'}$ in any $v^{'} \in ψ_{k} (v)$ for some

$v \in B_{k, ε} (X_{w})$ are those which occupy the central $Γ_{n^{'} - 2 R}$ of replaced occurrences of $f$ .

We also have a sort of converse result: for any such $v$ , $v^{'}$ , and for every $U$ a copy $Γ_{n^{'}}$

which is filled with an occurrence of $f$ in $v$ which is replaced, the central $Γ_{n^{'} - 2 R}$ of $U$

is filled with $f^{'}$ in $v^{'}$ . This fact rests on the disjointness of the replacements made in

the definition of $ψ_{k}$ ; since these replacements are disjoint, each letter is changed at

most once.

Now, consider any $v^{'} \in ψ_{k} (v)$ for $v \in B_{k, ε} (X_{w})$ . Since $v^{'} \in ψ_{k} (v)$ , $v^{'}$ has at least one

occurrence of $f^{'}$ . Since the only occurrences of $f^{'}$ in $v^{'}$ come from replacements during

the application of $ψ_{k}$ , we know that for every $U$ a copy of $Γ_{n^{'} - 2 R}$ with $v^{'} (U) = f^{'}$ , if

we call $U^{'}$ the copy of $Γ_{n^{'}}$ in which $U$ is central, it must be the case that $v (U^{'}) = f$ .

We also know that these are the only replacements performed in turning $v$ into $v^{'}$ ,

145

since any others would have resulted in more occurrences of $f^{'}$ . So, $v^{'}$ determines the

set of replacements which were made in $v$ . However, trivially $v = v^{'}$ outside of the

regions where replacements occurred, and so $v^{'}$ determines the letters of $v$ outside

regions where replacements occurred, as well. Therefore, $v$ is uniquely determined by

$v^{'}$ , and so we know that for $v \neq v^{'} \in B_{k, ε} (X_{w})$ , $ψ_{k} (v) \cap ψ_{k} (v^{'}) = \emptyset$ .

Since $| ψ_{k} (v) | \underline{>} 2^{k^{d} 3^{- d} n^{' - d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)} - 1$ $> 2^{k^{d} 3^{- d + 1} n^{- d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)}$ for $v \in$

$B_{k, ε} (X_{w})$ , we have shown that

$H_{Γ_{k}} (X) > 2^{k^{d} 3^{- d + 1} n^{- d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)} | B_{k, ε} (X_{w}) |$

for large $k$ . Take natural logarithms of both sides, divide by $k^{d}$ , and let $k \to \infty$ to

see that

$h^{t o p} (X) \underline{>} (\ln 2) (3^{- d + 1} n^{- d} (e^{- h^{t o p} (X) (n^{'} + R)^{d}} - ε)) + h^{t o p} (X_{w})$ .

Since $ε$ was arbitrary, we allow it to approach zero, and so

$h^{t o p} (X) - h^{t o p} (X_{w}) \underline{>}$ III 2 . $3^{- d + 1} n^{- d} e^{- h^{t o p} (X) (n^{'} + R)^{d}}$ ,

which for large enough $n$ , gives

$h^{t o p} (X) - h^{t o p} (X_{w}) > e^{- h^{t o p} (X) (n^{'} + R + 1)^{d}}$ ,

or, by replacing $n^{'}$ by its maximum possible value $n + 44 R + 69$ ,

$h^{t o p} (X) - h^{t o p} (X_{w}) > \frac{1}{e^{h^{t o p} (X) (n + 44 R + 70)^{d}}}$ .

146

Combining this with the earlier upper bound on $h^{t o p} (X) - h^{t o p} (X_{w})$ , we have

$\frac{1}{e^{h^{t o p} (X) (n + 44 R + 70)^{d}}} < h^{t o p} (X) - h^{t o p} (X_{w}) < \frac{D_{X}}{e^{h^{t o p} (X) (n - 2 R)^{d}}}$ .

$□$

3.5 A closer look at the main result

We here wish to briefly review the proof of Theorem 3.1.22, and point out that most

of the complications encountered in the proof were due to possible problems with

periodicity of $w$ . If some assumptions about $w' s$ aperiodicity are made, the proof can

be simplified significantly. We will skip some of the details of these proofs, as they

are just simplified versions of arguments already made in the main text.

Theorem 3.5.1. For any strongly irreducible shift X $= Ω_{F}$ of finite type with uniform

filling length R and positive topological entropy $h^{t o p} (X)$ , and for any words w, $w^{'} \in$

$L_{Γ_{n}}$ (X) such that w and $w^{'}$ agree on the boundary of thickness t and any replacement

of w by $w^{'}$ or vice versa cannot create or destroy any other occurrences of w and $w^{'}$ ,

if we denote by $X_{w}$ the shift of finite type $Ω_{F u {w}}$ , then for any $μ - w$ an ergodic measure

of maximal entropy on $X_{w}$ and any $μ -$ an ergodic measure of maximal entropy on X,

$(\ln 2) {\bar{μ}}_{w} ([w^{'}]) \underline{<} h^{t o p} (X) - h^{t o p} (X_{w}) \underline{<} 2 (\ln 2) \bar{μ} ([w])$ .

Proof: To prove an upper bound on $h^{t o p} (X) - h^{t o p} (X_{w})$ , define a mapping $θ_{k}$ (similar

to $φ_{k}$ from the upper bound portion of the proof of Theorem 3.1.22) on $L_{Γ_{k + 2 n}} (X)$ :

147

given $v \in L_{Γ_{k + 2 n}} (X)$ , one replaces the occurrences of $w$ by $w^{'}$ , going in order lexico-

graphically. Since none of these replacements can make new occurrences of $w$ , this

process terminates. The word occupying the central $Γ_{k}$ of the resulting word is then in

$L_{Γ_{k}} (X_{w})$ for the same reasons as before, and we call this word $θ_{k} (v)$ . We then wish to

bound from above the size of 0 $k - 1$ $(v)$ $\cap A_{k + 2 n}$ , $ε$ , $\bar{μ}$ , $w$ , $w^{'}$ $(X)$ for any $v \in L_{Γ_{k + 2 n}} (X)$ and any

ergodic measure of maximal entropy $\bar{μ}$ on $X$ . Recall that $A_{k + 2 n}$ , $ε$ , $\bar{μ}$ , $w$ , $w^{'}$ $(X)$ is the set of

words in $L_{Γ_{k + 2 n}} (X)$ which have between $(k + 2 n)^{d} (\bar{μ} ([x]) - ε)$ and $(k + 2 n)^{d} (\bar{μ} ([x]) + ε)$

occurrences of $x$ for $x = w$ , $w^{'}$ .

Consider any $x \in θ_{k}^{- 1}$ $(v)$ $\cap A_{k + 2 n}$ , $ε$ , $\bar{μ}$ , $w$ , $w^{'}$ $(X)$ . Because of the fact that occurrences of

$w$ and $w^{'}$ are not incidentally created nor destroyed during the replacements made in

the process of changing $x$ to $v$ , the word occupying a shape $S$ a copy of $Γ_{n}$ will be

changed from $w$ to $w^{'}$ at some point during these replacements if and only if $x (S) = w$ .

If $x (S)$ $= w$ , then $v (S) = w^{'}$ , since once an occurrence of $w^{'}$ is created, it cannot

be eliminated in any of the other replacements made in the process of changing $x$

to $θ_{k} (x) = v$ . Therefore, any replacements made are at copies $S$ of $Γ_{n}$ such that

$v (S) = w^{'}$ . We have then shown that if we denote by $S_{1}$ , . . . , $S_{p}$ the set of copies $S$

of $Γ_{n}$ such that $v (S) = w^{'}$ , then $x$ agrees with $v$ on $Γ_{k} ∖ (\cup_{i = 1}^{p} S_{i})$ , and for each $i$ ,

$x (S_{i}) = w$ or $x (S_{i}) = w^{'}$ . This shows that there are at most $2^{p}$ choices for $x (Γ_{k})$ , and

so since there are at most $| A |^{(k + 2 n)^{d} - k^{d}}$ ways to fill $Γ_{k + 2 n}^{(n)}$ with letters, that $| θ_{k}^{- 1} (v) | \underline{<}$

$2^{p} | A |^{(k + 2 n)^{d} - k^{d}}$ But, $v = θ_{k} (x)$ , and $x \in A_{k + 2 n, ε, \bar{μ}, w, w^{'}} (X)$ . Therefore, $p$ , the number

of occurrences of $w^{'}$ in $v$ , is equal to $q + r$ , where $q$ is the number of occurrences of

$w$ in $x$ and $r$ is the number of occurrences of $w^{'}$ in $x$ . By definition of $A_{k + 2 n, ε, n} (X)$ ,

$q \underline{<} (k + 2 n)^{d} (\bar{μ} ([w]) + ε)$ and $r \underline{<}$ $(k + 2 n)^{d} (\bar{μ} ([w^{'}]) + ε)$ . By Proposition 3.2.5,

148

since $w$ and $w^{'}$ agree on their boundary of thickness $t, \bar{μ} ([w]) = \bar{μ} ([w^{'}])$ . Therefore,

$p \underline{<} 2 (k + 2 n)^{d} (\bar{μ} ([w]) + ε)$ , and so we see that if 0 $k - 1 (v)$ $\cap A_{k + 2 n}$ , $ε$ , $\bar{μ}$ , $w$ , $w^{'}$ $(X)$ is nonempty,

then it has cardinality at most $2^{2 (k + 2 n)^{d} (\bar{μ} ([w]) + ε)} | A |^{(k + 2 n)^{d} - k^{d}}$ This implies that

$H_{Γ_{k}} (X_{w}) \underline{>} \frac{| A_{k + 2 n, ε, \bar{μ}, w, w^{'}} (X) |}{2^{2 (k + 2 n)^{d} (\bar{μ} ([w]) + ε)} | A |^{(k + 2 n)^{d} - k^{d}}}$ .

Take natural logarithms of both sides, divide by $(k + 2 n)^{d}$ , and let $k \to \infty$ to get

$\lim_{k \to \infty} \frac{k^{d}}{(k + 2 n)^{d}} \frac{h_{Γ_{k}} (X_{w})}{k^{d}} \underline{>} \lim_{k \to \infty} \frac{\ln | A_{k + 2 n, ε, \bar{μ}, w, w^{'}} (X) |}{(k + 2 n)^{d}} - 2 \ln 2 (\bar{μ} ([w]) + ε)$

$- \frac{\ln | A | ((k + 2 n)^{d} - k^{d})}{(k + 2 n)^{d}}$ ,

and by using the definition of entropy and Lemma 3.2.7, we see that

$h^{t o p} (X_{w}) \underline{>} h_{\bar{μ}} (X) - 2 (\ln 2) (\bar{μ} ([w]) + ε)$ .

Since $\bar{μ}$ is a measure of maximal entropy, and since $ε$ can be arbitrarily small, we may

rearrange to arrive at

$h^{t o p} (X) - h^{t o p} (X_{w}) \underline{<} 2 (\ln 2) \bar{μ} ([w])$ .

We could then use Lemma 3.2.6 to further bound this from above, but we leave the

bound as it is for now, to emphasize the fact that our upper bound comes from $\bar{μ} ([w])$ .

We can also give a similarly shortened proof of a lower bound on $h^{t o p} (X) - h^{t o p} (X_{w})$ .

Define a map $δ_{k}$ (similar to $ψ_{k}$ from the proof of the lower bound portion of Theo-

rem 3.1.22) from $L_{Γ_{k}} (X_{w})$ to $7^{\supset} (L_{Γ_{k}} (X))$ as follows: for any $v \in L_{Γ_{k}} (X_{w})$ which has

at least one occurrence of $w^{'}$ , take any nonempty subset of the occurrences of $w^{'}$ in $v$

149

and replace them all by $w$ . It is possible to perform all of these replacements simulta-

neously because the hypotheses of Theorem 3.5.1 imply that the portions of any two

occurrences of $w^{'}$ which would need to be changed to turn them into $w$ are disjoint.

(If not, then a replacement of one of them by $w$ would destroy the other.) Therefore,

for any ergodic measure of maximal entropy ${\bar{μ}}_{w}$ on $X_{w}$ and for any $v \in A_{k}$ , $ε$ , ${\bar{μ}}_{2}$ , $w^{'}$ $(X)$ ,

$| δ_{k} (v) | \underline{>} 2^{k^{d} ({\bar{μ}}_{w} ([w^{'}]) - ε)} - 1$ . For exactly the same reasons as in the earlier lower bound

proof, $δ_{k} (v) \cap δ_{k} (v^{'}) = \emptyset$ for $v \neq v^{'} \in A_{k, ε, {\bar{μ}}_{w}, w^{'}} (X_{w})$ . Therefore,

$H_{Γ_{k}} (X) > (2^{k^{d} ({\bar{μ}}_{w} ([w^{'}]) - ε)} - 1)$ $| A_{k, ε, {\bar{μ}}_{w}, w^{'}} (X_{w}) |$ .

Take natural logarithms of both sides, divide by $k^{d}$ , and let $k \to \infty$ to see that

$h^{t o p} (X) \underline{>} (\ln 2) ({\bar{μ}}_{w} ([w^{'}]) - ε) + \lim \sup \frac{\ln | A_{k, ε, {\bar{μ}}_{w}, w^{'}} (X_{w}) |}{k^{d}}$ ,

$k \to \infty$

and by using Lemma 3.2.7 and letting $ε$ approach zero since it is arbitrary,

$h^{t o p} (X)$ $\underline{>}$ (In $2$ ) $μ - w ([w^{'}]) + h_{{\bar{μ}}_{w}} (X_{w})$ .

Finally, we use the fact that ${\bar{μ}}_{w}$ is a measure of maximal entropy on $X_{w}$ to rewrite as

$h^{t o p} (X) - h^{t o p} (X_{w}) \underline{>} (\ln 2) {\bar{μ}}_{w} ([w^{'}])$ .

$□$

We recall that the upper and lower bounds in Theorem 3.1.16 (the one-dimensional

case) have a ratio bounded away from zero and infinity, but that we showed that it is

150

impossible to have such close bounds in the multidimensional case which depend only

on the size of $w$ . It is then natural to wonder whether or not it is possible, at least

under the hypotheses on $w$ from Theorem 3.5.1 and for large $n$ , to have a lower bound

of the form $C \bar{μ} ([w])$ for $h^{t o p} (X) - h^{t o p} (X_{w})$ . It turns out that this cannot be true if

there are multiple measures of maximal entropy. In one dimension, the uniqueness

of a measure of maximal entropy of any mixing shift of finite type is well-known,

but Burton and Steif ([BuSl], $[B u S 2]$ ) have shown that there are multidimensional

strongly irreducible shifts of finite type which have more than one measure of maximal

entropy. In order to show that this type of lower bound is too good to be true, we

need a lemma:

Lemma 3.5.2. If X is a strongly irreducible $Z^{d}$ -shift of finite type t and if we denote

by $E_{n} (X)$ the set of words w $\in L_{Γ_{n}} (X)$ for which there exists $w^{'} \in L_{Γ_{n}} (X)$ where $w$

and $w^{'}$ agree on the boundary of thickness t and such that replacing w by $w^{'}$ or vice

versa can never incidentally create nor destroy an occurrence of w or $w^{'}$ , then for

any ergodic measure $\bar{μ}$ of maximal entropy on X, $\lim_{n \to \infty} \bar{μ} (E_{n} (X)) = 1$ .

Proof: We claim that as long as $n$ is sufficiently large, and $w$ has the property

that given any two disjoint copies $S$ and $T$ of $Γ_{⌋ \sqrt{n} ⌊}$ contained in $Γ_{n}$ , $w (S) \neq w (T)$ ,

then $w \in E_{n} (X)$ . Consider any $w$ with the property just described. Then create

a standard replacement of $w$ in the same way as in the proof of Theorem 3.3.1, by

changing $w$ only on a central copy of $Γ_{2 R + l}$ . (Recall that $l$ $= ⌈ (\frac{d \ln n}{h^{t o p} (X)}) \bar{d} ⌉ + 1$ or

$l$ $= ⌈ (\frac{d \ln n}{h^{t o p} (X)})^{\frac{1}{d}} ⌉ + 2$ and that $l$ is chosen so that there exists $a \in L_{Γ}$ , (X) so that $a$

is not a subword of $w$ . We before chose $l$ to be odd, but here we take it to have the

151

same parity as $n$ so that there is a central $Γ_{l}$ within $Γ_{n} .$ ) We denote by $K$ the central

copy of $Γ_{2 R + l}$ on which $w$ is changed to create $w^{'}$ , and so $w (Γ_{n} ∖ K)$ $= w^{'} (Γ_{n} ∖ K)$ .

We have four cases that we would like to show are impossible:

Case 1: Suppose that a replacement of $w$ by $w^{'}$ could possibly create a new occur-

rence of $w$ . More rigorously, that there exists $x \in X$ and copies $T$ and $T^{'}$ of $Γ_{n}$ such

that $x (T) = w$ , and if we define by $x^{'}$ the element of $X$ created by replacing $x (T)$ by

$w^{'}$ , then $x^{'} (T^{'}) = w$ , but $x (T^{'}) \neq w$ . This means that $x^{'} (T) = w^{'}$ and $x^{'} (T^{'}) = w$ .

Since the central $Γ_{l}$ in $w^{'}$ is occupied by $a$ , this implies that $| T^{'} - T |_{\infty} > \frac{n - l}{2}$ , or else

$x^{'} (T^{'}) = w$ would have to contain this occurrence of $a$ . However, in order for $x^{'} (T^{'})$

to be a newly created occurrence of $w$ , it must be the case that $T^{'}$ has nonempty

overlap with $K + (T - Γ_{n})$ , and so $| T^{'} - T |_{\infty} < \frac{n + (2 R + l)}{2}$ . By this upper bound on

$| T^{'} - T |_{\infty}$ , $T \cap T^{'}$ contains a cube of size at least $\frac{n - (2 R + l)}{2}$ , which must contain a copy

of $Γ_{⌋ \sqrt{n} ⌊}$ disjoint from the very small sets $K + (T - Γ_{n})$ and $K + (T^{'} - Γ_{n})$ , call it $U$ .

Since $x^{'} (T) = w^{'}$ , $x^{'} (U) = w^{'} (U - (T - Γ_{n}))$ . Since $U$ is disjoint from $K + (T - Γ_{n})$ ,

$w^{'} (U - (T - Γ_{n})) = w (U - (T - Γ_{n}))$ . Since $x^{'} (T^{'}) = w$ , $x^{'} (U) = w (U - (T^{'} - Γ_{n}))$ .

Therefore, $w (U - (T - Γ_{n})) = w (U - (T^{'} - Γ_{n}))$ . But, since $| T^{'} - T |_{\infty} > \frac{n - l}{2}$ , which

is greater than $\sqrt{n}$ for large $n$ , $U - (T - Γ_{n})$ and $U - (T^{'} - Γ_{n})$ are disjoint copies of

$Γ_{⌋ \sqrt{n} ⌊}$ , and so we have a contradiction.

Case 2: Suppose that a replacement of $w$ by $w^{'}$ could possibly destroy an existing

occurrence of $w^{'}$ . More rigorously, that there exists $x \in X$ and copies $T$ and $T^{'}$ of

$Γ_{n}$ such that $x (T) = w$ , and if we define by $x^{'}$ the element of $X$ created by replacing

152

$x (T)$ by $w^{'}$ , then $x^{'} (T^{'}) \neq w^{'}$ , but $x (T^{'}) = w^{'}$ . This means that $x (T^{'}) = w^{'}$ and

$x (T) = w$ . Everything from here proceeds to a contradiction exactly as in Case 1.

Case 3: Suppose that a replacement of $w$ by $w^{'}$ could possibly destroy an existing

occurrence of $w$ . More rigorously, that there exists $x \in X$ and copies $T$ and $T^{'}$ of $Γ_{n}$

such that $x (T) = w$ , and if we define by $x^{'}$ the element of $X$ created by replacing $x (T)$

by $w^{'}$ , then $x^{'} (T^{'}) \neq w$ , but $x (T^{'}) = w$ . This means that $x (T) = w$ and $x (T^{'}) = w$ .

However, since $x^{'} (T^{'}) \neq w$ , it must be the case that $K + (T - Γ_{n})$ has nonempty

intersection with $T^{'}$ . This again implies that $| T^{'} - T |_{\infty} < \frac{n + (2 R + l)}{2}$ . It is then possible

to choose a positive integer $k$ so that $\sqrt{n} < | k (T^{'} - T) |_{\infty} < n - \sqrt{n}$ . Then choose

any $U$ a copy of $Γ_{⌋ \sqrt{n} ⌊}$ so that $U$ , $U + k (T - T^{'}) \underline{\subset} T$ . Since $x (T) = w$ , $x (U) = w (U)$ .

Since $x (T^{'}) = w$ , $x (U) = w (U + (T - T^{'}))$ . So, $w (U) = w (U + (T - T^{'}))$ . In the same

way, we can see that $w (U) = w (U + k (T - T^{'}))$ , and since $| k (T - T^{'}) |_{\infty} > \sqrt{n}$ , $U$ and

$U + k (T - T^{'})$ are disjoint and so we have a contradiction.

Case 4: Suppose that a replacement of $w$ by $w^{'}$ could possibly create a new occur-

rence of $w^{'}$ . More rigorously, that there exists $x \in X$ and copies $T$ and $T^{'}$ of $Γ_{n}$ such

that $x (T) = w$ , and if we define by $x^{'}$ the element of $X$ created by replacing $x (T)$ by

$w^{'}$ , then $x^{'} (T^{'}) = w^{'}$ , but $x (T^{'}) \neq w^{'}$ . This means that $x^{'} (T) = w^{'}$ and $x^{'} (T^{'}) = w^{'}$ .

However, since $x (T^{'}) \neq w^{'}$ , it must be the case that $K + (T - Γ_{n})$ has nonempty

intersection with $T^{'}$ . This again implies that $| T^{'} - T |_{\infty} < \frac{n + (2 R + l)}{2}$ . It is then pos-

sible to choose a positive integer $k$ so that $\sqrt{n} < | k (T^{'} - T) |_{\infty} < n - \sqrt{n}$ . Then,

choose any $U$ a copy of $Γ_{⌋ \sqrt{n} ⌊}$ so that $U$ , $U + k (T - T^{'}) \underline{\subset} T$ , and both are disjoint

from $K + (T - Γ_{n})$ . Then, since $x^{'} (T) = w^{'}$ , $x^{'} (U) = w^{'} (U)$ . Since $x^{'} (T^{'}) = w^{'}$ ,

153

$x^{'} (U) = w^{'} (U + (T - T^{'}))$ . So, $w^{'} (U) = w^{'} (U + (T - T^{'}))$ . In the same way, we can

see that $w^{'} (U) = w^{'} (U + k (T - T^{'}))$ . Since $U$ and $U + k (T - T^{'})$ are disjoint from

$K + (T - Γ_{n})$ , $w (U) = w^{'} (U)$ and $w (U + k (T - T^{'})) = w^{'} (U + k (T - T^{'}))$ . Therefore,

$w (U) = w (U + k (T - T^{'}))$ . Since $| k (T - T^{'}) |_{\infty} > \sqrt{n}$ , $U$ and $U + k (T - T^{'})$ are disjoint

and so we have a contradiction.

Therefore, as long as $w$ has $w (S) \neq w (T)$ for $S$ and $T$ disjoint copies of $Γ_{⌋ \sqrt{n} ⌊}$ in $Γ_{n}$ ,

$w \in E_{n} (X)$ . This implies that

$L_{Γ_{n}} (X) ∖ E_{n} (X) \underline{\subset} \cup v \in L_{Γ_{⌋ \sqrt{n} ⌊}} (X) (\cup (\cup, ([v] \cap ([v] + u^{'})) + u)) u^{'} \in [- n, n]^{d}, | u^{'} |_{\infty} > \sqrt{n} u \in [- n n]^{d}$ .

(3.8)

We can now use Lemma 3.2.6 to see that for any $v \in L_{Γ_{⌋ \sqrt{n} ⌊}} (X)$ , any $u^{'}$ with $| u^{'} |_{\infty} >$

$\sqrt{n}$ , and any ergodic measure of maximal entropy $\bar{μ}$ on $X$ ,

$\bar{μ} ([v] \cap ([v] + u^{'})) \underline{<} \frac{1}{H (X), (Γ_{⌋ \sqrt{n} ⌊} \cup (Γ_{⌋ \sqrt{n} ⌊} + u^{'})) ∖ (Γ_{⌋ \sqrt{n} ⌊} \cup (Γ_{⌋ \sqrt{n} ⌊} + u^{'}))^{(R)}}$ , $ċ$

Since $| u^{'} |_{\infty} > \sqrt{n}$ ,

$(Γ_{⌋ \sqrt{n} ⌊} \cup (Γ_{⌋ \sqrt{n} ⌊} + u^{'})) ∖ (Γ_{⌋ \sqrt{n} ⌊} \cup (Γ_{⌋ \sqrt{n} ⌊} + u^{'}))^{(R)}$

$= ((Γ_{⌋ \sqrt{n} ⌊}) ∖ (Γ_{⌋ \sqrt{n} ⌊})^{(R)}) \cup (((Γ_{⌋ \sqrt{n} ⌊}) ∖ (Γ_{⌋ \sqrt{n} ⌊})^{(R)}) + u^{'})$ .

Since $| u^{'} |_{\infty} > \sqrt{n}$ , $d (((Γ_{⌋ \sqrt{n} ⌊}) ∖ (Γ_{⌋ \sqrt{n} ⌊})^{(R)}), ((Γ_{⌋ \sqrt{n} ⌊}) ∖ (Γ_{⌋ \sqrt{n} ⌊})^{(R)}) + u^{'}) > \sqrt{n} > R$ for

large $n$ , and so by strong irreducibility,

$H (((Γ_{⌋ \sqrt{n} ⌊}) ∖ (Γ_{⌋ \sqrt{n} ⌊})^{(R)}) \cup (((Γ_{⌋ \sqrt{n} ⌊}) ∖ (Γ_{⌋ \sqrt{n} ⌊})^{(R)}) + u^{'}) X)$ $= H (((Γ_{⌋ \sqrt{n} ⌊}) ∖ (Γ_{⌋ \sqrt{n} ⌊})^{(R)}) X)^{2}$ ,

154

which is at least $e^{h^{t o p} (X) 2 (\sqrt{n} - 2 R)^{d}}$ by Lemma 3.2.1. Combining this with shift-invariance

of $\bar{μ}$ and (3.8), we see that

$\bar{μ} (L_{Γ_{n}} (X) ∖ E_{n} (X)) \underline{<}$ $H_{Γ_{⌋ \sqrt{n} ⌊}} (X) (2 n)^{2 d} e^{- h^{t o p} (X) 2 (\sqrt{n} - 2 R)^{d}}$

Again by Lemma 3.2.1, $H_{Γ_{⌋ \sqrt{n} ⌊}} (X) \underline{<} e^{h^{t o p (X)} (\sqrt{n} + R)^{d}}$ Therefore,

$\bar{μ} (L_{Γ_{n}} (X) ∖ E_{n} (X)) \underline{<} (2 n)^{2 d} e^{- h^{t o p} (X) (2 (\sqrt{n} - 2 R)^{d} - (\sqrt{n} + R)^{d})}$ ,

which clearly approaches zero as $n \to \infty$ .

$□$

Now, suppose that for any strongly irreducible $Z^{d}$ -shift of finite type $X$ , there exist

constants $C$ , $N > 0$ and an ergodic measure of maximal entropy $\bar{μ}$ on $X$ such that

for any $w \in E_{n} (X)$ with $n > N$ , $C \bar{μ} ([w]) < h^{t o p} (X) - h^{t o p} (X_{w})$ . For a contradiction,

choose any $X$ with more than one measure of maximal entropy. Since the extreme

points of the set of measures of maximal entropy are the ergodic measures of maximal

entropy, there must exist some ergodic measure of maximal entropy ${\bar{μ}}^{'} \neq \bar{μ}$ on $X$ .

Then we have

$C \bar{μ} ([w]) < h^{t o p} (X) - h^{t o p} (X_{w}) < (2 \ln 2) {\bar{μ}}^{'} ([w])$ (3.9)

for all $w \in E_{n}$ with $n > N$ . However, since $\bar{μ}$ and ${\bar{μ}}^{'}$ are ergodic, they must be

mutually singular. Therefore, for any $ε > 0$ , there exists some open set $U_{ε}$ such that

$\bar{μ} (U_{ε}) < ε$ and ${\bar{μ}}^{'} (U_{ε}) > 1$ $- ε$ . By Lemma 3.5.2, for large enough $n {\bar{μ}}^{'} (U_{ε} \cap E_{n}) > 1$ $- 2 ε$ ,

155

and obviously $\bar{μ} (U_{ε} \cap E_{n}) < ε$ . Since $U_{ε}$ is an open set in $X$ , it is a union of cylinder

sets. This means that there exists $N^{'} > N$ so that for any $n > N^{'}$ , there exists some

$V_{ε} \underline{\subset} U_{ε}$ a union of cylinder sets of words in $L_{Γ_{n}} (X)$ with ${\bar{μ}}^{'} (V_{ε} \cap E_{n}) > 1$ $- 3 ε$ , and

again $\bar{μ} (V_{ε} \cap E_{n}) < ε$ . However, we assumed that $C \bar{μ} ([w]) < h^{t o p} (X) - h^{t o p} (X_{w}) <$

(2 In $n$ ) ${\bar{μ}}^{'} ([w])$ for all $w \in E_{n}$ and $n > N$ , and so since $V_{ε} \cap E_{n}$ is a union of cylinder

sets associated to $E_{n}$ -words, for any $n > N^{'}$ we may sum (3.9) over $w \in V_{ε} \cap E_{n}$ to see

that $C \bar{μ} (V_{ε} \cap E_{n}) <$ ( $2$ In $2$ ) ${\bar{μ}}^{'} (V_{ε} \cap E_{n})$ , and so $C ε <$ ( $2$ In 2) $(1 - 3 ε)$ , a contradiction

for small enough $ε$ . This implies that if $d > 1$ , then there does not exist a lower

bound on $h^{t o p} (X) - h^{t o p} (X_{w})$ of the form $C \bar{μ} ([w])$ which holds for all words $w \in E_{n}$

for sufficiently large $n$ .

3.6 An application to an undecidability question

One of the complexities of multidimensional symbolic dynamics is that it is algorith-

mically undecidable, given just the set $F$ of forbidden words, whether or not a shift

of finite type in $Z^{d}$ is nonempty. (See [B], [R], [Wan] for details.) One application of

Corollary 3.3.7 is a condition under which $Ω_{F}$ is nonempty:

Theorem 3.6.1. For any alphabet $A$ , there exist $F$ , $G \in N$ such that for any $m > 0$

and any finite set of words $F_{m} = {w_{k} \in L_{Γ_{n_{k}}} (X) : 1 \underline{<} k \underline{<} m}$ satisfying $n_{1} > G$

and $n_{k} \underline{>} F (n_{k - 1})^{4 d^{2}}$ for $1 < k \underline{<} m$ , $Ω_{F_{m}} \neq \emptyset$ .

To prove this, we need the following lemma:

Lemma 3.6.2. For any strongly irreducible $Z^{d}$ -shift $X = Ω_{F}$ of finite type $t$ with

uniform filling length $R$ , there exist $C$ , $E \in N$ dependent only on $d$ such that for

156

any $w \in L_{Γ_{n}} (X)$ with $n > \max (C (R + 1)^{4 d^{2}}, t^{2}, E)$ , if we denote by $X_{w}$ the shift of

finite type $Ω_{F u {w}}$ , then there is some $w^{'} \in L_{Γ_{m}} (X)$ a subword of $w$ such that $X_{w}$ , is

nonempty and strongly irreducible with uniform filling length at most $2 n + R$ .

Proof: Suppose that such a shift $X$ is given. By Corollary 3.3.7, there exists $N$ such

that for any $w \in L_{Γ_{n}} (X)$ with $n > N$ , there exist $w^{'}$ , $w^{'} \in L_{Γ_{m}} (X)$ such that $w^{'}$ is a

subword of $w$ , $w^{'}$ and $w^{'}$ agree on $Γ_{n}^{(t)}$ , and replacing an occurrence of $w^{'}$ by $w^{'}$ in an

element of $X$ cannot possibly create a new occurrence of $w^{'}$ . (We are not using the

full force of Corollary 3.3.7 here, in that we do not care about the size $m$ of $w^{'}$ , only

that $w^{'}$ exists.) By reviewing the proof of Corollary 3.3.7, we see that a sufficient

condition for the existence of $w^{'}$ is that

$(n - t + 1)$ $- (C$ ' $(\frac{\ln (n - t + 1)}{h^{t o p} (X)} + R)) (n - t + 1)^{1 - \frac{1}{2 d}} > 0$

for some constant $C^{'}$ depending only on $d$ . Some algebraic manipulation shows that

the above is implied if $n > C^{' 2 d} (\frac{\ln (n - t + 1)}{h^{t o p} (X)} + R)^{2 d} + t$ . By Lemma 3.2.1, $e^{h^{t o p} (X) (R + 1)^{d}} \underline{>}$

$H_{Γ_{1}} (X) \underline{>} 2$ , so $h^{t o p} (X) \underline{>} \frac{\ln 2}{(R + 1)^{d}}$ . Therefore, it is sufficient to have $n > C^{' 2 d} (\frac{1}{\ln 2} (R +$

$1)^{d}$ In $n + R)^{2 d} + t$ . Since $(R + 1)^{d} > R$ and $n > t^{2}$ , for any $n > 1$ a sufficient

condition is $n > 2 C^{' 2 d} (2 \ln n (R + 1)^{d})^{2 d} + \sqrt{n}$ . There is a constant $E$ dependent

only on $d$ so that if $n > E$ , $(\ln n)^{2 d} < \sqrt{n}$ . For such $n$ , it would be enough to have

$n > 2 C^{' 2 d} \sqrt{n} (2 (R + 1)^{d})^{2 d} + \sqrt{n}$ , or by dividing both sides by $\sqrt{n}$ and then squaring

both sides, $n > C (R + 1)^{4 d^{2}}$ for some constant $C$ dependent only on $d$ . So, as long as

$n > \max (C (R + 1)^{4 d^{2}}, t^{2}, E)$ , such a $w^{'}$ exists.

Consider any two shapes $S$ , $T \underline{\subset} Z^{d}$ such that $d (S, T) > 2 n + R$ , and any words $y \in$

157

$L_{S} (X_{w^{'}})$ and $z \in L_{T} (X_{w^{'}})$ . Since $y \in L_{S} (X_{w^{'}})$ , there exists $y^{'} \in L_{s u (S^{c})^{(n)}} (X_{w^{'}})$ such

that $y^{'} (S) = y$ . Similarly, there exists $z^{'} \in L_{T U (T^{c})^{(n)}} (X_{w^{'}})$ such that $z^{'} (T) = z$ . For

any $p \in S \cup$ $(S^{c})^{(n)}$ , by definition there exists $p^{'} \in S$ such that $d (p, p^{'}) \underline{<} n$ . Similarly,

for any $q \in T \cup (T^{c})^{(n)}$ , there exists $q^{'} \in T$ such that $d (q, q^{'}) \underline{<} n$ . But $d (S, T) > 2 n + R$ ,

so $d (p^{'}, q^{'}) > 2 n + R 7$ . By the triangle inequality, $d (p, q) > R$ . Therefore, since $p$ , $q$ were

arbitrary, $d (S \cup (S^{c})^{(n)}, T U (T^{c})^{(n)}) > R$ , and so by strong irreducibility of $X$ there

exists $x \in X$ such that $x (S \cup (S^{c})^{(n)}) = y^{'}$ and $x (T \cup (T^{c})^{(n)}) = z^{'}$ . We now fix any

ordering of the elements of $Z^{d}$ with a least element (say lexicographically with respect

to polar coordinates) and replace each element of $w^{'}$ by $w^{'}$ in turn with respect to

this order. In this way, we eventually arrive at $x^{'} \in X$ which has no occurrences of $w^{'}$

and is thus an element of $X_{w^{'}}$ . Note that since $x (S \cup (S^{c})^{(n)}) = y^{'}$ and $x (T \cup (T^{c})^{(n)})$

are words in $L (X_{w^{'}})$ , they had no occurrences of $w^{'}$ . Therefore, any of the replaced

occurrences of $w^{'}$ had nonempty intersection with $((S \cup (S^{c})^{(n)}) u (T \cup (T^{c})^{(n)}))^{c}$

Consider such a replaced occurrence which occupies $U$ a copy of $Γ_{m}$ . From the fact

just noted, there exists $p \in U$ such that $p \notin S \cup$ $(S^{c})^{(n)} \cup$ $T \cup$ $(T^{c})^{(n)}$ . This implies that

$d (p, S)$ $> n$ and $d (p, T) > n$ . Therefore, since the size of $U$ is $m \underline{<} n$ , $U$ is disjoint

from both $S$ and $T$ . Since $U$ is an arbitrary replaced occurrence of $w^{'}$ , this implies

that $x$ remained unchanged on $S$ and $T$ throughout the process of changing it to $x^{'}$ ,

and so $x^{'} (S) = x (S) = y$ and $x^{'} (T) = x (T) = z$ . By definition, we have shown that

$X_{w}$ , is nonempty and strongly irreducible with uniform filling length at most $2 n + R$ .

$□$

Proof of Theorem 3.6.1: We prove Theorem 3.6.1 by induction. Our inductive

158

hypothesis is that for any $F_{m}$ as described in the hypotheses, $Ω_{F_{m}}$ contains a strongly

irreducible shift of finite type with uniform filling length less than $4 n_{m}$ . First note

that by Theorem 3.1.22 and Lemma 3.6.2, there certainly exists $G$ such that for

any word $w_{1} \in A^{Γ_{n_{1}}}$ with $n_{1} > G$ , there is a subword $w_{1}^{'}$ of $w_{1}$ such that $Ω {w_{1}^{'}}$ is

nonempty and strongly irreducible with uniform filling length $R_{1} < 4 n_{1}$ . We take

this $G$ to be greater than the $E$ from Lemma 3.6.2, and take our $F$ to be $5^{4 d^{2}} C$ ,

where $C$ is from Lemma 3.6.2. Now suppose the hypothesis to be true for $m$ , and

consider any $F_{m + 1} = {w_{1}$ , $w_{2}$ , . . . , $w_{m + 1}}$ satisfying the hypotheses of Theorem 3.6.1.

By the inductive hypothesis, $Ω {w_{1}, w_{2}, ..., w_{m}}$ contains a strongly irreducible shift $X_{m}$

of finite type $t_{m}$ with uniform filling length $R_{m} < 4 n_{m}$ . There are two cases: either

$w_{m + 1} \in L (X_{m})$ or not. If $w_{m + 1} \notin L (X_{m})$ , then $(X_{m})_{w_{m + 1}} = X_{m}$ , and so since

$X_{m} = (X_{m})_{w_{m + 1}} \underline{\subset} Ω_{F_{m + 1}}$ , in this case $Ω_{F_{m + 1}}$ contains a strongly irreducible shift

of finite type with uniform filling length $R_{m} < 4 n_{m} < 4 n_{m + 1}$ and the inductive

hypothesis is verified for $m + 1$ . So, we suppose that $w_{m + 1} \in L (X_{m})$ . We will show

that $n_{m + 1} > \max (C (R_{m} + 1)^{4 d^{2}}, t_{m}^{2}, E)$ . By the hypotheses of Theorem 3.6.1, we see

that $n_{m + 1} > F (n_{m})^{4 d^{2}} = 5^{4 d^{2}} C (n_{m})^{4 d^{2}} > C (5 n_{m})^{4 d^{2}} > C (R_{m} + 1)^{4 d^{2}}$ Since the largest

size of a word in $F$ is $n_{m}$ , $t_{m} \underline{<} n_{m}$ , and so $n_{m + 1} > F (n_{m})^{4 d^{2}} > t_{m}^{2}$ . Finally, $n_{m + 1} >$

$n_{1} > G > E$ . Therefore, $w_{m + 1}$ satisfies the hypotheses of Lemma 3.6.2 and so there

is $w_{m + 1}^{'}$ a subword of $w_{m + 1}$ such that $(X_{m})_{w_{m + 1}^{'}}$ is nonempty and strongly irreducible

with uniform filling length less than 42 $m + 1$ . Also, $(X_{m})_{w_{m + 1}^{'}} \underline{\subset} (X_{m})_{w_{m + 1}} \underline{\subset} Ω_{F_{m + 1}}$ ,

and so the induction is complete. We have then shown that for any $m$ , $Ω_{F_{m}}$ contains

a nonempty shift and is therefore nonempty itself.

$□$

159

3.7 Questions

A few questions suggest themselves based on our results. The first, and perhaps most

obvious, is simply how the bounds in Theorem 3.1.22 could be improved. (We suspect

that at least the lower bound is far from optimal.)

Question 3.7.1. Can the constants $A_{X}$ and $B_{X}$ in the statement of Possible Theo-

rem 3.1.21 be improved from their values in Theorem 3.1.22 l.?

We know at least that the gap between $A_{X}$ and $B_{X}$ can be improved only up to

possible additive and multiplicative constants; we showed earlier that there are ex-

amples which force $A_{X} -$ $B_{X} \underline{>} 2 R$ for strongly irreducible shifts of finite type with

arbitrarily large $R$ , and our Theorem 3.1.22 has $A_{X} -$ $B_{X} = 46 R + 70$ . However, it

would be nice to achieve optimal values.

Next, recall that we showed in the introduction that no result nearly as strong as

Theorem 3.1.22 can be true for shifts of finite type which are only assumed to be

topologically mixing. However, it is probable that something can be said. So, we can

ask

Question 3.7.2. If $X$ is a topologically mixing $Z^{d}$ -shift $X = Ω_{F}$ of finite type with

positive topological entropy $h^{t o p} (X)$ , and if we denote by $X_{w}$ the shift of finite type

$Ω_{F u {w}}$ , then is it true that for every $ε > 0$ , there exists $N$ such that for all $w \in$

$L_{Γ_{n}} (X)$ with $n > N$ , $h^{t o p} (X) - h^{t o p} (X_{w}) < ε$ ? If so, can anything be said about

the rate at which $h^{t o p} (X) - h^{t o p} (X_{w})$ approaches zero as the size $n$ of $w \in L_{Γ_{n}} (X)$

approaches infinity $l .$ ?

160

If it is the case that $h^{t o p} (X) - h^{t o p} (X_{w})$ approaches zero as $n \to \infty$ , then it seems

likely that some new techniques would be necessary for a proof, since there exist

non-strongly irreducible shifts of finite type whose languages contain words $w$ for

which there exists no word $w^{'} \neq w$ agreeing with $w$ on its border. For example,

in the checkerboard island shift $Z$ , any square word which is a finite portion of a

checkerboard is forced by its boundary.

161

BIBLIOGRAPHY

[B] ROBERT BERGER, The undecidability of the domino problem, Mem. Amer.

Math. Soc. 66 (1966)

[Be] VITALY BERGELSON, Ergodic Ramsey Theory- an Update, Ergodic Theory

of $Z^{d}$ -actions, M. Pollicott and K. Schmidt, editors, London Math. Soc.

Lecture Notes Series 228, Cambridge University Press (1996), 1-61.

[BeL] VITALY BERGELSON AND ALEXANDER LEIBIVIAN, Polynomial extensions

of van der Waerden's and Szemeredi's theorems, Journal of AMS 9 (1996),

no. 3, 725-753.

[Bi] G.D. BIRKHOFF, Proof of the ergodic theorem, Proc. Nat. Acad. Sci. USA,

17 (1931) 656-660.

[Bou] JEAN BOURGAIN, Poin rwise ergodic theorems for arithmetic sets, Publ.

Math. IHES 69 (1989), no. 3, 5-45.

[Bou2] JEAN BOURGAIN, On poin rwise ergodic theorems for arithmetic sets, C. R.

Acad. Sci. Paris Sir. I Math. 305 (1987), no. 10, 397-402.

[BoyLR] MICHAEL BOYLE, DOUGLAS LIND, AND DANIEL RUDOLPH, The auto-

morphism group of a shift of finite type, Trans. Amer. Math. Soc., 306

(1988), no. 1, 71-114.

[BuSl] ROBERT BURTON AND JEFFREY STEIF, Non-uniqueness of measures of

maximal entropy, Ergodic Theory Dynam. Systems, 14 (1994), no. 2. 213-

235.

[BuS2] ROBERT BURTON AND JEFFREY STEIF, Nerv results on measures of $\max -$

imal entropy, Israel J. Math. 89 (1995), nos. 1-3, 275-300.

[Fa] BASSAIVI FAYAD, Analytic mixing reparametrizations of irrational florvs, Er-

godic Theory Dynam. Systems 22 (2002), no. 2, 437-468.

162

[Fu]

HILLEL FURSTENBERG, Strict ergodicity and transformation of the torus,

Amer. J. Math. 83 (1961), 573-601.

[GO1] LEO GUIBAS AND ANDREW ODLYZKO, 6aximal prefix-synchronized codes,

SIAM J. Appl. Math. 35 (1978), 401-418.

[GO2] LEO GUIBAS AND ANDREW ODLYZKO, Periods in strings, J. Combin. The-

ory Ser. A, 30 (1981), 19-42.

[HaK] FRANK HAHN AND YITZHAK KATZNELSON, On the entropy of uniquely

ergodic transformations, Trans. Amer. Math. Soc. 126 (1967), 335-360.

[HaW] G.H. HARDY AND E.IVI. WRIGHT, An introduction to the theory of num-

bers, Clarendon Press, 1979.

[HeM] GUSTAV HEDLUND AND MARSTON Morse, Symbolic dynamics, Amer. J.

Math., 60 (1938), no. 4, 815-866.

[L] DOUGLAS LIND, Perturbations of shifts of finite type, Siam J. Discrete

Math. 2 (1989), no. 3, 350-365.

[LM DOUGLAS LIND AND BRIAN Marcus, An Introduction to Symbolic Dy-

namics and Coding, Cambridge University Press, 1995.

[M] MICHAEL MISIUREWICZ, A short proof of the variational principle for $a$

$Z_{+}^{n}$ -action on a compact space, Asterisque 40 (1975), 147-157.

[QS] ANTHONY QUAS AND A.A. §AHIN, Entropy gaps and locally maximal en-

tropy in $Z^{d}$ subshifts, Ergodic Theory Dynam. Systems 23 (2003), 1227-

1245.

[QT] ANTHONY QUAS AND PAUL Trow, Subshifts of multi-dimensional shifts

of finite type, Ergodic Theory Dynam. Systems 20 (2000), no. 3, 859-874.

[R] RAPHAEL IVI. ROBINSON, Undecidability and nonperiodicity for tilings of

the plane, Invent. Math 12 (1971), 177-209.

[Wal] PETER WALTERS, An Introduction to Ergodic Theory, Springer-Verlag,

1982.

[Wan] HAO WANG, Proving theorems by pattern recognition II, AT&T Bell Labs.

Tech. J. 40 (1961), 1-41.

163

[Wi]

MAacuteTEacute WIERDL, Poin rwise ergodic theorem along the prime numbers, Israel

J. Math. 64 (1988), no. 3, 315-336.

164