• Nie Znaleziono Wyników

A Regeneration Proof of the Central Limit Theorem for Uniformly Ergodic Markov Chains

N/A
N/A
Protected

Academic year: 2021

Share "A Regeneration Proof of the Central Limit Theorem for Uniformly Ergodic Markov Chains"

Copied!
6
0
0

Pełen tekst

(1)

A Regeneration Proof of the Central Limit Theorem for Uniformly Ergodic Markov Chains

Witold Bednorz

Krzysztof Latuszynski

Institute of Mathematics, Warsaw University. 00-927 Warsaw. Poland.

Department of Mathematical Statistics. Institute of Econometrics. Warsaw School of Economics. 02-554 Warsaw. Poland.

Keywords: Markov Chain, CLT, Uniform Ergodicity, Regeneration.

AMS: 60J05

Abstract

Central limit theorems for functionals of general state space Markov chains are of crucial importance in sensible implementation of Markov chain Monte Carlo algorithms as well as of vital theoretical interest. Different approaches to proving this type of results under diverse assumptions led to a large variety of CTL versions. However due to the recent development of the regeneration theory of Markov chains, many classical CLTs can be reproved using this in- tuitive probabilistic approach, avoiding technicalities of original proofs. In this paper we provide a regeneration proof of a CLT for functionals of uniformly ergodic Markov chains, thus solve the open problem posed in [8]. Moreover we discuss the difference between one-step and multiple-step small set condition.

1. Introduction

Let (Xn)n≥0 be a time homogeneous Markov chain on a measurable space (X, B(X)) with initial distribution π0, transition kernel P and a unique stationary distribution π. Let g be a real valued Borel function on X and define

¯

gn = n1Pn−1

i=0 g(Xi) and Eπg = R

Xg(x)π(dx). We say that a √

n−CLT holds for (Xn)n≥0and g, if

√n(¯gn− Eπg)→ N (0, σd g2), as n → ∞, (1)

where σg2:= varπg(X0) + 2P

n=1covπ{g(X0), g(Xn)} < ∞. Central limit theo- rems of this type are crucial for assessing the quality of Markov chain Monte Carlo estimation (see e.g. [5]) and are also of independent theoretical interest.

Thus a large body of work on CLTs for functionals of Markov chains exists and a variety of results have been established under different assumptions and with different approaches to proofs (see [4] for a review). We state two classical CLT versions for geometrically ergodic and uniformly ergodic Markov chains.

Let kµ1(·) − µ2(·)ktv := 2 supA∈B1(A) − µ2(A)| be the well known total variation distance between probability measures µ1 and µ2. We say that a Markov chain (Xn)n≥0with transition kernel P and stationary distribution π is geometrically ergodic, if kPn(x, ·) − π(·)ktv ≤ M (x)ρn, for some ρ < 1

(2)

and M (x) < ∞ π−almost everywhere. We say it is uniformly ergodic, if kPn(x, ·) − π(·)ktv ≤ M ρn, for some ρ < 1 and M < ∞.

Theorem 1.1. If a Markov chain (Xn)n≥0 with stationary distribution π is geometrically ergodic and π(|g|2+δ) < ∞ for some δ > 0, then a √

n−CLT holds for (Xn)n≥0 and g.

Theorem 1.2. If a Markov chain (Xn)n≥0 with stationary distribution π is uniformly ergodic and π(g2) < ∞, then a√

n−CLT holds for (Xn)n≥0and g.

Theorem 1.1 due to [3] has been reproved in [8] using the intuitive rege- neration approach and avoiding technicalities of the original proof (however see our Section 4). Roberts and Rosenthal posed an open problem, whether Theo- rem 1.2 due to [2] can also be reproved using direct regeneration arguments.

The aim of this paper is to provide a regeneration proof of Theorem 1.2.

The outline of the paper is as follows. In Section 2 we describe the regeneration construction. In Section 3 we prove Theorem 1.2 and we discuss some of the difficulties of the regeneration approach in Section 4.

2. Small Sets and the Split Chain

The regeneration construction discovered independently by [7] and [1] is now a well established technique. A systematic development of the theory can be found in e.g. [6] which we exploit in this section.

Definition 2.1 (Small Set). A set C ∈ B(X) is νm−small, if there exist m > 0, ε > 0, and a nontrivial probability measure νm on B(X), such that for all x ∈ C,

Pm(x, ·) ≥ ενm(·). (2)

Since ergodic Markov chains are π−irreducible, Theorem 5.2.2 of [6]

implies that for an ergodic chain a small set C with π(C) > 0 always exists.

A small set C with π(C) > 0 allows for constructing the split chain for (Xn)n≥0 which is the central object of the approach (see Section 17.3 of [6]

for a detailed description). Let (Xnm)n≥0 be the m−skeleton of (Xn)n≥0, i.e.

a Markov chain evolving according to the m−step transition kernel Pm. The small set condition allows to write Pmas a mixture of two distributions:

Pm(x, ·) = εIC(x)νm(·) + [1 − εIC(x)]R(x, ·), (3) where R(x, ·) = [1 − εIC(x)]−1[P (x, ·) − εIC(x)νm(·)]. Now let (Xnm, Yn)n≥0 be the split chain of the m−skeleton i.e. let the random variable Yn ∈ {0, 1}

be the level of the split m−skeleton at time nm. The split chain (Xnm, Yn)n≥0 is a Markov chain that obeys the following transition rule ˇP .

P (Yˇ n = 1, X(n+1)m∈ dy|Yn−1, Xnm= x) = εIC(x)νm(dy) (4) P (Yˇ n = 0, X(n+1)m∈ dy|Yn−1, Xnm= x) = (1 − εIC(x))R(x, dy), (5)

(3)

and Yn can be interpreted as a coin toss indicating whether X(n+1)m given Xnm = x should be drawn from νm(·) - with probability εIC(x) - or from R(x, ·) - with probability 1 − εIC(x).

One obtains the split chain (Xk, Yn)k≥0,n≥0 of the initial Markov chain (Xn)n≥0 by defining appropriate conditional probabilities. To this end let X0nm= {X0, . . . , Xnm−1} and Y0n= {Y0, . . . , Yn−1}.

P (Yˇ n= 1, Xnm+1∈ dx1, . . . , X(n+1)m−1∈ dxm−1, X(n+1)m ∈ dy| (6)

|Y0n, X0nm; Xnm= x) = εIC(x)νm(dy)

Pm(x, dy) P (x, dx1) · · · P (xm−1, dy), P (Yˇ n= 0, Xnm+1∈ dx1, . . . , X(n+1)m−1∈ dxm−1, X(n+1)m ∈ dy| (7)

|Y0n, X0nm; Xnm= x) =(1 − εIC(x))R(x, dy)

Pm(x, dy) P (x, dx1) · · · P (xm−1, dy).

Note that the marginal distribution of (Xk)k≥0 in the split chain is that of the underlying Markov chain with transition kernel P.

For a measure λ on (X, B(X)) let λ denote the measure on X × {0, 1}

(with product σ−algebra) defined by λ(B × {1}) = ελ(B) and λ(B × {0}) = (1 − ε)λ(B). Now the crucial observation is that on the set {Yn = 1}, the pre−nm process {Xk, Yi : k ≤ nm, i ≤ n} and the post−(n + 1)m process {Xk, Yi : k ≥ (n + 1)m, i ≥ n + 1} are independent and the post−(n + 1)m process has the same distribution as {Xk, Yi : k ≥ 0, i ≥ 0} with νm for the initial distribution of (X0, Y0). This leads to Theorem 2.2, but we first need some more notation. Thus let σ(n) denote entrance times of the split chain to the set C ×{1}, i.e. σ(0) = min{k ≥ 0 : Yk= 1}, and σ(n) = min{k > σ(n−1) : Yk= 1}, for n ≥ 1. Also define Zn(g) =Pm−1

k=0 g(Xnm+k) and gc= g − πg.

Theorem 2.2 (Theorem 17.3.6 of [6]). Suppose that (Xn)n≥0 is ergodic and let νm be the measure satisfying (2). If the following conditions hold

(i) Eˇνm

σ(0) X

n=0

Zn(|g|)

2

< ∞, (ii) Eˇνm σ(0)2 < ∞, (8)

then the√

n−CLT holds for (Xn)n≥0 and g, with

σ2g= επ(C) m

( Eˇνm

σ(0) X

n=0

Zn(gc)

2 +2 ˇEνm

σ(0) X

n=0

Zn(gc)

 σ(1) X

n=σ(0)+1

Zn(gc)

) .

3. A Proof

In view of Theorem 2.2 providing a regeneration proof of Theorem 1.2 amounts to establishing conditions (i) and (ii) of (8). To this end we need some additional facts about small sets for uniformly ergodic Markov chains.

(4)

Theorem 3.1. If (Xn)n≥0, a Markov chain on (X, B(X)) with stationary dis- tribution π is uniformly ergodic, then X is νm−small for some νm.

Hence for uniformly ergodic chains (2) holds for all x ∈ X. Theorem 3.1 is well known in literature, in particular it results from Theorems 5.2.1 and 5.2.4 in [6] with their ψ = π.

We start with proving (ii) of (8) which is now straightforward. Integra- ting (6) together with the fact that X is small, yields ˇP (Yn= 1|X0nm, Y0n−1; Xnm= x) = ε, thus Y0, Y1, . . . are independent Bernoulli trials and the distribution of σ(0) is geometric.

Establishing (i) of (8) is the essential part of the proof. Theorem 3.1 im- plies that for uniformly ergodic Markov chains (3) can be rewritten in operator notation as

Pm= ενm+ (1 − ε)R. (9)

The following mixture representation of π will turn out very useful.

Lemma 3.2. If (Xn)n≥0is an ergodic Markov chain with transition kernel P and (9) holds, then

π = εµ := ε

X

n=0

νm(1 − ε)nRn. (10)

Proof. Since (εP

n=0νm(1 − ε)nRn)(X) = εP

n=0(1 − ε)nmRn)(X) = 1, the measure in question is a probability measure. It is also invariant for Pm. By (9) we obtain

 X

n=0

νm(1 − ε)nRn



Pmεµνm+

X

n=1

νm(1 − ε)nRn=

X

n=0

νm(1 − ε)nRn.

Hence by ergodicity εµ = εµPnm→ π, as n → ∞. Thus εµ = π.

Corollary 3.3. The decomposition in Lemma 3.2 implies that

(i) Eˇν

m

σ(0) X

n=0

I{Xnm∈A}



= ˇEν

m

 X

n=0

I{Xnm∈A}I{Y0=0,...,Yn−1=0}



= ε−1π(A)

(ii) Eˇνm

 X

n=0

f (Xnm, Xnm+1, . . . ; Yn, Yn+1, . . . )I{Y0=0,...,Yn−1=0}



=

= ε−1πf (X0, X1, . . . ; Y0, Y1, . . . ).

Proof. (i) is a direct consequence of (10). To see (ii) note that Yn is a coin toss independent of {Y0, . . . , Yn−1} and Xnm, this allows for π instead of π on the RHS of (ii). Moreover the evolution of {Xnm+1, Xnm+2, . . . ; Yn+1, Yn+2, . . . } depends only (and explicitly by (6) and (7)) on Xnmand Yn. Now use (i).

(5)

Our object of interest

I = Eˇνm

σ(0) X

n=0

Zn(|g|)

2

= ˇEνm

 X

n=0

Zn(|g|)I{σ(0)≥n}

2

= Eˇν

m

 X

n=0

Zn(|g|)2I{Y0=0,Y1=0,...,Yn−1=0}



+2 ˇEνm

 X

n=0

X

k=n+1

Zn(|g|)I{σ(0)≥n}Zk(|g|)I{σ(0)≥k}



= A + B (11)

Now we can use Corollary 3.3 and then the inequality 2ab ≤ a2+ b2 to bound the term A in (11).

A = 1 ε

πZ0(|g|)2= 1

εEπm−1X

k=0

|g(Xk)|2

≤m

εEπhm−1X

k=0

g2(Xk)i

≤ m2

ε πg2< ∞.

We can similarly proceed with term B.

B = 2 ˇEνm

 X

n=0

Zn(|g|)I{σ(0)≥n}

X

k=1

Zn+k(|g|)I{σ(0)≥n+k}



= 2

ε Eˇπ

 Z0(|g|)

X

k=1

Zk(|g|)I{σ(0)≥k}



=2 ε

X

k=1

π

I{σ(0)≥k}Z0(|g|)Zk(|g|) (12) Let Ck := ˇEπ

I{σ(0)≥k}Z0(|g|)Zk(|g|). By Cauchy-Schwarz, Ck

qEˇπ

I{σ(0)≥k}Z0(|g|)2q

πZk(|g|)2

=

qEˇπ

I{Y0=0}I{Y1=0,...,Yk−1=0}Z0(|g|)2q

πZ0(|g|)2. Now observe that {Y1, . . . , Yk−1} and {X0, . . . , Xm−1} are independent. Mo- reover we drop I{Y0=0} to obtain

Ck ≤ (1 − ε)k−12πZ0(|g|)2≤ (1 − ε)k−12 m2πg2 (13) Combining (12) and (13) yields that B < ∞. This completes the proof.

4. The difference between m = 1 and m 6= 1

Assume the small set condition (2) holds and consider the split chain defined by (6) and (7). The following tours

{X(σ(n)+1)m, X(σ(n)+1)m+1, . . . , X(σ(n+1)+1)m−1}, n = 0, 1, . . .

(6)

that start whenever Xk ∼ νm are of crucial importance to the regeneration theory and are eagerly analyzed by researchers. In virtually every paper on the subject there is a claim these objects are independent identically distributed random variables. This claim is usually considered obvious and no proof is provided. However this is not true if m > 1.

In fact formulas (6) and (7) should be convincing enough, as Xmn+1, . . . , X(n+1)m given Yn= 1 and Xnm= x are linked in a way described by P (x, dx1) · · · P (xm−1, dy).

In particular consider a Markov chain on X = {a, b, c, d, e} with transition pro- babilities P (a, b) = P (a, c) = P (b, b) = P (b, d) = P (c, c) = P (c, e) = 1/2, and P (d, a) = P (e, a) = 1. Let ν4(d) = ν4(e) = 1/2 and ε = 1/8. Clearly P4(x, ·) ≥ εν4(·) for every x ∈ X, hence we established (2) with C = X. Note that for this simplistic example each tour can start with d or e. However if it starts with d or e the previous tour must have ended with b or c respectively.

This makes them dependent!

Similar examples with general state space X and C 6= X can be easily provided. Hence Theorem 2.2 is critical to providing regeneration proofs of CLTs and standard arguments that involve iid random variables are not valid.

5. Bibliography

[1] Athreya K. B., Ney P., 1978, A new approach to the limit theory of re- current Markov chains. Trans. Amer. Math. Soc., 245: 493-501.

[2] Cogburn, R., 1972, The Central Limit Theorem for Markov Processes. In Le Cam, L. E., Neyman, J. & Scott, E. L. (Eds) Proc. Sixth Ann. Berkley Symp. Math. Sttist. and Prob., 2, 458-512.

[3] Ibragimov, I. A., Linnik, Y. V., 1971, Independent and Stationary Sequen- ces of Random Variables. Wolters-Noordhof, Groningen.

[4] Jones, G. L., 2005. "On the Markov chain central limit theorem" Proba- bility Surveys 1:299-320.

[5] Jones, G. L., Haran, M., Caffo, B. S., Neath, R. (2006), "Fixed-Width Output Analysis for Markov Chain Monte Carlo," Journal of the American Statatistical Association, 101, 1537-1547.

[6] Meyn S. P., Tweedie R. L., 1993. Markov Chains and Stochastic Stability.

Springer-Verlag.

[7] Nummelin E., 1978. A splitting technique for Harris recurrent chains. Z.

Wahrscheinlichkeitstheorie und Verw. Geb., 43: 309-318.

[8] Roberts G. O., Rosenthal J. S., 2005. General state space Markov chains and MCMC algorithms. Probability Surveys 1:20-71.

Cytaty

Powiązane dokumenty

Key words and phrases : directed graph, directed forest, Matrix Tree Theorem, Markov chains, Markov Chain Tree Theorem, direct methods for linear systems, entrywise relative

S is conceived as a topological space with the topology determined by the

In Section 5 we prove the Bernoulli property for some class of C 1+α Markov maps satisfying R´enyi’s Condition (Theorem 5.1 and Corollary 5.1).... This extends the result

Definition 4 The stochastic process X defined on (Ω, F ) is called a time-homogeneous Markov chain with transition probability kernel P (x, A) and initial distribution µ if the

Central limit theorems for functionals of general state space Markov chains are of crucial importance in sensible implementation of Markov chain Monte Carlo algorithms as well as

The organization of the paper is as follows: the history of the abstract Kendall’s theorem as well as our main improvement of the result are con- tained in Section 2; in Section 3

The organization of the paper is as follows: in Section 2 we discuss the Markov chain theory and variety of integrability conditions we need to prove exponential concentration and

We study the question of the law of large numbers and central limit theorem for an additive functional of a Markov processes taking values in a Polish space that has Feller