vol.37(2008) No. 1
Convergen e diagnosis to stationary distribution in
MCMC methods via atoms and renewal sets
by
Ma iej Romaniuk
SystemsResear hInstitutePolishA ademyofS ien es, ul. Newelska6,
01447Warszawa,Poland,e-mail: mromanibspan.waw.pl
Abstra t: MCMCsetupsareoneofthebestknownmethodsfor
ondu ting omputersimulations usefulin su h areasas statisti s,
physi s,biology,et . However,to obtainappropriatesolutions,the
additional onvergen ediagnosismustbeappliedforMarkovChain
traje torygenerated bythealgorithm. Wepresentthe methodfor
dealingwiththisproblembasedonfeaturesofso alledse ondary
hain(the hainwithspe iallysele tedstatespa e). These ondary
hainis reatedfromtheinitial hainbypi kingonlysomeobserva-
tions onne tedwith atomsorrenewalsets. Inthispaperwefo us
on nding the moment when the simulated hain is lose enough
to thestationary distributionof the Markov hain. Thedis ussed
methodhassomeappealingproperties,likehighdegreeofdiagnosis
automation. Apart from theoreti al lemmas and a more heuristi
approa h,theexamplesofappli ationarealsoprovided.
Keywords: onvergen ediagnosis,MarkovChainMonteCarlo,
MarkovProperty,atom,renewalset, renewaltheory,automateddi-
agnosisofsimulations
1. Introdu tion
Theendoftheprevious enturybroughta olossalimprovementinspeedof
al ulations. Be ause of omputer development, the resear hers ould build
more omplex, more real-life models. The same applies for mathemati s,
statisti s,physi sand biology,where omputersimulationsarewidelyused.
Oneofthebestknownmethodsin omputersimulationsareMCMC(Markov
ChainMonteCarlo)algorithms,su essorsofMC(MonteCarlo)approa h(see
Metropolisetal.,1953;MetropolisandUlam, 1949). Theyare ommonlyused
in many pra ti al areas (see, e.g., Boos, Zhang, 2000; Booth, Sarkar, 1998;
Bremaud, 1999, Dou et et al., 2000; Gelfand et al., 1990; Gilks et al., 1997;
Kassetal.,1998;Korona kietal.,2005;Lasota,Niemiro,2003;Lietal.,2000;
Mehta etal.,2000;Robert,Casella,2004;Romaniuk,2003).
TheMCMCmethodisbasedonasimplebutbrilliantidea. Inordertond
theexpe tedvalueEπ
Xh(X) forsomefun tion h(.) : X → Rp andprobability distribution πX(.), we ould generate Markov Chain X0, X1, X2, . . . with πX
asthestationarydistribution. The onvergen eof theestimator,derivedfrom
thesimulatedsampleisguaranteedby theergodi theorems(see,e.g.,Robert,
Casella, 2004 for additional details). Therefore, we do not have to generate
valuesdire tlyfromπX(.)asin theMC method,but wemayusemoregeneral
algorithmslikeGibbs samplerortheMetropolisHastingsalgorithm.
Yet,duringthe ondu tofsimulationstwoquestionsariseallthetime. The
rstoneis onne tedwith hoosing appropriatenumberofstepsnstat forsim-
ulated traje tory, when the sampled transition probability Prnx0stat(.) is lose
enoughtotheassumedstationaryprobabilityπX(.)regardlessofstartingpoint x0. The se ond oneis related to nding the numberof steps nVar, when the
estimator of EπXh(X), derived from the sample Xnstat+1, Xnstat+2, . . . , XnVar
haserror smallenough,asmeasurede.g. byvarian e. Thesetwoquestionsare
overedby onvergen ediagnosis andareoneofthemainissuesinMCMCsim-
ulations. However,in thispaperwefo usonlyontherstproblem,i.e. nding
the value nstat. Some answersfor the se ond problem may be found e.g. in
Romaniuk(2007b).
There isa lot of various onvergen e diagnosis methods (see, e.g.,Robert,
Casella, 2004; El Adlouni et al., 2006, for omparative review). Butwehave
to say that itis notso easyto ompare them andnd thebest one oreven
thebestones. Firstly,veryoftenthesemethodsmakeuseofdierentfeatures
oftheunderlyingMarkovChains,e.g.spe i probabilitystru tureofthestate
spa e. Se ondly, the two questions mentioned before are used to be written
in mathemati al formulas not orresponding to one another, i.e. not dire tly
omparable. Thirdly, it is not even possible to draw a omparison between
heuristi and theoreti al (i.e. basedonmathemati alproofs)methods. There-
fore,ea hnew onvergen ediagnosismethod maybeseenasanadditionaltool
for experimenters, whi h gives them a new possibility to he k the obtained
simulations.
In this paper we dis uss the methods based on the on ept of se ondary
hain. These ondary hainisderivedfromtheoriginaltraje torybyobserving
the samples only in moments determined by spe ial probability rules. These
rulesare onne tedwiththenotionsofatomsandrenewalsets,whi harespe i
examplesofmoregeneralrenewal moments andareapartofrenewal theory.
Themethods des ribed overboththeoreti aland heuristi approa hes.
Thepresentedtheoreti al method hasthreemain advantages. Firstly,it is
supportedbystrongmathemati alreasoning. Therefore,itisfarlessinuen ed
by observer's intuition and his experien e than heuristi methods. Se ondly,
the obtained solutions are stri t, i.e. they are not asymptoti . Hen e, this
method is notbiasedbyadditionalerrorprovided bylimittheorems. Thirdly,
thedis ussedlemmasmaybeusedinahighlyautomatedmanner. Thisgivesthe
possibilityforpreparinggeneraldiagnosisalgorithmsforawide lassofMCMC
problems.
Theheuristi approa h isalso basedonmathemati al lemma, but involves
subje tivegraph he king.
The paper is organizedas follows. In Se tion 2 we present the ne essary
basi denitionsandtheorems. Then,inSe tion3.1weintrodu ethenotionof
se ondary hainandsomefundamentalfa tsaboutit. InSe tion3.2weformu-
latetwoinequalitieswhi h aredire tly onne tedto the onvergen ediagnosis
questions,mentioned before. Next, in Se tion 3.3wepresent sometheoreti al
lemmaswhi h onstitutethefoundationoftheintrodu edmethod andprovide
the answers for the question about nstat. In Se tion 3.4 we dis uss a more
heuristi approa h. In Se tion 4 we present how the derived results may be
applied intwoexamples. The on ludingremarksare ontainedin Se tion5.
Someof thesolutionspresentedin this paper arebased onideasfrom Ro-
maniuk(2007aand 2007b). Asit wasmentionedbefore, in Romaniuk(2007b)
themethodsforndingbothvaluesnstatandnVarwerepresented. However,for
nstatappropriatelemmasonlyinatom asewereproved. Inthispaperwefo us onlyon theproblemof nstat value, butthe generalizedlemmasforthe aseof renewalsetsareadded. Additionally,anewheuristi approa hforbothatom
andrenewal asesispresented.
2. Basi denitions and theorems
Inthis se tion we introdu e fundamental denitions and theorems. Addi-
tional ne essary denitions may be found in, e.g., Bremaud (1999), Fishman
(1996),RobertandCasella(2004).
Let(Xi)i=0 = (X0= x0, X1, . . .)denote aMarkovChain (abbreviatedfur- therMC),andB(X ) istheσeldofBorelsetsforspa eX.
The hain(Xi)i=0 hasitsvaluesinaspa eX, whereX ⊂ NorX ∈ B(Rk).
Intherst ase su h MCis alledasdis reteMC, andin these ond asMC
on ontinuousstatespa e.
Suppose that the hain (Xi)i=0 is ergodi and has anadequate stationary
probabilitydistributionπX(.). Inthis paperthe termergodi ity meansthat the hain is re urrent(or Harris re urrent in ase of MC on ontinuous state
spa eX ),aperiodi andirredu ible.
If(Xi)i=0 isadis reteMarkovChain,wedeneitstransitionmatrixPX as
PX= (Pr (Xk+1= j|Xk = i))si,j=1X , (1)
where sX ispowerof X. In aseof ontinuousstatespa eX,letus denoteby KX(., .)thetransitionkernelofthis hain
Pr(Xk+1∈ B|Xk = x) = Z
B
KX(x, y) dy . (2)
Definition 1. The set Ais alledan atom if thereexists aprobability distri- butionν(.)su hthat
Pr(Xk+1∈ B|Xk = x) = ν(B) (3)
for every x ∈ AandeveryB ∈ B(X ).
Definition 2. The set Ais alled renewalset if thereexists areal 0 < ǫ < 1
andaprobability measureν(.)su hthat
Pr(Xk+1∈ B|Xk = x) ≥ ǫν(B) (4)
for every x ∈ AandeveryB ∈ B(X ).
Thesetwodenitions maybefound in,e.g.,Asmussen(1979), Robert and
Casella(2004).
If A is arenewal set, it is onvenient to slightly hange the used MCMC
algorithm,whi hgeneratesthevaluesof(Xi)i=0. Itiseasilyseenthat Pr(Xk+1|Xk) = ǫν(Xk+1) + (1 − ǫ)Pr(Xk+1|Xk) − ǫν(Xk+1)
1 − ǫ (5)
in aseofdis reteMC,or
K(xk, xk+1) = ǫν(xk+1) + (1 − ǫ)K(xk, xk+1) − ǫν(xk+1)
1 − ǫ (6)
forMCon ontinuousstatespa eX. Hen e,wehavethefollowingmodi ation ofthealgorithm: whenXk ∈ A,generateXk+1 a ordingto
Xk+1=
(Xk+1∼ ν(.) ifUk+1≤ ǫ
Xk+1∼ K(xk1−ǫ,.)−ǫν(.) ifUk+1> ǫ , (7)
where Ui are iid random variablesfrom a uniform distribution on[0, 1], inde-
pendenton(Xi)i=0. Inviewof (5)and(6),themodi ation(7)oftheMCMC algorithm doesnot hangethepropertiesofthe hain. Alsoitsstationarydis-
tribution isstillthesame,i.e. πX(.). Thismodi ationforMCMCalgorithms wasintrodu edin Athreyaand Ney(1978),Nummelin (1978). Thegeneration
a ordingto (7) may bedi ult be auseof the omplex stru ture of there-
mainder kernel. A way aroundthis problem wasshown in Mykland,Tierney
andYu (1995).
Definition 3. The atom (or renewal set) A is alled geometri ally ergodi atom (orrenewal set)ifthere existr > 1andM > 0su hthat
|Prnx(y) − πX(y)| ≤ M r−n , (8)
for any x, y ∈ A,wherePrnx(.)denotes Pr(Xn = . |X0= x).
Letus denote by Eπ
Xh(X) the expe ted valueof the fun tion h : X → R
al ulateda ordingto thestationarydistributionπX. Appropriatesymbols
CovπX(g, h)andVarπX(h)areusedfor ovarian eand varian e.
Lemma 1. Let(Xi)i=0 beHarrisre urrentMarkov Chain and
Eπ
X|f (X)| = Z
X
|f (x)|dπX(x) < ∞ (9)
for somefun tion f (.) : X → Rand
EπX|l(X)| = Z
X
|l(x)|dπX(x) < ∞,EπXl(X) 6= 0 (10)
for somefun tion l(.) : X → R. Then wehave
1 n+1
Pn
k=0f (Xk)
1 n+1
Pn
k=0l(Xk)
−−−−→p.n.
n→∞
R
Xf (x) dπX(x) R
Xl(x) dπX(x) . (11)
Forproofofthis lemmaseeRobertandCasella(2004).
3. Proposal of a onvergen e diagnosis method
Inthisse tionwepresenta onvergen ediagnosismethodforMCMCoutput.
Thisproposalusesnotionsofatomsandrenewal sets (seeSe tion2).
3.1. Introdu ing se ondary hain
Suppose that we are interested in diagnosing onvergen e of some ergodi
MarkovChain (Xi)i=0 = (X0= x0, X1, . . .). We denoteastationary distribu-
tion for this hain by πX(.), its transition matrixby PX (or transition kernel byKX(., .)in aseofMCon ontinuousstatespa e)andthespa eofitsvalues
byX. Supposealso that weknowtwoatoms (orrenewalsets) A1, A2 for this
hain.
Therefore, we an reate the se ondary hain (Yi)i=1 based on our initial
hain (Xi)i=0. IfA1, A2areatoms,then we andene
ζ1:= min{i = 1, . . . : Xi∈ A1∪ A2}, (12) ζk+1:= min{i > ζk: Xi∈ A1∪ A2}, (13)
Yk = Xζk . (14)
It is seenthat the hain (Yi)i=1 hasMarkovPropertyfor thetrun ated spa e Y′ := {A1, A2}seeLemma2.
Ifthesetwosetsarerenewalsets, weshould introdu ethe modi ation(7)
and hange thedenition ofthe hain(Yi)i=1 to
ζ1:= min{i = 1, . . . : (Xi∈ A1∧ Ui≤ ǫA1) ∨ (Xi∈ A2∧ Ui≤ ǫA2)},
(15)
ζk+1:= min{i > ζk: (Xi∈ A1∧ Ui≤ ǫA1) ∨ (Xi∈ A2∧ Ui≤ ǫA2)},
(16)
Yk = Xζk , (17)
where ǫAj denotes the parameter ǫ for appropriate renewal set Aj in ondi-
tion(7). Alsointhis asethese ondary hain(Yi)i=1 hasMarkovPropertyfor
the spa eY′. Asit wasmentionedbefore, aspe ial method to simulate from
theremainderkernelmaybene essary(seeMykland,TierneyandYu,1995).
Wemaysummarisepreviousobservationsin asimplelemma:
Lemma 2. If A1, A2 are atoms(or renewal sets), the hain (Yi)i=1 dened by
onditions (12) (14) (or (15) (17), respe tively) is a Markov Chain for
the spa eY′ := {A1, A2}. This hainisergodi .
TheproofmaybefoundinRomaniuk(2007b).
Forsimpli ity ofnotation, we ontinue to allatoms orrenewalsets Aj as
spe ialsets,keepinginminddierentdenitionsofthese ondary hain(Yi)i=1
fortheseboth ases.
Themomentsζidenedpreviously,maybeadditionallypartitionedbetween the orrespondingspe ialsets. Hen e,weadopt thefollowingdenitionofζi(j)
forthexedatomAj:
ζ1(j):= min{i = 1, . . . : Xi∈ Aj} , (18) ζk+1(j) := min{i > ζk(j): Xi∈ Aj}. (19)
FortherenewalsetAj thedenitionofζi(j)isanequivalentmodi ationofthe aboveformulas,i.e.:
ζ1(j):= min{i = 1, . . . : Xi∈ Aj∧ Ui≤ ǫAj}, (20) ζk+1(j) := min{i > ζk(j): Xi∈ Aj∧ Ui≤ ǫAj} . (21)
Therefore,ζ1(j)maybe onsideredasthemomentofrstvisit inthesetAj.
Nextlemmaisusedasjusti ationforaheuristi methoddes ribedfurther.
Lemma 3. If sets A1, A2 are atoms, then stationary distribution of πY(.) is
given by
πY(Aj) =
R
x∈AjdπX(x) R
x∈A1dπX(x) +R
x∈A2dπX(x) , (22)
for j = 1, 2.
IfsetsA1, A2arerenewalsets,thenstationary distributionofπY(.)isgiven
by
πY(Aj) = ǫAj
R
x∈AjdπX(x) ǫA1
R
x∈A1dπX(x) + ǫA2
R
x∈A2dπX(x) , (23)
for j = 1, 2.
Proof.Be ause(Yi)i=1isMC,thenfromthestrongergodi theoremforMarkov
hains wehave
Pm
i=111(Yi∈ Aj) m
−−−−→p.n.
m→∞ πY(Aj), (24)
forj = 1, 2,where
m = #{i ≤ n : Xi ∈ A1∪ A2} . (25)
IfA1, A2areatoms, thenlet
m(n) = #{i ≤ n : Xi∈ A1∪ A2} , (26)
i.e. m(n)is therandom numberof visitsinto A1 and A2. Be ause theinitial
hainisHarrisre urrent,thenforn → ∞,wehavem(n) → ∞(seeNummelin,
2001).
From(12)(14)andLemma 1wehave
Pm(n)
i=1 11(Yi∈ Aj)
m(n) =
Pm(n)
i=1 11(Yi∈ Aj) Pm(n)
i=1 11(Yi∈ A1∪ A2) =
=
1 n+1
Pn
i=011(Xi∈ Aj)
1 n+1
Pn
i=011(Xi∈ A1∪ A2)
−−−−→p.n.
n→∞
R
x∈AjdπX(x) R
x∈A1dπX(x) +R
x∈A2dπX(x) .
(27)
Comparing (24) with (27), we obtain (22) (see also Nummelin, 2001 for
similarinferen e).
IfA1, A2arerenewalsets,thenlet
m(n) = #{i ≤ n : Xi∈ (A1, Ui≤ ǫA1) ∪ (A2, Ui≤ ǫA2)}. (28)
From(15)(17)andLemma1wehave
Pm(n)
i=1 11(Yi∈ Aj)
m(n) =
Pm(n)
i=1 11(Yi∈ Aj) Pm(n)
i=1 11(Yi∈ A1∪ A2) =
=
1 n+1
Pn
i=011(Xi∈ Aj, Ui≤ ǫAj)
1 n+1
Pn
i=011(Xi∈ (A1, Ui ≤ ǫA1) ∪ (A2, Ui≤ ǫA2))→
−−−−→p.n.
n→∞
ǫAj
R
x∈AjdπX(x) ǫA1
R
x∈A1dπX(x) + ǫA2
R
x∈A2dπX(x) . (29)
Informula(29)weusedtheindependen epropertyforUi andXi(see(7)). As
previously, omparing (24)with(29),weprove(23).
3.2. Diagnosisof the initial hain
Aswehavenotedin Se tion 3.1, for hain (Xi)i=0 with twoknownspe ial
setsAj (j = 1, 2)wemayintrodu eadditional hain(Yi)i=1. The hain(Yi)i=1
isadis reteMCwithonlytwostates,regardlessof ardinalityandpowerofthe
spa eX.
Duringdiagnosisoftheinitial hain,weareinterestedintwovaluesnstat
andnVar. Therstvaluenstatisthetimemomentwhenweare loseenough
tostationarydistributionπX,i.e.
Pxn0stat− πX
≤ ε1 , (30)
wherek.kindi atessomedeterminednormforspa eX,e.g. totalvariationnorm
whi h is used in the rest of this paper, Prnx0stat(.) = Pr(Xnstat = . |X0 = x0).
When the number of simulations nstat in the MCMC algorithm is a hieved,
in the lightof (30)wemay treat(Xi)i≥nstat asbeing almost distributed from stationary distribution πX.
Supposethatweareinterestedinobtainingestimatoroftheexpe tedvalue
Eπ
Xh(X) based on the average of the initial hain. Naturally, wewould like
to a hievesmall enoughvarian eof this estimatorand ndthe quantity nVar
fullling the ondition
Var 1 s
nVar
X
k=nstat+1
h(Xk) − Eπxh(X)
!
≤ ε2, (31)
wheres = nVar− nstat.
Inthe followingwefo us onlyon problem (30). We deal with these ond
problem inRomaniuk (2007b). Furthermore,forsimpli ityof formulationand
notation,welimitourselvestothe asewhenX isaniteset. However,appro-
priateproofsmaybeeasilygeneralizedforthe aseof ontinuousstatespa eX.
Itisworthnotingthatfromthe omputationalandnumeri alpointofview,the
problemof ardinalityofX israthera ademi in omputersallthenumbers
arerepresentedbytheniteset ofpossibilities.