• Nie Znaleziono Wyników

Wepresentthe methodfor dealingwiththisproblembasedonfeaturesofso alledse ondary hain(the hainwithspe iallysele tedstatespa e)

N/A
N/A
Protected

Academic year: 2021

Share "Wepresentthe methodfor dealingwiththisproblembasedonfeaturesofso alledse ondary hain(the hainwithspe iallysele tedstatespa e)"

Copied!
26
0
0

Pełen tekst

(1)

vol.37(2008) No. 1

Convergen e diagnosis to stationary distribution in

MCMC methods via atoms and renewal sets

by

Ma iej Romaniuk

SystemsResear hInstitutePolishA ademyofS ien es, ul. Newelska6,

01447Warszawa,Poland,e-mail: mromanibspan.waw.pl

Abstra t: MCMCsetupsareoneofthebestknownmethodsfor

ondu ting omputersimulations usefulin su h areasas statisti s,

physi s,biology,et . However,to obtainappropriatesolutions,the

additional onvergen ediagnosismustbeappliedforMarkovChain

traje torygenerated bythealgorithm. Wepresentthe methodfor

dealingwiththisproblembasedonfeaturesofso alledse ondary

hain(the hainwithspe iallysele tedstatespa e). These ondary

hainis reatedfromtheinitial hainbypi kingonlysomeobserva-

tions onne tedwith atomsorrenewalsets. Inthispaperwefo us

on nding the moment when the simulated hain is lose enough

to thestationary distributionof the Markov hain. Thedis ussed

methodhassomeappealingproperties,likehighdegreeofdiagnosis

automation. Apart from theoreti al lemmas and a more heuristi

approa h,theexamplesofappli ationarealsoprovided.

Keywords: onvergen ediagnosis,MarkovChainMonteCarlo,

MarkovProperty,atom,renewalset, renewaltheory,automateddi-

agnosisofsimulations

1. Introdu tion

Theendoftheprevious enturybroughta olossalimprovementinspeedof

al ulations. Be ause of omputer development, the resear hers ould build

more omplex, more real-life models. The same applies for mathemati s,

statisti s,physi sand biology,where omputersimulationsarewidelyused.

Oneofthebestknownmethodsin omputersimulationsareMCMC(Markov

ChainMonteCarlo)algorithms,su essorsofMC(MonteCarlo)approa h(see

Metropolisetal.,1953;MetropolisandUlam, 1949). Theyare ommonlyused

in many pra ti al areas (see, e.g., Boos, Zhang, 2000; Booth, Sarkar, 1998;

Bremaud, 1999, Dou et et al., 2000; Gelfand et al., 1990; Gilks et al., 1997;

Kassetal.,1998;Korona kietal.,2005;Lasota,Niemiro,2003;Lietal.,2000;

Mehta etal.,2000;Robert,Casella,2004;Romaniuk,2003).

(2)

TheMCMCmethodisbasedonasimplebutbrilliantidea. Inordertond

theexpe tedvalueEπ

Xh(X) forsomefun tion h(.) : X → Rp andprobability distribution πX(.), we ould generate Markov Chain X0, X1, X2, . . . with πX

asthestationarydistribution. The onvergen eof theestimator,derivedfrom

thesimulatedsampleisguaranteedby theergodi theorems(see,e.g.,Robert,

Casella, 2004 for additional details). Therefore, we do not have to generate

valuesdire tlyfromπX(.)asin theMC method,but wemayusemoregeneral

algorithmslikeGibbs samplerortheMetropolisHastingsalgorithm.

Yet,duringthe ondu tofsimulationstwoquestionsariseallthetime. The

rstoneis onne tedwith hoosing appropriatenumberofstepsnstat forsim-

ulated traje tory, when the sampled transition probability Prnx0stat(.) is lose

enoughtotheassumedstationaryprobabilityπX(.)regardlessofstartingpoint x0. The se ond oneis related to nding the numberof steps nVar, when the

estimator of EπXh(X), derived from the sample Xnstat+1, Xnstat+2, . . . , XnVar

haserror smallenough,asmeasurede.g. byvarian e. Thesetwoquestionsare

overedby onvergen ediagnosis andareoneofthemainissuesinMCMCsim-

ulations. However,in thispaperwefo usonlyontherstproblem,i.e. nding

the value nstat. Some answersfor the se ond problem may be found e.g. in

Romaniuk(2007b).

There isa lot of various onvergen e diagnosis methods (see, e.g.,Robert,

Casella, 2004; El Adlouni et al., 2006, for omparative review). Butwehave

to say that itis notso easyto ompare them andnd thebest one oreven

thebestones. Firstly,veryoftenthesemethodsmakeuseofdierentfeatures

oftheunderlyingMarkovChains,e.g.spe i probabilitystru tureofthestate

spa e. Se ondly, the two questions mentioned before are used to be written

in mathemati al formulas not orresponding to one another, i.e. not dire tly

omparable. Thirdly, it is not even possible to draw a omparison between

heuristi and theoreti al (i.e. basedonmathemati alproofs)methods. There-

fore,ea hnew onvergen ediagnosismethod maybeseenasanadditionaltool

for experimenters, whi h gives them a new possibility to he k the obtained

simulations.

In this paper we dis uss the methods based on the on ept of se ondary

hain. These ondary hainisderivedfromtheoriginaltraje torybyobserving

the samples only in moments determined by spe ial probability rules. These

rulesare onne tedwiththenotionsofatomsandrenewalsets,whi harespe i

examplesofmoregeneralrenewal moments andareapartofrenewal theory.

Themethods des ribed overboththeoreti aland heuristi approa hes.

Thepresentedtheoreti al method hasthreemain advantages. Firstly,it is

supportedbystrongmathemati alreasoning. Therefore,itisfarlessinuen ed

by observer's intuition and his experien e than heuristi methods. Se ondly,

the obtained solutions are stri t, i.e. they are not asymptoti . Hen e, this

method is notbiasedbyadditionalerrorprovided bylimittheorems. Thirdly,

thedis ussedlemmasmaybeusedinahighlyautomatedmanner. Thisgivesthe

possibilityforpreparinggeneraldiagnosisalgorithmsforawide lassofMCMC

(3)

problems.

Theheuristi approa h isalso basedonmathemati al lemma, but involves

subje tivegraph he king.

The paper is organizedas follows. In Se tion 2 we present the ne essary

basi denitionsandtheorems. Then,inSe tion3.1weintrodu ethenotionof

se ondary hainandsomefundamentalfa tsaboutit. InSe tion3.2weformu-

latetwoinequalitieswhi h aredire tly onne tedto the onvergen ediagnosis

questions,mentioned before. Next, in Se tion 3.3wepresent sometheoreti al

lemmaswhi h onstitutethefoundationoftheintrodu edmethod andprovide

the answers for the question about nstat. In Se tion 3.4 we dis uss a more

heuristi approa h. In Se tion 4 we present how the derived results may be

applied intwoexamples. The on ludingremarksare ontainedin Se tion5.

Someof thesolutionspresentedin this paper arebased onideasfrom Ro-

maniuk(2007aand 2007b). Asit wasmentionedbefore, in Romaniuk(2007b)

themethodsforndingbothvaluesnstatandnVarwerepresented. However,for

nstatappropriatelemmasonlyinatom asewereproved. Inthispaperwefo us onlyon theproblemof nstat value, butthe generalizedlemmasforthe aseof renewalsetsareadded. Additionally,anewheuristi approa hforbothatom

andrenewal asesispresented.

2. Basi denitions and theorems

Inthis se tion we introdu e fundamental denitions and theorems. Addi-

tional ne essary denitions may be found in, e.g., Bremaud (1999), Fishman

(1996),RobertandCasella(2004).

Let(Xi)i=0 = (X0= x0, X1, . . .)denote aMarkovChain (abbreviatedfur- therMC),andB(X ) istheσeldofBorelsetsforspa eX.

The hain(Xi)i=0 hasitsvaluesinaspa eX, whereX ⊂ NorX ∈ B(Rk).

Intherst ase su h MCis alledasdis reteMC, andin these ond asMC

on ontinuousstatespa e.

Suppose that the hain (Xi)i=0 is ergodi and has anadequate stationary

probabilitydistributionπX(.). Inthis paperthe termergodi ity meansthat the hain is re urrent(or Harris re urrent in ase of MC on ontinuous state

spa eX ),aperiodi andirredu ible.

If(Xi)i=0 isadis reteMarkovChain,wedeneitstransitionmatrixPX as

PX= (Pr (Xk+1= j|Xk = i))si,j=1X , (1)

where sX ispowerof X. In aseof ontinuousstatespa eX,letus denoteby KX(., .)thetransitionkernelofthis hain

Pr(Xk+1∈ B|Xk = x) = Z

B

KX(x, y) dy . (2)

(4)

Definition 1. The set Ais alledan atom if thereexists aprobability distri- butionν(.)su hthat

Pr(Xk+1∈ B|Xk = x) = ν(B) (3)

for every x ∈ AandeveryB ∈ B(X ).

Definition 2. The set Ais alled renewalset if thereexists areal 0 < ǫ < 1

andaprobability measureν(.)su hthat

Pr(Xk+1∈ B|Xk = x) ≥ ǫν(B) (4)

for every x ∈ AandeveryB ∈ B(X ).

Thesetwodenitions maybefound in,e.g.,Asmussen(1979), Robert and

Casella(2004).

If A is arenewal set, it is onvenient to slightly hange the used MCMC

algorithm,whi hgeneratesthevaluesof(Xi)i=0. Itiseasilyseenthat Pr(Xk+1|Xk) = ǫν(Xk+1) + (1 − ǫ)Pr(Xk+1|Xk) − ǫν(Xk+1)

1 − ǫ (5)

in aseofdis reteMC,or

K(xk, xk+1) = ǫν(xk+1) + (1 − ǫ)K(xk, xk+1) − ǫν(xk+1)

1 − ǫ (6)

forMCon ontinuousstatespa eX. Hen e,wehavethefollowingmodi ation ofthealgorithm: whenXk ∈ A,generateXk+1 a ordingto

Xk+1=

(Xk+1∼ ν(.) ifUk+1≤ ǫ

Xk+1 K(xk1−ǫ,.)−ǫν(.) ifUk+1> ǫ , (7)

where Ui are iid random variablesfrom a uniform distribution on[0, 1], inde-

pendenton(Xi)i=0. Inviewof (5)and(6),themodi ation(7)oftheMCMC algorithm doesnot hangethepropertiesofthe hain. Alsoitsstationarydis-

tribution isstillthesame,i.e. πX(.). Thismodi ationforMCMCalgorithms wasintrodu edin Athreyaand Ney(1978),Nummelin (1978). Thegeneration

a ordingto (7) may bedi ult be auseof the omplex stru ture of there-

mainder kernel. A way aroundthis problem wasshown in Mykland,Tierney

andYu (1995).

Definition 3. The atom (or renewal set) A is alled geometri ally ergodi atom (orrenewal set)ifthere existr > 1andM > 0su hthat

|Prnx(y) − πX(y)| ≤ M r−n , (8)

for any x, y ∈ A,wherePrnx(.)denotes Pr(Xn = . |X0= x).

(5)

Letus denote by Eπ

Xh(X) the expe ted valueof the fun tion h : X → R

al ulateda ordingto thestationarydistributionπX. Appropriatesymbols

CovπX(g, h)andVarπX(h)areusedfor ovarian eand varian e.

Lemma 1. Let(Xi)i=0 beHarrisre urrentMarkov Chain and

Eπ

X|f (X)| = Z

X

|f (x)|dπX(x) < ∞ (9)

for somefun tion f (.) : X → Rand

EπX|l(X)| = Z

X

|l(x)|dπX(x) < ∞,EπXl(X) 6= 0 (10)

for somefun tion l(.) : X → R. Then wehave

1 n+1

Pn

k=0f (Xk)

1 n+1

Pn

k=0l(Xk)

−−−−→p.n.

n→∞

R

Xf (x) dπX(x) R

Xl(x) dπX(x) . (11)

Forproofofthis lemmaseeRobertandCasella(2004).

3. Proposal of a onvergen e diagnosis method

Inthisse tionwepresenta onvergen ediagnosismethodforMCMCoutput.

Thisproposalusesnotionsofatomsandrenewal sets (seeSe tion2).

3.1. Introdu ing se ondary hain

Suppose that we are interested in diagnosing onvergen e of some ergodi

MarkovChain (Xi)i=0 = (X0= x0, X1, . . .). We denoteastationary distribu-

tion for this hain by πX(.), its transition matrixby PX (or transition kernel byKX(., .)in aseofMCon ontinuousstatespa e)andthespa eofitsvalues

byX. Supposealso that weknowtwoatoms (orrenewalsets) A1, A2 for this

hain.

Therefore, we an reate the se ondary hain (Yi)i=1 based on our initial

hain (Xi)i=0. IfA1, A2areatoms,then we andene

ζ1:= min{i = 1, . . . : Xi∈ A1∪ A2}, (12) ζk+1:= min{i > ζk: Xi∈ A1∪ A2}, (13)

Yk = Xζk . (14)

It is seenthat the hain (Yi)i=1 hasMarkovPropertyfor thetrun ated spa e Y := {A1, A2}seeLemma2.

(6)

Ifthesetwosetsarerenewalsets, weshould introdu ethe modi ation(7)

and hange thedenition ofthe hain(Yi)i=1 to

ζ1:= min{i = 1, . . . : (Xi∈ A1∧ Ui≤ ǫA1) ∨ (Xi∈ A2∧ Ui≤ ǫA2)},

(15)

ζk+1:= min{i > ζk: (Xi∈ A1∧ Ui≤ ǫA1) ∨ (Xi∈ A2∧ Ui≤ ǫA2)},

(16)

Yk = Xζk , (17)

where ǫAj denotes the parameter ǫ for appropriate renewal set Aj in ondi-

tion(7). Alsointhis asethese ondary hain(Yi)i=1 hasMarkovPropertyfor

the spa eY. Asit wasmentionedbefore, aspe ial method to simulate from

theremainderkernelmaybene essary(seeMykland,TierneyandYu,1995).

Wemaysummarisepreviousobservationsin asimplelemma:

Lemma 2. If A1, A2 are atoms(or renewal sets), the hain (Yi)i=1 dened by

onditions (12)  (14) (or (15)  (17), respe tively) is a Markov Chain for

the spa eY := {A1, A2}. This hainisergodi .

TheproofmaybefoundinRomaniuk(2007b).

Forsimpli ity ofnotation, we ontinue to allatoms orrenewalsets Aj as

spe ialsets,keepinginminddierentdenitionsofthese ondary hain(Yi)i=1

fortheseboth ases.

Themomentsζidenedpreviously,maybeadditionallypartitionedbetween the orrespondingspe ialsets. Hen e,weadopt thefollowingdenitionofζi(j)

forthexedatomAj:

ζ1(j):= min{i = 1, . . . : Xi∈ Aj} , (18) ζk+1(j) := min{i > ζk(j): Xi∈ Aj}. (19)

FortherenewalsetAj thedenitionofζi(j)isanequivalentmodi ationofthe aboveformulas,i.e.:

ζ1(j):= min{i = 1, . . . : Xi∈ Aj∧ Ui≤ ǫAj}, (20) ζk+1(j) := min{i > ζk(j): Xi∈ Aj∧ Ui≤ ǫAj} . (21)

Therefore,ζ1(j)maybe onsideredasthemomentofrstvisit inthesetAj.

Nextlemmaisusedasjusti ationforaheuristi methoddes ribedfurther.

Lemma 3. If sets A1, A2 are atoms, then stationary distribution of πY(.) is

given by

πY(Aj) =

R

x∈AjX(x) R

x∈A1X(x) +R

x∈A2X(x) , (22)

(7)

for j = 1, 2.

IfsetsA1, A2arerenewalsets,thenstationary distributionofπY(.)isgiven

by

πY(Aj) = ǫAj

R

x∈AjX(x) ǫA1

R

x∈A1X(x) + ǫA2

R

x∈A2X(x) , (23)

for j = 1, 2.

Proof.Be ause(Yi)i=1isMC,thenfromthestrongergodi theoremforMarkov

hains wehave

Pm

i=111(Yi∈ Aj) m

−−−−→p.n.

m→∞ πY(Aj), (24)

forj = 1, 2,where

m = #{i ≤ n : Xi ∈ A1∪ A2} . (25)

IfA1, A2areatoms, thenlet

m(n) = #{i ≤ n : Xi∈ A1∪ A2} , (26)

i.e. m(n)is therandom numberof visitsinto A1 and A2. Be ause theinitial

hainisHarrisre urrent,thenforn → ∞,wehavem(n) → ∞(seeNummelin,

2001).

From(12)(14)andLemma 1wehave

Pm(n)

i=1 11(Yi∈ Aj)

m(n) =

Pm(n)

i=1 11(Yi∈ Aj) Pm(n)

i=1 11(Yi∈ A1∪ A2) =

=

1 n+1

Pn

i=011(Xi∈ Aj)

1 n+1

Pn

i=011(Xi∈ A1∪ A2)

−−−−→p.n.

n→∞

R

x∈AjX(x) R

x∈A1X(x) +R

x∈A2X(x) .

(27)

Comparing (24) with (27), we obtain (22) (see also Nummelin, 2001 for

similarinferen e).

IfA1, A2arerenewalsets,thenlet

m(n) = #{i ≤ n : Xi∈ (A1, Ui≤ ǫA1) ∪ (A2, Ui≤ ǫA2)}. (28)

From(15)(17)andLemma1wehave

Pm(n)

i=1 11(Yi∈ Aj)

m(n) =

Pm(n)

i=1 11(Yi∈ Aj) Pm(n)

i=1 11(Yi∈ A1∪ A2) =

=

1 n+1

Pn

i=011(Xi∈ Aj, Ui≤ ǫAj)

1 n+1

Pn

i=011(Xi∈ (A1, Ui ≤ ǫA1) ∪ (A2, Ui≤ ǫA2))

−−−−→p.n.

n→∞

ǫAj

R

x∈AjX(x) ǫA1

R

x∈A1X(x) + ǫA2

R

x∈A2X(x) . (29)

(8)

Informula(29)weusedtheindependen epropertyforUi andXi(see(7)). As

previously, omparing (24)with(29),weprove(23).

3.2. Diagnosisof the initial hain

Aswehavenotedin Se tion 3.1, for hain (Xi)i=0 with twoknownspe ial

setsAj (j = 1, 2)wemayintrodu eadditional hain(Yi)i=1. The hain(Yi)i=1

isadis reteMCwithonlytwostates,regardlessof ardinalityandpowerofthe

spa eX.

Duringdiagnosisoftheinitial hain,weareinterestedintwovaluesnstat

andnVar. Therstvaluenstatisthetimemomentwhenweare loseenough

tostationarydistributionπX,i.e.

Pxn0stat− πX

≤ ε1 , (30)

wherek.kindi atessomedeterminednormforspa eX,e.g. totalvariationnorm

whi h is used in the rest of this paper, Prnx0stat(.) = Pr(Xnstat = . |X0 = x0).

When the number of simulations nstat in the MCMC algorithm is a hieved,

in the lightof (30)wemay treat(Xi)i≥nstat asbeing almost distributed from stationary distribution πX.

Supposethatweareinterestedinobtainingestimatoroftheexpe tedvalue

Eπ

Xh(X) based on the average of the initial hain. Naturally, wewould like

to a hievesmall enoughvarian eof this estimatorand ndthe quantity nVar

fullling the ondition

Var 1 s

nVar

X

k=nstat+1

h(Xk) − Eπxh(X)

!

≤ ε2, (31)

wheres = nVar− nstat.

Inthe followingwefo us onlyon problem (30). We deal with these ond

problem inRomaniuk (2007b). Furthermore,forsimpli ityof formulationand

notation,welimitourselvestothe asewhenX isaniteset. However,appro-

priateproofsmaybeeasilygeneralizedforthe aseof ontinuousstatespa eX.

Itisworthnotingthatfromthe omputationalandnumeri alpointofview,the

problemof ardinalityofX israthera ademi in omputersallthenumbers

arerepresentedbytheniteset ofpossibilities.

Cytaty

Powiązane dokumenty

z ograniczoną odpowiedzialnością i spółek akcyjnych. Wypada jednak wątpić, czy mimo identyczności terminologii oraz opierania się i tutaj na umowie spółki, mamy do czynienia

Implementation of the algorithm for the synthesis of driving cycles and analysis of research results An algorithm was proposed to generate naturalistic driving cycles

Trudno zrozumieć w peł­ ni całkowicie rolę warstwy czy jednostek, abstrahując od klasowych nastawień czy społecznych lęków, stanowiących przecież poważny

Краковяк: Инвектива как лирический жанр: семантическая структура (на материале русской и польской поэзии XIХ–XX вв.)

metody MCMC (ang. Markov Chain Monte Carlo), czyli metody Monte Carlo z wykorzystaniem łańcuchów Markowa.. Ogólnie metoda Monte Carlo MC jest zaliczana do klasy metod

b) Calculate the mean recurrence time for a stone and compare with the stationary distri- bution.. We toss a coin until we obtain three heads in

To obtain the moments of the time to absorption one can use the method based on algebraic properties of fundamental matrix for an absorbing Markov chain (see [7] Theorem 3.2,

brzeskim, w biednej, wielodzietnej rodzinie chłopskiej, w czasie kiedy Polski jeszcze nie było, ale był już w Galicji zorganizowany ruch ludo­ wy i stale