• Nie Znaleziono Wyników

A. Introduction to Bayesian inference B. Bayesian regression

N/A
N/A
Protected

Academic year: 2021

Share "A. Introduction to Bayesian inference B. Bayesian regression"

Copied!
31
0
0

Pełen tekst

(1)

Advanced Econometrics Topic 6: Bayesian inference

Michał Rubaszek

SGH Warsaw School of Economics

(2)

Themes

A. Introduction to Bayesian inference B. Bayesian regression

C. Bayesian model averaging

(3)

Theme A.

Introduction to Bayesian inference

(4)

Bayes theorem

For events A and B the Bayes theorem is:

Explanation:

,

(5)

Bayes theorem in econometrics

For parameters and data the Bayes theorem implies:

- prior pdf of parameters - probability of data given - posteriori pdf

- marginal likelihood of data (does not depend on )

To derive posterior of we substitute by likelihood | :

(6)

Bayes theorem in econometrics

In a Bayesian framework, the parameters are considered as random

variables, whereas in a frequentist approach is assumed to be a constant The distribution of before observing the data is called prior distribution and is denoted by .

The distribution of after observing the data is called posterior distribution and is denoted by | .

(7)

Bayes rule in econometrics: illustration

(8)

Conjugate prior

In some class of models the posterior distribution is in the same

family as the prior distribution. In this case we say about conjgate prior

Example:

Beta distribution Binomial distribution Beta distribution

(9)

Conjugate prior: Beta + Binomial

∼ "#$ %, &

' ()*(

+$, ()* -(*()*).

Γ % 0 &

Γ % Γ & (1. 1 3 *1.

0 5 5 1

Linkto beta distribution in Wikipedia

(10)

Conjugate prior: example

Example:

Two students (A and B) like to play chess. They have already played 6 Times and student A won 7 times (and lost 6 3 7 . Let be the parameter that describes the probability of student A success.

Prior: 8 (9)*9

8 (9 8 *9 (91. 1 3 *91. ∼ "#$ %:, &:

Likelihood: | 6

7 ; 1 3 <1; | ∼ 6,

Posterior: | ∝ (9); 1. 1 3 *9)<1; 1. | ∼ "#$ %., &.

%. %: 0 7 ; &. &: 0 6 3 7

| 8 (8 (>)*>

> 8 *> (>1. 1 3 *>1.

Notice: in formula for | we omitted 8 (8 (9)*9

9 8 *9 and 6

7 . Why?

(11)

Metropolis-Hastings

Markov chain Monte Carlo (MCMC) algorithm

 In most cases we can calculate and | but don't know the analytical formula for |

 We resort to numerical methods, e.g. Metropolis Hastings MCMC 1. Set the initial value of parameter @ for A 0

2. Draw @ 0 CD, where D ∼ 6 0, Σ and C is a step length 3. Draw F ∼ G :,.

4. Calculate H / @ and compare it to F

If H 5 F then @). @

If H J F then @).

5. Repeat steps 2-4 6K@L times

6. Using the sample @ for A 6MNOP 0 1, … , 6K@L calculate descriptive statistics for |

(12)

Exercise 1

I. Sample | ∼ 6, from binomial distribution.

Set 6 25 and 0.5

II. Assume the prior to ∼ "#$ %:, &: . Set %: 4 and &: 6

III. Calculate the posteriori distribution parameters %. and &..

Calculate posteriori mean and standard deviation with analytical method

IV. Perform MH MCMC simulations and calculate posteriori mean and standard deviation. Compare the values to the results from IV

V. Plot the posteriori density for using two methods:

analytical

numerical (MCMC)

(13)

Exercise 2

I. Sample V| ∼ 6 , 1W from normal distribution.

Set 6 10 and 2

Notice: our model is @ 0 D@, D@ ∼ 6 0,1W

II. Assume the prior to ∼ 6 :, X: . Set : 1 and X: 0.5W

III. Perform MH MCMC simulations and calculate posteriori mean and standard deviation for

Notice: the loglikelihood for normal distribution model is:

Y 2Z 1<W exp 3 3 \< ′ 3 \<

2 Here ^ . W <_′ and \< ^1 1 … 1_′

(14)

Theme B. Linear Bayesian model

(15)

Bayesian regression

Consider a linear model:

V `Va& 0 DV, DV ∼ 6 0, X

Likelihood:

&, X , ` 2ZX 1:.bc exp 3 d1e* a d1e*

Wf

where g `.a, `Wa,..., `ca ′ and h ., W, … , c

ML estimates:

&i gag 1.ga

Xj k1e*l nm k1e*l , where o p 3 7

Vector of parameters:

&, X

(16)

Bayesian regression: known variance

Linear model:

g& 0 D, D ∼ 6 0, X

Likelihood: & 2ZX 1:.bc exp 3 d1e* a d1e*

Wf

Prior: & ∝ exp 30.5 & 3 & a Ω1. & 3 &

& ∼ 6 &, Ω

Posterior: &| ∝ exp 3.W & 3 &̅ aΩv1. & 3 &̅

&| ∼ 6 &̅, Ωv

Ωv Ω1. 0 h1.gag 1. Ω1. 0 Ωl1w 1.

Ωx Ω1.& 0 X1.g′ = Ωv Ω1.& 0 Ωl1w&i

(17)

Example

Let us consider a model:

Ayz 1, Zyz & 0 D

Prior:

& ∼ 6 01.5 , 10W 0 0 0.1W

Posterior mean:

ML Prior Posterior const 2.58 0.00 1.50 infEA 0.71 1.50 0.99

(18)

Bayesian regression: random variance

Linear model:

g& 0 D, D ∼ 6 0, X

Likelihood: &, X 2ZX 1:.bc exp 3 d1e* a d1e*

Wf

Prior: X ∼ {| o, X

&|X ∼ 6 &, XΩ

&, X ∼ 6{| &, Ω, o, X Normal Inverse Gamma distribution

Posterior: X| ∼ {| o̅, Xx

&|X, ∼ 6 &̅, XΩv

&, X| ∼ 6{| &̅, Ωv, o̅, Xx

Ωv Ω1. 0 gag 1. &̅ Ωv Ω1.& 0 g′

o̅ o 0 o Xx Xo 0 Xj} 0 & 3 &i Ωv & 3 &i

(19)

Conjugate prior: Normal Gamma

Inverse gamma

g ∼ {| %, &

' g (1.*

+$, g (). *--(1W -

g 8 (*~ g(1.exp 3*e g J 0

Link to inverse gamma distr. in Wikipedia

(20)

Gibbs sampling

In the above example we know that X| ∼ {| o̅, Xx and &|X, ∼ 6 &̅, XΩv But we don't know what is the marginal distribution &| .*

To derive it we can use Gibbs-Sampler*

1. Draw X @ from {| o̅, Xx 2. Draw & @ from 6 &̅, XΩv

3. Repeat setps 1-2 6K@L times

4. Using the sample & @ for A 1, … , 6K@L calculate descriptive statistics for &|

* In this case we can derive that the marginal distribution is t-Student

(21)

Exercises

I. Estimate a model in which interest rate in a given country depends on:

inflation

GDP growth rate

exchange rate depreciation

II. Set the prior centered at ^0 1.5 0.5 0_ with standard deviation 10 0.1 0.1 0.1 III. Derive the posterior distribution

IV. Make a plot (prior/LL/posteriori) for each parameter

V. Repeat the above using the Normal-Gamma prior and compare the posteriori mean with the values from point III

(22)

Theme C. Bayesian model averaging

(23)

Bayes rule in econometrics

 Let us consider a model for with 7 potential regressors from a set:

g `., `W, … `;

 There are 2; different subsets g ∈ g, hence 2; potential models †:

% 0 g& 0 D

 Which specification † should be selected? This problem is especially difficult for large 7!

 Bayesian inference helps to tackle this problem

BMA – Bayesian model averaging / BMS – Bayesian model selection

(24)

Bayes theorem in model selection

The Bayes theorem

where

ˆ

W

„Š.

- posteriori probability of model

- prior probability of model

- marginal likelihood model

We need a method to calculate |† and choose

(25)

Marginal likelihood

For model †:

% 0 g& 0 D, D ∼ 6 0, X

the marginal likelihood is

Œ

where %, &, X

(26)

Marginal likelihood, Zellner g-prior

Zellner g-prior

% ∝ 1 X ∝ X1.

&|• ∼ 6 0, •X gag 1.

Posterior distribution

' & .)ŽŽ &i , where &i is ML estimate of the parameters

 For g → 0 posterior is equal to prior

 For g = 1 posteriori puts equal weight to prior and likelihood,

 For g = T prior has the equivalent weight of 1 observation

 For g → ∞ prior is uniform (non-informative)

(27)

Marginal likelihood: Zellner-g prior

Marginal likelihood of data:

, • Γ p 3 12

p.WZc1.W 3 x a 3 x 1 0 • 1 3 •W 1c1.W 1 0 • c1;W1.

, • ’“”•# – 1 0 • 1 3 •W 1—˜>- 1 0 • —˜‰‘˜>-

Bayes factor (relative marginal likelihood) for and š:

› †, †š , •

š, • .)Ž .1œ.)Ž .1œ•-ž-

—˜>

- 1 0 • ‰•˜‰ž-

(28)

Prior of the model:

Methods of choosing the prior

 Uniform prior: † 21;

[each model is equally probable, expected model size 7/2]

 Binomial prior: † ; 1 3 ;1;

[ -fixed prob. of including each parameter, expected model size 7 ]

 Custom Prior Inclusion Probabilities: Ÿ is individual for each variable

 Beta-binomial: ∼ "#$ is a random variable

(29)

BMS: example

(30)

BMS: example

(31)

Exercises

Exercise 1.

Select a country and a variable from the dataJIE.csv file. Evaluate factors that were most important for this variable using BMA/BMS methodology

Exercise 2.

Download the data on GDP growth over 1960-1992 and 41 other variables in 72 countries with the commands:

data(datafls) help(datafls)

Evaluate factors that were most important for economic growth using BMA/BMS methodology

Cytaty

Powiązane dokumenty

(For the case q = 1, this proof was also given in [11].) In fact, it shows that certain cases of Theorem (3.1) are equivalent to Doob’s results.. We end the section by deriving the

(It also states that the 2-class field tower of an arbitrary imaginary quadratic field with four or more prime divisors of discriminant never terminates in the class of CM-fields,

In Section 2 we describe some classical results concerning uniqueness, including the famous Salem–Zygmund characterization of perfect symmetric sets of constant ratio which are sets

Однак великі значення нормальних і дотичних напружень в цих зонах, що викликані періодичною зміною перерізу балки по довжині та збільшення

Use the 690+ Quick Start (HA4700631) guide to set up the drive and Autotune the drive in the Closed Loop Vector mode. Set the desired Distance, Velocity &amp; Acceleration values,

The radius of the circle circumscribing this triangle is equal to:A. The centre of the circle

• “Nowy Sącz Experiment” in the end of 50’s and 60’s years and its influence on city's innovation,.. • 7 economical “tigers” – there is always somebody behind

Since the identity (x + y)y = y in Theorem 1.1 is nonregular we see that according to the last lemma we consider in the sequel bi-near-semilattices with one absorption