• Nie Znaleziono Wyników

AndrzejTorój Lecture11:Spatialmodelsofbinaryvariables SpatialEconometrics

N/A
N/A
Protected

Academic year: 2021

Share "AndrzejTorój Lecture11:Spatialmodelsofbinaryvariables SpatialEconometrics"

Copied!
46
0
0

Pełen tekst

(1)

Spatial Econometrics

Lecture 11: Spatial models of binary variables

Andrzej Torój

Institute of Econometrics – Department of Applied Econometrics

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(2)

1 Binary variable models: what’s special about spatial Nonspatial models of binary variable

Binary variable models: spatial version

2 Estimation of parameters for Probit-SAR model Numerical evaluation of the likelihood funciton RIS

3 Interpretation of coefficients

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(3)

1 Binary variable models: what’s special about spatial

2 Estimation of parameters for Probit-SAR model

3 Interpretation of coefficients

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(4)

Observable dependent variable with binary outcomes: y

i

∈ {0; 1}, i = 1, ..., N.

We assume that the occurence of 0s and 1s is determined by an unobservable continuous variable y

i

– propensity of i -th observation to take the value of 1, materialising itself above the threshold of 0 (this value comes without any loss of generality, as the propensity depends on an estimated constant). This propensity depends on a systematic component (x

i

β – factors increasing the probability of 1s) and a random component (ε

i

):

y

i

= x

i

β + ε

i

, y

i

=

( 1 for P (y

i

> 0) 0 for P (y

i

≤ 0)

In general (for both spatial and nonspatial case): the likelihood function is an N-dimensional integral of an N-dimensional joint density of y

:

L (β) = P (y

1

, y

2

, ..., y

N

) = ˆ

0

−∞

ˆ

0

−∞

. . .

| {z }

yi=0

ˆ

0

ˆ

0

. . .

| {z }

yi=1

f

N

(y

|Xβ) dy

N

...dy

1

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(5)

Observable dependent variable with binary outcomes: y

i

∈ {0; 1}, i = 1, ..., N.

We assume that the occurence of 0s and 1s is determined by an unobservable continuous variable y

i

– propensity of i -th observation to take the value of 1, materialising itself above the threshold of 0 (this value comes without any loss of generality, as the propensity depends on an estimated constant). This propensity depends on a systematic component (x

i

β – factors increasing the probability of 1s) and a random component (ε

i

):

y

i

= x

i

β + ε

i

, y

i

=

( 1 for P (y

i

> 0) 0 for P (y

i

≤ 0)

In general (for both spatial and nonspatial case): the likelihood function is an N-dimensional integral of an N-dimensional joint density of y

:

L (β) = P (y

1

, y

2

, ..., y

N

) = ˆ

0

−∞

ˆ

0

−∞

. . .

| {z }

yi=0

ˆ

0

ˆ

0

. . .

| {z }

yi=1

f

N

(y

|Xβ) dy

N

...dy

1

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(6)

Observable dependent variable with binary outcomes: y

i

∈ {0; 1}, i = 1, ..., N.

We assume that the occurence of 0s and 1s is determined by an unobservable continuous variable y

i

– propensity of i -th observation to take the value of 1, materialising itself above the threshold of 0 (this value comes without any loss of generality, as the propensity depends on an estimated constant). This propensity depends on a systematic component (x

i

β – factors increasing the probability of 1s) and a random component (ε

i

):

y

i

= x

i

β + ε

i

, y

i

=

( 1 for P (y

i

> 0) 0 for P (y

i

≤ 0)

In general (for both spatial and nonspatial case): the likelihood function is an N-dimensional integral of an N-dimensional joint density of y

:

L (β) = P (y

1

, y

2

, ..., y

N

) = ˆ

0

−∞

ˆ

0

−∞

. . .

| {z }

yi=0

ˆ

0

ˆ

0

. . .

| {z }

yi=1

f

N

(y

|Xβ) dy

N

...dy

1

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(7)

Under independent observations and Bernoulli distribution we can factorize the integral into a product of one-dimensional integrals and write the likelihood function as:

L (β) = Π

i : yi=0

[P (y

i

≤ 0)] Π

i : yi=1

[1 − P (y

i

≤ 0)] =

= Π

i : yi=0

[P (ε

i

≤ −x

i

β)] · Π

i : yi=1

[1 − P (ε

i

≤ −x

i

β)] =

= Π

Ni =1

[F (−x

i

β)]

yi

[1 − F (−x

i

β)]

1−yi

=

= Π

Ni =1

("

−x

´

−∞

f (ε

i

) d ε

i

#

yi

"

1 −

−x

´

iβ

−∞

f (ε

i

) d ε

i

#

1−yi

)

Further derivation depends on the choice of density function f (ε

i

) and the integration technique. To common options (both easily subject to an analytical treatment):

Logistic density – logit model.

Normal density – probit model.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(8)

Under independent observations and Bernoulli distribution we can factorize the integral into a product of one-dimensional integrals and write the likelihood function as:

L (β) = Π

i : yi=0

[P (y

i

≤ 0)] Π

i : yi=1

[1 − P (y

i

≤ 0)] =

= Π

i : yi=0

[P (ε

i

≤ −x

i

β)] · Π

i : yi=1

[1 − P (ε

i

≤ −x

i

β)] =

= Π

Ni =1

[F (−x

i

β)]

yi

[1 − F (−x

i

β)]

1−yi

=

= Π

Ni =1

("

−x

´

−∞

f (ε

i

) d ε

i

#

yi

"

1 −

−x

´

iβ

−∞

f (ε

i

) d ε

i

#

1−yi

)

Further derivation depends on the choice of density function f (ε

i

) and the integration technique. To common options (both easily subject to an analytical treatment):

Logistic density – logit model.

Normal density – probit model.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(9)

Firstly: to be decided on the specification level – interdependence within the neighbourhood for the binary observable or the

unobservable propensity?

y

1

= β

1

y

2

+ β

2

x

1

+ ε

1

or y

1

= β

1

y

2

+ β

2

x

1

+ ε

1

Logical problem:

y

2

= 1 → y

β1 1

↑→P (y

1

= 1) ↑ → y

β1 2

That is: from the fact that event 2 occurred, it can be inferred that its probability should increase (?!).

For this reason, we normally assume a spatial interdependence on the level of the unobservable variable y

j

.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(10)

Firstly: to be decided on the specification level – interdependence within the neighbourhood for the binary observable or the

unobservable propensity?

y

1

= β

1

y

2

+ β

2

x

1

+ ε

1

or y

1

= β

1

y

2

+ β

2

x

1

+ ε

1

Logical problem:

y

2

= 1 → y

β1 1

↑→P (y

1

= 1) ↑ → y

β1 2

That is: from the fact that event 2 occurred, it can be inferred that its probability should increase (?!).

For this reason, we normally assume a spatial interdependence on the level of the unobservable variable y

j

.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(11)

Secondly: spatial interdependence implies heteroskedasticity and spatial autocorrelation – e.g. for SAR:

y

= ρWy

+ Xβ + ε, y

i

=

( 1 for P (y

i

> 0) 0 for P (y

i

≤ 0) y

= (I − ρW)

−1

Xβ + (I − ρW)

−1

ε

| {z }

υ

Var (υ) = (I − ρW)

−1

E  εε

0

 h

(I − ρW)

−1

i

0

= σ

2ε

h

(I − ρW)

0

(I − ρW) i

−1

Heteroskedasticity is due to a varying degree of network connectivity from individual to individual.

Likewise for SEM.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(12)

Secondly: spatial interdependence implies heteroskedasticity and spatial autocorrelation – e.g. for SAR:

y

= ρWy

+ Xβ + ε, y

i

=

( 1 for P (y

i

> 0) 0 for P (y

i

≤ 0) y

= (I − ρW)

−1

Xβ + (I − ρW)

−1

ε

| {z }

υ

Var (υ) = (I − ρW)

−1

E  εε

0

 h

(I − ρW)

−1

i

0

= σ

2ε

h

(I − ρW)

0

(I − ρW) i

−1

Heteroskedasticity is due to a varying degree of network connectivity from individual to individual.

Likewise for SEM.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(13)

Secondly: spatial interdependence implies heteroskedasticity and spatial autocorrelation – e.g. for SAR:

y

= ρWy

+ Xβ + ε, y

i

=

( 1 for P (y

i

> 0) 0 for P (y

i

≤ 0) y

= (I − ρW)

−1

Xβ + (I − ρW)

−1

ε

| {z }

υ

Var (υ) = (I − ρW)

−1

E  εε

0

 h

(I − ρW)

−1

i

0

= σ

2ε

h

(I − ρW)

0

(I − ρW) i

−1

Heteroskedasticity is due to a varying degree of network connectivity from individual to individual.

Likewise for SEM.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(14)

Thirdly: likelihood function – multidimensional integral:

L β, ρ, σ 2 ε  = P y 1 , y 2 , ..., y N |β, ρ, σ 2 ε  = ˆ 0

−∞

ˆ 0

−∞

. . .

| {z }

y

i

=0

ˆ ∞

0

ˆ ∞

0

. . .

| {z }

y

i

=1

f N (y |Xβ) dy N ...dy 1

In the absence of independent observations, further analytical simplifications impossible.

Numerical difficulties consist in:

multidimensionality

sometimes unknown function f

N

(but for SAR-probit it is known: MVN)

truncated MVN distribution (individual dimensions: below or above zero)

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(15)

Thirdly: likelihood function – multidimensional integral:

L β, ρ, σ 2 ε  = P y 1 , y 2 , ..., y N |β, ρ, σ 2 ε  = ˆ 0

−∞

ˆ 0

−∞

. . .

| {z }

y

i

=0

ˆ ∞

0

ˆ ∞

0

. . .

| {z }

y

i

=1

f N (y |Xβ) dy N ...dy 1

In the absence of independent observations, further analytical simplifications impossible.

Numerical difficulties consist in:

multidimensionality

sometimes unknown function f

N

(but for SAR-probit it is known: MVN)

truncated MVN distribution (individual dimensions: below or above zero)

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(16)

Thirdly: likelihood function – multidimensional integral:

L β, ρ, σ 2 ε  = P y 1 , y 2 , ..., y N |β, ρ, σ 2 ε  = ˆ 0

−∞

ˆ 0

−∞

. . .

| {z }

y

i

=0

ˆ ∞

0

ˆ ∞

0

. . .

| {z }

y

i

=1

f N (y |Xβ) dy N ...dy 1

In the absence of independent observations, further analytical simplifications impossible.

Numerical difficulties consist in:

multidimensionality

sometimes unknown function f

N

(but for SAR-probit it is known: MVN)

truncated MVN distribution (individual dimensions: below or above zero)

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(17)

1 Binary variable models: what’s special about spatial

2 Estimation of parameters for Probit-SAR model

3 Interpretation of coefficients

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(18)

The above difficulties boil down to a single evaluation of the likelihood function value for given parameter values (β, ρ and σ 2 ε ) and data (y, X, W).

Besides, the standard scheme applies:

1

Parameter starting values: β

(0)

, ρ

(0)

and σ

ε2(0)

.

2

Evaluation of L 

β

(0)

, ρ

(0)

, σ

ε2(0)

 .

3

Iterative update of parameters β

(i )

, ρ

(i )

and σ

ε2(i )

within the selected maximization algorithm L...

4

...until convergence of L to the maximum.

Setting the direction of parameter change (point 3) also requires the evaluation of L (e.g. for numerical gradient evaluation).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(19)

The above difficulties boil down to a single evaluation of the likelihood function value for given parameter values (β, ρ and σ 2 ε ) and data (y, X, W).

Besides, the standard scheme applies:

1

Parameter starting values: β

(0)

, ρ

(0)

and σ

ε2(0)

.

2

Evaluation of L 

β

(0)

, ρ

(0)

, σ

ε2(0)

 .

3

Iterative update of parameters β

(i )

, ρ

(i )

and σ

ε2(i )

within the selected maximization algorithm L...

4

...until convergence of L to the maximum.

Setting the direction of parameter change (point 3) also requires the evaluation of L (e.g. for numerical gradient evaluation).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(20)

The above difficulties boil down to a single evaluation of the likelihood function value for given parameter values (β, ρ and σ 2 ε ) and data (y, X, W).

Besides, the standard scheme applies:

1

Parameter starting values: β

(0)

, ρ

(0)

and σ

ε2(0)

.

2

Evaluation of L 

β

(0)

, ρ

(0)

, σ

ε2(0)

 .

3

Iterative update of parameters β

(i )

, ρ

(i )

and σ

ε2(i )

within the selected maximization algorithm L...

4

...until convergence of L to the maximum.

Setting the direction of parameter change (point 3) also requires the evaluation of L (e.g. for numerical gradient evaluation).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(21)

The above difficulties boil down to a single evaluation of the likelihood function value for given parameter values (β, ρ and σ 2 ε ) and data (y, X, W).

Besides, the standard scheme applies:

1

Parameter starting values: β

(0)

, ρ

(0)

and σ

ε2(0)

.

2

Evaluation of L 

β

(0)

, ρ

(0)

, σ

ε2(0)

 .

3

Iterative update of parameters β

(i )

, ρ

(i )

and σ

ε2(i )

within the selected maximization algorithm L...

4

...until convergence of L to the maximum.

Setting the direction of parameter change (point 3) also requires the evaluation of L (e.g. for numerical gradient evaluation).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(22)

The above difficulties boil down to a single evaluation of the likelihood function value for given parameter values (β, ρ and σ 2 ε ) and data (y, X, W).

Besides, the standard scheme applies:

1

Parameter starting values: β

(0)

, ρ

(0)

and σ

ε2(0)

.

2

Evaluation of L 

β

(0)

, ρ

(0)

, σ

ε2(0)

 .

3

Iterative update of parameters β

(i )

, ρ

(i )

and σ

ε2(i )

within the selected maximization algorithm L...

4

...until convergence of L to the maximum.

Setting the direction of parameter change (point 3) also requires the evaluation of L (e.g. for numerical gradient evaluation).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(23)

L β, ρ, σ 2 ε 

= P y 1 , y 2 , ..., y N |β, ρ, σ ε 2  =

= ˆ 0

−∞

ˆ 0

−∞

. . .

| {z }

y

i

=0

ˆ ∞

0

ˆ ∞

0

. . .

| {z }

y

i

=1

f N (y |Xβ) dy N ...dy 1

Since we cannot integrate analytically, we shall use numerical methods.

The proposed estimation method is Maximum Simulated Likelihood (MSL).

For R growing quicker than √

N – consistent and efficient estimation (Train, 2009 – free e-book about MSL).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(24)

L β, ρ, σ 2 ε 

= P y 1 , y 2 , ..., y N |β, ρ, σ ε 2  =

= ˆ 0

−∞

ˆ 0

−∞

. . .

| {z }

y

i

=0

ˆ ∞

0

ˆ ∞

0

. . .

| {z }

y

i

=1

f N (y |Xβ) dy N ...dy 1

Since we cannot integrate analytically, we shall use numerical methods.

The proposed estimation method is Maximum Simulated Likelihood (MSL).

For R growing quicker than √

N – consistent and efficient estimation (Train, 2009 – free e-book about MSL).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(25)

L β, ρ, σ 2 ε 

= P y 1 , y 2 , ..., y N |β, ρ, σ ε 2  =

= ˆ 0

−∞

ˆ 0

−∞

. . .

| {z }

y

i

=0

ˆ ∞

0

ˆ ∞

0

. . .

| {z }

y

i

=1

f N (y |Xβ) dy N ...dy 1

Since we cannot integrate analytically, we shall use numerical methods.

The proposed estimation method is Maximum Simulated Likelihood (MSL).

For R growing quicker than √

N – consistent and efficient estimation (Train, 2009 – free e-book about MSL).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(26)

L β, ρ, σ 2 ε 

= P y 1 , y 2 , ..., y N |β, ρ, σ ε 2  =

= ˆ 0

−∞

ˆ 0

−∞

. . .

| {z }

y

i

=0

ˆ ∞

0

ˆ ∞

0

. . .

| {z }

y

i

=1

f N (y |Xβ) dy N ...dy 1

Since we cannot integrate analytically, we shall use numerical methods.

The proposed estimation method is Maximum Simulated Likelihood (MSL).

For R growing quicker than √

N – consistent and efficient estimation (Train, 2009 – free e-book about MSL).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(27)

Since we integrate only over a truncated part of its domain, let’s transform the problem:

L β, ρ, σ

2ε



= ˆ

0

−∞

ˆ

0

−∞

. . .

| {z }

yi=0

ˆ

∞ 0

ˆ

∞ 0

. . .

| {z }

yi=1

f

N

(y

|Xβ) dy

N

...dy

1

=

= ˆ

−∞

I

0

(y

i

) ˆ

−∞

I

0

(y

i

) . . .

| {z }

yi=0

ˆ

−∞

I

1

(y

i

) ˆ

−∞

I

1

(y

i

) . . .

| {z }

yi=1

f

N

(y

|Xβ) dy

N

...dy

1

=

=

´

−∞

. . .

´

−∞

I

01

(y

)f

N

(y

|Xβ) dy

N

...dy

1

where I

01

(y) = Π

i : I0(yi)=1

I

<0

y

i

 · Π

i : I1(y

i)=1

I

>0

y

i

 (i.e. 1 when the multivariate draw y

exactly reflects the set of 0s and 1s in the sample and 0 otherwise).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(28)

The indicator function I 01 (y ) will be named importance function, and the method – importance sampling.

The method is frequently applied in Bayesian econometrics, when we cannot draw from a given distribution, but we can draw from a different, approximate one.

Typical application: drawing from the truncated N / MVN / t / MVt distribution using its non-truncated counterpart.

In practice, the method boils down to the rejection of the draws located in the truncated parts of the domain.

For students/graduates of Bayesian Econometrics:

- Importance sampling e.g. with prior distributions only indicating the sign of the parameters.

- See wykład (in Polish).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(29)

The indicator function I 01 (y ) will be named importance function, and the method – importance sampling.

The method is frequently applied in Bayesian econometrics, when we cannot draw from a given distribution, but we can draw from a different, approximate one.

Typical application: drawing from the truncated N / MVN / t / MVt distribution using its non-truncated counterpart.

In practice, the method boils down to the rejection of the draws located in the truncated parts of the domain.

For students/graduates of Bayesian Econometrics:

- Importance sampling e.g. with prior distributions only indicating the sign of the parameters.

- See wykład (in Polish).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(30)

The density f N is N-dimensional (in the case of probit – MVN):

L β, ρ, σ 2 ε 

=

´ ∞

−∞

. . .

∞ ´

−∞

I 01 (y) f N (y |Xβ)dy N ...dy 1 =

=

´ ∞

−∞

. . .

∞ ´

−∞

I 01 (y) f N h

(I − ρW) −1 Xβ + υ i

d υ N ...d υ 1

υ ∼ MVN

 0, σ 2 ε h

(I − ρW)

0

(I − ρW) i −1 

≡ MVN [0, Σ ε ]

y ∼ MVN h

(I − ρW) −1 Xβ, Σ ε i

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(31)

Solution (for r -th draw, r = 1, ..., R):

1

Draw independently υ ˜ (r ) i ∼ N 0, σ i 2  for i = 1, ..., N , where σ

2i

is the i -th diagonal element of the matrix Σ

ε

.

2

Cholesky decomposition: Σ ε = VV 0 allows to write:

υ (r ) = V · ˜ υ (r ) . Matrix V is upper triangular, which means:

1

independent draw of N-th (last) element υ

(r )

,

2

draw of the element N − 1 for a given draw of N and given (by the matrix Σ

ε

) correlation of the last one with the

last-but-one,

3

draw of the last-but-two conditional upon the two last ones, etc.

3

Shift of the mean: y ∗(r) = υ (r ) + (I − ρW) −1 Xβ .

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(32)

Solution (for r -th draw, r = 1, ..., R):

1

Draw independently υ ˜ (r ) i ∼ N 0, σ i 2  for i = 1, ..., N , where σ

2i

is the i -th diagonal element of the matrix Σ

ε

.

2

Cholesky decomposition: Σ ε = VV 0 allows to write:

υ (r ) = V · ˜ υ (r ) . Matrix V is upper triangular, which means:

1

independent draw of N-th (last) element υ

(r )

,

2

draw of the element N − 1 for a given draw of N and given (by the matrix Σ

ε

) correlation of the last one with the

last-but-one,

3

draw of the last-but-two conditional upon the two last ones, etc.

3

Shift of the mean: y ∗(r) = υ (r ) + (I − ρW) −1 Xβ .

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(33)

Solution (for r -th draw, r = 1, ..., R):

1

Draw independently υ ˜ (r ) i ∼ N 0, σ i 2  for i = 1, ..., N , where σ

2i

is the i -th diagonal element of the matrix Σ

ε

.

2

Cholesky decomposition: Σ ε = VV 0 allows to write:

υ (r ) = V · ˜ υ (r ) . Matrix V is upper triangular, which means:

1

independent draw of N-th (last) element υ

(r )

,

2

draw of the element N − 1 for a given draw of N and given (by the matrix Σ

ε

) correlation of the last one with the

last-but-one,

3

draw of the last-but-two conditional upon the two last ones, etc.

3

Shift of the mean: y ∗(r) = υ (r ) + (I − ρW) −1 Xβ .

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(34)

For each r -th sample, r = 1, ..., R, we have y ∗(r) . The evaluation of:

L β, ρ, σ ε 2  = ˆ ∞

−∞

. . . ˆ ∞

−∞

I 01 (y ) f N (y |Xβ) dy N ...dy 1 =

resembles the computation of expected value of I 01 (y ) over the density f N (y |Xβ), that is:

1

for each draw y ∗(r) evaluate I 01 (y ) as 0 or 1

2

compute the mean of the obtained sequence of 0s and 1s

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(35)

The above procedure is correct for R  0.

I

01

(y

) – multivariate indicator function – rarely takes the value of 1 (the drawn vector y

would have to imply

EXACTLY the same sequence of 0s and 1s as in the sample).

Extremely inefficient numerically, since we do not know at least approximate values of β.

Sometimes referred to as brute force method (cf. Lerman and Manski, 1981).

Solution: Recursive Importance Sampling (RIS).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(36)

The above procedure is correct for R  0.

I

01

(y

) – multivariate indicator function – rarely takes the value of 1 (the drawn vector y

would have to imply

EXACTLY the same sequence of 0s and 1s as in the sample).

Extremely inefficient numerically, since we do not know at least approximate values of β.

Sometimes referred to as brute force method (cf. Lerman and Manski, 1981).

Solution: Recursive Importance Sampling (RIS).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(37)

Transform the initial problem:

L β, ρ, σ

2ε



= ˆ

0

−∞

ˆ

0

−∞

. . .

| {z }

yi=0

ˆ

∞ 0

ˆ

∞ 0

. . .

| {z }

yi=1

f

N

h

(I − ρW)

−1

Xβ + υ i

d υ

N

...d υ

1

=

=

¯

0

−∞

f

N

h

Q (I − ρW)

−1

Xβ + υ i d υ =

= P h

Q (I − ρW)

−1

Xβ + υ ≤ 0 i

=

= P

 υ ≤ −Q (I − ρW)

−1

| {z }

≡µ

where: Q (y) =

 1 − 2y

1

. . .

1 − 2y

N

 (this notation serves the purpose of

setting the diagonal elements as{−1; 1} and transforming all the inequalities to the form ≤ by using the symmetry of N distribution).

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(38)

L β, ρ, σ

2ε



= P [υ ≤ µ] υ ∼ MVN (0, Σ

ε

) = MVN (0, VV

0

) Since υ

(r )

= V · ˜ υ

(r )

, then:

P h

υ

(r )

≤ µ i

= P h

V · ˜ υ

(r )

≤ µ i

= P h

˜

υ

(r )

≤ V

−1

· µ i

≡ P h

˜ υ

(r )

≤ ˜ µ i The method is called recursive because of the triangularity of V matrix.

We can exploit the indepencence between the individual dimensions of

˜

υ

(r )

to write:

L

(r )

β, ρ, σ

2ε



= P h

˜ υ

(r )

≤ ˜ µ i

= Π

Ni =1

P h

˜ υ

i(r )

≤ ˜ µ

i

i

= Π

Ni =1

Φ 

˜ µi σi

 where Φ (.) – standard normal distribution function.

Typically for importance sampling: L β, ρ, σ

ε2

 =

R1

Σ

Rr =1

L

(r )

β, ρ, σ

2ε

.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(39)

L β, ρ, σ

2ε



= P [υ ≤ µ] υ ∼ MVN (0, Σ

ε

) = MVN (0, VV

0

) Since υ

(r )

= V · ˜ υ

(r )

, then:

P h

υ

(r )

≤ µ i

= P h

V · ˜ υ

(r )

≤ µ i

= P h

˜

υ

(r )

≤ V

−1

· µ i

≡ P h

˜ υ

(r )

≤ ˜ µ i The method is called recursive because of the triangularity of V matrix.

We can exploit the indepencence between the individual dimensions of

˜

υ

(r )

to write:

L

(r )

β, ρ, σ

2ε



= P h

˜ υ

(r )

≤ ˜ µ i

= Π

Ni =1

P h

˜ υ

i(r )

≤ ˜ µ

i

i

= Π

Ni =1

Φ 

˜ µi σi

 where Φ (.) – standard normal distribution function.

Typically for importance sampling: L β, ρ, σ

ε2

 =

R1

Σ

Rr =1

L

(r )

β, ρ, σ

2ε

.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(40)

L β, ρ, σ

2ε



= P [υ ≤ µ] υ ∼ MVN (0, Σ

ε

) = MVN (0, VV

0

) Since υ

(r )

= V · ˜ υ

(r )

, then:

P h

υ

(r )

≤ µ i

= P h

V · ˜ υ

(r )

≤ µ i

= P h

˜

υ

(r )

≤ V

−1

· µ i

≡ P h

˜ υ

(r )

≤ ˜ µ i The method is called recursive because of the triangularity of V matrix.

We can exploit the indepencence between the individual dimensions of

˜

υ

(r )

to write:

L

(r )

β, ρ, σ

2ε



= P h

˜ υ

(r )

≤ ˜ µ i

= Π

Ni =1

P h

˜ υ

i(r )

≤ ˜ µ

i

i

= Π

Ni =1

Φ 

˜ µi σi

 where Φ (.) – standard normal distribution function.

Typically for importance sampling: L β, ρ, σ

ε2

 =

R1

Σ

Rr =1

L

(r )

β, ρ, σ

2ε

.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(41)

L β, ρ, σ

2ε



= P [υ ≤ µ] υ ∼ MVN (0, Σ

ε

) = MVN (0, VV

0

) Since υ

(r )

= V · ˜ υ

(r )

, then:

P h

υ

(r )

≤ µ i

= P h

V · ˜ υ

(r )

≤ µ i

= P h

˜

υ

(r )

≤ V

−1

· µ i

≡ P h

˜ υ

(r )

≤ ˜ µ i The method is called recursive because of the triangularity of V matrix.

We can exploit the indepencence between the individual dimensions of

˜

υ

(r )

to write:

L

(r )

β, ρ, σ

2ε



= P h

˜ υ

(r )

≤ ˜ µ i

= Π

Ni =1

P h

˜ υ

i(r )

≤ ˜ µ

i

i

= Π

Ni =1

Φ 

˜ µi σi

 where Φ (.) – standard normal distribution function.

Typically for importance sampling: L β, ρ, σ

ε2

 =

R1

Σ

Rr =1

L

(r )

β, ρ, σ

2ε

.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(42)

MSL with RIS: McMillen (1992), spprobitml {McSpatial}

GMM variant: Klier, McMillen (2008), gmmprobit {McSpatial}

Due to a high level of complication, some authors propose using Bayesian methods: LeSage and Pace (2009), sarprobit {spatialprobit}

For students/graduates of Bayesian econometrics:

- Prior distribution: non-informative normal-gamma-uniform (normal-gamma for β and

σ12

ε

, and uniform for ρ).

- Posterior sampling method -- Metropolis-within-Gibbs. Conditional posterior distributions:

1) P σ

ε2

|ρ, β = P σ

2ε

 ∼ InvGamma 2) P β|ρ, σ

ε2

 ∼ N (known parameters)

3) P ρ|β, σ

ε2

 ∼? (evaluation by the Metropolis-Hastings algorithm)

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(43)

1 Binary variable models: what’s special about spatial

2 Estimation of parameters for Probit-SAR model

3 Interpretation of coefficients

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(44)

Recall that in non-spatial probit models: ∂P(y ∂x

i

=1)

i ,k

= f (β, x i ).

Marginal effects depend not only on the coefficients, but on the level of the independent variable (for which we compute the effects) and the levels of all the other independent variables for a given unit.

In spatial probit models, it holds additionally that

∂P(y

i

=1)

∂x

i ,k

= f (β, ρ, W, X) .

Apart from all the abovementioned factors, as well as spatial parameters and weights, the effects for a given unit depend on the levels of all explanatory variables for all the units.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(45)

Recall that in non-spatial probit models: ∂P(y ∂x

i

=1)

i ,k

= f (β, x i ).

Marginal effects depend not only on the coefficients, but on the level of the independent variable (for which we compute the effects) and the levels of all the other independent variables for a given unit.

In spatial probit models, it holds additionally that

∂P(y

i

=1)

∂x

i ,k

= f (β, ρ, W, X) .

Apart from all the abovementioned factors, as well as spatial parameters and weights, the effects for a given unit depend on the levels of all explanatory variables for all the units.

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

(46)

M i ,j k =

= ∂P(y ∂x

i

=1)

k,j

= ∂P(y ∂y

i

=1)

i

· ∂y i

∂x k,j

| {z }

as before

= ∂P ( y

i

>0 )

∂y

i

· h

(I − ρW) −1 i

i .j β k =

=

∂P

y ∗i

[

(I−ρW)−1Xβ

]

i

[Σε]i,i

>

0−

[

(I−ρW)−1Xβ

]

i

[Σε]i,i

!

∂y

i

· h

(I − ρW) −1 i

i .j β k =

=

∂P

y ∗i −

[

(I−ρW)−1Xβ

]

i

[Σε]i,i

< [

(I−ρW)−1Xβ

]

i

[Σε]i,i

!

∂y

i

· h

(I − ρW) −1 i

i .j β k =

= Φ

0

 [ (I−ρW)

−1

]

i

ε

]

i ,i



· 1

ε

]

i ,i

· h

(I − ρW) −1 i

i .j β k

Andrzej Torój Institute of Econometrics – Department of Applied Econometrics

Cytaty

Powiązane dokumenty

Altman’s or Beaver’s classical explanatory variables (financial ratios). The ex post forecasts are more precise for the multinomial model than for the binomial one. Another study

In addition, inclusion in the function of current inflation rate (which in fact may be interpreted as future inflation rate because MPC members do not know it yet at the time

The estimated ordered logit models show that the level of corporate governance of companies in Poland is associated by their ability to cope with the financial distress, as

1) The outcome of the research presented in the paper confirmed that in the Polish economy exists the long-run real money demand function where the demand for real money depends

In the single-factor model of factorization for the bond market derived by Ho, the rate of return realized in an investment period is represented as the sum of two components: a

On the basis of Consumption Based Capital Asset Pricing Model (CCAPM) I prove that the relation between real economic activity and financial market expectations exists for

We calculate values of the four crisis measures and output growth in different time horizons, and investigate one-year up to seven-year changes in real credit for the pre-crisis

Since this method usually requires using large data sets and estimating many models with different specifications, forecasting with dynamic factor analysis takes