Spatial Econometrics
Lecture 5: Single-source model of spatial regression. Combining GIS and regional analysis
Andrzej Torój
Institute of Econometrics – Department of Applied Econometrics
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Outline
1 Linear model vs SAR/SLM (Spatial Lag) Linear model
SAR (Spatial Lag, SLM)
2 Model SEM (Spatial Error)
SEM model with global error dependence SEM model with local error dependence
3 SLX model
4 Combining point GIS data with regional statistics Example: location of Biedronka markets Homework
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Plan prezentacji
1 Linear model vs SAR/SLM (Spatial Lag)
2 Model SEM (Spatial Error)
3 SLX model
4 Combining point GIS data with regional statistics
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Linear regression model – specification
The well-known linear regression model:
y = Xβ + ε
Its parameters can be estimated in an unbiased, consistent and efficient way via Ordinary Least Squares (OLS) method.
Appropriate, when spatial links in y are fully (implicitly) captured through the spatial autocorrelation of regressors included in X (spatial clustering of X).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Linear regression model – specification
The well-known linear regression model:
y = Xβ + ε
Its parameters can be estimated in an unbiased, consistent and efficient way via Ordinary Least Squares (OLS) method.
Appropriate, when spatial links in y are fully (implicitly) captured through the spatial autocorrelation of regressors included in X (spatial clustering of X).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Linear regression model – specification
The well-known linear regression model:
y = Xβ + ε
Its parameters can be estimated in an unbiased, consistent and efficient way via Ordinary Least Squares (OLS) method.
Appropriate, when spatial links in y are fully (implicitly) captured through the spatial autocorrelation of regressors included in X (spatial clustering of X).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Flow of impacts in the linear model
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Flow of impacts in SAR model
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SAR model – relation to other models
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SAR model – relation to other models
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SAR model – specification
Spatial autoregression with additional regressors.
y = ρWy + Xβ + ε
Without any explanatory variables X in the model, it would be identical with pure SAR.
In this model, we do not assume any spatial clustering of the causes, but spatial interactions in outcomes (spatial global spillovers, spatial spillovers).
Problem with OLS estimation: endogeneity (like in pure SAR).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SAR model – specification
Spatial autoregression with additional regressors.
y = ρWy + Xβ + ε
Without any explanatory variables X in the model, it would be identical with pure SAR.
In this model, we do not assume any spatial clustering of the causes, but spatial interactions in outcomes (spatial global spillovers, spatial spillovers).
Problem with OLS estimation: endogeneity (like in pure SAR).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SAR model – specification
Spatial autoregression with additional regressors.
y = ρWy + Xβ + ε
Without any explanatory variables X in the model, it would be identical with pure SAR.
In this model, we do not assume any spatial clustering of the causes, but spatial interactions in outcomes (spatial global spillovers, spatial spillovers).
Problem with OLS estimation: endogeneity (like in pure SAR).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SAR model – specification
Spatial autoregression with additional regressors.
y = ρWy + Xβ + ε
Without any explanatory variables X in the model, it would be identical with pure SAR.
In this model, we do not assume any spatial clustering of the causes, but spatial interactions in outcomes (spatial global spillovers, spatial spillovers).
Problem with OLS estimation: endogeneity (like in pure SAR).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Consequences of omitting spatial structure SAR (1)
True data generating process: y = ρWy + Xβ + ε Estimated linear model omitting Wy (method – OLS):
y = Xβ KMNK + ε
According to the general principles of econometrics, omitting a variable results in the estimation bias of β, that converges to the product of:
(true) parameter of the skipped variable
slope of the regression of the skipped variable on the included variables
In our case:
plim ˆ β KMNK = β + ρ Cov (Wy,X) Var (X)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Consequences of omitting spatial structure SAR (1)
True data generating process: y = ρWy + Xβ + ε Estimated linear model omitting Wy (method – OLS):
y = Xβ KMNK + ε
According to the general principles of econometrics, omitting a variable results in the estimation bias of β, that converges to the product of:
(true) parameter of the skipped variable
slope of the regression of the skipped variable on the included variables
In our case:
plim ˆ β KMNK = β + ρ Cov (Wy,X) Var (X)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Consequences of omitting spatial structure SAR (1)
True data generating process: y = ρWy + Xβ + ε Estimated linear model omitting Wy (method – OLS):
y = Xβ KMNK + ε
According to the general principles of econometrics, omitting a variable results in the estimation bias of β, that converges to the product of:
(true) parameter of the skipped variable
slope of the regression of the skipped variable on the included variables
In our case:
plim ˆ β KMNK = β + ρ Cov (Wy,X) Var (X)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Consequences of omitting spatial structure SAR (2)
Can Cov (Wy, X) possibly be 0? If the true data generating process is SAR, then...
y = (I − ρW)
−1Xβ + (I − ρW)
−1ε
y = Xβ + ρWXβ + ρ
2W
2Xβ + ... + ε + ρWε + ρ
2W
2ε + ...
Wy = WXβ + ρW
2Xβ + ρ
2W
3Xβ + ... + Wε + ρW
2ε + ρ
2W
3ε + ...
Thus (skipping the components related to ε, which – as we know – are uncorrelated to X):
plim ˆ β
KMNK− β
= ρ
Cov (WXβ,X)Var (X)
+ ρ
Cov(
ρW2Xβ,X)
Var (X)
+ ρ
Cov(
ρ2W3Xβ,X)
Var (X)
+ ... =
=
Var (X)ρCov (WX, X) + ρ · Cov W
2X, X + ρ
2· Cov W
3X, X + ... β Even if X is not spatially autocorrelated and Cov (WX, X) = 0, further components cannot be equal to zero.
W
2and further powers of W are not any more matrices with zero diagonal elements.
Interpretation: W
2is the matrix of connections to neighbours of the neighbours. But the neighbour of your neighbour is i.a. You! (And You’re always correlated with yourself.)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Consequences of omitting spatial structure SAR (2)
Can Cov (Wy, X) possibly be 0? If the true data generating process is SAR, then...
y = (I − ρW)
−1Xβ + (I − ρW)
−1ε
y = Xβ + ρWXβ + ρ
2W
2Xβ + ... + ε + ρWε + ρ
2W
2ε + ...
Wy = WXβ + ρW
2Xβ + ρ
2W
3Xβ + ... + Wε + ρW
2ε + ρ
2W
3ε + ...
Thus (skipping the components related to ε, which – as we know – are uncorrelated to X):
plim ˆ β
KMNK− β
= ρ
Cov (WXβ,X)Var (X)
+ ρ
Cov(
ρW2Xβ,X)
Var (X)
+ ρ
Cov(
ρ2W3Xβ,X)
Var (X)
+ ... =
=
Var (X)ρCov (WX, X) + ρ · Cov W
2X, X + ρ
2· Cov W
3X, X + ... β Even if X is not spatially autocorrelated and Cov (WX, X) = 0, further components cannot be equal to zero.
W
2and further powers of W are not any more matrices with zero diagonal elements.
Interpretation: W
2is the matrix of connections to neighbours of the neighbours. But the neighbour of your neighbour is i.a. You! (And You’re always correlated with yourself.)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Consequences of omitting spatial structure SAR (2)
Can Cov (Wy, X) possibly be 0? If the true data generating process is SAR, then...
y = (I − ρW)
−1Xβ + (I − ρW)
−1ε
y = Xβ + ρWXβ + ρ
2W
2Xβ + ... + ε + ρWε + ρ
2W
2ε + ...
Wy = WXβ + ρW
2Xβ + ρ
2W
3Xβ + ... + Wε + ρW
2ε + ρ
2W
3ε + ...
Thus (skipping the components related to ε, which – as we know – are uncorrelated to X):
plim ˆ β
KMNK− β
= ρ
Cov (WXβ,X)Var (X)
+ ρ
Cov(
ρW2Xβ,X)
Var (X)
+ ρ
Cov(
ρ2W3Xβ,X)
Var (X)
+ ... =
=
Var (X)ρCov (WX, X) + ρ · Cov W
2X, X + ρ
2· Cov W
3X, X + ... β Even if X is not spatially autocorrelated and Cov (WX, X) = 0, further components cannot be equal to zero.
W
2and further powers of W are not any more matrices with zero diagonal elements.
Interpretation: W
2is the matrix of connections to neighbours of the neighbours. But the neighbour of your neighbour is i.a. You! (And You’re always correlated with yourself.)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial OLS (1)
If the omission of spatial lag makes the OLS estimator biased, we should include it.
Potentially easy to do: if W is predetermined, one can
construct the spatial lag variable Wy upfront and estimate the SAR model y = ρWy + Xβ + ε with OLS (this method is referred to as Spatial OLS):
y =
Wy X
ρ β
+ ε From OLS properties, we know that:
E
ρ ˆ β ˆ
=
ρ β
+
Wy X
TWy X
−1E
Wy X
Tε
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial OLS (1)
If the omission of spatial lag makes the OLS estimator biased, we should include it.
Potentially easy to do: if W is predetermined, one can
construct the spatial lag variable Wy upfront and estimate the SAR model y = ρWy + Xβ + ε with OLS (this method is referred to as Spatial OLS):
y =
Wy X
ρ β
+ ε From OLS properties, we know that:
E
ρ ˆ β ˆ
=
ρ β
+
Wy X
TWy X
−1E
Wy X
Tε
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial OLS (1)
If the omission of spatial lag makes the OLS estimator biased, we should include it.
Potentially easy to do: if W is predetermined, one can
construct the spatial lag variable Wy upfront and estimate the SAR model y = ρWy + Xβ + ε with OLS (this method is referred to as Spatial OLS):
y =
Wy X
ρ β
+ ε From OLS properties, we know that:
E
ρ ˆ β ˆ
=
ρ β
+
Wy X
TWy X
−1E
Wy X
Tε
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial OLS (2)
In the linear regression model, we assume that the error terms are independent of regressors, i.e. E
Wy X
Tε
= 0, and this proves the unbiasedness of the OLS estimator in such a model. It holds that E X
Tε = 0, but:
E h
(Wy)
Tε i
= E n
W (I − ρW)
−1Xβ + W (I − ρW)
−1ε
Tε o
=
= E
n W (I − ρW)
−1Xβ
Tε + W (I − ρW)
−1ε
Tε o
=
= E n
ε
TW (I − ρW)
−1Tε o
6= 0
Our model is not the classical regression model, because observations depend on one another (y
idepends on the neighbour y
jand vice versa).
Situation similar to the simultaneous equations models.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial OLS (2)
In the linear regression model, we assume that the error terms are independent of regressors, i.e. E
Wy X
Tε
= 0, and this proves the unbiasedness of the OLS estimator in such a model. It holds that E X
Tε = 0, but:
E h
(Wy)
Tε i
= E n
W (I − ρW)
−1Xβ + W (I − ρW)
−1ε
Tε o
=
= E
n W (I − ρW)
−1Xβ
Tε + W (I − ρW)
−1ε
Tε o
=
= E n
ε
TW (I − ρW)
−1Tε o
6= 0
Our model is not the classical regression model, because observations depend on one another (y
idepends on the neighbour y
jand vice versa).
Situation similar to the simultaneous equations models.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial OLS (3)
For simplification, consider the SAR model with 1 explanatory variable x:
E
ρˆ βˆ
=
=
ρ β
+ 1
det
Wy x T
Wy x
| {z }
≡γ>0
"
xTx − (Wy)Tx
−xT(Wy) (Wy)T(Wy)
#
(Wy)Tε
xTε
|{z}
=0
=
=
ρ β
+
usually >0
z }| {
γxTx (Wy)Tε
−γxT(Wy) (Wy)Tε
| {z }
usually <0
So, the spatial OLS delivers biased estimates! (ρ usually upward biased, β downward biased).
In the multivariate cases, the bias is concentrated on the parameters for variables X whose spatial patterns most resembles the spatial pattern of y.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial OLS (3)
For simplification, consider the SAR model with 1 explanatory variable x:
E
ρˆ βˆ
=
=
ρ β
+ 1
det
Wy x T
Wy x
| {z }
≡γ>0
"
xTx − (Wy)Tx
−xT(Wy) (Wy)T(Wy)
#
(Wy)Tε
xTε
|{z}
=0
=
=
ρ β
+
usually >0
z }| {
γxTx (Wy)Tε
−γxT(Wy) (Wy)Tε
| {z }
usually <0
So, the spatial OLS delivers biased estimates! (ρ usually upward biased, β downward biased).
In the multivariate cases, the bias is concentrated on the parameters for variables X whose spatial patterns most resembles the spatial pattern of y.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial 2SLS (1)
The simultaneous equation bias in y = ρWy + Xβ + ε can be treated analogously to the case of endogenous regressors: i.e. use the instrumental variables method. This implementation is consistent, unbiased and is referred to as spatial 2-stage least squares (S2SLS).
A valid instrumental variable is correlated with the problematic regressor (Wy), but uncorrelated with the error term (ε). Recall that for the SAR model:
Wy = WXβ + ρW
2Xβ + ρ
2W
3Xβ + ...
| {z }
ideal instruments!
+ Wε + ρW
2ε + ρ
2W
3ε + ...
Step 1:
Linear regression of Wy on the matrix including exogenous variables and a certain number of instruments: Π =
X WX W
2X . . . (OLS).
Theoretical values: c Wy = Π
Π
TΠ
−1Π
T| {z }
P
Wy
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial 2SLS (1)
The simultaneous equation bias in y = ρWy + Xβ + ε can be treated analogously to the case of endogenous regressors: i.e. use the instrumental variables method. This implementation is consistent, unbiased and is referred to as spatial 2-stage least squares (S2SLS).
A valid instrumental variable is correlated with the problematic regressor (Wy), but uncorrelated with the error term (ε). Recall that for the SAR model:
Wy = WXβ + ρW
2Xβ + ρ
2W
3Xβ + ...
| {z }
ideal instruments!
+ Wε + ρW
2ε + ρ
2W
3ε + ...
Step 1:
Linear regression of Wy on the matrix including exogenous variables and a certain number of instruments: Π =
X WX W
2X . . . (OLS).
Theoretical values: c Wy = Π
Π
TΠ
−1Π
T| {z }
P
Wy
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial 2SLS (1)
The simultaneous equation bias in y = ρWy + Xβ + ε can be treated analogously to the case of endogenous regressors: i.e. use the instrumental variables method. This implementation is consistent, unbiased and is referred to as spatial 2-stage least squares (S2SLS).
A valid instrumental variable is correlated with the problematic regressor (Wy), but uncorrelated with the error term (ε). Recall that for the SAR model:
Wy = WXβ + ρW
2Xβ + ρ
2W
3Xβ + ...
| {z }
ideal instruments!
+ Wε + ρW
2ε + ρ
2W
3ε + ...
Step 1:
Linear regression of Wy on the matrix including exogenous variables and a certain number of instruments: Π =
X WX W
2X . . . (OLS).
Theoretical values: c Wy = Π
Π
TΠ
−1Π
T| {z }
P
Wy
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial 2SLS (2)
Step 2:
OLS estimation of the SAR model parameters after the replacement of Wy with d Wy:
ρ ˆ β ˆ
=
h
Wy d X i T h
Wy d X i −1
h
Wy d X i T
y
Spatial 2SLS (S2SLS)
model <- stsls(y ~ x, listw = W)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial ML (1)
Variant 2: maximum likelihood method y =
M
z }| {
(I − ρW) −1 Xβ +
M
z }| {
(I − ρW) −1 u, u ∼ N(0, σ 2 ) L (u) = σ
21 2π
N2exp
− u 2σ
Tu
2By the change of variables theorem (multivariate case):
L (y) = det
M
−1z }| {
∂u
∂y
L [u (y)]
L (y) =
det M −1 1
σ
22π
N2exp
− (y−MXβ)
T
( M
−1)
T( M
−1) (y−MXβ)
2σ
2β = arg max ˆ
β
L (y)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Spatial ML (2)
Standard errors evaluated on the basis of Hessian matrix at the maximum point of the likelihood function (typical for ML).
If M = I, the likelihood function identical as in the linear model.
ML for the SAR model in R
model <- lagsarlm(y ~ x, listw = W)
The same model is estimated when the formula argument in the function spautolm (pure SAR) is supplied with additional regressors.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Demand for computing power
The most burdensome operations:
matrix determinant: det M
−1matrix inversion: M
−1I recommend to useMicrosoft R Open (instead of standard R) – it contains mathematical libraries for multi-threading.
Test the following script on Your computer with standard R and MS R Open:
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Tests: linear model vs SAR (1)
This illustration demonstrates the univariate case (θ – scalar).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Tests: linear model vs SAR (2)
LM ρ = N
tr [( W
T+W ) W ] +
ˆεT ˆ1ε( WX ˆ β )
T[ I−X ( X
TX ) X
T]( WX ˆ β )
ˆ ε
TWy ˆ ε
Tˆ ε
2
∼ χ 2 (1)
H 0 : linear model (ρ = 0) H 1 : SAR
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Plan prezentacji
1 Linear model vs SAR/SLM (Spatial Lag)
2 Model SEM (Spatial Error)
3 SLX model
4 Combining point GIS data with regional statistics
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Flow of impacts in SEM model
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – relation to other models
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – relation to other models
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – specification
It is not the dependent variable, but the error term, that is subject to spatial autocorrelation – the difference is analogous to the difference between AR and MA models.
y = Xβ + ε ε = λWε + u
In the absence of regressors X, the model would be equivalent to (pure) SAR.
Spatial clustering in unobservables (shocks).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – specification
It is not the dependent variable, but the error term, that is subject to spatial autocorrelation – the difference is analogous to the difference between AR and MA models.
y = Xβ + ε ε = λWε + u
In the absence of regressors X, the model would be equivalent to (pure) SAR.
Spatial clustering in unobservables (shocks).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – specification
It is not the dependent variable, but the error term, that is subject to spatial autocorrelation – the difference is analogous to the difference between AR and MA models.
y = Xβ + ε ε = λWε + u
In the absence of regressors X, the model would be equivalent to (pure) SAR.
Spatial clustering in unobservables (shocks).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – estimation (1)
OLS estimator is inefficient (and the standard errors – biased), because:
y = Xβ + ε ε = λWε + u, czyli ε = (I − λW)
−1u Var (ε) = E εε
T= (I − λW)
−1E uu
T(I − λW)
−1T= σ
2(I − λW)
−1(I − λW)
−1T6= σ
2I
Variant 1: as usually with non-spherical errors, the solution is Generalised Least Squares estimation:
β = X ˆ
TΩ
−1X
−1X
TΩ
−1y with given Ω = (I − λW)
−1(I − λW)
−1TW known, λ estimated based on errors derived from the consistent OLS estimation (details of the procedure: Kelejian and Prucha, 1998; Arbia, 2014).
Var ˆ β
= ˆ σ
2X
TΩ
−1X
−1Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – estimation (2)
Spatial GLS in R
model4 <- GMerrorsar(y ~ x, listw = W)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – estimation (3)
Variant 2: maximum likelihood method y = Xβ +
M
z }| {
(I − λW) −1 u, u ∼ N(0, σ 2 ) L (u) = σ
21 2π
N2exp
− u 2σ
Tu
2By the change of variables theorem (multivariate case):
L (y) = det
M
−1z }| {
∂u
∂y
L [u (y)]
L (y) =
det M −1 1
σ
22π
N2exp
− (y−Xβ)
T
( M
−1)
T( M
−1) (y−Xβ)
2σ
2β = arg max ˆ
β
L (y)
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SEM model – estimation (4)
Standard errors evaluated on the basis of Hessian matrix at the maximum point of the likelihood function (typical for ML).
If M = I, the likelihood function identical as in the linear model.
ML for SEM model in R
model <- errorsarlm(y ~ x, listw = W)
The same model will also be estimated, if the formula in the function spautolm (pure SAR) supplied with regressors.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Both SAR and SEM collapse to pure SAR without regressors X
SAR SEM
y = ρWy + Xβ + ε y = Xβ + (I − λW) −1 u β = 0
y = ρWy + ε
y = (I − λW) −1 u y − ρWy = ε
(I − ρW) y = ε y = (I − ρW) −1 ε
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
LM tests: linear model vs SEM
LM λ = N
2tr [( W
T+W ) W ]
ˆ u
TWˆ u ˆ u
Tˆ u
2
∼ χ 2 (1) H 0 : linear model (λ = 0)
H 1 : SEM
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Robust LM tests (1)
In LM thests for SAR and SEM specifications (respectively):
1
H 0 : linear model (ρ = 0), H 1 : SAR
2
H 0 : linear model (λ = 0), H 1 : SEM
Problem: each pair of hypotheses leaves out of sight the alternative hypothesis from the other pair of the other test.
Consequence: test 1 rejects H 0 even under false H 1 (but true H 1 from test 2). And vice versa.
RLMlag and RLMerr
Anselin et al. (1996) propose robust test statistics LM ρ ∗ and LM λ ∗ , which – by construction – exclude the possibility that an incorrect process is captured by the alternative hypothesis (see Arbia, 2014).
LM ρ ∗ = LM ρλ − LM λ LM λ ∗ = LM ρλ − LM ρ
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Global vs local SEM model (1)
The previously presented SEM model stipulated a global dependence between unobservables:
y = Xβ + ε ε = λWε + u The local SEM version:
y = Xβ + ε ε = λWu + u
What is the difference? Consider spatial multiplier matrices of y with respect to u in both cases:
local SEM: y = Xβ + ε, ε = (I + λW)u M =
∂y∂u= (I + λW) global SEM: y = Xβ + ε, ε = (I − λW)
−1u M =
∂y∂u= (I − λW)
−1Algebraically, note that:
multiplier SEM glob
z }| {
(I − λW)
−1=
multiplier SEM loc
z }| {
I + λW + λ
2W
2+ λ
3W
3+ ...
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Global vs local SEM model (2)
Example: Canada, USA, Mexico; W =
US 0 0.5 CA MX 0.5
1 0 0
1 0 0
; λ = 0.4; shock u = 1 occurs in Mexico.
Spatial multiplierrs for local SEM:
I + 0.4
0 0.5 0.5
1 0 0
1 0 0
0 0 1
=
1 0.2 0.2
0.4 1 0
0.4 0 1
0 0 1
=
0.2
0 1
∆y
MX= 1, ∆y
US= 0.2, no effect for Canada.
Shock in u affected y in the directly linked units.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Global vs local SEM model (3)
Spatial multipliers for global SEM:
I − 0.4
0 0.5 0.5
1 0 0
1 0 0
−1
0 0 1
=
1 −0.2 −0.2
−0.4 1 0
−0.4 0 1
−1
0 0 1
≈
1.19 0.24 0.24 0.48 1.10 0.10 0.48 0.10 1.10
0 0 1
≈
0.24 0.10 1.10
∆y
MX> 1, ∆y
US> 0.2, there is (weak but positive) effect for Canada
The impulse spills over to the related units, and then to their own related units, etc. (including the feedback into the impulse region).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Plan prezentacji
1 Linear model vs SAR/SLM (Spatial Lag)
2 Model SEM (Spatial Error)
3 SLX model
4 Combining point GIS data with regional statistics
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Flow of impacts in the SLX model
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SLX model – relation to other models
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SLX model – relation to other models
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SLX model – specification
Direct impact of causes in the neighbourhood on the consequence in the observed region – spatial spillovers::
y = Xβ + WXθ + ε
Consistent, efficient and unbiased estimation with OLS.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
SLX model – specification
Direct impact of causes in the neighbourhood on the consequence in the observed region – spatial spillovers::
y = Xβ + WXθ + ε
Consistent, efficient and unbiased estimation with OLS.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Plan prezentacji
1 Linear model vs SAR/SLM (Spatial Lag)
2 Model SEM (Spatial Error)
3 SLX model
4 Combining point GIS data with regional statistics
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
GIS data about the markets Biedronka
Source: poiplaza.com
POI: points of interest (usually published for the users of car GPS navigation sets)
Point data about location of individual markets in Poland.
Longitude and latitude.
Question: what regional criteria do the managers / owners of Biedronka use when locating their markets?
In other words, is there a relationship between the number of Biedronka markets (per capita) and local socio-economic characteristics from Local Data Bank, e.g. on the level of poviats?
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
GIS data about the markets Biedronka
Source: poiplaza.com
POI: points of interest (usually published for the users of car GPS navigation sets)
Point data about location of individual markets in Poland.
Longitude and latitude.
Question: what regional criteria do the managers / owners of Biedronka use when locating their markets?
In other words, is there a relationship between the number of Biedronka markets (per capita) and local socio-economic characteristics from Local Data Bank, e.g. on the level of poviats?
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
GIS data about the markets Biedronka
Source: poiplaza.com
POI: points of interest (usually published for the users of car GPS navigation sets)
Point data about location of individual markets in Poland.
Longitude and latitude.
Question: what regional criteria do the managers / owners of Biedronka use when locating their markets?
In other words, is there a relationship between the number of Biedronka markets (per capita) and local socio-economic characteristics from Local Data Bank, e.g. on the level of poviats?
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Aggregation of points on the predefined map
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Aggregation of points on the predefined map
Function ClassIntervals – watch out the style parameter. So far, we have been using quantile – division into classes of equal count for the purpose of presentation.
It may not make much sense for a count variable (a lot of ”ties” around the limits of classes, colours will be allocated accidentally).
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Estimates of 3 models
Fit the following regression models of the number of Biedronka markets per 10 thousand residents on labour market statistics (unemployment, wages):
linear SLM SEM SLX
Compare the models as regards the significance of variables, AIC criterion, log-likelihood value at maximum and the
presence of unremoved spatial autocorrelation. Which model is the best?
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Exercise
Derive the likelihood function for SEM model with local error dependence.
Develop an R code for the estimation of such a model.
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics
Homework 5
Estimate the models SAR, SLX, SEM for the previously considered X (from HW 3 and 4), y (from HW 1) and W (from HW 2).
Illustrate the residuals from these 3 models on maps and test them for spatial autocorrelation.
Also, in the PDF file, report and comment on the results obtained:
signs and significance of the causal parameters (β) and the comparison to the linear model considered in HW 3
significance of ρ and λ? is ρ or λ negative? is it interpretable in this case?
how does θ relate to β in SLX?
Andrzej Torój Institute of Econometrics – Department of Applied Econometrics