GWR and the Spatial Heterogeneity
GWR and the Spatial Heterogeneity
of Hedonic Prices:
of Hedonic Prices:
Most Promising a Method, Not a Panacea
Most Promising a Method, Not a Panacea
François Des Rosiers and Marius Thériault Laval University, Canada
2007 Seminar on Automated Methods of Mass Appraisal and Market Analysis
Delft, The Netherlands, November 1-2, 2007
Purpose and Context of Study
In Chapter 6 of «Advances in Mass Appraisal Methods», focus is put on the measurement of urban externalities using hedonic modelling
Primary purpose of this paper is to handle the spatial
non-stationarity of hedonic house prices, using access to primary
school facilities as a case study
While access to primary schools has been shown to influence family household preferences, willingness-to-pay may vary over space
Literature Review – Proximity to Schools
Influence of school quality on house prices (Hendon 1973,
Jud and Watts 1981, Jud 1985, Haurin and Brasington 1996, Bogart and Cromwell 1997, Black 1999, Downes and Zabel 2002, Reback 2005, Hayes and Taylor 1996, Kane et al. 2005 )
Impact of proximity to schools and of school size on
property values (Guntermann and Colwell 1983, Des Rosiers et al. 2001, Thériault et al. 2005)
Optimal school size (365 pupils); optimal distance to
nearest school (400 metres) (Des Rosiers et al. 2001)
Impact of urban environment (type of road, urban form
Literature Review – Regression methods
OLS method addresses neither the issue of the non-stationarity of shadow prices nor the presence of spatial autocorrelation among residuals (Getis and Ord 1992, Dubin 1998)
Improvement to OLS: Casetti’s (1972) expansion method (Can 1990, Thériault et al. 2003 ); Spatial Autoregressive
Regression – SAR (Anselin’s 1995); Geographically Weighted Regression – GWR (Brunsdon et al. 1996)
GWR outperforms most other global and local approaches (Farber & Yates 2006 )
Database Organization
Study based on some 8,221 single-family detached house sales in Quebec City (former- QUC territory) between January 1993 and December 1996
Median price : $89,000CAN (50,000-250,000)
Information on major property attributes, local services, time trend, socioeconomic and household structure profiles (1996), regional accessibility index, land use density
Four school-related control variables : school board
dummy; nearest primary school is English speaking; is also a high school; school size
Analytical Approach
The GWR procedure consists in performing a series of
local regressions, thereby calibrating a specific hedonic model for every property in the database. Each local regression is calibrated using all data within a given kernel around this point –although with a weight that decreases with distance -, with the spatial kernel being either “fixed” (constant radius) or “adaptive” to local land use densities (constant number of neighbours).
Yi= β0 (xi ,yi) + β1 (xi ,yi) Xi1 + β2 (xi ,yi) Xi2 +…+ βp (xi ,yi) Xip + εi (xi ,yi) = coordinates of the ith point in space, with i=1 to N
βp (xi ,yi) Xip = estimated parameters for variable p at the regression point
Research Hypotheses
Hypothesis H1: Access to, and proximity of, a primary
school is a positive externality which should enhance the value of a nearby residence (neg. signed contin. var.)
Hypothesis H2: distance dummy parameters are expected
to be statistically significant and positively signed, safe for the nearest buffer (negative proximity effects)
Hypothesis H3: location rent generated by access to, and
proximity of, a primary school is expected to decrease with
distance
Hypothesis H4: Space-related attributes are expected to
display significant spatial drifts
Hypothesis H5: GWR substantially reduces spatial
Main Regression Findings - OLS
N= 8,221 K= 29 Standzd Coeff. Collinearity statisticsB Std. Error Beta t Sig. VIF
(Constant) 10.642 0.042 253.030 0.000 LivArea_m2 0.004 0.000 0.462 52.665 0.000 2.93 lnLotSize 0.063 0.006 0.071 11.346 0.000 1.47 AppAge -0.010 0.000 -0.418 -43.643 0.000 3.48 Cottage -0.039 0.005 -0.059 -7.238 0.000 2.50 Quality 0.073 0.007 0.058 10.446 0.000 1.19 Finbasment 0.057 0.003 0.095 16.945 0.000 1.20 SoftFacing51 -0.032 0.004 -0.052 -9.177 0.000 1.20 Fireplace 0.044 0.004 0.069 11.887 0.000 1.30 HiQualFloor 0.032 0.003 0.053 9.395 0.000 1.22 HrdWoodStair 0.036 0.005 0.047 7.569 0.000 1.49 SplAttGar 0.101 0.008 0.072 13.316 0.000 1.13 DblAttGar 0.057 0.010 0.030 5.663 0.000 1.09 SplDetGar 0.057 0.006 0.056 10.359 0.000 1.10 DblDetGar 0.071 0.007 0.052 9.660 0.000 1.08 WaterSewer 0.154 0.015 0.055 10.210 0.000 1.09 NbMonthJan_9396 -0.001 0.000 -0.023 -4.424 0.000 1.05 AvgIncome_1K 0.004 0.000 0.166 21.503 0.000 2.26 PercChild0_9years -0.004 0.001 -0.071 -8.216 0.000 2.84 PercChildFam -0.002 0.000 -0.058 -7.433 0.000 2.33 RegAcc 0.076 0.003 0.233 23.175 0.000 3.83 OldSuburbs -0.034 0.006 -0.056 -5.543 0.000 3.91 NewSuburbs -0.034 0.008 -0.036 -4.217 0.000 2.79 ScatrdSettlmt -0.007 0.008 -0.012 -0.941 0.347 6.17 Decouvreurs 0.083 0.006 0.126 14.542 0.000 2.85 PremSeigneuries 0.020 0.004 0.032 5.096 0.000 1.52 EnglishSchool 0.054 0.010 0.030 5.121 0.000 1.27 HighSchool 0.019 0.007 0.017 2.972 0.003 1.22 SchoolSize -0.004 0.001 -0.022 -3.389 0.001 1.67 Unstandardized Coefficients Coefficientsa
a. Dependent Variable: LN_SALEPRICE
Sig.= 0.000
Variable
Adj. R-sq = 0.784 F= 1,064.044 SEE= 0.1380
• School board descriptors act as proxies for housing submarkets
• Negative proximity effects tend to increase with school size
• Nearest English school raises house values by some 5.4%
• If high school education provided, market premium 1.9%
Main Regression Findings - OLS
• Driving time coefficient displays a much stronger magnitude (-0.017) than the walking time one does (-0.001) due to greater distance covered by car
N= 8,221 K= 30
Standzd Coeff.
Collinearity statistics B Std. Error Beta t Sig. VIF
(Constant) 10.669 0.042 254.238 0.000 OldSuburbs -0.026 0.006 -0.044 -4.280 0.000 3.98 NewSuburbs -0.028 0.008 -0.030 -3.449 0.001 2.81 ScatrdSettlmt 0.002 0.008 0.003 0.259 0.796 6.28 Decouvreurs 0.079 0.006 0.120 13.863 0.000 2.87 PremSeigneuries 0.021 0.004 0.034 5.399 0.000 1.52 EnglishSchool 0.055 0.010 0.030 5.299 0.000 1.27 HighSchool 0.019 0.006 0.016 2.900 0.004 1.22 SchoolSize -0.002 0.001 -0.015 -2.256 0.024 1.70 Walking_Time -0.001 0.000 -0.051 -8.971 0.000 1.26 Unstandardized Coefficients Adj. R-sq = 0.786 F= 1,040.094 Sig.= 0.000 SEE= 0.1373 Variable Coefficientsa
a. Dependent Variable: LN_SALEPRICE
N= 8,221 K= 30
Standzd Coeff.
Collinearity statistics
B Std. Error Beta t Sig. VIF
(Constant) 10.667 0.042 253.84 0.000 OldSuburbs -0.028 0.006 -0.046 -4.49 0.000 3.97 NewSuburbs -0.029 0.008 -0.030 -3.54 0.000 2.81 ScatrdSettlmt 0.001 0.008 0.001 0.07 0.942 6.27 Decouvreurs 0.081 0.006 0.123 14.18 0.000 2.86 PremSeigneuries 0.022 0.004 0.036 5.67 0.000 1.53 EnglishSchool 0.056 0.010 0.031 5.36 0.000 1.27 HighSchool 0.017 0.006 0.015 2.66 0.008 1.22 SchoolSize -0.003 0.001 -0.017 -2.48 0.013 1.69 Driving_Time -0.017 0.002 -0.045 -7.93 0.000 1.21 Variable Coefficientsa Adj. R-sq = 0.785 F= 1,037.280 Sig.= 0.000 SEE= 0.1375 Unstandardized Coefficients
a. Dependent Variable: LN_SALEPRICE
Main Regression Findings - OLS
• Interactive time-distance coefficients yield results identical to model 3.
Model 5 (BUF) - Includes Distance Buffers
Model 4 (ITD) - Includes Interactive Time and Distance Dummies
N= 8,221 K= 31
Standzd Coeff.
Collinearity statistics B Std. Error Beta t Sig. VIF (Constant) 10.663 0.042 253.974 0.000 OldSuburbs -0.027 0.006 -0.045 -4.394 0.000 3.980 NewSuburbs -0.029 0.008 -0.031 -3.554 0.000 2.820 ScatrdSettlmt 0.002 0.008 0.003 0.216 0.829 6.292 Decouvreurs 0.079 0.006 0.120 13.831 0.000 2.888 PremSeigneuries 0.021 0.004 0.035 5.475 0.000 1.525 EnglishSchool 0.055 0.010 0.030 5.266 0.000 1.272 HighSchool 0.019 0.006 0.016 2.861 0.004 1.221 SchoolSize -0.002 0.001 -0.016 -2.366 0.018 1.698 WalkTime_Less1600m -0.001 0.000 -0.024 -3.565 0.000 1.804 DrivTime_Beyond1600m -0.017 0.002 -0.059 -8.015 0.000 2.069 a. Dependent Variable: LN_SALEPRICE
Adj. R-sq = 0.785 F= 1,003.448 Sig.= 0.000 SEE= 0.1375 Variable Unstandardized Coefficients Coefficientsa N= 8,221 K= 35 Standzd Coeff. Collinearity statistics B Std. Error Beta t Sig. VIF (Constant) 10.582 0.043 247.296 0.000 0.000 OldSuburbs -0.027 0.006 -0.045 -4.405 0.000 3.985 NewSuburbs -0.029 0.008 -0.030 -3.510 0.000 2.817 ScatrdSettlmt 0.002 0.008 0.002 0.190 0.849 6.279 Decouvreurs 0.083 0.006 0.127 14.684 0.000 2.868 PremSeigneuries 0.022 0.004 0.036 5.678 0.000 1.531 EnglishSchool 0.059 0.010 0.033 5.630 0.000 1.277 HighSchool 0.017 0.006 0.015 2.696 0.007 1.222 SchoolSize -0.002 0.001 -0.014 -2.095 0.036 1.717 Buf_0_250 0.068 0.010 0.066 6.896 0.000 3.467 Buf_251_500 0.060 0.009 0.088 6.681 0.000 6.596 Buf_501_750 0.053 0.009 0.079 6.048 0.000 6.490 Buf_751_1000 0.038 0.009 0.049 4.236 0.000 5.140 Buf_1001_1300 0.040 0.009 0.043 4.252 0.000 3.954 Buf_1301_1600 0.027 0.010 0.022 2.663 0.008 2.698 SEE= 0.1374 Coefficientsa Unstandardized Coefficients
a. Dependent Variable: LN_SALEPRICE
Adj. R-sq = 0.785 F= 886.338 Sig.= 0.000
Variable
• As expected, the magnitude of the buffer coefficients
decreases with distance from 0.068 to 0.027 (with an
Spatial Autocorrelation Diagnostics for OLS
Models (GEODA software)
• Spatial dependence is still present in the models’
residuals (between 0.0613 and 0.0628) and significant at the 0.001 level.
Variable Global Moran's I* Sig.
lnSaleprice 0.3106 < 0.001 Pre_Base (1) 0.3558 < 0.001 Res_Base (1) 0.0628 < 0.001 Pre_WT (2) 0.3567 < 0.001 Res_WT (2) 0.0621 < 0.001 Pre_DT (3) 0.3558 < 0.001 Res_DT (3) 0.0628 < 0.001 Pre_ITD (4) 0.3563 < 0.001 Res_ITD (4) 0.0625 < 0.001 Pre_BUF (5) 0.3565 < 0.001 Res_BUF (5) 0.0613 < 0.001
Main Regression Findings - GWR
• GWR outperforms the OLS method, both in terms of adjusted R-square (0.810) and of AIC (Akaike Information Criterion) value (-10,191)
• Main differences in the parameter estimates are observed with space-related descriptors
• Interactive walking and driving time
median estimate values (-0.001 and
-0.013) corroborate OLS results • Findings suggest that, as
hypothesized, access to primary schools is not stationary over space
Number of nearest neighbours = 4,668 Adj. R-sq = 0.810
Akaike Information Criterion (AIC) = -10,190.89 AIC for OLS = -9,259.29
Source SS DF MS F OLS Residuals 154.8 31
GWR Residuals 135.4 8104.76 0.0167 13.6826 GWR Improvement 19.5 85.24 0.2285 (Crit.Val.: 1.4526)
ANOVA
Overall Model Performances
Variable Minimum Lwr Quartile Median Upr Quartile Maximum
Intrcept 10.4170 10.5370 10.5865 10.6339 10.7623 LivArea_m2 0.0036 0.0038 0.0040 0.0040 0.0042 lnLotSize 0.0492 0.0695 0.0785 0.0897 0.1151 AppAge -0.0121 -0.0115 -0.0113 -0.0100 -0.0086 Cottage -0.0530 -0.0341 -0.0294 -0.0267 0.0061 Quality 0.0555 0.0616 0.0763 0.0808 0.0971 Finbasment 0.0436 0.0475 0.0546 0.0602 0.0732 SoftFacing51 -0.0540 -0.0378 -0.0326 -0.0300 -0.0276 Fireplace 0.0317 0.0377 0.0412 0.0446 0.0605 HiQualFloor 0.0200 0.0254 0.0293 0.0322 0.0362 HrdWoodStair 0.0299 0.0318 0.0401 0.0440 0.0506 SplAttGar 0.0462 0.0879 0.0931 0.0989 0.1180 DblAttGar 0.0237 0.0505 0.0626 0.0825 0.0981 SplDetGar 0.0298 0.0491 0.0528 0.0549 0.0649 DblDetGar 0.0366 0.0670 0.0803 0.0868 0.0983 WaterSewer 0.0379 0.1228 0.1554 0.1750 0.2352 NbMonthJan_9396 -0.0011 -0.0007 -0.0005 -0.0004 -0.0002 AvgIncome_1K 0.0027 0.0033 0.0038 0.0043 0.0047 PercChild0_9years -0.0082 -0.0061 -0.0051 -0.0030 0.0000 PercChildFam -0.0047 -0.0038 -0.0024 -0.0015 0.0005 RegAcc 0.0343 0.0487 0.0554 0.0785 0.1480 OldSuburbs -0.0344 -0.0212 -0.0100 0.0008 0.0064 NewSuburbs -0.0790 -0.0364 -0.0177 0.0128 0.0437 ScatrdSettlmt -0.0362 -0.0041 0.0103 0.0466 0.1059 Decouvreurs -0.0672 0.0488 0.0975 0.1855 0.3535 PremSeigneuries -0.1970 -0.0226 -0.0039 0.0066 0.0429 EnglishSchool -0.0488 0.0000 0.0119 0.2648 0.3583 HighSchool -0.0147 0.0254 0.0376 0.0531 0.1177 SchoolSize -0.0114 -0.0030 0.0023 0.0047 0.0087 WalkTime_Less1600m -0.0028 -0.0012 -0.0009 -0.0007 -0.0004 DrivTime_Beyond1600m -0.0440 -0.0209 -0.0131 -0.0088 0.0009
Main Regression Findings - GWR
• Median parameter estimates are positively signed and uniformly decrease with distance.
• However they display substantially lower magnitudes than OLS-derived coefficients (Model 5)
• In summary, H1, H2 and H3 (access to, and proximity of, a primary school
positively affect residential values) are being validated using GWR
• GWR method strongly suggests that most non-building attributes, including school-related and access-to-school variables, are not homogeneously valued by households over space (H4)
Model 7 – GWR-BUF Overall Performance & Regression Coefficients
Number of nearest neighbours = 4,668 Adj. R-sq = 0.810
Akaike Information Criterion (AIC) = -10,191.95 AIC for OLS = -9,261.26
Source SS DF MS F OLS Residuals 154.7 35
GWR Residuals 134.8 8088.15 0.0167 12.1852 GWR Improvement 19.9 97.85 0.2031 (Crit.Val.: 1.4244)
Overall Model Performances
ANOVA
GWR local R-Squares
Map 1 : ITD Model (computed for each case)
Monte-Carlo Spatial Stationarity
Test for GWR Coefficients
(non stationary where significant)
Parameter P-value Level of Significance Intercept 0.630 n/s LivArea_m2 0.450 n/s lnLotSize 0.100 n/s AppAge 0.000 *** Cottage 0.390 n/s Quality 0.570 n/s Finbasment 0.070 n/s SoftFacing51 0.060 n/s Fireplace 0.420 n/s HiQualFloor 0.530 n/s HrdWoodStair 0.360 n/s SplAttGar 0.560 n/s DblAttGar 0.290 n/s SplDetGar 0.880 n/s DblDetGar 0.230 n/s WaterSewer 0.240 n/s NbMonthJan_9396 0.060 n/s AvgIncome_1K 0.010 ** PercChild0_9years 0.000 *** PercChildFam 0.000 *** RegAcc 0.000 *** OldSuburbs 0.240 n/s NewSuburbs 0.000 *** ScatrdSettlmt 0.000 *** Decouvreurs 0.000 *** PremSeigneuries 0.000 *** EnglishSchool 0.000 *** HighSchool 0.000 *** SchoolSize 0.000 *** WalkTime_Less1600m 0.020 * DrivTime_Beyond1600m 0.000 ***
Parameter P-value Level of Significance Intercept 0.610 n/s LivArea_m2 0.400 n/s lnLotSize 0.090 n/s AppAge 0.000 *** Cottage 0.430 n/s Quality 0.470 n/s Finbasment 0.090 n/s SoftFacing51 0.070 n/s Fireplace 0.480 n/s HiQualFloor 0.470 n/s HrdWoodStair 0.380 n/s SplAttGar 0.540 n/s DblAttGar 0.280 n/s SplDetGar 0.840 n/s DblDetGar 0.340 n/s WaterSewer 0.130 n/s NbMonthJan_9396 0.060 n/s AvgIncome_1K 0.000 *** PercChild0_9years 0.000 *** PercChildFam 0.000 *** RegAcc 0.000 *** OldSuburbs 0.200 n/s NewSuburbs 0.000 *** ScatrdSettlmt 0.000 *** Decouvreurs 0.000 *** PremSeigneuries 0.000 *** EnglishSchool 0.000 *** HighSchool 0.000 *** SchoolSize 0.000 *** Buf_0_250 0.020 * Buf_251_500 0.060 n/s Buf_501_750 0.050 * Buf_751_1000 0.010 ** Buf_1001_1300 0.000 *** Buf_1301_1600 0.060 n/s *** signifiant at 0.10% level ** signifiant at 1% level * signifiant at 5% level
Spatial Distribution of Walking
and Driving Time Coefficients
Map 3 : WalkTime_Less1600m Significant Coefficients
GWR & Spatial Dependence
Variable Global Moran's I* Sig. Pre_ITD (6) 0.3628 < 0.001
Res_ITD (6) 0.0101 < 0.001
Pre_BUF (7) 0.3634 < 0.001
Res_BUF (7) 0.0097 < 0.001 * Computed for a 3,000 metres kernel
Figure 1 : Local Spatial Autocorrelation for GWR-ITD Model (MapStat software)
Figure 2 : Local Spatial Autocorrelation for GWR-BUF Model (MapStat software)
• Global Moran’s I is much lower than the index obtained for OLS models (0.06), although it remains statistically significant at the 0.001 level
• LISA graphs suggest that local SA is no more significant at roughly 750 metres from any property
• Research hypothesis
Discussion on GWR Limitations
The presence of space-related dummy variables in the regression equation and of continuous variables that entail some spatial discontinuity often results in the distribution of GWR regression coefficients differing from the Gaussian distribution required in order to generate reliable spatial stationarity tests
Moreover, local regressions are calibrated with dummy descriptors that, over large areas, are characterized by a
null variance (such as school boards)
Discussion on GWR Limitations
Map 5 : Significant GWR-ITD Coefficients for the EnglishSchool Variable (Green-shaded areas indicate English school concentration in the region)
Figure 3 : Distribution of GWR-ITD Regression Coefficients 0,400000000 0,300000000 0,200000000 0,100000000 0,000000000 -0,100000000 3 000 2 000 1 000 0 Frequency Histogram Mean = 0,10260677053 Std. Dev. = 0,127720228264 N = 8 221
• Significant regression coefficients for the EnglishSchool descriptor are
found in the north-eastern portion of the territory whereas the only English
schools in Quebec City are concentrated in the green-shaded areas,
Discussion on GWR Limitations
Map 6 : Walking Times for Homes Located Within 1,600 m of Nearest School
Figure 4 : Distribution of GWR-ITD Regression Coefficients
• Positive and significant coefficients are found in the western part of the agglomeration (north of Saint-Augustin-de-Desmaures) although no
effective walking time is computed for that area (Map 3)
Conclusion and Suggestions for
Further Research
While GWR is a most promising regression method, it may not be applied without discrimination
The results it yields are heavily dependent upon the
specification of the hedonic model, which must therefore be
adapted consequently
For instance, accessibility to primary schools could be measured using the number of available schools from any property, based on some density-adaptive kernel
Alternately, other approaches may be best suited to deal with the above mentioned issues, such as Casetti’s expansion method or some multi-level modelling procedure.