• Nie Znaleziono Wyników

Panel Data Econometrics

N/A
N/A
Protected

Academic year: 2021

Share "Panel Data Econometrics"

Copied!
39
0
0

Pełen tekst

(1)

Panel Data Econometrics

Katarzyna Bech

18.05.2017

() 18.05.2017 1 / 37

(2)

What is ’Panel Data’?

Tra¢ c fatality rate for 48 contiguous U.S. states for each of the seven years from 1982 to 1988:

year Alabama Arkansas ... Wyoming 1982 0.000213 0.00025 ... 0.000394 1983 0.000235 0.000227 ... 0.000335

... ... ... ... ...

1988 0.000249 0.000271 ... 0.000324

Two- dimensional: observations on di¤erent objects in di¤erent points in time.

Both time-series and cross sectional data can be treated as special cases of panels.

Interesting: time does not have to be the second dimension!

(3)

Couple of de…nitions

N- the number of (cross-sectional) objects (individuals, organizations, countries) in the sample

T - the number of time periods (years, quarters, months, days) in the sample i.e. the number of waves

If N =1 and T is large- time series.

If T =1 and N is large- cross section.

Panel data are those with N >1 and T >1.

Balanced panel: for each N we have exactly the same number of time observations T .

Short panel: if N >T . Long panel if T >N.

Micro panel: N >>T . Macro panel: N 'T .

(4)

Why ’Panel data’?

Consider an empirical application: what are the e¤ects of alcohol taxes and drunk driving laws on tra¢ c fatalities?

Panel data set let us control for unobserved variables that di¤er from one state to the next, but do not change overtime.

It also allows us to control for (unobserved) variables that vary through time, but do not vary across states.

Another advantage is increased precision in estimation, by pooling several time periods of data for each individual.

Panel data also gives ”more variability, less collinearity among variables, more degrees of freedom and more e¢ ciency”.

Better suited to study the dynamics of change.

(5)

Some well known examples of panel data sets:

The Panel Study of Income Dynamics (PSID): constructed by the Institute of Social Research (University of Michigan), collected from 1968 (each year), 500 families, socioeconomic and demographic variables

Survey of Income and Program Participation (SIPP): conducted by the Bureau of the Census of the US Department of Commerce, four times a year, individual level, economic conditions

The German Socio-Economic Panel (GESOEP): every year from 1984 to 2014, individual level

National Longitudinal Survey of Youth (NLSY): collected by the US Department of Labour, individual level, labour market activities Other: LFS, BHPS, CFPS

(6)

Problems solved by panels:

Labour supply: Ben-Porath (1973) observes that at a certain point in time, in a cohort of women, 50% may appear to be working. It is ambiguous whether this implies that, in this cohort one-half of women on average will be working or that the same one-half will be working in every period.

Production function: inability to separate economies of scale and technological change. Cross-sectional data only provide info about the former, time-series muddle the two e¤ects, with no prospect of separation, e.g. common to assume CRS in order to reveal the technical change. Greene (1983) uses panel of a large number of

…rms for several years and provides estimates for technological change and economies of scale.

(7)

Illustrative example: charitable giving

Data: 47 individuals over the period 1979-1988. From Frees (2004), Longitudinal and Panel Data Analysis and Applications in the Social Sciences, Cambridge University Press.

Variables:

Charity- sum of cash contributions Income- gross

Price- (1-marginal income tax rate) Age- dummy, 1-for individuals over 64 MS- dummy, 1-for married

DEPS- number of dependents

From the Panel of Individual Tax Returns.

Goal: study the e¤ect (if any!) of the marginal tax rate on charitable giving. Prior expectations?

(8)

How to estimate the parameters of the charity function? 5 options

Individual time series of charity functions.

Cross-sectional charity functions.

Pooled OLS (constant coe¢ cient model) Fixed e¤ects model

Random e¤ects model

(9)

Option 1: Time series

Model

Ct = β0+β1Aget+β2Incomet+β3Pr icet+β4DEPSt+β5MSt+εt

Estimates

i 0 1 2 3 4 5

1 35.14 2.70 2.32

2 7.95 0.18 0.05 1.16 0.57

...

47 13.75 1.75 0.40 0.13

(10)

Option 1: Time series

Model

Ct = β0+β1Aget+β2Incomet+β3Pr icet+β4DEPSt+β5MSt+εt

Estimates

i 0 1 2 3 4 5

1 35.14 2.70 2.32

2 7.95 0.18 0.05 1.16 0.57

...

47 13.75 1.75 0.40 0.13

(11)

Option 2: Cross section

Model

Ci = β0+β1Agei+β2Incomei+β3Pr icei +β4DEPSi +β5MSi+εi

Estimates

t 0 1 2 3 4 5

1 7.93 1.17 1.33 0.02 0.11 0.12

2 13.16 1.15 1.08 7.07 0.24 1.05

...

10 9.46 1.77 7.39 0.35 1.92

(12)

Option 2: Cross section

Model

Ci = β0+β1Agei+β2Incomei+β3Pr icei +β4DEPSi +β5MSi+εi

Estimates

t 0 1 2 3 4 5

1 7.93 1.17 1.33 0.02 0.11 0.12

2 13.16 1.15 1.08 7.07 0.24 1.05

...

10 9.46 1.77 7.39 0.35 1.92

(13)

Option 3: Pooled OLS

Model

Cit = β0+β1Ageit+β2Incomeit+β3Pr iceit+β4DEPSit+β5MSit+εit

Estimate by OLS

Requirements for unbiasedness and consistency same as for standard linear regression for large sample cross-section (standard

Gauss-Markov)

Important: consistency for N !∞, when T is …nite ("…xed T

asymptotics"). If T also!∞ then treat problem as multivariate time series.

Time series characteristics irrelevant (may be nonstationary).

As we omit the info on the structure of the sample, OLS not e¢ cient.

(14)

Option 3: Pooled OLS

Estimates

0 1 2 3 4 5

4.67 1.55 1.04 0.48 0.18 0.008

Endogeneity?

(15)

Pooled OLS

Heterogeneity bias

(16)

Option 4: Fixed e¤ect model

The basic framework for the discussion is the regression model of the form

yit =αi+β0xit +εit.

The individual e¤ect is αi constant over t and speci…c to the individual cross-section unit i . Unknown parameters to estimate.

It is also possible to allow the slopes to vary across i , but it inroduces methodological issues and complexity in calculations. We can go over it next week if you wish.

How to make this model operational?

(17)

Option 4: Fixed e¤ect model

For each individual we have:

yi =l αi +Xiβ+εi

where l is T 1 vector of ones.

Collecting all inviduals we have:

2 4

y1

...

yN 3 5=

2 4

l ... 0 ... ... ...

0 ... l 3 5

2 4 α1

...

αN

3 5+

2 4

X1

...

XN 3 5 β+

2 4 ε1

...

εN

3 5

or

y = [d1 ... dN X] α β +ε

(18)

Option 4: Fixed e¤ect model

Let D be NT N matrix

D= [d1 ... dN]. Assembling all NT rows together gives:

y =+X β+ε

referred to as the Least Squares Dummy Variable Model (LSDV).

(19)

Option 4: Fixed e¤ect model

Why …xed? Intercepts, although di¤erent across individuals, do not vary in time (time-invariant).

Model:

Cit = β0i+β1Ageit+β2Incomeit+β3Pr iceit+β4DEPSit+β5MSit+εit. β0i controls for an unobserved heterogeneity. If this heterogeneity is correlated with other regressors, Polled OLS is biased (as this heterogeneity is omitted in the pooled model).

Practical issue: if individual characteristics are not enough time varying, FEM might not work.

(20)

Fixed e¤ect model: di¤erential intercept dummies

De…ne a dummy variable D1i, which takes value 1, if i =1, 0 otherwise. Similarly D2i, D3i, ..., D47i.

Then our model might be written as:

Cit = α1+α2D2i+α3D3i +...+α47D47i

+β1Ageit +β2Incomeit+β3Pr iceit +β4DEPSit +β5MSit+εit. Each individual intercept is then: β01 =α1 and β0i =α1+αi for

i 2. We have a classical model with K+ (N 1) variables.

Remember about dummy variable trap!

If we estimate by OLS, then we call them Least Squares Dummy Variable Estimators.

(21)

Fixed e¤ect model: di¤erential intercept dummies.

Problems

The choice of the reference group typically random- not interesting interpretation of α unless you model without intercept.

Every additional dummy costs you a degree of freedom.

Remember about the assumptions on the error term: εit (0, σ2). These may have to be modi…ed, e.g. you might assume constant variance, but also allow for heteroskedasticity and correct the standard errors, you may assume no serial correlation or allow for some AR structure in the error (and correct the standard errors), you might assume that at any time error term of one individual is not correlated with the errors of the other, or you might allow for such correlation (treat it as SURE model).

(22)

Fixed e¤ect model: time e¤ect

Similarly, we can have models in which we allow for an individual e¤ect, not for objects, but for waves (time):

Yit = β0t+β1Xit +εit,

or even on both dimensions (two-way …xed e¤ect model):

Yit =β0it+β1Xit+εit. The parameters might be estimated by LSDV in:

Yit =α1+α2D2i +...+αNDNi+γ2B2i+...+αTBTi+εit. If we wish to introduce di¤erential slope coe¢ cients by multiplying slope coe¢ cients with intercept dummies (in this model such operation consumes 230 degrees of freedom). If we additionally interact time dummies with …ve regressors (50 degrees of freedom), we have almost no observations left for meaningful conclusions.

(23)

Fixed e¤ect model: within transformation

We go back to the model:

y =+X β+ε.

There is an easier way to estimate its parameters than to go through LSDV.

Use the results for a partition regression and write the OLS of β as β= [X0MdX] 1[X0Mdy].

This amounts to a least squares regression using the transformed data:

X =MdX and y =Mdy .

(24)

Fixed e¤ect model: within transformation

The structure of D is particularly convenient (its columns are orthogonal), so

Md = 2 4

M0 ... 0 ... ... ...

0 ... M0 3 5

where M0 =IT T1ll0.

Premultiplying any T 1 vector zi by M0 gives M0zi =zi zl

where the mean is taken over T observations for unit i .

This implies that the regression of Mdy on MdX is equivalent to the regression of yit yi on xit xi.

(25)

Fixed e¤ect model: within transformation

We centre the observations around their means (calculated over time):

Cit Ci = β1(Ageit Agei) +β2(Incomeit Incomei) + β3(Pr iceit Pr icei) +β4(DEPSit DEPSi) + β5(MSit MSi) +uit

"within", as we consider changes in individual characteristics in time, but not across individuals (each period- deviation from the mean).

Estimators show the impact of the changes in individual’s characteristics in time.

Both estimators LSDV and WG are identical, as matematically both models are exactly the same.

(26)

Fixed e¤ect model: within transformation

We pay the price for the simplicity of this approach:

model needs to be estimated without the intercept

all regressors which for each individual do not vary in time needs to be removed (or they will be automatically eliminated)

loosing the intercept is ok, but loosing important explanatory variables may lead to endogeneity and omitted variable bias.

Z metody within (z oczywistych wzgl ¾edów) nie mo·zemy stosowa´c, gdy chcemy bada´c np. dyskryminacj ¾e p÷acow ¾a

Within group estimators are consistent, but not e¢ cient (as they express variables as their deviations from the mean, the variability of that di¤erence will be relatively smaller than in the original data, which means that the variability of the error term will be relatively higher, which leads to higher variance estimates).

(27)

Aside: between transformation

Alternatively (for pooled OLS) we may present the model for averages (across time) for di¤erent individuals, i.e. we estimate

Ci = β0+β1Agei+β2Incomei+β3Pr icei+β4DEPSi+β5MSi+ui

"between", as we consider changes of ’average’characteristics between individuals.

Use if you want to measure e.g. gwg or impact of any characteristics that are time-invariant.

Problem: we loose observations (only N left)

(28)

Useful variance analysis

Variance (total variation) is quanti…ed by the fact that each individual is di¤erent from the average calculated on both dimensions :T and N.

We might decompose the variance into a part coming from changes in time (within) and a part coming from changes between individuals (between):

i

t

(xit x)2 =

i

t

(xit xi)2 +

i

Ti(xi x)2 Total variation = within variation + between variation

(29)

Fixed e¤ect model: better than pooled OLS?

H0 : α1 =αj for all j 2 ((N 1)restrictions)

Unrestricted model: FE; Restricted model: pooled OLS

F = (RRSS URSS)/m

URSS /[n (k+1) m] Fm,n (k+1) m where m=N 1, and [n (k+1) m] =N(T 1) k.

(30)

Fixed e¤ect model: di¤ in di¤ (FD- First Di¤erence)

LSDV and WG estimators are not the only ways of dealing with FE models.

If the model is true in time t

Yit =β0i +β1Xit+εit, it is also true in time t 1

Yit 1 =β0i +β1Xit 1+εit 1. Substracting one from another we get:

4Yit = β14Xit+uit,

where uit = 4εit. Due to this transformation we eliminate all variables which are time-invariant (including individual β0i)(…rst

"di¤"). Additionally if the model has a linear trend (t as a regressor), it also gets eliminated (second "di¤").

Unfortunately for T >2 the error term will be correlated!OLS provides consistent but ine¢ cient estimates.

(31)

FE or FD?

If N is large, and T small then FE more e¢ cient if we have autocorrelation, FD better for nonstationary series.

If T is large, and N small then

we prefer FD to model processes with strong positive correlation in the error term (AR(1) parameter close to 1, as FD eliminates

nonstationarity problem)

FE more sensitive to the lack of normality, heteroscedasticity and autocorrelation in the error

but FE less sensitive for the endogeneity of regressors.

If T =2 the all three LSDV, WG and FD are exactly the same.

(32)

Option 5: Random e¤ect model

In the FE models we assume that the "individual speci…c" parameters β0i are constant (time invariant) for each individual i - ok if we believe that di¤erences between units can be viewed as parametric shifts of a regression function.

In the Random E¤ect Model we assume that β0i is a random variable with mean β0 (o index i ), which means that the intercept for each individual is

β0i = β0+ui, where ui is (0, σ2u).

In our example this means that 47 individuals were randomly drawn form a large population of individuals with a constant expected value of the intercept. Di¤erences across individuals are therefore expressed by the error component ui.

(33)

Random e¤ect model: General formulation

Model:

yit =α+β0xit +ui+εit

where ui is constant through time.

Standard assumptions are:

ui (0, σ2u) εit (0, σ2ε)

E[uiεit] = 0, E[uiuj] =0(i 6=j)

E[εitεis] = E[εitεij] =E[εitεjs] =0(i 6=j, t 6=s).

(34)

Random e¤ect model: General formulation

De…ne

wit =ui+εit

and

wi = [wi 1, wi 2, ..., wiT]0. Given the assumption listed above:

E[wit] = 0 E[wit2] = σ2u +σ2ε E[witwis] = σ2u

(35)

Random e¤ect model: General formulation

De…ne

Ω = E[wiwi0] = 2 66 4

σ2u+σ2ε σ2u ... σ2u σ2u σ2u+σ2ε ... σ2u

... ... ... ...

σ2u σ2u ... σ2u+σ2ε 3 77 5

= σ2εI+σ2ull0

Since observations i and j are independent, the disturbance covariance for the full NT observations is

V = 2

4 Ω ... 0 ... ... ...

0 ... Ω 3

5=I Ω.

Apply GLS.

(36)

Random e¤ect model

Our charity function might be expressed as:

Cit= β0+β1Ageit+β2Incomeit+β3Pr iceit+β4DEPSit+β5MSit+wit, where wit =ui+εit.

Composite error component: individual speci…c ui and εit

"idiosyncratic term", varying across time and individuals.

RE also known as Error Components Model (ECM).

Super important: wit should not be correlated with the regressors.

(37)

FE vs. RE

The practical choice of the model depends on the assumptions on the correlation between ui and a set of regressors X. If corr(ui, X) =0 the correct model is RE, if corr(ui, X) 6=0 the correct model is FE.

How to decide?

Hausman test: the null says, that there is no di¤erence between FE and RE. If we reject H0 we state, that the RE is inappropriate as ui is probably correlated with one or more regressors !we should choose FE model instead.

Alternatively, we can still stick to the RE model, but estimate its parameters by IV (panel IV- e.g. Hausman-Taylor Estimator or Arellano-Bond).

(38)

FE vs. RE practicalities

If T is large and N small (and standard assumption satis…ed) both models should deliver similar results (then the choice depends on computational complexity).

If N is large and T small and all assumptions of the RE model are satis…ed, RE more e¢ cient than FE.

RE might estimate the impact of time-invariant variables that

’disapear’in the FE.

If the true model is pooled: all estimators (FE, RE, pooled OLS) are consistent.

If the true model is FE: pooled OLS and RE are inconsistent.

If the true model is RE: FE are consistent (FE consistent always!).

(39)

Panel data Econometrics: further topics

Hypothesis veri…cation

Heteroskedasticity and Autocorrelation Unbalanced panels

Dynamic panel data models Multivariate panels

Limited dependent variable panels Nonstationarity in panel data

Cytaty

Powiązane dokumenty

Since postprojective (and, dually, preinjective) components of Γ B are easy to identify and construct (see [4] and [9]), it follows that concealed algebras are easy to identify....

Determine the shape of the surface of an incompressible fluid subject to a gravitational field contained in a cylindrical vessel which rotates about its (vertical) axis with a

Look at the following sentences and choose the correct answer from the options.. me, I won’t do

Professor Woźniakowski also received two first-degree prizes from the Minister of National Education, the Stanisław Mazur Prize from the Polish Mathematical Society, two awards from

Definition 2.1 (KMS-Symmetric Markovian Semigroups). In this form KMS symmetry has been introduced in [Cip, Definition 2.1] for the particular case of von Neumann algebras and

An electron is in a one-dimensional trap with zero potential energy in the interior and infinite potential energy at the walls... Four different particles are trapped

An electron is in a one-dimensional trap with zero potential energy in the interior and infinite potential energy at

Prosz¸e o przygotowanie si¸e do dyskusji: cz¸ astka kwantowa ze spinem, cz¸ astka kwantowa w polu magnetycznym (r´