• Nie Znaleziono Wyników

is an asymptotically normal estimator of g( θ ), if for any θ ∈Θ there exists σ

N/A
N/A
Protected

Academic year: 2021

Share "is an asymptotically normal estimator of g( θ ), if for any θ ∈Θ there exists σ"

Copied!
23
0
0

Pełen tekst

(1)

Mathematical Statistics

Anna Janicka

Lecture VII, 1.04.2019

ESTIMATOR PROPERTIES, PART III

CONFIDENCE INTERVALS – INTRO

(2)

Plan for Today

1. Asymptotic properties of estimators – cont.

asymptotic normality asymptotic efficiency

2. Consistency, asymptotic normality and asymptotic efficiency of MLE estimators

3. Interval estimation – confidence intervals

(3)

Asymptotic normality

is an asymptotically normal estimator of g( θ ), if for any θ ∈Θ there exists σ

2

( θ ) such that, when n→∞

Convergence in distribution, i.e. for any a

in other words, the distribution of is for large n similar to

) ,...,

,

ˆ ( X

1

X

2

X

n

g

( g ˆ ( X

1

, X

2

,..., X ) g ( θ ) ) N ( 0 , σ

2

( θ ))

n

n

−  →

D

(

ˆ( , ,..., ) ( )

)

( )

)

lim (n g X1 X2 X g a a

P n

n = Φ

θ

θ

θ σ

) ,...,

,

ˆ ( X

1

X

2

X

n

g

) ),

(

( g

n2

N θ

σ

(4)

Asymptotic normality – properties

An asymptotically normal estimator is consistent (not necessarily strongly).

A similar condition to unbiasedness – the expected value of the asymptotic

distribution equals g( θ ) (but the estimator does not need to be unbiased).

Asymptotic variance defined as

or – the variance of the asymptotic distribution

n )

2

( θ σ

)

2

( θ

σ

(5)

Asymptotic normality – what it is not

For an asymptotically normal estimator we usually have:

but these properties needn’t hold, because convergence in distribution does not imply convergence of moments.

) ( )

,..., ,

ˆ (

1 2

θ

θ

g X X X g

E

n

  →

n

) ( )

,..., ,

ˆ (

var g X

1

X

2

X

n

 →

n

σ

2

θ

n

(6)

Asymptotic normality – example

Let X

1

, X

2

, ..., X

n

,... be an IID sample from a distribution with mean µ and variance σ

2

. On the base of the CLT, for the sample

mean we have

In this case the asymptotic variance, , is equal to the estimator variance.

) ,

0 ( )

( X µ N σ

2

n −  →

D

n σ 2

(7)

Asymptotic normality – how to prove it

In many cases, the following is useful:

Delta Method. Let T

n

be a sequence of

random variables such that for n→∞ we have

and let h:R→R be a function differentiable at point µ such that h’( µ )≠0. Then

µ, σ2 are functions of θ

usually used when estimators are functions of statistics Tn, which can be easily shown co converge on the base of CLT

) ,

0 ( )

( T µ N σ

2

n

n

−  →

D

( h ( T ) h ( µ ) ) N ( 0 , σ

2

( h ' ( µ ))

2

)

n

n

−  →

D

(8)

Asymptotic normality – examples cont.

In an exponential model:

From CLT, we get

so from the Delta Method for h(t)=1/t:

so is an asymptotically normal (and consistent) estimator of λ .

MLE ( λ ) =

X1

) ,

0 ( )

(

1 12

λ

N

λ

X

n −  →

D

) ) (

, 0 ( )

(

2

) / 1 (

1 1

1

2

2 λ

λ  →

λ

⋅ −

N

n

X D

X 1

(9)

Asymptotic efficiency

For an asymptotically normal estimator

of g( θ ) we define asymptotic efficiency as

where σ

2

( θ )/n is the asymptotic variance, i.e.

for n→∞

( g ˆ ( X

1

, X

2

,..., X ) g ( θ ) ) N ( 0 , σ

2

( θ ))

n

n

−  →

D

) ,...,

,

ˆ ( X

1

X

2

X

n

g

( )

) , ( )

(

) ( ) '

( ˆ

as.ef

2

2

θ θ

σ

θ

I

n

n g g

= ⋅

( )

) ( ) (

) ( ) '

( ˆ as.ef

1 2

2

θ θ

σ

θ I g g

=

modification of the definition of efficiency to the limit case, with the asymptotic

variance in place of the normal variance

(10)

Relative asymptotic efficiency

Relative asymptotic efficiency for asymptotically normal estimators

and

ˆ ) ( as.ef

ˆ ) ( as.ef )

( ) ) (

, ˆ ( ˆ

as.ef

2 1 2

1 2 2 2

1

g

g g

g = =

θ σ

θ σ

) ˆ

1

( X

g g ˆ

2

( X )

Note. A less (asymptotically) efficient estimator may have other properties, which will make it preferable to a more efficient one.

(11)

Relative asymptotic efficiency – examples.

Is the mean better than the median?

Depends on the distribution!

a) normal model N( µ , σ

2

):

b) Laplace model Lapl( µ , λ )

c) some distributions do not have a mean...

Theorem: For a sample from a continuous distribution with density f(x), the sample median is an asymptotically normal estimator for the median m

(provided the density is continuous and ≠0 at point m):

(

X µ

)

N(0,σ 2)

n →D

(

md µ

)

N(0,πσ22 )

n →D

1 )

, d eˆ m (

as.ef X = π2 <

(

X µ

)

N(0, λ22 )

n →D

(

md

)

(0, 2 )

1

µ N λ

n →D as.ef(meˆd, X ) = 2 > 1

(

md

)

(0, 4( ( ))2 )

1 m f

D N

m

n →

(12)

Consistency of ML estimators

Let X1, X2, ..., Xn,... be a sample from a distribution with density fθ (x). If Θ ⊆ R is an open set, and:

all densities fθ have the same support;

the equation has exactly one solution, .

Then is the MLE(θ ) and it is consistent

Note. MLE estimators do not have to be unbiased!

0 ) (

ln θ = θ L

d d

θˆ

θ ˆ

(13)

Asymptotic normality of ML estimators

Let X1, X2, ..., Xn,... be a sample with density fθ (x), such that Θ ⊆ R is open, and is a consistent

m.l.e. (for example, fulfills the assumptions of the previous theorem), and

exists

Fisher Information may be calculated, 0<I1(θ )<∞

the order of integration with respect to x and derivation with respect to θ may be changed

then is asymptotically normal and

θ ˆ

) (

2 ln

2

θ L θ d

d

θ ˆ

( θ ˆ θ )

D

N ( 0 ,

I1(1θ )

)

n −  →

(14)

Asymptotic normality of ML estimators

Additionally, if g:R→R is a function

differentiable at point θ , such that g’( θ ) ≠ 0, and is MLE(g( θ )), then

( ˆ (

1

,

2

,..., ) ( ) ) ( 0 ,

( '(( )))

)

1

2

θ

θ

D gI θ

n

g N

X X

X g

n −  →

) ,...,

,

ˆ ( X

1

X

2

X

n

g

(15)

Asymptotic efficiency of ML estimators

If the assumptions of the previous theorems

are fulfilled, then the ML estimator (of θ or

g( θ )) is asymptotically efficient.

(16)

Asymptotic normality and efficiency of ML estimators – examples

In the normal model: the mean is an asymptotically efficient estimator of µ

In the Laplace model: the median is an asymptotically efficient estimator of µ

Examples

(17)

Summary: basic (point) estimator properties

bias

variance MSE

efficiency

consistency

asymptotic normality

asymptotic efficiency

(18)

Interval estimation – confidence intervals

We do not provide a single value estimate, but rather a lower and an upper bound for the estimate (the true value will fit into

these bounds with given probability)

We estimate with given precision

(19)

Confidence interval

Let g( θ ) be a function of unknown parameter θ , and let and

be statistics

Then, is a confidence interval for g( θ ) with a confidence level 1- α , if for any θ

) ,...,

,

( X

1

X

2

X

n

g

g =

( θ ) α

θ

g ( X

1

, X

2

,..., X

n

) ≤ g ( ) ≤ g ( X

1

, X

2

,..., X

n

) ≥ 1 − P

) ,...,

,

( X

1

X

2

X

n

g

g =

]

,

[ g g

(20)

Confidence intervals – use and interpretation

Typically:

α

is a small number, for example 1-

α

= 0,95 or 1-

α

= 0,99

The condition from the definition means: the random interval includes the unknown value g(

θ

) with given (high) probability.

If we calculate the realization of the

confidence interval (e.g. ) then

we CAN’T say that the unknown parameter is included in the range with probability 1-

α

anymore!

the parameter is either in the interval or not – the event is not random, it is just something we don’t know.

] , [g g

3 ,

1 =

= g g

(21)

Confidence intervals – construction

The confidence interval depends on the underlying probability distribution

Usually, normal samples are considered (the distribution most frequently

observed in nature)

(22)

Confidence intervals – construction cont.

Convenient method: we look for random

variables which depend on sample data and

parameter values, but whose distributions do not depend on unknown parameters (pivotal method) If U = U(X1, X2, ..., Xn, θ ) is such a function, then we look for confidence intervals [a,b] such that

Usually we look for „symmetric” CI

( ) α

θ aUb ≥ 1− P

( ) ( )

2 2 ,

α α

θ

θ U < aP U > b

P

(23)

Cytaty

Powiązane dokumenty

będzie ciągiem pa- rami niezależnych zmiennych losowych o

[r]

[r]

[r]

[r]

będzie ciągiem pa- rami niezależnych zmiennych losowych o

Udowodni´c, ˙ze je˙zeli ka˙zda niezrandomizowana niezmiennicza regu la decyzyjna ma sta le ryzyko, to klasa niezrandomizowanych niezmienniczych regu l decyzyjnych tworzy podklase..

Wynik pomiaru pozwala znale´ z´ c okres r z prawdopodobie´ nstwem 24 proc.