• Nie Znaleziono Wyników

3. Likelihood ratio test: Neyman-Pearson Lemma

N/A
N/A
Protected

Academic year: 2021

Share "3. Likelihood ratio test: Neyman-Pearson Lemma"

Copied!
25
0
0

Pełen tekst

(1)

Mathematical Statistics

Anna Janicka

Lecture X, 29.04.2019

HYPOTHESIS TESTING II:

COMPARING TESTS

(2)

Plan for Today

0. Definitions – reminder and supplement 1. Comparing tests

2. Uniformly Most Powerful Test

3. Likelihood ratio test: Neyman-Pearson Lemma

4. Examples of tests for simple hypotheses

and generalizations

(3)

Definitions – reminder

We are testing H0:

θ

∈ Θ0 against H1:

θ

∈ Θ1

K – critical region of the test, the set of outcomes for which we reject H0, K = {x ∈ X :

δ

(x) = 1}

The test has a significance level

α

, if for any

θ

∈ Θ0 we have Pθ (K) ≤

α

.

decision

In reality we have H0 true H0 false

reject H0 Type I error OK

do not reject H0 OK Type II error

(4)

Statistical test example (is the coin symmetric?) reminder: finding the critical range

We want: significance level α = 0.01

We look for c such that (assuming p= ½)

P (|X – 200| > c) = 0.01 (n=400) From the de Moivre-Laplace theorem

P (|X – 200| > c) ≈ 2 Φ(-c/10), to get

= 0.01 we need c ≈25.8

For a significance level approximately 0.01 we reject H

0

when the number of tails is lower than 175 or higher than 225

K = {0,1,...,174} ∪ {226, 227,..., 400}

for large n!

(5)

Statistical test – example cont.

The choice of the alternative hypothesis

For a different alternative...

For example, we lose if tails appear too often.

H

0

: p = ½, H

1

: p > ½

Which results would lead to rejecting H

0

?

X – 200 ≤ c – do not reject H0.

X – 200 > c – reject H0 in favor of H1. i.e. T(x) = x – 200

we could have H0: p ≤ ½

(6)

Statistical test – example cont.

The choice of the alternative hypothesis

Again, from the de Moivre – Laplace theorem:

P

½

(X – 200 > c) ≈ 0.01 for c ≈ 23.3,

so for a significance level 0.01 we reject H

0

: p = ½ in favor of H

1

: p > ½ if the

number of tails is at least 224

What if we got 220 tails?

p-value is equal to ≈ 0.025; do not reject H

0

(7)

Power of the test (for an alternative hypothesis)

P

θ

(K) for θ ∈ Θ

1

– power of the test (for an alternative hypothesis)

Function of the power of a test:

1- β : Θ

1

→ [0,1] such that 1- β (θ) = P

θ

(K)

Usually: we look for tests with a given level of significance and the highest power

possible.

(8)

Statistical test – example cont.

Power of the test

We test H0 : p = ½ against H1 : p = ¾ with: T(x) = X – 200, K = {T(x) > 23.3}

(i.e. for a significance level α = 0.01) Power of the test:

1-

β

(¾) = P(T(x) > 23.3 | p = ¾) = P¾ (X>223.3)

≈1-Φ((223.3-300)/5√3) ≈ Φ(8.85) ≈ 1 But if H1 : p = 0.55

1-

β

(0.55) = P(T(x) > 23.3 | p = 0.55) ≈ 1-Φ(0.33) ≈ 1- 0.63 ≈ 0.37

And if H1 : p = ¼ for the same T we would get 1-

β

(¼) = P(T(x) > 23.3 | p = ¼) ≈ 1-Φ(14.23) ≈ 0

(9)

Power of the test:

Graphical interpretation (1)

c

θ = θ0 θ = θ1

type I error type II error

distributions of the test statistic T assuming that the null and alternative hypotheses are true

power of the test

(10)

Power of the test:

Graphical interpretation (2) – a very bad test

c θ = θ0

θ = θ1

type I error type II error

distributions of the test statistic T assuming that the null and alternative hypotheses are true

power of the test

(11)

Sensitivity and specificity

Specificity – true negative rate (when in reality H

0

is not true)

Sensitivity – true positive rate (when in reality H

0

is true)

terms used commonly in diagnostic tests

(H

0

is having a medical condition)

(12)

Size of a test

sometimes we also look at the size of a test:

sup

θ ∈ Θ0

P

θ

(K) then we have:

significance level = α if the size of the test

does not exceed α .

(13)

Comparing tests

How do we chose the best test?

for given null and alternative hypotheses for a given significance level

→ the test which is more powerful is better

(14)

Comparing the power of tests

X ~ P

θ

, {P

θ

: θ ∈ Θ} – family of distributions We test H

0

: θ ∈ Θ

0

against H

1

: θ ∈ Θ

1

such that Θ

0

∩ Θ

1

= ∅

with two tests with critical regions K

1

and K

2

; both at significance level α .

The test with the critical region K

1

is more

powerful than the test with critical region K

2

, if

) (

) (

: and

) (

) (

:

2 1

1 1

2 1

1

1

1

K P K

P

K P

K P

θ θ

θ θ

θ θ

>

Θ

≥ Θ

(15)

Uniformly most powerful test

For given H

0

: θ ∈ Θ

0

and H

1

: θ ∈ Θ

1

:

δ * is a uniformly most powerful test (UMPT) at significance level α , if

1) δ * is a test at significance level α ,

2) for any test δ at significance level α , we have, for any θ ∈ Θ

1

:

P

θ

( δ *(X)=1) ≥ P

θ

( δ (X)=1)

i.e. the power of the test δ* is not smaller than the power of any other test of the same hypotheses, for any θ ∈ Θ1 if Θ1 has one element, the word uniform is redundant

(16)

Uniformly most powerful test – alternative form

For given H0:

θ

∈ Θ0 and H1:

θ

∈ Θ1:

A test with critical region K* is a uniformly most powerful test (UMPT) at significance level

α

, if

1) The test with critical region K* is a test at significance level

α

, i.e.

for any

θ

∈ Θ0: Pθ (K*) ≤

α

,

2) for any test with critical region K at significance level

α

, we have for any

θ

∈ Θ1:

Pθ (K*) ≥ Pθ (K)

(17)

Testing simple hypotheses

We observe X. We want to test H

0

: θ = θ

0

against H

1

: θ = θ

1

. (two simple hypotheses)

We can write it as:

H

0

: X ~ f

0

against H

1

: X ~ f

1

,

where f

0

and f

1

are densities of distributions

defined by θ

0

and θ

1

(i.e. P

0

and P

1

)

(18)

Likelihood ratio test for simple hypotheses.

Neyman-Pearson Lemma

Let

such that

Then, for any K ⊆ X :

if P

0

(K) ≤ α , then P

1

(K) ≤ 1– β .

(i.e.: the test with critical region K* is the most powerful test for testing H0 against H1)

In many cases, it is easier to write the test as

K* = {x: lnf1(x) – lnf0(x) > c1}

Likelihood ratio test: we compare the likelihood ratio to a constant; if it is bad we reject H0

β

α = −

=

 

 

 ∈ >

=

1

*) (

and

*) (

) (

) : (

*

1 0

0 1

K P

K P

x c f

x x f

K X

(19)

Neyman-Pearson Lemma – Example 1

Normal model: X

1

, X

2

, ..., X

n

are an IID sample from N( µ , σ

2

), σ

2

is known

The most powerful test for

H

0

: µ = 0 against H

1

: µ = 1.

At significance level α :

For obs. 1.37; 0.21; 0.33; -0.45; 1.33; 0.85; 1.78; 1.21; 0.72 from N(µ, 1) we have, for α = 0.05 :

→ we reject H0

µ0 < µ1

 

 

 >

=

n X u

x x

x

K

n 1 α

σ

2

1

, ,..., ) : (

*

54 . 9 0

1 645 .

82 1 .

0 >

X

(20)

Neyman-Pearson Lemma – Example 1 cont.

Power of the test

If we change α , µ

1

, n – the power of the test....

 

 

 − ⋅

Φ

=

 =

 

 > =

=

µ σ σ µ

n X n

P K

P

1 1

645 .

1 1

....

645 1 .

*) 1 (

≈ 0.91

(21)

Neyman-Pearson Lemma:

Generalization of example 1

The same test is UMP for H

1

: µ > 0 and for H

0

: µ ≤ 0 against H

1

: µ > 0

more generally: under additional assumptions about the family of distributions, the same test is UMP for testing

H0:

µ

µ

0 against H1:

µ

>

µ

0

Note the change of direction in the inequality when testing

H0:

µ ≥ µ

0 against H1:

µ

<

µ

0

(22)

Neyman-Pearson Lemma – Example 2

Exponential model: X

1

, X

2

, ..., X

n

are an IID sample from distr exp( λ ), n = 10.

MP test for

H

0

: λ = ½ against H

1

: λ = ¼.

At significance level α = 0.05:

E.g. for a sample: 2; 0.9; 1.7; 3.5; 1.9; 2.1; 3.7; 2.5; 3.4; 2.8:

Σ = 24.5 → no grounds for rejecting H0.

{ ( , ,..., ) : 31 . 41 }

* = x

1

x

2

x

10

x

i

>

K

) ( )

, ( )

, (

) , ( )

, ( )

, 1 ( )

(

exp λ = Γ λ Γ a λ + Γ b λ = Γ a + b λ Γ n2 12 = χ2 n

(23)

Neyman-Pearson Lemma – Example 2’

Exponential model: X

1

, X

2

, ..., X

n

are an IID sample from distr exp( λ ), n = 10.

MP test for

H

0

: λ = ½ against H

1

: λ = ¾.

At significance level α = 0.05:

E.g. for a sample: 2; 0.9; 1.7; 3.5; 1.9; 2.1; 3.7; 2.5; 3.4; 2.8:

Σ = 24.5 → no grounds for rejecting H0.

) ( )

, ( )

, (

) , ( )

, ( )

, 1 ( )

(

exp λ = Γ λ Γ a λ + Γ b λ = Γ a + b λ Γ n2 12 = χ2 n

{ ( , ,..., ) : 10 . 85 }

* = x

1

x

2

x

10

x

i

<

K

(24)

Example 2 cont.

The test

is UMP for H

0

: λ ≥ ½ against H

1

: λ < ½ The test

is UMP for H

0

: λ ≤ ½ against H

1

: λ > ½

{ ( , ,..., ) : 31 . 41 }

* = x

1

x

2

x

10

x

i

>

K

{ ( , ,..., ) : 10 . 85 }

* = x

1

x

2

x

10

x

i

<

K

(25)

Cytaty

Powiązane dokumenty

The radius of the circle circumscribing this triangle is equal to:A. The centre of the circle

With reference to our paper on the perception of physiological visual illusions by individuals with schizophrenia [1] and the view from the Internet, that “the test of

To calculate an extensive property we need not only the state of the phase (to define intensive properties) but also one extensive quantity.. Also, any extensive quantity is di-

Then the Neyman-Pearson Fundamental Lemma tells us how to construct the most powerful test of the null hypothesis that A' ~ AfV(0,1) + (1- \)F£&gt;1/... against the

The basic rule of comparing tests is the following: for a given set of null and alternative hypotheses, for a given significance level, the test which is more powerful is

Uzupełnij luki 1–3, wybierając jedną z podanych możliwości a, b lub c, tak aby otrzymać logiczny, spójny i poprawny językowo tekst.. My girlfriend likes

The idea of stability in Bayesian robust analysis was developed in M¸ eczarski and Zieli´ nski [5], with some additional results in M¸ eczarski [4] and in Boraty´ nska and M¸

This is, of course, necessary for a construction of special normal forms with the property that passing to another special normal coordinates is provided by linear