1. LR test for composite hypotheses 2. Examples of LR tests:

(1)

Mathematical Statistics

Anna Janicka

Lecture XI, 6.05.2019

HYPOTHESIS TESTING III:

LR TEST FOR COMPOSITE HYPOTHESES EXAMPLES OF ONE-SAMPLE TESTS

(2)

Plan for today

1. LR test for composite hypotheses 2. Examples of LR tests:

Model I: One- and two-sided tests for the mean in the normal model, σ² known

Model II: One- and two-sided tests for the mean in the normal model, σ² unknown

+ One- and two-sided tests for the variance Model III: Tests for the mean, large samples

Model IV: Tests for the fraction, large samples

3. Asymptotic properties of the LR test

4. Test randomization

(3)

Testing simple hypotheses – reminder

We observe X. We want to test H ₀ : θ = θ ₀ against H

₁

: θ = θ ₁ . (two simple hypotheses)

We can write it as:

H ₀ : X ~ f ₀ against H

₁

: X ~ f ₁ ,

where f ₀ and f ₁ are densities of distributions

defined by θ and θ (i.e. P and P )

(4)

Likelihood ratio test for simple hypotheses.

Neyman-Pearson Lemma – reminder

H ₀ : X ~ f ₀ against H

₁

: X ~ f ₁ Let

such that

Then, for any K ⊆ X :

if P ₀ (K) ≤ α , then P ₁ (K) ≤ 1– β .

(i.e.: the test with critical region K is the most powerful test* for testing H

₀

against H

₁

)

β

α ⁼ ⁻

=

 



 

 ∈ >

=

1 *) (

and

*) (

) (

) : (

*

1 0

0 1

K P

x c f

x x f

K X

(5)

Likelihood ratio test for composite hypotheses

X ~ P _θ , {P _θ : θ ∈ Θ} – family of distributions We are testing H ₀ : θ ∈ Θ ₀ against H

₁

: θ ∈ Θ ₁

such that Θ ₀ ∩ Θ ₁ = ∅, Θ ₀ ∪ Θ ₁ = Θ Let

H ₀ : X ~ f ₀ ( θ ₀ ,⋅) for some θ ₀ ∈ Θ _0.

H

₁

: X ~ f ₁ ( θ ₁ , ⋅) for some θ ₁ ∈ Θ ₁ ,

where f ₀ and f ₁ are densities (for θ ∈ Θ ₀ and θ

∈ Θ ₁ , respectively)

Just like in the N-P Lemma, but models are statistic –

(6)

Neyman-Pearson Lemma – Example 1 – reminder

Normal model: X

₁

, X

₂

, ..., X

_n

are an IID sample from N( µ , σ ² ), σ ² is known

The most powerful test for

H ₀ : µ = 0 against H

₁

: µ = 1.

At significance level α :

For obs. 1.37; 0.21; 0.33; -0.45; 1.33; 0.85; 1.78; 1.21; 0.72 from N(µ, 1) we have, for α = 0.05 :

→ we reject H₀

µ

₀

< µ

₁

 



 

 >

=

⁻

n X u

x x

x

K

_n ₁ _α

σ

2

1

, ,..., ) : (

*

54 . 9 0

1 645 .

82 1 .

0 > ⋅ ≈

≈ X

Model I

(7)

Neyman-Pearson Lemma – Example 1 cont. – reminder

Power of the test

If we change α , µ ₁ , n – the power of the test....

 

 



 − ⋅

Φ

−

=

 =



 



 > =

=

µ σ σ µ

n X n

P K

P

1 1

645 .

1 1

....

645 1 .

*) 1 (

≈ 0.91

(8)

Neyman-Pearson Lemma:

Generalization of example 1

The same test is UMP for H

₁

: µ > 0 and for H ₀ : µ ≤ 0 against H

₁

: µ > 0

more generally: under additional assumptions about the family of distributions, the same test is UMP for testing

H

₀

: µ ≤ µ

₀

against H

₁

: µ > µ

₀

Note the change of direction in the inequality when testing

H

₀

: µ ≥ µ

₀

^{against H}

₁

^: µ ^< µ

₀

(9)

Neyman-Pearson Lemma – Example 2

Exponential model: X

₁

, X

₂

, ..., X

_n

are an IID sample from distr exp( λ ), n = 10.

MP test for

H ₀ : λ = ½ against H

₁

: λ = ¼.

At significance level α = 0.05:

E.g. for a sample: 2; 0.9; 1.7; 3.5; 1.9; 2.1; 3.7; 2.5; 3.4; 2.8:

Σ = 24.5 → no grounds for rejecting H

₀

.

{ ⁽ ^, ^,..., ⁾ ^: ³¹ ^. ⁴¹ }

* = ^x

₁

^x

₂

^x

₁₀

∑ ^x

ⁱ

>

K

) ( )

, ( )

, (

) , ( )

, ( )

, 1 ( )

(

exp λ = Γ λ Γ

a

λ + Γ

b

λ = Γ

a

+

b

λ Γ

ⁿ ¹

= χ

² n

(10)

Neyman-Pearson Lemma – Example 2’

Exponential model: X

₁

, X

₂

, ..., X

_n

are an IID sample from distr exp( λ ), n = 10.

MP test for

H ₀ : λ = ½ against H

₁

: λ = ¾.

At significance level α = 0.05:

E.g. for a sample: 2; 0.9; 1.7; 3.5; 1.9; 2.1; 3.7; 2.5; 3.4; 2.8:

Σ = 24.5 → no grounds for rejecting H

₀

.

) ( )

, ( )

, (

) , ( )

, ( )

, 1 ( )

(

exp λ = Γ λ Γ

a

λ + Γ

b

λ = Γ

a

+

b

λ Γ

ⁿ₂ ¹₂

= χ

² n

{ ⁽ ^, ^,..., ⁾ ^: ¹⁰ ^. ⁸⁵ }

* = ^x

₁

^x

₂

^x

₁₀

∑ ^x

ⁱ

<

K

(11)

Example 2 cont.

The test

is UMP for H ₀ : λ ≥ ½ against H

₁

: λ < ½ The test

is UMP for H ₀ : λ ≤ ½ against H

₁

: λ > ½

{ ⁽ ^, ^,..., ⁾ ^: ³¹ ^. ⁴¹ }

* = ^x

₁

^x

₂

^x

₁₀

∑ ^x

ⁱ

>

K

{ ⁽ ^, ^,..., ⁾ ^: ¹⁰ ^. ⁸⁵ }

* = ^x

₁

^x

₂

^x

₁₀

∑ ^x

ⁱ

<

K

(12)

Likelihood ratio test for composite hypotheses – cont.

Test statistic:

or

where are MLE of the null and alternative hypothesis models

We reject H ₀ if λ > c for a constant c

(determined according to significance level)

) ,

( sup

) ,

( sup

0 0

1 1

0 0

1 1

X f

θ λ θ

θ θ

Θ

∈ Θ

=

∈

) ˆ ,

(

) ˆ ,

(

0 0

1 1

X f

θ λ = θ

1 0

, ˆ

ˆ θ

θ

(13)

Likelihood ratio test for composite hypotheses – justification

Just like in the Neyman-Pearson Lemma, we compare the “highest chance of obtaining

observation X, when the alternative is true” to the “highest chance of obtaining observation X, when the null is true”; we reject the null

hypothesis in favor of the alternative if this

ratio is very unfavorable for the null.

(14)

Likelihood ratio test for composite hypotheses – alternative version

Test statistic:

or

where are the ML estimators for the model without restrictions and for the null model.

We reject H ₀ if for a constant .

) ,

( sup

) ,

(

~ sup

0

0 0

0

f X

X f

θ λ θ

θ θ

Θ

∈ Θ

=

∈

) ˆ ,

(

) ˆ ,

~ (

0

X

f

X f

θ λ = θ

ˆ

0

ˆ , θ θ

~ > c~

λ ^c~

more convenient if the null is simple or if models are nested

(15)

Likelihood ratio test for composite hypotheses – properties

For some models with composite hypotheses the

UMPT does not exist (so the LR test will not be UMP because there is no such test)

e.g. testing H

₀

: θ = θ

₀

against H

₁

: θ ≠ θ

₀

if the family of

distributions has a monotonic LR property, i.e. f

₁

(x)/f

₀

(x) is an increasing function of a statistic T(x) for any f

₀

and f

₁

corresponding to parameters θ

₀

< θ

₁

.

In order to have UMPT for H

₀

: θ = θ

₀

against H

₁

: θ < θ

₀

we would need a critical region of the type T(x)>c, and to have a UMPT for H

₀

: θ = θ

₀

against H

₁

: θ > θ

₀

we would need a

critical region of the type T(x)<c, so it is impossible to find a

UMPT for H

₁

: θ ≠ θ

_0.

.

(16)

Likelihood ratio test:

special cases

The exact form of the test depends on the distribution.

In many cases, finding the distribution is

hard/complicated (in many such cases, we

use the asymptotic properties of the LR test

instead of precise formulae)

(17)

Notation

x

_something

always means a quantile of rank

something

(18)

Model I: comparing the mean

Normal model: X

₁

, X

₂

, ..., X

_n

are an IID sample from N( µ ^, σ

²

^{), where} σ

²

^{is known}

H

₀

: µ = µ

₀

Test statistic:

H

₀

: µ = µ

₀

against H

₁

: µ > µ

₀

critical region

H

₀

: µ = µ

₀

against H

₁

: µ < µ

₀

critical region

H

₀

: µ ⁼ µ

₀

^{against H}

₁

^: µ ≠ µ

₀

critical region

Example 1

) 1 , 0 (

0

~

N X n

U σ

µ

= −

} )

( :

{

* = x U x > u

₁₋_α

K

} )

( :

{

* = x U x < u

_α

= − u

₁₋_α

K

}

| ) ( :|

{

* = x U x > u

₁₋_α _/₂

K

(19)

Model I: example

Let X

₁

, X

₂

, ..., X

₁₀

be an IID sample from N( µ , 1

²

):

-1.21 -1.37 0.51 0.37 -0.75 0.44 1.20 -0.96 -1.14 -1.40

Is µ = 0? (for α = 0.05)

In the sample: mean = -0.43, variance = 0.92 Test statistic:

H

₀

: µ = 0 against H

₁

: µ ≠ 0, u

_0.975

≈1.96 (p-value ≈ 0.172)

H

₀

: µ = 0 against H

₁

: µ < 0, u

_0.05

≈ -1.64 (p-value ≈ 0.086)

H

₀

: µ = 0 against H

₁

: µ ^{> 0, u}

_0.95

≈1.64 (p-value ≈ 0.914)

→ in none of these cases are there grounds to reject H

₀

for α = 0.05

→ but we would reject H : µ = 0 in favor of H : µ < 0 for α = 0.1

36 . 1 1 10

0 43

.

0 − ≈ −

= −

U

(20)

Model II: comparing the mean

Normal model: X

₁

, X

₂

, ..., X

_n

are an IID sample from N( µ ^, σ

²

^{), where} σ

²

^{is unknown}

H

₀

: µ = µ

₀

Test statistic:

H

₀

: µ = µ

₀

against H

₁

: µ > µ

₀

critical region

H

₀

: µ = µ

₀

against H

₁

: µ < µ

₀

critical region

H

₀

: µ ⁼ µ

₀

^{against H}

₁

^: µ ≠ µ

₀

critical region

) 1 (

0

~ −

= − n t n

S T X µ

)}

1 (

) ( :

{

* = x T x > t

₁₋

n −

K

_α

)}

1 (

) ( :

{

* = x T x < t n −

K

_α

)}

1 (

| ) ( :|

{

* = x T x > t

₁₋ _/ ₂

n −

K

_α

) 1 (

) 1

(n − = −t₁₋ n −

t_α _α

(21)

Model II: example (mean)

Let X

₁

, X

₂

, ..., X

₁₀

be an IID sample from N( µ , σ

²

):

-1.21 -1.37 0.51 0.37 -0.75 0.44 1.20 -0.96 -1.14 -1.40

Is µ = 0? (for α = 0.05)

In the sample: mean = -0.43, variance = 0.92 Test statistic:

H

₀

: µ ^{= 0 vs H}

₁

^: µ ≠ 0, t

_0.975

(9) ≈ 2.26 (p-value ≈ 0.188)

H

₀

: µ = 0 vs H

₁

: µ < 0, t

_0.05

(9) ≈ -1.83 (p-value ≈ 0.094)

H

₀

: µ ^{= 0 vs H}

₁

^: µ ^{> 0, t}

_0.95

(9) ≈1.83 (p-value ≈ 0.906)

→ in none of these cases are there grounds to reject H

₀

for α = 0.05

→ but we would reject H : µ = 0 in favor of H : µ < 0 for α = 0.1

42 . 1 10

92 . 0

0 43

.

0 − ≈ −

= −

U

(22)

Model II: comparing the variance

Normal model: X

₁

, X

₂

, ..., X

_n

are an IID sample from N( µ ^, σ

²

^{), where} σ

²

^{is unknown}

H

₀

: σ = σ

₀

Test statistic:

H

₀

: σ = σ

₀

against H

₁

: σ > σ

₀

critical region

H

₀

: σ = σ

₀

against H

₁

: σ < σ

₀

critical region

H

₀

: σ ⁼ σ

₀

^{against H}

₁

^: σ ≠ σ

₀

critical region

) 1 (

) ~ 1

(

₂

2 0

2

− −

= n S n

σ χ χ

)}

1 (

) ( :

{

* = x

²

x >

₁²₋

n −

K χ χ

_α

)}

1 (

) ( :

{

* = x

²

x <

²

n −

K χ χ

_α

)}

1 (

) (

) 1 (

) ( :

{

*

2

2 / 1 2

2 2 / 2

−

>

∨

−

<

=

−

n

x

n x

x K

α α

χ χ

χ

(23)

Model II: example (variance)

Let X

₁

, X

₂

, ..., X

₁₀

be an IID sample from N( µ , σ

²

):

-1.21 -1.37 0.51 0.37 -0.75 0.44 1.20 -0.96 -1.14 -1.40

Is σ =1? (for α = 0.05)

In the sample: variance = 0.92 Test statistic:

H

₀

: σ = 1 against H

₁

: σ ^{> 1}

H

₀

: σ = 1 against H

₁

: σ < 1 H

₀

: σ = 1 against H

₁

: σ ≠ ¹

→ in none of these cases are there grounds to reject H (for α = 0.05)

28 . 1 8

92 . 0

2

9 ⋅ ≈

χ =

92 .

2

16

95 .

0

≈

χ

33 .

2

3

05 .

0

≈

χ

02 . 19

; 70 .

2

₀²_.₉₇₅

2 025 .

0

≈ χ ≈

χ

(24)

Model III: comparing the mean

Asymptotic model: X

₁

, X

₂

, ..., X

_n

are an IID sample from a distribution with mean µ and variance

(unknown), n – large.

H

₀

: µ = µ

₀

Test statistic:

has, for large n, an approximate distribution N(0,1) H

₀

: µ ⁼ µ

₀

^{against H}

₁

^: µ ^> µ

₀

critical region

H

₀

: µ = µ

₀

against H

₁

: µ < µ

₀

critical region

H

₀

: µ = µ

₀

against H

₁

: µ ≠ µ

₀

critical region

S n T X − µ

⁰

=

} )

( :

{

* = x T x > u

₁₋_α

K

} )

( :

{

* = x T x < u

_α

= − u

₁₋_α

K

}

| ) ( :|

{

* = x T x > u

₁₋_α _/₂

K

(25)

Model IV: comparing the fraction

Asymptotic model: X

₁

, X

₂

, ..., X

_n

are an IID sample from a two-point distribution, n – large.

H

₀

: p = p

₀

Test statistic:

has an approximate distribution N(0,1) for large n H

₀

: p = p

₀

against H

₁

: p > p

₀

critical region

H

₀

: p = p

₀

against H

₁

: p < p

₀

critical region

H

₀

: p = p

₀

against H

₁

: p ≠ p

₀

critical region

) 0 (

1 )

1 ( X = = p = − P X =

P

_p _p

p n p

p p

p U X

) 1

( ˆ )

1 * (

0 0

0

−

= −

−

= −

} )

( :

{

* = x U x > u

₁₋_α

K

} )

( :

{

* = x U x < u

_α

= − u

₁₋_α

K

}

| ) ( :|

{

* = x U x > u

₋_α

K

(26)

Model IV: example

We toss a coin 400 times. We get 180 heads. Is the coin symmetric?

H

₀

: p = ½

for α = 0.05 and H

₁

: p ≠ ½ we have u

_0.975

=1.96 → we reject H

₀

for α = 0.05 and H

₁

: p < ½ we have u

_0.05

= -u

_0.95

=-1.64

→ we reject H

₀

for α = 0.01 and H

₁

: p ≠ ½ we have u

_0.995

=2.58

→ we do not reject H

₀

for α = 0.01 and H

₁

: p < ½ we have u

_0.01

= -u

_0.99

=-2.33

→ we do not reject H

₀

p-value for H

₁

: p ≠ ½: 0.044 p-value for H

₁

: p < ½: 0.022

2 ) 400

2 / 1 1

( 2 / 1

) 2 / 1 400

/ 180

* ( = −

−

= −

U

(27)

Asymptotic properties of the LR test

We consider two nested models, we test H

₀

: h( θ ) = 0 against H

₁

: h( θ ) ≠ 0

Under the assumption that h is a nice function

Θ is a d-dimensional set

Θ

₀

= { θ ^{: h(} θ ) = 0} is a d – p dimensional set

Theorem: If H

₀

is true, then for n→∞ the distribution of the statistic converges to a chi-squared

distribution with p degrees of freedom

λ ^~

ln

2

(28)

Asymptotic properties of the LR test – example Exponential model: X

₁

, X

₂

, ..., X

_n

are an IID sample from Exp( θ ).

We test H

₀

: θ = 1 against H

₁

: θ ≠ 1

then:

from Theorem:

for a sign. level α =0.05 we have so we reject H

₀

in favor of H

₁

if

X MLE ( θ ) = θ ˆ = 1 /

( ⁽ ¹ ⁾ )

1 exp )

exp(

) exp(

) (

)

~ (

¹ ¹

1

ˆ

= −

Σ

−

Σ

= − Π

= Π n X

X x

x x

f

x f

n i

X i X

i

i ⁿ

λ

θ

) 1 ( )

ln )

1 ((

~ 2 ln

2 λ = n X − − X  →

^D

χ

²

c

c ~ 2 ln ~ ln

~ 2

~ > ⇔ λ >

λ

c~

ln 2 84

. 3 )

1

2

(

95 .

0

≈ ≈

χ

2 / 84 .

~

3

> e

λ

(29)

Test randomization

Sometimes, a test with a significance level exactly equal to α does not exist (e.g. for discrete random variables).

In such cases, we need randomization.

(The UMPT, if it exists, needs to be randomized).

eg. no of heads in 8 tosses, H

₀

: p = ½, H

₁

: p <½, α=0.05:

X≤1 reject, X>2 OK, X=2: p=1/11 reject

x_i 0 1 2 3 4 5 6 7 8

p_i 0.004 0.03 0.11 0.22 0.27 0.22 0.11 0.03 0.004 cum p_i 0.004 0.04 0.15 0.36 0.64 0.86 0.97 0.996 1.000

(30)

1. LR test for composite hypotheses 2. Examples of LR tests:

Mathematical Statistics

Anna Janicka

Plan for today