• Nie Znaleziono Wyników

Mathematical Statistics Anna Janicka

N/A
N/A
Protected

Academic year: 2021

Share "Mathematical Statistics Anna Janicka"

Copied!
28
0
0

Pełen tekst

(1)

Mathematical Statistics

Anna Janicka

Lecture XII, 18.05.2020

HYPOTHESIS TESTING IV:

PARAMETRIC TESTS: COMPARING TWO OR MORE

POPULATIONS

(2)

Plan for today

1. Parametric LR tests for one population – cont.

2. Asymptotic properties of the LR test

3. Parametric LR tests for two populations 4. Comparing more than two populations

 ANOVA

(3)

Notation

x

something

always means a quantile of rank

something

(4)

Model IV: comparing the fraction – reminder Asymptotic model: X

1

, X

2

, ..., X

n

are an IID sample from a two-point distribution, n – large.

H

0

: p = p

0

Test statistic:

has an approximate distribution N(0,1) for large n H

0

: p = p

0

against H

1

: p > p

0

critical region

H

0

: p = p

0

against H

1

: p < p

0

critical region

H

0

: p = p

0

against H

1

: p p

0

critical region

𝑃𝑃𝑝𝑝(𝑋𝑋 = 1) = 𝑝𝑝 = 1 − 𝑃𝑃𝑝𝑝(𝑋𝑋 = 0) 𝑈𝑈 = ̄𝑋𝑋 − 𝑝𝑝0

𝑝𝑝0(1 − 𝑝𝑝0) 𝑛𝑛 = ̂𝑝𝑝 − 𝑝𝑝0

𝑝𝑝0(1 − 𝑝𝑝0) 𝑛𝑛

𝐶𝐶 = {𝑥𝑥 : 𝑈𝑈 (𝑥𝑥) > 𝑢𝑢1−𝛼𝛼}

𝐶𝐶 = {𝑥𝑥 : 𝑈𝑈 (𝑥𝑥) < 𝑢𝑢𝛼𝛼 = −𝑢𝑢1−𝛼𝛼} 𝐶𝐶 = {𝑥𝑥 : | 𝑈𝑈(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}

(5)

Model IV: example

We toss a coin 400 times. We get 180 heads. Is the coin symmetric?

H

0

: p = ½

for α = 0.05 and H1: p ≠ ½ we have u0.975 =1.96 → we reject H0 for α = 0.05 and H1: p < ½ we have u0.05 = -u0.95 =-1.64

→ we reject H0

for α = 0.01 and H1: p ≠ ½ we have u0.995 =2.58

→ we do not reject H0

for α = 0.01 and H1: p < ½ we have u0.01 = -u0.99 =-2.33

→ we do not reject H0

p-value for H1: p ½: 0.044 p-value for H1: p < ½: 0.022 𝑈𝑈 = (180/400 − 1/2)

1/2(1 − 1/2) 400 = −2

(6)

Likelihood ratio test for composite hypotheses – reminder

X ~ P

θ

, {P

θ

: θ ∈ Θ} – family of distributions We are testing H

0

: θ ∈ Θ

0

against H

1

: θ ∈ Θ

1

such that Θ

0

∩ Θ

1

= ∅, Θ

0

∪ Θ

1

= Θ Let

H

0

: X ~ f

0

( θ

0

,⋅) for some θ

0

∈ Θ

0.

H

1

: X ~ f

1

( θ

1

, ⋅) for some θ

1

∈ Θ

1

,

where f

0

and f

1

are densities (for θ ∈ Θ

0

and θ

∈ Θ

1

, respectively)

(7)

Likelihood ratio test for composite hypotheses – reminder (cont.)

Test statistic:

or

where are the ML estimators for the model without restrictions and for the null model.

We reject H

0

if for a constant .

̃𝜆𝜆 = sup

𝜃𝜃∈Θ

𝑓𝑓 (𝜃𝜃, 𝑋𝑋) sup

𝜃𝜃0∈Θ0

𝑓𝑓

0

(𝜃𝜃

0

, 𝑋𝑋)

̃𝜆𝜆 = 𝑓𝑓( �𝜃𝜃, 𝑋𝑋) 𝑓𝑓

0

( �𝜃𝜃

0

, 𝑋𝑋)

�𝜃𝜃, �𝜃𝜃

0

̃𝜆𝜆 > ̃𝑐𝑐 ̃𝑐𝑐

more convenient if the null is simple or if models are nested

(8)

Asymptotic properties of the LR test We consider two nested models, we test H

0

: h( θ ) = 0 against H

1

: h( θ ) ≠ 0

Under the assumption that

h is a nice function

Θ is a d-dimensional set

 Θ

0

= { θ : h( θ ) = 0} is a d – p dimensional set

Theorem: If H

0

is true, then for n →∞ the distribution of the statistic converges to a chi-squared

distribution with p degrees of freedom

2ln ̃𝜆𝜆

degrees of freedom = number of restrictions

(9)

Asymptotic properties of the LR test – example Exponential model: X

1

, X

2

, ..., X

n

are an IID sample from Exp( θ ).

We test H

0

: θ = 1 against H

1

: θ ≠ 1

then:

from Theorem:

for a sign. level α =0.05 we have so we reject H

0

in favor of H

1

if

𝑀𝑀𝑀𝑀𝑀𝑀(𝜃𝜃) = �𝜃𝜃 = 1/ ̄𝑋𝑋

̃𝜆𝜆 = Π𝑓𝑓�𝜃𝜃(𝑥𝑥𝑖𝑖) Π𝑓𝑓1(𝑥𝑥𝑖𝑖) =

̄𝑋𝑋1𝑛𝑛 exp( − 1̄𝑋𝑋Σ𝑥𝑥𝑖𝑖)

exp( − Σ𝑥𝑥𝑖𝑖) = 1

̄𝑋𝑋𝑛𝑛 exp 𝑛𝑛( ̄𝑋𝑋 − 1)

2ln ̃𝜆𝜆 = 2𝑛𝑛(( ̄𝑋𝑋 − 1) − ̄ln 𝑋𝑋) 𝐷𝐷 𝜒𝜒2(1)

̃𝜆𝜆 > ̃𝑐𝑐 ⇔ 2ln ̃𝜆𝜆 > 2ln ̃𝑐𝑐

𝜒𝜒0.952 (1) ≈ 3.84 ≈ 2ln ̃𝑐𝑐

̃𝜆𝜆 > 𝑒𝑒3.84/2

(10)

Comparing two or more populations

We want to know if populations studied are

“the same” in certain aspects:

 parametric tests: we check the equality of certain distribution parameters

 nonparametric tests: we check whether

distributions are the same

(11)

Model I: comparison of means, variance known, significance level α

X

1

, X

2

, ..., X

nX

are an IID sample from distr N( µ

X

, σ

X2

), Y

1

, Y

2

, ..., Y

nY

are an IID sample from distr N( µ

Y

, σ

Y2

),

σ

X2

, σ

Y2

are known, samples are independent H

0

: µ

x

= µ

Y

Test statistic:

H

0

: µ

x

= µ

Y

against H

1

: µ

x

> µ

Y

critical region

H

0

: µ

x

= µ

Y

against H

1

: µ

x

≠ µ

Y

critical region

𝑈𝑈 = ̄𝑋𝑋 − ̄𝑌𝑌 𝜎𝜎

𝑋𝑋2

𝑛𝑛

𝑋𝑋

+ � 𝜎𝜎

𝑌𝑌2

𝑛𝑛

𝑌𝑌

~ 𝑁𝑁 (0,1)

𝐶𝐶 = {𝑥𝑥 : 𝑈𝑈 (𝑥𝑥) > 𝑢𝑢1−𝛼𝛼} 𝐶𝐶 = {𝑥𝑥 : | 𝑈𝑈(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}

assuming H0 is true

(12)

Model I – comparison of means. Example

X

1

, X

2

, ..., X

10

are an IID sample from distr N( µ

X

,11

2

), Y

1

, Y

2

, ..., Y

10

are an IID sample from distr N( µ

Y

,13

2

) Based on the sample:

Are the means equal, for significance level 0.05?

H

0

: µ

x

= µ

Y

against H

1

: µ

x

≠ µ

Y

we have: u

0.975

≈ 1.96.

|0.557| < 1.96 → no grounds to reject H

0

̄𝑋𝑋 = 501, ̄𝑌𝑌 = 498

𝑈𝑈 = 501 − 498 132

10 + 112 10

≈ 0.557

(13)

Model II: comparison of means, variance

unknown but assumed equal, significance level α

X

1

, X

2

, ..., X

nX

are an IID sample from distr N( µ

X

, σ

2

), Y

1

, Y

2

, ..., Y

nY

are an IID sample from distr N( µ

Y

, σ

2

) with σ

2

unknown, samples are independent

H

0

: µ

x

= µ

Y

Test statistic:

H

0

: µ

x

= µ

Y

against H

1

: µ

x

> µ

Y

critical region

H

0

: µ

x

= µ

Y

against H

1

: µ

x

≠ µ

Y

critical region

𝐾𝐾 ∗= {𝑥𝑥 : 𝑇𝑇 (𝑥𝑥) > 𝑡𝑡1−𝛼𝛼(𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦 − 2)}

𝐶𝐶 = {𝑥𝑥 : | 𝑇𝑇(𝑥𝑥)| > 𝑡𝑡1−𝛼𝛼/2(𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦 − 2)}

Assuming H0 is true

𝑆𝑆𝑋𝑋2 = 1

𝑛𝑛𝑋𝑋−1

𝑖𝑖=1 𝑛𝑛𝑋𝑋

(𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2 = 1

𝑛𝑛𝑌𝑌−1

𝑖𝑖=1 𝑛𝑛𝑌𝑌

(𝑌𝑌𝑖𝑖 − ̄𝑌𝑌)2

𝑇𝑇 = ̄𝑋𝑋 − ̄𝑌𝑌

(𝑛𝑛𝑥𝑥 − 1)𝑆𝑆𝑋𝑋2 + (𝑛𝑛𝑌𝑌 − 1)𝑆𝑆𝑌𝑌2

𝑛𝑛𝑋𝑋𝑛𝑛𝑌𝑌

𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2) ~ 𝑡𝑡 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2)

(14)

Model II: comparison of means, variance unknown but assumed equal, cont.

can be rewritten as

where

is an estimator of the variance σ

2

based on the two samples

𝑇𝑇 = ̄𝑋𝑋 − ̄𝑌𝑌

(𝑛𝑛𝑥𝑥 − 1)𝑆𝑆𝑋𝑋2 + (𝑛𝑛𝑌𝑌 − 1)𝑆𝑆𝑌𝑌2

𝑛𝑛𝑋𝑋𝑛𝑛𝑌𝑌

𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2) ~ 𝑡𝑡 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2)

𝑇𝑇 = ̄𝑋𝑋 − ̄𝑌𝑌 𝑆𝑆 1

𝑛𝑛𝑋𝑋 + 1𝑛𝑛𝑌𝑌

~ 𝑡𝑡 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2)

𝑆𝑆2 = (𝑛𝑛𝑥𝑥 − 1)𝑆𝑆𝑋𝑋2 + (𝑛𝑛𝑌𝑌 − 1)𝑆𝑆𝑌𝑌2 𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦 − 2

(15)

Model II: comparison of variances, significance level α

X

1

, X

2

, ..., X

nX

are an IID sample from distr N( µ

X

, σ

X2

), Y

1

, Y

2

, ..., Y

nY

are an IID sample from distr N( µ

Y

, σ

Y2

),

σ

X2

, σ

Y2

are unknown, samples are independent H

0

: σ

X

= σ

Y

Test statistic:

H

0

: σ

X

= σ

Y

against H

1

: σ

X

> σ

Y

critical region

H

0

: σ

X

= σ

Y

against H

1

: σ

X

≠ σ

Y

critical region

𝐹𝐹 = 𝑆𝑆𝑋𝑋2

𝑆𝑆𝑌𝑌2 ~ 𝐹𝐹 (𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)

𝐶𝐶 = {𝑥𝑥 : 𝐹𝐹 (𝑥𝑥) > 𝐹𝐹1−𝛼𝛼(𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)}

𝐶𝐶 = {𝑥𝑥 : 𝐹𝐹 (𝑥𝑥) < 𝐹𝐹𝛼𝛼/2(𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)

∨ 𝐹𝐹(𝑥𝑥) > 𝐹𝐹1−𝛼𝛼/2(𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)}

assuming H0 is true

𝑆𝑆𝑋𝑋2= 1 𝑛𝑛𝑋𝑋−1

𝑖𝑖=1 𝑛𝑛𝑋𝑋

(𝑋𝑋𝑖𝑖− ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2= 1 𝑛𝑛𝑌𝑌−1

𝑖𝑖=1 𝑛𝑛𝑌𝑌

(𝑌𝑌𝑖𝑖− ̄𝑌𝑌)2

(16)

Model II: comparison of means, variances unknown and no equality assumption

X

1

, X

2

, ..., X

nX

are an IID sample from distr N( µ

X

, σ

X2

), Y

1

, Y

2

, ..., Y

nY

are an IID sample from distr N( µ

Y

, σ

Y2

),

σ

X2

, σ

Y2

are unknown, samples independent H

0

: µ

x

= µ

Y

The test statistic would be very simple, but:

It isn’t possible to design a test statistic such that the distribution does not depend on σ

X2

and σ

Y2

(values)...

𝑆𝑆𝑋𝑋2= 1 𝑛𝑛𝑋𝑋−1

𝑖𝑖=1 𝑛𝑛𝑋𝑋

(𝑋𝑋𝑖𝑖− ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2= 1 𝑛𝑛𝑌𝑌−1

𝑖𝑖=1 𝑛𝑛𝑌𝑌

(𝑌𝑌𝑖𝑖− ̄𝑌𝑌)2

̄𝑋𝑋 − ̄𝑌𝑌 𝑆𝑆𝑋𝑋2

𝑛𝑛𝑋𝑋 + 𝑆𝑆𝑛𝑛𝑌𝑌𝑌𝑌2

~?

(17)

Model III: comparison of means for large samples, significance level α

X1, X2, ..., XnX are an IID sample from distr. with mean µX, Y1, Y2, ..., YnY are an IID sample from distr. with mean µY , both distr. have unknown variances, samples are independent,

nX, nY – large.

H0: µx = µY Test statistic:

H0: µx = µY against H1: µx > µY critical region

H0: µx = µY against H1: µx ≠ µY critical region

𝑈𝑈 = ̄𝑋𝑋 − ̄𝑌𝑌 𝑆𝑆𝑋𝑋2

𝑛𝑛𝑋𝑋 + 𝑆𝑆𝑛𝑛𝑌𝑌𝑌𝑌2

~ 𝑁𝑁 (0,1)

assuming H0. is true, for large samples

approximately

𝐶𝐶 = {𝑥𝑥 : 𝑈𝑈 (𝑥𝑥) > 𝑢𝑢1−𝛼𝛼}

𝐶𝐶 = {𝑥𝑥 : | 𝑈𝑈(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}

𝑆𝑆𝑋𝑋2= 1 𝑛𝑛𝑋𝑋−1

𝑖𝑖 1 𝑛𝑛𝑋𝑋

(𝑋𝑋𝑖𝑖− ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2= 1 𝑛𝑛𝑌𝑌−1

𝑖𝑖 1 𝑛𝑛𝑌𝑌

(𝑌𝑌𝑖𝑖− ̄𝑌𝑌)2

(18)

Model III – example (equality of means?)

(19)

Model IV: comparison of fractions for large samples, significance level α

Two IID samples from two-point distributions. X – number of successes in nX trials with prob of success pX, Y – number of successes in nY trials with prob of success pY. pX and pY

unknown, nX and nY large.

H0: pX = pY Test statistic:

where

H0: pX = pY against H1: pX > pY critical region

H0: pX = pY against H1: pXpY critical region

𝑈𝑈 =

𝑛𝑛𝑋𝑋𝑋𝑋 − 𝑌𝑌𝑛𝑛𝑌𝑌

𝑝𝑝(1 − 𝑝𝑝) 1𝑛𝑛𝑋𝑋 + 1𝑛𝑛𝑌𝑌

~ 𝑁𝑁 (0,1)

𝐶𝐶 = {𝑥𝑥 : 𝑈𝑈( 𝑥𝑥) > 𝑢𝑢1−𝛼𝛼}

𝐶𝐶 = {𝑥𝑥 : | 𝑈𝑈(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}

𝑝𝑝 = 𝑋𝑋 + 𝑌𝑌 𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦

assuming H0. is true, for large samples

approximately

(20)

Model IV – example (equality of probabilities?)

(21)

Tests for more than two populations

A naive approach:

pairwise tests for all pairs But:

in this case, the type I error is higher than

the significance level assumed for each

simple test...

(22)

More populations

Assume we have k samples:

, and

all X

i,j

are independent (i=1,...,k, j=1,.., n

i

)

X

i,j

~N(m

i

, σ

2

)

we do not know m

1

, m

2

, ..., m

k

, nor σ

2

let n=n

1

+n

2

+...+n

k

𝑋𝑋

1,1

, 𝑋𝑋

1,2

, . . . , 𝑋𝑋

1,𝑛𝑛1

, 𝑋𝑋

2,1

, 𝑋𝑋

2,2

, . . . , 𝑋𝑋

2,𝑛𝑛2

, . . .

𝑋𝑋

𝑘𝑘,1

, 𝑋𝑋

𝑘𝑘,2

, . . . , 𝑋𝑋

𝑘𝑘,𝑛𝑛𝑘𝑘

(23)

Test of the Analysis of Variance (ANOVA) for significance level α

H

0

: µ

1

= µ

2

=... = µ

k

H

1

: ¬ H

0

(i.e. not all µ

i

are equal) A LR test; we get a test statistic:

with critical region

for k=2 the ANOVA is equivalent to the two-sample t-test.

𝐹𝐹 = ∑𝑖𝑖=1𝑘𝑘 𝑛𝑛𝑖𝑖( ̄𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2/(𝑘𝑘 − 1)

𝑖𝑖=1𝑘𝑘𝑗𝑗=1𝑛𝑛𝑖𝑖 (𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋𝑖𝑖)2/(𝑛𝑛 − 𝑘𝑘) ~ 𝐹𝐹 (𝑘𝑘 − 1, 𝑛𝑛 − 𝑘𝑘)

̄𝑋𝑋𝑖𝑖 = 1 𝑛𝑛𝑖𝑖

𝑗𝑗=1 𝑛𝑛𝑖𝑖

𝑋𝑋𝑖𝑖,𝑗𝑗 , ̄𝑋𝑋 = 1 𝑛𝑛 �𝑖𝑖=1

𝑘𝑘

𝑗𝑗=1 𝑛𝑛𝑖𝑖

𝑋𝑋𝑖𝑖,𝑗𝑗 = 1 𝑛𝑛 �𝑖𝑖=1

𝑘𝑘

𝑛𝑛𝑖𝑖 ̄𝑋𝑋𝑖𝑖

𝐶𝐶

= {𝑥𝑥 : 𝐹𝐹 (𝑥𝑥) > 𝐹𝐹

1−𝛼𝛼

(𝑘𝑘 − 1, 𝑛𝑛 − 𝑘𝑘)}

(24)

ANOVA – interpretation

we have

– between group variance estimator – within group variance estimator 1

𝑛𝑛 − 𝑘𝑘 �𝑖𝑖=1

𝑘𝑘

𝑗𝑗=1 𝑛𝑛𝑖𝑖

(𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋𝑖𝑖)2

Sum of Squares (SS)

Sum of Squares Between (SSB)

Sum of Squares Within (SSW)

1

𝑘𝑘 − 1 �𝑖𝑖=1

𝑘𝑘

𝑛𝑛𝑖𝑖( ̄𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2

𝑖𝑖=1 𝑘𝑘

𝑗𝑗=1 𝑛𝑛𝑖𝑖

(𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋)2 = �

𝑖𝑖=1 𝑘𝑘

𝑛𝑛𝑖𝑖( ̄𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2 + �

𝑖𝑖=1 𝑘𝑘

𝑗𝑗=1 𝑛𝑛𝑖𝑖

(𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋𝑖𝑖)2

(25)

ANOVA test – table

source of

variability sum of squares degrees of freedom

value of the test statistic F between

groups SSB k-1 –

within groups SSW n-k –

total SS n-1 F

(26)

ANOVA test – example

Yearly chocolate consumption in three cities: A, B, C based on random samples of n

A

= 8, n

B

= 10, n

C

= 9 consumers. Does consumption depend on the city?

α=0.01

→ reject H

0

(equality of means), consumption depends on city

A B C

sample mean 11 10 7

sample variance 3.5 2.8 3

̄𝑋𝑋 = 127 (11 ⋅ 8 + 10 ⋅ 10 + 7 ⋅ 9) = 9.3

𝑆𝑆𝑆𝑆𝑆𝑆 = (11 − 9.3)2 ⋅ 8 + (10 − 9.3)2 ⋅ 10 + (7 − 9.3)2 ⋅ 9 = 75.63 𝑆𝑆𝑆𝑆𝑆𝑆 = 3.5 ⋅ 7 + 2.8 ⋅ 9 + 3 ⋅ 8 = 73.7

𝐹𝐹 = 75.63/2

73.7/24 ≈ 12.31 and 𝐹𝐹0.99(2,24) ≈ 5.61

(27)

ANOVA test – table – example

source of

variability sum of squares degrees of freedom

value of the test statistic F between

groups 75.63 2 –

within groups 73.7 24 –

total 149.33 26 12.31

(28)

Cytaty

Powiązane dokumenty

Therefore these tests were compared in terms of Type III error rates across the variety of population distributions, mean difference (effect size), and sample sizes.. Key words: Type

in this case, the type I error is higher than the significance level assumed for each simple test..... ANOVA test

but these properties needn’t hold, because convergence in distribution does not imply convergence of moments.. Asymptotic normality – how to

The basic rule of comparing tests is the following: for a given set of null and alternative hypotheses, for a given significance level, the test which is more powerful is

Likelihood ratio test: Neyman-Pearson Lemma.. Examples of tests for simple hypotheses

Our knowledge about the unknown parameters is described by means of probability distributions, and additional knowledge may affect our

We believe the firm is mistaken and want to execute a test where it is possible to conclude that the firm probably is mistaken.... When is the alternative one-sided and when is it

Copyright Cambridge University Press 2013.. All