Mathematical Statistics
Anna Janicka
Lecture XII, 18.05.2020
HYPOTHESIS TESTING IV:
PARAMETRIC TESTS: COMPARING TWO OR MORE
POPULATIONS
Plan for today
1. Parametric LR tests for one population – cont.
2. Asymptotic properties of the LR test
3. Parametric LR tests for two populations 4. Comparing more than two populations
ANOVA
Notation
x
somethingalways means a quantile of rank
something
Model IV: comparing the fraction – reminder Asymptotic model: X
1, X
2, ..., X
nare an IID sample from a two-point distribution, n – large.
H
0: p = p
0Test statistic:
has an approximate distribution N(0,1) for large n H
0: p = p
0against H
1: p > p
0critical region
H
0: p = p
0against H
1: p < p
0critical region
H
0: p = p
0against H
1: p ≠ p
0critical region
𝑃𝑃𝑝𝑝(𝑋𝑋 = 1) = 𝑝𝑝 = 1 − 𝑃𝑃𝑝𝑝(𝑋𝑋 = 0) 𝑈𝑈∗ = ̄𝑋𝑋 − 𝑝𝑝0
𝑝𝑝0(1 − 𝑝𝑝0) 𝑛𝑛 = ̂𝑝𝑝 − 𝑝𝑝0
𝑝𝑝0(1 − 𝑝𝑝0) 𝑛𝑛
𝐶𝐶∗ = {𝑥𝑥 : 𝑈𝑈∗ (𝑥𝑥) > 𝑢𝑢1−𝛼𝛼}
𝐶𝐶∗ = {𝑥𝑥 : 𝑈𝑈∗ (𝑥𝑥) < 𝑢𝑢𝛼𝛼 = −𝑢𝑢1−𝛼𝛼} 𝐶𝐶∗ = {𝑥𝑥 : | 𝑈𝑈∗(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}
Model IV: example
We toss a coin 400 times. We get 180 heads. Is the coin symmetric?
H
0: p = ½
for α = 0.05 and H1: p ≠ ½ we have u0.975 =1.96 → we reject H0 for α = 0.05 and H1: p < ½ we have u0.05 = -u0.95 =-1.64
→ we reject H0
for α = 0.01 and H1: p ≠ ½ we have u0.995 =2.58
→ we do not reject H0
for α = 0.01 and H1: p < ½ we have u0.01 = -u0.99 =-2.33
→ we do not reject H0
p-value for H1: p ≠ ½: 0.044 p-value for H1: p < ½: 0.022 𝑈𝑈∗ = (180/400 − 1/2)
1/2(1 − 1/2) 400 = −2
Likelihood ratio test for composite hypotheses – reminder
X ~ P
θ, {P
θ: θ ∈ Θ} – family of distributions We are testing H
0: θ ∈ Θ
0against H
1: θ ∈ Θ
1such that Θ
0∩ Θ
1= ∅, Θ
0∪ Θ
1= Θ Let
H
0: X ~ f
0( θ
0,⋅) for some θ
0∈ Θ
0.H
1: X ~ f
1( θ
1, ⋅) for some θ
1∈ Θ
1,
where f
0and f
1are densities (for θ ∈ Θ
0and θ
∈ Θ
1, respectively)
Likelihood ratio test for composite hypotheses – reminder (cont.)
Test statistic:
or
where are the ML estimators for the model without restrictions and for the null model.
We reject H
0if for a constant .
̃𝜆𝜆 = sup
𝜃𝜃∈Θ𝑓𝑓 (𝜃𝜃, 𝑋𝑋) sup
𝜃𝜃0∈Θ0𝑓𝑓
0(𝜃𝜃
0, 𝑋𝑋)
̃𝜆𝜆 = 𝑓𝑓( �𝜃𝜃, 𝑋𝑋) 𝑓𝑓
0( �𝜃𝜃
0, 𝑋𝑋)
�𝜃𝜃, �𝜃𝜃
0̃𝜆𝜆 > ̃𝑐𝑐 ̃𝑐𝑐
more convenient if the null is simple or if models are nested
Asymptotic properties of the LR test We consider two nested models, we test H
0: h( θ ) = 0 against H
1: h( θ ) ≠ 0
Under the assumption that
h is a nice function
Θ is a d-dimensional set
Θ
0= { θ : h( θ ) = 0} is a d – p dimensional set
Theorem: If H
0is true, then for n →∞ the distribution of the statistic converges to a chi-squared
distribution with p degrees of freedom
2ln ̃𝜆𝜆
degrees of freedom = number of restrictions
Asymptotic properties of the LR test – example Exponential model: X
1, X
2, ..., X
nare an IID sample from Exp( θ ).
We test H
0: θ = 1 against H
1: θ ≠ 1
then:
from Theorem:
for a sign. level α =0.05 we have so we reject H
0in favor of H
1if
𝑀𝑀𝑀𝑀𝑀𝑀(𝜃𝜃) = �𝜃𝜃 = 1/ ̄𝑋𝑋
̃𝜆𝜆 = Π𝑓𝑓�𝜃𝜃(𝑥𝑥𝑖𝑖) Π𝑓𝑓1(𝑥𝑥𝑖𝑖) =
̄𝑋𝑋1𝑛𝑛 exp( − 1̄𝑋𝑋Σ𝑥𝑥𝑖𝑖)
exp( − Σ𝑥𝑥𝑖𝑖) = 1
̄𝑋𝑋𝑛𝑛 exp 𝑛𝑛( ̄𝑋𝑋 − 1)
2ln ̃𝜆𝜆 = 2𝑛𝑛(( ̄𝑋𝑋 − 1) − ̄ln 𝑋𝑋) 𝐷𝐷 𝜒𝜒2(1)
̃𝜆𝜆 > ̃𝑐𝑐 ⇔ 2ln ̃𝜆𝜆 > 2ln ̃𝑐𝑐
𝜒𝜒0.952 (1) ≈ 3.84 ≈ 2ln ̃𝑐𝑐
̃𝜆𝜆 > 𝑒𝑒3.84/2
Comparing two or more populations
We want to know if populations studied are
“the same” in certain aspects:
parametric tests: we check the equality of certain distribution parameters
nonparametric tests: we check whether
distributions are the same
Model I: comparison of means, variance known, significance level α
X
1, X
2, ..., X
nXare an IID sample from distr N( µ
X, σ
X2), Y
1, Y
2, ..., Y
nYare an IID sample from distr N( µ
Y, σ
Y2),
σ
X2, σ
Y2are known, samples are independent H
0: µ
x= µ
YTest statistic:
H
0: µ
x= µ
Yagainst H
1: µ
x> µ
Ycritical region
H
0: µ
x= µ
Yagainst H
1: µ
x≠ µ
Ycritical region
𝑈𝑈 = ̄𝑋𝑋 − ̄𝑌𝑌 𝜎𝜎
𝑋𝑋2�
𝑛𝑛
𝑋𝑋+ � 𝜎𝜎
𝑌𝑌2𝑛𝑛
𝑌𝑌~ 𝑁𝑁 (0,1)
𝐶𝐶∗ = {𝑥𝑥 : 𝑈𝑈 (𝑥𝑥) > 𝑢𝑢1−𝛼𝛼} 𝐶𝐶∗ = {𝑥𝑥 : | 𝑈𝑈(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}
assuming H0 is true
Model I – comparison of means. Example
X
1, X
2, ..., X
10are an IID sample from distr N( µ
X,11
2), Y
1, Y
2, ..., Y
10are an IID sample from distr N( µ
Y,13
2) Based on the sample:
Are the means equal, for significance level 0.05?
H
0: µ
x= µ
Yagainst H
1: µ
x≠ µ
Ywe have: u
0.975≈ 1.96.
|0.557| < 1.96 → no grounds to reject H
0̄𝑋𝑋 = 501, ̄𝑌𝑌 = 498
𝑈𝑈 = 501 − 498 132
10 + 112 10
≈ 0.557
Model II: comparison of means, variance
unknown but assumed equal, significance level α
X
1, X
2, ..., X
nXare an IID sample from distr N( µ
X, σ
2), Y
1, Y
2, ..., Y
nYare an IID sample from distr N( µ
Y, σ
2) with σ
2unknown, samples are independent
H
0: µ
x= µ
YTest statistic:
H
0: µ
x= µ
Yagainst H
1: µ
x> µ
Ycritical region
H
0: µ
x= µ
Yagainst H
1: µ
x≠ µ
Ycritical region
𝐾𝐾 ∗= {𝑥𝑥 : 𝑇𝑇 (𝑥𝑥) > 𝑡𝑡1−𝛼𝛼(𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦 − 2)}
𝐶𝐶∗ = {𝑥𝑥 : | 𝑇𝑇(𝑥𝑥)| > 𝑡𝑡1−𝛼𝛼/2(𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦 − 2)}
Assuming H0 is true
𝑆𝑆𝑋𝑋2 = 1
𝑛𝑛𝑋𝑋−1�
𝑖𝑖=1 𝑛𝑛𝑋𝑋
(𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2 = 1
𝑛𝑛𝑌𝑌−1�
𝑖𝑖=1 𝑛𝑛𝑌𝑌
(𝑌𝑌𝑖𝑖 − ̄𝑌𝑌)2
𝑇𝑇 = ̄𝑋𝑋 − ̄𝑌𝑌
(𝑛𝑛𝑥𝑥 − 1)𝑆𝑆𝑋𝑋2 + (𝑛𝑛𝑌𝑌 − 1)𝑆𝑆𝑌𝑌2
𝑛𝑛𝑋𝑋𝑛𝑛𝑌𝑌
𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2) ~ 𝑡𝑡 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2)
Model II: comparison of means, variance unknown but assumed equal, cont.
can be rewritten as
where
is an estimator of the variance σ
2based on the two samples
𝑇𝑇 = ̄𝑋𝑋 − ̄𝑌𝑌
(𝑛𝑛𝑥𝑥 − 1)𝑆𝑆𝑋𝑋2 + (𝑛𝑛𝑌𝑌 − 1)𝑆𝑆𝑌𝑌2
𝑛𝑛𝑋𝑋𝑛𝑛𝑌𝑌
𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2) ~ 𝑡𝑡 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2)
𝑇𝑇 = ̄𝑋𝑋 − ̄𝑌𝑌 𝑆𝑆∗ 1
𝑛𝑛𝑋𝑋 + 1𝑛𝑛𝑌𝑌
~ 𝑡𝑡 (𝑛𝑛𝑋𝑋 + 𝑛𝑛𝑌𝑌 − 2)
𝑆𝑆∗2 = (𝑛𝑛𝑥𝑥 − 1)𝑆𝑆𝑋𝑋2 + (𝑛𝑛𝑌𝑌 − 1)𝑆𝑆𝑌𝑌2 𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦 − 2
Model II: comparison of variances, significance level α
X
1, X
2, ..., X
nXare an IID sample from distr N( µ
X, σ
X2), Y
1, Y
2, ..., Y
nYare an IID sample from distr N( µ
Y, σ
Y2),
σ
X2, σ
Y2are unknown, samples are independent H
0: σ
X= σ
YTest statistic:
H
0: σ
X= σ
Yagainst H
1: σ
X> σ
Ycritical region
H
0: σ
X= σ
Yagainst H
1: σ
X≠ σ
Ycritical region
𝐹𝐹 = 𝑆𝑆𝑋𝑋2
𝑆𝑆𝑌𝑌2 ~ 𝐹𝐹 (𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)
𝐶𝐶∗ = {𝑥𝑥 : 𝐹𝐹 (𝑥𝑥) > 𝐹𝐹1−𝛼𝛼(𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)}
𝐶𝐶∗ = {𝑥𝑥 : 𝐹𝐹 (𝑥𝑥) < 𝐹𝐹𝛼𝛼/2(𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)
∨ 𝐹𝐹(𝑥𝑥) > 𝐹𝐹1−𝛼𝛼/2(𝑛𝑛𝑋𝑋 − 1, 𝑛𝑛𝑌𝑌 − 1)}
assuming H0 is true
𝑆𝑆𝑋𝑋2= 1 𝑛𝑛𝑋𝑋−1�
𝑖𝑖=1 𝑛𝑛𝑋𝑋
(𝑋𝑋𝑖𝑖− ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2= 1 𝑛𝑛𝑌𝑌−1�
𝑖𝑖=1 𝑛𝑛𝑌𝑌
(𝑌𝑌𝑖𝑖− ̄𝑌𝑌)2
Model II: comparison of means, variances unknown and no equality assumption
X
1, X
2, ..., X
nXare an IID sample from distr N( µ
X, σ
X2), Y
1, Y
2, ..., Y
nYare an IID sample from distr N( µ
Y, σ
Y2),
σ
X2, σ
Y2are unknown, samples independent H
0: µ
x= µ
YThe test statistic would be very simple, but:
It isn’t possible to design a test statistic such that the distribution does not depend on σ
X2and σ
Y2(values)...
𝑆𝑆𝑋𝑋2= 1 𝑛𝑛𝑋𝑋−1�
𝑖𝑖=1 𝑛𝑛𝑋𝑋
(𝑋𝑋𝑖𝑖− ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2= 1 𝑛𝑛𝑌𝑌−1�
𝑖𝑖=1 𝑛𝑛𝑌𝑌
(𝑌𝑌𝑖𝑖− ̄𝑌𝑌)2
̄𝑋𝑋 − ̄𝑌𝑌 𝑆𝑆𝑋𝑋2
𝑛𝑛𝑋𝑋 + 𝑆𝑆𝑛𝑛𝑌𝑌𝑌𝑌2
~?
Model III: comparison of means for large samples, significance level α
X1, X2, ..., XnX are an IID sample from distr. with mean µX, Y1, Y2, ..., YnY are an IID sample from distr. with mean µY , both distr. have unknown variances, samples are independent,
nX, nY – large.
H0: µx = µY Test statistic:
H0: µx = µY against H1: µx > µY critical region
H0: µx = µY against H1: µx ≠ µY critical region
𝑈𝑈 = ̄𝑋𝑋 − ̄𝑌𝑌 𝑆𝑆𝑋𝑋2
𝑛𝑛𝑋𝑋 + 𝑆𝑆𝑛𝑛𝑌𝑌𝑌𝑌2
~ 𝑁𝑁 (0,1)
assuming H0. is true, for large samples
approximately
𝐶𝐶∗ = {𝑥𝑥 : 𝑈𝑈 (𝑥𝑥) > 𝑢𝑢1−𝛼𝛼}
𝐶𝐶∗ = {𝑥𝑥 : | 𝑈𝑈(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}
𝑆𝑆𝑋𝑋2= 1 𝑛𝑛𝑋𝑋−1�
𝑖𝑖 1 𝑛𝑛𝑋𝑋
(𝑋𝑋𝑖𝑖− ̄𝑋𝑋)2, 𝑆𝑆𝑌𝑌2= 1 𝑛𝑛𝑌𝑌−1�
𝑖𝑖 1 𝑛𝑛𝑌𝑌
(𝑌𝑌𝑖𝑖− ̄𝑌𝑌)2
Model III – example (equality of means?)
Model IV: comparison of fractions for large samples, significance level α
Two IID samples from two-point distributions. X – number of successes in nX trials with prob of success pX, Y – number of successes in nY trials with prob of success pY. pX and pY
unknown, nX and nY large.
H0: pX = pY Test statistic:
where
H0: pX = pY against H1: pX > pY critical region
H0: pX = pY against H1: pX ≠ pY critical region
𝑈𝑈∗ =
𝑛𝑛𝑋𝑋𝑋𝑋 − 𝑌𝑌𝑛𝑛𝑌𝑌
𝑝𝑝∗(1 − 𝑝𝑝∗) 1𝑛𝑛𝑋𝑋 + 1𝑛𝑛𝑌𝑌
~ 𝑁𝑁 (0,1)
𝐶𝐶∗ = {𝑥𝑥 : 𝑈𝑈∗( 𝑥𝑥) > 𝑢𝑢1−𝛼𝛼}
𝐶𝐶∗ = {𝑥𝑥 : | 𝑈𝑈∗(𝑥𝑥)| > 𝑢𝑢1−𝛼𝛼/2}
𝑝𝑝∗ = 𝑋𝑋 + 𝑌𝑌 𝑛𝑛𝑥𝑥 + 𝑛𝑛𝑦𝑦
assuming H0. is true, for large samples
approximately
Model IV – example (equality of probabilities?)
Tests for more than two populations
A naive approach:
pairwise tests for all pairs But:
in this case, the type I error is higher than
the significance level assumed for each
simple test...
More populations
Assume we have k samples:
, and
all X
i,jare independent (i=1,...,k, j=1,.., n
i)
X
i,j~N(m
i, σ
2)
we do not know m
1, m
2, ..., m
k, nor σ
2let n=n
1+n
2+...+n
k𝑋𝑋
1,1, 𝑋𝑋
1,2, . . . , 𝑋𝑋
1,𝑛𝑛1, 𝑋𝑋
2,1, 𝑋𝑋
2,2, . . . , 𝑋𝑋
2,𝑛𝑛2, . . .
𝑋𝑋
𝑘𝑘,1, 𝑋𝑋
𝑘𝑘,2, . . . , 𝑋𝑋
𝑘𝑘,𝑛𝑛𝑘𝑘Test of the Analysis of Variance (ANOVA) for significance level α
H
0: µ
1= µ
2=... = µ
kH
1: ¬ H
0(i.e. not all µ
iare equal) A LR test; we get a test statistic:
with critical region
for k=2 the ANOVA is equivalent to the two-sample t-test.
𝐹𝐹 = ∑𝑖𝑖=1𝑘𝑘 𝑛𝑛𝑖𝑖( ̄𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2/(𝑘𝑘 − 1)
∑𝑖𝑖=1𝑘𝑘 ∑𝑗𝑗=1𝑛𝑛𝑖𝑖 (𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋𝑖𝑖)2/(𝑛𝑛 − 𝑘𝑘) ~ 𝐹𝐹 (𝑘𝑘 − 1, 𝑛𝑛 − 𝑘𝑘)
̄𝑋𝑋𝑖𝑖 = 1 𝑛𝑛𝑖𝑖 �
𝑗𝑗=1 𝑛𝑛𝑖𝑖
𝑋𝑋𝑖𝑖,𝑗𝑗 , ̄𝑋𝑋 = 1 𝑛𝑛 �𝑖𝑖=1
𝑘𝑘
�
𝑗𝑗=1 𝑛𝑛𝑖𝑖
𝑋𝑋𝑖𝑖,𝑗𝑗 = 1 𝑛𝑛 �𝑖𝑖=1
𝑘𝑘
𝑛𝑛𝑖𝑖 ̄𝑋𝑋𝑖𝑖
𝐶𝐶
∗= {𝑥𝑥 : 𝐹𝐹 (𝑥𝑥) > 𝐹𝐹
1−𝛼𝛼(𝑘𝑘 − 1, 𝑛𝑛 − 𝑘𝑘)}
ANOVA – interpretation
we have
– between group variance estimator – within group variance estimator 1
𝑛𝑛 − 𝑘𝑘 �𝑖𝑖=1
𝑘𝑘
�
𝑗𝑗=1 𝑛𝑛𝑖𝑖
(𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋𝑖𝑖)2
Sum of Squares (SS)
Sum of Squares Between (SSB)
Sum of Squares Within (SSW)
1
𝑘𝑘 − 1 �𝑖𝑖=1
𝑘𝑘
𝑛𝑛𝑖𝑖( ̄𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2
�
𝑖𝑖=1 𝑘𝑘
�
𝑗𝑗=1 𝑛𝑛𝑖𝑖
(𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋)2 = �
𝑖𝑖=1 𝑘𝑘
𝑛𝑛𝑖𝑖( ̄𝑋𝑋𝑖𝑖 − ̄𝑋𝑋)2 + �
𝑖𝑖=1 𝑘𝑘
�
𝑗𝑗=1 𝑛𝑛𝑖𝑖
(𝑋𝑋𝑖𝑖,𝑗𝑗 − ̄𝑋𝑋𝑖𝑖)2
ANOVA test – table
source of
variability sum of squares degrees of freedom
value of the test statistic F between
groups SSB k-1 –
within groups SSW n-k –
total SS n-1 F
ANOVA test – example
Yearly chocolate consumption in three cities: A, B, C based on random samples of n
A= 8, n
B= 10, n
C= 9 consumers. Does consumption depend on the city?
α=0.01
→ reject H
0(equality of means), consumption depends on city
A B C
sample mean 11 10 7
sample variance 3.5 2.8 3
̄𝑋𝑋 = 127 (11 ⋅ 8 + 10 ⋅ 10 + 7 ⋅ 9) = 9.3
𝑆𝑆𝑆𝑆𝑆𝑆 = (11 − 9.3)2 ⋅ 8 + (10 − 9.3)2 ⋅ 10 + (7 − 9.3)2 ⋅ 9 = 75.63 𝑆𝑆𝑆𝑆𝑆𝑆 = 3.5 ⋅ 7 + 2.8 ⋅ 9 + 3 ⋅ 8 = 73.7
𝐹𝐹 = 75.63/2
73.7/24 ≈ 12.31 and 𝐹𝐹0.99(2,24) ≈ 5.61
ANOVA test – table – example
source of
variability sum of squares degrees of freedom
value of the test statistic F between
groups 75.63 2 –
within groups 73.7 24 –
total 149.33 26 12.31