doi:10.7151/dmps.1152
MULTIVARIATE MULTIPLE COMPARISONS WITH A CONTROL IN ELLIPTICAL POPULATIONS
Naoya Okamoto
Department of Food Sciences, Tokyo Seiei College 1-4-6, Nishishinkoiwa, Katsushika-ku, Tokyo 124-8530 Japan
e-mail: n okamoto@ayoan.jp and
Takashi Seo
Department of Mathematical Information Science, Tokyo University of Science 1-3 Kagurazaka, Shinjuku-ku, Tokyo 162-8601, Japan
e-mail: seo@rs.tus.ac.jp
Abstract
The approximate upper percentile of Hotelling’s T
2-type statistic is de- rived in order to construct simultaneous confidence intervals for comparisons with a control under elliptical populations with unequal sample sizes. Ac- curacy and conservativeness of Bonferroni approximations are evaluated via a Monte Carlo simulation study. Finally, we explain the real data analysis using procedures derived in this paper.
Keywords: comparisons with a control, Bonferroni approximation and Monte Carlo simulation.
2010 Mathematics Subject Classification: 62H10, 60E05 and 65C05.
1. Introduction
Simultaneous confidence intervals for comparisons with a control among mean
vectors are considered under k independent elliptical populations with unequal
sample sizes. In order to construct them, it is necessary to obtain the upper
percentile of T
max ·c2which is Hotelling’s T
2-type statistic. However, it is difficult
to obtain upper percentiles exactly even when populations have the multivariate
normal distribution. In order to obtain conservative approximate simultaneous confidence intervals, Bonferroni’s inequality is applied to T
2-type statistic. Under elliptical populations with equal sample sizes, the first and the modified second order Bonferroni approximations for pairwise multiple comparisons are discussed by Seo [6]. Under elliptical populations with unequal sample sizes, these are discussed by Okamoto and Seo [5] and Okamoto [4]. This paper gives them for comparisons with a control, and their accuracy and conservativeness are evaluated via a Monte Carlo simulation study. Finally, an actual procedure is explained using the school-record data of the second-year student in a junior high school in Tokyo. Also, for graphical approaches using weighted Bonferroni, see e.g. Bretz et al. [1].
For the j-th population, a p ×1 random vector x
(j)is said to have an elliptical distribution with parameters µ
(j)(p × 1) and Λ
(j)(p × p) if its density function is of the form
f (x
(j)) = c
(j)p|Λ
(j)|
−12g
jn
(x
(j)− µ
(j))
′Λ
(j)−1(x
(j)− µ
(j)) o
for some non-negative function g
j, where c
(j)pis a normalizing constant and Λ
(j)is a positive definite. The characteristic function of the vector x
(j)is φ
j(t) = exp(it
′µ
(j))ψ
j(t
′Λ
(j)t) for some function ψ
j, and E[x
(j)] = µ
(j)and Σ
(j)= Cov[x
(j)] = −2ψ
j′(0)Λ
(j), if they exist. Throughout this paper, we assume Σ = Σ
(1)= · · · = Σ
(k). We define the kurtosis parameter as κ
j= {ψ
′′j(0)/(ψ
′j(0))
2} − 1.
2. A first order Bonferroni approximation
Consider simultaneous confidence intervals for comparisons with a control among k independent p-dimensional mean vectors under elliptical populations. Let x
(j)1, . . . , x
(j)Nj
(j = 1, . . . , k) be N
jindependent observations on x
(j)that has an elliptical distribution with mean vector µ
(j)and common covariance matrix Σ. Let the j-th sample mean vector, the j-th sample covariance matrix and the pooled sample covariance matrix be
x
(j)= 1 N
jNj
X
i=1
x
(j)i,
S
(j)= 1 N
j− 1
Nj
X
i=1
(x
(j)i− x
(j))(x
(j)i− x
(j))
′,
S = 1 ν
k
X
j=1
(N
j− 1)S
(j),
respectively, where ν = P
kj=1
N
j− k.
Letting the first population be a control, the simultaneous confidence intervals with the given confidence level 1 − α for comparisons with a control among mean vectors are given by
a
′(µ
(1)− µ
(m)) ∈ h
a
′(x
(1)− x
(m)) ± t
αpd
1ma
′Sa i ,
∀a ∈ R
p− {0} , 2 ≤ m ≤ k, (1)
where d
1m= 1/N
1+ 1/N
m, R
p− {0} is the set of any nonnull real p-dimensional vectors and the value t
α(≡ t > 0) satisfies as follows:
Pr T
max ·c2> t
2= α, where
T
max ·c2= max
2≤m≤k
T
1m2, T
1m2= d
−11my
(1)− y
(m)′S
−1y
(1)− y
(m), y
(j)= x
(j)− µ
(j), j = 1, . . . , k.
By using the first term of Bonferroni’s inequality for Pr T
max ·c2> t
2:
Pr T
max ·c2> t
2<
k
X
m=2
Pr T
1m2> t
2,
the approximate upper percentile t
21cof T
max ·c2is given by
k
X
m=2
Pr T
1m2> t
21c= α.
Without loss of generality, we assume Σ = I
pand N = max{N
1, N
2, . . . , N
k}.
Put r
j= N
j/N for j = 1, . . . , k, s = 1/( P
kj=1
r
j) and w
lm= pr
m/(r
l+ r
m).
Letting
x
(j)= µ
(j)+ 1 pN
jz
(j),
W
(j)= 1 N
jNj
X
i=1
(x
(j)i− µ
(j))(x
(j)i− µ
(j))
′= I
p+ 1 pN
jZ
(j),
we have
T
1m2= τ
′1mS
−1τ
1m, where
τ
1m= w
1mz
(1)− w
m1z
(m), S
−1= I
p− 1
√ N s
k
X
j=1
√ r
jZ
(j)+ 1 N
"
s
k
X
j=1
z
(j)z
(j)′+ s
2k
X
j=1
r
jZ
(j)2+ s
2
k−1
X
i=1 k
X
j=i+1
√ r
ir
jZ
(i)Z
(j)+ Z
(j)Z
(i)
− skI
p#
+ o
p(N
−1).
Using the joint density function of z
(j)and Z
(j)which is derived by Iwashita [2], the asymptotic expansion of the characteristic function of T
1m2can be written as
E[exp(itT
1m2)] = u
−p21 + 1
4N
c
(0)1m+ c
(1)1mu
−1+ c
(2)1mu
−2+ o(N
−1), where u = 1 − 2it, i = √
−1 and c
(0)1m= −sp
2+ 1
2 p(p + 2) 1
r
1w
41m− 2sw
1m2κ
1+
1 r
mw
m14− 2sw
m12κ
m− sκ
r, c
(1)1m= −2sp − p(p + 2) 1
r
1w
1m4− 4sw
1m2κ
1+
1
r
mw
4m1− 4sw
2m1κ
m+ sκ
r, c
(2)1m= sp(p + 2)
+ 1
2 p(p + 2) 1 r
1w
1m4− 6sw
1m2κ
1+
1 r
mw
4m1− 6sw
2m1κ
m+ 3sκ
r,
κ
r= s
k
X
j=1
r
jκ
j.
Using above result, the distribution of T
1m2can be expanded as
Pr(T
1m2> t
2) = Pr(χ
2p> t
2) + 1 4N
2
X
j=0
c
(j)1mPr(χ
2p+2j> t
2) + o(N
−1),
and its upper 100α percentile can be expanded as t
21m·χ2(α) = χ
2p(α) − 1
2N χ
2p(α) 1
p c
(0)1m− 1
p(p + 2) c
(2)1mχ
2p(α)
+ o(N
−1),
where χ
2p(α) is the upper 100α percentile of the χ
2distribution with p degrees of freedom. Therefore, we have the first order Bonferroni approximate upper 100α percentile of T
max ·c2as follows:
t
21·χ2·c(α) = χ
2pα
k − 1
− 1
2N (k − 1) χ
2pα k − 1
×
k
X
m=2
1
p c
(0)1m− 1
p(p + 2) c
(2)1mχ
2pα k − 1
. (2)
Also, since Hotelling’s T
2-statistic under normality is an F -statistic, we obtain another approximate upper 100α percentile of T
max ·c2as follows:
t
21·F ·c(α) = νp
ν − p + 1 F
p, ν−p+1α k − 1
− 1
2N (k − 1) χ
2pα
k − 1
×
k
X
m=2
1
p c
(0)1m+ sp
−
1
p(p + 2) c
(2)1m− s
χ
2pα k − 1
, (3)
where F
p, ν−p+1(α/(k − 1)) is the upper 100(α/(k − 1)) percentile of the F - distribution with p and ν − p + 1 degrees of freedom.
3. A modified second order Bonferroni approximation
The first order Bonferroni approximation becomes conservative too much when the number of populations or the kurtosis parameter is large. In this section, a modified second order Bonferroni procedure, which uses the first and the second terms of Bonferroni’s inequality, is described to improve conservativeness of the first order Bonferroni approximation.
Let y
1= w
12z
(1)− w
21z
(2), y
2= w
13z
(1)− w
31z
(3), . . . , y
k−1= w
1kz
(1)− w
k1z
(k). Bonferroni’s inequality for Pr{T
max ·c2> t
2} is given by
k−1
X
i=1
Pr y
′iS
−1y
i> t
2− β
c(t
2) < Pr{T
max ·c2> t
2} <
k−1
X
i=1
Pr y
′iS
−1y
i> t
2,
where
β
c(t
2) =
k−2
X
i=1 k−1
X
j=i+1
Pr y
′iS
−1y
i> t
2, y
′jS
−1y
j> t
2.
The first order Bonferroni approximation t
21cis defined as a critical value that satisfies the equality
k−1
X
i=1
Pr y
′iS
−1y
i> t
21c= α.
The second order Bonferroni approximation t
22cis defined as a critical value that satisfies the equality
k−1
X
i=1
Pr y
′iS
−1y
i> t
22c− β
c(t
22c) = α.
The modified second order Bonferroni approximation t
2M cis defined as a critical value that satisfies the equality
k−1
X
i=1
Pr y
′iS
−1y
i> t
2M c= α + β
c(t
21c), where
β
c(t
21c) =
k−1
X
j=2 k
X
h=j+1
Pr{T
1j2> t
21c, T
1h2> t
21c}.
(4)
In order to obtain the modified second order Bonferroni approximation t
2M c, it is necessary to evaluate Pr{T
1j2> t
21c, T
1h2> t
21c}. For convenience, we discuss the joint characteristic function of T
122and T
132: E[exp(it
1T
122+ it
2T
132)] as follows.
E[exp(it
1T
122+ it
2T
132)]
= E
exp(it
1T
12(1)+ it
2T
13(1))
1 + 1
√ N D
1+ 1 N D
2+ o(N
−1), where
D
1= it
1T
12(2)+ it
2T
13(2), D
2= it
1T
12(3)+ (it
1)
22 (T
12(2))
2+ it
2T
13(3)+ (it
2)
22 (T
13(2))
2+ (it
1)(it
2)T
12(2)T
13(2), and
T
12(1)= τ
′12τ
12, T
13(1)= τ
′13τ
13, T
12(2)= −τ
′12
s
k
X
j=1
√ r
jZ
(j)
τ
12, T
13(2)= −τ
′13
s
k
X
j=1
√ r
jZ
(j)
τ
13,
T
12(3)= τ
′12
s
k
X
j=1
z
(j)z
(j)′+ s
2k
X
i=1 k
X
j=1
√ r
ir
jZ
(i)Z
(j)− skI
p
τ
12,
T
13(3)= τ
′13
s
k
X
j=1
z
(j)z
(j)′+ s
2k
X
i=1 k
X
j=1
√ r
ir
jZ
(i)Z
(j)− skI
p
τ
13,
and
τ
12= w
1z
(1)− w
2z
(2), w
1≡ w
12=
r r
2r
1+ r
2, w
2≡ w
21=
r r
1r
1+ r
2, τ
13= w
3z
(1)− w
4z
(3), w
3≡ w
13=
r r
3r
1+ r
3, w
4≡ w
31=
r r
1r
1+ r
3.
Using the joint density function of z
(j)and Z
(j), we obtain an asymptotic ex- pansion for the expectation of exp(it
1T
12(1)+ it
2T
13(1)) in elliptical distributions as follows.
E[exp(it
1T
12(1)+ it
2T
13(1))]
= U
−p2+ 1
8N p(p + 2)U
−p2−2× 1
r
1{(u
1− 1)u
2w
21+ (u
2− 1)u
1w
23− 2(u
1− 1)(u
2− 1)v
0}
2κ
1+ 1
r
2(u
1− 1)
2u
22w
42κ
2+ 1
r
3u
21(u
2− 1)
2w
44κ
3+ o(N
−1),
where U = u
1u
2− (u
1− 1)(u
2− 1)v
0, u
1= 1 − 2it
1, u
2= 1 − 2it
2, v
0= w
21w
32. Let λ
1= 1 − 2(1 − v
0)it
1, λ
2= 1 − 2(1 − v
0)it
2, then u
1= (λ
1− v
0)/(1 − v
0), u
2= (λ
2− v
0)/(1 − v
0) and
U
−p2= λ
1λ
2− v
01 − v
0 −p2
= (1 − v
0)
p2∞
X
m=0 1 2
p
m
m! v
m0λ
−p 2−m
1
λ
−p 2−m
2
,
where
1 2 p
m
= Γ
p2+ m Γ
p2= p
2
p 2 + 1
· · · p
2 + m − 1
.
Repeating such calculations about expectation of z
(j)and Z
(j), an asymptotic expansion for the joint probability Pr n
T
1j2> t
21c, T
1h2> t
21co
is given by Pr T
1j2> t
21c, T
1h2> t
21c= (1 − v
0)
p2∞
X
m=0 1 2
p
m
m! v
m0×
G
2p2+m
(η
2) + 1 N
n d
1g
p2+m
(η
2)G
p2+m
(η
2) + d
2g
2p2+m
(η
2) o
+ o(N
−1), where
η
2= 1
2(1 − v
0) t
21c, G
p2+m
(η
2) = Z
∞η2
g
p2+m
(t)dt, g
p2+m
(t) = 1 Γ
p2+ m t
p
2+m−1
e
−t, and
d
1= η
232v
2132sv
21(p − 2m + 2η
2) + 8sv
1d
11+ d
12, d
2= η
2216qv
21(p + 2m) 32sqv
21(2m + 1) + 8sv
1d
21+ d
22, d
11= 2 [3(m − η
2v
0) + v
1v
2{2η
2(2v
1− 1) + q}] κ
1+ 2v
1w
22(4v
1η
2+ q) + 9m + η
2v
1(4w
21− 13) − 9 κ
j+ 2v
1w
24(4v
1η
2+ q) + 9m + η
2v
1(4w
23− 13) − 9 κ
h+ [2v
1{p + 6m − 6η
2(2v
1+ 1) + 2}] κ
r,
d
12= 8 1
r
1(2η
2− q)v
12(v
22− 2v
0) + m − η
2(v
1+ 1)
κ
1+ 8
r
j(2η
2− q)v
12w
42+ 5m − 5η
2(v
1+ 1)
κ
j+ 8
r
h(2η
2− q)v
12w
44+ 5m − 5η
2(v
1+ 1)
κ
h,
d
21= 4v
0η
22{4v
0(v
2− 4) + 4v
2− 1} + {−8v
0+ 2(v
0+ 1)v
2+ 1} q
2− {p − 2v
0η
2(4v
2(v
0− 4) + 21) + 2} q] κ
1+ 2v
0η
22−8(v
0+ 1)w
21+ 8v
0+ 3
+ v
0η
2−8(v
0− 4)w
12+ 8v
0− 41 q + 5m
2+ 2(p + 2)
2(v
0+ 1)w
22+ (p + m + 2)m −8(v
0+ 1)w
21+ 8v
0+ 13 κ
j+ 2v
0η
22−8(v
0+ 1)w
23+ 8v
0+ 3
+ v
0η
2−8(v
0− 4)w
32+ 8v
0− 41 q + 5m
2+ 2(p + 2)
2(v
0+ 1)w
42+ (p + m + 2)m −8(v
0+ 1)w
23+ 8v
0+ 13 κ
h+ [2v
1(p + 6m − 12v
0η
2+ 2)q] κ
r, d
22= h
4 (m − 2v
0η
2)q − 2v
0η
22+ 8v
0r
1(v
2− 2)
2+ v
12v
1− v
22+ 4 q
2+ 4η
2(2v
1− v
2+ 2)(v
2− 2)q + 4v
0η
22(v
2− 2)
2] i κ
1+ 8v
0w
42r
j(2η
2− q) {2v
0η
2+ (v
1− 1)q} + (m − 5v
0η
2)q − 2v
0η
22κ
j+ 8v
0w
44r
h(2η
2− q) {2v
0η
2+ (v
1− 1)q} + (m − 5v
0η
2)q − 2v
0η
22κ
h,
q = p + 2m + 2, w
1≡ w
1j, w
2≡ w
j1, w
3≡ w
1h, w
4≡ w
h1, v
1= v
0− 1, v
2= w
21+ w
23.
Therefore, the modified second order Bonferroni approximate upper 100α per- centiles of T
max ·c2are obtained as follows:
t
2M ·χ2·c(α) = χ
2p(γ
c) − 1
2N (k − 1) χ
2p(γ
c)
×
k
X
m=2
1
p c
(0)1m− 1
p(p + 2) c
(2)1mχ
2p(γ
c)
, (5)
t
2M ·F ·c(α) = νp
ν − p + 1 F
p, ν−p+1(γ
c) − 1
2N (k − 1) χ
2p(γ
c)
×
k
X
m=2