GENERALIZED F TESTS AND SELECTIVE GENERALIZED F TESTS FOR ORTHOGONAL
AND ASSOCIATED MIXED MODELS
C´ elia Nunes,
Mathematics Department, University of Beira Interior Covilh˜ a, Portugal
e-mail: celia@mat.ubi.pt
Iola Pinto
Superior Institute of Engineer of Lisbon Scientific Area of Mathematics, Lisboa, Portugal
and
Jo˜ ao Tiago Mexia
Mathematics Department, Faculty of Science and Technology New University of Lisbon, Monte da Caparica, Portugal
e-mail: jtm@fct.unl.pt
Abstract
The statistics of generalized F tests are quotients of linear combi-
nations of independent chi-squares. Given a parameter, θ, for which
we have a quadratic unbiased estimator, e θ, the test statistic, for the
hypothesis of nullity of that parameter, is the quotient of the positive
part by the negative part of such estimator. Using generalized polar
coordinates it is possible to obtain selective generalized F tests which
are especially powerful for selected families of alternatives.
We build both classes of tests for the orthogonal and associated mixed models. The associated models are obtained adding terms to the orthogonal models.
Keywords: selective generalized F tests, generalized polar coordi- nates, associated models.
2000 Mathematics Subject Classification: 62J12, 62H15, 62H10.
1. Introduction
Generalized F tests were introduced by Michalski and Zmy´slony (1996) and (1999), first for variance components and later for linear combina- tions of parameters in mixed linear models. The statistics of these tests are the quotients of the positive by the negative parts of quadratic unbiased estimators.
To obtain selective generalized F tests for the fixed effects part of mixed models generalized polar coordinates are used, (see Nunes and Mexia, 2004).
The statistic of these tests is the statistic of the generalized F tests for the same hypothesis coupled with a vector of central angles. In this way it is possible to increase the test power for the selected family of alternatives.
This possibility had already been considered for the usual F tests (see Dias, 1994). Moreover both F and selective F tests have been considered for balanced cross-nesting models (see Fonseca et al., 2003, and Nunes et al., 2006).
The distributions of the test statistics of generalized and selective genera- lized F tests have been studied (see Fonseca et al., 2002, and Nunes and Mexia, 2006).
In what follows we consider generalized and selective generalized F tests for orthogonal mixed models. In this way we extend the results of Nunes et al., (2006) for balanced cross-nesting models. We will obtain interest- ing monotonicity properties that enable us to consider the extension of our results to associated models. These models are obtained adding terms to the orthogonal mixed models. Actually such extension has already been considered (see Nunes and Mexia 2006), for balanced cross-nesting.
The next section is divided into two subsections, on distributions and
algebraic model structure. The results presented in this section will be used
in the study, first of generalized and then of selective generalized F tests,
for orthogonal mixed models.
2. Preliminary results 2.1. Distributions
The vectors in this section will have k components, those of 1 [p
i] being equal to 1 [0 except the i-th which is 1] and q
i= 1 − p
i, i = 1, ..., k. Moreover uov will be the vector with components u
iv
i, i = 1, ..., k, and χ
2g,δwill be a chi-square with g degrees of freedom and non-centrality parameter δ. We will only consider independent chi-squares.
With h < k let F
h(.|a, g, δ) be the distribution of
(2.1) =
h(a, g, δ) =
X
h i=1a
iχ
2gi,δiX
k i=h+1a
iχ
2gi,δi.
In Nunes and Mexia (2006) it was shown that
(2.2)
F
h(z|a, g, δ)
= e
−1
2
X
k i=1δ
i +∞X
j1=0
...
X
+∞jk=0
Y
k i=1δ
i2
jiY
k i=1j
i! F
hz|a, g + 2j, 0 .
Consider the `-th component of the non-centrality parameter vector δ, δ
`, we can rewrite the previous expression as
(2.3) F
hz|a, g, δ
= e
−δ`2+∞
X
j=0
δ`2
jj! F
hz|a, g + 2jp
`
, q
`
δ
.
Besides this we have
(2.4)
P r
X
hi=1
a
iχ
2gi,δiX
ki=h+1
a
iχ
2gi,δi<
X
h i=1a
iχ
2gi,δi+ a
i0χ
22X
k i=h+1a
iχ
2gi,δi
= 1, i
0= 1, ..., h
P r
X
hi=1
a
iχ
2gi,δiX
ki=h+1
a
iχ
2gi,δi>
X
h i=1a
iχ
2gi,δiX
ki=h+1
a
iχ
2gi,δi+ a
i0χ
22
= 1, i
0= h + 1, ..., k ,
and since the second fractions will have distribution F
h(.|a, g + 2p
i0, δ), i
0= 1, ..., k, we have
(2.5)
F
h(z|a, g + 2(j + 1)p
i0
, δ) < F
h(z|a, g + 2jp
i0
, δ), j = 0, ..., i
0= 1, ..., h
F
h(z|a, g + 2jp
i0, δ) < F
h(z|a, g + 2(j + 1)p
i0, δ) j = 0, ..., i
0= h + 1, ..., k
.
Now
(2.6)
∂F
h(z|a, g, δ)
∂δ
i0= 1 2 e
−δi0 2
+∞
X
j=0
δi0 2 jj!
F
hz|a, g + 2(j + 1)p
i0
, q
i0
δ
− F
hz|a, g + 2jp
i0
, q
i0
δ
,
so
(2.7)
∂F
h(z|a, g, δ)
∂δ
i0< 0, i
0= 0, ..., h
∂F
h(z|a, g, δ)
∂δ
i0> 0, i
0= h + 1, ..., k .
Let us now assume that δ
i, i = 1, ..., k, to be realizations of the non-negative random variables V
i, i = 1, ..., k, components of V , with distribution G
Vand moment generation function λ
V. We put
(2.8) λ
<j>V(u) = ∂
j1+...+jkλ
V(u) Q
ki=1
∂u
jii, and point out that λ
V(u) is defined whenever u ≤ 0.
The distribution of =
h(a, g, V ) will be F
h(z|a, g, λ
V)
=
Z
+∞0
...
Z
+∞0
e
−1
2
X
k i=1v
i+∞
X
j1=0
...
+∞
X
jk=0
Y
k i=1v
i2
jiY
k i=1j
i!
F
h(z|a, g + 2j, 0)dG
V(v) (2.9)
=
+∞
X
j1=0
...
+∞
X
jk=0
λ
<j>−
121 Y
ki=1
(2
jij
i!)
F
h(z|a, g + 2j, 0).
It is also easy to see that, if P r(V > 0) = 1,
(2.10)
F
h(z|a, g, λ
V) < F
h(z|a, g, λ
qiV
), i = 1, ..., h F
h(z|a, g, λ
V) > F
h(z|a, g, λ
qiV
), i = h + 1, ..., k ,
since the i-th component of q
iV will be null while the corresponding component of V will be positive with probability one.
2.2. Models structure
In this section we will use commutative Jordan algebras, CJA. These are lin- ear spaces constituted by symmetric matrices that commute and containing the squares of their matrices. Seely (1971) showed that for any CJA A there exists one and only one basis, the principal basis pb(A) of A, constituted by pairwise orthogonal projection matrices.
If Q = pb(A) ={Q
1, ..., Q
`}, given an orthogonal projection matrix Q ∈ A, we will have Q = P
`j=1
a
jQ
jbut, since Q is idempotent and the Q
1, ..., Q
`are pairwise orthogonal, a
j= 0 or a
j= 1, j = 1, ..., `. Thus any orthogonal projection matrix belonging to a CJA will be the sum of all or part of the matrices in the principal basis.
Let us now consider symmetric matrices M
1, ..., M
wbelonging to a CJA A
1contained in another CJA A
2. With pb(A
u) ={Q
u,1, ..., Q
u,`u}, u = 1, 2 we will have
(2.11) M
i=
`u
X
j=1
b
u,i,jQ
u,j, i = 1, ..., w, u = 1, 2,
as well as
(2.12) Q
1,j= X
j0∈ϕj
Q
2,j0, j = 1, ..., `
1,
where the ϕ
1, ..., ϕ
`1are pairwise disjunct sets. If we put B
u= [b
u,i,j],
u = 1, 2, we see that the columns of B
2with indexers in a set ϕ
j, j = 1, ..., `
1,
are equal. Thus rank(B
1) = rank(B
2). Moreover, if
(2.13) B
u=
B
u,1,10 B
u,2,1B
u,2,2
, u = 1, 2,
where B
u,1,1has m rows and t
ucolumns, so that B
u,2,1will have w − m rows and also t
ucolumns and B
u,2,2w − m rows and `
u− t
ucolumns, u = 1, 2.
We also will have rank(B
1,1,1) = rank(B
2,1,1) rank(B
1,2,1) = rank(B
2,2,1) and rank(B
1,2,2) = rank(B
2,2,2). Thus the row vectors of B
1,2,2,are linearly independent if and only if the row vectors of B
2,2,2are linearly independent.
As we shall see this observation will be important.
Let us consider a normal mixed model
(2.14) Y =
X
m i=1X
iβ
i+ X
w i=m+1X
iβ e
i,
where β
1, ..., β
m
are fixed and the e β
m+1
, ..., e β
w
are normal, independent with null mean vectors and variance-covariance matrices σ
i2I
ci, i = m + 1, ..., w.
Many times X
w= I
nand e β
w= e, an error vector. Then Y will be normal with mean vector
(2.15) µ =
X
m i=1X
iβ
i
and variance-covariance matrix
(2.16) Σ/(Y ) =
X
w i=m+1σ
i2M
i,
where M
i= X
iX
i>, i = 1, ..., w. This model is orthogonal when the matrices M
icommute.
Now, see Schott (1997, pg 157), the matrices M
i, i = 1, ..., w,
commute if and only if they are diagonalized by an orthogonal matrix P .
Thus if the model is orthogonal, M
1, ..., M
w∈ V(P ) with V(P ) the family of matrices diagonalized by P which is a CJA. So, the model is orthogonal if and only if the matrices M
1, ..., M
wbelong to a CJA. Since intersecting CJA gives CJA there will be a minimal CJA ˙ A = A(M ) containing M = {M
1, ..., M
w}, the CJA generated by M . With ˙ Q = { ˙ Q
1, ..., ˙ Q
`˙} = pb( ˙ A) we have
(2.17) M
i=
`˙
X
j=1
˙b
i,jQ ˙
j, i = 1, ..., w.
Now the space Ω spanned by µ is the range space of
(2.18)
X
m i=1M
i= X
j∈D
X
m i=1˙b
i,j! Q ˙
j,
with D= {j : P
mi=1
˙b
i,j6= 0}. Thus the orthogonal projection matrix T on Ω will be
(2.19) T = X
j∈D
Q ˙
j.
We can always reorder the matrices in pb( ˙ A) to get D= {1, ..., ˙ d}. Then, since the matrices M
i, i = 1, ..., w, are positive semi-definite,
˙b
i,j= 0, j = ˙ d + 1, ..., ˙`, i = 1, ..., m,
and so
(2.20) B = ˙
B ˙
1,10 B ˙
2,1B ˙
2,2
.
As we saw, if for another CJA containing M , we have
(2.21) M
i=
X
k j=1b
i,jQ
j, i = 1, ..., w,
we will have
(2.22) B =
B
1,10 B
2,1B
2,2
and the row vectors of B
2,2are linearly independent if and only if the row vectors of ˙ B
2,2are linearly independent.
We then have b
i,j= 0, j = d + 1, ..., k; i = 1, ..., m, and
(2.23) Σ/(Y ) =
X
w i=m+1σ
i2X
k j=1b
i,jQ
j= X
k j=1γ
jQ
j,
with
(2.24) γ
j=
X
w i=m+1b
i,jσ
2i.
Putting γ(1) = (γ
1, ..., γ
d), γ(2) = (γ
d+1, ..., γ
k) and σ
2= (σ
m+12, ..., σ
2w) we have
(2.25)
γ(1) = B
2>,1σ
2γ(2) = B
2>,2σ
2,
and, since the column vectors of B
2>,1are linearly independent, we will have
(2.26)
σ
2= B
2>,2 +γ(2)
γ(1) = B
2>,1B
2>,2 +γ(2) ,
where (B
2>,2)
+is the Moore-Penrose inverse of B
2>,2.
Let now the row vectors of A
jconstitute an orthonormal basis for the range space of Q
j, then Q
j= A
>jA
jand A
jA
>j= I
gj, with g
j= rank(Q
j), j = 1, ..., k. We may assume that the observations vector spans R
nso that P
kj=1
A
>jA
j= P
kj=1
Q
j= I
n. Then with
(2.27)
η
j= A
jµ, j = 1, ..., k η e
j
= A
jY , j = 1, ..., k ,
we will have η
j
= 0, j = d + 1, ..., k, and
(2.28)
µ =
X
d j=1A
>jη
jY = X
k j=1A
>jη e
j
.
These expressions show the central part that vectors η
1, ..., η
d[e η
1, ..., e η
k] play in our model.
3. Generalized F tests
We start by obtaining sufficient and complete statistics. Since the Q
j, j =
1, ..., k, are pairwise orthogonal projection matrices we will have
(3.1) Σ/(Y )
−1= X
k j=11 γ
jQ
j= X
k j=11 γ
jA
>jA
j,
so that, with S
j= ke η
jk
2, j = d + 1, ..., k,
(Y − µ)
>Σ/(Y )
−1(Y − µ) = X
k j=1ke η
j− η
jk
2γ
j= X
d j=1ke η
j
− η
j
k
2γ
j+ S
jγ
j. (3.2)
Using the factorization theorem and the fact the normal distribution belongs to the exponential family with, for these models, a parametric space that contain open sets, we establish the first part of the thesis of
Theorem 1. The η e
1, ..., e η
d
and S
d+1, ..., S
kconstitute a sufficient complete statistic. Moreover the e η
1, ..., e η
d, e γ(2) with components eγ
j=
Sgjj
, j = d + 1, ..., k, e σ
2= (B
2>,2)
+e γ(2) and eγ(1) = B
2>,1(B
2>,2)
+e γ(2) will be UMVUE.
P roof. The second part of the thesis follows from the first part and from the theorem of Blackwell-Lehman-Scheff´e.
Now we can put
(3.3) σ
2i= X
j∈ϕ+i
b
i,jγ
j− X
j∈ϕ−i
b
i,jγ
j, i = m + 1, ..., w,
with ϕ
+i∪ ϕ
−i⊆ {d + 1, ..., k}. Thus the positive and the negative parts of an unbiased estimator for σ
2iwill be P
j∈ϕ+i
b
i,jSgjj
and P
j∈ϕ−i
b
i,jSgjj
and the
statistic for testing
(3.4) H
0: σ
2i= 0, i = m + 1, ..., w, will be
(3.5) = =
X
j∈ϕ+i
b
i,jS
jg
jX
j∈ϕ−i
b
i,jS
jg
j= X
j∈ϕ+i
b
i,jγ
jg
jχ
2gjX
j∈ϕ−i
b
i,jγ
jg
jχ
2gj.
The orthogonal model
(3.6) Y =
X
k j=1A
>jη e
jhas associated models given by
(3.7) Y
a= Y + Y
p,
where
(3.8) Y
p=
X
k j=1A
>jZ
j.
The Z
1, ..., Z
kbeing independent of the e η
1, ..., e η
k
. We take
(3.9) V
j= 1
γ
jkZ
jk
2, j = 1, ..., k,
and represent by G
ithe joint distribution of the V
jwith j ∈ ϕ
+i∪ ϕ
−i, i = m + 1, ..., w. With h
i= ](ϕ
+i), k
i− h
i= ](ϕ
−i), a
ithe vec- tor of the coefficients for the positive and the negative parts of the es- timator and g
i
the vector of number of degrees of freedom, the distri- bution of = for the orthogonal model be F
hi(.|a
i, g
i
), i = m + 1, ..., w.
When we go over to the associated models we get the distribution F
hi(.|a
i, g
i
, G
i), i = m + 1, ..., w. Our results of Section 2.1 show that the effects of the Z
j, j = 1, ..., h
i, is to ”increase” the test statis- tic possibly leading to pseudo-significant results. Moreover the effect of Z
j, j = h
i+ 1, ..., k
i, will be to ”decrease” the statistic leading to loss of power.
Likewise if we go to the fixed effects part, and given
(3.10) ψ = W η
j
, j = 1, ..., d,
we have the UMVUE e ψ = W e η
j
, j = 1, ..., d.
Moreover we want to test
(3.11) H
0: ψ = ψ
0.
Since e ψ will be normal with mean vector ψ and variance-covariance matrix
W A
jΣ/(Y )A
>jW
>= γ
jW W
>,
the quadratic form
U =
ψ − ψ e
0>W W
>+ψ − ψ e
0will be (see Mexia 1990), the product by γ
jof χ
2g,δ0with g = rank(W ) and
(3.12) δ
0= 1
γ
jψ − ψ
0>W W
>+ψ − ψ
0.
When H
0holds, δ
0= 0, and if the row vectors of W are linearly independent, W W
>will be positive definite so (W W
>)
+= (W W
>)
−1and the hypothesis may be rewritten as
(3.13) H
0: λ = 0,
with λ = δ
0γ
j. In what follows we will restrict ourselves to this case.
Now
(3.14) E(U ) = gγ
j+ λ
and for γ
jwe have the UMVUE
(3.15) eγ
j= X
v∈ϕ+j
c
j,ve γ
v− X
v∈ϕ−j
c
j,ve γ
v,
where ϕ
+j∪ ϕ
−j⊆ {d + 1, ..., k} and the c
j,van element of B
>2,1(B
2>,2)
+. Thus for λ we have the quadratic unbiased estimator
(3.16) e λ =
U + g X
v∈ϕ−j
c
j,vS
vg
v
−
g X
v∈ϕ+j
c
j,vS
vg
v
.
So we will have the test statistic with distribution F
hj(.|a, g, δ
0p
1), where a has components γ
j, gc
j,vγgvv
, v ∈ ϕ
−jand gc
j,vγgvv
, v ∈ ϕ
+jwhile the com-
ponents of g will be g, g
v, v ∈ ϕ
−jand g
v, v ∈ ϕ
+jand h
j= ](ϕ
−j) + 1.
Since in the test statistic
(3.17) F =
U + g X
v∈ϕ−j
c
j,vS
vg
vg X
v∈ϕ+j
c
j,vS
vg
vonly the term U may have non null centrality parameters, from our results in Section 2.1, it follows that this test will be strictly unbiased.
If we go over to associated models we can reason as above to show that:
• the Z
j0, with j
0∈ ϕ
−j, ”increase” the statistics leading, possibility, to situations of pseudo-significance;
• the Z
j0, with j
0∈ ϕ
+j, ”degrease” the statistics leading to a loss of test power.
Moreover, if we replace e η
j
by e η
j
+ Z
jwe will have, with ψ
0= W η
0, U =
e
η
j+ Z
j− η
0>W
>W W
>−1W e
η
j+ Z
j− η
0so when H
0holds and δ
0= 0, the perturbations Z
jmay lead to pseudo- significant results.
4. Selective generalized F tests
To obtain selective F tests we use generalized polar coordinates. Let ψ have s components. Given a point in R
swith cartesian coordinates (x
1, ..., x
s), and generalized polar coordinates (r, θ
1, ..., θ
s−1), we will have r = kxk and
x
j= r`
j(θ),
where θ = (θ
1, ..., θ
s−1) and
(4.1)
`
1(θ) = cos θ
1· · · cos θ
s−1.. .
`
j(θ) = cos θ
1· · · cos θ
s−jsin θ
s−j+1, j = 2, ..., s − 1 .. .
`
s(θ) = sin θ
1.
For the central angles we have the bounds
(4.2)
−
π2≤ θ
j≤
π2; j = 1, ..., s − 2 0 ≤ θ
s−1< 2π
,
which define the domain D of variation of the central angles.
Given x the corresponding vector of the central angles will be θ(x).
The use of generalized polar coordinates enables us to obtain tests for alternatives
(4.3) H
1: ψ = ψ
1to
(4.4) H
0: ψ = ψ
0such that, θ(ψ
1− ψ
0) ∈ D
1⊂ D.
In the previous section we presented a statistic F for the (non-selective) generalized F test for H
0. Now , when H
0holds F is independent of
Θ = θ
ψ − ψ e
0(see Nunes and Mexia, 2004) thus we now use as test statistic the pair
(F, Θ), rejecting H
0when F> f and Θ ∈ D
1. The test level will be the
product of F
hj(f |a, g), (see Nunes and Mexia, 2004),
P r
Θ ∈ D
1|H
0=
2
−1Γ s 2
π
s/2Z
D1
...
Z
cos θ
1s−2, ..., cos θ
s−2 s−1Y
j=1
dθ
j. Many times, when ψ = η
j
, θ(x) ∈ D
1if and only if the g
jcomponents satisfy
` order relations. Since e η
j
− η
j