F AND SELECTIVE F TESTS WITH BALANCED CROSS-NESTING AND ASSOCIATED MODELS
C´ elia Nunes
Departamento de Matem´ atica, Universidade da Beira Interior 6200 Covilh˜ a, Portugal
e-mail: celia@mat.ubi.pt Iola Pinto
Instituto Superior de Engenharia de Lisboa and
Jo˜ ao Tiago Mexia
Mathematics Department, Faculty of Science and Technology New University of Lisbon
Monte da Caparica 2829–516 Caparica, Portugal
Abstract
F tests and selective F tests for fixed effects part of balanced models with cross-nesting are derived. The effects of perturbations in the numerator and denominator of the F statistics are considered.
Keywords: selective F tests, associated models, cross-nesting.
2000 Mathematics Subject Classification: 62J10, 62J12, 62J99.
1. Introduction
Balanced models with cross-nesting enable us to study the
action of a large number of factors. Whenever possible, F tests are
highly recommended due to their robustness and power. In what follows
such tests are derived for the fixed effects part of the models.
Besides the usual F tests we will consider selective F tests which have high power for chosen alternatives. we consider the effects of perturbations on the numerator and denominator of the statistics. These perturbations arise when additional terms are added to the models, thus originating associated models.
2. Models and hypothesis
Throughout the text, superscripts will indicate vector dimensions, 1 r [0 r ] will have their r components equal to 1 [0]. Moreover, I r will be the r × r identity matrix, R(A) will be the range space of matrix A and A ⊗ B the Kronecker product of matrices A and B. The transpose of matrix A will be denoted by A > .
Let us assume there are L groups with u 1 , ..., u L factors. The number of the levels for the first factors in the different groups will be a ` (1), ` = 1, ..., L, we also put a ` (0) = 1, ` = 1, ..., L. If u ` > 1, we will have balanced nesting in the `-th group of factors. For each level of the first factor there will be a ` (2) levels for the second factor, and so on. With h ` ≤ u ` , we will have c ` (h ` ) = Q h`
t
`=0 a ` (t ` ) level combinations from the first h ` factors in the `-th group. The number of level combinations for the factors in the `-th group will be c ` = c ` (u ` ), ` = 1, ..., L. Each level combination for the first h ` factors of the `-th group will nest b ` (h ` ) = c c
``
(h
`) level combinations for the remaining factors in the group. The number of level combinations for all the factors, this is, the number of treatments, will be c = Q L
`=1 c ` . If, for each treatment, we have r observations, the total number of observations will be n = cr.
Let ∆ be the family of vectors h L with components h ` = 0, ..., u ` ,
` = 1, ..., L and let us assume that the L 0 < L first groups are of fixed-effect factors and that the remaining groups will be of random effects factors.
The vectors associated to µ and the effects and interactions between fixed effects factors constitute the sub-family ∆ 0 of ∆. Given h L ∈ ∆ 0 , c(h L ) = Q L
`=1 c ` (h ` ) will be the number of effects or interactions associ-
ated to h L . These effects and interactions are components of a fixed vector
β c(h
L) (h L ), if h L 6= 0 L . Besides this, the vectors h L ∈ ∆ 0c = ∆ − ∆ 0 will
correspond to effects or to interactions involving one or more random effects
factors. Associated to them there will be random vectors e β c(h
L) (h L ) with
null mean vectors and variance-covariance matrices σ 2 (h L )I c(h
L) , h L ∈ ∆ 0c .
These vectors are independent between themselves and of the error vector e n which has null mean vector and variance-covariance matrix σ 2 I n . Let us put
(2.1)
∆ 0 (t L ) = {h L : h L ∈ ∆ 0 ; t L ≤ h L }
∆ 0c (t L ) = {h L : h L ∈ ∆ 0c ; t L ≤ h L } .
Taking
X(h L ) = O L
`=1
X ` (h ` ) ⊗ 1 r ,
where X ` (h ` ) = I c(h`) ⊗ 1 b
`(
`) , h ` = 0, ..., u ` , ` = 1, ..., L, we will use a model formulation introduced by Fonseca et al. (2003),
(2.2) Y n = X
h
L∈∆
0X (h L )β c(hL) (h L ) + X
h
L∈∆
0cX (h L ) e β c(hL) (h L ) + e n .
With T s a matrix such that [T s ; s −1/2 1 s ] is a s order orthogonal matrix, we put
K ` (t ` ) = I c`(t
`−1) ⊗ T a
`(t
`) ⊗ 1 b
`(t
`)
p b ` (t ` ) ; t ` = 0, ..., u ` ; ` = 1, ..., L, (2.3)
as well as P (t L ) = K(t L )K(t L ) > , with
K(t L ) = O L
`=1
K ` (t ` ) ⊗ 1
√ r 1 r
and rank(P (t L )) = rank(K(t L )) = g(t L ). We also put P = I n − X
t
L:t
L∈∆
P (t L ),
with rank(P ) = g, so we can write
(2.4) Y n = X
t
L∈∆
P (t L )Y n + P Y n .
Let us point out that
M (h L ) = X(h L )X(h L ) > = O L
`=1
M ` (h ` ) ⊗ J r ,
where J r = 1 r (1 r ) > . These matrices commute and, taking b(h L ) = r Q L
`=1 b ` (h ` ), the matrices
Q(h L ) = b(h L ) −1 M (h L ) = X
t
L:t
L≤h
LP (t L )
are orthogonal projections matrices, while the P (t L ), t L ∈ ∆, and ¯ P are mutually orthogonal projections matrices.
Let us take
(2.5) e
η g(tL) (t L ) = K(t L ) > Y n = η g(t
L) (t L ) + P
h
L∈∆
0c(t
L) K(t L ) > X(h L ) e β c(h
L) (h L ) + K(t L ) > e n ; t L ∈ ∆, where
(2.6) η g(tL) (t L ) = X
h
L∈∆
0(t
L)
K(t L ) > X (h L )β c(hL) (h L ); t L ∈ ∆
is the mean vector of e η g(tL) (t L ), t L ∈ ∆. From the independence of the β e c(h
L) (h L ), h L ∈ ∆ 0c , and of e n , we also get the variance-covariance matrix of e η g(t
L) (t L )
(2.7) 6Σ(e η g(tL) (t L )) = γ(t L )I g(t
L) ; t L ∈ ∆ with
(2.8) γ (t L ) = X
h
L∈∆
0c(t
L)
b(h L )σ 2 (h L ) + σ 2 ,
since 6Σ(e β c(hL) (h L )) = σ 2 (h L )I c(h
L) , h L ∈ ∆ 0c , 6Σ(e n ) = σ 2 I n , and
(2.9)
K(t L ) > M (h L )K(t L ) = b(h L )I g(tL) ; t L ≤ h L K(t L ) > M (h L )K(t L ) = 0 g(t
L),g(t
L) ; t L 6≤ h L
,
since K(t L ) > K(t L ) = I g(tL) , t L ∈ ∆. If t L ∈ ∆ 0c , ∆ 0 (t L ) = ∅, and
(2.10) η g(tL) (t L ) = 0 g(t
L) .
Let us consider the family of the estimable vectors
(2.11)
Λ(t L ) = {ψ s (t L ) = B(t L )η g(tL) (t L ) : R(B(t L ) > ) ⊆ R(K(t L ) > ) } ; t L ∈ ∆ 0 .
In what follows we will use the ψ s (t L ) ∈ Λ h (t L ), the family of homoscedastic estimable vectors, t L ∈ ∆ 0 , to define the hypotheses
(2.12) H 0 (t L ; d s ) : ψ s (t L ) = d s ; t L ∈ ∆ 0
that holds if and only if kψ s (t L ) − d s k 2 = 0. The vectors in Λ h (t L ) are characterized by B(t L )B(t L ) > = kI s .
3. F and selective F tests When ∆ 0c (t L ) = ∅ we have, according to (2.8),
(3.1) γ (t L ) = σ 2 .
To single out these cases we put t L ∈ ∆ 0 ∅ when this happen.
If the vectors e β c(hL) (h L ), h L ∈ ∆ 0c and e n are normal and independent,
we obtain, see Mexia (1995, p. 35–42),
a) e η g(tL) (t L ) ∼ N(η g(t
L) (t L ), γ(t L )I g(t
L) ), t L ∈ ∆;
b) P e n ∼ N(0 n , σ 2 P );
c) the vectors e η g(tL) (t L ), t L ∈ ∆, and P e n are independent.
Then, with t L ∈ ∆ 0 ∅ , we will have e ψ s (t L ) ∼ N(ψ s (t L ), σ 2 I s ) independent from S = kP e n k 2 ∼ σ 2 χ 2 g,0 .
Thus it is straightforward to obtain, for the hypotheses H 0 (t L , d s ), F tests with statistics
(3.2) =(t L ) = g
s
k e ψ s (t L ) − d s k 2
S ,
where k e ψ s (t L ) − d s k 2 ∼ σ 2 χ 2 s,δ(tL,d
s) , with
(3.3) δ(t L , d s ) = 1
σ 2 kψ s (t L ) − d s k 2 . We will replace the test statistic by
(3.4) T (t L ) = k e ψ s (t L ) − d s k 2
S ,
because these statistics have ”more manageable” distributions than the previous ones.
Let us consider associated models. The models obtained, taking (3.5) β e c(hL) (h L ) = β c(h
L) (h L ) + U c(h
L) (h L ), h L ∈ ∆ 0 ,
this is adding random vectors Z g(tL) (t L ) to the vectors e η g(t
L) (t L ), t L ∈ ∆ 0 , will be the associated models of first type. Now, with t L ∈ ∆ 0 ∅ , we will have k e ψ s (t L ) − d s k 2 ∼ σ 2 χ 2 s,V (t
L,d
s) , with
(3.6) V (t L , d s ) = 1
σ 2 kψ s (t L ) − d s + B(t L )Z g(tL) (t L ) k 2 ; t L ∈ ∆ 0 .
If pr(Z g(tL) (t L ) = 0 g(t
L) ) = 1, V (t L , d s ) is constant (with probability equal
to one), and is null if and only if H 0 (t L , d s ) holds.
We have the associated models of the second type, obtained adding to P e n a random vector W g independent of all the other random vectors that are considered in the model. Thus, only S will be disturbed and we will have S ∼ σ 2 χ 2 g,V , with
(3.7) V = 1
σ 2 kW g k 2 .
We point out that χ 2 g,V is the result of randomizing the non-centrality parameter in a chi-square.
Lastly we have associated models of third type. These models are obtained adding random vectors Z g(tL) (t L ) [W g ] to the vectors e η g(t
L) (t L ), t L ∈ ∆ 0 ∅ [P e n ]. We now have k e ψ s (t L ) − d s k 2 ∼ σ 2 χ 2 s,V (t
L,d
s) and S ∼ σ 2 χ 2 g,V . Once more, with t L ∈ ∆ 0 ∅ , whenever pr(Z g(t
L) (t L ) = 0 g(t
L) ) = 1, V (t L , d s ) will be fixed and null if and only if H 0 (t L , d s ) holds.
We intend deriving selective F tests for certain alternatives.
To define selected alternatives to H 0 (t L , d s ), Dias (1994, p. 21–24) used polar coordinates (r, θ 1 , ..., θ s−1 ). For the θ 1 , ..., θ s−1 we have the bounds
(3.8)
− Π
2 ≤ θ j ≤ Π
2 ; j = 1, ..., s − 2 , 0 ≤ θ s−1 < 2Π ,
which define the domain D with θ s−1 (ψ s − d s ) the vector of central angles θ 1 , ..., θ s−1 associated to ψ s (t L ) − d s , when one of those alternatives holds we will have θ s−1 (ψ s − d s ) ∈ C and kψ s (t L ) − d s k 2 > a. We will represent by ν the set of vectors ψ s (t L ) associated to these alternatives.
The joint density of T (t L ) and Θ s−1 = θ s−1 (ψ s −d s ) for normal models, see Nunes and Mexia (2003), will be
(3.9)
f (z, θ s−1 |ρ s , g)
= e −δ/2 h(θ s−1 ) 2π s/2 Γ g
2
X +∞
j=0
2 j/2 Γ
g + s + j 2
a j (θ s−1 )z
j+s2−1 j!(1 + z)g+s+j2
,
z > 0, θ s−1 ∈ D,
where
(3.10)
ρ s = 1
σ (ψ s (t L ) − d s ) k(θ s−1 ) = k` s (θ s−1 ) k 2 = 1 a(θ s−1 ) = 1
√ σ 2 (ψ s (t L ) − d s ) > ` s (θ s−1 ) h(θ s−1 ) = cos θ 1 s−2 , ..., cos θ s−2
.
The components of ` s (θ s−1 ) being
(3.11)
` 1 (θ s−1 ) = cos θ 1 · · · cos θ s−1
` 2 (θ s−1 ) = cos θ 1 · · · cos θ s−2 sin θ s−1 .. .
` j (θ s−1 ) = cos θ 1 · · · cos θ s−j sin θ s−j+1 .. .
` s (θ s−1 ) = sin θ 1
.
When H 0 (t L , d s ) holds, ρ s = 0 s and the joint density will now be (3.12) f(z, θ s−1 |0 s , g) = f (z |s, g)f(θ s−1 ),
where f (z |s, g) is the density of χ χ2s2
g and
f (θ s−1 ) = Γ s
2
2π s/2 h(θ s−1 ); θ s−1 ∈ D,
so T (t L ) and Θ s−1 will then be independent. Let us assume that we have a critical region (k, C), rejecting the tested hypothesis when T (t L ) > k and Θ s−1 ∈ C, the test level will be
(3.13) level(k, C) = (1 − F (k|s, g)) Z
C
...
Z
f (θ s−1 )
s−1 Y
j=1
dθ j .
From this expression we can conclude that, for a given test level, there exist several pairs (k, C). To choose a convenient pair Dias (1994, p. 53) introduced the max-min tests. These tests maximize the minimum power for selected alternatives. The selected alternatives for which this minimum is attained will be the critical alternatives. It is possible to have more than one of these alternatives, Dias (1994, p. 53) describes a case with s = 2 where there are two critical alternatives.
Now we go over to first type associated models. The selected alternatives will now be those such that pr((ψ s (t L ) + B(t L )Z g(tL) (t L )) ∈ ν) = 1 ∗ .
Now we establish
Proposition 1. If the test is max-min for the normal model it is also max- min for the associated models of the first type with the same minimum power for selected alternatives.
P roof. With pot(ψ s ) the power function for the normal model and G(v s ) the distribution of e ψ s (t L ) + B(t L )Z g(tL) (t L ), the power for the associated model will be
(3.14) P ot(G) =
Z
ν
...
Z
pot(v s )dG(v s ).
Let us now point out that, if H 1 : ψ s (t L ) = ψ c s (t L ), t L ∈ ∆ 0 ∅ , is a critical alternative for the normal model, we will have pot(ψ c s ) = min {pot(ψ s ), ψ s (t L ) ∈ ν}. Since pr(( e ψ s (t L ) + B(t L )Z g(tL) (t L )) ∈ ν) = 1, we have P ot(G) ≥ pot(ψ c s ). This minimum is attained for the ”degenerate” selected alternatives e ψ s (t L ) + B(t L )Z g(t
L) (t L )) = ψ s c (t L ) so the proof is complete.
Thus we can extended directly the max-min tests constructed for normal models to the corresponding associated models of first type.
When the infimum of the power for selected alternatives exceeds the test level the test will be selectively unbiased. If suffices that the power for critical alternatives, if they exist, exceeds the test level for the test to be selectively unbiased. Is easy to establish the
∗