• Nie Znaleziono Wyników

GENERALIZED F TESTS IN MODELS WITH RANDOM PERTURBATIONS:

N/A
N/A
Protected

Academic year: 2021

Share "GENERALIZED F TESTS IN MODELS WITH RANDOM PERTURBATIONS:"

Copied!
13
0
0

Pełen tekst

(1)

GENERALIZED F TESTS IN MODELS WITH RANDOM PERTURBATIONS:

THE GAMMA CASE

C´ elia Maria Pinto Nunes

Department of Mathematics, University of Beira Interior 6200 Covilh˜ a, Portugal

e-mail: celia@mat.ubi.pt

Sandra Maria Barg˜ ao Saraiva Ferreira Department of Mathematics, University of Beira Interior

6200 Covilh˜ a, Portugal

and

D´ ario Jorge da Conceic ¸ ˜ ao Ferreira Department of Mathematics, University of Beira Interior

6200 Covilh˜ a, Portugal

Abstract

Generalized F tests were introduced for linear models by Michal- ski and Zmy´slony (1996, 1999). When the observations are taken in not perfectly standardized conditions the F tests have generalized F distributions with random non-centrality parameters, see Nunes and Mexia (2006). We now study the case of nearly normal perturbations leading to Gamma distributed non-centrality parameters.

Keywords: generalized F distributions; random non-centrality parameters; Gamma distribution.

2000 Mathematics Subject Classification: 62J12, 62H10, 62J99.

(2)

1. Introduction

The statistics of the generalized F tests are the quotients of linear combina- tions of independent chi-squares. These tests were introduced by Michalski and Zmy´slony (1996, 1999), first for variance components and later for linear combinations of parameters in mixed linear models.

These tests derived when we have a quadratic unbiased estimator e θ for a parameter θ and we want to test

H

0

: θ = 0 against

H

1

: θ > 0.

If e θ

+

and e θ

are, respectively, the positive and the negative parts of e θ, when H

0

[H

1

] holds we have E(e θ

+

) = E(e θ

) [E(e θ

+

) > E(e θ

)]. Thus, we are led to use the test statistic

= = eθ

+

.

The following example shows the importance of these tests. In a balanced variance components model in which a first factor crosses with the second that nests a third, the variance component associated with the second factor is not the difference between two ANOVA mean squares, see Khuri et al.

(1998). Thus, an usual F test cannot be derived for the nullity of this variance component. This problem is solved using generalized F tests. We can find a solution for this case, with a practical application of interest, in Fonseca et al. (2003b).

An exact expression for the distribution of quotients of linear combi- nations of independent central chi-squares was obtained in Fonseca et al.

(2002), when the chi-squares, in the numerator or in the denominator, have even degrees of freedom and all coefficients are non-negative. This result was extended to the non-central case in Nunes and Mexia (2006). On car- rying out this extension there were used the Robbins (1948) and Robbins and Pitman (1949) mixtures method for fixed non-centrality parameters.

When the vector of observations is the sum of a vector corresponding

to the theoretical model plus an independent perturbation vector, the dis-

tribution of the generalized F statistics has, see Nunes and Mexia (2006),

(3)

random non-centrality parameters. This kind of model perturbation is worthwhile to study since it would cover situations in which the collection of the observations was made on non standardized conditions. If we assume that the fluctuations in the observation conditions are approximately normal the non-centrality parameters would tend to be Gamma distributed. So, we decided to study this case.

Our aim is essentially theoretical. An alternative for our treatment, if practical applications are the main goal, is given by Imhof (1961). We can also use the algorithm presented by Davies (1980). This way, the previous approaches, such as the ones given by Satterthwaite (1946) and Gaylor and Hopper (1969), may be improved.

This article is organized in the following way. In Section 2 the central generalized F distributions and some particular cases are presented. Section 3 presents the non-central case of these distributions. This section is divided in tree Subsections. 3.1 is devoted to the case of random non-centrality pa- rameters. The expressions of the distributions where the non-centrality pa- rameters have Gamma distribution for the non-generalized case are obtained in 3.2. Finally 3.3 deals with the results for the generalized case.

2. Generalized F and related distributions

Let a

r1

and a

s2

be the vectors with non-negative components and being at least one of them not null. Consider also the independent random variables U

i

∼ χ

2g1,i

, i = 1, ..., r, and V

j

∼ χ

2g2,j

, j = 1, ..., s, the distribution of

X

r i=1

a

1,i

U

i

X

s j=1

a

2,j

V

j

will be F

+

(z|a

r1

, a

s2

, g

r1

, g

2s

).

Let consider some particular cases of these distributions. With (v

m

)

1

the vector whose components are the inverses of the components of v

m

, the central generalized F distribution will be

F (z|g

1r

, g

s2

) = F

+

(z|(g

r1

)

1

, (g

2s

)

1

, g

r1

, g

2s

).

(4)

Another interesting case of F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

) will be F (z|g

r1

, g

2s

) = F

+

(z|(1

r

, 1

s

, g

1r

, g

s2

).

If r = s = 1, in the first case one will have the usual central F distribution with g

1

and g

2

degrees of freedom, F (z|g

1

, g

2

), while for the second case one will have the F distribution, defined for the quotient of independent central chi-squares with g

1

and g

2

degrees of freedom, F (z|g

1

, g

2

).

In Fonseca et al. (2002) the exact expressions of F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

) are given when the degrees of freedom in the numerator or in the denominator are even. Moreover, the second case reduces to the first one since

F

+

(z|a

r1

, a

s2

, g

r1

, 2m

s

) = 1 − F

+

(z

1

|a

s2

, a

r1

, 2m

s

, g

1r

).

An example to show how these expressions may be used to check the pre- cision of Monte-Carlo methods in tabling such distributions may be seen in Fonseca et al. (2002).

3. Non-central generalized F distributions The exact expression of

F

+

(z|1, a

s2

, g

1

, g

s2

, δ) = e

δ/2

X

+∞

`=0

(

δ2

)

`

`! F

+

(z|1, a

s2

, g

1

+ 2`, g

s2

), which is the distribution of

χ

2g

1,δ s+1

X

i=2

a

i

χ

2gi

,

was obtained in Nunes and Mexia (2006) when g

1

is even.

Distributions χ

2g,δ

are a mixture of the distributions χ

2g+2j

, j = 0, ... The

coefficients in this mixture are the probabilities for non-negative integers of

the Poisson distribution with parameter

δ2

, P

δ/2

. Thus, if U ∼ χ

2g,δ

, it can be

assumed that there is an indicator variable J ∼ P

δ/2

such that U ∼ χ

2g+2`

,

when J = `, ` = 0, ...

(5)

If the U

i

∼ χ

2g

1,i1,i

, i = 1, ..., r, and V

j

∼ χ

2g

2,j2,j

, j = 1, ..., s, are independent, their joint distribution

χ

2gr

1,gs2r1s2

= Y

r i=1

χ

2g1,i1,i

Y

s j=1

χ

2g2,j2,j

will be a mixture with coefficients

(3.1) c(`

r1

, `

s2

, δ

r1

, δ

s2

) = Y

r i=1

e

δ1,i2

(

δ1,i2

)

`1,i

`

1,i

! Y

s j=1

e

δ2,j2

(

δ2,j2

)

`2,j

`

2,j

! of the

χ

2gr

1+2`r1,g2s+2`s2

= Y

r i=1

χ

2g1,i+2`1,i

Y

s j=1

χ

2g2,j+2`2,j

.

Using the mixtures method, see Robbins (1948) and Robbins and Pitman (1949), the distribution of

Z = X

r i=1

a

1,i

U

i

X

s j=1

a

2,j

V

j

will be

(3.2)

F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

, δ

r1

, δ

s2

)

=

+∞

X

`1,1=0

...

X

+∞

`1,r=0

X

+∞

`2,1=0

...

+∞

X

`2,s=0

c(`

r1

, `

s2

, δ

r1

, δ

s2

)

F

+

(z|a

r1

, a

s2

, g

r1

+ 2`

r1

, g

2s

+ 2`

s2

).

Likewise, if indicator variables are considered, the conditional distribution

of Z, when J

1,i

= `

1,i

, i = 1, ..., r and J

2,j

= `

2,j

, j = 1, ..., s, will be

F

+

(z|a

r1

, a

s2

, g

1r

+ 2`

r1

, g

s2

+ 2`

s2

). Thus, the expression of F

+

(z|a

r1

, a

s2

, g

r1

, g

2s

,

δ

r1

, δ

s2

) can be obtained desconditioning in order to the indicator variables.

(6)

Let consider now monotonicity properties for these distributions. With δ

1,p

the p-th component of δ

r1

, there will be

(3.3)

∂F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

, δ

r1

, δ

s2

)

∂δ

1,p

= F

+

(z|a

r1

, a

s2

, g

1r

+ 2q

pr

, g

2s

, δ

r1

, δ

s2

) − F

+

(z|a

r1

, a

s2

, g

r1

, g

2s

, δ

r1

, δ

s2

)

2 < 0,

as well as

∂F

+

(z|a

r1

, a

s2

, g

r1

, g

2s

, δ

r1

, δ

s2

)

∂δ

2,h

(3.4)

= F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

+ 2q

sh

, δ

r1

, δ

s2

) − F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

, δ

r1

, δ

s2

)

2 > 0,

where q

rp

has all components null, except the p-th that is equal to 1.

The non-generalized case will be used to justify (3.3) and (3.4). With the independent chi-squares χ

22

, χ

2m,δ

and χ

2n,δ0

, there will be

(3.5) pr χ

2m,δ

χ

2n,δ0

+ χ

22

< χ

2m,δ

χ

2n,δ0

< χ

2m,δ

+ χ

22

χ

2n,δ0

!

= 1,

so

(3.6) F (z|m + 2, n, δ, δ

0

) < F (z|m, n, δ, δ

0

) < F (z|m, n + 2, δ, δ

0

),

with 

 

 

 

 

 

 

 

 

 

 

 

 

χ

2m,δ

+ χ

22

χ

2n,δ0

∼ F (z|m + 2, n, δ, δ

0

)

χ

2m,δ

χ

2n,δ0

∼ F (z|m, n, δ, δ

0

) χ

2m,δ

χ

2n,δ0

+ χ

22

∼ F (z|m, n + 2, δ, δ

0

)

.

(7)

3.1. Random non-centrality parameters

So far we have considered the indicator variables J

1,i

, i = 1, ..., r, and J

2,j

, j = 1, ..., s, to have Poisson distributions with fixed parameters. Let now assume these parameters to be random variables.

Remark. To understand the ”appearance” of randomized non-centrality parameters we point out that if the error vector e

n

has normal distri- bution with null mean vector and variance-covariance matrix σ

2

I

g

, e

n

∼ N (0

n

, σ

2

I

g

), with I

g

the g × g identity matrix, one will have ke

n

k

2

∼ σ

2

χ

2g

. With µ

n

the mean vector of the observations vector, ke

n

+ µ

n

k

2

∼ σ

2

χ

2g,δ

, with the non-centrality parameter δ =

σ12

n

k

2

. Let consider a random perturbation vector of the model, W

n

, independent of e

n

. The condi- tional distribution of ke

n

+ W

n

k

2

, given W

n

= w

n

, will be σ

2

χ

2g,δ(w)

, with δ(w) =

σ12

kw

n

k

2

. Then, desconditioning in order to W

n

, we obtain a chi- square with g degrees of freedom and random non-centrality parameters. In mixed models, see for example Khuri et al. (1998), Fonseca et al. (2003a) and Nunes et al. (2006), the F and generalized F tests are quotients of squares of norms of vectors or of linear combinations of such squares. These squares may happen to have random non-centrality parameters when, in the expression, a random perturbation vector W

n

occurs.

Let consider now the random variables L

1,i

, i = 1, ..., r and L

2,j

, j = 1, ..., s, with λ

Lr1,Ls2

(t

r1

, t

s2

) the joint moment generating function for these variables and

(3.7) λ

<`Lrr1,`s2>

1,Ls2

(t

r1

, t

s2

) = ∂

Pri=1`1,i+Psj=1`2,j

λ

Lr1,Ls2

(t

r1

, t

s2

) Y

r

i=1

∂t

`1,i1,i

Y

s j=1

∂t

`2,j2,j

.

Desconditioning

(3.8)

F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

, l

1r

, l

s2

)

=

+∞

X

`1,1=0

...

X

+∞

`1,r=0

X

+∞

`2,1=0

...

+∞

X

`2,s=0

c(`

r1

, `

s2

, l

1r

, l

2s

)

F

+

(z|a

r1

, a

s2

, g

1r

+ 2`

r1

, g

s2

+ 2`

s2

)

(8)

in order to the random parameters vectors L

r1

and L

s2

, we will have

(3.9)

F

+

(z|a

r1

, a

s2

, g

r1

, g

2s

, λ

Lr1,Ls2

) =

+∞

X

`1,1=0

...

X

+∞

`1,r=0 +∞

X

`2,1=0

...

X

+∞

`2,s=0

× λ

<`

r 1,`s2>

Lr1,Ls2



− 1

2 1

r

, − 1 2 1

s



Y

r i=1

`

1,i

!2

`1,i

Y

s j=1

`

2,j

!2

`2,j

F

+

(z|a

r1

, a

s2

, g

r1

+ 2`

r1

, g

2s

+ 2`

s2

).

With q

ri

[q

js

] be the vector with all r [s] components null, except the i-th [j-th] which is 1, all components of L

ri

= (1

r

− q

ir

)L

r1

[L

sj

= (1

s

− q

js

)L

s2

] will be equal to the ones of L

r1

[L

s2

] to exception of the i-th [j-th] that is null.

From (3.3) and (3.4) it is easy to obtain

(3.10)

 

 

F

+

(z|a

r1

, a

s2

, g

1r

, g

2s

, λ

Lr

i,Ls2

) > F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

, λ

Lr1,Ls2

); i = 1, ..., r F

+

(z|a

r1

, a

s2

, g

1r

, g

2s

, λ

Lr1,Ls2

) > F

+

(z|a

r1

, a

s2

, g

1r

, g

s2

, λ

Lr

1,Lsj

); j = 1, ..., s.

So, when one of the components of L

s2

[L

r1

] is null, with probability 1, the values of F

+

(z|a

r1

, a

s2

, g

r1

, g

2s

, λ

Lr1,Ls2

) decrease [increase].

3.2. F distribution with non-centrality parameters with Gamma distribution

As it was previously seen, if a

r1

= 1

r

and a

s2

= 1

s

, with r = s = 1 one will have the F distribution defined for the quotient of independent chi-squares with g

1

and g

2

degrees of freedom. So, (3.9) can be rewritten as

(3.11) F (z|g

1

, g

2

, λ

L1,L2

) = X

+∞

i=0 +∞

X

j=0

λ

<i,j>

L1,L2



− 1

2 , − 1 2



2

i+j

i!j! F (z|g

1

+ 2i, g

2

+ 2j).

(9)

Consider now L

1

with Gamma distribution with parameters n

1

and α

1

, L

1

∼ G(n

1

, α

1

),

λ

L1

(t

1

) =

 α

1

α

1

− t

1



n1

, t

1

< α

1

and consequently

(3.12) λ

<i>L1

(t

1

) = (n

1

+ i − 1)!α

n11

1

− t

1

)

n1i

(n

1

− 1)! .

If L

1

is independent of L

2

, with L

2

∼ G(n

2

, α

2

), one will have

(3.13)

λ

<i,j>

L1,L2

(t

1

, t

2

) = λ

<i>L1

(t

1

<j>L2

(t

2

)

= (n

1

+ i − 1)!α

n11

1

− t

1

)

n1i

(n

1

− 1)!

(n

2

+ j − 1)!α

n22

2

− t

2

)

n2j

(n

2

− 1)! ,

and (3.11) will be

(3.14)

F (z|g

1

, g

2

, λ

L1,L2

)

=

+∞

X

i=0

X

+∞

j=0

 n

1

+ i − 1 i

  n

2

+ j − 1 j



α

n11

α

n22

2

i+j

1

+

1

2

)

n1+i

2

+

1

2

)

n2+j

F (z|g

1

+ 2i, g

2

+ 2j).

(10)

Let consider a particular case of Gamma distribution. If L

1

∼ χ

2n1

and L

2

∼ χ

2n2

then L

1

∼ G(

n21

,

1

2

) and L

2

∼ G(

n2

2

,

1

2

), so

(3.15)

λ

<i,j>

L1,L2

(t

1

, t

2

) =

(

n1

2

+ i − 1)!(

n22

+ j − 1)! 1

2

n1+n2

2



1

2 − t

1



n1

2 −i

 1

2 − t

2



n2

2 −j

( n

1

2 − 1)!( n

2

2 − 1)!

and

(3.16) F (z|g

1

, g

2

, λ

L1,L2

) = X

+∞

i=0 +∞

X

j=0



n1

2

+ i − 1 i

 

n2

2

+ j − 1 j



2

n12 +i+n22 +j

F (z|g

1

+ 2i, g

2

+ 2j),

if L

1

and L

2

are independent.

3.3. Generalized F distribution with non-centrality parameters with Gamma distribution

Consider the generalized case and the independent random variables L

r1

∼ G(n

r1

, α

r1

), with n

1,1

, ..., n

1,r

1,1

, ..., α

1,r

] the components of n

r1

r1

] and L

s2

∼ G(n

s2

, α

s2

), with n

2,1

, ..., n

2,s

2,1

, ..., α

2,s

] the components of n

s2

s2

],

λ

Lr

1

(t

r1

) = Y

r i=1

λ

L1,i(ti)

= Y

r i=1

 α

1,i

α

1,i

− t

i



n1,i

, t

i

< α

1,i

, i = 1, ..., r.

Consequently

(3.17) λ

<`Lrr1>

1

(t

r1

) = Y

r i=1

1,i

)

n1,i

(n

1,i

+ `

1,i

− 1)!

(n

1,i

− 1)!(α

1,i

− t

i

)

n1,i+`1,i

(11)

and

(3.18)

λ

<`Lrr1,`s2>

1,Ls2

(t

r1

, t

s2

) = λ

<`Lrr1>

1

(t

s2

<`Lss2>

2

(t

s2

)

= Y

r i=1

1,i

)

n1,i

(n

1,i

+ `

1,i

− 1)!

(n

1,i

− 1)!(α

1,i

− t

i

)

n1,i+`1,i

Y

s

j=1

2,j

)

n2,j

(n

2,j

+ `

2,j

− 1)!

(n

2,j

− 1)!(α

2,j

− t

j

)

n2,j+`2,j

.

This way, (3.9) can be rewritten as

F

+

z|a

r1

, a

s2

, g

r1

, g

2s

, λ

Lr1,Ls2



= X

+∞

`1,1=0

...

+∞

X

`1,r=0

X

+∞

`2,1=0

X

+∞

`2,s=0

(3.19) × Y

r i=1

 n

1,i

+ `

1,i

− 1

`

1,i



1,i

)

n1,i

Y

s j=1

 n

2,j

+ `

2,j

− 1

`

2,j



2,j

)

n2,j

Y

r

i=1

2

`1,i



α

1,i

+ 1

2



n1,i+`1,i

Y

s j=1

2

`2,j



α

2,j

+ 1 2



n2,j+`2,j

× F

+

z|a

r1

, a

s2

, g

1r

+ 2`

r1

, g

2s

+ 2`

s2

 .

Let consider now the particular case of chi-square distribution. If the independent variables

L

r1

∼ χ

2nr

1

and L

s2

∼ χ

2ns

2

, then L

r1

∼ G  n

r1

1

r

2 , 1

r

2



, L

s2

∼ G  n

s2

1

s

2 , 1

s

2



and there will be

(12)

λ

<`

r 1,`s2>

Lr1,Ls2

 t

r1

, t

s2



= Y

r i=1



1

2



n1,i

2

n

1,i

2 + `

1,i

− 1 

!

 n

1,i

2 − 1 

!  1

2 − t

i



n1,i

2 +`1,i

Y

s j=1



1

2



n2,j

2

n

2,j

2 + `

2,j

− 1 

!

n

2,j

2 − 1 

! 1

2 − t

j



n2,j

2 +`2,j

,

and

(3.21)

F

+

(z|a

r1

, a

s2

, g

r1

, g

2s

, λ

Lr1,Ls2

)

= X

+∞

`1,1=0

...

+∞

X

`1,r=0

X

+∞

`2,1=0

X

+∞

`2,s=0

× Y

r i=1



n1,i

2

+ `

1,i

− 1

`

1,i

 Y

s j=1



n2,j

2

+ `

2,j

− 1

`

2,j



Y

r i=1

2

n1,i2 +`1,i

Y

s j=1

2

n2,j2 +`2,j

× F

+

(z|a

r1

, a

s2

, g

r1

+ 2`

r1

, g

2s

+ 2`

s2

).

Acknowledgements

The authors are grateful to Professor Jo˜ao Tiago Mexia for his permanent availability, his continuous support and incentive.

References

[1] R.B. Davies, Algorithm AS 155: The distribution of a linear combinations of

χ

2

random variables, Applied Statistics 29 (1980), 232–333.

(13)

[2] J.P. Imhof, Computing the distribution of quadratic forms in normal vari- ables, Biometrika 48 (1961), 419–426.

[3] M. Fonseca, J.T. Mexia and R. Zmy´slony, Exact distribution for the general- ized F tests, Discuss. Math. Probab. Stat. 22 (2002), 37–51.

[4] M. Fonseca, J.T. Mexia and R. Zmy´slony, Estimators and Tests for Variance Components in Cross Nested Orthogonal Designs, Discuss. Math. Probab.

Stat. 23 (2) (2003a), 175–201.

[5] M. Fonseca, J.T. Mexia and R. Zmy´slony, Estimating and testing of variance components: an application to a grapevine experiment, Biometrical Letters 40 (1) (2003b), 1–7.

[6] D.W. Gaylor and F.N. Hopper, Estimating the degrees of freedom for linear combinations of mean squares by Satterthwaite’s formula, Technometrics 11 (1969), 691–706.

[7] A.I. Khuri, T. Mathew and B.K. Sinha, Statistical Tests for Mixed Linear Models, A Wiley-Interscience Publication. John Wiley & Sons, Inc., New York 1998.

[8] A. Michalski and R. Zmy´slony, Testing hypothesis for variance components in mixed linear models, Statistics 27 (1996), 297–310.

[9] A. Michalski and R. Zmy´slony, Testing hypothesis for linear functions of pa- rameters in mixed linear models, Tatra Mountain Mathematical Publications 17 (1999), 103–110.

[10] C. Nunes and J.T. Mexia, Non-central generalized F distributions, Discuss.

Math. Probab. Stat. 26 (1) (2006), 47–61.

[11] C. Nunes, I. Pinto and J.T. Mexia, F and Selective F tests with balanced cross-nesting and associated models, Discuss. Math. Probab. Stat. 26 (2) (2006), 193–205.

[12] H. Robbins, Mixture of distribution, Ann. Math. Statistics 19 (1948), 360–369.

[13] H. Robbins and E.J.G. Pitman, Application of the method of mixtures to quadratic forms in normal variates, Ann. Math. Statistics 20 (1949), 552–560.

[14] F.E. Satterthwaite, An approximate distribution of estimates of variance com- ponents, Biometrics Bulletin 2 (1946), 110–114.

Received 10 November 2009

Cytaty

Powiązane dokumenty

In this paper we have proposed the Generalized Beta Regression framework for modeling systematic risk in Loss-Given-Default (LGD) in the context of credit portfolio losses. The

This happens for example if we want to check whether a random variable fits (comes from) a specified distribution (when we will perform so- called goodness-of-fit tests), when we

The main result of this note gives us a canonical representa ­ tion of the Lévy-Khi nchine’ s type of infinitely divisible generalized dis ­ tribution.. The generalized

Given that each of these numbers is even, the probability that each individual component is at the origin is the probability that a one dimensional walk is at the origin at time

You are not required to find the coordinates of the

Key words and phrases: total variation distance, w-functions, Stein–Chen identity, Poisson, binomial, negative binomial, hypergeometric

A similar problem, namely that of finding conditions under which the product of independent random variables with beta distribution has also the beta

The carried out analysis of this synanthropization process was based on the example of stations of 31 anthropophytes of Poaceae family located in railway grounds