INFERENCE ON THE LOCATION PARAMETER OF EXPONENTIAL POPULATIONS

(1)

INFERENCE ON THE LOCATION PARAMETER OF EXPONENTIAL POPULATIONS

^∗

Maria de F´ atima Brilhante Universidade dos A¸cores, DM, and CEAUL

Sandra Mendonc ¸a

Universidade da Madeira, CCEE, and CEAUL Dinis Duarte Pestana

Universidade de Lisboa, DEIO and CEAUL and

Maria Lu´ısa Rocha

Universidade dos A¸cores, DEG, CEEAplA and CEAUL

A token of friendship to Professor J. T. Mexia on his 70th birthday

Abstract

Studentization and analysis of variance are simple in Gaussian families because X and S

²

are independent random variables. We exploit the independence of the spacings in exponential populations with location λ and scale δ to develop simple ways of dealing with inference on the location parameter, namely by developing an analysis of scale in the homocedastic independent k-sample problem.

Keywords: studentization, analysis of scale, characterizations, independence of exponential spacings, location-scale families, F -ratio.

2000 Mathematics Subject Classification: 60E05.

∗

Research partially supported by FCT/OET.

(2)

1. Introduction

In what follows, X = (X

₁

, X

₂

, . . . , X

_n

) denotes a random sample of size n

X

= n from the population X;

X = 1 n

X

n k=1

X

_k

and S

_X²

= 1 n − 1

X

n k=1

(X

k

− X)

²

.

X

₁

, . . . , X

_k

, X

₁

, . . . , X

_k

, S

_X²₁

, . . . , S

_X²

k

– which for simplicity we shall denote S

₁²

, . . . , S

_k²

– will have similar meaning. Also for simplicity, we shall write n

_X_j

= n

_j

, j = 1, . . . , k, and N = n

₁

+· · ·+n

_k

the dimension of the combined sample. A pooled variance estimator

S

_p²

= (n

₁

− 1)S

₁²

+ · · · + (n

_k

− 1)S

_k²

N − k

will also be used.

The exact sampling distribution of functions of low empirical moments of Gaussian populations X

_k

_ Gaussian(µ

_k

, σ

_k

) is easily derived, and hence we have a wealth of results to estimate, or to test, the location parameter, or to compare location parameters of several Gaussian populations, even for small samples, at least assuming homocedasticity.

For the one-sample and the homocedastic independent two-sample cases, Student (1908) devised how to get rid of the nuisance scale parameter in σ in

X − µ σ q

1 n

and in X

₁

− X

₂

− (µ

₁

− µ

₂

) σ q

1 n₁

+

_n¹

2

,

respectively, dividing those standard Gaussian random variables by appropriate functions of estimators of σ.

For the heterocedastic case, Smith (1936), Welch (1938), Satterthwaite

(1946) and Aspin (1948) established that

(3)

X

₁

− X

₂

− (µ

₁

− µ

₂

) s

σ

₁²

n

₁

+ σ

²₂

n

₂

has an approximate t

_ν

distribution, where the fractional number of degrees of freedom ν can be estimated by

˜ ν =

s

²₁

n

₁

+ s

²₂

n

₂

2

s

⁴₁

n

²₁

(n

₁

− 1) + s

⁴₂

n

²₂

(n

₂

− 1)

.

Aspin (1949) tabulated high quantiles of this ‘fractional’ t distribution. Al- ternatively, inference on µ

₁

− µ

₂

can use random pairing of X

_1,k

from X

₁

with Y

_2,ν_k

chosen using simple random sampling without replacement from X

2

, cf. Sheff´e (1943, 1944), so that using the one-sample t-test for the ‘random’ differences D

k

= X

_1,k

− Y

_2,ν_k

, k = 1, . . . , min(n

₁

, n

₂

) is in order.

Following David and Nagaraja (2003) we shall speak of studentization, in general populations, whenever we divide a random variable T = T (Θ(X), θ, ξ), which is a function of an estimator Θ(X) of the parameter of interest θ, depending both on this parameter and on a nuisance parameter ξ, by some appropriate function h(Ξ(X)) of an estimator ˜ ξ = Ξ(X) of the nuisance parameter, so that the ratio no longer depends on the nuisance parameter ξ. Therefore, this ratio can be used as a fulcral variable, useful to construct interval estimators of θ; on the other hand, the above mentioned ratio is an adequate test statistic, under the null hypothesis H

0

: θ = θ

0

. We shall consider it external studentization when T and h(Ξ) are independent, and internal studentization otherwise.

The homocedastic independent k-sample problem, in Gaussian populations, has been solved by Fisher (1925). Instead of getting rid of the nuisance parameter, Fisher’s ANalysis Of VAriance – in the sense of decomposing an estimator of the overall dispersion into components – uses the influ- ence of the location parameter estimates on the scale parameter estimator.

More precisely, as the variance σ

²

is the smaller second order moment, i.e., σ

²

< E[(X −A)

²

] for all A 6= E[X] (and, similarly, s

²^x

=

_n−1¹

P

n

k=1

(x

k

−x)

²

<

1 n−1

P

n

k=1

(x

_k

− a)

²

for all a 6= x =

_n¹

P

n

k=1

x

_k

), splitting the Total Sum of

(4)

Squares T SS = P

k i=1

P

ni

j=1

(X

ij

− X)

²

, where X stands for the overall mean X =

_N¹

P

k

i=1

P

ni

j=1

X

_ij

of the combined sample X

e

= (X

1

, . . . .X

_k

), into the Between Sum of Squares BSS = P

k

i=1

n

_i

(X

i

− X)

²

and the Within Sum of Squares W SS = P

k

i=1

P

ni

j=1

(X

ij

− X

i

)

²

, we obtain two independent estimators of the variance, namely the means squares

^BSS_k−1

and S

_p²

=

^{W SS}_N_−k

. Moreover, while S

_p²

is an unbiased estimator of σ

²

,

^BSS_k−1

is unbiased under the null hypothesis µ

₁

= µ

₂

= · · · = µ

_k

= µ, but when the alternative is true

^BSS_k−1

is biased, and hence it must overestimate σ

²

, and the F -ratio

BSS k−1 W SS N−k

|H

_

0

F

_{k−1,N −k}

should detect gross departures from the null hypothesis.

Once again, Welch–Satterthwaite pathbreaking techniques are useful in constructing approximate solutions without assuming homocedasticity (Satterthwaite, 1946; Welch, 1951; Oehlert, 2000).

Inference on location and scale in Gaussian populations is thus simple for two main reasons:

1. If X is a random sample from X _ Gaussian(µ, σ), X and S

²

are independent random variables, and thus in the two-sample setting, X

₁

− X

₂

and S

_p²

are independent random variables. Thus deducing the exact distribution of the studentized variables

X − µ S

r 1 n

and X

₁

− X

₂

− (µ

₁

− µ

₂

) S

_p

r 1 n

₁

+ 1

n

₂

(Student, 1908) has been a simple task.

Unfortunately, independence of X and S

²

is a characterization of the Gaussian populations, and studentization in other populations, e.g.

uniform populations, is a hard task, even for samples of size 3 (Perlo, 1933).

2. Assume that we want to estimate the location parameter θ of an abso-

lutely continuous population X using the maximum likelihood estima-

tor ˆ θ. Under what conditions is ˆ θ = X?

(5)

Given a sample x = (x

₁

, . . . , x

_n

), ˆ θ must satisfy P

n

k=1

g

⁰

(ˆ θ − x

k

) = 0, where g(θ − x

k

) = ln [f (x

k

| θ)].

The special case x

1

= · · · = x

n−1

= 0 and x

n

= nu implies that (n−1)g

⁰

(u)+g

⁰

((1−n)u) = 0, n = 2, 3, . . . ; the case n = 2 shows at once that g must be an odd function. Thus g

⁰

(nu) = ng

⁰

(u), u ∈ R, n = 2, 3, . . . , and henceforth g

⁰

is a linear function. From this, g

⁰

(u) = Cu =⇒ g(u) =

¹

2

C u

²

+ b, satisfying the condition R

R

f (x|θ) du = 1.

Then

f (x | θ) = r

α 2 π e

⁻¹

2α(x−θ)²

I

R

(x), α > 0.

This observation (Gauss, 1809) shows that the Gaussian family is unique, in the realm of absolutely continuous distributions, in having

X as the maximum likelihood estimator of its location parameter θ = E[X].

In the sequel we shall investigate how a characterization of the exponential populations – namely that S

₁

:= X

_1:n

and the spacings S

_k

:= X

_k:n

− X

_k−1:n

, k = 2, . . . , n are independent – can be used to establish simple results in what concerns inference the location parameter λ of Y = δX + λ _ Exponential(λ, δ), where X _ Exponential(1) is the standard exponential, with probability density function f

X

(x) = e

^−x

I

(0,∞)

(x).

2. Inference on the location

of exponential random variables

2.1. The one-sample case

Let y be a random sample of size n from Y _ Exponential(λ, δ), λ ∈ R, δ > 0, i.e., with distribution function

F

Y

(y) =

1 − exp

− y − λ δ

I

[λ,∞)

(y).

The maximum likelihood estimators of the parameters are b λ = Y

_1:n

and b δ = Y − Y

1:n

. From those we can easily construct unbiased estimators of λ and of δ, e λ = Y

_1:n

−

_n^e^δ

and e δ =

_n−1¹

P

n

k=1

(Y

_k

− Y

_1:n

). Moreover,

(6)

bδ = Y − Y

1:n

= 1 n

X

n k=2

(Y

_k:n

− Y

_1:n

) = 1 n

X

n k=2

(n + 1 − k)S

k

,

where the spacings S

k

= Y

_k:n

− Y

_k−1:n

_ Exponential

0,

_n+1−k^δ

, k = 2 , . . . , n are independent and independent of S

₁

= Y

_1:n

− λ _ Exponential

0,

_n^δ

, cf. David and Nagaraja (2003).

Thus the studentized random variable

Y

_1:n

− λ b δ =

Y

_1:n

− λ δ δ b

δ

is the quotient of the independent random variables Y

_1:n

− λ

δ _ Exponential (0,

_n¹

) and b δ

δ _ Gamma n − 1, 0,

¹

n

.

Therefore, the probability density function of Y

_1:n

− λ δ b is

f

_Y1:n−λ

b δ

(x) = n − 1

(1 + x)

ⁿ

I

(0,∞)

(x), i.e.,

Y

_1:n

− λ b

δ _ P areto(0, n − 1)

can be used either to construct confidence intervals for the location parameter λ, or to test some specific hypothesis about where its value lies. Observe that the quantiles of this distribution are x

n,1−α

= α

⁻ⁿ⁻¹¹

− 1.

2.2. The ‘homocedastic’ independent two samples case

Although Johnson et al. (1995, p. 193) and Brilhante and Kotz (2008)

consider more general asymmetric Laplace random variables, herein we shall

consider only the simpler situation of equal sample sizes, leading to the usual

Laplace, obtained as X

1

− X

2

, where X

1

and X

2

are iid exponential random

variables.

(7)

Let Y

₁

be a random sample of size

ⁿ₂

(n > 2 even) from Y

₁

_ Exponential (λ

₁

, δ), Y

₂

a random sample of size

ⁿ₂

from Y

₂

_ Exponential(λ

₂

, δ) – hence we are assuming a ‘homocedastic’ situation δ

1

= δ

2

= δ –, and Y

1

and Y

2

independent. We shall denote Y

_k:n⁽¹⁾

the k-th order statistics of Y

₁

, Y

_j:n⁽²⁾

the j-th order statistics of Y

2

.

Using arguments analogous to those stated in the one sample case, it is easy to establish that the studentized random variable

W = Y

_1:⁽¹⁾n

2

− Y

_1:⁽²⁾n

2

− (λ

₁

− λ

₂

)

δ b =

Y

_1:⁽¹⁾n

2

− λ

₁

− (Y

_1:⁽²⁾n 2

− λ

₂

) δ

b δ

δ

,

where the random variables

Y

_1:⁽¹⁾n

2

− Y

_1:⁽²⁾ⁿ

2

− (λ

₁

− λ

₂

)

δ _ Laplace

2 n

and

bδ

δ =

Y

₁

− Y

_1:⁽¹⁾ⁿ

2

+

Y

₂

− Y

_1:⁽²⁾ⁿ

2

2 _ Gamma

n − 2, 1 n

are independent.

In the general case, with eventually unequal sample sizes n

₁

and n

₂

, the adequate pooled estimator of the dispersion parameter δ is

b δ = n

₁

Y

₁

− Y

_1:n⁽¹⁾₁

+ n

2

Y

₂

− X

_1:n⁽²⁾₂

n

₁

+ n

₂

.

(8)

From this we readily obtain the probability density function of W ,

f

_W

(x) =

n 2

− 1 2

1 +

^|x|₂

n−1

I

R

(x)

and the upper quantiles

w

_n,1−α

= 2

(2α)

²⁻ⁿ¹

− 1

, α < 1

2 (where n is the combined sample size).

2.3. The ‘homocedastic’ independent k samples case Let

Y

₁

= (Y

₁₁

, . . . , Y

_1n

) .. .

Y

k

= (Y

_k1

, . . . , Y

kn

)

be independent random samples from k populations from Y

ij

_ Exponential (λ

i

, δ), i = 1, . . . , k; j = 1, . . . , n. Once again, herein we shall only deal with the simplified problem of a balanced design, with all sample sizes equal.

For simplicity, we denote

λ the average

¹_k

P

k

i=1

λ

_i

, and we assume that the Y

_i

, i = 1, . . . k, have been ordered so that λ

₁

≤ λ

₂

≤ · · · ≤ λ

_k

; henceforth Y

₁

Y

₂

· · · Y

_k

, where W Z means that W is stochastically not greater than Z.

We shall use the notations Y

1:N

= min

_i,j

Y

_ij

, i = 1, . . . , k, j = 1, . . . , n, Y

_1:n⁽ⁱ⁾

= min

_j

Y

_ij

, i = 1, . . . , k, and more generally Y

_k:n⁽ⁱ⁾

the k−th order statistic from the i−th random sample.

We now split the Total Sum of Spacings T SSp into a Between Sum of

Spacings BSSp and a Within Sum of Spacings W SSp:

(9)

T SSp = X

k i=1

X

n j=1

(Y

ij

− Y

_1:N

) = BSSp + W SSp

= X

k

i=1

n

Y

_1:n⁽ⁱ⁾

− Y

_1:N

+

X

k i=1

X

n j=1

Y

_ij

− Y

_1:n⁽ⁱ⁾

. The simple observation that

W SSp = X

k

i=1

X

n j=2

Y

_j:n⁽ⁱ⁾

− Y

_1:n⁽ⁱ⁾

= X

k i=1

X

n j=2

(n + 1 − j) h

Y

_j:n⁽ⁱ⁾

− Y

_j−1:n⁽ⁱ⁾

i

| {z }

Gamma(n−1, δ)

| {z }

Gamma(N −k, δ)

shows that W SSp

N − k is an unbiased estimator of δ, either under H

₀

or under the alternative H

_A

.

Under H

₀

, BSSp _ Gamma(k − 1, δ), and therefore BSSp

k − 1 is an unbiased estimator of δ. However, under H

A

, BSSp

k − 1 is a biased estimator of δ.

As a consequence, an ANOSp (ANalysis Of Spacings) table similar to the one-way ANOVA table

Sum of Spacings d.f. δ estimator F − ratio BSSp =

X

k i=1

n h

Y

_1:n⁽ⁱ⁾

− Y

_1:N

i

k − 1 BSSp k − 1

BSSp k−1 W SSp

N−k

|H

_

0

F

(2(k−1),2(N −k))

W SSp = X

k

i=1

X

n j=1

h

Y

_ij

− Y

_1:n⁽ⁱ⁾

i

N − k W SSp N − k

should be able to detect gross departures from the null hypothesis.

(10)

3. Alternative results on external and internal studentization in exponential populations

The broad concept of studentization as defined in David and Nagaraja (2003) inspires alternative approaches to inference on the location parameter of an exponential population Y _ Exponential(λ, δ). Brilhante et al. (2001) summarize previous work by the authors, namely, in the one-sample case, the study of the studentized minimum using the range as the estimator of a simple function of the nuisance scale parameter,

Y

_1:n

− λ Y

_n:n

− Y

_1:n

=

d

X

_1:n

X

_n:n

− X

_1:n

,

with X =

^Y^−λ_δ

is the standard exponential.

As Y

_1:n

and Y

_n:n

−Y

_1:n

are independent, it is once again external studentization. Due to the memoryless of the exponential, X

_n:n

−X

_1:n

= X

^d _n−1:n−1

, with probability density function f

_X_n:n_−X_1:n

(x) = (n−1) e

^−nx

(1 − e

^−nx

)

ⁿ⁻²

I

(0,∞)

(x).

Using standard methods, the probability density function of W =

_Y^Y^1:n^−λ

n:n−Y1:n

is

f

W

(x) = −n

"

B(n, nt) + nt ∂B(n, nt)

∂(nt)

#

I

(0,∞)

(x)

= n B(n, nt)

n−1

X

k=1

nt

(n − k) + nt I

(0,∞)

(x),

where B(p, q) = R

1

0

x

^p−1

(1 − x)

^q−1

dx, p, q > 0 is Euler’s integral of the first

kind.

(11)

Explicit formulas for the probability density function for n ≤ 30, and a table of high quantiles, can be requested from the authors. For larger values, observe that n ln(n)

_Y_n:n^Y^1:n_−Y^−λ

1:n

converges in distribution to a standard exponential random variables.

Internal studentization may be an interesting alternative; for instance,

τ

^∗_(n−1;i,k)

= Y

_n

− λ Y

_k:n

− Y

_i:n

(1 ≤ i < k ≤ n) for appropriate choices of i and k has a smaller breaking point than any of the studentized variables considered so far. At first sight the problem looks analytically intractable, since Y

n

− λ and Y

_k:n

− Y

i:n

have a very intricate dependence structure. But rewriting the above expression as

Y

_k:n

− Y

_i:n

= Y n τ

^∗_(n−1;i,k)

,

in view of Basu’s theorem, and with the standard notations for the Laplace transform L and inverse Laplace transforms L

⁻¹

at given points, we get that

L y

ⁿ

f

_τ^∗

(n−1;i,k)

(y); nx

= Γ(n) Γ(n + 1 − i) n

ⁿ

Γ(k − i) Γ(n + 1 − k)

e

^{−x(n+1−k)}

(1 − e

^−x

)

^k−i−1

x

ⁿ⁻¹

or

L y

ⁿ

f

_τ^∗

(n−1;i,k)

(y); x

= Γ(n) Γ(n + 1 − i) n Γ(k − i) Γ(n + 1 − k)

e

⁻ⁿ^x ^(n+1−k)

1 − e

⁻^xⁿ

k−i−1

x

ⁿ⁻¹

and hence

(12)

f

_τ^∗

(n−1;i,k)

(y)

= Γ(n) Γ(n + 1 − i) n Γ(k − i) Γ(n + 1 − k)

1 y

ⁿ

L

⁻¹



 

e

⁻^xⁿ ^(n+1−k)

1 − e

⁻ⁿ^x

k−i−1

x

ⁿ⁻¹

; y



  .

Choosing i and k so that

n i k

3j − 1 j − 1 2j

3j j 2j + 1

3j + 1 j + 1 2j + 2

we have the specially simple expression

f

_τ^∗

(n−1)

(y)

= Γ(n)Γ(2j + 1) n Γ

²

(j)

1 y

²

L

⁻¹



  1 x

^k−2



 e

⁻^xⁿ

1 − e

⁻^xⁿ

x





j

; y



  .

Since

L

⁻¹



 e

⁻ⁿ^x

1 − e

⁻^xⁿ

x ; y



 = 1 I(

n¹,_n²

)(y)

(13)

and

L

⁻¹

1 s

ⁿ

; y

!

= y

ⁿ⁻¹

Γ(n) ,

remembering how convolution of densities and products of Laplace transforms are related we get explicit solutions such as, for n = 3, i=1, k = 3,

f

_τ∗ (2)

(x) =

 

 

 

 



0 x < 1

3 4(x −

¹

3

)

3 x

³

1 3 ≤ x < 2 3

4 9 x

³

x ≥ 2

3 or, for n = 4, i = 2, k = 4,

f

_τ^∗

(3)

(x) =

 

 

 

 



0 x < 1 4 3

32x

⁴

(4x − 1)

²

1 4 ≤ x < 1 2

3 (8x − 3)

32 x

⁴

x ≥ 1

2 .

References

[1] A.A. Aspin, An examination and further developments of a formula arising in the problem of comparing two mean values, Biometrika 35 (1948), 88–97.

[2] A.A. Aspin, Tables for use in comparisons whose accuracy involves two variances, separately estimated, Biometrika 36 (1949), 290–293.

[3] M.F. Brilhante and S. Kotz, Infinite divisibility of the spacings of a Kotz-

Kozubowski–Podg´ orski generalized Laplace model, Statistics & Probability

Letters 78 (2008), 2433–2436.

(14)

[4] M.F. Brilhante, D. Pestana, J. Rocha and S. Velosa, Inferˆencia Estat´ıstica So- bre Localiza¸c˜ ao e Escala, Sociedade Portuguesa de Estat´ıstica, Ponta Delgada 2001.

[5] H.A. David and H.N. Nagaraja, Order Statistics, 3rd ed., Wiley, New York 2003.

[6] R.A. Fisher, Statistical Methods for Research Workers, Oliver and Boyd, Edin- burgh 1925.

[7] N.L. Johnson, S. Kotz and N. Balakrishnan, Continuous Univariate Distribu- tions, vol. 2, 2nd ed., Wiley, New York 1995.

[8] G.W. Oehlert, A First Course in Design and Analysis of Experiments, Free- man, New York 2000.

[9] V. Perlo, On the distribution of ‘Student’s’ ratio for samples of three drawn from the rectangular distribution, Biometrika 25 (1933), 203–204.

[10] D. Pestana, F. Brilhante and J. Rocha, The analysis of variance revisited, in Extreme Values and Additive Laws, Lisboa (1999), 73–77.

[11] D. Pestana and J. Rocha, An´ alise de escala – modelo exponencial, in A Es- tat´ıstica e o Futuro e o Futuro da Estat´ıstica, Salamandra, Lisboa (1993), 295–303.

[12] J. Rocha, Localiza¸c˜ ao e Escala em Situa¸c˜ oes n˜ ao Cl´ assicas, Disserta¸c˜ ao de Doutoramento, Universidade dos A¸cores, Ponta Delgada 1995.

[13] J. Rocha, Inference on location parameters – internally studentized statistics, Rev. Estat. (2001), 355–356.

[14] F.E. Satterthwaite, An approximate distribution of estimates of variance components, Biometrics Bulletin 2 (1946), 110–114.

[15] H. Scheff´e, On solutions of the Behrens-Fisher problem based on the t−distribution, Ann. Math. Stat. 14 (1943), 35–44.

[16] H. Scheff´e, A note on the Behrens-Fisher problem, Ann. Math. Stat. 15 (1944), 430–432.

[17] H. Smith, The problem of comparing the results of two experiments with unequal means, J. Council Sci. Industr. Res. 9 (1936), 211–212.

[18] Student, The probable error of the mean, (Reprinted in E.S. Pearson and

J. Wishart, (1958) “Student’s” Collected Papers, Cambridge Univ. Press,

Cambridge), Biometrika 6 (1908), 1–25.

(15)

[19] B.L. Welch, The significance of the difference between two means when the population variances are unequal, Biometrika 29 (1938), 350–361.

[20] B.L. Welch, On the comparison of several mean values: an alternative approach, Biometrika 38 (1951), 330–336.

Received 8 September 2009

INFERENCE ON THE LOCATION PARAMETER OF EXPONENTIAL POPULATIONS