INFERENCE ON THE LOCATION PARAMETER OF EXPONENTIAL POPULATIONS
∗Maria de F´ atima Brilhante Universidade dos A¸cores, DM, and CEAUL
Sandra Mendonc ¸a
Universidade da Madeira, CCEE, and CEAUL Dinis Duarte Pestana
Universidade de Lisboa, DEIO and CEAUL and
Maria Lu´ısa Rocha
Universidade dos A¸cores, DEG, CEEAplA and CEAUL
A token of friendship to Professor J. T. Mexia on his 70th birthday
Abstract
Studentization and analysis of variance are simple in Gaussian fa- milies because X and S
2are independent random variables. We exploit the independence of the spacings in exponential populations with lo- cation λ and scale δ to develop simple ways of dealing with inference on the location parameter, namely by developing an analysis of scale in the homocedastic independent k-sample problem.
Keywords: studentization, analysis of scale, characterizations, independence of exponential spacings, location-scale families, F -ratio.
2000 Mathematics Subject Classification: 60E05.
∗
Research partially supported by FCT/OET.
1. Introduction
In what follows, X = (X
1, X
2, . . . , X
n) denotes a random sample of size n
X= n from the population X;
X = 1 n
X
n k=1X
kand S
X2= 1 n − 1
X
n k=1(X
k− X)
2.
X
1, . . . , X
k, X
1, . . . , X
k, S
X21, . . . , S
X2k
– which for simplicity we shall denote S
12, . . . , S
k2– will have similar meaning. Also for simplicity, we shall write n
Xj= n
j, j = 1, . . . , k, and N = n
1+· · ·+n
kthe dimension of the combined sample. A pooled variance estimator
S
p2= (n
1− 1)S
12+ · · · + (n
k− 1)S
k2N − k
will also be used.
The exact sampling distribution of functions of low empirical moments of Gaussian populations X
k_ Gaussian(µ
k, σ
k) is easily derived, and hence we have a wealth of results to estimate, or to test, the location parameter, or to compare location parameters of several Gaussian populations, even for small samples, at least assuming homocedasticity.
For the one-sample and the homocedastic independent two-sample cases, Student (1908) devised how to get rid of the nuisance scale parameter in σ in
X − µ σ q
1 n
and in X
1− X
2− (µ
1− µ
2) σ q
1 n1
+
n12
,
respectively, dividing those standard Gaussian random variables by appro- priate functions of estimators of σ.
For the heterocedastic case, Smith (1936), Welch (1938), Satterthwaite
(1946) and Aspin (1948) established that
X
1− X
2− (µ
1− µ
2) s
σ
12n
1+ σ
22n
2has an approximate t
νdistribution, where the fractional number of degrees of freedom ν can be estimated by
˜ ν =
s
21n
1+ s
22n
2 2s
41n
21(n
1− 1) + s
42n
22(n
2− 1)
.
Aspin (1949) tabulated high quantiles of this ‘fractional’ t distribution. Al- ternatively, inference on µ
1− µ
2can use random pairing of X
1,kfrom X
1with Y
2,νkchosen using simple random sampling without replacement from X
2, cf. Sheff´e (1943, 1944), so that using the one-sample t-test for the ‘ran- dom’ differences D
k= X
1,k− Y
2,νk, k = 1, . . . , min(n
1, n
2) is in order.
Following David and Nagaraja (2003) we shall speak of studentiza- tion, in general populations, whenever we divide a random variable T = T (Θ(X), θ, ξ), which is a function of an estimator Θ(X) of the parameter of interest θ, depending both on this parameter and on a nuisance parameter ξ, by some appropriate function h(Ξ(X)) of an estimator ˜ ξ = Ξ(X) of the nuisance parameter, so that the ratio no longer depends on the nuisance pa- rameter ξ. Therefore, this ratio can be used as a fulcral variable, useful to construct interval estimators of θ; on the other hand, the above mentioned ratio is an adequate test statistic, under the null hypothesis H
0: θ = θ
0. We shall consider it external studentization when T and h(Ξ) are independent, and internal studentization otherwise.
The homocedastic independent k-sample problem, in Gaussian popula- tions, has been solved by Fisher (1925). Instead of getting rid of the nuisance parameter, Fisher’s ANalysis Of VAriance – in the sense of decomposing an estimator of the overall dispersion into components – uses the influ- ence of the location parameter estimates on the scale parameter estimator.
More precisely, as the variance σ
2is the smaller second order moment, i.e., σ
2< E[(X −A)
2] for all A 6= E[X] (and, similarly, s
2x=
n−11P
nk=1
(x
k−x)
2<
1 n−1
P
nk=1
(x
k− a)
2for all a 6= x =
n1P
nk=1
x
k), splitting the Total Sum of
Squares T SS = P
k i=1P
nij=1
(X
ij− X)
2, where X stands for the overall mean X =
N1P
ki=1
P
nij=1
X
ijof the combined sample X
e
= (X
1, . . . .X
k), into the Between Sum of Squares BSS = P
ki=1
n
i(X
i− X)
2and the Within Sum of Squares W SS = P
ki=1
P
nij=1
(X
ij− X
i)
2, we obtain two independent estimators of the variance, namely the means squares
BSSk−1and S
p2=
W SSN−k. Moreover, while S
p2is an unbiased estimator of σ
2,
BSSk−1is unbiased under the null hypothesis µ
1= µ
2= · · · = µ
k= µ, but when the alternative is true
BSSk−1is biased, and hence it must overestimate σ
2, and the F -ratio
BSS k−1 W SS N−k
|H
_
0F
k−1,N −kshould detect gross departures from the null hypothesis.
Once again, Welch–Satterthwaite pathbreaking techniques are useful in constructing approximate solutions without assuming homocedasticity (Satterthwaite, 1946; Welch, 1951; Oehlert, 2000).
Inference on location and scale in Gaussian populations is thus simple for two main reasons:
1. If X is a random sample from X _ Gaussian(µ, σ), X and S
2are independent random variables, and thus in the two-sample setting, X
1− X
2and S
p2are independent random variables. Thus deducing the exact distribution of the studentized variables
X − µ S
r 1 n
and X
1− X
2− (µ
1− µ
2) S
pr 1 n
1+ 1
n
2(Student, 1908) has been a simple task.
Unfortunately, independence of X and S
2is a characterization of the Gaussian populations, and studentization in other populations, e.g.
uniform populations, is a hard task, even for samples of size 3 (Perlo, 1933).
2. Assume that we want to estimate the location parameter θ of an abso-
lutely continuous population X using the maximum likelihood estima-
tor ˆ θ. Under what conditions is ˆ θ = X?
Given a sample x = (x
1, . . . , x
n), ˆ θ must satisfy P
nk=1
g
0(ˆ θ − x
k) = 0, where g(θ − x
k) = ln [f (x
k| θ)].
The special case x
1= · · · = x
n−1= 0 and x
n= nu implies that (n−1)g
0(u)+g
0((1−n)u) = 0, n = 2, 3, . . . ; the case n = 2 shows at once that g must be an odd function. Thus g
0(nu) = ng
0(u), u ∈ R, n = 2, 3, . . . , and henceforth g
0is a linear function. From this, g
0(u) = Cu =⇒ g(u) =
12
C u
2+ b, satisfying the condition R
R
f (x|θ) du = 1.
Then
f (x | θ) = r
α 2 π e
−12α(x−θ)2
I
R(x), α > 0.
This observation (Gauss, 1809) shows that the Gaussian family is unique, in the realm of absolutely continuous distributions, in having
X as the maximum likelihood estimator of its location parameter θ = E[X].
In the sequel we shall investigate how a characterization of the exponen- tial populations – namely that S
1:= X
1:nand the spacings S
k:= X
k:n− X
k−1:n, k = 2, . . . , n are independent – can be used to establish simple re- sults in what concerns inference the location parameter λ of Y = δX + λ _ Exponential(λ, δ), where X _ Exponential(1) is the standard exponential, with probability density function f
X(x) = e
−xI
(0,∞)(x).
2. Inference on the location
of exponential random variables
2.1. The one-sample case
Let y be a random sample of size n from Y _ Exponential(λ, δ), λ ∈ R, δ > 0, i.e., with distribution function
F
Y(y) =
1 − exp
− y − λ δ
I
[λ,∞)(y).
The maximum likelihood estimators of the parameters are b λ = Y
1:nand b δ = Y − Y
1:n. From those we can easily construct unbiased estimators of λ and of δ, e λ = Y
1:n−
neδand e δ =
n−11P
nk=1
(Y
k− Y
1:n). Moreover,
bδ = Y − Y
1:n= 1 n
X
n k=2(Y
k:n− Y
1:n) = 1 n
X
n k=2(n + 1 − k)S
k,
where the spacings S
k= Y
k:n− Y
k−1:n_ Exponential
0,
n+1−kδ, k = 2 , . . . , n are independent and independent of S
1= Y
1:n− λ _ Exponential
0,
nδ, cf. David and Nagaraja (2003).
Thus the studentized random variable
Y
1:n− λ b δ =
Y
1:n− λ δ δ b
δ
is the quotient of the independent random variables Y
1:n− λ
δ _ Exponential (0,
n1) and b δ
δ _ Gamma n − 1, 0,
1n
.
Therefore, the probability density function of Y
1:n− λ δ b is
f
Y1:n−λb δ
(x) = n − 1
(1 + x)
nI
(0,∞)(x), i.e.,
Y
1:n− λ b
δ _ P areto(0, n − 1)
can be used either to construct confidence intervals for the location parame- ter λ, or to test some specific hypothesis about where its value lies. Observe that the quantiles of this distribution are x
n,1−α= α
−n−11− 1.
2.2. The ‘homocedastic’ independent two samples case
Although Johnson et al. (1995, p. 193) and Brilhante and Kotz (2008)
consider more general asymmetric Laplace random variables, herein we shall
consider only the simpler situation of equal sample sizes, leading to the usual
Laplace, obtained as X
1− X
2, where X
1and X
2are iid exponential random
variables.
Let Y
1be a random sample of size
n2(n > 2 even) from Y
1_ Exponential (λ
1, δ), Y
2a random sample of size
n2from Y
2_ Exponential(λ
2, δ) – hence we are assuming a ‘homocedastic’ situation δ
1= δ
2= δ –, and Y
1and Y
2independent. We shall denote Y
k:n(1)the k-th order statistics of Y
1, Y
j:n(2)the j-th order statistics of Y
2.
Using arguments analogous to those stated in the one sample case, it is easy to establish that the studentized random variable
W = Y
1:(1)n2
− Y
1:(2)n2
− (λ
1− λ
2)
δ b =
Y
1:(1)n2
− λ
1− (Y
1:(2)n 2− λ
2) δ
b δ
δ
,
where the random variables
Y
1:(1)n2
− Y
1:(2)n2
− (λ
1− λ
2)
δ _ Laplace
2 n
and
bδ
δ =
Y
1− Y
1:(1)n2
+
Y
2− Y
1:(2)n2
2 _ Gamma
n − 2, 1 n
are independent.
In the general case, with eventually unequal sample sizes n
1and n
2, the adequate pooled estimator of the dispersion parameter δ is
b δ = n
1Y
1− Y
1:n(1)1+ n
2Y
2− X
1:n(2)2n
1+ n
2.
From this we readily obtain the probability density function of W ,
f
W(x) =
n 2
− 1 2
1 +
|x|2 n−1I
R(x)
and the upper quantiles
w
n,1−α= 2
(2α)
2−n1− 1
, α < 1
2
(where n is the combined sample size).
2.3. The ‘homocedastic’ independent k samples case Let
Y
1= (Y
11, . . . , Y
1n) .. .
Y
k= (Y
k1, . . . , Y
kn)
be independent random samples from k populations from Y
ij_ Exponential (λ
i, δ), i = 1, . . . , k; j = 1, . . . , n. Once again, herein we shall only deal with the simplified problem of a balanced design, with all sample sizes equal.
For simplicity, we denote
λ the average
1kP
ki=1
λ
i, and we assume that the Y
i, i = 1, . . . k, have been ordered so that λ
1≤ λ
2≤ · · · ≤ λ
k; henceforth Y
1Y
2· · · Y
k, where W Z means that W is stochastically not greater than Z.
We shall use the notations Y
1:N= min
i,jY
ij, i = 1, . . . , k, j = 1, . . . , n, Y
1:n(i)= min
jY
ij, i = 1, . . . , k, and more generally Y
k:n(i)the k−th order statis- tic from the i−th random sample.
We now split the Total Sum of Spacings T SSp into a Between Sum of
Spacings BSSp and a Within Sum of Spacings W SSp:
T SSp = X
k i=1X
n j=1(Y
ij− Y
1:N) = BSSp + W SSp
= X
ki=1
n
Y
1:n(i)− Y
1:N+
X
k i=1X
n j=1Y
ij− Y
1:n(i). The simple observation that
W SSp = X
ki=1
X
n j=2Y
j:n(i)− Y
1:n(i)= X
k i=1X
n j=2(n + 1 − j) h
Y
j:n(i)− Y
j−1:n(i)i
| {z }
Gamma(n−1, δ)
| {z }
Gamma(N −k, δ)
shows that W SSp
N − k is an unbiased estimator of δ, either under H
0or under the alternative H
A.
Under H
0, BSSp _ Gamma(k − 1, δ), and therefore BSSp
k − 1 is an un- biased estimator of δ. However, under H
A, BSSp
k − 1 is a biased estimator of δ.
As a consequence, an ANOSp (ANalysis Of Spacings) table similar to the one-way ANOVA table
Sum of Spacings d.f. δ estimator F − ratio BSSp =
X
k i=1n h
Y
1:n(i)− Y
1:Ni
k − 1 BSSp k − 1
BSSp k−1 W SSp
N−k
|H
_
0F
(2(k−1),2(N −k))W SSp = X
ki=1
X
n j=1h
Y
ij− Y
1:n(i)i
N − k W SSp N − k
should be able to detect gross departures from the null hypothesis.
3. Alternative results on external and internal studentization in exponential populations
The broad concept of studentization as defined in David and Nagaraja (2003) inspires alternative approaches to inference on the location parameter of an exponential population Y _ Exponential(λ, δ). Brilhante et al. (2001) summarize previous work by the authors, namely, in the one-sample case, the study of the studentized minimum using the range as the estimator of a simple function of the nuisance scale parameter,
Y
1:n− λ Y
n:n− Y
1:n=
dX
1:nX
n:n− X
1:n,
with X =
Y−λδis the standard exponential.
As Y
1:nand Y
n:n−Y
1:nare independent, it is once again external studen- tization. Due to the memoryless of the exponential, X
n:n−X
1:n= X
d n−1:n−1, with probability density function f
Xn:n−X1:n(x) = (n−1) e
−nx(1 − e
−nx)
n−2I
(0,∞)(x).
Using standard methods, the probability density function of W =
YY1:n−λn:n−Y1:n
is
f
W(x) = −n
"
B(n, nt) + nt ∂B(n, nt)
∂(nt)
#
I
(0,∞)(x)
= n B(n, nt)
n−1
X
k=1
nt
(n − k) + nt I
(0,∞)(x),
where B(p, q) = R
10
x
p−1(1 − x)
q−1dx, p, q > 0 is Euler’s integral of the first
kind.
Explicit formulas for the probability density function for n ≤ 30, and a table of high quantiles, can be requested from the authors. For larger va- lues, observe that n ln(n)
Yn:nY1:n−Y−λ1:n
converges in distribution to a standard exponential random variables.
Internal studentization may be an interesting alternative; for instance,
τ
∗(n−1;i,k)= Y
n− λ Y
k:n− Y
i:n(1 ≤ i < k ≤ n) for appropriate choices of i and k has a smaller breaking point than any of the studentized variables considered so far. At first sight the problem looks analytically intractable, since Y
n− λ and Y
k:n− Y
i:nhave a very intricate dependence structure. But rewriting the above expression as
Y
k:n− Y
i:n= Y n τ
∗(n−1;i,k),
in view of Basu’s theorem, and with the standard notations for the Laplace transform L and inverse Laplace transforms L
−1at given points, we get that
L y
nf
τ∗(n−1;i,k)
(y); nx
= Γ(n) Γ(n + 1 − i) n
nΓ(k − i) Γ(n + 1 − k)
e
−x(n+1−k)(1 − e
−x)
k−i−1x
n−1or
L y
nf
τ∗(n−1;i,k)
(y); x
= Γ(n) Γ(n + 1 − i) n Γ(k − i) Γ(n + 1 − k)
e
−nx (n+1−k)1 − e
−xnk−i−1x
n−1and hence
f
τ∗(n−1;i,k)
(y)
= Γ(n) Γ(n + 1 − i) n Γ(k − i) Γ(n + 1 − k)
1 y
nL
−1
e
−xn (n+1−k)1 − e
−nxk−i−1x
n−1; y
.
Choosing i and k so that
n i k
3j − 1 j − 1 2j
3j j 2j + 1
3j + 1 j + 1 2j + 2
we have the specially simple expression
f
τ∗(n−1)
(y)
= Γ(n)Γ(2j + 1) n Γ
2(j)
1 y
2L
−1
1 x
k−2
e
−xn1 − e
−xnx
j
; y
.
Since
L
−1
e
−nx1 − e
−xnx ; y
= 1 I(
n1,n2)(y)
and
L
−11 s
n; y
!
= y
n−1Γ(n) ,
remembering how convolution of densities and products of Laplace trans- forms are related we get explicit solutions such as, for n = 3, i=1, k = 3,
f
τ∗ (2)(x) =
0 x < 1
3 4(x −
13
)
3 x
31
3 ≤ x < 2 3
4
9 x
3x ≥ 2
3 or, for n = 4, i = 2, k = 4,
f
τ∗(3)