ON GEOMETRY OF THE SET OF ADMISSIBLE QUADRATIC ESTIMATORS OF QUADRATIC
FUNCTIONS OF NORMAL PARAMETERS
KONRAD NEUMANN and STEFAN ZONTEK Faculty of Mathematics, Computer Science and Econometrics
University of Zielona G´ ora
prof. Z. Szafrana 4a, 65–516 Zielona G´ ora, Poland e-mail: k.neumann@wmie.uz.zgora.pl e-mail: s.zontek@wmie.uz.zgora.pl
Abstract
We consider the problem of admissible quadratic estimation of a linear function of µ
2and σ
2in n dimensional normal model N (Kµ, σ
2I
n) under quadratic risk function. After reducing this problem to admissible estimation of a linear function of two quadratic forms, the set of admissible estimators are characterized by giving formulae on the boundary of the set D ⊂ R
2of components of the two quadratic forms constituting the set of admissible estimators.
Different shapes and topological properties of the set D are studied.
Keywords: linear estimator, quadratic estimator, Bayesian quadratic estimator, quadratic loss function, admissibility, quadratic subspace.
2000 Mathematics Subject Classification: 62F10, 62J10.
1. Introduction and notation Let Y ∈ R
nbe a normal random vector such that
E(Y ) = K µ and cov(Y ) = σ
2I
n,
where µ ∈ R and σ
2> 0 are unknown parameters, K ∈ R
nis a known vector and I
nis the n × n identity matrix. Such model can be considered for measurements of the intensity µ of a source of signals by a numbers of sensors. For a single observation we assume the following structure of measured intensity of the source by the sensor: y = kµ + e, where k is known, while e denotes an unobservable random error (see Gnot et al., 2001 and references therein).
In the paper we consider the problem of characterization of admissible quadratic estimators of the function
f
0θ = f
1µ
2+ f
2σ
2,
where f
1, f
2are given scalars, with the quadratic loss function.
The natural class of estimators of f
0θ is the class of quadratic forms with respect to Y, that is {Y
0A Y : A ∈ S
n} , where S
ndenotes the space of n × n symmetric matrices.
It is known, that K
0Y and Y
0I
n−
K10KKK
0Y are sufficient statistics for µ and σ
2. So each quadratic estimator of f
0θ is a linear combination of the following quadratic forms (see also Zmy´slony, 1976)
Z
1= 1
K
0K Y
0B
1Y and
Z
2= 1
n − 1 Y
0B
2Y, where B
1=
K10KKK
0and B
2= I
n− B
1.
For K being the vector of ones (so called one way classification random model) Rukhin (1990) has shown that an estimator g
1Z
1+g
2Z
2is admissible for f
0θ (see Remark in Section 2 of this paper) iff
(1)
nf
2− n(n + 1)
n − 1 g
2− g
15g
1− 2n
n − 1 g
2− f
1≤ (g
1− f
1) (nf
2− ng
2− 3g
1)
and
(2) sgn (g
1− f
1) = sgn
nf
2− n(n + 1)
n − 1 g
2− g
1.
For slightly more general case (arbitrary vector K) we will give formulae on the boundary of the set
D
f= {(g
1, g
2)
0: g
1Z
1+ g
2Z
2is admissible for f
0θ}.
The description of the set D
fenables to prove whether or not a given estimator is admissible. We presented on figures different shapes of the set D
f. Also there are given examples of a comparison of risk of an admissible estimator with the best unbiased estimator and the maximum likelihood estimator.
Congruent approach to characterization of admissible quadratic estima- tors in balanced two variance components model was initiated by LaMotte (1985). In more details it was analized by Neumann and Zontek (2004). Sets D
f, considered in this paper, possess different properties then corresponding sets resulting from balanced two variance components model.
2. Characterization of the admissible estimators for E(Z) First we characterize the class of admissible estimators of E(Z), where Z = (Z
1, Z
2)
0. It is easy to calculate that
E
θ(Z) =
"
1 a 0 1
# "
µ
2σ
2#
and that
cov
θ(Z) = 2
"
2 a µ
2σ
2+ a
2σ
40
0 b σ
4# ,
where a =
K10Kand b =
n−11.
The total mean squared error of a linear estimator L
0Z, where L is 2 × 2
matrix, of E(Z) can be written as
T M SE
L(µ
2, σ
2) = tr[L
0cov(Z) L + (L − I
2)
0E(Z) E(Z
0) (L − I
2)], where for a quadratic matrix A the symbol tr(A) denotes the trace of A.
In order to construct an admissible estimator of E(Z) we will use Bayesian approach. Let τ be a priori distribution on the parameter space such that ∆
τ= {δ
ij} = E
τ[(µ
2, σ
2)
0(µ
2, σ
2)] exists. Then the Bayesian risk of L
0Z has the form
r
τ(L
0Z) = tr[L
0V
τL + (L − I
2)
0Φ
τ(L − I
2)], where
V
τ= 2
"
a (2 δ
12+ a δ
22) 0
0 b δ
22#
and
Φ
τ=
"
δ
11+ a (2 δ
12+ a δ
22) δ
12+ a δ
22δ
12+ a δ
22δ
22# .
An estimator L
0Z of E(Z) is said to be Bayesian estimator with respect to τ if r
τ(L
0Z) ≤ r
τ(M
0Z) for any 2 × 2 matrix M . It is known that an estimator L
0Z is Bayesian with respect to τ iff
(V
τ+ Φ
τ) L = Φ
τ(see for example LaMotte, 1982). If |V
τ+ Φ
τ| 6= 0, then the above equation has a unique solution and
[(V
τ+ Φ
τ)
−1Φ
τ]
0Z
is admissible for E(Z). Note that the class of uniquely determined Bayesian estimators and their limits is equal to the class of all admissible estimators.
The Bayesian estimator with respect to τ is uniquely determined
iff δ
22> 0. Under this assumption, without loss of generality, we can
assume that δ
22= 1. Following Gnot and Kleffe (1983) we present ∆
τin the following way
∆
τ=
"
u
2+ v
2u
u 1
# , where u, v ≥ 0. Then
V
τ= 2
"
a (2 u + a) 0
0 b
#
and
Φ
τ=
"
u
2+ a (2 u + a) + v
2u + a
u + a 1
# .
Simple algebra shows that for a fixed u ≥ 0 and v ≥ 0 the estimator L
0Z is Bayesian iff
(3) L = 1
w
"
2 b (a + u)
2+ (2 b + 1) v
22 b (a + u) 2 a (a + u) (a + 2 u) 2 a (a + 2 u) + v
2# ,
where
w = 2 b u
2+ 2 a (3 b + 1) (a + 2 u) + (2 b + 1) v
2.
To the class of admissible estimators belong also limits of estimators given in (3). This implies (for arbitrary u ≥ 0 and v → +∞) that the estimator of the following form is also admissible:
(4) L =
1 0
0 1
2 b + 1
.
All estimators determined by (3) and (4) constitute the class of all admissible
estimators of E(Z).
3. Admissible estimation of f
0θ
Admissible estimator for a function f
0θ = f
1µ
2+ f
2σ
2can be described in the following way. An estimator g
0Z = g
1Z
1+ g
2Z
2is admissible for f
0θ iff (5) g ∈ D
f= {L H
−1f : L
0Z is admissible for E(Z)},
where H =
1 0 a 1
. Note that f
0(H
0)
−1E(Z) = f
0θ.
In order to describe the boundary of D
fwe introduce a new parameter l (instead of v) defined in the proof of the following lemma.
Lemma 1. An estimator L
0Z is admissible for E(Z) iff (6)
L =
1 − a (2 b + 1) (a + 2 u) l a (3 b + 1) (a + 2 u) + b u
2b (a + u) l
a (3 b + 1) (a + 2 u) + b u
2a (a + u) (a + 2 u) l
a (3 b + 1) (a + 2 u) + b u
21
1+2 b 1− b (a + u)
2l
a (3 b + 1) (a + 2 u) + bu
2!
,
where u ≥ 0, 0 ≤ l ≤ 1.
P roof. Define l as
(7) l = 2 [b u
2+ a (3 b + 1) (a + 2 u)]
2 [b u
2+ a (3 b + 1) (a + 2 u)] + (2 b + 1) v
2.
Note that l ∈ (0, 1] for any fixed u ≥ 0, and that l → 0 when v → + ∞.
Simple calculations show that substituting l in formula (6) we get (3). For l = 0 we get matrix of the form (4). This ends the proof.
Introducing the new parametrization we obtain simpler and compact for-
mula on matrices constituting the class of admissible estimators of E(Z).
Note that entries of (6) linearly depend on the new parameter l and that l belongs to the closed interval. This remark play important role in the proof of the main result of the paper.
According to (5) we get the following corollary.
Corollary 1. An estimator g
0Z = g
1Z
1+ g
2Z
2is admissible for f
0θ = f
1µ
2+ f
2σ
2if and only if
g
1= a
2(3b + 1)(1 − l) + a [2(3b + 1) − (5b + 2)l] u + bu
2a(3b + 1)(a + 2u) + bu
2f
1+ b(a + u)l
a(3b + 1)(a + 2) + bu
2f
2, (8)
g
2= a
2(3b + 1)(l − 1) + a [(8b + 3)l − 2(3b + 1)] u + [(5b + 2)l − b] u
2(2b + 1) [a(3b + 1)(a + 2u) + bu
2] a f
1+ b (1 − l) u
2+ a [b(3 − l) + 1] (a + 2u) (2 b + 1) [a (3b + 1) (a + 2u) + bu
2] f
2, (9)
where u ≥ 0, 0 ≤ l ≤ 1.
Since g
1and g
2given by (8) and (9), respectively, are linear combinations of elements of (6), then they also linearly depend on l.
Theorem 1. An estimator g
1Z
1+ g
2Z
2is admissible for f
0θ = f
1µ
2+ f
2σ
2if and only if (g
1, g
2)
0∈ D
f, where D
fis the area with the boundary A ∪ B ∪ C, where
A =
b (a + u) (u f
1+ f
2) a (3 b + 1) (a + 2 u) + b u
2a (a + 2 u) (u f
1+ f
2) a (3 b + 1) (a + 2 u) + b u
2
: u ≥ 0
,
B is the interval connecting point B = (f
1,
2 a fb 1)
0with point A
∞= (f
1,
2 a fb 1)
0and C is the interval connecting point B with point A
0= (
a(3 b+1)b f2,
3 b+1f2)
0; such that A ⊂ D
f, C ⊂ D
fand D
f∩ B ={B}.
P roof. First note that for any fixed u ≥ 0 components of g given by (8) and (9) linearly depend on l and that l belongs to the closed interval [0, 1] . So the set D
fconsists of intervals connecting a point A from line A obtained for l = 1 that is
A =
b (a + u) (uf
1+ f
2)
a (3b + 1) (a + 2u) + bu
2, a (a + 2u) (uf
1+ f
2) a (3b + 1) (a + 2u) + bu
2 0, u ≥ 0
with the point B obtained for l = 0. Now it is enough to notice that the interval C we obtain for u = 0 and the interval B is the set of limits of the points (g
1, g
2)
0when u → +∞. This ends the proof.
Notice that for estimation of f
0θ the set D
fis bounded, and sometimes closed.
Remark 1. The description of the set of admissible estimators of a lin- ear combination of µ
2and σ
2given in Theorem 1 is different from the one given by Rukhin (1990, formulae (1) and (2) in this paper). For k = 1
nthe line A is a part of the curve obtained by putting equality in inequal- ity (1). Intervals B and C are contained in straight lines sgn (g
1− f
1) = 0 and sgn
nf
2−
n(n+1)n−1g
2− g
1= 0, respectively (compare condition (2)).
However, condition (2) excluds the admissible estimators corresponding to interval C except the point B. So the condition (2) is too restrictive.
For the fixed model (i.e., a and b are fixed) the shape of the set D
fdepends on estimated function f
1µ
2+ f
2σ
2. Different shapes of the set D
fcorrespond to three areas given by the following lines:
(a) : f
2= 0,
(b) : f
2= a (3 b + 1) b f
1, (c) : f
2= a (5 b + 2)
b f
1.
We also consider three cases corresponding to boundary lines (a), (b) and (c).
Different forms of the set D
fwe present on figures. Because the shape of area D
fdoes not depend on the parameters a and b, at calculations for figures we accepted a = 0.25 and b = 0.1 (for all figures).
Figure 1. Areas of vectors f under considerations.
On Figures 2–5 presenting set D
f, the position of points corresponding to the best unbiased estimator (point P) and the maximum likelihood estima- tor (point S, derived under normality) are marked. These estimators are given by g
1= f
1, g
2= f
2− a f
1and g
1= f
1, g
2=
b+11f
2, respectively.
On Figure 2 there is presented the shape of D
ffor f = (f
1, 0)
0, f
1> 0.
The set D
fis the sum of intervals connecting points from A with the point
B. So the set D
fis bounded, but not closed and not convex.
Figure 2. Admissible estimators for f
0θ, where f = (1, 0)
0.
Figure 3. Admissible estimators for f
0θ, where f = (1, 1.625)
0.
The shape of D
fon Figure 3 we obtain for f = (f
1, f
2)
0, when 0 < f
2<
a(3 b+1) f1
b
, f
1> 0. This set is not closed but bounded and convex.
Figure 4. Admissible estimators for f
0θ, where f = (1, 3.25)
0.
Figure 5. Admissible estimators for f
0θ, where f = (1, 5.4)
0.
Figure 4 presents the shape of D
ffor f = (f
1, f
2)
0, where f
2=
a(3 b+1) fb 1, f
1> 0. Then interval C is degenerated to one point and the convex set D
fis between line A and interval B. It is bounded and convex but not closed.
The shape of D
fon Figure 5 we obtain for f = (f
1, f
2)
0, where
a(3 b+1) f1
b
≤ f
2<
a(5 b+2) fb 1, f
1> 0. This set consists of two parts and is bounded by line A, interval B and interval C. The set D
fis bounded but not closed and not convex.
Figure 6. Admissible estimators for f
0θ, where f = (1, 6.25)
0.
For f = (f
1, f
2)
0, where f
2=
a(5 b+2) fb 1, f
1> 0, we obtain the shape of D
fin the form presented on Figure 6. Then interval B is degenerated to one point and the convex set D
fis between line A and interval C. It is bounded, convex and closed. On Figure 6 and 7 we do not mark points S and P because the position of these points go beyond the scale of picture.
In the case when f
2>
a(5 b+2) fb 1, f
1> 0 the set D
fis not convex.
Figure 7. Admissible estimators for f
0θ, where f = (1, 7.25)
0.
Figures 2–7 present different shapes of D
ffor f
1≥ 0. For completeness notice that the set D
−fis symmetric to the set D
fwith respect to point (0, 0)
0. Also notice that for f = (0, 0)
0the set D
freduced to {(0, 0)
0}.
4. Conclusions
In the paper we give explicitly described the boundary of the set of admissible quadratic estimators for a linear combination of parameters in n dimensional normal model N (Kµ, σ
2I
n). The shape of appointed set of admissible quadratic estimators strongly depends on estimated function f
0θ = f
1µ
2+ f
2σ
2. Some examples we present on a series of figures. We obtain bounded set D
f(convex or not), which usually is not closed. Only in one case, when we estimate f
0θ = f
1µ
2+ f
2σ
2with f
2=
a(5 b+2) fb 1, we obtain closed set D
fof admissible quadratic estimators.
It is easy to prove that the best unbiased estimator (UE) is admis-
sible only in the case when we estimate f
0θ with f
2= a f
1, similarly the
maximum likelihood estimator (MLE) is admissible only in the case when we
estimate f
0θ with f
2=
−a(b+1) fb 1. In other cases we can find an admissible estimator, which is better than UE and MLE. Of course we can suspect that an admissible estimator can be much better than UE and MLE only for small values n.
For the estimation of linear combination µ
2and σ
2we should find es- timators which have small risk. The minimal risk is attained by estimators corresponding to l
max= 1. For selected u = 0 the smallest risk we have for µ
2σ
2. When we choose somewhat larger u the suitable estimator have minimal risk for µ
2≈ σ
2. In the case when µ
2σ
2profits are relatively small when u → +∞.
In details we will consider two main cases: estimation of µ
2and estima- tion of σ
2. For both cases we choose an estimator corresponding to point B on line B obtained for l = 0. This point is given by B =
f
1,
−a f2 b+11+f20. Simple calculations show that the estimator corresponding to this point in both cases is better than UE and MLE.
In calculations we consider n dimensional model with K = 1
n. For such model a =
n1and b =
n−11.
4.1. Estimation of µ
2Since in this case (f
2= 0)
M SE
Pµ
2, σ
2− MSE
S(µ
2, σ
2) = a
2(2 b − 1) σ
4f
12,
the maximum likelihood estimator is worse than the unbiased one for n > 3. So we compare the risk of the unbiased estimator with the risk of the admissible quadratic estimator corresponding to the point B. The ratio of risks has following form
M SE
P(µ
2, σ
2)
M SE
B(µ
2, σ
2) = (2 b + 1) 2 µ
2+ a (b + 1) σ
2(4 b + 2) µ
2+ a (3 b + 1) σ
2.
Table 1. Ratio of risks of estimators of µ
2σ
2\ n 2 4 6 8 10
0.0 1.000 1.000 1.000 1.000 1.000
0.1 1.018 1.002 1.001 1.000 1.000
0.2 1.038 1.004 1.001 1.000 1.000
0.3 1.062 1.007 1.002 1.001 1.000
0.4 1.091 1.010 1.003 1.001 1.001
0.5 1.125 1.014 1.004 1.002 1.001
0.6 1.167 1.020 1.006 1.003 1.001
0.7 1.219 1.029 1.009 1.004 1.002
0.8 1.286 1.042 1.014 1.006 1.003
0.9 1.375 1.064 1.023 1.011 1.006
1.0 1.500 1.111 1.050 1.029 1.019
M SE of U E relative to M SE corresponding to B for n dimensional normal model, calculated for (1− σ
2, σ
2)
0with σ
2= 0(0.1)1, respectively.
Since the ratio M SE
Pt µ
2, t σ
2) /M SE
Bt µ
2, t σ
2, t > 0, does not depend on t, we can compare risks on the interval {(µ
2, σ
2)
0: µ
2= 1 − σ
2, 0 ≤ σ
2≤ 1}. In Table 1 there are given ratios of risks calculated for (1 − σ
2, σ
2)
0with σ
2from the set {0, 0.1, . . . , 0.9, 1}. Results of calcula- tions confirm that even for not large values of n the estimator corresponding to B is only a little better than UE. For example when n = 10 the admissible estimator is only 0 ÷ 2% better than UE.
4.2. Estimation of σ
2Since in this case (f
1= 0)
M SE
P(µ
2, σ
2) − M SE
S(µ
2, σ
2) = b
2(3 + 2 b) σ
4f
22(1 + b)
2≥ 0,
then the maximum likelihood estimator is as good as the unbiased one. So we compare the risk of the unbiased estimator with the risk of the admissible quadratic estimator corresponding to the point B. The ratio of risks has the following form
M SE
S(µ
2, σ
2)
M SE
B(µ
2, σ
2) = (2 + b) (1 + 2 b)
2 (1 + b)
2.
Note that this ratio does not depend on the model parameters.
Calculated values of ratio M SE
Sµ
2, σ
2and M SE
Bµ
2, σ
2for different n are given in Table 2.
Table 2. Ratio of risks of estimators of σ
2n 2 5 10 15 20 25 30 35
M SES(µ2,σ2)
M SEB(µ2,σ2)