A. B O R A T Y ´N S K A and M. D R O Z D O W I C Z (Warszawa)
ROBUST BAYESIAN ESTIMATION IN A NORMAL MODEL WITH ASYMMETRIC LOSS FUNCTION
Abstract. The problem of robust Bayesian estimation in a normal model with asymmetric loss function (LINEX) is considered. Some uncertainty about the prior is assumed by introducing two classes of priors. The most robust and conditional Γ -minimax estimators are constructed. The situa- tions when those estimators coincide are presented.
1. Introduction and notation. In Bayesian statistical inference the goal of research are optimal decisions under a specified loss function and a prior distribution over the parameter space. However the arbitrariness of a unique prior distribution is a permanent problem. Robust Bayesian inference deals with the problem of expressing uncertainty of the prior information using a class Γ of priors and of measuring the range of a posterior quantity while the prior distribution Π runs over the class Γ . It is interesting not only in calculating the range but also in constructing optimal procedures.
In the problem of estimation of an unknown parameter two concepts of optimality are considered: the idea of conditional Γ -minimax estimators (see DasGupta and Studden [4], Betro and Ruggeri [1]) and the idea of stable estimators developed in M¸ eczarski and Zieli´ nski [6] and Boraty´ nska and M¸ eczarski [3]. The first concept is connected with the problem of ef- ficiency of the estimator with respect to the posterior risk when the priors run over Γ . The second one is connected with the problem of finding an estimator with the smallest oscillation of the posterior risk when the priors run over Γ . Sometimes those two estimators coincide (see M¸ eczarski [5] and Boraty´ nska [2]).
In all papers mentioned above the quadratic loss function was consid- ered. However in many situations a quadratic loss function seems inappro-
1991 Mathematics Subject Classification: Primary 62C10; Secondary 62F15, 62F35.
Key words and phrases: Bayes estimators, classes of priors, robust Bayesian estima- tion, asymmetric loss function.
[85]
priate in that it assigns the same loss to overestimates as to equal under- estimates.
In this paper we estimate an unknown parameter θ and consider the asymmetric loss function (LINEX)
L(θ, d) = exp(a(θ − d)) − a(θ − d) − 1,
where a is a known parameter and a 6= 0. Exhaustive motivations to use LINEX are presented in Varian [7] and Zellner [8]. We find the conditional Γ -minimax estimators and the stable estimators, and present conditions when those estimators coincide, in a normal model with two classes of con- jugate priors given below.
Let X
1, . . . , X
nbe i.i.d. random variables with normal N (θ, b
2) distri- bution where θ is unknown and b
2is known. Set X = (X
1, . . . , X
n). Let Π
µ0,σ0= N (µ
0, σ
20) be a fixed prior distribution of θ.
Define X = 1
n
n
X
i=1
X
i, v
n= n(X − µ
0)
b
2, λ = λ(σ) = 1 σ
2+ n
b
2 −1,
m = m(µ) = µ
1 − n
b
21 σ
02+ n
b
2 −1, w
n= a
2 + nX b
21 σ
02+ n
b
2 −1.
If X = x then the posterior distribution is the normal distribution N (µ
0+ v
nλ
0, λ
0) = N (m
0+ w
n− aλ
0/2, λ
0),
where λ
0= λ(σ
0) and m
0= m(µ
0). The posterior risk of an estimator b θ with LINEX loss function is equal to
Ee
a(θ− bθ)− aEθ + ab θ − 1,
where Ey(θ) denotes the expected value of a function y(θ) when θ has the posterior distribution. Thus under the prior Π
µ0,σ0,
Ee
aθ= exp(aµ
0+ (a
2/2 + av
n)λ
0) = exp(am + aw
n) and
Eθ = µ
0+ v
nλ
0= m
0+ w
n− aλ
0/2.
The minimum of the posterior risk as a function of θ is reached for θ = b 1
a ln Ee
aθ.
Thus the Bayes estimator with LINEX loss function is given by the formula θ b
Bayµ0,σ0= 1
a ln Ee
aθ= µ
0+ (a/2 + v
n)λ
0= m
0+ w
n.
Now suppose that the prior distribution is not exactly specified and consider two classes of prior distributions of θ:
Γ
µ0= {Π
µ0,σ: Π
µ0,σ= N (µ
0, σ
2), σ ∈ (σ
1, σ
2)}, where σ
1< σ
2are fixed and σ
0∈ (σ
1, σ
2), and
Γ
σ∗0= {Π
µ,σ0: Π
µ,σ0= N (µ, σ
02), µ ∈ (µ
1, µ
2)},
where µ
1< µ
2are fixed and µ
0∈ (µ
1, µ
2). The classes Γ
µ0and Γ
σ∗0express two types of uncertainty about the elicited prior.
Let R
x(µ, σ, b θ ) denote the posterior risk of the estimator b θ when the prior is normal N (µ, σ
2). The posterior risk can be expressed by two formulas as a function of λ and m:
R(µ
0, σ, b θ) = %
µ0(λ, b θ)
= exp(−ab θ + aµ
0+ (a
2/2 + av
n)λ) − a(µ
0+ λv
n) + ab θ − 1 and
R(µ, σ
0, b θ) = %
∗σ0(m, b θ)
= exp(−ab θ + am + aw
n) − a(m + w
n) + a
2λ
0/2 + ab θ − 1.
Observe that λ is an increasing function of σ and therefore if σ ∈ (σ
1, σ
2) then λ ∈ (λ
1, λ
2), where λ
i= λ(σ
i), i = 1, 2. Similarly, m is an increasing function of µ and therefore if µ ∈ (µ
1, µ
2) then m ∈ (m
1, m
2), where m
i= m(µ
i), i = 1, 2. The ranges of the posterior risk of the estimator b θ when the prior runs over Γ
µ0and Γ
σ∗0are
r
µ0(b θ) = sup
λ∈(λ1,λ2)
%
µ0(λ, b θ) − inf
λ∈(λ1,λ2)
%
µ0(λ, b θ) and
r
∗σ0(b θ) = sup
m∈(m1,m2)
%
∗σ0(m, b θ) − inf
m∈(m1,m2)
%
∗σ0(m, b θ), respectively.
2. The range of the posterior risk for the Bayes estimator.
Consider the prior Π
µ0,σ0, note that Π
µ0,σ0∈ Γ
µ0and Π
µ0,σ0∈ Γ
σ∗0, and consider the Bayes estimator
θ b
Bayµ0,σ0= µ
0+ (a/2 + v
n)λ
0= m
0+ w
n.
The posterior risk of this estimator under an arbitrary prior Π
µ0,σ∈ Γ
µ0is
%
µ0(λ, b θ
Bayµ0,σ0) = exp((a
2/2 + av
n)(λ − λ
0)) − av
n(λ − λ
0) + a
2λ
0/2 − 1.
Denote it by f (λ). Now computations lead to the following form of the
oscillation of %
µ0for b θ
Bayµ0,σ0while λ runs over (λ
1, λ
2):
r
µ0(b θ
Bayµ0,σ0) =
f (λ
2) − f (λ
1) if −a/2 ≤ v
n< 0 and a > 0, or 0 < v
n≤ −a/2 and a < 0, or b λ < λ
1, f (λ
2) − f (b λ) otherwise,
where
b λ = λ
0+ (a
2/2 + av
n)
−1ln v
na/2 + v
n. Thus
r
µ0(b θ
Bayµ0,σ0)
=
e
z(λ1−λ0)[e
zδ− 1] − av
nδ if −a/2 < v
n< 0 and a > 0, or 0 < v
n≤ −a/2 and a < 0, or b λ < λ
1,
a
2δ/2 if v
n= −a/2,
e
z(λ2−λ0)+ av
n(b λ − λ
2− 1/z) otherwise, where z = a
2/2 + av
nand δ = λ
2− λ
1.
Consider the class Γ
σ∗0. The posterior risk of this estimator under an arbitrary prior Π
µ,σ0∈ Γ
σ∗0is
%
∗σ0(m, b θ
Bayµ0,σ0) = e
−a(m0−m)+ a(m
0− m) + a
2λ
0/2 − 1 and the oscillation of %
∗σ0is equal to
r
σ∗0(b θ
Bayµ0,σ0) =
e
−a(m0−m2)+ a(m
0− m
2) − 1 for m
0≤ m, b e
−a(m0−m1)+ a(m
0− m
1) − 1 for m
0> m, b where
m = m b
1+ 1
a ln exp(am
2− am
1) − 1 a(m
2− m
1) .
3. Most stable and conditional Γ -minimax estimators. Now the problem is to find most stable estimators b θ
µ0and b θ
∗σ0, i.e. those satisfying
inf
θb
r
µ0(b θ ) = r
µ0(b θ
µ0) and inf
θb
r
σ∗0(b θ) = r
∗σ0(b θ
∗σ0)
and to find the conditional Γ -minimax estimators e θ
µ0and e θ
σ∗0, i.e. those satisfying
inf
θb
sup
σ∈[σ1,σ2]
R
x(µ
0, σ, b θ ) = sup
σ∈[σ1,σ2]
R
x(µ
0, σ, e θ
µ0) and
inf
θb
sup
µ∈[µ1,µ2]
R
x(µ, σ
0, b θ ) = sup
µ∈[µ1,µ2]
R
x(µ, σ
0, e θ
σ∗0).
We use the following theorem proved by M¸ eczarski [5].
Theorem 1 (M¸ eczarski [5]). Let Γ = {Π
α: α ∈ [α
1, α
2]} be a set of prior distributions, where α is a real parameter. Let %(α, d) be the posterior risk of a decision d based on an observation x when the prior is Π
α. Assume that the function %(α, d) satisfies the following conditions:
1. %(α, ·) is a strictly convex function for any α;
2. for any d the minimum point α
min(d) of %(·, d) is unique and α
minis a strictly monotone function of d;
3. for any α and d such that α
min(d ) = α we have
∀d
1< d
2≤ d %(α, d
2) − %(α, d
1)
d
2− d
1< %(α
min(d
2), d
2) − %(α
min(d
1), d
1) d
2− d
1and
∀d
2> d
1≥ d %(α, d
2) − %(α, d
1)
d
2− d
1> %(α
min(d
2), d
2) − %(α
min(d
1), d
1)
d
2− d
1;
4. the function %(α
1, d) − %(α
2, d) is a monotone function of d.
Then
(i) if there exists b d such that sup
α∈[α1,α2]
%(α, b d ) = %(α
1, b d ) = %(α
2, b d ) then b d is the most stable;
(ii) if b d satisfying (i) belongs to L
Γ= {d : ∀x ∈ X ∃α ∈ [α
1, α
2] d(x) = d
Bayα(x)} then b d is conditional Γ -minimax.
We now prove our results.
Theorem 2. If the class of priors is Γ
σ∗0then θ b
σ∗0= b θ
Bayµ1,σ0+ 1
a ln exp[a(m
2− m
1)] − 1 a(m
2− m
1) and e θ
σ∗0= b θ
∗σ0for all values x of the random variable X.
P r o o f. Let us check the conditions of Theorem 1 for
%
∗σ0(m, b θ) = exp(−ab θ + am + aw
n) − a(m + w
n) + a
2λ
0/2 + ab θ − 1.
The function %
∗σ0(m, ·) is convex and
∂%
∗σ0(m, b θ)
∂m = a exp(−ab θ + am + aw
n) − a,
thus the minimum point m
min(b θ) = b θ − w
n, and m
minis an increasing
function of b θ.
To check condition 3 it is enough to show the inequalities
∀θ
1< θ
2≤ b θ e
a bθe
−aθ2− e
−aθ1θ
2− θ
1< −a and
∀θ
2> θ
1≥ b θ e
a bθe
−aθ2− e
−aθ1θ
2− θ
1> −a.
These hold by the Lagrange formula. The last condition of Theorem 1 is also true, thus b θ
∗σ0is a solution of the equation
%
∗σ0(m
1, b θ) = %
∗σ0(m
2, b θ).
To obtain the conditional Γ -minimax estimator note that for all values x of the random variable X we have b θ
∗σ0(x) ∈ [b θ
Bayµ1,σ0(x), b θ
Bayµ2,σ0(x)].
Theorem 3. Let the class of priors be Γ
µ0. Then the most stable esti- mator b θ
µ0of θ in the class of all estimators of θ exists only for the values of X satisfying
v
n(v
n+ a/2) > 0 or v
n= −a/2.
For v
n(v
n+ a/2) > 0,
b θ
µ0= b θ
Bayµ0,σ1+ 1
a ln e
(λ2−λ1)(a2/2+avn)− 1 av
n(λ
2− λ
1) .
For v
n= −a/2 the range of the posterior risk does not depend on the value of b θ.
The conditional Γ -minimax estimator is
θ e
µ0=
θ b
µ0if v
n(v
n+ a/2) > 0 and
exp[(λ
1− λ
2)(a
2/2 + av
n)] + av
n(λ
2− λ
1)) ≥ 1, θ b
Bayµ0,σ2otherwise.
The most stable estimator in the class
L = {b θ : ∀x ∃σ ∈ [σ
1, σ
2] b θ(x) = b θ
Bayµ0,σ(x)}
is equal to the conditional Γ -minimax estimator in the class of all estima- tors.
P r o o f. Let us check the conditions of Theorem 1 for
%
µ0(λ, b θ) = exp(−ab θ + aµ
0+ (a
2/2 + av
n)λ) − a(µ
0+ λv
n) + ab θ − 1.
The function %
µ0(λ, ·) is convex and
∂%
µ0(λ, b θ)
∂λ = (a
2/2 + av
n) exp(−ab θ + aµ
0+ λ(a
2/2 + av
n)) − av
n.
Thus the minimum point is
λ
min(b θ) = ab θ − aµ
0+ ln
a/2+vvnn
a
2/2 + av
nand λ
minexists iff v
n(v
n+ a/2) > 0.
For v
nsatisfying v
n(v
n+ a/2) ≤ 0 the function %
µ0(·, b θ) is an increasing function of λ and the oscillation of the posterior risk
r
µ0(b θ) = − av
n(λ
2− λ
1) + exp(−ab θ + aµ
0+ (a
2/2 + av
n)λ
1)
× [exp((a
2/2 + av
n)(λ
2− λ
1)) − 1]
is a monotone function of b θ (decreasing for a > 0 and −a/2 < v
n≤ 0, constant for v
n= −a/2 and increasing for a < 0 and 0 ≤ v
n< −a/2).
Thus the most stable estimator does not exist for v
n(v
n+ a/2) ≤ 0 and v
n6= −a/2. For v
n= −a/2 the oscillation r
µ0(b θ) = a
2(λ
2− λ
1)/2 does not depend on the value of b θ. The conditional Γ -minimax estimator e θ
µ0is equal to b θ
Bayµ0,σ2.
Let us consider the situation when v
n(v
n+ a/2) > 0. The minimum point λ
minand the function %
µ0(λ
2, ·) − %
µ0(λ
1, ·) are monotone functions of θ. Condition 3 of Theorem 1 is similar to that in Theorem 2 so we obtain b the most stable estimator as a solution of the equation
%
µ0(λ
1, b θ
µ0) = %
µ0(λ
2, b θ
µ0).
To find the conditional Γ -minimax estimator we check when b θ
µ0∈ L.
For v
n+ a/2 > 0 we have b θ
Bayµ0,σ1< b θ
Bayµ0,σ2. Solving the inequalities θ b
Bayµ0,σ1≤ b θ
µ0≤ b θ
Bayµ0,σ2we obtain the condition
(∗) exp[(λ
1− λ
2)(a
2/2 + av
n)] + av
n(λ
2− λ
1) ≥ 1.
For v
n+ a/2 < 0 we have b θ
Bayµ0,σ1> b θ
Bayµ0, σ2. Solving the inequalities θ b
Bayµ0,σ1≥ b θ
µ0≥ b θ
Bayµ0,σ2we also obtain (∗). Thus if v
n(v
n+ a/2) > 0 and (∗) is true then e θ
µ0= b θ
µ0. If v
n+ a/2 > 0 and v
n> 0 and (∗) is not true then
θ b
Bayµ0,σ1< b θ
Bayµ0,σ2< b θ
µ0and
sup
λ∈[λ1,λ2]
%
µ0(λ, b θ) = %
µ0(λ
2, b θ) if b θ ≤ b θ
µ0,
%
µ0(λ
1, b θ) if b θ ≥ b θ
µ0,
and the oscillation r
µ0(b θ) is a decreasing function for b θ < b θ
µ0.
If v
n+ a/2 < 0 and v
n< 0 and (∗) is not true then θ b
Bayµ0,σ1> b θ
Bayµ0,σ2> b θ
µ0and
sup
λ∈[λ1,λ2]
%
µ0(λ, b θ) = %
µ0(λ
1, b θ) if b θ ≤ b θ
µ0,
%
µ0(λ
2, b θ) if b θ ≥ b θ
µ0, and the oscillation r
µ0(b θ) is an increasing function for b θ > b θ
µ0.
Thus if v
n(v
n+ a/2) > 0 and (∗) is not true then e θ
µ0= b θ
Bayµ0,σ2and b θ
Bayµ0,σ2is the most stable estimator in the class L.
The monotonicity of the function r
µ0shows that b θ
Bayµ0,σ2is also the most stable estimator in the class L for v
n(v
n+ a/2) ≤ 0.
References
[1] B. B e t r o and F. R u g g e r i, Conditional Γ -minimax actions under convex losses, Comm. Statist. Theory Methods 21 (1992), 1051–1066.
[2] A. B o r a t y ´n s k a, Stability of Bayesian inference in exponential families, Statist.
Probab. Lett. 36 (1997), 173–178.
[3] A. B o r a t y ´n s k a and M. M ¸e c z a r s k i, Robust Bayesian estimation in the one-dimen- sional normal model , Statistics and Decision 12 (1994), 221–230.
[4] A. D a s G u p t a and W. J. S t u d d e n, Frequentist behavior of robust Bayes estimates of normal means, Statist. Decisions 7 (1989), 333–361.
[5] M. M ¸e c z a r s k i, Stability and conditional Γ -minimaxity in Bayesian inference, Appl.
Math. (Warsaw) 22 (1993), 117–122.
[6] M. M ¸e c z a r s k i and R. Z i e l i ´n s k i, Stability of the Bayesian estimator of the Poisson mean under the inexactly specified gamma prior , Statist. Probab. Lett. 12 (1991), 329–333.
[7] H. R. V a r i a n, A Bayesian approach to real estate assessment , in: Studies in Bayesian Econometrics and Statistics, North-Holland, 1974, 195–208.
[8] A. Z e l l n e r, Bayesian estimation and prediction using asymmetric loss functions, J.
Amer. Statist. Assoc. 81 (1986), 446–451.
Agata Boraty´nska
Institute of Applied Mathematics University of Warsaw
Banacha 2
02-097 Warszawa, Poland E-mail: agatab@mimuw.edu.pl
Monika Drozdowicz Wojciechowskiego 22 02-495 Warszawa, Poland
Received on 2.9.1998;
revised version on 3.12.1998