ROBUST BAYESIAN ESTIMATION IN A NORMAL MODEL WITH ASYMMETRIC LOSS FUNCTION

(1)

A. B O R A T Y ´N S K A and M. D R O Z D O W I C Z (Warszawa)

ROBUST BAYESIAN ESTIMATION IN A NORMAL MODEL WITH ASYMMETRIC LOSS FUNCTION

Abstract. The problem of robust Bayesian estimation in a normal model with asymmetric loss function (LINEX) is considered. Some uncertainty about the prior is assumed by introducing two classes of priors. The most robust and conditional Γ -minimax estimators are constructed. The situations when those estimators coincide are presented.

1. Introduction and notation. In Bayesian statistical inference the goal of research are optimal decisions under a specified loss function and a prior distribution over the parameter space. However the arbitrariness of a unique prior distribution is a permanent problem. Robust Bayesian inference deals with the problem of expressing uncertainty of the prior information using a class Γ of priors and of measuring the range of a posterior quantity while the prior distribution Π runs over the class Γ . It is interesting not only in calculating the range but also in constructing optimal procedures.

In the problem of estimation of an unknown parameter two concepts of optimality are considered: the idea of conditional Γ -minimax estimators (see DasGupta and Studden [4], Betro and Ruggeri [1]) and the idea of stable estimators developed in M¸ eczarski and Zieli´ nski [6] and Boraty´ nska and M¸ eczarski [3]. The first concept is connected with the problem of ef- ficiency of the estimator with respect to the posterior risk when the priors run over Γ . The second one is connected with the problem of finding an estimator with the smallest oscillation of the posterior risk when the priors run over Γ . Sometimes those two estimators coincide (see M¸ eczarski [5] and Boraty´ nska [2]).

In all papers mentioned above the quadratic loss function was considered. However in many situations a quadratic loss function seems inappro-

1991 Mathematics Subject Classification: Primary 62C10; Secondary 62F15, 62F35.

Key words and phrases: Bayes estimators, classes of priors, robust Bayesian estima- tion, asymmetric loss function.

[85]

(2)

priate in that it assigns the same loss to overestimates as to equal under- estimates.

In this paper we estimate an unknown parameter θ and consider the asymmetric loss function (LINEX)

L(θ, d) = exp(a(θ − d)) − a(θ − d) − 1,

where a is a known parameter and a 6= 0. Exhaustive motivations to use LINEX are presented in Varian [7] and Zellner [8]. We find the conditional Γ -minimax estimators and the stable estimators, and present conditions when those estimators coincide, in a normal model with two classes of con- jugate priors given below.

Let X

1

, . . . , X

n

be i.i.d. random variables with normal N (θ, b

²

) distribution where θ is unknown and b

²

is known. Set X = (X

1

, . . . , X

n

). Let Π

µ0,σ0

= N (µ

0

, σ

²₀

) be a fixed prior distribution of θ.

Define X = 1

n

X

i=1

X

i

, v

n

= n(X − µ

0

)

b

²

, λ = λ(σ) = 1 σ

²

+ n

b

²

−1

,

m = m(µ) = µ

1 − n

b

²

1 σ

₀²

+ n

b

²

−1

, w

n

= a

2 + nX b

²

1 σ

₀²

+ n

b

²

−1

.

If X = x then the posterior distribution is the normal distribution N (µ

0

+ v

n

λ

0

, λ

0

) = N (m

0

+ w

n

− aλ

₀

/2, λ

0

),

where λ

0

= λ(σ

0

) and m

0

= m(µ

0

). The posterior risk of an estimator b θ with LINEX loss function is equal to

Ee

^{a(θ− b}^θ)

− aEθ + ab θ − 1,

where Ey(θ) denotes the expected value of a function y(θ) when θ has the posterior distribution. Thus under the prior Π

µ0,σ0

,

Ee

^aθ

= exp(aµ

0

+ (a

²

/2 + av

n

)λ

0

) = exp(am + aw

n

) and

Eθ = µ

0

+ v

n

λ

0

= m

0

+ w

n

− aλ

0

/2.

The minimum of the posterior risk as a function of θ is reached for θ = b 1

a ln Ee

^aθ

.

Thus the Bayes estimator with LINEX loss function is given by the formula θ b

^Bay_µ₀_,σ₀

= 1

a ln Ee

^aθ

= µ

0

+ (a/2 + v

n

)λ

0

= m

0

+ w

n

.

(3)

Now suppose that the prior distribution is not exactly specified and consider two classes of prior distributions of θ:

Γ

µ0

= {Π

µ0,σ

: Π

µ0,σ

= N (µ

0

, σ

²

), σ ∈ (σ

1

, σ

2

)}, where σ

1

< σ

2

are fixed and σ

0

∈ (σ

₁

, σ

2

), and

Γ

_σ^∗₀

= {Π

µ,σ0

: Π

µ,σ0

= N (µ, σ

₀²

), µ ∈ (µ

1

, µ

2

)},

where µ

1

< µ

2

are fixed and µ

0

∈ (µ

₁

, µ

2

). The classes Γ

µ0

and Γ

_σ^∗₀

express two types of uncertainty about the elicited prior.

Let R

x

(µ, σ, b θ ) denote the posterior risk of the estimator b θ when the prior is normal N (µ, σ

²

). The posterior risk can be expressed by two formulas as a function of λ and m:

R(µ

0

, σ, b θ) = %

µ0

(λ, b θ)

= exp(−ab θ + aµ

0

+ (a

²

/2 + av

n

)λ) − a(µ

0

+ λv

n

) + ab θ − 1 and

R(µ, σ

0

, b θ) = %

^∗_σ₀

(m, b θ)

= exp(−ab θ + am + aw

n

) − a(m + w

n

) + a

²

λ

0

/2 + ab θ − 1.

Observe that λ is an increasing function of σ and therefore if σ ∈ (σ

1

, σ

2

) then λ ∈ (λ

1

, λ

2

), where λ

i

= λ(σ

i

), i = 1, 2. Similarly, m is an increasing function of µ and therefore if µ ∈ (µ

1

, µ

2

) then m ∈ (m

1

, m

2

), where m

i

= m(µ

i

), i = 1, 2. The ranges of the posterior risk of the estimator b θ when the prior runs over Γ

µ0

and Γ

_σ^∗₀

are

r

µ0

(b θ) = sup

λ∈(λ1,λ2)

%

µ0

(λ, b θ) − inf

λ∈(λ1,λ2)

%

µ0

(λ, b θ) and

r

^∗_σ₀

(b θ) = sup

m∈(m1,m2)

%

^∗_σ₀

(m, b θ) − inf

m∈(m1,m2)

%

^∗_σ₀

(m, b θ), respectively.

2. The range of the posterior risk for the Bayes estimator.

Consider the prior Π

µ0,σ0

, note that Π

µ0,σ0

∈ Γ

µ0

and Π

µ0,σ0

∈ Γ

_σ^∗₀

, and consider the Bayes estimator

θ b

^Bay_µ₀_,σ₀

= µ

0

+ (a/2 + v

n

)λ

0

= m

0

+ w

n

.

The posterior risk of this estimator under an arbitrary prior Π

µ0,σ

∈ Γ

_µ₀

is

%

µ0

(λ, b θ

^Bay_µ₀_,σ₀

) = exp((a

²

/2 + av

n

)(λ − λ

0

)) − av

n

(λ − λ

0

) + a

²

λ

0

/2 − 1.

Denote it by f (λ). Now computations lead to the following form of the

(4)

oscillation of %

µ0

for b θ

^Bay_µ₀_,σ₀

while λ runs over (λ

1

, λ

2

):

r

µ0

(b θ

^Bay_µ₀_,σ₀

) =







f (λ

2

) − f (λ

1

) if −a/2 ≤ v

n

< 0 and a > 0, or 0 < v

n

≤ −a/2 and a < 0, or b λ < λ

1

, f (λ

2

) − f (b λ) otherwise,

where

b λ = λ

0

+ (a

²

/2 + av

n

)

⁻¹

ln v

n

a/2 + v

n

. Thus

r

µ0

(b θ

^Bay_µ₀_,σ₀

)

=



 



 



e

^z(λ¹^−λ⁰⁾

[e

^zδ

− 1] − av

n

δ if −a/2 < v

n

< 0 and a > 0, or 0 < v

n

≤ −a/2 and a < 0, or b λ < λ

1

,

a

²

δ/2 if v

n

= −a/2,

e

^z(λ²^−λ⁰⁾

+ av

n

(b λ − λ

2

− 1/z) otherwise, where z = a

²

/2 + av

n

and δ = λ

2

− λ

₁

.

Consider the class Γ

_σ^∗₀

. The posterior risk of this estimator under an arbitrary prior Π

µ,σ0

∈ Γ

_σ^∗₀

is

%

^∗_σ₀

(m, b θ

^Bay_µ₀_,σ₀

) = e

^−a(m⁰^−m)

+ a(m

0

− m) + a

²

λ

0

/2 − 1 and the oscillation of %

^∗_σ₀

is equal to

r

_σ^∗₀

(b θ

^Bay_µ₀_,σ₀

) =

e

^−a(m⁰^−m²⁾

+ a(m

0

− m

₂

) − 1 for m

0

≤ m, b e

^−a(m⁰^−m¹⁾

+ a(m

0

− m

1

) − 1 for m

0

> m, b where

m = m b

1

+ 1

a ln exp(am

2

− am

₁

) − 1 a(m

2

− m

1

) .

3. Most stable and conditional Γ -minimax estimators. Now the problem is to find most stable estimators b θ

µ0

and b θ

^∗_σ₀

, i.e. those satisfying

inf

θb

r

µ0

(b θ ) = r

µ0

(b θ

µ0

) and inf

θb

r

_σ^∗₀

(b θ) = r

^∗_σ₀

(b θ

^∗_σ₀

)

and to find the conditional Γ -minimax estimators e θ

µ0

and e θ

_σ^∗₀

, i.e. those satisfying

inf

θb

sup

σ∈[σ1,σ2]

R

x

(µ

0

, σ, b θ ) = sup

σ∈[σ1,σ2]

R

x

(µ

0

, σ, e θ

µ0

) and

inf

θb

sup

µ∈[µ1,µ2]

R

x

(µ, σ

0

, b θ ) = sup

µ∈[µ1,µ2]

R

x

(µ, σ

0

, e θ

_σ^∗₀

).

We use the following theorem proved by M¸ eczarski [5].

(5)

Theorem 1 (M¸ eczarski [5]). Let Γ = {Π

α

: α ∈ [α

1

, α

2

]} be a set of prior distributions, where α is a real parameter. Let %(α, d) be the posterior risk of a decision d based on an observation x when the prior is Π

α

. Assume that the function %(α, d) satisfies the following conditions:

1. %(α, ·) is a strictly convex function for any α;

2. for any d the minimum point α

min

(d) of %(·, d) is unique and α

min

is a strictly monotone function of d;

3. for any α and d such that α

min

(d ) = α we have

∀d

₁

< d

2

≤ d %(α, d

2

) − %(α, d

1

)

d

2

− d

₁

< %(α

min

(d

2

), d

2

) − %(α

min

(d

1

), d

1

) d

2

− d

₁

and

∀d

2

> d

1

≥ d %(α, d

2

) − %(α, d

1

)

d

2

− d

₁

> %(α

min

(d

2

), d

2

) − %(α

min

(d

1

), d

1

)

d

2

− d

₁

;

4. the function %(α

1

, d) − %(α

2

, d) is a monotone function of d.

Then

(i) if there exists b d such that sup

α∈[α1,α2]

%(α, b d ) = %(α

1

, b d ) = %(α

2

, b d ) then b d is the most stable;

(ii) if b d satisfying (i) belongs to L

Γ

= {d : ∀x ∈ X ∃α ∈ [α

1

, α

2

] d(x) = d

^Bay_α

(x)} then b d is conditional Γ -minimax.

We now prove our results.

Theorem 2. If the class of priors is Γ

σ^∗0

then θ b

_σ^∗₀

= b θ

^Bay_µ₁_,σ₀

+ 1

a ln exp[a(m

2

− m

1

)] − 1 a(m

2

− m

₁

) and e θ

_σ^∗₀

= b θ

^∗_σ₀

for all values x of the random variable X.

P r o o f. Let us check the conditions of Theorem 1 for

%

^∗_σ₀

(m, b θ) = exp(−ab θ + am + aw

n

) − a(m + w

n

) + a

²

λ

0

/2 + ab θ − 1.

The function %

^∗_σ₀

(m, ·) is convex and

∂%

^∗_σ₀

(m, b θ)

∂m = a exp(−ab θ + am + aw

n

) − a,

thus the minimum point m

min

(b θ) = b θ − w

n

, and m

min

is an increasing

function of b θ.

(6)

To check condition 3 it is enough to show the inequalities

∀θ

1

< θ

2

≤ b θ e

^{a b}^θ

e

^−aθ²

− e

^−aθ¹

θ

2

− θ

₁

< −a and

∀θ

₂

> θ

1

≥ b θ e

^{a b}^θ

e

^−aθ²

− e

^−aθ¹

θ

2

− θ

1

> −a.

These hold by the Lagrange formula. The last condition of Theorem 1 is also true, thus b θ

^∗_σ₀

is a solution of the equation

%

^∗_σ₀

(m

1

, b θ) = %

^∗_σ₀

(m

2

, b θ).

To obtain the conditional Γ -minimax estimator note that for all values x of the random variable X we have b θ

^∗_σ₀

(x) ∈ [b θ

^Bay_µ₁_,σ₀

(x), b θ

^Bay_µ₂_,σ₀

(x)].

Theorem 3. Let the class of priors be Γ

^µ0

. Then the most stable estimator b θ

µ0

of θ in the class of all estimators of θ exists only for the values of X satisfying

v

n

(v

n

+ a/2) > 0 or v

n

= −a/2.

For v

n

(v

n

+ a/2) > 0,

b θ

µ0

= b θ

^Bay_µ₀_,σ₁

+ 1

a ln e

^(λ²^−λ¹^)(a²^/2+avⁿ⁾

− 1 av

n

(λ

2

− λ

₁

) .

For v

n

= −a/2 the range of the posterior risk does not depend on the value of b θ.

The conditional Γ -minimax estimator is

θ e

µ0

=







θ b

µ0

if v

n

(v

n

+ a/2) > 0 and

exp[(λ

1

− λ

₂

)(a

²

/2 + av

n

)] + av

n

(λ

2

− λ

₁

)) ≥ 1, θ b

^Bay_µ₀_,σ₂

otherwise.

The most stable estimator in the class

L = {b θ : ∀x ∃σ ∈ [σ

1

, σ

2

] b θ(x) = b θ

^Bay_µ₀_,σ

(x)}

is equal to the conditional Γ -minimax estimator in the class of all estimators.

P r o o f. Let us check the conditions of Theorem 1 for

%

µ0

(λ, b θ) = exp(−ab θ + aµ

0

+ (a

²

/2 + av

n

)λ) − a(µ

0

+ λv

n

) + ab θ − 1.

The function %

µ0

(λ, ·) is convex and

∂%

µ0

(λ, b θ)

∂λ = (a

²

/2 + av

n

) exp(−ab θ + aµ

0

+ λ(a

²

/2 + av

n

)) − av

n

.

(7)

Thus the minimum point is

λ

min

(b θ) = ab θ − aµ

0

+ ln

_a/2+v^vⁿ

n

a

²

/2 + av

n

and λ

min

exists iff v

n

(v

n

+ a/2) > 0.

For v

n

satisfying v

n

(v

n

+ a/2) ≤ 0 the function %

µ0

(·, b θ) is an increasing function of λ and the oscillation of the posterior risk

r

µ0

(b θ) = − av

n

(λ

2

− λ

₁

) + exp(−ab θ + aµ

0

+ (a

²

/2 + av

n

)λ

1

)

× [exp((a

²

/2 + av

n

)(λ

2

− λ

₁

)) − 1]

is a monotone function of b θ (decreasing for a > 0 and −a/2 < v

n

≤ 0, constant for v

n

= −a/2 and increasing for a < 0 and 0 ≤ v

n

< −a/2).

Thus the most stable estimator does not exist for v

n

(v

n

+ a/2) ≤ 0 and v

n

6= −a/2. For v

n

= −a/2 the oscillation r

µ0

(b θ) = a

²

(λ

2

− λ

1

)/2 does not depend on the value of b θ. The conditional Γ -minimax estimator e θ

µ0

is equal to b θ

^Bay_µ₀_,σ₂

.

Let us consider the situation when v

n

(v

n

+ a/2) > 0. The minimum point λ

min

and the function %

µ0

(λ

2

, ·) − %

µ0

(λ

1

, ·) are monotone functions of θ. Condition 3 of Theorem 1 is similar to that in Theorem 2 so we obtain b the most stable estimator as a solution of the equation

%

µ0

(λ

1

, b θ

µ0

) = %

µ0

(λ

2

, b θ

µ0

).

To find the conditional Γ -minimax estimator we check when b θ

µ0

∈ L.

For v

n

+ a/2 > 0 we have b θ

^Bay_µ₀_,σ₁

< b θ

^Bay_µ₀_,σ₂

. Solving the inequalities θ b

^Bay_µ₀_,σ₁

≤ b θ

µ0

≤ b θ

^Bay_µ₀_,σ₂

we obtain the condition

(∗) exp[(λ

1

− λ

₂

)(a

²

/2 + av

n

)] + av

n

(λ

2

− λ

₁

) ≥ 1.

For v

n

+ a/2 < 0 we have b θ

^Bay_µ₀_,σ₁

> b θ

^Bay_µ₀_{, σ}₂

. Solving the inequalities θ b

^Bay_µ₀_,σ₁

≥ b θ

µ0

≥ b θ

^Bay_µ₀_,σ₂

we also obtain (∗). Thus if v

n

(v

n

+ a/2) > 0 and (∗) is true then e θ

µ0

= b θ

µ0

. If v

n

+ a/2 > 0 and v

n

> 0 and (∗) is not true then

θ b

^Bay_µ₀_,σ₁

< b θ

^Bay_µ₀_,σ₂

< b θ

µ0

and

sup

λ∈[λ1,λ2]

%

µ0

(λ, b θ) = %

_µ₀

(λ

2

, b θ) if b θ ≤ b θ

µ0

,

%

µ0

(λ

1

, b θ) if b θ ≥ b θ

µ0

,

and the oscillation r

µ0

(b θ) is a decreasing function for b θ < b θ

µ0

.

(8)

If v

n

+ a/2 < 0 and v

n

< 0 and (∗) is not true then θ b

^Bay_µ₀_,σ₁

> b θ

^Bay_µ₀_,σ₂

> b θ

µ0

and

sup

λ∈[λ1,λ2]

%

µ0

(λ, b θ) = %

_µ₀

(λ

1

, b θ) if b θ ≤ b θ

µ0

,

%

µ0

(λ

2

, b θ) if b θ ≥ b θ

µ0

, and the oscillation r

µ0

(b θ) is an increasing function for b θ > b θ

µ0

.

Thus if v

n

(v

n

+ a/2) > 0 and (∗) is not true then e θ

µ0

= b θ

^Bay_µ₀_,σ₂

and b θ

^Bay_µ₀_,σ₂

is the most stable estimator in the class L.

The monotonicity of the function r

µ0

shows that b θ

^Bay_µ₀_,σ₂

is also the most stable estimator in the class L for v

n

(v

n

+ a/2) ≤ 0.

References

[1] B. B e t r o and F. R u g g e r i, Conditional Γ -minimax actions under convex losses, Comm. Statist. Theory Methods 21 (1992), 1051–1066.

[2] A. B o r a t y ´n s k a, Stability of Bayesian inference in exponential families, Statist.

Probab. Lett. 36 (1997), 173–178.

[3] A. B o r a t y ´n s k a and M. M ¸e c z a r s k i, Robust Bayesian estimation in the one-dimen- sional normal model , Statistics and Decision 12 (1994), 221–230.

[4] A. D a s G u p t a and W. J. S t u d d e n, Frequentist behavior of robust Bayes estimates of normal means, Statist. Decisions 7 (1989), 333–361.

[5] M. M ¸e c z a r s k i, Stability and conditional Γ -minimaxity in Bayesian inference, Appl.

Math. (Warsaw) 22 (1993), 117–122.

[6] M. M ¸e c z a r s k i and R. Z i e l i ´n s k i, Stability of the Bayesian estimator of the Poisson mean under the inexactly specified gamma prior , Statist. Probab. Lett. 12 (1991), 329–333.

[7] H. R. V a r i a n, A Bayesian approach to real estate assessment , in: Studies in Bayesian Econometrics and Statistics, North-Holland, 1974, 195–208.

[8] A. Z e l l n e r, Bayesian estimation and prediction using asymmetric loss functions, J.

Amer. Statist. Assoc. 81 (1986), 446–451.

Agata Boraty´nska

Institute of Applied Mathematics University of Warsaw

Banacha 2

02-097 Warszawa, Poland E-mail: agatab@mimuw.edu.pl

Monika Drozdowicz Wojciechowskiego 22 02-495 Warszawa, Poland

Received on 2.9.1998;

revised version on 3.12.1998

ROBUST BAYESIAN ESTIMATION IN A NORMAL MODEL WITH ASYMMETRIC LOSS FUNCTION