• Nie Znaleziono Wyników

On Synthetic Ratio Estimator Based on Superpopulation Approach

N/A
N/A
Protected

Academic year: 2021

Share "On Synthetic Ratio Estimator Based on Superpopulation Approach"

Copied!
17
0
0

Pełen tekst

(1)

A C T A U N I V E R S I T A T I S L O D Z I E N S I S F O L IA O E C O N O M IC A 194, 2005 T o m a s z Ż ą d ł o * O N SY N TH ETIC R A I I O ESTIM A TO R BASED O N S U PE R PO PU L A T IO N A PPR O A C H Abstract

In the p a p e r p ro p e rties o f a p re d ic to r o f the form o f synthetic ra tio e stim a to r o f do m ain to ta l, k n o w n fro m ra n d o m isa tio n a p p ro a c h , are considered. T h e p r o o f o f its £-unbiasedness fo r sim ple regression su p e rp o p u latio n m odel in s tra ta is show n. F o r the m odel B L U p red ic to r is also presented. E q u atio n s o f prediction variances o f both predictors are derived. F o r considered p re d ic to rs the p ro b lem o f m odel m isspecification is considered an d e q u atio n s o f p rediction m ean sq u a re e rro rs a rc derived. T h e co m p ariso n o f accuracy is su p p o rte d by sim u latio n study.

Key words: sm all a rea statistics, su p erp o p u latio n a p p ro ac h , m odel m isspecification, £-bias.

I. INTRODUCTION

Let p o p u la tio n (2 o f size N be divided into С s tra ta d eno ted by Q c each o f size N c (w here c = 1 ,...,C ) and D dom ain s Q d each o f size N d (w here d = 1 One d om ain can be a p a rt o f m o re th a n one stratum . Sets i2cn i 2 d will be denoted by Q cd and their sizes by N cd. F ro m each stra ta sam ple sc o f size nc is draw n. Let sets scr \ Q d be d eno ted by scd and

с с

th eir sizes by ncd. L et us introduce additional sym bols: sc = s, £ nc = n,

C = 1 C - 1

^ГС ß c *^C5 N rc = N c fíc, £2rí| S 4, N rd N d ľl^, ^ r e d = ^ c d $cd>

N red = N cd — ncd. Let us stress th a t subscript d* will denote do m ain o f interest, w hich to ta l value 7 > = £ У. is estim ated.

(2)

U. S IM P L E R E G R E S S IO N S U P E R P O P U L A T IO N M O D E L IN S T R A T A

Let us consider sim ple regression su p erp o p u latio n m odel in s tra ta with assum ption:

Let us add th a t ßc is unknow n and x t , . . . , x N are know n. W h at is m ore, for considered su p erp o p u latio n m odel and for o th er su p erp o p u latio n m odels assum ed for stra ta , which will be discussed in follow ing p a rts o f the paper, it is assum ed th a t random variables Y u ..., YN arc in dep end en t and:

w here v(.) denotes values o f know n function o f auxiliary variable.

Let us intro d u ce predictor o f d om ain to tal value o f the form o f ratio synthetic estim ato r know n from random izatio n ap p ro ach . F o r considered stratified ra n d o m sam pling it is as follows (e.g. B racha, 1994; B racha, 1996; G etka-W ilczyńska, 2000; W ywiał, Ż ądło, 2003):

It was proved th a t p redictor o f the form o f synthetic ra tio estim ato r is ^-unbiased for sim ple regression su perpop ulatio n m odel assum ed for strata.

d ) where Hd Y cl) — ß c x cj, ЕДес() — 0 = D i ( Y ci) = = (J2c v{xci) (2) (3) where *C|*C * »Cäc I

Let us notice th a t for assum ed sup erp o p u latio n m odel:

(3)

W hal should he stressed is th a t pred ictor o f the form o f synthetic ratio estim ato r (3) docs n o t have m inim al prediction variance am o n g all linear «^-unbiased prcdictors for sim ple regression su p crp o p u la tio n m odel assum ed for strata. F ro m R oyall’s theorem (1976) it is know n th a t B L U predictor for the considered sup crp o p u latio n m odel with assu m p tio n s (1) and (2) is as follows: f B L V - r a , = Z ( Y x d . + $ cX n d . ) (4 ) c = l where V X i ^ l

I

= = ľ *„ у , , . =.

z

y, V ' * i e & rc á * Í G S cé* Ic-se v ( x i )

Let inclusion probabilities in stra ta be co n stan t (e.g. sim ple ra n d o m sam ple w ithout replacem ent is d raw n from strata) and V;v(xj) = x ľ Hence:

С y -rp S Y N — rat _ V ы * V /с\ 1 л- = 2. ~ ч г (5) С = 1 Л SC where Yx = £ Y i , X K = Y , X i and i e s f i e a f f%íru ~Tat = I и ^ Х гсЛ (6) c= 1 \ A sc J

It easy to notice th a t if above-m entioned assum ption s an d the following conditions are fulfilled:

- n o n e o f elem ents o f dom ain d* are d raw n to the sam ple,

- fo r each stra ta from which elem ents o f i/*-th do m ain were draw n follow ing eq u a tio n holds — =

^ SC X s c d *

- for each stra ta from w hich elem ents o f d *-th d o m ain were draw n follow ing eq u a tio n holds sc = scd.,

(4)

then r p B lV - r a t _ f ' SY N i d *í d * C y Z j\ C(l * y C— 1 Л ЯСУ.,. (7)

L et us derive equ atio n s o f prediction variances o f pred icto rs (3) and (4) assum ing th a t co n d itio n (2) is fulfilled. It should he stressed th a t they arc correct even when co n dition (1), which defines sim ple regression super- pop u latio n m odel, is n o t fulfilled.

A fter som e algebra prediction variance o f the prcdictor o f the form o f synthetic ra tio estim ato r is as follows:

V a r ^ Ť T d. ) 2 = X X 2 C - 1 x cd. % vi^i) X cd* „ v(Xj) „ ~| I 6 J , л с 1 е а ы * / L i i e O c i» (8) If first ord er inclusion probabilities are co n stan t is s tra ta and if V ^ x ,) = x ;, then p rediction variance o f the prcdictor o f the form o f synthetic ratio estim ato r will be given by follow ing equation:

V a r,(rj* 'v T V )2 = X a-C = 1 Xcd' _ - x cd, x „ X sc ď + X cd* (9) where X s c d * I X , if'cd'

Prediction variance o f predictor (4) for su p erp o p u la tio n m odel with assum p tion (2) can be derived using R o y alľs theorem (1976). L et us stress th a t it is correct even w hen co ndition (1), which defines sim ple regression su p erp o p u la tio n m odel, is n o t fulfilled. P redictio n variance o f p redicto r (4) is as follows: V ar i( T Bi l lJ- ' at- T d. ) 2 = 2 > c c — 1 Y 2 Л. rc ( x f ' \v ( x ;) icQ„I V( * i ) (10) If Vjvix,) = x„ then prediction variance will simplify to the follow ing form:

(5)

Let us co m p are prediction variances o f b o th predictors w hen VjvCx,) = x ; and for c o n sta n t first o rd e r inclusion probabilities in strata.

V a r{( f J .“ ' - ' - - T d. ) 2 - V a r j if J J 1»-"* - T d*)2 = - £ <x2 C = 1

( 12)

Let us notice, th a t the value o f X Kä. is closer to zero (w hat holds when ncj, decreases), the sm allest precision difference o f b o th predictors is. In

X * discusscd case, the m axim um value o f equation (12) is received for = 0.5.

X x

X * X *

T he difference (12) equals 0 for = 0 and for = 1. F o r small area

X sc X sc

X *

statistics p urposes considerations can be limited to 0 < ’c < 0 .5 . In this X s c

X

case, the lower value o f is, the lower value o f precision difference X , c

(12) is observed. P rediction variances o f the considered p red ictors are equal w hen e q u a tio n (7) holds.

III. S IM P L E R E G R E S S IO N S U P E R P O P U L A T IO N M O D E L IN D O M A IN S

S ynthetic estim ators use assum ption th a t som e relatio nsh ips which occur in populatio n (or in strata) hold in dom ains (or dom ains and stra ta products) too. In the previous p a rt o f the p ap e r two ^-unbiased p redictors for simple regression su p crp o p u latio n m odel in stra ta were presented. Let us add that predictor (4) have m inim al prediction variance am ong all ^-unbiased predictors (hence its m ore precise th an predictor (3)). A ssum ption th a t sim ple regression sup crp o p u latio n m odel in stra ta is tru e can be incorrect. F o r exam ple simple regression su p crp o p u latio n m odel in dom ains can be true. In the following p art o f th e p aper accuracy o f the predictors (3) and (4) for sim ple regression su p crp o p u la tio n m odel in d om ains will be considered. It will be proved th at both predictors arc ^-biased and equations o f their ^-biases and prediction M SEs will be derived.

L et us assum e th a t sim ple regression su p crp o p u la tio n m odel in dom ains is true. T h e assum ption is as follows:

(6)

Let us consider tw o add itio n al alternative assum ptions. It is assum ed th at ran d o m variables У ,,..., Y N are independent and:

a l = D U Y c d = D f (eci) = a 2 v ( x el) (14) as in e q u a tio n (2) or

= D ? ( Ydd = D 2 (edi) = a j v ( x di). (15) In previous p arag ra p h it was stressed th a t if assu m p tion given by equation (2) (th e sam e is p re sen ted by e q u a tio n (14)) is tru e , th en V a r ^ T ’®/'*7 -"'' — T d) < V ,dTl (T^XN~rat — T d). Let us consider prediction v arian­ ces o f b o th p redictors when eq uation (15) is true.

Prediction variance o f the predictor o f the form o f synthetic ratio estim ator fo r assum ption (15) after som e algebra is received as follows:

с V a r^ T ^ ™ - ™' — T d) = £ С — 1 ( í ŕ ľ i i z ~ r + ° i i к * , ) -\ л с / d = 1 i e S c s n i i е О ы * cd* 2 V 1 J + 2 ^ - '< j J . I - I- (16)

If Vdv (x i) = x i and first o rd e r inclusion probabilities will be co n stan t in stra ta , th en above eq u atio n simplifies to the follow ing form:

V ar{( 7 * ™ T d) 2 = i f Í a 2X 3Cd + a 2d. X cd. - 2 а 2. Х х Л

c = l \ ^ a c d = l Л sc J

(17)

Let us derive prediction variance o f pred icto r (4) for assu m ptio n (15). T he follow ing result can be received:

(18)

If Vdv(Xj) = x j; then above eq u atio n simplifies to the follow ing form:

V ari(7'J.L£/_rat - T d) =

W

X 2d. X x 2 £ a j X scd + a 2d. X rcd\ (19)

(7)

If V X x ,) = X; and first o rd e r inclusion probabilities arc c o n sta n t in strata, then fo r assum ption (15):

с = 1 C = 1 V ar{( f J , " '- '* - T d) - V ar{(f2.™-™‘ - T d) = r i Xy £ ( X K- X xä. )(Xcd. + X rcd. - X sc) - I o i XYf X KAXcä* + X nd.) Л s c d ¥■ d * = 1 * * 8C (20)

Let us notice, that the value of Xscd. is closer to zero (w hat holds when ncd. decreases), the sm allest precision difference o f b o th pred icto rs is. A bove eq u a tio n is sum for stra ta o f sum s o f two elem ents. Let us assum e th at ^dx i > o.

F o r each stra ta second elem ent is negative. T h e first elem ent is negative for every stra ta if and only if X cd. + X rcd. < X x . H ence,

VcX cd. + X rcd> < X K=> V a r - T d) < Vax(( f dl N~ rat - T d). Based o n eq u a tio n (20) it can also be proved th a t

Ľ X ľ l 71 Xi 1 <1=1 Vc^ 2. E V a r ^ f j r 17- * - T d) < - T d). Л sc

It was show n th a t p re d ic to r (4) can be m ore precise th a n p re d ic to r (3) for assu m p tio n (15).

L et us derive e q u a tio n o f £-bias o f the p re d ic to r o f the fo rm o f synthetic ratio estim ato r (3) for the supcrp o p u latio n m odel with assu m ptio n (13). A fter som e algebra it is obtained th at:

E = £ Xf I (ßd ~ ß d. ) X cd (21) с = 1 с d = l where „ X , x <ä = E i

W h a t was expected, the predictor o f the form o f sy nth etic ra tio estim ato r is (^-unbiased, when sim ple regression sup crp o p u latio n m odel is true in stra ta to w hich do m ain o f interest belongs (su p c rp o p u latio n m odel with assu m p tio n (1)).

(8)

L et us derive eq uation o f £-bias o f the p redicto r (4) for su p erpo pu latio n m odel assum ed in this p a rt o f th e paper.

I x J l f x ) ' Ы - М 2 & а з ) c= l \ i e s r VVX i ) J d = X i e ic iV\ X i)

Sim ilarly to the p redictor o f the form o f synthetic ra tio estim ator, the p red icto r (4) is ^-unbiased if simple regression su p erp o p u la tio n m odel in do m ain s becom cs sim ple regression su perpo pu lation m odel in s tra ta (simple regression su p erp o p u la tio n m odel in s tra ta with assum p tion (2) is true).

Let us assum e th a t V;v(Xj) = x, and th a t first ord er inclusion probabilities arc c o n sta n t is strata. T hen, equations (21) and (22) o f ^-bias o f predictors (3) and (4) simplify to the following form s:

E {(Tу и - ш _ T r ) = £ V * ^ (ß ä ~ ßä * ) XKi (23) С — 1 A SC d = 1 4 f BJ - U~ra‘ - Td-) = i X “d' i (Ä - ßd*)X,cd. (24) sc ii = 1 H ence, Ч Т ^ - Ш _ T d t ) _ 4 f ^ Y S - r a , _ Tdt) = _ £ X d. ° {fid_ ß(i,)Xxd C = 1 A s c d = 1 (25) First, let us remind th at if both predictors are ^-unbiased (i.e. simple regression superpopulation m odel in strata is true) or if equality (7) holds, then difference given by e q u a tio n (25) will equal zero. Let us noticc, th a t the value o f X xd, is closer to zero (w hat holds when ncil. decreases), the sm allest difference o f d;-biases o f both predictors is.

Let us consider tw o cases w ith addition al assum ptions th a t VjX, > 0 and f B L u - r a t ^ f s Y N - u o r jn t ^ e first case for cach s tra ta to which elem ents

1 D

o f d* d o m ain belong follow ing inequality occurs — £ (Д, — ß d ' ) Xscd > О,

SC d = 1

w h a t can hold w hen 'id ß d > ß d.. H ence, E i( Ť d*N~ ral- - T d,) > 0 and E(( f d * U rat - T d.) > 0 and"finally E {( f dB.Lt/“ rflt - T d.) - E (( f sJ N~rat - T d.) < 0. L et in the second case fo r each stra ta to which elem ents o f d* dom ain

1 D

belong follow ing inequality occurs — £ ßd ~ ßd*)Xxd < 0, w hat can hold sc d = 1

(9)

When V, ßd < ß d.. H ence, < 0 and Е }( Т ^ и rat - T d.) < 0 and

d + d *

finally E(( Ť ^ v - ra,- T íl. ) - E l( Ť sdI N- ra,- T d. ) > 0 . In b o th cases absolute value o f í-b ia s o f Ť d* u ~rat predictor is lower then absolute value o f Ť d*N~ral. Let us stress th a t when elem ents o f d* d om ain were draw n to the sam ple only from one stra ta , only one o f these tw o situ atio n s can hold.

P redictio n M S E o f the predictor o f the form o f synthetic ra tio estim ato r for sim ple regression sup erp o p u latio n m odel in dom ain s is ob tained by sum m ation o f prediction variance (8) for assum p tion (14) o r prediction variance (16) for assu m ption (15) and squared f-b ia s (21). P rediction M SE o f p red icto r (4) for sim ple regression su p crp o p u latio n m odel in do m ains is received by sum m ation o f prediction variance (10) fo r assum p tio n (14) or prediction variance (18) for assum ption (15) and sq uared f-b ia s (22).

Because analytical results o f M SE com parison are qu ite m o dest, in p art V sim u latio n study will additionally be conducted.

[V. P O L Y N O M IA L S U P E R P O P U L A T IO N M O D E L IN S T R A T A

In the previous section the m isspecification o f su p erp o p u la tio n m odel was considered in the case when simple regression su p erp o p u la tio n m odel in dom ain s is true. In the follow ing section polynom ial su p erp o p u latio n m odel in s tra ta is assum ed.

It is assum ed th a t

Е Д У „ ) = i ß P x b (26)

)= о

P articu lar form o f polynom ial superpo p u latio n m odel w ith assum p tio n (26) is regression su p erp o p u la tio n m odel w ith follow ing assum ption:

E i( Y tl) = № x ci + ß ? ). (27)

W h at should be rem inded is th a t for m odels assum ed fo r stra ta equation (2) holds. It im plies th a t, prediction variances o f b o th p red ictors are given by eq u a tio n s (8) and (10) and

V ar{(T j.LC,_ret - 7 » < V a r j f g " - ' * - 7 » .

Let us derive eq u atio n o f £-bias o f the p red icto r o f the form o f synthetic ratio estim ato r for polynom ial superpopulation m odel in stra ta (superp op ula­ tion m odel w ith assum ption (26)). A fter som e algebra it is o b tain ed , th at

(10)

E{(ŕ?.™-™‘-

7

V)= X

с - 1 с ießc,* j=0 i

I

I E # W ■td* _ У.

У.Ш У

. Y№x{

If regression su p erp o p u la tio n m odel is assum ed for stra ta (su p erp o p u latio n m odel w ith assum ption (27)) and if first ord er inclusion p rob abilities are co n stan t in stra ta , the eq u atio n will simplify to the follow ing form :

In the considered case if fo r each stra ta the m ean value o f auxiliary variable for d o m ain d* and stratu m products equals the m ean value o f auxiliary variable for sam pled elem ents from stratu m , the p re d ic to r o f the form o f synthetic ra tio estim ato r will be ^-unbiased.

Let us derive eq u a tio n o f £-bias o f p redicto r (4) fo r polynom ial sup er­ p o p u la tio n m odel in s tra ta (su perpopulatio n m odel w ith assu m ptio n (26)). T h e result is as follows:

I f regression su p erp o p u latio n m odel in s tra ta is tru e (su p erp o p u latio n m odel with assu m p tio n (27)) and if У;у(х^) = x ;, then the eq u atio n will simplify to th e follow ing form :

(29)

E t (f$* -pBL U - rat1 — T d») =

(30)

In the considered case if for each s tra ta th e auxiliary variable m ean value for non-sam pled elem ents o f intersection o f do m ain d* and stratu m equals

(11)

the m ean value o f auxiliary variable for sam pled elem ents from stratu m , prcd icto r (4) will be ^-unbiased.

Let us co m p are ^-biases o f b o th predictors for regression su p erp o p u la­ tion m odel in stra ta when V;v(jc^ = x, and first o rd e r inclusion probabilities arc c o n sta n t in strata. Let us assum e th a t equ ality (7) does n o t occure. Ilen ce,

E ( ( f B L U - r « _ T d t ) _ E ( ( f S Y N - r a , _ ^ = _ £ Г h * . _ * - 1

с = 1 X я с ^ f t c d * f t с

(32) Let us notice, th a t the value o f пы. is closer to zero, the sm allest difference o f ^-biases of b o th predictors is. If for each s tra ta the auxiliary variable m ean value for sam pled elem ents o f intersection o f do m ain d* and stratum equals the m ean value o f auxiliary variable for sam pled elem ents from stratu m , values o f £-bias for b o th predicto rs will be equal.

Let us consider tw o cases assum ing th a t V;X( > 0 and 4 ß cO)> 0 . Let in the first case for each s tra ta from which elem ents o f d*-th d o m ain were d raw n follow ing inequalities ap p ear

Ncd. nc N rcd. ncncd. nc ' can hold for exam ple w hen d om ain o f interest consists o f elem ents with the highest values o f auxiliary variable. H ence, E i( T d.ł'JV“ rat — 7 » > 0 and

E ť( f « . ! / - « , _ T i <) > 0 a n d f m a l l y E ( ( f B L V - r a , _ T i m) _ E ^ f S Y N - m , _ < Q

Let in the second case for each stra ta from which elem ents o f d*-th dom ain were d ra w n follow ing inequalities appear

Ncd* nc N nd. nc ’ ncd. nc ' It can hold fo r exam ple w hen d om ain o f interest consists o f elem ents with the lowest values o f auxiliary variable. H ence, E (( f d™ ~ rat — T d.) < 0 and

4 f B L V - r a , _ Tdm) < о a n d f i n a l l y E i ( f B i i / - r « r _ T d t ) _ E { ( 7 * ™ - ™ < _ T d. ) > 0 .

In b o th cases absolute value o f £-bias o f p re d ic to r is low er than ab solute value o f £-bias o f f dI N~ ra\ w hat implies low er value o f prediction M S E o f p re d ic to r (because value o f p re d ic tio n v aria n c e o f f B i u - ra t js iow er) Let us add t^ at fjje sam c conclusions can be received for b o th cases fo r assum ptions V;X; > 0 and V ^ 0' < 0.

P rediction M S E ’s o f predictors (3) and (4) fo r sim ple regression sup er­ p o p u latio n m odel in dom ains arc received by su m m ation o f prediction variances (8) and (10) and squared ^-biases given by eq u atio n s (28) and (30) adequately.

(12)

V. S IM U L A T IO N S T U D Y

S im ulation study is conducted based on artificial p o p u latio n which consists o f 200 elem ents divided into 3 s tra ta and 6 do m ains. F irst stratu m , which consists o f 80 elem ents, includes 20 elem ents from first do m ain , 20 elem ents from second d om ain and 40 elem ents from third d om ain. Second stra tu m , which consists o f 70 elem ents, includes 30 elem ents from first d o m ain , 30 elem ents from fo u rth d om ain and 10 elem ents from fifth do m ain . T h ird stratu m , which consists o f 50 elem ents, includes 20 elem ents from second d o m ain , 10 elem ents from fifth do m ain an d 20 elem ents from sixth do m ain . Values o f auxiliary variable were generated using norm al d istrib u tio n s with follow ing param eters set arb itrarily: in first stratum d istrib u tio n N (100, 20), in second s tra tu m - N ( 120, 30) an d in third stra tu m - N (1 5 0 ,4 0 ). E lem ents in stra ta are assigned to d o m ain s at ra n d o m .

T hree p redictors arc considered: predicto r given by e q u a tio n (3) (in tables d eno ted by synt), predictor given by eq u atio n (4) with v(x;) = \ J x t for every i = \ , . . . , N (in tables denoted by BLU 1) an d p re d ic to r given by e q u a tio n (4) w ith v(x^) = 1 for every i = 1, ..., N (in tables deno ted by B LU 2). A ccuracy o f the three predictors is considered fo r fo u r sup er­ p o p u latio n m odels with following p aram eters set arb itra rily . Let us add, th a t fo r all follow ing sup erp o p u latio n m odels ran d o m co m p o n en ts are generated by using N ( 0, 1) distribution. F irst m odel is sim ple regression superp opulation m odel in strata as follows: Yci = ßcx ci + EcisJ x ci, where ß l — 1, ß 2 = 2, ß 3 = 3. Second m odel is regression su p crp o p u latio n m odel in strata as follow s: Yci = (rcl)x ci + $ 0) + EctyJ x ci, w here ß \ l) - 1, ß ^ ) = 2, = 3, /Д0) = 200, = 250, ß ^ ) = 300. T h ird m odel is polynom ial su p erp o p u latio n

2

m odel in stra ta as follows: Y ei = £ (№xJci + i-cis/ x ci, w here //,2) = 1.5, /42) = 1, #,2) = 0.5, № = 1, $su = 2, Ä 1’ = 3° M0) = 200, #>0) = 250, = 300. F o u rth m odel is sim ple regression superpo p u latio n m odel in d om ain s as follows: Ydi = ßiXa + EdiJ x di, where ß v = 1, ß 2 = 3, ß 3 = 5, /?4 = 7, ß s = 9, ß e = 11. It should be underlined, th a t although m odel ap p ro ach is con dition al a p p ro ach , results in sim ulation study are averaged by tak in g sam pling design d is trib u tio n in to co n sid e ra tio n . S ym bol E p d e n o te s expected value o f sam pling design distribution. In the follow ing tables bias (in % )

E E ( T * _T *) denotes ap p ro x im ated in sim ulation study value o f p ^ d* d* x 100,

E i(7 'd.)

(13)

- T , .) - E {(7 V - T » ) 2 v ... , , .

— 4— x 100 and ro o t M SE (in % ) denotes

ap-E{(i <(*)

/g ££ f'p _ T

proxim ated in sim ulation study value o f L -|, — —— x 100. It is w orth

E t( T d.)

stressing th a t č, p-bias, p-expected value o f prediction variance and p-expccted value o f prediction M SE arc com puted instead o f p ^-bias, <j;-expected value o f p-variancc and <!;-expected value o f p-M SE. Values o f above-m entioned statis­ tics are eq ual because sam pling design is noninform ativ e.

Stratified ran d o m sam pling with p ro p o rtio n al allo catio n is considered. R esults received in sim ulation arc based on 500 ra n d o m sam ples and are ad ditionally averaged with respect to 1000 realizations o f su p erp o p u latio n m odel. T h is way for sim ulation purposes 500 000 values o f each predictor are generated. T hree sizes o f sam ple are considered: 40, 60 an d 80 elem ents which a m o u n t to 20 % , 30% and 40% o f p o p u latio n size. H igh fractions o f draw n elem ents are considered because it was proved, for cases discussed in previous p arts o f the paper, th at for small sam ple sizes precision difference o f b o th p redictors is small.

Let us co m p are accuracy o f analysed predictors w hen sim ple regression su p crp o p u la tio n m odel in s tra ta is true.

R esults presented in the T able 1 show th a t ro o t <!;-expected values of p-M SEs fo r all o f predictors in all dom ains except o f d om ain three equal less th an 1% o f <!;-expected d om ain total. In d om ain three they docs n o t exceed 3% . It is w orth stressing th a t although accuracies o f the considered predictors are sim ilar, ro o t ^-expected value o f p-M S E o f th e p re d ic to r o f the form o f synthetic ratio estim ator is higher com paring to predictor (4) with misspecifica­ tion o f variance structure (in table denoted by B LU 2). If statistician specifies correct form o f <!;-expected value o f random variables (i.e. he decides th at simple regression su perpopulation m odel in stra ta is tru e) and incorrect form o f their «^-variance (i.e. he decides th a t m odel is hom oscedastic), the choice o f BLU p red icto r with w rong specification o f variance stru ctu re will be better th an choice o f the p red icto r o f the form o f synthetic ra tio estim ato r. Interes­ ting is th a t in sim ulation study the decrease o f ro o t <!;-expected p-M S E s for synthetic estim ator due to the increase of sample size is slower com paring with o th e r predictors. Let us add, th a t the highest values o f ro o t <!;-expected p-M S E s are observed in dom ain three, because it is the only d o m ain which belongs only to first stra ta - s tra ta with the low est Д. coefficient. Because d istrib u tio n s o f auxiliary variable in s tra ta are sim ilar, in the first s tra ta the higher dispersion o f variable o f interest with respect to Č d istrib u tio n is observed. N otice th a t the sm aller is sam ple size the sm aller is difference in accuracy o f synthetic estim ator and B LU predictor (denoted by B LU 1) w hat was proofed fo r different assum ptions in p a rt 2 o f th e paper.

(14)

Table 1. A ccuracy o f p re d ic to rs fo r sim ple regression s u p c rp o p u latio n m odel in s tra ta

D o m a in P red icto r

Bias (in % ) R o o t v a rian ce an d ro o t M S E (in % )

Sam ple size S am ple size

40 60 80 40 60 80 1 synt 0.00 0.00 0.00 0.86 0.72 0.65 B L U 1 0.00 0.00 0.00 0.84 0.68 0.57 B L U 2 0.00 0.00 0.00 0.85 0.68 0.58 synt 0.00 0.00 0.00 0.63 0.54 0.48 2 B L U 1 0.00 0.00 0.00 0.61 0.50 0.43 B L U 2 0.00 0.00 0.00 0.61 0.50 0.43 3 synt 0.00 0.00 -0.01 2.52 2.04 1.77 B L U 1 0.00 0.00 0.00 2.48 1.95 1.63 B L U 2 0.00 0.00 0.00 2.50 1.97 1.64 4 synt 0.00 0.00 0.00 0.85 0.70 0.62 B LU 1 0.00 0.00 0.00 0.83 0.67 0.56 B L U 2 0.00 0.00 0.00 0.84 0.68 0.57 5 synt 0.00 0.00 0.00 0.58 0.52 0.49 B L U 1 0.00 0.00 0.00 0.56 0.47 0.41 B L U 2 0.00 0.00 0.00 0.56 0.47 0.41 6 synt 0.00 0.00 0.00 0.55 0.46 0.40 B L U 1 0.00 0.00 0.00 0.54 0.44 0.37 B L U 2 0.00 0.00 0.00 0.55 0.44 0.37

Let us consider results for regression su p erp o p u latio n m odel in strata which are presented in the T able 2. A ccuracy o f the considered predictors will be discussed in the case o f m odel m isspecification. Let us notice th at values o f ro o t ^-expected p-M SEs do n o t exceed 3,5% o f <!;-expected dom ain totals and they arc determ ined by values o f £-p-bias. It should be underlined th a t in this case none of predictors have better accuracy in com p ariso n with others. F o r polynom ial m odel in stra ta (result are n o t presented) values o f ro o t <!;-expected p-M S E s exceed 6% o f <!;-expected do m ain to tals only in few cases for sam ple size 40 elem ents. T hese results are determ ined by £, p-bias, values o f ro o t <!;-expected p-variances do no t exceed 0.04% of ^-expected do m ain totals. It should be stressed th a t in som e cases ^-expected p-M S E s o f synthetic ratio estim ator increase d ue to the increase o f sam ple

(15)

size, w hat for p-M S E s was discussed earlier by W yw iał, Ż ąd ło (2003). T he sam e p ro p e rty can be observed for <!;-expected p-M S E s, because sam pling design is n oninform ative.

T able 2. A ccu racy o f p re d ic to rs fo r regression su p e rp o p u latio n m odel in s tra ta

Domain P red icto r

Bias (in % ) R o o t va ria n ce (in % ) R o o t M S E (in % )

S am ple size S am ple size S am ple size

40 60 80 40 60 80 40 60 80 1 synt -1 .7 5 -1 .8 7 -1.91 0.44 0.37 0.33 1.81 1.90 1.94 B L U 1 -2 .4 2 -2 .3 0 -1 .9 0 0.43 0.35 0.30 2.46 2.33 1.92 B L U 2 -3 .3 3 -3 .1 0 -2 .6 0 0.43 0.35 0.30 3.36 3.12 2.62 2 synt -1 .3 4 -1 .4 3 -1 .5 0 0.38 0.32 0.29 1.39 1.47 1.53 B L U 1 -1 .9 3 -1 .8 4 -1.61 0.37 0.30 0.26 1.97 1.86 1.63 B L U 2 -2 .7 6 -2 .7 3 -2 .2 6 0.37 0.30 0.26 2.78 2.74 2.27 3 synt 1.73 1.52 1.50 0.84 0.68 0.59 1.93 1.67 1.62 B L U 1 0.38 0.12 0.10 0.83 0.66 0.55 0.91 0.67 0.55 B L U 2 -1 .6 5 -0.77 -0 .6 2 0.83 0.66 0.55 1.84 1.01 0.83 4 synt 0.42 0.30 0.27 0.50 0.41 0.36 0.66 0.51 0.45 B L U 1 -0 .8 7 -0 .6 7 -0 .5 0 0.49 0.39 0.33 1.00 0.78 0.60 B L U 2 -1 .5 7 -1 .4 9 -1.21 0.50 0.40 0.33 1.65 1.55 1.25 5 synt 2.14 2.06 2.03 0.39 0.35 0.33 2.17 2.09 2.06 B L U 1 0.74 0.58 0.46 0.37 0.32 0.28 0.83 0.66 0.53 B L U 2 -0 .1 2 -0 .1 2 -0 .1 2 0.38 0.32 0.28 0.39 0.34 0.31 6 sy n t 1.71 1.57 1.56 0.40 0.33 0.29 1.76 1.60 1.59 B L U 1 0.50 0.33 0.31 0.39 0.31 0.26 0.63 0.46 0.40 B L U 2 -0 .4 0 -0 .4 0 -0 .3 4 0.40 0.31 0.26 0.56 0.51 0.43

F inally, in the T ab le 3 results o f sim ulation study fo r sim ple regression su p erp o p u la tio n m odel in d om ains are presented. A t the beginning it m ust be stressed th a t p rediction accuracy is n o t sufficient m ainly because o f high values o f the bias. It should be noticed th a t p red icto r (4) (b o th in cases o f correct and incorrect specification o f variance structure) has better accuracy com p arin g to the p redictor o f the form o f synthetic ra tio estim ato r. T he highest values o f £ p-bias and ^-expected p-M S E are observed in first and second dom ain. It results form fact, th a t elem ents o f these d o m ain s belong to stra ta in which m ost o f elem ents are from d o m ain s w ith higher ßd th an

(16)

in the first and second dom ain. It should be stressed th a t, as in T able 2, in som e cases <j;-expected p-M SEs oť the prcd icto r o f the form o f synthetic ra tio estim ato r increase due to th e increase o f sam ple size.

Table 3. A ccu racy o f p re d ic to rs fo r sim ple regression su p e rp o p u latio n m odel in d o m ain s

Domain P red icto r

Bias (in % ) R o o t v ariance (in % ) R o o t M SI! (in % )

Sam ple size Sam ple size Sam ple size

40 60 80 40 60 80 40 60 80 1 synt 336.89 336.39 336.06 1.97 1.66 1.49 336.89 336.40 336.06 B L U 1 227.68 241.19 206.71 1.92 1.56 1.32 276.69 241.20 206.71 B L U 2 280.01 244.17 209.23 1.94 1.57 1.33 280.02 244.17 209.23 2 synt 93.09 95.69 95.86 0.68 0.59 0.53 93.09 95.69 95.86 59.05 B L U 1 76.90 70.01 59.05 0.66 0.55 0.46 76.90 70.01 B L U 2 78.37 71.16 60.05 0.67 0.55 0.47 78.38 71.17 60.05 3 synt -28.55 -28.89 -28.99 0.50 0.41 0.35 28.55 28.89 28.99 B L U 1 -2 3 .5 8 -20.16 -17.36 0.50 0.39 0.33 23.58 20.16 17.37 B L U 2 -23.23 -1 9 .8 9 -17.11 0.50 0.39 0.33 23.24 19.89 17.12 4 synt -3 1 .0 6 -31.37 -31.41 0.36 0.30 0.27 31.06 31.37 31.41 B LU 1 -24.66 -2 1 .8 6 -18.90 0.36 0.29 0.24 24.66 21.86 18.90 B L U 2 -24.07 -2 1 .3 2 -18.45 0.36 0.29 0.24 24.07 21.32 18.45 5 synt -30.41 -2 9 .8 2 -29.67 0.26 0.24 0.22 30.41 29.82 29.67 B L U 1 -24.06 -2 0 .5 6 -17.74 0.25 0.21 0.19 24.06 20.57 17.74 B L U 2 -23.48 -20.10 -17.33 0.25 0.21 0.19 23.48 20.10 17.33 6 synt -3 1 .7 9 -30.72 -30.54 0.25 0.21 0.18 31.79 30.72 30.54 B L U 1 -2 5 .4 2 -21.60 -18.77 0.25 0.20 0.17 25.43 21.60 18.77 B L U 2 -2 4 .8 6 -21.18 -18.40 0.25 0.20 0.17 24.86 21.18 18.40 VI. C O N C L U S IO N

In the p ap e r properties o f the p redicto r o f the form o f synthetic ratio estim ato r based on su perpo pulation ap p ro ach were studied. It was proved th a t it is ^-unbiased for simple regression su p erp o p u la tio n m odel in strata. F o r the m odel BLU pred ictor was presented an d situ atio n s when both p re d ic to rs are eq u al w ere show n. P ro p erties o f b o th p re d ic to rs were

(17)

additio n ally studied in the case o f su p erp o p u la tio n m odel m isspecification. A nalytical con sid erations were supported by sim ulation study. It was shown th a t for discussed d a ta both predictors gives sim ilar results bo th for correct and incorrect m odel specification. F o r correct m odel specification and for simple regression m odel assum ed in dom ains, accuracy o f th e B LU predictor is higher com p arin g to accuracy o f the p redictor o f th e form o f synthetic ra tio estim ato r in sim ulation study. W hen problem o f m odel m isspecification for analysed artificial pop u latio n is discussed, both p red ictors gives better results for incorrect m odels assum ed for stra ta th a n for incorrect m odels assum ed fo r dom ains.

R E F E R E N C E S

B olfarine H ., Z acks S. (1992), Prediction Theory fo r Finite Population.';, Springer-V erlag, New Y ork.

B rach a Cz. (1994), M etodologiczne a sp ekty badania m ałych obszarów, S tu d ia i M ateriały , Z Prac Z a k ła d u B a d ań S taty sty czn o -E k o n o m iczn y ch , 43, G U S , W arszaw a.

B racha Cz. (1996), Teoretyczne podstaw y m eto d y reprezentacyjnej, P W N , W arszaw a.

D o m a ń sk i C z., P ru sk a K . (2001), M e to d y sta ty s ty k i m ałych obszarów, W yd. U niw ersytetu Ł ódzkiego, Ł ó d ź.

G etk a-W ilczy ń sk a E. (2000), E stim atio n o f to ta l d o m ain in finite p o p u la tio n , S ta tistics in Transition, 4, 4, 711-728.

R oyal! R .M . (1976), I'he linear least sq u ares pred ictio n a p p ro a c h to tw o-stage sam pling, Journal o f the A m erican S ta tistica l A ssociation, 71, 473-657.

V alliant R ., D o rfm a n A .H ., R oyall R .M . (2000), Finite P opulation Sam pling and Inference. A Prediction Approach, Jo h n W iley & Sons, N ew Y ork.

W yw iał J., Ż ą d ło T . (2003), On M ean Square Error o f S yn th etic R atio E stim ator, S tudia E k o n o m iczn e, A E K atow ice, 2003.

T o m a s z Ż ą d ło

O S Y N T E T Y C Z N Y M E S T Y M A T O R Z E IL O R A Z O W Y M Z P U N K T U W ID Z E N IA P O D E J Ś C IA M O D E L O W E G O

Streszczenie

W opracow aniu rozw ażane są z p u n k tu widzenia podejścia m odelow ego w łasności pred y k to ra p ostaci syntetycznego esty m ato ra ilorazow ego w artości globalnej w d om enie znanego z podejścia ran d o m izacy jn eg o . P rzed staw io n y jes t d o w ó d jeg o f-n ieo b ciążo n o śc i d la p ro steg o regresyjnego m odelu n a d p o p u la cji w w arstw ach. D la tego m odelu z ap rezen to w an y je s t także p re d y k to r typu B L U . W yprow adzone są wzory opisujące w ariancje predykcji obu pred y k to ró w d la w spom nianego m odelu n a d p o p u la cji. D la obu p re d y k to ró w ro z w aża n y je s t także p ro b lem niepraw idłow ej specyfikacji m odelu nadpopulacji i d la tego przypadku w yprow adzone są błędy średniokw adratow e predykcji. P o ró w n an ie d o k ład n o ści obu p red y k to ró w w sp arte je s t an alizą sym ulacyjną.

Cytaty

Powiązane dokumenty

Przeczucie końca nie ma dla niego wartości ostatecznej, jest tylko ważnym toposem, który konkuruje z mitami samoodradzania się i po- czątku. Warto zwrócić uwagę na

Objective To re-evaluate gonad shielding in paediatric pelvic radiography in terms of attainable radiation risk reduction and associated loss of diagnostic information.. Methods A

U Różewicza martwe ciało jest argumentem w ramach traumatycznego doświadczenia skończoności ludzkiego życia („człowieka tak się zabija jak zwierzę”) – u Kajzara

A fault diagnosis system supported by a qualitative physics simulator and the FRD represents an innovating and promising approach for offshore wind farms to deal with faults, with

We present a unified algorithm, called Unified MApping, Routing and Slot allocation (UMARS+), that couples mapping, path selec- tion and time-slot allocation, accommodating both

CRITICAL ECOLOGICAL ISSUES AND FOREIGN LANGUAGE TEACHING.. Critical ecological thinking – Th

The clock synchronization algorithm using hardware timestamping and the optimized clock synchronization algorithm both can meet the requirements of 5G communication

Previous studies showed that in high pH solutions with chloride, the potentials are almost stable over 6 months with only ±1.5 mV changes, whereas in the absence of