• Nie Znaleziono Wyników

About robust detection of a breakdown point in selected linear regression models

N/A
N/A
Protected

Academic year: 2021

Share "About robust detection of a breakdown point in selected linear regression models"

Copied!
8
0
0

Pełen tekst

(1)

A C T A U N I V E R S I T A T I S L O D Z I E N S I S ____________ FOLIA OECONOMICA 216, 2008

Grzegorz K o ń c z a k *

O N T E S T I N G T H E H Y P O T H E S I S O F S T A B I L I T Y O F T H E R A T I O O F T W O R A N D O M V A R I A B L E S

A B S T R A C T . T he problem o f testing the h ypothesis o f the stability o f expected value o f the ratio o f random variables is considered in the article. T h e ratio o f random variables X and Y under assum ptions o f the autoregression m o d els is analyzed. A test w hich m akes it p o ssib le to verify the m entioned h yp oth esis is proposed. T h e problem being considered in the article appears in the statistical quality control w hen the propor­ tions o f d im en sion s o f the product or the proportion o f com p on en ts in m ixtures are re­ quired to be stable.

K e y w o rd s: ratio, testing stability, autoregression.

1. IN TR O D U CTIO N

Problems connected with the estimation o f the parameters o f the ratio of random variables can be found in various practical statistical analyses. The prob­ lem o f testing the hypothesis o f the stability o f expected value o f the ratio o f random variables is considered in the article. It is assumed that measurements of random variables X and У in и periods are made. The ratio o f two random vari­ ables X and Y is analyzed in the paper. The stability o f the expected value o f the ratio in periods t = 1, 2... n is assumed and the hypothesis o f the stability of the ratio o f X and Y for period t = и+ l is verified. A test which makes it possible to verify the mentioned hypothesis is proposed and its properties are presented in the paper.

The problem being considered in the article appears in the statistical quality control when the proportions o f dimensions o f the product or the proportion of components in mixtures are to be stable. The expected values o f the random variables can change but the ratio o f these variables should be stable over time. The advantages o f the use o f the proposed test instead o f simultaneous testing the expected values of random variables X and Y are presented.

(2)

II. BASIC DEFINITIONS

Let {ľ,, t = 1 and {X,, t = 1,...,«} be two normally distributed time se­ ries with constant means f.iy and f i x respectively. Let us assume that the ran­ dom variables Y, and X, are independent and that their variances are stable over time and are equal to a 2, and a x . These assumptions can be written as follows:

е д ) = £ ( у 2) = . . , * = е д , ) = / / „ E ( X l ) = E ( X 2 ) = ... = E ( X n) = M x ,

D 2(Y{) = D 2(Y2) = ...D 2 (Yn) ~ , (1)

D 2( X l ) = D 2( X 2) = ... = D 2( X n) = <r2x , C o v ( X n Y j) = 0 for i, j = 1,2,

Under these assumptions the ratio o f random variables Y, and X, Y

Z = —— for t = n+\ X ,

will be considered.

III. RATIO OF RANDOM VARIABLES Y, AND X,

Let x , , x 2,...,x n+1 and У1, У 2^—^Уп+\ be two random samples. Let us as­ sume that the expected value o f the ratio Z is stable for periods t - 1, 2, ..., n. It can be written as follows:

Í - 5 ' \ X , j

- r - c o n s t .

(

2

)

There are various methods o f determining the expected value o f the ratio of two random variables. If f Y ( y ) and f x (x ) are probability densities o f random variables Y and X then the probability density of the random variable Z (M. G. Kendall, 1945) can be written as follows

(3)

ау

/ ( * ) = \ \ x \ f x ( x ) f Y( z x ) d x . (3)

The probability density (3) can be used for calculating expected value o f the ratio (2).

The ratio o f normal variables was considered by R.C. Geary (1930), G. Marsaglia (1965), D. V. Hinkley (1969), A. Cedilnik et al (2004, 2006). In the case o f the ratio o f two normal random variables with zero expected values the ratio has Cauchy distribution and has no expected value, variance and no moments o f higher orders.

If X ~ N(0,1) and Y ~ //( 0 ,1 ) , then the ratio (2) has Cauchy distribution ( Z ~ C (0,1)). In the case when X ~ N ( 0 , c r x ) and Y ~ N ( 0 , c r y ) the ratio has Cauchy distribution Z ~ С

r \

o A <*xj

. The ratio o f two arbitrary normal ran­ dom variables X ~ N{f.ir ,crx ) and Y ~ N ( /j y,<jy ) for fdy Ф 0 and /лх * 0 is discussed in G. Marsaglia (1965) and leads to Cauchy-like distribution, but if

L l y

--- » 0 then it leads to normal distribution (D.V. Hinkley, 1969). ° x

The Taylor series expansion can be used for estimating parameters (if they exist) o f the ratio (2). The expected value and variance o f the ratio can be written as follows: Y_ X ) Их Их Их (4) D 'Y ' * i í L _ И у _ i i3iił \ Иг Их Их T 1 1 (Ty H----—<JY ._2 , 1 _2 (5) Mx

For estimating the parameters o f complex estimators the resampling meth­ ods are oflen used. The resampling methods used most often for estimating the parameters o f function o f random variables are jackknife and bootstrap.

(4)

IV. TIIE CASE OF AUTOCORRELATED DATA

The standard assumptions in various statistical controls analysis are that the data generated by the process being in control are normally and independently distributed. This assumption is made in Shewhart’s control chart model which can be written as follows

X ,

= n + e,

t = 1 ,2 ,....

where e t (t = 1 , 2 , . . . ) are normally and independently distributed with mean 0 and standard deviation a . The assumption o f independent observations is ful­ filled in many real processes. In real processes the data are often correlated over time (D.C. Montgomery, C. M. Mastrangelo, 2000). Numerous industrial proc­ esses produce data that change over time (R. L. Mason, J. C. Young, 2000).

Let us assume the first-order autoregressive model AR(1) for time series Y, and X,. We can write then (G. E. P. Box, G. M. Jenkins, 1983)

X i — ф х Х i - \ ■*" £ X I ( 6 )

and

Y l = (PyYi- \ + £ Y ť ( ^ )

where X l = X l - / J x , Y l = Y l - /j y and £ Xl, £ Yt are normally and independ­ ently distributed with mean 0 and standard deviations a X n a Yt for t = 1,2, 3, ...,и . These processes are stationary when - 1 < (px < 1 and — 1 <(pY < 1. The autoregressive models (6) and (7) can be written as follows:

X , = <Px X '-i + M x ( l - <Px ) + £ x, (g) and

(5)

V. TEST OF STABILITY OF TIIE RATIO OF RANDOM VARIABLES

Let x l , x 2,. .. ,x n+i and y {, y 2... JVn be the observed time series ( / = 1 , 2 , .... n). Under the assumption (2) the hypothesis o f the stability o f the expected value o f the ratio will be tested. It can be written as follows:

H q : E = E n Z n (=1 n

z*.

V i=i ( 10)

against the alternative

Я , : E ‘ <i+i Y VA »+\ J фЕ n 1 .У . (=1 n Z * . \ /«i ( ID

On the base o f the observed values the parameters o f the regression models (8) and (9) can be estimated and the theoretical values x, , x 2 and

У\,Уг>—>Уп+\ ° f series can be calculated. The residuals e yl = y , - y , and exl = x, — x, (t = 1, 2, ..., л+ l) can also be then calculated. Under the H0 hypothesis the residuals have normal distributions with means 0 and variances

2 2

сt y and a x respectively.

Q Q.

From the above we have that ey l = - ^ ~ and e x , = —— have normal

° y & X

distributions with mean 0 and standard deviations 1. To verify the hypothesis (10) the following statistic will be used:

T =U

(6)

(13)

(14)

Under the H0 hypothesis the statistic U has chi square distribution with 1 degree o f freedom and the statistic V has chi square distribution with 2n degrees o f freedom. The ratio statistic T given by (12) has F distribution with 1 and 2/7 degrees o f freedom.

The approximate values o f the probabilities of rejection the hypothesis (10) for various discordance levels in models (8) and (9) are obtained in Monte Carlo analysis. In the simulation process observations X,, x 2 x 10 and y t , y 2 y i0 were generated using the (8) and (9) models. The 11/A observations for both time series were generated using (15) and (16).

where k Y , k x = 0, 1, 2, 3 .

Then the statistic (12) was calculated and the values were compared to criti­ cal value Fo,o5; i, 20 which is equal to 4.35. The procedure was repeated 1000 times for each combination k y and k x (k y , k x = 0, 1, 2, 3) with jJY = 1 1 , / л х = 1 0 , a у = c rx = 1 . The results of Monte Carlo study are pre­ sented in Table 1 and on the Figure 1.

VI. M ON TE C A RLO STUDY

Y, — фу Y’-l Иу 0 V y ) “I" ky O’у + £y, (15) and

(7)

Table 1 The estimated values o f probabilities o f rejection the null hypothesis

kx к r 0 1 2 3 0 0.050* 0.067 0.176 0.371 1 0.093 0.035 0.071 0.191 2 0.166 0.063 0.037 0.048 3 0.304 0.094 0.110 0.080

*) the significance level (not simulated) Source: Monte Carlo study.

Fig. 1 The estimated values o f probabilities o f rejection the null hypothesis

The probabilities o f rejection the hypothesis (10) are greater for k y = 3, k x = 0 or k Y = 0, k x = 3 (there are the greatest changes in the ratio). If k Y = k x then the probability o f the rejection the null hypothesis is close to a = 0.05 (changes in the ratio are small).

The proposed test can be used for testing the stability o f expected value of the ratio o f random variables for which autoregression models are assumed. The probability o f rejection the hypothesis o f stability increases when the expected value o f one o f the analyzed variables change and is almost close to significant level a for consistence proportional changes o f the expected values o f both o f the random variables.

(8)

REFERENCES

Box G. E. P., Jenkins G. M. (1983) A n a liza s z e r e g ó w c za so w y c h . P r o g n o z o w a n ie

i s te r o w a n ie, Państwowe Wydawnictwo Naukowe, Warszawa.

Cedilnik A., Kosmelj K., Blejec A. (2004) The D istrib u tio n o f th e R a tio o f J o in tly N o r ­

m a l V a ria b les, Metodoloski zvezki, vol. 1, no. 1, s. 99-108.

Cedilnik A., Kosmelj K., Blejec A. (2006) R a tio o f Tw o R a n d o m V a ria b les: A N o te on

th e E x is te n c e o f its M o m e n ts , Metodoloski zvezki, vol. 3, no. 1, s. 1-7.

Geary R. C. (1930) T he fr e q u e n c y d is trib u tio n o f th e q u o tie n t o f tw o n o r m a l va ria tes,

Journal o f the Royal Statistical Society 9 3 ,4 4 2 -6 , 1930

Hinkley D. V. (1969) O n th e ra tio o f tw o c o r r e la te d n o r m a l ra n d o m v a ria b le s, Bio-

metrika 56, s. 635-639.

Kendall M. G. (1945) The A d v a n c e d T h eo ry o f S ta tis tic s, Charles Griffin & Company

Limited, Londyn.

Marsaglia G. (1965) R a tio s o f n o r m a l v a ria b le s a n d r a tio s o f s u m s o f u n ifo rm s v a ri­

a b le s. JASA, 60, s. 163-204.

Mason R. L., Young J.C. (2000), A u to c o rr e la tio n in M u ltiv a r ia te P r o c e s s e s , w: Statisti­

cal process monitoring and optimization, s. 387-394, Marcel Dekker Inc, New York - Basel.

Montgomery D. C., Mastrangelo C. M. (2000) P r o c e s s m o n ito r in g w ith a u to c o r r e la te d

d a ta , in: Statistical process monitoring and optimization s. 139-160, Marcel Dekker, New York-Basel.

G rze g o rz K o ń c z a k

O TESTO W A N IU H IP O T E Z Y O STA BILN O ŚCI W A R T O ŚC I O C Z E K IW A N E J ILO R A ZU DW Ó CH ZM IEN N Y CH L O SO W Y C H

W artykule zaprezentowano propozycję testu pozwalającego na weryfikację hipote­ zy o stabilności wartości oczekiwanej ilorazu zmiennych losowych. W rozważaniach

przyjęto, że w kolejnych n okresach czasowych dokonywane są pomiary wartości

zmiennych losowych Y i X . Przedmiotem analizy jest iloraz zmiennych losowych

Z = Y/ Х przy założeniu modelu autoregresji. Problem przedstawiony w artykule jest spotykany w zagadnieniach statystycznej kontroli jakości, gdy niezbędne jest zachowanie odpowiedniej proporcji np. dla wymiarów produktu czy proporcji skład­ ników w mieszaninach.

Cytaty

Powiązane dokumenty

T ak ą była zem sta G rzym alitów za udział sędziego kaliskiego w konfederacji pyzdrskiej. P rzyszedł na to czas jednak do­ piero po upływ ie rozejm u, kiedy i

Z klimatu zrodzonego przez wyobraz´nieC wymienionych twórców i nurtów rezYyser tworzy wizjeC s´wiata, która w sposób metaforyczny nawi Cazuje do problemów wspóNczesnej cywilizacji.

„Konfe- rencja Biskupów Białorusi wobec nowych wyzwań” zaprezentował bp Antoni Dziemianko (biskup diecezji pińskiej).. W wystąpieniu wskazał na bardzo specy-

Specificity of social-psychological training, as a practical method, qualitatively improves relevance of requirements for the professional competence and the capacity of

W następnym etapie MEN zamierza wdrożyć program wieloletni zakładając, że „jednym z podsta- wowych zadań współczesnej szkoły jest rozwijanie kompetencji uczniów

The carried out analysis of this synanthropization process was based on the example of stations of 31 anthropophytes of Poaceae family located in railway grounds

On the Linear Hypothesis in the Theory of Normal Regression 39 variables F discussed in this paper are the ratios of quadratic forms, the determination of the type of test based

Aleksander Potyrała ogłosił artykuł O po­ trzebie badania historii polskiego budownictwa okrętowego (Rozważania z okazji obchodów X wieków Gdańska).. Polemikę z