• Nie Znaleziono Wyników

Testing hypothesis on stability of expected value and variance

N/A
N/A
Protected

Academic year: 2021

Share "Testing hypothesis on stability of expected value and variance"

Copied!
11
0
0

Pełen tekst

(1)

Nr 1 2006

Grzegorz KOŃCZAK* Janusz WYWIAŁ*

TESTING HYPOTHESIS ON STABILITY

OF EXPECTED VALUE AND VARIANCE

The simple samples are independently taken from normal distribution. The two functions of the sample means and sample variances are considered. The density functions of these two statistics have been derived. These statistics can be applied for verifying the hypothesis on stability of expected value and variance of normal distribution considered, e.g., in statistical process control. The critical values for these statistics have been found using numerical integration. The tables with approximated critical values of these statistics have been presented.

Keywords: density function, sample variance, test statistic, numerical integration, statistical process

control

1. Introduction

One of the problems of statistical process control is considered. It is a procedure (so called control charts) for monitoring stability of the expected value and the variance of diagnostic variables. We assume that during the first k≥2 periods the mean values of a diagnostic variable are the same but unknown. The same situation is with the variance of the variable. The unbiased estimators of the expected value µ and the variance σ2

evaluated on the basis of data observed in the first k periods and the k + 1 period are X ,

and Xk+1 and Sˆk+1, respectively. Our problem is the following: is the process (char-acterized by the diagnostic variable) stable in all of the periods? If yes, the distances:

|

|XXk+1 or |Sˆ−Sˆk+1| should not be significant. Such a problem is considered, e.g., in [4]. More formally, we have the problem of testing the hypothesis

* Department of Statistics, Katowice University of Economics, ul. Bogucicka 14, 40-226 Katowice. koncz@ae.katowice.pl, wywial@ae.katowice.pl

(2)

) ˆ ( ) ˆ ( : 1 0 E X E X H k+ = and (ˆ2 ) (ˆ2) 1 E S S E k+ = .

We are going to construct a test statistic for this hypothesis in the next paragraphs.

2. Basic definitions and properties

Let Ja be a column vector consisting of a element equal one and let Ia be a unit

matrix of degree a. Moreover, let X=[X1,X2,...,Xi,...,Xk+1], where

[ ]

i i= i i ij in

X X ,X ,...,X ,...,X1 2 (i = 1, 2, …, k + 1, k ≥ 2, j = 1, 2, …, ni).

We will consider the following statistics:

i i n i i n j ij i i n X n X 1 1 XJ 1 = =

= , for i = 1, 2, …, k + 1, n k j i in n X n X 1 1XJ 1 = =

= , where

= = k i i n n 1 , 1 ) ( 1 1 ˆ 1 2 2 − = − − =

= i T i i i n j i ij i i n X X n S i XMX , T n i n i I i n J iJ i M = − 1 n , T XMX k n S k n k n S k j i = − − =

= 1 ˆ ) ( 1 ˆ 1 2 2 ,

[ ]

i M M= ,

where Mi is the block-diagonal matrix of degree n,

1 ) ( 1 1 ~ 1 2 2 − = − − =

= k n X X k S T k i i i XMX , T n n n M nJ J I N= − −1 , where ˆ2 i

S is the sample variance within the i-th group, ˆS is the mean of group vari-2

ances and S~2 is the variance between groups. Moreover, let us note that

i i M

M2 = , M

M2 = and N2= .N

The particular case of the theorem on independence of the quadratic or linear forms of normal vectors corresponding to our problem is as follows (see [3], pp. 224 and next).

(3)

Theorem 1. Let X have a non-singular normal distribution N(µJn, σ2In) and T

A

Q =XAX , T

B

Q =XBX , L = Xa where a is a column non-random vector, A, B are symmetric and non-random matrices of degree n each. Then, the set of necessary and sufficient conditions for QA and QB to be independently distributed is a) AB = O or b) ABJn = O. The quadratic form QA and the linear form L are to be independently

dis-tributed if and only if Aa = O.

On the basis of this theorem we can show that the statistics in the following pairs are independently distributed (X,S~2), (X,Sˆ2), (Sˆ2,S~2). Moreover, the statistics

2 1 1, ˆ +

+ k

k S

X are independent and they are independent of each of the statistics X, Sˆ2

and S~2.

The obtained result and the well known definitions let us derive the following dis-tributions: ) 1 , 0 ( : 1 1 1 1 N n n X X Z k k + − = + + σ . (1) Moreover, 2 1 2 2 1 : ~ ) 1 ( − − = k S k U χ σ , (2) 2 2 2 2 : ˆ ) ( k n S k n U = − χ σ (3) and 2 1 2 2 , 0 : ˆ ) 1 ( − − = i i ni i S n U χ σ i = 1, 2, …, k + 1. (4)

On the basis of these expressions we have

) 1 , 1 ( : ~2 2 2 1= F kS Z F σ , (5) ) , 1 ( : ˆ2 2 2 2 F n k S Z F = σ − , (6) ) , 1 ( : ˆ ˆ 1 2 2 1 3 F n n k S S F = k k + + , (7) ) 1 , 1 ( : ~ ˆ 1 2 2 1 4= + F n + − kS S F k k , (8)

(4)

where F(r, m) denotes the well known F distribution with r and m degrees of freedom and the following density function:

) ( 1 2 2 2 2 ) ( (0, ) ) ( 2 1 1 2 2 g I g m r g m r m r m r g f m r r r ∞ + −       +             ⋅             + = Γ Γ Γ . (9)

3. The statistics Q

1

and Q

2

Let us consider two statistics Q1 and Q2 given as follows 2 2 2 1 2 2 2 3 1 1 ˆ 1 ˆ ~ ) 1 (        − + = − + = + S S S Z F F Q k , (10) 2 2 2 1 2 2 2 4 2 2 ~ 1 ˆ ˆ ) 1 (        − + = − + = + S S S Z F F Q k , (11)

where F1 – F4 are defined by expressions (5)–(8), respectively.

On the basis of the previous results the distributions of the random variables Z, U1, U2 and U0, k + 1 are independent. Finally, this and the fact the samples are independent

lead to the conclusion that the distributions of the statistics F1 and F3 as well as the

distributions of F2 and F4 are independent.

The density function of random variable F1 is as follows:

) ( 1 1 1 ) 1 ( 1 2 1 2 1 2 ) ( (0, ) 2 1 I g k g g k k Γ Γ k Γ g f k       − + −       − ⋅             = , (12) where    ∉ ∈ = A x A x x IA 0 1 ) ( .

(5)

The density function of random variable F2 is: ) ( 1 1 ) ( 1 2 2 1 2 1 ) ( (0, ) 2 1 2 I g k n g g k n k n Γ Γ k n Γ g f nk+       − + −       − ⋅             − + = . (13)

The density functions of F3 and F4 are as follows:

) ( 1 ) ( (0, ) 2 1 2 3 3 I g g k n r g c g f r n k r ∞ − + −       − + = , (14) ) ( 1 1 ) ( (0, ) 2 1 1 2 4 4 I g g k r g c g f r k r ∞ − + −       − + = . (15) where r = nk+1 – 1 and 2 3 2 2 2 r k n r k n Γ r Γ k n r Γ c       −       − ⋅             + − = , 4 2 1 2 1 2 2 1 r k r k Γ r Γ k r Γ c       −       − ⋅             + − = .

Let us derive the distribution of random variable (F3 – 1)2. If b = g – 1 then we

have g = b + 1 and dg = db and

) ( ) 1 ( ) ( (0, ) 2 1 2 3 3 I g b k n r k n r k n b c b f r n k r a + − ∞ −       − + − + − + = . (16)

When v = b2, then we have b=± v and

v db 2 1 | | = . If b∈(−1,0], then b=− v

(6)

If v∈[0,1)                     − + − + − + +       − − − + − − = + − − + − 2 1 2 2 1 2 3 3 ) 1 ( ) 1 ( 2 1 ) ( r n k r k n r r b v k n r k n r k n v v k n r k n r k n v v c v f . (17) If v∈[1,∞) then 2 1 2 3 3 ) 1 ( 2 1 ) ( r n k r b v k n r k n r k n v v c v f + −       − + − + − + = . (18)

We can write the density function of (F3 – 1)2 in the following way

+              − − − + − − = + − ) ( ) 1 ( 2 1 ) ( (0,1) 2 1 2 3 3 I v v k n r k n r k n v v c v f r n k r b              − + − + − + + + − ) ( ) 1 ( ) , 0 ( 2 1 2 v I v k n r k n r k n v k n r r . (19)

If W: F(1, k – 1), then the density function of W is given by expression (12). Let V be the random variable of F distribution with r and n-k degrees of freedom. The den-sity function of V is given by (19). Now, we are going to evaluate the denden-sity func-tions of the random variable: Q1 = W + V. The density function of the statistic Q1 is as

follows

∞ − = 0 1 3 1(q) f (v)f(q v)dv h b (20)

(7)

∞ − + −              − − − + − − = 0 ) 1 , 0 ( 2 1 2 1 1 ( ) ) 1 ( ) ( I v v k n r k n r k n v c q h r n k r 2 ) , 0 ( 2 1 2 1 1 ) ( ) 1 ( 1 ) ( ) 1 ( k k n r r k v g v q v k v I v k n r k n r k n v       − − + − − ⋅              − + − + − + + + − , (21) where 2 1 2 2 2 2 1 2 1 2 2 r k n r k n Γ r Γ k n r Γ k Γ Γ k Γ c       −       − ⋅             + −       − ⋅             = .

Similarly, the density function of Q2 is derived in the following way.

∞ − = 0 2 4 2(q) f (v)f (q v)dv h b , (22)

where f4b is the density function of (F4 – 1)2

             − − − + − − = + − ) ( 1 1 1 ) 1 ( 2 1 ) ( [0,1) 2 1 1 2 4 4 I v v k r k r k v v c v f r k r b ⋅              − + − + − + + + − ) ( 1 1 1 ) 1 ( ) , 0 ( 2 1 1 2 v I v k r k r k v k r r (23)

(8)

Finally, the density function h2(q) of the statistic Q2 is as follows

∞ − + −              − − − + − − = 0 ) 1 , 0 [ 2 1 1 2 2 2 ( ) 1 1 1 ) 1 ( ) ( I v v k r k r k v c q h r k r 2 1 ) , 0 ( 2 1 1 2 1 1 ) ( ) ( 1 ) ( 1 1 1 ) 1 ( + − ∞ − + −       − − + − − ⋅              − + − + − + + r k n k r k n v g v q v k n v I v k r k r k v , (24) where 2 2 1 2 1 2 2 1 2 2 1 2 2 1 r k r k r k r k n k n c       −       − ⋅             + −       − ⋅             − + = Γ Γ Γ Γ Γ Γ .

The distribution functions of Q1 or Q2 are evaluated by means of the following

in-tegral

= < = q i i i q P Q q h s ds H 0 ) ( ) ( ) ( , I = 1, 2. (25)

For the given significance level α the quantil q1–αis determined on the basis of the

integral: α α =

∞ − ds s h q i 1 ) ( . (26)

4. Numerical computations

An exact solution of equation (26) is very difficult. In this situation, the quantiles of Q1 and Q2 can be found using numerical integration (see, e.g., [2], [1]). The

(9)

quan-tiles were found for three significance levels (α = 0.01, 0.05, 0.1). Table 1 presents quantiles of the statistic Q1 and Table 2 presents quantiles of the statistic Q2. These

quantiles were evaluated for the case n1 = n2 = … = nk = nk + 1 = 5. These quantiles have

been evaluated for the numbers of groups k from 4 to 10 and for 15, 20, 25 and 30.

Table 1 Quantiles of statistic Q1 Significance level α Number of groups k 0.10 0.05 0.01 4 5.85 10.38 31.19 5 4.86 8.08 21.43 6 4.36 6.96 16.75 7 4.06 6.31 14.21 8 3.86 5.90 12.67 9 3.72 5.61 11.64 10 3.62 5.40 10.92 15 3.34 4.85 9.15 20 3.23 4.62 8.45 25 3.16 4.50 8.08 30 3.12 4.42 7.85 0 5 10 15 20 25 30 35 40 2 4 6 8 10 20 30 0.1 0.05 0.01 k q

Fig. 1. The graphic display of quantiles of statistic Q1 for significance level α = 0.1, 0.05 and 0.01

The Figure 1 presents quantiles of the statistic Q1 for the significance levels 0.1,

0.05 and 0.01. These quantiles are presented for the same cases as in Table 1. Figure 2 presents the same results as Figure 1 but for the statistic Q2.

(10)

Table 2 Quantiles of statistic Q2 Significance level α Number of groups k 0.10 0.05 0.01 4 3.42 5.00 10.71 5 3.30 4.76 9.41 6 3.23 4.62 9.21 7 3.18 4.53 8.34 8 3.15 4.46 8.07 9 3.12 4.41 7.88 10 3.10 4.37 7.74 15 3.04 4.26 7.39 20 3.01 4.20 7.23 25 2.99 4.17 7.21 30 2.98 4.15 7.10 0 1 2 3 4 5 6 7 8 9 10 2 4 6 8 10 20 0.1 0.05 0.01 k q

Fig. 2. The graphic display of quantiles of statistic Q2 for significance level α = 0.1, 0.05 and 0.01

References

[1] BRANDT S., Statistical and Computational Methods in Data Analysis, Springer Verlag, New York 1997.

[2] DAHLQUIST G., BJORCK A., Numerical Methods (in Polish), PWN, Warszawa 1992.

[3] MATHAI A.M., PROVOST S.B., Quadratic forms in random variables, Marcel Dekker, Inc., New York, Basel, Hong Kong 1992.

[4] THOMPSON J.R., KORONACKI J., Statistical Process Control: The Deming Paradigm and Beyond, Chapman and Hall/CRC, New York, London 2001.

(11)

Testowanie hipotezy o stabilności wartości oczekiwanej i wariancji

W pracy jest rozważane zagadnienie jednoczesnej stabilności wartości przeciętnej i wariancji. Próby proste są pobierane niezależnie z populacji o rozkładzie normalnym. Rozważa się dwie funkcje średniej i wariancji z próby. Dla rozważanych statystyk zostały wyprowadzone funkcje gęstości. Proponowane statystyki mogą być wykorzystane do weryfikacji hipotezy o stabilności wartości oczekiwanej i wariancji dla rozkładu normalnego. Hipoteza taka może być rozważana np. w statystycznym sterowaniu procesem przy konstrukcji kart kontrolnych. Bardzo trudne jest dokładne wyznaczenie kwantyli rozważanych staty-styk. Dlatego wartości krytyczne dla tych statystyk zostały wyznaczone dla trzech zwykle używanych poziomów istotności (α = 0,01, 0,05 i 0,1) dla prób o liczebnościach od 4 do 30 z wykorzystaniem cał-kowania numerycznego. Zaprezentowano tablice wartości krytycznych dla tych statystyk. Zaproponowa-ne statystyki i wyznaczoZaproponowa-ne wartości krytyczZaproponowa-ne mogą być również przydatZaproponowa-ne do wykrywania zmian w proce-sach produkcyjnych.

Słowa kluczowe: funkcja gęstości, wariancja z próby, test, numeryczne całkowanie, statystyczna kontrola

Cytaty

Powiązane dokumenty

In other words, being able to positively reap- praise seems to be important in allowing students with controlling instructors to achieve self-determined motivation, high energy

In such cases conflict preven- tion turns into conflict management and early warnings act more as a tool for monitoring the development and potential escalation of the crisis,

Testing the need to include dichotomous variables in the model reflecting the selection of students to individual groups (early, delayed school entry to lower secondary school,

14 Pressure gradient in different sections of the core during liquid injection following 60 PV gas injection after 0.6-quality foam (low-quality regime) and 0.95-quality

Z tego względu wypełnienie przez Polskę do końca 2015 roku postano- wień w zakresie odpowiedniego stanu infrastruktury wodno-kanalizacyjnej na terenach wiejskich przy

Inwentarz: 1) fragment szczypczyków br?z., uszko wyodr?bnione i zdobione dwiema liniami rytymi, w górnej cz??ci ramion, na ich kraw?dziach, pó?koliste wyci?cia, a poni?ej nich

On the Linear Hypothesis in the Theory of Normal Regression 39 variables F discussed in this paper are the ratios of quadratic forms, the determination of the type of test based

следует закону 1'одип—р степеней свободы, где 8 = Х*Х, а 8п является блочной матрицей матрицы