• Nie Znaleziono Wyników

Tests for relation type equivalence or tolerance based on medians from multiple pairwise comparisons

N/A
N/A
Protected

Academic year: 2021

Share "Tests for relation type equivalence or tolerance based on medians from multiple pairwise comparisons"

Copied!
11
0
0

Pełen tekst

(1)TESTS FOR RELATION TYPE – EQUIVALENCE OR TOLERANCE – BASED ON MEDIANS FROM MULTIPLE PAIRWISE COMPARISONS LESZEK KLUKOWSKI Systems Research Institute, Polish Academy of Sciences. Streszczenie The statistical procedure for determination of the relation type – equivalence or tolerance – in a finite set on the basis of multiple pairwise comparisons with random errors is presented in the paper. The procedure consists of two tests; it is an extension of the approach presented in [3]. The test statistics is based on a mixture of some random variables and rests on: estimated form of both relations and some probabilistic inequalities. The tests require weak assumptions about distributions of comparison errors. Keywords: tests for equivalence and tolerance relations, medians from multiple pairwise comparisons. 1. Introduction The methods of estimation of the equivalence relation and tolerance relation, in finite set, presented in [2], [5], are based on the assumption that the type of relation is known. In practice it may be not true; therefore the decision rule for determining the type of relation (true model of data) is necessary. The example of such problem is a set of descendants, which can include: siblings – brothers or sisters (equivalence relation) or also stepbrothers and stepsisters (tolerance relation). Another example – determining relation type in the set of functions, expressing profitability of treasury securities, has been presented in Klukowski 2006. In the paper statistical tests are proposed for the purpose. They are based on estimates of both relations and well-known probabilistic inequalities for moments of random variables. The estimators of the relations exploit the idea the nearest adjoining order (see Slater 1961, David 1988). The test statistics rests on a mixture of some random variables; two parameters of one component of the mixture are determined: the expected value and the evaluation of variance. The paper consists of four sections. The second section presents definitions and notations; main results - the form of tests are presented in section 3. Last section summarizes the results. 2. Basic definitions and notations The equivalence relation (reflexive, symmetric, transitive) divides the set X= {x1 , ..., x m} 1) (m≥3) into n1 (n1 ≥2) subsets χ *( (r=1, ..., n1) with empty intersections, i.e.: r n1.  χr. *(1). 1) *(1) χ *( r ∩ χ s = {0} , X= , where: {0} - an empty set. r =1. for r ≠ s ,. (1).

(2) Leszek Klukowski Tests for relation type - equivalence or tolerance - based on medians from multiple pairwise comparisons. 339. 2) (r=1, ..., n2) with at least The tolerance relation divides the set X into n2 (n2 ≥2) subsets χ *( r one non-empty intersection, i.e. the relation is not transitive. It satisfies the conditions: n2. 2) 2) 2) X=  χ *( , n2≥2, and there exists at least one pair ( x r , x s ) such that χ *( ∩ χ *( ≠ {0} . r s r r =1. (2). The equivalence relation can be described by the function T 1 : X × X → D, D={0, 1}, defined as follows: *(1). 0 if there exists χ *(1) q such, that : ( x i , x j ) ∈ χ q , i ≠ j; (3a) T 1 ( xi , x j ) =

(3). 1 otherwise. The tolerance relation is described by the function T 2 : X × X→ D, D = {0, 1}, defined as follows: 0 if there exists q and s (q = s not excluded) such,. *( 2 ) *( 2) (3b) T 2 ( xi , x j ) =

(4) that : ( xi , x j ) ∈ χ q ∩ χ s , i ≠ j;. 1 otherwise. It is assumed that the function T 2 ( xi , x j ) characterizes completely the tolerance relation, i.e.. there exists one-to-one relationship between the relation form and the function T 2 ( xi , x j ) . The 2) requirement is satisfied if each subset χ *( includes an element xi, which is not included in any q 2) 2) 2) (s≠q) (i.e. xi∈ χ *( and xi∉ χ *( ). other subset χ *( s q s. Of course, T f ( xi , xi ) = 0 (f=1, 2; i=1, …, m). In the paper it is assumed that the type of the relation (equivalence or tolerance) in the set X is not known and has to be determined on the basis of pairwise comparisons g k ( xi , x j ) (k=1, …, N; ( xi , x j ) ∈X × X) with random errors; each comparison determines homogeneity or nonhomogeneity of elements in the pair. The homogeneity means inclusion into the same subset (also into an intersection), non-homogeneity – inclusion into different subsets. Hence, the comparisons g k ( xi , x j ) do not determine the type of the relation; they are only the basis for inference. The result of comparison g k ( xi , x j ) is the function: D = {0, 1}, g k : X × X → D,. (4). where: g k ( xi , x j ) - an evaluation of homogeneity of the pair ( xi , x j ) , in k-th comparison, with random error. It is assumed that the probability of correctness of each comparison satisfies the conditions: P( g k ( xi , x j ) = T f ( xi , x j )) ≥ 1 − δ , δ∈(0, ½), (5a). P(( g k ( xi , x j ) = T f ( xi , x j )) ∩ ( g l ( x r , x s ) = T f ( x r , x s ))) = P( g k ( xi , x j ) = T f ( xi , x j )) P( g l ( x r , x s ) = T f ( x r , x s )). (k≠l),. where: f - equals 1 or 2 - according to the actual relation type in the set X.. (5b).

(5) 340. POLISH ASSOCIATION FOR KNOWLEDGE MANAGEMENT Series: Studies & Proceedings No. 31, 2010. The conditions (5a), (5b) mean that the probability of correct comparison is greater than incorrect one and that the random variables g k ( xi , x j ) , g l ( x r , x s ) are independent for k≠l.. Let us notice that any comparison g k ( xi , x j ) satisfying conditions (5a, b), may be equal to T f ( xi , x j ) (f=1 or 2) or not, as a result of a random error. In particular, the comparisons obtained for the equivalence relation may be not transitive (e.g.: g k ( xi , x j ) = 0 , g k ( x j , x r ) = 0 and g k ( xi , x r ) = 1 , while comparisons for the tolerance relation may be transitive. Therefore, the type of actual relation is not (directly) indicated by results of comparisons. The methods of estimation of both relations have been proposed in [2], [5] (the first paper presents the case of N=1). In the case of multiple comparisons (N>1) the realizations g k ( xi , x j ) (k=1, …, N) are usually aggregated in some way, e.g. with the use of mean or median. The second N) case is examined in the paper; the median g (me ( xi , x j ) is the middle value in the sequence. g (1) ( xi , x j ) , …, g ( N ) ( xi , x j ) (N – uneven), created from comparisons g1 ( xi , x j ) , …, g N ( xi , x j ) ordered in non-decreasing manner. The estimated form of the equivalence relation is obtained (assuming known type of the relation) on the basis of the discrete optimization task: N) N) (6) ( xi , x j ) + (1 − g (me ( xi , x j ))}, g (me. min { F (X1). <i , j >∈I 1( χ1(1) , ..., χ (v1) ). <i , j >∈J 1( χ1(1) , ..., χ (v1) ). where: F (X1) - the feasible set (family of all equivalence relations in the set X);. χ 1(1) , ..., χ (v1) - an element of feasible set (any form of the equivalence relation in the set X), (1) (1) I 1 ( χ 1 , ..., χ v ) - the set of all indices <i, j> satisfying the conditions:. i, j∈{1, ..., m}; j>i; <i, j>∈ I 1 ( χ 1(1) , ..., χ (v1)) ⇔ ∃q such, that: ( xi , x j ) ∈ χ (q1) , N) ( xi , x j ) – a median from comparisons g1 ( xi , x j ) , …, g N ( xi , x j ) ; g (me (1) (1) J 1 ( χ 1 , ..., χ v ) - the set of all indices <i, j> satisfying the conditions:. i, j∈{1, ..., m};. j>i; <i, j>∈ J 1 ( χ 1(1) , ..., χ (v1)) ⇔ it does not exist q such, that: ( xi , x j ) ∈ χ (q1) .. The optimal solution of the task with the criterion function (6) (estimated form of the equivalence relation) will be denoted with the symbols χˆ 1(1, me) , ..., χˆ (nˆ11, me) . The solution is. described by the function: 0 if there exists χˆ (1, me) in the relation χˆ (1, me) , ..., χˆ (1, me) such, q 1 nˆ1. (1, me ) ( me ) ˆ , i ≠ j; (7) tˆ1 ( xi , x j ) =

(6) that ( xi , x j ) ∈ χ q. 1 otherwise .. The minimum value of the function (6) equals zero; it corresponds to the case of estimate, which is exactly consistent with comparisons. The estimated form of the relation may be not unique, because the number of optimal solutions of discrete problem (6) can exceeds one. The.

(7) Leszek Klukowski Tests for relation type - equivalence or tolerance - based on medians from multiple pairwise comparisons. 341. unique estimate can be selected randomly or with the use of an additional criterion, e.g. g (Nme)( xi , x j )} .. min { F (X1). <i , j >∈I 1( χ1(1) , ..., χ (v1) ). In the case of the tolerance relation the optimization task assumes the form: { min ( 2) FX. <i , j >∈I 2 ( χ1( 2 ) , ..., χ (v2 ) ). g (Nme) ( xi , x j ) +. <i , j >∈J 2 ( χ1( 2 ) , ..., χ (v2 ) ). (1 − g (Nme) ( xi , x j ))},. (8). where:. F (X2) - the feasible set (family of all tolerance relations in the set X);. χ 1( 2) , ..., χ (v2) - an element of feasible set (any form of the tolerance relation in the set X), ( 2) ( 2) I 2 ( χ 1 , ..., χ v ) - the set of all indices <i, j> satisfying the conditions:. i, j∈{1, ..., m}; j>i; <i, j>∈ I 2 ( χ 1( 2) , ..., χ (v2)) ⇔∃q, s (q=s. not. excluded). such,. that:. ( xi , x j ) ∈ χ (q2) ∩ χ (s2) ; ( 2) ( 2) J 2 ( χ 1 , ..., χ v ) - the set of all indices <i, j> satisfying the conditions:. i, j∈{1, ..., m};. j>i; <i, j>∈ J 2 ( χ 1( 2) , ..., χ (v2)) ⇔ there does not exist q such, that: ( xi , x j ) ∈ χ (q2) .. The properties of the task (8) are similar to properties of the task (6). The estimate obtained as the optimal solution of task (8), corresponding to the tolerance relation, will be denoted: χˆ 1( 2,me) , ..., χˆ (nˆ22,me) . The solution is described by the function ( me ) tˆ2 ( xi , x j ) , defined as follows:. 0. ( me ) tˆ2 ( xi , x j ) =

(8). 1. if there exist the subsets χˆ (q2,me) and χˆ (s2, me) (q = s not excluded) in the relation χˆ 1( 2,me) , ..., χˆ (nˆ22,me) such, that : xi , x j ∈ χˆ q. ( 2, me ). (9). ∩ χˆ (s2, me) , i ≠ j ;. otherwise.. 3. Construction of tests. The tests proposed are based on differences: ( me ) ( me ) ( me) ( me ) S ij ,me = tˆ1 ( xi , x j ) − g N ( xi , x j ) − tˆ2 ( xi , x j ) − g N ( xi , x j ). where: I (wme) - the set of all pairs of indices <i, j>, which satisfy the condition:. (< i, j >∈ I (wme)) (10).

(9) 342. POLISH ASSOCIATION FOR KNOWLEDGE MANAGEMENT Series: Studies & Proceedings No. 31, 2010. ( me ) ( me) ( me ) ( me ) tˆ1 ( xi , x j ) ≠ tˆ2 ( xi , x j ) ; ( tˆ1 ( xi , x j ) and tˆ2 ( xi , x j ) defined - respectively - in (7) and (9)). The condition, which defines the set I (wme) , means that:. • in the estimated form of the tolerance relation the elements xi and x j are included in an. intersection of two subsets χˆ (q2,me) ∩ χˆ (s2,me) , while in the (estimated) equivalence relation they are included in different subsets or • in the estimated form of the tolerance relation the elements xi and x j are not included in any intersection of subsets (also in the same subset), while in (estimated) equivalence relation they are included in the same subset. The test statistics suggested assumes the form: (11). S ( me, N ) = #( 1( me ) ) S ij ,me , Iw. <i , j >∈I (wme ). – number of elements of the set I (wme) . The properties of the statistics S ( me, N ) depend on actual relation type in the set X. Let us consider first the case of the tolerance relation. For simplification it is assumed that probability of an error in each comparison g k ( xi , x j ) where:. # ( I (wme)). (k=1, …, N; j≠i) equals δ (see (5a)). In the case, when some probabilities are lower than δ, the properties of the tests (the probabilities of errors in the tests) are not worse. In the case of the tolerance relation existing in the set X, the estimated form of the relation is 2) equivalent to the actual (errorless result of estimation), i.e. χˆ 1( 2, me) , ..., χˆ (nˆ22,me) ≡ χ 1*( 2) , ..., χ *( , n2. with. some. probability. ( me ) tˆ2 ( xi , x j ) = T 2 ( xi , x j ). ( me ) tˆ2 ( xi ,. ( me) x j ) − g N ( xi ,. ( 2, me ) 2) , ..., χˆ (nˆ22, me) ≡ χ 1*( 2) , ..., χ *( P ( χˆ 1 R ( 2) ) n2. (< i, j >∈ I (wme)). x j ) and. ( me ) tˆ1 ( xi ,. hold.. ( me ) x j ) − g N ( xi ,. and. Moreover. x j ) (<i, j>∈. I (wme) ). the. the. equalities expressions. are zero-one random. variables; their distributions can be determined on the basis of the properties of comparisons g k ( xi , x j ) . The probability function of each random variable tˆ(2me) ( xi , x j ) − g (Nme) ( xi , x j ) (< i, j >∈ I (wme)) is determined as follows: P ( tˆ(2me) ( xi , x j ) − g (Nme) ( xi , x j ) = 0; tˆ(2me) (•) = T 2 (•)) = P ( g (Nme) ( xi , x j ) = tˆ(2me) ( xi , x j ) ; tˆ(2me) (•) = T 2 (•)) = 1 − η N ,. P ( tˆ(2me) ( xi , x j ) − g (Nme) ( xi , x j ) = 1; tˆ(2me) = T 2 (•)) = P ( g (Nme) ( xi , x j ) ≠ tˆ(2me) ( xi , x j ); tˆ(2me) = T 2 (•)) = η N , where: ( me ) tˆ2 (•) = T 2 (•) the relationship means equality for all pairs ( xi , x j ) ∈ X × X ; η N - the probability of the form:. (12a). (12b).

(10) Leszek Klukowski Tests for relation type - equivalence or tolerance - based on medians from multiple pairwise comparisons. N. P (k =1g k ( xi , x j ) > η N = P ( g (Nme) ( xi , x j ) ≠ T 2 ( xi , x j )) =

(11) N. P( g ( xi , x j ) <. k =1 k. N 2. ; T 2 ( xi , x j ) = 0);. N 2. ; T 2 ( xi , x j ) = 1).. 343. (13). Under the assumption tˆ1( me) ( xi , x j ) ≠ tˆ(2me) ( xi , x j ) , the probability function of the random variable tˆ1( me) ( xi , x j ) − g (Nme) ( xi , x j ) assumes the form: P ( tˆ1( me) ( xi , x j ) − g (Nme) ( xi , x j ) = 0;. ( me ) tˆ 2 (•) = T 2 (•)) = η N ,. (14a). P ( tˆ1( me) ( xi , x j ) − g (Nme) ( xi , x j ) = 1; tˆ (2me) (•) = T 2 (•)) = 1 − η N .. (14b). The probabilities (12a), (12b) and (14a), (14b) result from the fact that for < i, j >∈ I (wme). inequalities tˆ1( me) ( xi , x j ) ≠ tˆ(2me) ( xi , x j ) and implications are true: g (Nme) ( xi , x j ) = tˆ1( me) ( xi , x j )  g (Nme) ( xi , x j ) ≠ tˆ(2me) ( xi , x j ) ,. g (Nme) ( xi , x j ) ≠ tˆ1( me) ( xi , x j )  g (Nme) ( xi , x j ) = tˆ(2me) ( xi , x j ) . The equalities (12a), (12b) and (14a), (14b) indicate: P ( S ij ,me = −1 ; tˆ (2me) (•) = T 2 (•)) = η N ,. (15). P ( S ij ,me = 1; tˆ(2me) (•) = T 2 (•)) = P [( tˆ1( me) ( xi , x j ) − g (Nme) ( xi , x j ) = 1) ∩ ( tˆ(2me) ( xi , x j ) − g (Nme) ( xi , x j ) = 0);. (16). ( me ) tˆ2 (•) = T 2 (•)] = 1 − η N .. It follows from (15) and (16) that in the case, when the tolerance relation exists in the set X, the expected value E ( S ij ,me) and variance Var ( S ij ,me) of each random variable S ij,me. (< i, j >∈ I (wme)) assume the form - respectively: E ( S ij ,me ; tˆ(2me) (•) = T 2 (•)) = −η N +1 −η N = 1 −2η N ,. (17). Var ( S ij , me ; tˆ(2me) (•) = T 2 (•)) = (−1 − (1 − η N )) η N + (1 − (1 − 2  N )) (1 −  N ) = 4  N (1 −  N ) (18) 2. 2. The expected value and the variance of the variable S ( me, N ) fulfill the relationships: (19) E (S ; tˆ(2me) (•) = T 2 (•)) = 1− 2η N , ( me, N ). Var ( S ( me, N ) ; tˆ(2me) (•) = T 2 (•)) ≤ 4(1 − 2 L( I (wme)) /(# ( I (wme)) )2)η N (1 − η N ) . The variance. (20). Var ( S ( me, N ) ; tˆ(2me) (•). = T 2 (•)) is evaluated under the assumption that any and S rs,me , which satisfy the conditions i≠r, s and j≠r, s, are independent. random variables S ij,me (i.e. their covariance equals to zero), while remaining variables may be dependent (correlated). The number of covariances which are equal to zero is denoted L( I (wme)) ; if the assumption does not hold, then L( I (wme)) = 0 . The evaluation of variance of the variable S ( me, N ) is based on the.

(12) 344. POLISH ASSOCIATION FOR KNOWLEDGE MANAGEMENT Series: Studies & Proceedings No. 31, 2010. following facts: each variance of S ij, me is equal to 4η N (1 − η N ) and each non-zero covariance Cov ( S ij ,me , S rs ,me) is not greater, than 4η N (1 − η N ) . Moreover the number of variances. Var 2 ( S ij ,me) (<i, j>∈ I (wme) ) is equal to # ( I (wme)) and number of covariances (in the set I (wme) ) is equal to # ( I (wme)) × (# ( I (wme) − 1)) − 2 L ( I (wme)) . As a result Var 2 ( S ( me, N )) satisfies the inequality (20). The right-hand side of the inequality (20) can significantly exceed the actual variance Var 2 ( S ( me, N )) , because each covariance Cov ( S ij ,me , S rs , me) may be lower than variance, in particular – negative. More precise evaluation of the variance (20) requires some additional knowledge about covariance structure. Sometimes values of covariances can be evaluated, e.g. when the comparisons g k ( xi , x j ) are obtained from a statistical test and the covariance of the test results is known. It is clear that the right-hand side of the inequality (20) converges to zero for N → ∞ , because (see (13)) lim η N = 0 ; the speed of the convergence can be determined exactly N →∞. or evaluated. The properties (19) and (20) are valid in the case of errorless estimation result of the tolerance 2) relation, i.e. when: χˆ 1( 2,me) , ..., χˆ n(ˆ22,me) ≡ χ 1*( 2) , ..., χ *( . If it is not true, then the properties n2. mentioned do not hold. Therefore, the value of the variable S ( me, N ) obtained for any estimation result (errorless or not) can be treated as a realization of some mixture of random variables. However, the properties (the expected value and the evaluation of variance) of only one random variable from the mixture - corresponding to errorless estimation result - can be determined (without difficulties). Let us notice that the probability of errorless estimation result, denoted 2) , ..., χˆ (nˆ22,me) ≡ χ 1*( 2) , ..., χ *( R ( 2)) , is also not easy to determine; it can be evaluated on n2 the basis of simulation approach. In the case when the equivalence relation exists in the set X and the result of estimation is errorless the distribution of the random variable S ( me, N ) results the following facts:. P ( χˆ 1. ( 2, me ). P ( tˆ1( me) ( xi , x j ) − g (Nme) ( xi , x j ) = 0; tˆ1( me) (•) = T 1 (•)) = P ( g (Nme) ( xi , x j ) = tˆ1( me) ( xi , x j ); tˆ1( me) (•) = T 1 (•)) = 1 − η N ,. P ( tˆ1( me) ( xi , x j ) − g (Nme)( xi , x j ) = 1; tˆ1( me) (•) = T 1 (•)) = P ( g (Nme) ( xi , x j ) ≠ tˆ1( me) ( xi , x j ); tˆ1( me) (•) = T 1 (•)) = η N , P ( tˆ(2me) ( xi , x j ) − g (Nme) ( xi , x j ) = 0; tˆ1( me) (•) = T 1 (•)) = P ( g (Nme) ( xi , x j ) = tˆ(2me) ( xi , x j ));. ( me ) tˆ1 (•) = T 1 (•)) = η N ,. P ( tˆ(2me) ( xi , x j ) − g (Nme) ( xi , x j ) = 1; tˆ1( me) (•) = T 1 (•)) = P ( g (Nme) ( xi , x j ) ≠ tˆ(2me) ( xi , x j ); tˆ1( me) (•) = T 1 (•)) = 1 − η N , (the expression tˆ1( me) (•) = T 1 (•) means equality for each ( xi , x j ) ∈X × X). From (21a) - (22b) it follows, that:. (21a). (21b). (22a). (22b).

(13) Leszek Klukowski Tests for relation type - equivalence or tolerance - based on medians from multiple pairwise comparisons. 345. P ( S ij ,me = −1; tˆ1( me) (•) = T 1 (•)) = P [( tˆ1( me) ( xi , x j ) − g (Nme) ( xi , x j ) = 0) ∩ ( tˆ(2me) ( xi , x j ) − g (Nme) ( xi , x j ) = 1); ( me ) tˆ1 (•). (23a). = T 1 (•)] = 1 − η N ,. P ( S ij ,me = 1; tˆ1( me) (•) = T 1 (•)) = P [( tˆ1( me) ( xi , x j ) − g me, N ( xi , x j ) = 1) ∩ ( tˆ(2me) ( xi , x j ) − g me, N ( xi , x j ) = 0);. (23b). = T 1 (•)] = η N . The formulas (23a) and (23b) are the basis for determination of expected value and variance of each random variable S ij, me : ( me) tˆ1 (•). E ( S ij , me ; tˆ1( me) (•) = T 1 (•)) = −1 + η N + η N = 2η N − 1 ,. (24). (25) = T 1 (•)) = 4η N (1 − η N ) . The relationships (24) – (25) allow determining the expected value and evaluation of variance of the random variable S ( me, N ) , when the equivalence relation exists in the set X. The expected value assumes the form: (26) E ( S ( me, N ) ; tˆ1( me) (•) = T 1 (•)) = 2η N − 1 , and the variance satisfies the condition: Var ( S ( me, N ) ; tˆ1( me) (•) = T 1 (•)) ≤ 4(1 − 2 L ( I (wme)) /(# ( I (wme)) )2)η N (1 − η N )) . (27) Var ( S ij , me ; tˆ1( me) (•). The properties (26) and (27) hold for the equivalence relation, when errorless estimation 1) result occurs. However, with the probability 1 − P( χˆ 1(1,me) , ..., χˆ 1(1, me) ≡ χ 1*(1) , ..., χ *( R (1)) the n1 result of estimation is different than errorless one. Therefore, the value of the random variable S ( me, N ) is a realization of a mixture of random variables, with similar properties as in the case of the tolerance relation. The test for verification the tolerance relation in the set X rests on expected value (equal to 1− 2η N ) and evaluation (27) of variance of the random variable S ( me, N ) . The null and alternative. hypothesis of the test can be formulated in the following way: Ht: E ( S ( me, N )) = 1− 2η N , He: E ( S ( me, N )) = 2η N − 1 , with the critical region: ( me , N ) S ( me, N ) < 1 − 2η N − λ σ (Sme, N )} , Λ (2me, N ) = {S where: σ (Sme, N ) – square root of the variance Var 2 ( S ( me, N )) evaluation, i.e.: 1/ 2. σ (Sme, N ) = (Var 2 ( S ( me, N ) ). = (4(1 − 2 L( I (wme)) /(# ( I (wme)) )2)η N (1 − η N ) )1/ 2 λ. constant obtained on the basis of Chebyshev inequality P ( X − E ( X ) > λ σ X ) < variable X with known expected value E ( X ) and standard deviation σ X .. (28). 1. λ2. a. positive. for a random.

(14) 346. POLISH ASSOCIATION FOR KNOWLEDGE MANAGEMENT Series: Studies & Proceedings No. 31, 2010. The form of the test for the equivalence relation is “symmetric”: He: E ( S ( me, N ))) = 2η N − 1 ,. Ht: E ( S ( me, N )) = 1 − 2η N , with the critical region: ( me, N ) > 2η N − 1 + λ σ (Sme, N )} Λ1( me, N ) = {S ( me, N ) S ( σ (Sme, N ). (29). – the same, as in the formula (28)).. In the case of the tolerance relation, the second type error β (2me, N ) occurs, if Ht is accepted. S ( me, N ) ≥ 1 − 2η N − λ σ (Sme, N ) ),. (i.e.. E (S. while. the. equivalence. relation. He)= 2η N − 1 ). The probability of the event can be evaluated ( me, N ) = P ( S ( me, N ) ≥ 1 − 2η N − λ σ (Sme, N )  He)= β2 P ( S ( me, N ) −( 2η N − 1) ≥ 1 − 2η N −( 2η N − 1) − λ σ (Sme, N )  He)= P ( S ( me, N ) −( 2η N − 1) ≥ 2(1 − 2η N ) − λ σ (Sme, N )  He) =. ( me, N ) . is. true. (i.e.. in the following way:. ) ( me, N ) ) ( me, N )  He) ≤ P ( S ( me, N ) − (2η N − 1) ≥ λ1(,me P ( S ( me, N ) −( 2η N − 1) ≥ λ1(,me ) 0 σS 0 σS ) 2 ≤ 1 (λ1(,me 0 ) ,. (30). ) ) ( me, N ) where: the value of λ1(,me is determined in the following way (the expression λ1(,me is 0 0 σS positive under the assumptions made): ) ( av , N ) ) ( me, N ) 2(1 − 2η N ) − λ σ (Sme, N ) = λ1(,me  λ1(,me (31) ) σ (Sme, N ) . 0 σS 0 = (2(1 − 2η N ) − λ σ S. The probability of the second type error β 1( me, N ) in the case the equivalence relation ( E ( S ( me, N )  Ht)= 1− 2η N ) is obtained in similar way:. β 1( me, N ) = ( P( S ( me, N ) ≤ 2η N − 1 + λ σ (Sme, N )  Ht) = P ( S ( me, N ) −(1 − 2η N ) ≤ 2η N − 1 −(1 − 2η N )) + λ σ (Sme, N )  Ht) = P ( S ( me, N ) −(1 − 2η N ) ≤ 2( 2η N − 1) + λ σ (Sme, N )  Ht) = ) ( me, N )  H ) ≤ P ( S ( me, N ) −(1 − 2η N ) ≤ λ (2me P ( S ( me, N ) − (1 − 2η N ) ≥ t ,0 σ S ) 2 ) ( me, N ) ) ≤ 1 (λ (2me λ (2me ,0 σ S ,0 ) ,. (32). ) ( me, N ) where λ (2me (33) ) σ (Sme, N ) = (λ σ (Sme, N ) −2(1 − 2η N )) σ (Sme, N ) . ,0 = (2(2η N − 1) + λ σ S The probabilities of the first and second type errors in the tests, are valid for errorless estimation results. Therefore, they have to be corrected with the factor. P( χˆ 1. f) , ..., χˆ (nˆ1f ,me) ≡ χ 1*( f ) , ..., χ *( R ( f )) (f=1, 2). n1 In the case of equivalence relation the corrected significance level assumes the form: 1) (34) 1 − (1 − α 1( me, N )) P ( χˆ 1(1,me) , ..., χˆ (nˆ1,me) ≡ χ 1*(1) , ..., χ *( R (1)) , n1 ( f , me ). 1. where:.

(15) Leszek Klukowski Tests for relation type - equivalence or tolerance - based on medians from multiple pairwise comparisons. 347. > 2η N − 1 + λ σ (Sme, N ) He); - the corrected probability of the second type error for the equivalence relation assumes the form: 2) 1 − (1 − β 1( me, N )) P ( χˆ 1( 2,me) , ..., χˆ (nˆ2, me) ≡ χ 1*( 2) , ..., χ *( (35) R ( 2) ) . n. α 1( me, N ) = P ( S. ( me, N ). 1. 1. The corrected probabilities of errors for the tolerance relation assume the same form as in the case of the equivalence relation. The probabilities (34), (35) converges to one for N→∞, because P ( ( , ) = ( , )) = 1 (f=1 or 2) and therefore the tests (28), (29) are consistent g T f xi x j lim me , N xi x j N →∞. 5. Conclusions. The tests for distinction of the equivalence and tolerance relations in a finite set, presented in the paper, provide formal basis for decisions about actual model of data. They are based on weak assumptions about distribution functions of comparison errors and exploit known probabilistic inequalities (evaluations). The probabilities of errors in the tests are determined in approximated way – they include terms requiring simulation approach. The approximation is a “cost” of nonrestricted assumptions about comparison errors. The important feature of the tests is their consistency for number of comparisons N→∞. 6. Literature. [1] David H. A. (1988) The Method of Paired Comparisons, 2nd ed. Ch. Griffin, London. [2] Klukowski L. (1990) Algorithm for classification of samples in the case of unknown number variables generating them (in Polish). Przegląd Statystyczny XXXVII, 167. [3] Klukowski L.: (2006) Tests for relation type – equivalence or tolerance – in finite set of elements. Control and Cybernetics, 35, pp. 369–384. [4] Klukowski L. (2007a) Completion and clarification to the paper “Tests for relation type equivalence or tolerance - in finite set of elements”. Control and Cybernetics, 36, pp. 467– 468. [5] Klukowski L. (2007b) Estimation of tolerance relation the basis of multiple pairwise comparisons with random errors. Control and Cybernetics, 36, pp. 443–466. [6] Slater P. (1961) Inconsistencies in a schedule of paired comparisons. Biometrika, 48, pp. 303–312. The work has been supported by the grant No N N111434937 of the Polish Ministry of Science and Higher Education.

(16) 348. POLISH ASSOCIATION FOR KNOWLEDGE MANAGEMENT Series: Studies & Proceedings No. 31, 2010. TESTY RODZAJU RELACJI – RÓWNOWANOCI LUB TOLERANCJI – OPARTE NA MEDIANACH Z WIELOKROTNYCH PORÓWNA PARAMI Streszczenie W artykule wyprowadzono oparte na medianach z wielokrotnych porówna parami testy, prowadzce do oceny rodzaju relacji midzy porównywanymi obiektami – równowanoci lub tolerancji. Zakłada si wystpowanie błdów losowych w wynikach porówna, za załoenia dotyczce rozkładów tych błdów s stosunkowo słabe. Słowa kluczowe: relacja równowaĪnoĞci, relacja tolerancji, porównania parami, testy statystyczne, mediana.. Leszek Klukowski Instytut BadaĔ Systemowych PAN Newelska 6, 01-447 Warszawa e-mail: Leszek.Klukowski@ibspan.waw.pl.

(17)

Cytaty

Powiązane dokumenty

The real interpolation method (A 0 , A 1 ) θ ,q is very useful in applications of interpolation theory to function spaces, PDEs, operator theory and approximation theory (see, for

Using the parametrized coefficient, we are able to generate an association order equal and opposite to at least 20 well known similarity measures.. Usage of a single coefficient

Thus, we focus on the analysis of the computational efficiency of the instance selection approaches used by various ISMIL algorithms, including our MILPIS algorithm, DD-SVM, MILES,

However, if a Ricci-semisymmetric manifold satis- fies certain additional assumptions then it is semisymmetric.. For instance, every conformally flat Ricci-semisymmetric

(1) and (2) are equivalent on every 3-dimensional semi- Riemannian manifold as well as at all points of any semi-Riemannian mani- fold (M, g), of dimension ≥ 4, at which the Weyl

This study provides a proof that the limit of a distance-based inconsistency reduction process is a matrix induced by the vector of geometric means of rows when a

Moreover, for an arbitrary type III equivalence relation R, we provide another system of invariants for orbit equivalence of R-subrelations of finite index and show that it

In the paper an alternative form of the tolerance averaged model of heat conduction in the composite conductors with microperiodic palisade-type materi- al structure with