Borowczyk Henryk: The combinatorial diagnostic entropy and symptoms’ informativity. Kombinatoryczna entropia diagnostyczna i wartość informacyjna symptomów.

(1)

THE COMBINATORIAL DIAGNOSTIC ENTROPY

AND SYMPTOMS’ INFORMATIVITY

KOMBINATORYCZNA ENTROPIA DIAGNOSTYCZNA

I WARTOŚĆ INFORMACYJNA SYMPTOMÓW

Henryk Borowczyk Air Force Institute of Technology Instytut Techniczny Wojsk Lotniczych 01-494 Warszawa ul. Ksiecia Bolesława 6

e-mail: borowczyk@post.pl

Abstract. This paper presents combinatorial measures of the system condition

uncertainty (diagnostic entropy) and the diagnostic symptoms’ informativity. The multi-valued diagnostic model has been assumed. Proposed measures can be used in the manner similar to Shannon entropy use for the diagnostic model analysis and the diagnostic algorithm planning process.

Keywords: fault diagnosis, qualitative modelling, inference, information,

uncertainty

Streszczenie: W pracy przedstawiono kombinatoryczną miarę nieokreśloności

stanu diagnozowanego obiektu (entropię diagnostyczną) oraz kombinatoryczną miarę wartości informacyjnej symptomów. Rozważania przeprowadzono przy założeniu, że obiekt opisany jest wielowartościowym modelem diagnostycznym. Zaproponowane miary mogą być wykorzystane w sposób analogiczny jak wprowadzone przez Shannona entropia i informacja do analizy modelu diagnostycznego oraz wyznaczania algorytmu diagnozowania.

Słowa kluczowe: diagnozowanie uszkodzeń, modelowanie jakościowe,

(2)

THE COMBINATORIAL DIAGNOSTIC ENTROPY

AND SYMPTOMS’ INFORMATIVITY

1.

Introduction

Optimal set of diagnostic symptoms determination is one of the most important problems. Applied optimization method depends on the form of diagnostic model. Recently, more attention is paid to the qualitative (approximate, multi-valued) models [2, 5, 6, 7]. One of the way of solving diagnostic algorithm determination problem consists in applying the information-based analysis [2, 8], i.e. description of the system condition uncertainty and amount of information delivered by means of individual symptoms and sets thereof. This aim can be reached with the Shannon-introduced quantities: the entropy, and the amount of information [9]. There are some other kinds of entropies which can be considered – Renyi's entropy [3], structural α-entropy [4], functions ( )z t_ [1].

In this paper the combinatorial diagnostic entropy denoted with H_Bc(.) will be introduced. The set of desirable properties of proposed measure will be determined taking the diagnostics point of view into account.

2.

Assumptions

1. A finite set of faults is determined: { },_i 1,...,

E e i n

2. Probabilities ( )P e of faults i ei are non-zero:E

1, , ( ) 0, ( ) 1i i n P e P E   





3. Determined is a finite set of symptoms { },_r 1,...,

D d r  t

and a finite set of values taken by the symptoms {0, , 1}

A _ 

(3)

( / )_r _i _ir, _ir R d e    A

4. For all the symptoms the following holds:

[ ( / ) ] 1 r i ir r i ir d D e E E P R d e       

  

5. The multi-valued diagnostic model has been presented in the form of a diagnostic matrix G .

 

ir n xt ir ( r/ )i ir, ir G g where g R d e   A

The above assumptions establish the multi-valued diagnostic model of wide class of technical objects.

3.

Postulated properties of the combinatorial diagnostic entropy

A set of postulated properties of HBc( )E can be found on the base of its ‘conceptual’ similarity with the entropy. Therefore, reasonable seems the postulate that HBc( )E be a function of n card E ( ):

( ) ( )

Bc

H E  f n

What else should be expected is the monotonic increase of H_Bc( )E with the growth of n :

( ) ( ') ' f n  f n  n n

Two other properties result from conditions of setting the uncertainty to zero. If it is known a priori that the set of faults is one-component only (n ), the system condition is then definitely determined and 1 HBc( )E should take value equal to zero:

1

( ) 0

Bc _n

H E _ 

Another extreme case takes place when the selected symptoms generates the faults set partition in the form of one-component subsets {{ }},ei i  .1, n It means that all the pairs of faults have been distinguished by the selected symptoms; hence, the uncertainty equals zero:

( /{{ }}) 0

Bc i

H E e 

4.

(4)

The form of the function f n( )_{can be defined with two methods: a) formal} deduction, b) arbitrary acceptance of a certain form of the function and proving that it shows the postulated properties.

Further considerations will be based on the following theorem proved in [2]:

Theorem 1

If given is a finite set of faults E{ },e_i i1,...,n, then function

( ) 0,5 ( 1) 2 Bc n H E  _{ } n n  

which determines the number of all unordered pairs of faults shows the postulated properties – .

5.

The combinatorial informativity of diagnostic symptoms

The initial system condition uncertainty (prior to the selection of any symptom) is equal to:

( ) 0,5 ( 1) Bc

H E  n n

If any symptom dr , has been selected as the first one in the sequence, itD generates the faults set partition of the following form:

0 1

{ ( )} { ( ), ,E dj r  E dr  E( )}dr and relationships are satisfied

1 1 , 0,..., 1 0 0 j( )r l( r) , j( r) j j l j j j l E d E d E d E n n              







_

The condition uncertainty, after selection of symptom d that generates setr partition , equals : 1 1 ( / { }) 0,5 ( 1) Bc j j j j H E E  n n  





Since the set partition is explicitly defined by means of the symptom dr generating it, the above formula can be written down in the form:

1 0 ( / ) 0,5 ( 1) Bc r j j j H E d n n   





It’s easy to observe that the condition uncertainty after the selection of the symptom d is not greater than the initial uncertainty, i.e.:r

(5)

( ) ( / )

Bc Bc r

H E H E d

Equality in occurs in the case described with the following condition

( ) ( / ) ( )

Bc Bc r _{j A} j

H E H E d n n



   _

It means that the value of the symptom d does not depend on the systemr condition, and such a symptom should be removed. On the grounds of relationships and , the notion of the symptom combinatorial informativity can be defined.

Definition

The symptom d combinatorial informativity is equal to the difference inr the condition uncertainty before this symptom has been selected and the uncertainty remaining after the selection.

If the symptom d is selected as the first one in the sequence, then,r according to the Definition 1, the following can be written down:

( ) ( ) ( / )

Bc r Bc Bc r

J d H E H E d

where: J d - the combinatorial informativity of the symptom _Bc( )_r dr .D After substituting and into and account taken of relationships , the informativity can be presented in the form:

1 0 ( ) 0,5 ( ) Bc r j j j J d  n n n  





From and it becomes evident that the informativity JBc( )d can take non-r negative values J_Bc( ) 0d_r  . The formula and earlier considerations give grounds to formulate the conclusion - the informativity J_Bc( )d equals the_r number of all unordered pairs of faults distinguishable due to the symptom

r d : 2 1 0 1 ( ) Bc r j k j k j J d   n n    

 

This confirms the coherence of the introduced measures of the system condition uncertainty and the symptoms informativity.

If the symptom ds has been selected as the second one in the sequence,D then in each of the subsets E dj( )r of the set partition it generates the following set partition:

0 1

0,..., 1{ j ( , ),...,r s j ( , )}r s

(6)

where:

( , ) { : ( / ) ( / ) , 1,..., ,}

jl jl jl

jl r s i r i s i jl jl

E d d  e R d e  j R d e l i  n The following relationships are satisfied:

0,..., 1 , 0,..., 1 1 1 0,..., 1 0,..., 1 ₀ 0 ) ( , ) ( , ) ) ( , ) ( ) ) jl r s jk r s j l k l k jl r s j r jl j j j _l l a E d d E d d b E d d E d c n n                _   _      

_

 





The uncertainty after having selected both the symptoms, i.e. ,d dr s ,D can be written down in the following way:

1 1 0 0 ( / , ) 0,5 ( 1) Bc r s jl jl j l H E d d   n n   





The symptom d conditional informativity results from the generals Definition 1

( / ) ( / ) ( / , )

Bc s r Bc r Bc r s

J d d H E d H E d d

After substituting and into , the following is arrived at: 1 1 0 0 ( / ) 0,5 ( ) Bc s r jl j jl j l J d d   n n n   





The above considerations can be generalised to the question of the symptom s

d  informativity defining under the condition that earlier set of kD symptoms Dk  have been selected D

(1) (2) ( ) { , ,..., }

k k

D  d d d

and the faults set partition is in the form

0 1 1

{ (E Dj k)} { ( E Dk), (E Dk),...,Em_k (Dk)} where: m – the power of the family of subsets k

Using general Definition 1, the conditional informativity of the symptom ds can be written down in the following form:

( / ) ( / ) ( / , ) Bc s k Bc k Bc k s J d D H E D H E D d and finaly: 1 1 0 0 ( / ) 0,5 k ( ) m Bc s k jl j jl j l J d D  n n n     

 



It can be easily noticed that formulas are the generalisation of - they become identical when k  and 1 mk  .

(7)

informativity

( ) ( ) ( / )

Bc k Bc Bc k

J D H E H E D

After simple transformations the following is arrived at: 1 0 ( ) 0,5 k ( ) m Bc k j j j J D n n n   





What comes out from the comparison between and is that both the formulas take identical form if D is a one-member set. k

The total informativity of the symptoms set D and the symptom k dsDk can be presented in the following form:

( , ) ( ) ( / , )

Bc k s Bc Bc k s

J D d H E H E D d

After transformations, the following is arrived at: 1 1 0 0 ( , ) 0,5 k ( ) m Bc k s jl jl j l J D d  n n n     

 



What results from the above-considered issues can be used to prove the Lemma and the Theorem 2.

Lemma

The total informativity of the symptoms set Dk  and the symptomD

s k

d D is equal to the sum of the set D informativity and conditionalk informativity of the symptom ds

( , ) ( ) ( / )

Bc k s Bc k Bc s k

J D d  J D J d D

Theorem 2

The informativity of the symptoms set DK { },dk k1,..., ,K DK D equals to the sum of conditional informativities of individual symptoms.

( ) 1 1 ( ) ( / ) K Bc K Bc k k k J D J d D _  



The symptoms combinatorial informativity shows the property of additivity, as does information in the sense meant by Shannon.

6.

Conclusion

(8)

can be described with functions of the logarithmic form. A new, non-logarithmic combinatorial diagnostic entropy has been introduced. It describes the number of fault pairs which have to be distinguished during diagnosing process. Treating the assumed combinatorial diagnostic entropy as a primary notion, the symptoms informativity has been defined. In this way relationships have been derived that facilitate explicit, quantitative assessment of the informativity of a single symptom as well as that of a symptoms set. It has been proved that the informativity J_Bc( )d shows the_r property of additivity.

References

1. Behara M., Nath P.: Additive and non additive entropies of finite measurable partitions., Lecture Notes in Mathematics, vol. 296, Springer-Verlag, 1973.

2. Borowczyk H.: Quasi-informacyjna metoda wyznaczania program diagnozowania złożonych obiektów technicznych. Rozprawa doktorska. WAT, Warszawa, 1984

3. Csiszár I.: Information measures: a critical survey. Proc. 7th Prague Conf. on Inf. Theory, Stat. Dec. Functions and Random Processes, 1974 4. Havrda M.E., Charvát F.: Quantification method of classification processes: concept of structural alpha-entropy. Kybernetica, (3), 1967. 5. Iserman R.: Model-based fault-detection and diagnosis – status and

applications. Ann. Rev. in Control, 2004

6. Korbicz J., Kościelny J. M., Kowalczuk Z., Cholewa W.:Fault diagnosis. Models, Artificial intelligence, Applications. Springer-Verlag, 2004

7. Lunze J.: Qualitative modelling of dynamical systems. Motivation, methods, and prospective applications. Math. Comp. in Simulation, 1998

8. Rosenhaus M. B.: Construction of a fault location algorithm, Automatica Vol. 32, 3, 1996

9. Shannon C. E.: A mathematical theory of communication. The Bell Technical Journal, Vol. 27, 1948

(9)

KOMBINATORYCZNA ENTROPIA DIAGNOSTYCZNA

I WARTOŚĆ INFORMACYJNA SYMPTOMÓW

1.

Wprowadzenie

Jednym z istotniejszych zagadnień diagnostyki technicznej jest wyznaczanie optymalnego zbioru symptomów tworzącego algorytm diagnozowania. Przyjęta metoda optymalizacji zależy od postaci modelu diagnostycznego. We współczesnych pracach wykorzystuje się coraz częściej modele jakościowe (wielowartościowe) [2, 5, 6, 7].

Algorytm diagnozowania może być wyznaczony z zastosowaniem podejścia informacyjnego [2, 8], wykorzystującego miary nieokreśloności stanu obiektu i wartości informacyjnej symptomów. Najczęściej stosowane są miary wprowadzone przez Shannona dla potrzeb teorii informacji [9] – entropia i ilość informacji. Możliwe jest również przyjęcie innych miar, np entropii Renyi [3], strukturalnej α-entropii [4], funkcji ( )z t_ [1].

W niniejszej pracy zaproponowano kombinatoryczne miary opisujące nieokreśloność stanu diagnozowanego obiektu oraz wartość informacyjną symptomów. Zbiór postulowanych własności kombinatorycznej entropii diagnostycznej (oznaczonej jako H_Bc(.)) określono z uwzględnieniem postulatów wynikających z potrzeb diagnostyki.

2.

Założenia

1. Określony jest skończony zbiór stanów: { },i 1,...,

E e i n

2. Prawdopodobieństwa ( )P e stanów i ei są niezerowe:E

1, , ( ) 0, ( ) 1_i i n P e P E   





(10)

3. Określony jest skończony zbiór symptomów { },_r 1,...,

D d r  t

oraz skończony zbiór wartości logicznych przyjmowanych przez symptom {0, , 1}

A _ 

Funkcja R(.)_{odwzorowująca}_D_w_A_{jest  -wartościowa} ( / )_r _i _ir, _ir

R d e    A

4. Dla wszystkich symptomów spełniony jest warunek

[ ( / ) ] 1 r i ir r i ir d D e E E P R d e       

  

5. Wielowartościowy model diagnostyczny ma postać macierzy diagnostycznej G .

 

ir _{n xt} ir ( r/ )i ir, ir G g gdzie g R d e   A

Przyjęte założenia określają wielowartościowy model diagnostyczny szerokiej klasy obiektów technicznych.

3.

Postulowane własności kombinatorycznej entropii diagnostycznej Zbór postulowanych własności kombinatorycznej entropii H_Bc( )E może być określony na podstawie przesłanek wynikających z potrzeb diagnostyki oraz ideowej zbieżności z entropią Shannona. Uzasadniony jest zatem postulat aby HBc( )E była funkcją n card E ( ):

( ) ( )

Bc

H E  f n

(11)

( ) ( ') ' f n  f n  n n

Kolejne dwie własności wynikają z warunków zerowania się entropii. Jeżeli a priori wiadomo, że zbiór stanów jest jednoelementowy n , to stan1 obiektu jest określony i H_Bc( )E powinna przyjąć wartość zero

1

( ) 0

Bc n

H E _ 

Drugi skrajny przypadek ma miejsce, gdy wybrane symptom generują rozkład zbioru stanów w postaci podzbiorów jednoelementowych {{ }},ei i  . Oznacza to, że wszystkie stany zostały rozróżnione1, n i nieokreśloność jest równa zero:

( /{{ }}) 0

Bc i

H E e 

4.

Kombinatoryczna entropia diagnostyczna

Postać funkcji f n( )_{może być określona dwiema metodami: a) formalnej} dedukcji na podstawie zbioru postulowanych własności, b) arbitralnym przyjęciu postaci funkcji wykazaniu, że posiada ona wymagane własności. W dalszych rozważaniach wykorzystane zostanie twierdzenie, którego dowód przeprowadzono w [2].

Twierdzenie 1

Jeżeli dany jest skończony zbiór stanów E { },ei i1,...,n, to funkcja

( ) 0,5 ( 1) 2 Bc n H E  _{ } n n  

określająca liczbę nieuporządkowanych par stanów posiada postulowane własności – .

5.

Kombinatoryczna miara wartości informacyjnej symptomów diagnostycznych

(12)

symptomu) wynosi: ( ) 0,5 ( 1) Bc

H E  n n

Jeżeli jako pierwszy wybrany zostanie symptom dr , wygenerujeD rozkład zbioru stanów w postaci

0 1 { ( )} { ( ), ,E dj r  E dr  E( )}dr gdzie: 1 1 , 0,..., 1 0 0 j( )r l( r) , j( r) j j l j j j l E d E d E d E n n              







_

Nieokreśloność stanu po wybraniu symptom d wyniesie:r

1 1 ( / { }) 0,5 ( 1) Bc j j j j H E E n n   





Ponieważ rozkład zbioru stanów jest jednoznacznie określony przez generujący go symptom d można zapisać:r

1 0 ( / ) 0,5 ( 1) Bc r j j j H E d n n   





Nieokreśloność stanu po wybraniu symptom d jest nie większa niżr początkowa:

( ) ( / )

Bc Bc r

H E H E d

Równość w zachodzi w następującym przypadku:

( ) ( / ) ( )

Bc Bc r _{j A} j

H E H E d n n



   _

Oznacza to, że wartość symptomu d nie zależy od stanu obiektu i takir symptom może być usunięty ze zbioru D . Na podstawie i można zdefiniować kombinatoryczną miarę wartości informacyjnej symptomów.

Definicja 1

(13)

nieokreśloności stanu obiektu przed wybraniem symptomu dr i nieokreśloności pozostałej po jego wybraniu.

Jeżeli symptom d zostanie wybrany jako pierwszy to:r

( ) ( ) ( / )

Bc r Bc Bc r

J d H E H E d

Po podstawieniu i do i uwzględnieniu , otrzymuje się: 1 0 ( ) 0,5 ( ) Bc r j j j J d  n n n  





Z i wynika, że kombinatoryczna informacja jest nieujemna - J_Bc( ) 0d_r  . Zależność i przeprowadzone wcześniej rozważania pozwalają na sformułowanie wniosku - wielkość J_Bc( )d jest równa liczbie_r nieuporządkowanych par stanów rozróżnianych przez symptom d .r

2 1 0 1 ( ) Bc r j k j k j J d   n n    

 

Potwierdza to spójność wprowadzonych miar nieokreśloności stanu i wartości informacyjnej symptomów.

Jeżeli symptom ds zostanie wybrany jako drugi w kolejności toD w każdym z podzbiorów E dj( )r rozkładu generuje następujący rozkład:

0 1 0,..., 1{ j ( , ),...,r s j ( , )}r s j E d d E d d gdzie: ( , ) { : ( / ) ( / ) , 1,..., ,} jl jl jl jl r s i r i s i jl jl E d d  e R d e  j R d e l i  n Spełnione są przy tym następujące zależności:

(14)

0,..., 1 , 0,..., 1 1 1 0,..., 1 0,..., 1 ₀ 0 ) ( , ) ( , ) ) ( , ) ( ) ) jl r s jk r s j l k l k jl r s j r jl j j j _l l a E d d E d d b E d d E d c n n                _   _      

_

 





Nieokreśloność stanu po wybraniu obu symptomów ,d dr s , możnaD zapisać w postaci: 1 1 0 0 ( / , ) 0,5 ( 1) Bc r s jl jl j l H E d d n n     



 Stąd na podstawie Definicji 1: ( / ) ( / ) ( / , ) Bc s r Bc r Bc r s J d d H E d H E d d Po podstawieniu i do , otrzymuje się:

1 1 0 0 ( / ) 0,5 ( ) Bc s r jl j jl j l J d d n n n     





Powyższe rozważania można uogólnić na wyznaczanie informacji symptomu ds pod warunkiem, że wcześniej wybrany został zbiór kD symptomów Dk  D

(1) (2) ( ) { , ,..., }

k k

D  d d d

i rozkład zbioru stanów przyjął postać:

0 1 1

{ (E Dj k)} { ( E Dk), (E Dk),...,Em_k (Dk)} gdzie: m – liczność rodziny podzbiorów. k Na podstawie uogólnionej Definicji 1:

( / ) ( / ) ( / , ) Bc s k Bc k Bc k s J d D H E D H E D d i ostatecznie: 1 1 0 0 ( / ) 0,5 k ( ) m Bc s k jl j jl j l J d D n n n      

 



(15)

Zależność stanowi uogólnienie – stają się one identyczne gdy k  i1 k

m  .

Kolejnym zagadnieniem jest określenie informacji zbioru k symptomów k

D  D

( ) ( ) ( / )

Bc k Bc Bc k

J D H E H E D Po przekształceniach otrzymuje się:

1 0 ( ) 0,5 k ( ) m Bc k j j j J D n n n   





Z porównania i wynika, że obie zależności stają się identyczne gdy D jestk zbiorem jednoelementowym.

Całkowitą informację zbioru D i symptomu k dsDk można zapisać następująco: ( , ) ( ) ( / , ) Bc k s Bc Bc k s J D d H E H E D d Po przekształceniu: 1 1 0 0 ( , ) 0,5 k ( ) m Bc k s jl jl j l J D d  n n n     

 



Powyższe rozważania pozwalają na sformułowanie poniższego lematu i twierdzenia (dowód w [2])

Lemat

Całkowita kombinatoryczna informacja zbioru Dk  i symptomuD

s k d D jest równa: ( , ) ( ) ( / ) Bc k s Bc k Bc s k J D d  J D J d D Twierdzenie 2

Informacja zbioru symptomów D_K { },d_k k1,..., ,K D_K  jest równaD sumie warunkowych informacji symptomów wybieranych sekwencyjne:

( ) 1 1 ( ) ( / ) K Bc K Bc k k k J D J d D _  



(16)

Wynika stąd, że kombinatoryczna informacja posiada własność addytywności podobnie jak informacja w sensie Shannona.

6.

Podsumowanie

W pracy wykazano, że nieokreśloność stanu diagnozowanego obiektu może być opisana przy pomocy funkcji o postaci nielogarytmicznej. Wprowadzono kombinatoryczną entropię diagnostyczną opisującą liczbę par stanów, które powinny być rozróżnione w procesie diagnozowania. Dalej zdefiniowano kombinatoryczne miary wartości informacyjnej pojedynczych symptomów i ich zbiorów. Wykazano, że kombinatoryczna informacja J_Bc( )d posiada własność addytywności podobnie jak_r informacja w sensie Shannona.

PhD. Eng. BOROWCZYK Henryk, Air Force Institute of Technology, Bialystok University of Technology, specialization: complex diagnostics of turbine engines and control systems, system identification, methods of artificial intelligence.