• Nie Znaleziono Wyników

ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwo ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie Prawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopo

N/A
N/A
Protected

Academic year: 2021

Share "ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwo ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie Prawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopodobie ñ stwoPrawdopo"

Copied!
16
0
0

Pełen tekst

(1)

Probabilistic Intervals of Confidence Interpretation of Adaptive Models

Norbert Jankowski

Department of Computer Methods, Nicholas Copernicus University

ul. Grudziądzka 5, 87–100 Toruń, Poland, phone: +48 56 6113307, fax: +48 56 621543

e-mail: Norbert.Jankowski@phys.uni.torun.pl, http://www.phys.uni.torun.pl/˜norbert

(2)

What is the goal?

• High accuracyshould not be the onlygoal of classification

• Important are also: alternatives diagnoses and their probability, evaluation of confidence

• Neural models — just the winner class — theywork as black boxes.

Probabilistic Confidence Intervals helps to:

• evaluate the certainty of the winning class and the importance of alternative classes

• compare the influence of each feature in classification of a given case, showing changes of the probabilityof all important classes

• visualize the class memberships of a given case and its neighborhood

(3)

Disadvantages of (crisp) logical rules

• Rules assign a given case to a class without anygradation which could give information on uncertaintyof such classification

• Rules conditions use hyper-rectangular membership function and therefore shape of their decision borders are verylimited

• Because of rectangular shapes rules maynot cover the whole input space, leaving subspaces in which no classification is done

• Rules mayalso overlap producing ambiguous classification

• Logical rules are not reliable near decision borders

(4)

Incremental Network

(x, y 1 ) IncNet 1

(x, y) .. . .. . Decision Module

C 1 (x), . . . , C K (x)

C(x)

(x, y K ) IncNet K

Winning class:

C(x) = arg max

i C i (x)

Probability:

p(C i |x) = σ(C i (x) 1 2 )

 K

j=1 σ(C j (x) 1 2 )

The IncNet network was used because of its good performance — network

structure is controlled bygrowing and pruning criterion to keep complexityof

network similar to the complexityof data.

(5)

Confidence Intervals (CI)

 Confidence intervals — calculated individuallyfor a given input vector while

 Logical rules are extracted for the whole training set.

 In general such probabilitymaybe estimated byanytrustworthymodel.

Suppose that for a given vector x = [x 1 , x 2 , . . . , x N ] the highest probability p(C k |x; M) is found for class k.

The confidence interval [x r min , x r max ] for the feature r is defined by x r min = min

¯

x {C(¯x) = k ∧ ∀ x r x>¯ x C(ˆ x) = k } (1) x r max = max

¯

x {C(¯x) = k ∧ ∀ x r x<¯ x C(ˆ x) = k } (2) where

¯

x = [x 1 , . . . , x r −1 , ¯ x, x r+1 , . . . , x N ], x = [x ˆ 1 , . . . , x r −1 , ˆ x, x r+1 , . . . , x N ] (3)

(6)

Confidence intervals for a given vector x measure maximal deviation from the value x r , assuming all other feature values unchanged, that do not change

classification of the vector.

Intervals with confidence level

should guarantee that the winning class k is considerablymore probable than the most probable alternative class:

x r,β min = min

¯ x



C(¯ x) = k ∧ ∀ x r x>¯ x C(ˆ x) = k p(C k |¯x)

max i =k p(C i |¯x) > β

 (4)

x r,β max = max

¯ x



C(¯ x) = k ∧ ∀ x r x<¯ x C(ˆ x) = k p(C k |¯x)

max i =k p(C i |¯x) > β



(5)

(7)

0 20 40 60 80 100 120 0

0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

1. "Na to trudno mi odpowiedzieæ"

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

2. Ocena stopnia szczeroœci osób badanych

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

3. Wykrywanie nietypowych i dewiacyjnych sposobów odpowiadania

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

4. Wykrywanie subtelniejszych prób zafa³szowania profilu

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

5. Hipochondria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

6. Depresja

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

7. Histeria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

8. Psychopatia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

9. Mêskoœæ

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

10. Paranoja

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

11. Psychastenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

12. Schizofrenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

13. Mania

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobieñstwo

14. Introwersja spo³eczna

Figure 1: Reactive Psychosis.

(8)

0 20 40 60 80 100 120 0

0.2 0.4 0.6 0.8 1

Feature value

Probability

1. Assessment of degree of sincerity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

2. Detection of atypical and deviational answering style

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

3. Detection of subtle trials of profile falsifing

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

4. Hypochondria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

5. Depression

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

6. Hysteria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

7. Psychopathy

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

8. Masculinity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

9. Paranoia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

10. Psychasthenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

11. Schizophrenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

Feature value

Probability

12. Manic

Figure 2: Class: Paranoia (prob. 0.68); alternative class: schizophrenia (prob. 0.28).

(9)

Probabilistic Intervals of Confidence (PIC)

For given vector x and feature r:

Class Probability. # class winner p(C(x) |¯x(z)) C(x)

alternative I p(C k 2 |¯x(z)) k 2 = arg max i {p(C i |x), C i = C(x)}

alternative II p(C k M |¯x(z)) k M = arg max i {p(C i |¯x(z)), C i = C(x)}

x(z) = [x ¯ 1 , . . . , x r −1 , z, x r+1 , . . . , x N ]

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

WartoϾ cechy

Prawdopodobie ñ stwo

11. Psychastenia

(10)

0 20 40 60 80 100 120 0

0.2 0.4 0.6 0.8 1

psychopathy manic state

schizophrenia

Feature value

Probability

1. Assessment of degree of sincerity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

Feature value

Probability

2. Detection of atypical and deviational answering style

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

Feature value

Probability

3. Detection of subtle trials of profile falsifing

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy neurosis

Feature value

Probability

4. Hypochondria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

Feature value

Probability

5. Depression

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

Feature value

Probability

6. Hysteria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm psychopathy

Feature value

Probability

7. Psychopathy

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

Feature value

Probability

8. Masculinity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

Feature value

Probability

9. Paranoia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

neurosis

Feature value

Probability

10. Psychasthenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy

schizophrenia psychopathy

Feature value

Probability

11. Schizophrenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

psychopathy narcomania

Feature value

Probability

12. Manic

Figure 3: Class: Psychopathy (prob. 0.97); alternative class: neurosis (prob. 0.002).

(11)

0 20 40 60 80 100 120 0

0.2 0.4 0.6 0.8 1

neurosis

organic

neurosis

Feature value

Probability

1. Assessment of degree of sincerity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia neurosis

organicschizophrenia

Feature value

Probability

2. Detection of atypical and deviational answering style

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

organic

deviational answering style 1

organic

Feature value

Probability

3. Detection of subtle trials of profile falsifing

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia organic

neurosis

Feature value

Probability

4. Hypochondria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia neurosis

schizophrenia organic

neurosis

Feature value

Probability

5. Depression

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia organic

schizophrenia neurosis

schizophrenia Feature value

Probability

6. Hysteria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

organicschizophrenia

organic

Feature value

Probability

7. Psychopathy

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

organic

Feature value

Probability

8. Masculinity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia

neurosis organic

simulation

organic

Feature value

Probability

9. Paranoia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia organic

neurosis organic

schizophrenia Feature value

Probability

10. Psychasthenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

neurosis criminality

organic

neurosis

Feature value

Probability

11. Schizophrenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia organic

neurosis organic

schizophrenia

Feature value

Probability

12. Manic

Figure 4: Organic (0.83), schizophrenia (0.062)

(12)

0 20 40 60 80 100 120 0

0.2 0.4 0.6 0.8 1

schizophrenia

paranoia schizophrenia

Feature value

Probability

1. Assessment of degree of sincerity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia paranoia

schizophrenia criminalityschizophrenia

paranoia schizophrenia

Feature value

Probability

2. Detection of atypical and deviational answering style

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia

paranoia schizophrenia

Feature value

Probability

3. Detection of subtle trials of profile falsifing

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia paranoiaschizophrenia

neurosis

Feature value

Probability

4. Hypochondria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia paranoia

schizophrenia

Feature value

Probability

5. Depression

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia paranoia

schizophrenia

Feature value

Probability

6. Hysteria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia paranoia

Feature value

Probability

7. Psychopathy

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia

paranoia

schizophrenia

Feature value

Probability

8. Masculinity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia

paranoia criminality schizophrenia

Feature value

Probability

9. Paranoia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia paranoiaschizophrenia

Feature value

Probability

10. Psychasthenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

criminality schizophrenia

paranoia

Feature value

Probability

11. Schizophrenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

schizophrenia paranoia

schizophrenia paranoia

schizophrenia paranoia

schizophrenia

Feature value

Probability

12. Manic

Figure 5: Class: Paranoia (prob. 0.68); alternative class: schizophrenia (prob. 0.28).

(13)

0 20 40 60 80 100 120 0

0.2 0.4 0.6 0.8 1

norm

Feature value

Probability

1. Assessment of degree of sincerity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

Feature value

Probability

2. Detection of atypical and deviational answering style

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

Feature value

Probability

3. Detection of subtle trials of profile falsifing

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm alcoholism

Feature value

Probability

4. Hypochondria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1 norm

Feature value

Probability

5. Depression

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

Feature value

Probability

6. Hysteria

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

psychopathy

Feature value

Probability

7. Psychopathy

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

Feature value

Probability

8. Masculinity

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

criminality

Feature value

Probability

9. Paranoia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

Feature value

Probability

10. Psychasthenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm schizophrenia

simulation

Feature value

Probability

11. Schizophrenia

0 20 40 60 80 100 120

0 0.2 0.4 0.6 0.8 1

norm

Feature value

Probability

12. Manic

Figure 6: Class: Norm (prob. 0.97); non alternative class.

(14)

Description of previous pictures

Figures 3, 4, 5 and 6 show probabilistic intervals of confidence for two quite different patients (the first and the last scale has been omitted, therefore only12 features are displayed). Little squares show the probabilityof the winning class corresponding to the measured input values of the psychometric scales. Figure 3 presents an easycase: the psychopathyhas a large probability0.97 and the case is quite far from anyother alternative classes. The whole range of values, 0-120, is shown and an alternative class appears for features 1, 4, 7 and 12, but the confidence intervals are quite broad.

Classification does not depend on the precise values of some features r (for example features 2, 3, 5, 6, etc) since there are no alternative classes in the whole range of values ¯ x maytake.

The second set of plots, Fig. 4, is more complex. The winner class, organic, has probability0.83 while the alternative class, schizophrenia has probability0.06. The analysis of plots shows that the values for scales 4 and 7 are close to the border and therefore both diagnoses are probable, and scales 4 & 7 are veryimportant for diagnosis. Note that classification is not so simple although the probabilityis 0.83, because considered case lies so close the border of feature 4.

Case on Figure 5 is ambiguous too. The winner class, paranoia, has probability0.68 while the alternative class, schizophrenia has probability0.28. The analysis of plots shows that the values for scales 7 and 11 are close to the border and therefore both diagnoses are probable, and scales 7 & 11 are crucial for considered case.

Figure 6 describe typical case which belong to the ”norm” class.

(15)

Psychometric data classification

• Psychometric test: Minnesota Multiphasic Personality Inventory

• Test consist from over 550 questions

• 550 questions ➠ 14 features (control and clinic)

hypochondria, depression, hysteria, psychopathy, masculinity, paranoia, psychasthenia, schizophrenia, manic, social introversion

• 20, 27 or 28 nosological types (classes)

norm, neurosis, psychopathy, organic, schizophrenia, delusion, reactive psychosis, paranoia, manic state, criminality, alcoholism, etc.

• CV10 accuracytraining with IncNet network is 93% (CV5 – 95.5%).

(16)

Conclusions

• PIC are new and veryuseful tools to support the process of diagnosis

• Information on winner and alternative classes is continuous and veryprecise

• Confidence interval shows neighboring alternative classes (if theyexist)

• The distance from the case considered to decision borders maybe analyzed in this way

• Analysis of complex cases, which often lie near the decision border, is much more reliable using probabilistic confidence intervals than logical rules

• It is veryeasyto find which features are important and which maybe omitted

• Artificial neural networks maybe interpreted using such tools, breaking the

myth that neural networks are black boxes.

Cytaty

Powiązane dokumenty

Pod pojÚciem „zadania rzÈdowe” naleĝy rozumieÊ zadania zlecone z zakresu admi- nistracji rzÈdowej, a takĝe organizacji przygotowañ i przeprowadzenia wyborów

W od cin ku uj œcio wym Si l ni cy ob se r wu je siê doœæ szybkie wych³adzanie wód rzecznych, co powoduje wzrost iloœci dni ze sta³¹ i brzegow¹ pokryw¹ lodow¹

Ostatecznie kontrast miêdzy bohaterkami sprowadza siê zatem do odmiennego postrze- gania piêkna: dla Herodiady piêkno jest wartoœci¹ sam¹ w sobie i nieprzemi- jaj¹c¹, Mamce

(3) Klasa Ab wszystkich grup abelowych tworzy kategoriÍ, w której morfizmami sπ homomorfizmy grup abelowych, a sk≥adanie morfizmów jest sk≥adaniem funkcji.. (4) Klasa Rng

Zgwałcenie jest przestępstwem po- wszechnym, dlatego odpowiedzialnym za wykorzystanie seksualne z użyciem pigułki gwałtu może być każdy bez względu na płeć,

189 kpcǡ ȏ™ǣȐ Kodeks

Glucose level in different groups of mice (V1FN, V2FN, V3FN ñ three NZO mice groups with tested vanadium compounds and fatty diet; CN ñ control NZO with standard diet; FN ñ control

Abbreviations: ANS ñ autonomic nervous system, BPH ñ benign prostatic hyperplasia, BWW ñ bladder wet weight, COX ñ cyclooxygenase, CP ñ cyclophosphamide, CP-HC