Cluster analysis

(1)

Cluster analysis

Agnieszka Nowak - Brzezinska

(2)

Outline of lecture

• What is cluster analysis?

• Clustering algorithms

• Measures of Cluster Validity

(3)

What is Cluster Analysis?

• Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups

Inter-cluster distances are

maximized Intra-cluster

distances are

minimized

(4)

Applications of Cluster Analysis

• Understanding

Group genes and proteins that have similar

functionality, or group stocks with similar price fluctuations

• Summarization

– Reduce the size of large data sets

Clustering precipitation in Australia

(5)

Notion of a Cluster can be Ambiguous

How many clusters?

Four Clusters Two Clusters

Six Clusters

(6)

Types of Clusterings

• A clustering is a set of clusters

• Important distinction between hierarchical and partitional sets of clusters

• Partitional Clustering

– A division data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset

• Hierarchical clustering

– A set of nested clusters organized as a hierarchical tree

(7)

Partitional Clustering

Original Points A Partitional Clustering

(8)

Hierarchical Clustering

p4 p1

p3 p2

p1 p2 p3 p4

Traditional Hierarchical Clustering Traditional Dendrogram

(9)

Clustering Algorithms

• K-means

• Hierarchical clustering

• Graph based clustering

(10)

K-means Clustering

• Partitional clustering approach

• Each cluster is associated with a centroid (center point)

• Each point is assigned to the cluster with the closest centroid

• Number of clusters, K, must be specified

• The basic algorithm is very simple

(11)

Illustration

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 1

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 3

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 4

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 5

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 6

(12)

Illustration

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 1

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 2

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 3

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 4

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 5

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Iteration 6

(13)

Two different K-means Clusterings

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Sub-optimal Clustering

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

0 0.5 1 1.5 2 2.5 3

x

y

Optimal Clustering

Original Points

(14)

Solutions to Initial Centroids Problem

• Multiple runs

– Helps, but probability is not on your side

• Sample and use hierarchical clustering to determine initial centroids

• Select more than k initial centroids and then select among these initial centroids

– Select most widely separated

• Bisecting K-means

– Not as susceptible to initialization issues

(15)

Evaluating K-means Clusters

• Most common measure is Sum of Squared Error (SSE)

– For each point, the error is the distance to the nearest cluster – To get SSE, we square these errors and sum them.

– x is a data point in cluster C

_i

and m

_i

is the representative point for cluster C

_i

• can show that m

_i

corresponds to the center (mean) of the cluster

– Given two clusters, we can choose the one with the smaller error

– One easy way to reduce SSE is to increase K, the number of clusters

• A good clustering with smaller K can have a lower SSE than a poor clustering with higher K

 

 



^K

i x C

i

x m dist

SSE

1

2

( , )

(16)

Limitations of K-means

• K-means has problems when clusters are of differing

– Sizes

– Densities

– Non-globular shapes

• K-means has problems when the data contains outliers.

• The number of clusters (K) is difficult to determine.

(17)

Hierarchical Clustering

• Produces a set of nested clusters organized as a hierarchical tree

• Can be visualized as a dendrogram

– A tree like diagram that records the sequences of merges or splits

1 3 2 5 4 6

0 0.05 0.1 0.15 0.2

1

2

3 4

5 6

1

3 2 4

5

(18)

Strengths of Hierarchical Clustering

• Do not have to assume any particular number of clusters

– Any desired number of clusters can be obtained by ‘cutting’ the dendogram at the proper level

• They may correspond to meaningful taxonomies

– Example in biological sciences (e.g., animal kingdom, phylogeny

reconstruction, …)

(19)

Hierarchical Clustering

• Two main types of hierarchical clustering – Agglomerative:

• Start with the points as individual clusters

• At each step, merge the closest pair of clusters until only one cluster (or k clusters) left

– Divisive:

• Start with one, all-inclusive cluster

• At each step, split a cluster until each cluster contains a point (or there are k clusters)

• Traditional hierarchical algorithms use a similarity or distance matrix

– Merge or split one cluster at a time

(20)

Agglomerative Clustering Algorithm

• More popular hierarchical clustering technique

• Basic algorithm is straightforward

1. Compute the proximity matrix 2. Let each data point be a cluster 3. Repeat

4. Merge the two closest clusters 5. Update the proximity matrix 6. Until only a single cluster remains

• Key operation is the computation of the proximity of two clusters

– Different approaches to defining the distance between clusters

distinguish the different algorithms

(21)

Starting Situation

• Start with clusters of individual points and a proximity matrix

p1

p3

p5 p4 p2

p1 p2 p3 p4 p5 . . .

. .

.

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12

(22)

Intermediate Situation

• After some merging steps, we have some clusters

C1

C4

C2 C5

C3

C2 C1

C1

C3

C5 C4 C2

C3 C4 C5

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12

(23)

Intermediate Situation

• We want to merge the two closest clusters (C2 and C5) and update the proximity matrix .

C1

C4

C2 C5

C3

C2 C1

C1

C3

C5 C4 C2

C3 C4 C5

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12

(24)

After Merging

• The question is “How do we update the proximity matrix?”

C1

C4

C2 U C5 C3

? ? ? ?

?

? C2 U C5 C1

C1

C3 C4 C2 U C5

C3 C4

Proximity Matrix

...

p1 p2 p3 p4 p9 p10 p11 p12

(25)

How to Define Inter-Cluster Similarity

p1

p3

p5 p4 p2

p1 p2 p3 p4 p5 . . .

. . . Similarity?

• MIN

• MAX

• Group Average

• Distance Between Centroids

Proximity Matrix

(26)

How to Define Inter-Cluster Similarity

p1

p3

p5 p4 p2

p1 p2 p3 p4 p5 . . .

. .

.

Proximity Matrix

• MIN

• MAX

• Group Average

• Distance Between Centroids

(27)

How to Define Inter-Cluster Similarity

p1

p3

p5 p4 p2

p1 p2 p3 p4 p5 . . .

. .

.

Proximity Matrix

• MIN

• MAX

• Group Average

• Distance Between Centroids

(28)

How to Define Inter-Cluster Similarity

p1

p3

p5 p4 p2

p1 p2 p3 p4 p5 . . .

. .

.

Proximity Matrix

• MIN

• MAX

• Group Average

• Distance Between Centroids

(29)

How to Define Inter-Cluster Similarity

p1

p3

p5 p4 p2

p1 p2 p3 p4 p5 . . .

. .

.

Proximity Matrix

• MIN

• MAX

• Group Average

• Distance Between Centroids

 

(30)

Hierarchical Clustering: Group Average

• Compromise between Single and Complete Link

• Strengths

– Less susceptible to noise and outliers

• Limitations

– Biased towards globular clusters

(31)

Hierarchical Clustering: Time and Space requirements

• O(N

²

) space since it uses the proximity matrix.

– N is the number of points.

• O(N

³

) time in many cases

– There are N steps and at each step the size, N

²

, proximity matrix must be updated and searched

– Complexity can be reduced to O(N

²

log(N) ) time for some approaches

(32)

Hierarchical Clustering: Problems and Limitations

• Once a decision is made to combine two clusters, it cannot be undone

• No objective function is directly minimized

• Different schemes have problems with one or more of the following:

– Sensitivity to noise and outliers (MIN)

– Difficulty handling different sized clusters and non-convex shapes (Group average, MAX)

– Breaking large clusters (MAX)

(33)

Types of data in clustering analysis

• Interval-scaled attributes:

• Binary attributes:

• Nominal, ordinal, and ratio attributes:

• Attributes of mixed types:

(34)

Interval-scaled attributes

• Continuous measurements of a roughly linear scale

– E.g. weight, height, temperature, etc.

• Standardize data in preprocessing so that all attributes have equal weight

– Exceptions: height may be a more important

attribute associated with basketball players

(35)

Similarity and Dissimilarity Between Objects

• Distances are normally used to measure the similarity or dissimilarity between two data objects (objects=records)

• Minkowski distance:

where i = (x

_i1

, x

_i2

, …, x

_ip

) and j = (x

_j1

, x

_j2

, …, x

_jp

) are two p- dimensional data objects, and q is a positive integer

• If q = 1, d is Manhattan distance

q q

p p

q q

x j x i

j i

d ( , ) (| | | | ... | | )

2 2

1

     



|

| ...

|

| ) ,

(

1 1 2 2 p

x j

p

x i x j

x i j

i

d       

(36)

Similarity and Dissimilarity Between Objects (Cont.)

• If q = 2, d is Euclidean distance:

– Properties

• d(i,j)  0

• d(i,i) = 0

• d(i,j) = d(j,i)

• d(i,j)  d(i,k) + d(k,j)

• Can also use weighted distance, or other dissimilarity measures.

)

|

| ...

|

| (|

) ,

(

² ²

2 2

2 1

1 p

x j

p

x i x j

x i j

i

d       

(37)

Binary Attributes

• A contingency table for binary data

• Simple matching coefficient (if the binary attribute is symmetric):

• Jaccard coefficient (if the binary attribute is asymmetric):

d c

b a

c j b

i

d ( , )      p

d b c a sum

d c d

c

b a b

a

sum



 0

1 0 1

c b

a

c j b

i

d ( , )    

Object i

Object j

(38)

Dissimilarity between Binary Attributes

• Example

i j

– gender is a symmetric attribute

– remaining attributes are asymmetric

– let the values Y and P be set to 1, and the value N be set to 0 Name Gender Fever Cough Test-1 Test-2 Test-3 Test-4

Jack M Y N P N N N

Mary F Y N P N P N

Jim M Y P N N N N

75 . 2 0

1 1

2 ) 1

, (

67 . 1 0

1 1

1 ) 1

, (

33 . 1 0

0 2

1 ) 0

, (

 



 

 



 

 



 

mary jim

d

jim jack

d

mary jack

d

(39)

Nominal Attributes

• A generalization of the binary attribute in that it can take more than 2 states, e.g., red, yellow, blue, green

• Method 1: Simple matching

– m: # of attributes that are same for both records, p: total # of attributes

• Method 2: rewrite the database and create a new binary attribute for each of the m states

– For an object with color yellow, the yellow attribute is set to 1, while the remaining attributes are set to 0.

p m j p

i

d ( , )  

(40)

Ordinal Attributes

• An ordinal attribute can be discrete or continuous

• Order is important, e.g., rank

• Can be treated like interval-scaled – replacing x

_if

by their rank

– map the range of each variable onto [0, 1] by replacing i-th object in the f-th attribute by

– compute the dissimilarity using methods for interval-scaled attributes

1 1



 

f if

if

M

z r

} ,...,

1 {

_f

if

M

r 

(41)

(42)

Eucliedean distance

(43)

Other measures

spheric

manhattan

(44)

Different measures

(45)

• Calculate the distance between A (2,3) and B(7,8).

• D (A,B) = square ((7-2)

²

+ (8-3)

²

) = square (25 + 25) = square (50) = 7.07

The distance between two point

A

B

0 1 2 3 4 5 6 7 8 9

0 2 4 6 8

y

x

A B

(46)

• A(2,3), B(7,8) and C(5,1). Calculate:

• D (A,B) = square ((7-2)

²

+ (8-3)

²

) = square (25 + 25) = square (50) = 7.07

• D (A,C) = square ((5-2)

²

+ (3-1)

²

) = square (9 + 4) = square (13) = 3.60

• D (B,C) = square ((7-5)

²

+ (3-8)

²

) = square (4 + 25) = square (29) = 5.38

3 points

A

B

C 0

1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8

y

x

A B C

(47)

• D (A,B) = square ((0.7-0.6)

²

+ (0.8-0.8)

²

+ (0.4-0.3)

²

+ (0.5-0.4)

²

+ (0.2-0.2)

²

) = square (0.01 + 0.01 + 0.01) = square (0.03) = 0.17

• D (A,C) = square ((0.7-0.8)

²

+ (0.8-0.9)

²

+ (0.4-0.7)

²

+ (0.5-0.8)

²

+ (0.2-0.9)

²

) = square (0.01 + 0.01 + 0.09 + 0.09 + 0.49) = square (0.69) = 0.83

• D (B,C) = square ((0.6-0.8)

²

+ (0.8-0.9)

²

+ (0.5-0.7)

²

+ (0.4-0.8)

²

+ (0.2-0.9)

²

) = square (0.04 + 0.01 + 0.04+0.16 + 0.49) = square (0.74) = 0.86

•

Many attributes…

V1 V2 V3 V4 V5

A 0.7 0.8 0.4 0.5 0.2

B 0.6 0.8 0.5 0.4 0.2

C 0.8 0.9 0.7 0.8 0.9

(48)

Hierarchical Clustering

• Use distance matrix as clustering criteria. This method does not require the number of clusters k as an input, but needs a termination condition.

Step 0 Step 1 Step 2 Step 3 Step 4

b

d c

e

a a b

d e

c d e

a b c d e

Step 4 Step 3 Step 2 Step 1 Step 0

agglomerative (AGNES)

divisive

(DIANA)

(49)

AGNES-Explored

• Given a set of N items to be clustered, and an NxN distance (or similarity) matrix, the basic process of Johnson's (1967) hierarchical clustering is this:

• Start by assigning each item to its own cluster, so that if you have N items, you now have N clusters, each containing just one item. Let the distances (similarities) between the clusters equal the distances (similarities) between the items they

contain.

• Find the closest (most similar) pair of clusters and merge them

into a single cluster, so that now you have one less cluster.

(50)

AGNES

• Compute distances (similarities) between the new cluster and each of the old clusters.

• Repeat steps 2 and 3 until all items are clustered into a single cluster of size N.

• Step 3 can be done in different ways, which is

what distinguishes single-link from complete-

link and average-link clustering

(51)

data

VAR 1 VAR 2

1 1 3

2 1 8

3 5 3

4 1 1

5 2 8

6 5 2

7 2 3

8 4 8

9 7 2

10 5 8

(52)

P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_1 0 5,00 4,00 2,00 5,10 4,12 1,00 5,83 6,08 6,40 P_2 5,00 0 6,40 7,00 1,00 7,21 5,10 3,00 8,49 4,00 P_3 4,00 6,40 0 4,47 5,83 1,00 3,00 5,10 2,24 5,00 P_4 2,00 7,00 4,47 0 7,07 4,12 2,24 7,62 6,08 8,06 P_5 5,10 1,00 5,83 7,07 0 6,71 5,00 2,00 7,81 3,00 P_6 4,12 7,21 1,00 4,12 6,71 0 3,16 6,08 2,00 6,00 P_7 1,00 5,10 3,00 2,24 5,00 3,16 0 5,39 5,10 5,83 P_8 5,83 3,00 5,10 7,62 2,00 6,08 5,39 0 6,71 1,00 P_9 6,08 8,49 2,24 6,08 7,81 2,00 5,10 6,71 0 6,32 P_10 6,40 4,00 5,00 8,06 3,00 6,00 5,83 1,00 6,32 0

Distance matrix

(53)

1 step

P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_1 0

P_2 5 0

P_3 4 6,4 0

P_4 2 7 4,47 0

P_5 5.1 1 5.83 7.07 0

P_6 4.12 7.21 1 4.12 6.71 0

P_7 1 5.1 3 2.24 5 3.16 0

P_8 5.83 3 5.1 7.62 2 6.08 5.39 0

P_9 6.08 8.49 2.24 6.08 7.81 2 5.1 6.71 0

P_10 6.4 4 5 8.06 3 6 5.83 1 6.32 0

Find the minimal distance...

1 P_1

P_7 P_17

P_17

(54)

1 step

P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_1 0

P_2 5 0

P_3 4 6,4 0

P_4 2 7 4,47 0

P_5 5.1 1 5.83 7.07 0

P_6 4.12 7.21 1 4.12 6.71 0

P_7 1 5.1 3 2.24 5 3.16 0

P_8 5.83 3 5.1 7.62 2 6.08 5.39 0

P_9 6.08 8.49 2.24 6.08 7.81 2 5.1 6.71 0

P_10 6.4 4 5 8.06 3 6 5.83 1 6.32 0

1 P_1

P_7 P_17

P_17

Find the minimal distance...

(55)

1 step

P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_1 0

P_2 5 0

P_3 4 6,4 0

P_4 2 7 4,47 0

P_5 5.1 1 5.83 7.07 0

P_6 4.12 7.21 1 4.12 6.71 0

P_7 1 5.1 3 2.24 5 3.16 0

P_8 5.83 3 5.1 7.62 2 6.08 5.39 0

P_9 6.08 8.49 2.24 6.08 7.81 2 5.1 6.71 0

P_10 6.4 4 5 8.06 3 6 5.83 1 6.32 0

1 P_1

P_7 P_17

P_17

Find the minimal distance...

(56)

2 step

P_17 P_2 P_3 P_4 P_5 P_6 P_8 P_9 P_10 P_17 0

P_2 5 0

P_3 3 6,4 0

P_4 2 7 4,47 0

P_5 5 1 5.83 7.07 0

P_6 3.16 7.21 1 4.12 6.71 0

P_8 5.39 3 5.1 7.62 2 6.08 0

P_9 5.1 8.49 2.24 6.08 7.81 2 6.71 0

P_10 5.83 4 5 8.06 3 6 1 6.32 0

1

P_2

P_5

P_25

(57)

2 step

P_17 P_2 P_3 P_4 P_5 P_6 P_8 P_9 P_10 P_17 0

P_2 5 0

P_3 3 6,4 0

P_4 2 7 4,47 0

P_5 5 1 5.83 7.07 0

P_6 3.16 7.21 1 4.12 6.71 0

P_8 5.39 3 5.1 7.62 2 6.08 0

P_9 5.1 8.49 2.24 6.08 7.81 2 6.71 0

P_10 5.83 4 5 8.06 3 6 1 6.32 0

1

P_2

P_5

P_25

(58)

3 step

P_17 P_25 P_3 P_4 P_6 P_8 P_9 P_10 P_17 0

P_25 5 0

P_3 3 5,83 0

P_4 2 7 4,47 0

P_6 3.16 6.71 1 4.12 0

P_8 5.39 2 5.1 7.62 6.08 0

P_9 5.1 7.81 2.24 6.08 2 6.71 0

P_10 5.83 3 5 8.06 6 1 6.32 0 1

P_3 P_6 P_36

P_36

Find the minimal distance...

(59)

3 step

P_17 P_25 P_3 P_4 P_6 P_8 P_9 P_10 P_17 0

P_25 5 0

P_3 3 5,83 0

P_4 2 7 4,47 0

P_6 3.16 6.71 1 4.12 0

P_8 5.39 2 5.1 7.62 6.08 0

P_9 5.1 7.81 2.24 6.08 2 6.71 0

P_10 5.83 3 5 8.06 6 1 6.32 0 1

P_3 P_6 P_36

P_36

Find the minimal distance...

(60)

3 step

P_17 P_25 P_3 P_4 P_6 P_8 P_9 P_10 P_17 0

P_25 5 0

P_3 3 5,83 0

P_4 2 7 4,47 0

P_6 3.16 6.71 1 4.12 0

P_8 5.39 2 5.1 7.62 6.08 0

P_9 5.1 7.81 2.24 6.08 2 6.71 0

P_10 5.83 3 5 8.06 6 1 6.32 0 1

P_3 P_6 P_36

P_36

Find the minimal distance...

(61)

4 step

P_17 P_25 P_36 P_4 P_8 P_9 P_10 P_17 0

P_25 5 0

P_36 3 5,83 0

P_4 2 7 4,12 0

P_8 5.39 2 5.1 7.62 0

P_9 5.1 7.81 2 6.08 6.71 0 P_10 5.83 3 5 8.06 1 6.3

2 0

1

P_8 P_10

P_810

Find the minimal distance...

(62)

4 step

P_17 P_25 P_36 P_4 P_8 P_9 P_10 P_17 0

P_25 5 0

P_36 3 5,83 0

P_4 2 7 4,12 0

P_8 5.39 2 5.1 7.62 0

P_9 5.1 7.81 2 6.08 6.71 0 P_10 5.83 3 5 8.06 1 6.3

2 0

1

P_8 P_10

P_810

Find the minimal distance...

(63)

4 step

P_17 P_25 P_36 P_4 P_8 P_9 P_10 P_17 0

P_25 5 0

P_36 3 5,83 0

P_4 2 7 4,12 0

P_8 5.39 2 5.1 7.62 0

P_9 5.1 7.81 2 6.08 6.71 0 P_10 5.83 3 5 8.06 1 6.3

2 0

1

P_8 P_10

P_810

Find the minimal distance...

(64)

5 step

P_17 P_25 P_36 P_4 P_810 P_9 P_17 0

P_25 5 0

P_36 3 5,83 0

P_4 2 7 4,12 0

P_810 5.39 2 5 7.62 0

P_9 5.1 7.81 2 6.08 6.71 0 2

P_17

P_4

P_174

Find the minimal distance...

(65)

5 step

P_17 P_25 P_36 P_4 P_810 P_9 P_17 0

P_25 5 0

P_36 3 5,83 0

P_4 2 7 4,12 0

P_810 5.39 2 5 7.62 0

P_9 5.1 7.81 2 6.08 6.71 0 2

P_17

P_4

P_174

Find the minimal distance...

(66)

5 step

P_17 P_25 P_36 P_4 P_810 P_9 P_17 0

P_25 5 0

P_36 3 5,83 0

P_4 2 7 4,12 0

P_810 5.39 2 5 7.62 0

P_9 5.1 7.81 2 6.08 6.71 0 2

P_17

P_4

P_174

Find the minimal distance...

(67)

6 step

P_174 P_25 P_36 P_810 P_9 P_174 0

P_25 5 0

P_36 3 5,83 0

P_810 5.39 2 5 0

P_9 5.1 7.81 2 6.71 0 P_810 2

P_25

P_25810

Find the minimal distance...

(68)

6 step

P_174 P_25 P_36 P_810 P_9 P_174 0

P_25 5 0

P_36 3 5,83 0

P_810 5.39 2 5 0

P_9 5.1 7.81 2 6.71 0 P_810 2

P_25

P_25810

Find the minimal distance...

(69)

6 step

P_174 P_25 P_36 P_810 P_9 P_174 0

P_25 5 0

P_36 3 5,83 0

P_810 5.39 2 5 0

P_9 5.1 7.81 2 6.71 0 P_810 2

P_25

P_25810

Find the minimal distance...

(70)

7 step

P_174 P_25810 P_36 P_9 P_174 0

P_25810 5 0

P_36 3 5 0

P_9 P_9 5.1 6.71 2 2 0 P_36

P_369

Find the minimal distance...

(71)

7 step

P_174 P_25810 P_36 P_9 P_174 0

P_25810 5 0

P_36 3 5 0

P_9 P_9 5.1 6.71 2 2 0 P_36

P_369

Find the minimal distance...

(72)

7 step

P_174 P_25810 P_36 P_9 P_174 0

P_25810 5 0

P_36 3 5 0

P_9 P_9 5.1 6.71 2 2 0 P_36

P_369

Find the minimal distance...

(73)

8 step

P_174 P_25810 P_369 P_174 0

P_25810 5 0

P_369 3 3 5 0

P_369 P_174

P_174369 P_174369

Find the minimal distance...

(74)

8 step

P_174 P_25810 P_369 P_174 0

P_25810 5 0

P_369 3 3 5 0

P_369 P_174

P_174369 P_174369

Find the minimal distance...

(75)

8 step

P_174 P_25810 P_369 P_174 0

P_25810 5 0

P_369 3 3 5 0

P_369 P_174

P_174369 P_174369

Find the minimal distance...

(76)

9 step

P_174 P_25810 P_369 P_174 0

P_25810 5 0

P_369 3 3 5 0

P_369 P_174

P_174369 P_174369

9 step

P_174369 P_25810

P_174369 0

P_25810 5 0

Find the minimal distance...

(77)

9 step

P_174 P_25810 P_369 P_174 0

P_25810 5 0

P_369 3 3 5 0

P_369 P_174

P_174369 P_174369

9 step

P_174369 P_25810

P_174369 0

P_25810 5 0

It is now only ONE GROUP P_17436925810 9 STEPS OF THE ALGORITHM

Find the minimal distance...

(78)

Find the minimal distance...

9 step

P_174 P_25810 P_369 P_174 0

P_25810 5 0

P_369 3 3 5 0

P_369 P_174

P_174369 P_174369

9 step

P_174369 P_25810

P_174369 0

P_25810 5 0

Powstanie jedna grupa P_17436925810 9 iteracji algorytmu

Find the minimal distance...

(79)

Dendrogram

(80)

Single Linkage Complete linkage Average Linkage

(81)