• Nie Znaleziono Wyników

MACHINE LEARNING:

N/A
N/A
Protected

Academic year: 2021

Share "MACHINE LEARNING:"

Copied!
141
0
0

Pełen tekst

(1)

DATA SCIENCE WITH MACHINE LEARNING:

CLUSTERING

1

(2)

What is clustering?

19/01/2021

2

(3)

Clustring applications

3

(4)

Clustering applications

19/01/2021

4

(5)

Overwiew of content

5

(6)

19/01/2021

6

Clustering:

An unsupervised learning task

(7)

Motivation

7

(8)

Motivation

19/01/2021

8

I dont’t just like sport!

(9)

Motivation

9

(10)

Clustering: a supervised learning

19/01/2021

10

(11)

Custering: a supervised learning

11

(12)

Clustering: an unsupervised learning

19/01/2021

12

An unsupervised learning task

(13)

What defines a cluster ?

13

(14)

Hope for unsupervised learning

19/01/2021

14

(15)

Other (challenging!) clusters to discover

15

Analysed by your eyes

(16)

Other (challenging!) clusters to discover

19/01/2021

16

Analysed by clustering algorithms

(17)

17

k-means

clustering algorithm

(18)

k-means clustering algorithm

19/01/2021

18

(19)

k-means clustering algorithm

19

(20)

k-means clustering algorithm

19/01/2021

20

(21)

k-means clustering algorithm

21

(22)

k-means clustering algorithm

19/01/2021

22

(23)

K-means as coordinate descent algorithm

23

(24)

Convergence of k-means

19/01/2021

24

Because we can cast k-means as coordinate descent algorithm we know that we are converging to local optimum

(25)

Convergence of k-mans to local mode

25

(26)

Smart initialisation: k-means++ overwiew

19/01/2021

26

(27)

k-means++ visualised

27

(28)

k-means++ visualised

19/01/2021

28

(29)

k-means++ visualised

29

(30)

k-means++ visualised

19/01/2021

30

(31)

Smart initialisation: k-means++ overwiew

31

(32)

Assessing quality of the clustering

19/01/2021

32

(33)

K-means objective

33

(34)

Cluster heterogeneity

19/01/2021

34

(35)

What happens to heterogeneity as k increases?

35

(36)

How to choose k?

19/01/2021

36

(37)

37

Probabilistic approach:

mixture model

(38)

Why probabilistic approach?

19/01/2021

38

(39)

Why probabilistic approach?

39

(40)

Why probabilistic approach?

19/01/2021

40

(41)

Why probabilistic approach?

41

(42)

Mixture models

19/01/2021

42

(43)

Application: clustering images

43

(44)

Application: clustering images

19/01/2021

44

Single RGB vector per image

(45)

Application: clustering images

45

(46)

Application: clustering images

19/01/2021

46

(47)

Application: clustering images

47

(48)

Application: clustering images

19/01/2021

48

We see that they are grouping!

But not easy to distinguish between groups

(49)

Application: clustering images

49

In this dimmension

(50)

Model for a given image type

19/01/2021

50

(51)

Model for a given image type

51

(52)

Application: clustering images

19/01/2021

52

(53)

Application: clustering images

53

(54)

Application: clustering images

19/01/2021

54

(55)

Mixture of Gaussians

55

(56)

Mixture of Gaussians

19/01/2021

56

(57)

Mixture of Gaussians

57

(58)

Mixture of Gaussians

19/01/2021

58

(59)

Mixture of Gaussians

59

(60)

Mixture of Gaussians

19/01/2021

60

(61)

Mixture of Gaussians

61

(62)

Application: clustering documents

19/01/2021

62

(63)

Application: clustering documents

63

(64)

Application: clustering documents

19/01/2021

64

(65)

Application: clustering documents

65

(66)

Application: clustering documents

19/01/2021

66

(67)

Application: clustering documents

67

(68)

Application: clustering documents

19/01/2021

68

(69)

69

Inferring soft assignments with

expectation maximization (EM)

(70)

Inferring cluster labels

19/01/2021

70

(71)

71

(72)

19/01/2021

72

(73)

73

(74)

19/01/2021

74

(75)

75

(76)

19/01/2021

76

(77)

77

Part 1: Summary

(78)

19/01/2021

78

(79)

79

(80)

19/01/2021

80

(81)

81

(82)

19/01/2021

82

(83)

83

Part 2a : Summary

(84)

19/01/2021

84

(85)

85

(86)

19/01/2021

86

(87)

87

(88)

19/01/2021

88

(89)

89

(90)

19/01/2021

90

(91)

91

Part 2b: Summary

(92)

Expectation maximization (ME)

19/01/2021

92

(93)

Expectation maximization (ME)

93

(94)

Expectation maximization (ME)

19/01/2021

94

(95)

Expectation maximization (ME)

95

(96)

Expectation maximization (ME)

19/01/2021

96

(97)

Expectation maximization (ME)

97

(98)

Expectation maximization (ME)

19/01/2021

98

(99)

Expectation maximization (ME)

99

(100)

Expectation maximization (ME)

19/01/2021

100

(101)

Expectation maximization (ME)

101

(102)

Expectation maximization (ME)

19/01/2021

102

(103)

Expectation maximization (ME)

103

(104)

Expectation maximization (ME)

19/01/2021

104

(105)

105

Mixed membership models

for documents

(106)

Clustering model

19/01/2021

106

(107)

Clustering model

107

(108)

Clustering model

19/01/2021

108

(109)

Soft assignments

109

(110)

Soft assignments

19/01/2021

110

(111)

Soft assignments

111

(112)

Mixed membershio models

19/01/2021

112

(113)

Building alternative model

113

(114)

Building an alternative model

19/01/2021

114

(115)

Building an alternative model

115

(116)

Building an alternative model

19/01/2021

116

(117)

Model for „bag-of-words”

117

(118)

Model for „bag-of-words”

19/01/2021

118

(119)

Model for „bag-of-words”

119

(120)

Model for „bag-of-words”

19/01/2021

120

(121)

121

Hierarchical clustering

(122)

Why hierarchical clustering

19/01/2021

122

(123)

Why hierarchical clustering

123

(124)

Why hierarchical clustering

19/01/2021

124

(125)

Two main types of algorithms

125

(126)

Divisive clustering

19/01/2021

126

(127)

Divisive clustering

127

(128)

Divisive: Recursive k-means

19/01/2021

128

(129)

Divisive: Recursive k-means

129

(130)

Divisive: choices to be made

19/01/2021

130

(131)

Aglomerative: Single linkage

131

(132)

Aglomerative: Single linkage

19/01/2021

132

(133)

Aglomerative: Single linkage

133

(134)

Aglomerative: Single linkage

19/01/2021

134

(135)

Aglomerative: Single linkage

135

(136)

Cluster of clusters

19/01/2021

136

(137)

The dendrogram

137

(138)

Extracting a partition

19/01/2021

138

(139)

Agglomerative: choices to be made

139

(140)

More on cutting dendrogram

19/01/2021

140

(141)

Computational considerations

141

Cytaty

Powiązane dokumenty

As for the gradient estimate, for the proof of Theorem 1.3 one can either use the blowing-up analysis from [8] or estimates from [6] and [15] (proved independently from each other

an interior point is connected with the class of regular and univalent functions f satisfying an analytic condition (3.1), which are called strongly starlike of order α w.r.t..

Stosując metodę redukcji fenom enologicznej, Chapey poddaje analizie postaw ę człow ieka wierzącego, który całą sw oją osobą skierow any jest ku Bogu jafco

Due to the magnetic force between external magnet and donut-shaped magnets changes with the beam free tip dis- placement as function of joint deflection and length, only the

The theorem im- plies that if there exist counterexamples to the conjecture in C 2 then those of the lowest degree among them fail to satisfy our assumption on the set {f m = 0} (it

By means of a connected sum on the pair: (X, the Z m -manifold), along two points of ψ −1 (0), we can change the manifold so that the monodromy along a connected component of ψ −1

It is proved in [1] that if a Boolean algebra F is atomic and has any of the properties (σ), (E), (SC) or (WSC) then, if I is the ideal of F of finite elements, F /I is (nσ);

Antiepileptic treatment be- fore the onset of seizures reduces epilepsy severity and risk of mental retardation in infants with tuberous sclerosis com- plex. Eur J Paediatr