• Nie Znaleziono Wyników

MACHINE LEARNING:

N/A
N/A
Protected

Academic year: 2021

Share "MACHINE LEARNING:"

Copied!
155
0
0

Pełen tekst

(1)

DATA SCIENCE WITH MACHINE LEARNING:

CLASSIFICATION

WFAiS UJ, Informatyka Stosowana I stopień studiów

1

12/01/2021

This lecture is

based on course by E. Fox and C. Guestrin, Univ of Washington

(2)

What is a classification?

12/01/2021

2

(3)

Overwiew of the content

12/01/2021

3

(4)

12/01/2021

4

Linear classifier

(5)

An inteligent restaurant review system

12/01/2021

5

(6)

Classifying sentiment of review

12/01/2021

6

(7)

A (linear) classifier: scoring a sentence

12/01/2021

7

Score(xi) = 1.2+1.7 -2.1 = 0.8 >0

=> y = +1

positive review

(8)

Training a classifier = Learning the coefficients

12/01/2021

8

We will discuss latter how do we learn clasifier from data

(9)

Decision boundary example

12/01/2021

9

(10)

Decision boundary

12/01/2021

10

(11)

Flow chart:

12/01/2021

11

(12)

Coefficients of classifier

12/01/2021

12

(13)

General notation

12/01/2021

13

(14)

Simple hyperplane

12/01/2021

14

(15)

D-dimensional hyperplane

12/01/2021

15

(16)

Flow chart:

12/01/2021

16

(17)

12/01/2021

17

Linear classifier

Class probability

(18)

How confident is your prediction?

12/01/2021

18

(19)

Conditional probability

12/01/2021

19

(20)

Interpreting conditional probabilities

12/01/2021

20

(21)

How confident is your prediction?

12/01/2021

21

(22)

Learn conditional probabilities from data

12/01/2021

22

(23)

Predicting class probabilities

12/01/2021

23

(24)

Flow chart:

12/01/2021

24

(25)

Why not just use regression to build classifier?

12/01/2021

25

(26)

Link function

12/01/2021

26

(27)

Flow chart:

12/01/2021

27

(28)

12/01/2021

28

Logistic regression classifier:

linear score with logistic link

function

(29)

Simplest link function: sign(z)

12/01/2021

29

(30)

Logistic function (sigmoid, logit)

12/01/2021

30

0.5

0.0 0.12 0.88 1.0

(31)

Logistic regression model

12/01/2021

31

(32)

Effect of coefficients

12/01/2021

32

(33)

Flow chart:

12/01/2021

33

(34)

Learning logistic regression model

12/01/2021

34

(35)

Categorical inputs

12/01/2021

35

(36)

Encoding categories as numeric features

12/01/2021

36

(37)

Multiclass classification

12/01/2021

37

(38)

1 versus all

12/01/2021

38

(39)

1 versus all

12/01/2021

39

(40)

Summary: Logistic regression classifier

12/01/2021

40

(41)

12/01/2021

41

Linear classifier

Parameters learning

(42)

Maximizing likelihood (probability of data)

12/01/2021

42

(43)

Maximum likelihood estimation (MLE)

12/01/2021

43

Learn logistic regression model with MLE

(44)

Flow chart:

12/01/2021

44

(45)

Find „best” classifier

12/01/2021

45

(46)

Maximizing likelihood

12/01/2021

46

(47)

Gradient ascent

12/01/2021

47

Convergence criteria

(48)

Gradient ascent

12/01/2021

48

(49)

The log trick, often used in ML…

12/01/2021

49

(50)

Derivative for logistic regression

12/01/2021

50

See slides at the end of this lecture If you are interested how it is derived.

(51)

12/01/2021

51

Derivative for logistic regression

(52)

Choosing the step size

12/01/2021

52

(53)

Choosing the step size

12/01/2021

53

(54)

Choosing the step size

12/01/2021

54

(55)

Choosing the step size

12/01/2021

55

(56)

Choosing the step size

12/01/2021

56

(57)

Flow chart: final look at it

12/01/2021

57

(58)

12/01/2021

58

Linear classifier

Overfitting & regularization

(59)

Training a classifier = Learning the coefficients

12/01/2021

59

(60)

Classification error & accuracy

12/01/2021

60

(61)

Overfitting in classification

12/01/2021

61

Decision boundary example

(62)

Overfitting in classification

12/01/2021

62

Learned decision boundary

(63)

Overfitting in classification

12/01/2021

63

Quadratic features (in 2d)

(64)

Overfitting in classification

12/01/2021

64

Degree 6 features (in 2d)

(65)

Overfitting in classification

12/01/2021

65

Degree 20 features (in 2d)

(66)

Overfitting in classification

12/01/2021

66

(67)

Overfitting in logistic regression

12/01/2021

67

Remember about this probability interpretation

(68)

Effect of coefficients on logistic regression model

12/01/2021

68

With increasing coefficients model becomes overconfident on predictions

(69)

Learned probabilities

12/01/2021

69

(70)

Quadratic features: learned probabilities

12/01/2021

70

(71)

Overfitting → overconfident predictions

12/01/2021

71

(72)

Quality metric → penelazing large coefficients

12/01/2021

72

(73)

Desired total cost format

12/01/2021

73

(74)

Measure of magnitude of logistic regression coefficients

12/01/2021

74

(75)

Visualizing effect of regularisation

12/01/2021

75

(76)

Effect of regularisation

12/01/2021

76

(77)

Visualizing effect of regularisation

12/01/2021

77

(78)

Sparse logistic regression

12/01/2021

78

(79)

L1 regularised logistic regression

12/01/2021

79

(80)

12/01/2021

80

Decision trees

(81)

What makes a loan risky?

12/01/2021

81

(82)

Classifier: decision trees

12/01/2021

82

(83)

Quality metric: Classification error

12/01/2021

83

(84)

Find the tree with lowest classification error

12/01/2021

84

(85)

How do we find the best tree?

12/01/2021

85

(86)

Simple (greedy) algorithm finds good tree

12/01/2021

86

(87)

Greedy decision tree learning

12/01/2021

87

(88)

How do we select the best feature to split on?

12/01/2021

88

(89)

Classification error

12/01/2021

89

(90)

Classification error

12/01/2021

90

(91)

Choice 1 vs Choise 2

12/01/2021

91

(92)

Greedy decision tree learning algorithm

12/01/2021

92

(93)

Greedy decision tree algorithm

12/01/2021

93

(94)

Decision trees vs logistic regression

12/01/2021

94

(95)

Decision trees vs logistic regression

12/01/2021

95

(96)

Decision tree vs logistic regression

12/01/2021

96

(97)

12/01/2021

97

Overfitting

in decision trees

(98)

Overfitting in decision tree

12/01/2021

98

(99)

Overfitting in decision tree

12/01/2021

99

(100)

Early stopping

12/01/2021

100

(101)

Greedy decision tree learning

12/01/2021

101

(102)

12/01/2021

102

Strategies for

handling missing data

(103)

Handling missing data

12/01/2021

103

(104)

Handling missing data

12/01/2021

104

(105)

Handling missing data

12/01/2021

105

(106)

Idea 3: addapt algorithm

12/01/2021

106

(107)

Feature split selection with missing data

12/01/2021

107

(108)

Idea 3: addapt algorithm

12/01/2021

108

(109)

12/01/2021

109

Ensemble classifiers

and boosting

(110)

Simple classifiers

12/01/2021

110

(111)

Simple classifiers

12/01/2021

111

(112)

Can they be combined?

12/01/2021

112

(113)

Ensemble methods

12/01/2021

113

(114)

Ensemble classifier

12/01/2021

114

(115)

Boosting

12/01/2021

115

(116)

Weighted data

12/01/2021

116

(117)

Weighted data

12/01/2021

117

(118)

Boosting = greedy learning ensembles from data

12/01/2021

118

(119)

Boosting convergence & overfitting

12/01/2021

119

(120)

Boosting convergence & overfitting

12/01/2021

120

(121)

Example

12/01/2021

121

(122)

Example

12/01/2021

122

(123)

Boosting: summary

12/01/2021

123

(124)

Boosting: summary

12/01/2021

124

(125)

Classification: summary

12/01/2021

125

(126)

12/01/2021

126

Details

Derivative of likelihood

for logistic regression

(127)

The log trick, often used in ML…

12/01/2021

127

(128)

Log-likelihood function

12/01/2021

128

(129)

Log-likelihood function

12/01/2021

129

(130)

Rewritting log-likelihood

12/01/2021

130

Indicator function

(131)

Logistic regression

12/01/2021

131

(132)

Logistic regression

12/01/2021

132

(133)

Logistic regression

12/01/2021

133

(134)

Logistic regression

12/01/2021

134

(135)

12/01/2021

135

Details

ADA boosting

(136)

AdaBoost: learning ensemble

10/11, 17/11, 24/11/2020

136

(137)

AdaBoost: Computing coefficients wt

10/11, 17/11, 24/11/2020

137

(138)

Weighted classification error

10/11, 17/11, 24/11/2020

138

(139)

AdaBoost formula

10/11, 17/11, 24/11/2020

139

(140)

AdaBoost: learning ensemble

10/11, 17/11, 24/11/2020

140

(141)

AdaBoost: updating weights ai

10/11, 17/11, 24/11/2020

141

(142)

AdaBoost: updating weights ai

10/11, 17/11, 24/11/2020

142

(143)

AdaBoost: learning ensemble

10/11, 17/11, 24/11/2020

143

(144)

AdaBoost: normlizing weights ai

10/11, 17/11, 24/11/2020

144

(145)

AdaBoost: learning ensemble

10/11, 17/11, 24/11/2020

145

(146)

AdaBoost: example

10/11, 17/11, 24/11/2020

146

(147)

AdaBoost: example

10/11, 17/11, 24/11/2020

147

(148)

AdaBoost: example

10/11, 17/11, 24/11/2020

148

(149)

AdaBoost: example

10/11, 17/11, 24/11/2020

149

(150)

AdaBoost: example

10/11, 17/11, 24/11/2020

150

(151)

AdaBoost: learning ensemple

10/11, 17/11, 24/11/2020

151

(152)

Boosted decision stumps

10/11, 17/11, 24/11/2020

152

(153)

Boosted decision stumps

10/11, 17/11, 24/11/2020

153

(154)

Boosted decision stumps

10/11, 17/11, 24/11/2020

154

(155)

Boosted decision stumps

10/11, 17/11, 24/11/2020

155

ai e-0.69 ai e0.69

=

=

,if ft(xi) = yi ,if ft(xi) ≠ yi

Cytaty

Powiązane dokumenty

WFAiS UJ, Informatyka Stosowana I rok studiów, I

WFAiS UJ, Informatyka Stosowana II stopień

WFAiS UJ, Informatyka Stosowana II stopień

WFAiS UJ, Informatyka Stosowana II stopień

WFAiS UJ, Informatyka Stosowana II stopień

WFAiS UJ, Informatyka Stosowana II stopień

WFAiS UJ, Informatyka Stosowana II stopień

WFAiS UJ, Informatyka Stosowana II stopień