Machine Learning: Introduction - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Machine Learning: Introduction

Description:

An example: learning to play checkers. What questions should we ask about Machine Learning? ... Example 4x4 checkers. V(b1)=20. V(b0)=20 ... – PowerPoint PPT presentation

Number of Views:123

Avg rating:3.0/5.0

Slides: 43

Provided by: IjamPu

Category:

more less

Transcript and Presenter's Notes

Title: Machine Learning: Introduction

1
Machine LearningIntroduction

Lecturer
Md Nor Ridzuan Daud
"Find a bug in a program, and fix it, and the
program will work today. Show the program how to
find and fix a bug, and the program will work
forever." Oliver G. Selfridge

2
Machine Learning

Course Proforma WAES3302
Website
http//perdana.fsktm.um.edu.my/ridzuan/

3
Question of the Day

What is the next symbol in this series?

4
Question of the Day

What is the next symbol in this series?

5
Learning Adaptation

Modification of a behavioral tendency by
expertise.
(Webster 1984)
A learning machine, broadly defined is any
device whose
actions are influenced by past experiences.
(Nilsson 1965)
Any change in a system that allows it to
perform better
the second time on repetition of the same
task or on another task drawn from the same
population. (Simon 1983)
An improvement in information processing
ability that results from information processing
activity. (Tanimoto 1990)

6
Learning

Learning tasks
Pattern association
Pattern recognition (classification)
Function approximation
Control
Filtering

7
Learning classification
8
Learning vision
9
Learning

Learning problems
Learning with a teacher
Learning with a critic
Unsupervised learning

10
Learning with a Teacher

supervised learning
knowledge represented by a set of input-output
examples (xi,yi)
minimize the error between the actual response
of the learner and the desired response

desired response
state x
Environment
Teacher
actual response

Learning system
S
-
error signal
11
Learning with a Critic

learning through interaction with the environment
exploration of states and actions
feed-back through delayed primary reinforcement
signal (temporal credit assignment problem)
goal maximize accumulated future reinforcements

primary reinforcement signal
state
Environment
Critic
heuristic reinforcement signal
Learning system
action
12
Unsupervised Learning

self-organized learning
no teacher or critic
task independent quality measure
identify regularities in the data and discover
classes automatically
competitive learning

state
Environment
Learning system
13
Outline

Why Machine Learning?
What is a welldefined learning problem?
An example learning to play checkers
What questions should we ask about Machine
Learning?

14
Why Machine Learning

Recent progress in algorithms and theory
Growing flood of online data
Computational power is available
Budding (smart) industry
Three niches for machine learning
Data mining using historical data to improve
decisions
medical records gt medical knowledge

15
Cont..

Software applications we can't program by hand
autonomous driving
speech recognition
Self customizing programs
Newsreader that learns user interests

16
Applications of ML

Learning to recognize spoken words
SPHINX (Lee 1989)
Learning to drive an autonomous vehicle
ALVINN (Pomerleau 1989)
Learning to classify celestial objects
(Fayyad et al 1995)
Learning to play world-class backgammon
TD-GAMMON (Tesauro 1992)
Designing the morphology and control structure of
electro-mechanical artefacts
GOLEM (Lipton, Pollock 2000)

17
Cont.

Autonomous driving
ALVINN Pomerleau, 1989 drives 70 mph on
highways.
Using neural network learning

18
Where is this Headed?

Today
Firstgeneration algorithms neural nets,
decision trees, regression ...
Applied to wellformatted database
Budding industry
Opportunity for tomorrow
Learn across full mixedmedia data
Learn across multiple internal databases, plus
the web and newsfeeds
Learn by active experimentation
Learn decisions rather than predictions
Cumulative, lifelong learning
Programming languages with learning embedded?

19
Relevant Disciplines

Artificial intelligence
Bayesian methods
Computational complexity theory
Control theory
Information theory
Philosophy
Psychology and neurobiology
Statistics

20
What is the Learning Problem?

Learning Improving with experience at some
task.
Improve over task T , with respect to performance
measure P , based on experience E.
E.g., Learn to play checkers
T Play checkers
P of games won in world tournament
E opportunity to play against self

21
Learning to Play Checkers

T Play checkers
P Percent of games won in world tournament
What experience?
What exactly should be learned?
How shall it be represented?
What specific algorithm to learn it?

22
Type of Training Experience

Direct or indirect?
Direct board state -gt correct move
Indirect outcome of a complete game
Credit assignment problem
Teacher or not ?
Teacher selects board states
Learner can select board states
Is training experience representative of
performance goal?
Training playing against itself
Performance evaluated playing against world
champion

23
Choose Target Function

ChooseMove B ? M board state ? move
Maps a legal board state to a legal move
Evaluate B?V board state ? board value
Assigns a numerical score to any given board
state, such that better board states obtain a
higher score
Select the best move by evaluating all successor
states of legal moves and pick the one with the
maximal score

24
Definition of Target Function

If b is a final board state that is won then V(b)
100
If b is a final board state that is lost then
V(b) -100
If b is a final board state that is drawn then
V(b)0
If b is not a final board state, then V(b)V(b),
where b is the best final board state that can
be achieved starting from b and playing optimally
until the end of the game.
Gives correct values but is not operational

25
State Space Search
V(b) ?
V(b) maxi V(bi)
26
State Space Search
V(b1) ?
V(b1) mini V(bi)
m6 b?b6
m5 b?b5
m4 b?b4
27
Final Board States
Black wins V(b)-100
Red wins V(b)100
draw V(b)0
28
Representation of Target Function

table look-up
collection of rules
neural networks
polynomial function of board features
trade-off in choosing an expressive
representation
approximation accuracy
number of training examples required to learn the
target function

29
Representation of Target Function

V(b)?0 ?1bp(b) ?2rp(b)
?3bk(b) ?4rk(b) ?5bt(b) ?6rt(b)
bp(b) black pieces
rb(b) red pieces
bk(b) black kings
rk(b) red kings
bt(b) red pieces threatened by black
rt(b) black pieces threatened by red

30
Obtaining Training Examples

V(b) true target function
V(b) learned target function
Vtrain(b) training value
Rule for estimating training values
Vtrain(b) ? V(Successor(b))

31
Choose Weight Training Rule

LMS weight update rule
Select a training example b at random
1. Compute error(b)
error(b) Vtrain(b) V(b)
2. For each board feature fi, update weight ?i ?
?i ? fi error(b)
? learning rate approx. 0.1

32
Example 4x4 checkers

V(b)?0 ?1rp(b) ?2bp(b)
Initial weights ?0-10, ?1 75, ?2 -60

V(b0)?0 ?12 ?22 20
m1 b?b1 V(b1)20
m2 b?b2 V(b2)20
m3 b?b3 V(b3)20
33
Example 4x4 checkers
V(b1)20
V(b0)20
1. Compute error(b0) Vtrain(b) V(b0) V(b1)
V(b0) 0 2. For each board feature fi, update
weight ?i ? ?i ? fi error(b) ?0 ? ?0 0.1 1
0 ?1 ? ?1 0.1 2 0 ?2 ? ?2 0.1 2 0
34
Example 4x4 checkers
V(b0)20
35
Example 4x4 checkers
V(b3)20
V(b4a)20
V(b4b)-55
36
Example 4x4 checkers
V(b4)-55
V(b3)20
1. Compute error(b3) Vtrain(b) V(b3) V(b4)
V(b3) -75 2. For each board feature fi,
update weight ?i ? ?i ? fi error(b) ?0-10,
?1 75, ?2 -60 ?0 ? ?0 - 0.1 1 75, ?0
-17.5 ?1 ? ?1 - 0.1 2 75, ?1 60 ?2 ? ?2 -
0.1 2 75, ?2 -75
37
Example 4x4 checkers
?0 -17.5 , ?1 60, ?2 -75
V(b5)-107.5
V(b4)-107.5
38
Example 4x4 checkers
V(b6)-167.5
V(b5)-107.5
error(b5) Vtrain(b) V(b5) V(b6) V(b5)
-60 ?0-17.5, ?1 60, ?2 -75 ?i ? ?i ? fi
error(b) ?0 ? ?0 - 0.1 1 60, ?0 -23.5 ?1 ?
?1 - 0.1 1 60, ?1 54 ?2 ? ?2 - 0.1 2
60, ?2 -87
39
Example 4x4 checkers
Final board state black won Vf(b)-100
V(b6)-197.5
error(b6) Vtrain(b) V(b6) Vf(b6) V(b6)
97.5 ?0-23.5, ?1 54, ?2 -87 ?i ? ?i ? fi
error(b) ?0 ? ?0 0.1 1 97.5, ?0 13.75 ?1
? ?1 0.1 0 97.5, ?1 54 ?2 ? ?2 0.1 2
97.5, ?2 -67.5
40
Evolution of Value Function
Training data before after
41
Design Choices
42
Some Issues in Machine Learning