Title: Machine Learning: Introduction
1Machine LearningIntroduction
- Lecturer
- Md Nor Ridzuan Daud
- "Find a bug in a program, and fix it, and the
program will work today. Show the program how to
find and fix a bug, and the program will work
forever." Oliver G. Selfridge
2Machine Learning
- Course Proforma WAES3302
- Website
- http//perdana.fsktm.um.edu.my/ridzuan/
3Question of the Day
- What is the next symbol in this series?
4Question of the Day
- What is the next symbol in this series?
5Learning Adaptation
- Modification of a behavioral tendency by
expertise. - (Webster 1984)
- A learning machine, broadly defined is any
device whose - actions are influenced by past experiences.
(Nilsson 1965) - Any change in a system that allows it to
perform better - the second time on repetition of the same
task or on another task drawn from the same
population. (Simon 1983) - An improvement in information processing
ability that results from information processing
activity. (Tanimoto 1990)
6Learning
- Learning tasks
- Pattern association
- Pattern recognition (classification)
- Function approximation
- Control
- Filtering
7Learning classification
8Learning vision
9Learning
- Learning problems
- Learning with a teacher
- Learning with a critic
- Unsupervised learning
10Learning with a Teacher
- supervised learning
- knowledge represented by a set of input-output
examples (xi,yi) - minimize the error between the actual response
of the learner and the desired response
desired response
state x
Environment
Teacher
actual response
Learning system
S
-
error signal
11Learning with a Critic
- learning through interaction with the environment
- exploration of states and actions
- feed-back through delayed primary reinforcement
signal (temporal credit assignment problem) - goal maximize accumulated future reinforcements
primary reinforcement signal
state
Environment
Critic
heuristic reinforcement signal
Learning system
action
12Unsupervised Learning
- self-organized learning
- no teacher or critic
- task independent quality measure
- identify regularities in the data and discover
classes automatically - competitive learning
state
Environment
Learning system
13Outline
- Why Machine Learning?
- What is a welldefined learning problem?
- An example learning to play checkers
- What questions should we ask about Machine
Learning?
14Why Machine Learning
- Recent progress in algorithms and theory
- Growing flood of online data
- Computational power is available
- Budding (smart) industry
- Three niches for machine learning
- Data mining using historical data to improve
decisions - medical records gt medical knowledge
15Cont..
- Software applications we can't program by hand
- autonomous driving
- speech recognition
- Self customizing programs
- Newsreader that learns user interests
16Applications of ML
- Learning to recognize spoken words
- SPHINX (Lee 1989)
- Learning to drive an autonomous vehicle
- ALVINN (Pomerleau 1989)
- Learning to classify celestial objects
- (Fayyad et al 1995)
- Learning to play world-class backgammon
- TD-GAMMON (Tesauro 1992)
- Designing the morphology and control structure of
electro-mechanical artefacts - GOLEM (Lipton, Pollock 2000)
17Cont.
- Autonomous driving
- ALVINN Pomerleau, 1989 drives 70 mph on
highways. - Using neural network learning
18Where is this Headed?
- Today
- Firstgeneration algorithms neural nets,
decision trees, regression ... - Applied to wellformatted database
- Budding industry
- Opportunity for tomorrow
- Learn across full mixedmedia data
- Learn across multiple internal databases, plus
the web and newsfeeds - Learn by active experimentation
- Learn decisions rather than predictions
- Cumulative, lifelong learning
- Programming languages with learning embedded?
19Relevant Disciplines
- Artificial intelligence
- Bayesian methods
- Computational complexity theory
- Control theory
- Information theory
- Philosophy
- Psychology and neurobiology
- Statistics
20What is the Learning Problem?
- Learning Improving with experience at some
task. - Improve over task T , with respect to performance
measure P , based on experience E. - E.g., Learn to play checkers
- T Play checkers
- P of games won in world tournament
- E opportunity to play against self
21Learning to Play Checkers
- T Play checkers
- P Percent of games won in world tournament
- What experience?
- What exactly should be learned?
- How shall it be represented?
- What specific algorithm to learn it?
22Type of Training Experience
- Direct or indirect?
- Direct board state -gt correct move
- Indirect outcome of a complete game
- Credit assignment problem
- Teacher or not ?
- Teacher selects board states
- Learner can select board states
- Is training experience representative of
performance goal? - Training playing against itself
- Performance evaluated playing against world
champion
23Choose Target Function
- ChooseMove B ? M board state ? move
- Maps a legal board state to a legal move
- Evaluate B?V board state ? board value
- Assigns a numerical score to any given board
state, such that better board states obtain a
higher score - Select the best move by evaluating all successor
states of legal moves and pick the one with the
maximal score
24Definition of Target Function
- If b is a final board state that is won then V(b)
100 - If b is a final board state that is lost then
V(b) -100 - If b is a final board state that is drawn then
V(b)0 - If b is not a final board state, then V(b)V(b),
where b is the best final board state that can
be achieved starting from b and playing optimally
until the end of the game. - Gives correct values but is not operational
25State Space Search
V(b) ?
V(b) maxi V(bi)
26State Space Search
V(b1) ?
V(b1) mini V(bi)
m6 b?b6
m5 b?b5
m4 b?b4
27Final Board States
Black wins V(b)-100
Red wins V(b)100
draw V(b)0
28Representation of Target Function
- table look-up
- collection of rules
- neural networks
- polynomial function of board features
- trade-off in choosing an expressive
representation - approximation accuracy
- number of training examples required to learn the
target function
29Representation of Target Function
- V(b)?0 ?1bp(b) ?2rp(b)
- ?3bk(b) ?4rk(b) ?5bt(b) ?6rt(b)
- bp(b) black pieces
- rb(b) red pieces
- bk(b) black kings
- rk(b) red kings
- bt(b) red pieces threatened by black
- rt(b) black pieces threatened by red
30Obtaining Training Examples
- V(b) true target function
- V(b) learned target function
- Vtrain(b) training value
- Rule for estimating training values
- Vtrain(b) ? V(Successor(b))
31Choose Weight Training Rule
- LMS weight update rule
- Select a training example b at random
- 1. Compute error(b)
- error(b) Vtrain(b) V(b)
- 2. For each board feature fi, update weight ?i ?
?i ? fi error(b) - ? learning rate approx. 0.1
32Example 4x4 checkers
- V(b)?0 ?1rp(b) ?2bp(b)
- Initial weights ?0-10, ?1 75, ?2 -60
V(b0)?0 ?12 ?22 20
m1 b?b1 V(b1)20
m2 b?b2 V(b2)20
m3 b?b3 V(b3)20
33Example 4x4 checkers
V(b1)20
V(b0)20
1. Compute error(b0) Vtrain(b) V(b0) V(b1)
V(b0) 0 2. For each board feature fi, update
weight ?i ? ?i ? fi error(b) ?0 ? ?0 0.1 1
0 ?1 ? ?1 0.1 2 0 ?2 ? ?2 0.1 2 0
34Example 4x4 checkers
V(b0)20
35Example 4x4 checkers
V(b3)20
V(b4a)20
V(b4b)-55
36Example 4x4 checkers
V(b4)-55
V(b3)20
1. Compute error(b3) Vtrain(b) V(b3) V(b4)
V(b3) -75 2. For each board feature fi,
update weight ?i ? ?i ? fi error(b) ?0-10,
?1 75, ?2 -60 ?0 ? ?0 - 0.1 1 75, ?0
-17.5 ?1 ? ?1 - 0.1 2 75, ?1 60 ?2 ? ?2 -
0.1 2 75, ?2 -75
37Example 4x4 checkers
?0 -17.5 , ?1 60, ?2 -75
V(b5)-107.5
V(b4)-107.5
38Example 4x4 checkers
V(b6)-167.5
V(b5)-107.5
error(b5) Vtrain(b) V(b5) V(b6) V(b5)
-60 ?0-17.5, ?1 60, ?2 -75 ?i ? ?i ? fi
error(b) ?0 ? ?0 - 0.1 1 60, ?0 -23.5 ?1 ?
?1 - 0.1 1 60, ?1 54 ?2 ? ?2 - 0.1 2
60, ?2 -87
39Example 4x4 checkers
Final board state black won Vf(b)-100
V(b6)-197.5
error(b6) Vtrain(b) V(b6) Vf(b6) V(b6)
97.5 ?0-23.5, ?1 54, ?2 -87 ?i ? ?i ? fi
error(b) ?0 ? ?0 0.1 1 97.5, ?0 13.75 ?1
? ?1 0.1 0 97.5, ?1 54 ?2 ? ?2 0.1 2
97.5, ?2 -67.5
40Evolution of Value Function
Training data before after
41Design Choices
42Some Issues in Machine Learning
- What algorithms can approximate functions well
(and when)? - How does number of training examples influence
accuracy? - How does complexity of hypothesis representation
impact it? - How does noisy data influence accuracy?
- What are the theoretical limits of learnability?
- How can prior knowledge of learner help?
- What clues can we get from biological learning
systems? - How can systems alter their own representations?