K-Nearest%20Neighbours%20and%20Instance%20based%20learning - PowerPoint PPT Presentation

About This Presentation
Title:

K-Nearest%20Neighbours%20and%20Instance%20based%20learning

Description:

K-Nearest Neighbours and Instance based learning Ata Kaban The University of Birmingham – PowerPoint PPT presentation

Number of Views:246
Avg rating:3.0/5.0
Slides: 33
Provided by: Ata128
Category:

less

Transcript and Presenter's Notes

Title: K-Nearest%20Neighbours%20and%20Instance%20based%20learning


1
K-Nearest Neighbours and Instance based learning
  • Ata Kaban
  • The University of Birmingham

2
  • Today we learn
  • K-Nearest Neighbours
  • Case-based reasoning
  • Lazy and eager learning

3
Instance-based learning
  • One way of solving tasks of approximating
    discrete or real valued target functions
  • Have training examples (xn, f(xn)), n1..N.
  • Key idea
  • just store the training examples
  • when a test example is given then find the
    closest matches

4
  • 1-Nearest neighbour
  • Given a query instance xq,
  • first locate the nearest training example xn
  • then f(xq) f(xn)
  • K-Nearest neighbour
  • Given a query instance xq,
  • first locate the k nearest training examples
  • if discrete values target function then take
    vote among its k nearest nbrs else if real
    valued target fct then take the mean of the f
    values of the k nearest nbrs

5
The distance between examples
  • We need a measure of distance in order to know
    who are the neighbours
  • Assume that we have T attributes for the learning
    problem. Then one example point x has elements xt
    ? ?, t1,T.
  • The distance between two points xi xj is often
    defined as the Euclidean distance

6
Voronoi Diagram
7
Voronoi Diagram
8
Characteristics of Inst-b-Learning
  • An instance-based learner is a lazy-learner and
    does all the work when the test example is
    presented. This is opposed to so-called
    eager-learners, which build a parameterised
    compact model of the target.
  • It produces local approximation to the target
    function (different with each test instance)

9
When to consider Nearest Neighbour algorithms?
  • Instances map to points in
  • Not more then say 20 attributes per instance
  • Lots of training data
  • Advantages
  • Training is very fast
  • Can learn complex target functions
  • Dont lose information
  • Disadvantages
  • ? (will see them shortly)

10
(No Transcript)
11
Training data
Test instance
12
Keep data in normalised form
One way to normalise the data ar(x) to ar(x) is
13
Normalised training data
Test instance
14
Distances of test instance from training data
Classification 1-NN Yes 3-NN Yes 5-NN No 7-NN
No
15
What if the target function is real valued?
  • The k-nearest neighbour algorithm would just
    calculate the mean of the k nearest neighbours

16
Variant of kNN Distance-Weighted kNN
  • We might want to weight nearer neighbors more
    heavily
  • Then it makes sense to use all training examples
    instead of just k (Stepards method)


17
Difficulties with k-nearest neighbour algorithms
  • Have to calculate the distance of the test case
    from all training cases
  • There may be irrelevant attributes amongst the
    attributes curse of dimensionality

18
Case-based reasoning (CBR)
  • CBR is an advanced instance based learning
    applied to more complex instance objects
  • Objects may include complex structural
    descriptions of cases adaptation rules

19
  • CBR cannot use Euclidean distance measures
  • Must define distance measures for those complex
    objects instead (e.g. semantic nets)
  • CBR tries to model human problem-solving
  • uses past experience (cases) to solve new
    problems
  • retains solutions to new problems
  • CBR is an ongoing area of machine learning
    research with many applications

20
  • We only touch upon the area of Case Based
    Reasoning.
  • If you have interest to find out more about it a
    good place to start is the second part of the
    Chapter on Instance Based Learning in the
    textbook of Tom Mitchell.

21
  • The remaining slides in this file are optional
    material, and not examinable.

22
Applications of CBR
  • Design
  • landscape, building, mechanical, conceptual
    design of aircraft sub-systems
  • Planning
  • repair schedules
  • Diagnosis
  • medical
  • Adversarial reasoning
  • legal

23
CBR process
New Case
24
CBR example Property pricing
Test instance
25
How rules are generated
  • There is no unique way of doing it. Here is one
    possibility
  • Examine cases and look for ones that are almost
    identical
  • case 1 and case 2
  • R1 If recep-rooms changes from 2 to 1 then
    reduce price by 5,000
  • case 3 and case 4
  • R2 If Type changes from semi to terraced then
    reduce price by 7,000

26
Matching
  • Comparing test instance
  • matches(5,1) 3
  • matches(5,2) 3
  • matches(5,3) 2
  • matches(5,4) 1
  • Estimate price of case 5 is 25,000

27
Adapting
  • Reverse rule 2
  • if type changes from terraced to semi then
    increase price by 7,000
  • Apply reversed rule 2
  • new estimate of price of property 5 is 32,000

28
Learning
  • So far we have a new case and an estimated price
  • nothing is added yet to the case base
  • If later we find house sold for 35,000 then the
    case would be added
  • could add a new rule
  • if location changes from 8 to 7 increase price by
    3,000

29
Problems with CBR
  • How should cases be represented?
  • How should cases be indexed for fast retrieval?
  • How can good adaptation heuristics be developed?
  • When should old cases be removed?

30
Advantages
  • A local approximation is found for each test case
  • Knowledge is in a form understandable to human
    beings
  • Fast to train

31
Summary
  • K-Nearest Neighbours
  • Case-based reasoning
  • Lazy and eager learning

32
Lazy and Eager Learning
  • Lazy wait for query before generalizing
  • k-Nearest Neighbour, Case based reasoning
  • Eager generalize before seeing query
  • Radial Basis Function Networks, ID3,
  • Does it matter?
  • Eager learner must create global approximation
  • Lazy learner can create many local approximations
Write a Comment
User Comments (0)
About PowerShow.com