Title: Instance Based Learning
1Instance Based Learning
2Nearest Neighbor
- Remember all your data
- When someone asks a question
- Find the nearest old data point
- Return the answer associated with it
- In order to say what point is nearest, we have to
define what we mean by "near". - Typically, we use Euclidean distance between two
points.
Nominal attributes distance is set to 1 if
values are different, 0 if they are equal
3Predicting Bankruptcy
4Predicting Bankruptcy
- Now, let's say we have a new person with R equal
to 0.3 and L equal to 2. - What y value should we predict?
And so our answer would be "no".
5Scaling
- The naïve Euclidean distance isn't always
appropriate. - Consider the case where we have two features
describing a car. - f1 weight in pounds
- f2 number of cylinders.
- Any effect of f2 will be completely lost because
of the relative scales. - So, rescale the inputs to put all of the features
on about equal footing
6Time and Space
- Learning is fast
- We just have to remember the training data.
- Space is n.
- What takes longer is answering a query.
- If we do it naively, we have to, for each point
in our training set (and there are n of them)
compute the distance to the query point (which
takes about m computations, since there are m
features to compare). - So, overall, this takes about m n time.
7 Noise
- Someone with an apparently healthy financial
record goes bankrupt.
8Remedy K-Nearest Neighbors
- k-nearest neighbor algorithm
- Just like the old algorithm, except that when we
get a query, we'll search for the k closest
points to the query points. - Output what the majority says.
- In this case, we've chosen k to be 3.
- The three closest points consist of two "no"s and
a "yes", so our answer would be "no".
Find the optimal k using cross-validation
9Other Variants
- IB2 save memory, speed up classification
- Work incrementally
- Only incorporate misclassified instances
- Problem noisy data gets incorporated
- IB3 deal with noise
- Discard instances that dont perform well
- Keep a record of the number of correct and
incorrect classification decisions that each
exemplar makes. - Two predetermined thresholds are set on success
ratio. - If the performance of exemplar falls below the
low threshold it is deleted. - If the performance exceeds the upper threshold it
is used for prediction.
10Instance-based learning IB2
- IB2 save memory, speed up classification
- Work incrementally
- Only incorporate misclassified instances
- Problem noisy data gets incorporated
- Data Who buys gold jewelry
- (25,60,no) (45,60,no) (50,75,no) (50,100,no)
- (50,120,no) (70,110,yes) (85,140,yes)
(30,260,yes) - (25,400,yes) (45,350,yes) (50,275,yes)
(60,260,yes)
11Instance-based learning IB2
- Data
- (25,60,no)
- (85,140,yes)
- (45,60,no)
- (30,260,yes)
- (50,75,no)
- (50,120,no)
- (70,110,yes)
- (25,400,yes)
- (50,100,no)
- (45,350,yes)
- (50,275,yes)
- (60,260,yes)
This is the final answer. I.e. we memorize only
these 5 points. However, lets compute gradually
the classifier.
12Instance-based learning IB2
13Instance-based learning IB2
- Data
- (25,60,no)
- (85,140,yes)
Since so far the model has only the first
instance memorized, this second instance gets
wrongly classified. So, we memorize it as well.
14Instance-based learning IB2
- Data
- (25,60,no)
- (85,140,yes)
- (45,60,no)
So far the model has the two first instances
memorized. The third instance gets properly
classified, since it happens to be closer with
the first. So, we dont memorize it.
15Instance-based learning IB2
- Data
- (25,60,no)
- (85,140,yes)
- (45,60,no)
- (30,260,yes)
So far the model has the two first instances
memorized. The fourth instance gets properly
classified, since it happens to be closer with
the second. So, we dont memorize it.
16Instance-based learning IB2
- Data
- (25,60,no)
- (85,140,yes)
- (45,60,no)
- (30,260,yes)
- (50,75,no)
So far the model has the two first instances
memorized. The fifth instance gets properly
classified, since it happens to be closer with
the first. So, we dont memorize it.
17Instance-based learning IB2
- Data
- (25,60,no)
- (85,140,yes)
- (45,60,no)
- (30,260,yes)
- (50,75,no)
- (50,120,no)
So far the model has the two first instances
memorized. The sixth instance gets wrongly
classified, since it happens to be closer with
the second. So, we memorize it.
18Instance-based learning IB2
- Continuing in a similar way, we finally get, the
figure in the right. - The colored points are the one that get
memorized.
This is the final answer. I.e. we memorize only
these 5 points.
19Instance-based learning IB3
- IB3 deal with noise
- Discard instances that dont perform well
- Keep a record of the number of correct and
incorrect classification decisions that each
exemplar makes. - Two predetermined thresholds are set on success
ratio. - An instance is used for training
- If the number of incorrect classifications is ?
the first threshold, and - If the number of correct classifications ? the
second threshold.
20Instance-based learning IB3
- Suppose the lower threshold is 0, and upper
threshold is 1. - Shuffle the data first
- (25,60,no)
- (85,140,yes)
- (45,60,no)
- (30,260,yes)
- (50,75,no)
- (50,120,no)
- (70,110,yes)
- (25,400,yes)
- (50,100,no)
- (45,350,yes)
- (50,275,yes)
- (60,260,yes)
21Instance-based learning IB3
- Suppose the lower threshold is 0, and upper
threshold is 1. - Shuffle the data first
- (25,60,no) 1,1
- (85,140,yes) 1,1
- (45,60,no) 0,1
- (30,260,yes) 0,2
- (50,75,no) 0,1
- (50,120,no) 0,1
- (70,110,yes) 0,0
- (25,400,yes) 0,1
- (50,100,no) 0,0
- (45,350,yes) 0,0
- (50,275,yes) 0,1
- (60,260,yes) 0,0
22Instance-based learning IB3
- The points that will be used in classification
are - (45,60,no) 0,1
- (30,260,yes) 0,2
- (50,75,no) 0,1
- (50,120,no) 0,1
- (25,400,yes) 0,1
- (50,275,yes) 0,1
23Rectangular generalizations
- When a new exemplar is classified correctly, it
is generalized by simply merging it with the
nearest exemplar. - The nearest exemplar may be either a single
instance or a hyper-rectangle.
24Rectangular generalizations
- Data
- (25,60,no)
- (85,140,yes)
- (45,60,no)
- (30,260,yes)
- (50,75,no)
- (50,120,no)
- (70,110,yes)
- (25,400,yes)
- (50,100,no)
- (45,350,yes)
- (50,275,yes)
- (60,260,yes)
25Classification
- If the new instance lies within a rectangle then
output the rectangle class - If the new instance lies in the overlap of
several rectangles, then output the class of the
rectangle whose center is the closest to the new
data instance. - If the new instance lies outside any of the
rectangles, output the class of the rectangle,
which is the closest to the data instance. - The distance of a point from a rectangle is
- If an instance lies within rectangle, d0
- If outside, d distance from the closest
rectangle part, i.e. distance from some point in
the rectangle boundary.
Class 1
Class 2
Separation line