Title: Instance-Based Learning
1Instance-Based Learning
Evgueni Smirnov
2Overview
- Instance-Based Learning
- Comparison of Eager and Instance-Based Learning
- Instance Distances for Instance-Based Learning
- Nearest Neighbor (NN) Algorithm
- Advantages and Disadvantages of the NN algorithm
- Approaches to overcome the Disadvantages of the
NN algorithm - Combining Eager and Instance-Based Learning
3Instance-Based Learning
- Learning storing all training instances
- Classification an instance gets a
classification equal to the classification of the
nearest instances to the instance.
4Different Learning Methods
- Eager Learning
- Learning acquiring an explicit structure of a
classifier on the whole training set - Classification an instance gets a
classification using the explicit structure of
the classifier. - Instance-Based Learning (Lazy Learning)
- Learning storing all training instances
- Classification an instance gets a
classification equal to the classification of the
nearest instances to the instance.
5Different Learning Methods
Any random movement gtIts a mouse
I saw a mouse!
6Instance-Based Learning
Its very similar to a Desktop!!
7Nearest-Neighbor Algorithm (NN)
- The Features of the Task of the NN Algorithm
- the instance language I is a conjunctive language
with a set A with n attributes a1, a2, an. The
domain of each attribute ai, can be discrete or
continuous. - an instance x is represented as lt a1(x), a2(x),
an(x) gt, where ai(x) is the value of the
attribute ai for the instance x - the classes to be learned can be
- discrete. In this case we learn discrete function
f(x) and the co-domain C of the function consists
of the classes c to be learned. - continuous. In this case we learn continuous
function f(x) and the co-domain C of the function
consists of the classes c to be learned.
8Distance Functions
- The distance functions are composed from
difference metrics da w.r.t. attributes a defined
for each two instances xi and xj. - If the attribute a is numerical, then
- If the attribute a is discrete, then
9Distance Functions
The main distance function for determining
nearest neighbors is the Euclidean distance
10k-Nearest-Neighbor Algorithm
- The case of discrete set of classes.
- Take the instance x to be classified
- Find k nearest neighbors of x in the training
data. - Determine the class c of the majority of the
instances among the k nearest neighbors. - Return the class c as the classification of x.
11Classification Decision Boundaries
e1
-
-
-
-
q1
-
-
1-nn
1-nn q1 is positive 5-nn q1 is classified as
negative
12k-Nearest-Neighbor Algorithm
- The case of continuous set of classes
(Regression). - Take the instance x to be classified
- Find k nearest neighbors of x in the training
data. - Return the average of the classes of the k
nearest neighbors as the classification of x.
13Distance Weighted Nearest-Neighbor Algorithm
- The case of discrete set of classes.
- Take the instance x to be classified
- Determine for each class c the sum
- Return the class c with the greater Sc.
14Advantages of the NN Algorithm
- the NN algorithm can estimate complex target
classes locally and differently for each new
instance to be classified - the NN algorithm provides good generalisation
accuracy on many domains - the NN algorithm learns very quickly
- the NN algorithm is robust to noisy training
data - the NN algorithm is intuitive and easy to
understand which facilitates implementation and
modification.
15Disadvantages of the NN Algorithm
- the NN algorithm has large storage requirements
because it has to store all the data - the NN algorithm is slow during instance
classification because all the training instances
have to be visited - the accuracy of the NN algorithm degrades with
increase of noise in the training data - the accuracy of the NN algorithm degrades with
increase of irrelevant attributes.
16Condensed NN Algorithm
The Condensed NN algorithm was introduced to
reduce the storage requirements of the NN
algorithm. The algorithm finds a subset S of the
training data D s.t. each instance in D can be
correctly classified by the NN algorithm applied
on the subset S. The average reduction of the
algorithm varies between 60 to 80.
17Condensed NN Algorithm
D
S
-
-
-
This algorithm first randomly selects one
instance for each class in D and puts it in S.
Then each instance in D is classified using only
the instances in S. If an instance is
misclassified, it is added to S. This process is
repeated until there are no instances in D that
are misclassified.
18Condensed NN Algorithm
- The CNN algorithm is especially sensitive to
noise, because noisy instances will usually be
misclassified by their neighbors, and thus will
be retained. This causes two problems. - storage reduction is hindered, because noisy
instances are retained, and because they are
there, often non-noisy instances nearby will also
need to be retained. - generalization accuracy is hurt because noisy
instances are usually exceptions and thus do not
represent the underlying function well.
19Edited NN Algorithm
The Edited Nearest Neighbor algorithm was
proposed to stabilise the accuracy of the NN
algorithm when there is increase of noise in the
training data. The algorithm starts with the set
S equal to the training data D, and then each
instance in S is removed if it does not agree
with the majority of its k nearest neighbors
(with k3, typically). The algorithm edits out
noisy instances as well as close border cases,
leaving smoother decision boundaries. It also
retains all internal points i.e., it does not
reduce the space as much as most other reduction
algorithms.
20Edited NN Algorithm
e1
-
-
-
-
-
The negative instance is removed!
-
The average reduction of the algorithm varies
between 20 to 40.
21Weighting Attributes
The weighting-attribute technique was proposed in
order to improve the accuracy of the NN
algorithm in the presence of irrelevant
attributes. The key idea is to find weights for
all the attribute and to use them when the
distance between instances is computed.
Determining the weights of the attributes can be
done by some search algorithm while determining
the adequacy of the weights can be done with the
process of cross validation. In a similar way we
can choose the best k parameter for the NN
algorithm!
22Combining Decision Tress and the NN Algorithm
Outlook
sunny
overcast
rainy
Humidity
Windy
yes
high
normal
false
true
no
yes
yes
no
23Combining Decision Tress and the NN Algorithm
Outlook
sunny
overcast
rainy
Humidity
Windy
yes
high
normal
false
true
Classify the instance using the NN
algorithm applied on the training instances
associated with the classification nodes (leaves)
24Combining Decision Rules and the NN Algorithm
25Summary Points
- Instance-based learning is simple, efficient and
accurate approach to concept learning and
classification. - Many of the problems of instance-based learning
can be solved. - Instance-based learning can be combined with
eager approaches to concept learning.