Title: Fuzzy KNearest Neighbour Algorithm
1Fuzzy K-Nearest Neighbour Algorithm
- CP5090 Semester 1 2006
- A Presentation by Michael Fryer
2Introduction
- Classifying things is important
- K-Nearest Neighbour (K-NN) Algorithm is one way
of doing this - Fuzzy KNN is a suggested improvement to KNN
3About KNN
- Computationally Simple
- Only a bit less accurate the far more complicated
algorithms - Good results with small data sets
4How KNN Works
- Take an initial known data set of at least 2
classes. - Get a new element that needs to be classified
into one of the classes of the original data set. - Find the K Nearest Neighbours of the new point
from the original data. - Assign the new point to the class that the
majority of the nearest neighbours belong to.
5How KNN Works
Initial Data Set
6How KNN Works
New Unknown Element
7How KNN Works
Find K-Nearest (let K be 3)
8How KNN Works
Assign the new point to the class that the
majority of the nearest neighbours belong to.
9Why does KNN need to be Improved
- All sample elements are weighted equally when
assigning class to new element. - The amount of information given out by the
algorithm is very limited. The classified element
is either part of a class or not part of it.
10Fuzzy KNN
- Fuzzy KNN was created to try and solve some of
the problems with KNN - Fuzzy KNN is just KNN using fuzzy sets as the
output.
11Fuzzy Sets
- Fuzzy sets are simply a set of data where each
element can belong to multiple classes by varying
amounts. - Typically this is represented as a membership
strength between 0 and 1 where the total
membership of all classes adds to 1. - Ie. Element y belongs to class A with 0.75
strength and class B with strength 0.25.
12How Fuzzy KNN Works
Same as KNN up to the point where all the
neighbours are found. (k still equals 3 here)
13How Fuzzy KNN Works
However, when classifying the new element it is
given a fuzzy membership in all the classes of
it's neighbours.
14How Fuzzy KNN Works
The fuzzy membership is figured out from details
about the number of neighbours in a class and
their distance.
15How Fuzzy KNN Works
The new node still mostly belongs to the blue
class however with fuzzy KNN it has a bit of
membership of the green class
16Testing
- Testing was done on three data sets comparing KNN
to Fuzzy KNN - The three data sets were IRIS, IRIS23 and
TWOCLASS - IRIS is a data set of 150 elements and 4
attributes for each element. It has been
historically as a basic test for many
classification techniques. It has three classes
with 50 elements in each.
17Testing
- IRIS23 is a subset of the IRIS data set made up
of the second and third classes which cannot by
perfectly separated by a classification algorithm - TWOCLASS is an artificial data set. It was
included because data about how the Bayes
classification technique worked with it was
available to compare against.
18Testing
- Testing was done by removing 1 element out of the
set being worked on and using that as the
unknown, whilst using the left over data set as
the data being used to classify the unknown. - Three different types of Fuzzy KNN classifiers
were used.
19Results
20Results
21Results
22Results
23Conclusion
- Fuzzy KNN is has comparable, and in most cases
slightly better, accuracy than KNN - However, Fuzzy KNN's main advantage is the extra
data that can be obtained from the fuzzy set data
output.
24(No Transcript)