FPGA Co-Processor Enhanced Ant Colony Systems Data Mining - PowerPoint PPT Presentation

1 / 51

About This Presentation

Title:

FPGA Co-Processor Enhanced Ant Colony Systems Data Mining

Description:

FPGA Co-Processor Enhanced Ant Colony Systems Data Mining Jason Isaacs and Simon Y. Foo Machine Intelligence Laboratory FAMU-FSU College of Engineering – PowerPoint PPT presentation

Number of Views:92

Avg rating:3.0/5.0

Slides: 52

Provided by: klabsOrgm5

Learn more at: http://www.klabs.org

Category:

more less

Transcript and Presenter's Notes

Title: FPGA Co-Processor Enhanced Ant Colony Systems Data Mining

1
FPGA Co-Processor Enhanced Ant Colony Systems
Data Mining

Jason Isaacs and Simon Y. Foo
Machine Intelligence Laboratory
FAMU-FSU College of Engineering
Department of Electrical and Computer Engineering

2
Presentation Outline

Introduction
Significance of Research
Concise Background on ACS
Summary of Data Mining focused on Clustering
Discussion of ACS-based Data Mining
FPGA Co-processor Enhancement
Conclusions
Future Work

3
Project Goal to design and implement an Ant
Colony Systems toolbox for non-combinatorial
problem solving. This toolbox will comprise both
hardware and software based solutions.
4
Ant Colony Systems Project Overview

This work aims at advancing fundamental research
in Ant Colony Systems.
The major objectives of this project are
Develop a set of behavior models
Design ACS algorithms for solutions to
non-combinatorial problems
Analyze algorithms for hardware implementations
Implement FPGA Modules CURRENT
Incorporate all modules into a cohesive toolbox

5
Introduction to Ant Colony Systems

Ants are model organisms for bio-simulations due
to both their relative individual simplicity and
their complex group behaviors.
Colonies have evolved means for collectively
performing tasks that are far beyond the
capacities of individual ants. They do so without
direct communication or centralized control
Stigmergy.
Previous Research our use of simulated ants to
generate random numbers proved a novel
application for ACS.
Prior to 1992, ACS was used exclusively to study
real ant behavior.
However, in the last decade, beginning with Marco
Dorigos 1992 PhD Dissertation Optimization,
Learning and Natural Algorithms, modeling the
way real ants solve problems using pheromones,
ant colony simulations have provided solutions to
a variety of NP-hard combinatorial optimization
problems

6
ACS Application Area Data Mining

Ant Colony real-world behaviors applicable to
Data Mining
Ant Foraging
Cemetery Organization and Brood Sorting
Division of Labor and Task Allocation
Self-organization and Templates
Co-operative Transport
Nest Building

7
Cemetery Organization and Brood Sorting
8
Ant Colony Nest Examples
9
Flowchart for the ACS Data Mining System
10
Knowledge Discovery andData Mining

What is Data Mining?
Discovery of useful summaries of data
Also, Data Mining refers to a collection of
techniques for extracting interesting
relationships and knowledge hidden in data.
It is best described as the nontrivial process
of identifying valid, novel, potentially useful,
and ultimately understandable patterns in data.
(Fayyad, et al 1996)

11
Knowledge Discovery in Databases
12
Typical Tasks in Data Mining

Classification
Prediction
Clustering
Association Analysis
Summarization

13
Clustering

What is Clustering?
Given points in some space, often a
high-dimensional space, group the points into a
small number of clusters, each cluster consisting
of points that are near in some sense.

14
The k-Means Algorithm

k-means picks k cluster centroids and assigns
points to the clusters by picking the closest
centroid to the point in question. As points are
assigned to clusters, the centroid of the cluster
may migrate.
For a very simple example of five points in two
dimensions. Suppose we assign the points 1, 2, 3,
4, and 5 in that order, with k 2. Then the
points 1 and 2 are assigned to the two clusters,
and become their centroids for the moment.
When we consider point 3, suppose it is closer to
1, so 3 joins the cluster of 1, whose centroid
moves to the point indicated as a. Suppose that
when we assign 4, we find that 4 is closer to 2
than to a, so 4 joins 2 in its cluster, whose
center thus moves to b. Finally, 5 is closer to a
than to b, so it joins the cluster 1,3, whose
centroid moves to c.

15
The k-Means Algorithm
Having located the centroids of the k clusters,
we can reassign all points, since some points
that were assigned early may actually wind up
closer to another centroid, as the centroids move
about. If we are not sure of k, we can try
different values of k until we find the smallest
k such that increasing k does not much decrease
the average distance of points to their
centroids.
16
ACS Notation and Heuristics
E Oi,, On Set of n data or objects
collected. Oi vi,, vk Each object is a
vector of k numerical attributes. Vector
similarity is measured by Euclidean distance (can
use other Minkowski, Hamming, or
Mahalanobis). Dmax max DOi, Oj, where Oi,Oj
? E
17
ACS Notation and Heuristics

2-D search area, in general, must be at least m2
? n, but experiments have shown that m2 ? 4n
provides good results.
A heap/pile H is considered to be a collection of
two or more objects. This collection is located
on a given single cell rather than just spatially
connected. This limitation prevents overlaps.

Spatial pattern cluster
Single-cell ranked cluster
18
ACS Distance Measures

Dmax is the maximum distance between two objects
of H
Ocenter is the center of mass of all objects in
H (not necessarily a real object)
Odissim is the most dissimilar object in H, i.e.
which maximizes
Dmean is the mean distance between the objects of
H and the center of mass Ocenter

19
ACS Unsupervised Learning and Clustering Algorithm

Initialize randomly the ant positions
Repeat
For each anti Do
Move anti
If anti does not carry any object Then look at
8-cell neighborhood and pick up object according
to pick-up algorithm
Else (anti is already carrying an object O) look
at 8-cell neighborhood and drop O according to
drop-off algorithm
Until stopping criterion

20
ACS Data Mining AlgorithmTop Level

Load Database
Data Compression
Object Clustering
Clustering of Similar Groups
Reevaluate Objects in Groups

21
ACS Data Mining AlgorithmTop Level

Load Database
Select Compression Method
Wavelets
Principle Component Analysis
None
Repeat for Max_Iterations1 Object Clustering
Begin Ants Redistribute Objects
K-means
Repeat for Max_Iterations2 Clustering of
Similar Groups
Ants Redistribute Piles (Clusters) of Objects
K-means
Repeat for Max_Iterations3 Reevaluate Objects
in Groups
Ants Redistribute Objects in Clusters with a
Probability based on Least Similar Objects
Distance from the Mean of the Cluster
K-means

22
ACS Object Pick-up Algorithm

Label 8-cell neighborhood as unexplored
Repeat
Consider the next unexplored cell c around anti
with the following order cell 1is NW, cell 2 is
N, cell 3 is NE, N is the direction the ant is
facing.
If c is not empty Then do one of the following
If c contains a single object O, Then load O with
probability Pload, Else
If c contains a heap of two objects, Then remove
one of the two with a probability Pdestroy, Else
If c contains a heap H of more than 2 objects,
Then remove the most dissimilar object Odissim(H)
from H provided that
Label c as explored
Until all 8 cells have been explored or one
object has been loaded

23
ACS Object Drop-off Algorithm

Label 8-cell neighborhood as unexplored
Repeat
Consider the next unexplored cell c around anti
with the following order cell 1is NW, cell 2 is
N, cell 3 is NE, N is the direction the ant is
facing.
If c is empty Then drop O in cell with a
probability Pdrop, Else
If c contains a single object O, Then drop O to
create a heap H provided that
Else
If c contains a heap H, Then drop O on H provided
that
Label c as explored
Until all 8 cells have been explored or carried
object has been dropped

24
Parameter Table
25
K-means Algorithm

Take as input the partition P of the data set
found by the ants in the form of k heaps Hi,,Hk
Repeat
Compute Ocenter(Hi),, Ocenter(Hk)
Remove all objects from heaps,
For each object Oi? E
Let Hi, j? 1, k be the heap whose center is the
closest to Oi,
Assign Oi to Hj,
Compute the resulting new partition P H1,,Hk
by removing all empty clusters,
Until stopping criterion

26
Benchmark Databases

The following public domain data sets were
obtained from the UCI (University of California
at Irvine) - Machine Learning Repository. These
have been used extensively for classification
tasks using different paradigms. The main
characteristics of each of these domains are
described in the three slides.

27
Tested Databases

Golf
Very simple database, 4 attributes, 2 classes
Balloons
The influence of prior knowledge on concept
acquisition, 4 data sets, 4 attributes, 2 classes
Wine
Well behaved class structure, 178 instances, 13
attributes, 3 classes
Hepatitis
Poorly distributed database, 155 instances, 19
attributes, 2 classes
Iris (plant)
Very popular database, 150 instances, 4
attributes, 3 classes.
Wisconsin Breast Cancer
High dimensional database, 198 instances, 32
attributes, 2 classes

28
Golf Data Results
Given Data
Numerical Equivalent
Normalized
29
Golf Data Results
Number in Cluster
ERROR
Dont Play
Play
Dont Play
Objects (1-14)
Position of Cluster
30
Golf Data Results
Number in Cluster
No Errors
Play
Dont Play
Dont Play
Objects (1-14)
Position of Cluster
31
Wine Database
Data is the results of a chemical analysis of
wines grown in the same region in Italy but
derived from three different cultivars.
The attributes are 1) Alcohol 2) Malic acid
3) Ash 4) Alcalinity of ash 5)
Magnesium 6) Total phenols 7) Flavanoids 8)
Nonflavanoid phenols 9) Proanthocyanins 10)Colo
r intensity 11)Hue 12)OD280/OD315 of diluted
wines 13)Proline
Number of Instances class 1 59 class 2 71 class 3
48

Error 0.050562
5 class 1 mislabeled as class 2
3 class 2 mislabeled as class 3
1 class 3 mislabeled as class 2

32
Iris (Plant) Database
This is perhaps the best known database to be
found in the pattern recognition literature.

Attribute Information
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm

Number of Instances 150 (50 in each of three
classes) -- Iris Setosa -- Iris Versicolour --
Iris Virginica
Errors 0.047 4 mislabeled as type 2 3 mislabeled
as type 3
Errors 0.04 2 mislabeled as type 3 4 mislabeled
as type 2
33
ACS DM Optimization of Parameters

Number of Total Iterations
Compression Method (PCA, Wavelet, None)
Cluster Method
Objects Only
Objects and Groups of Objects
Objects, Groups, then Objects again
Number of Ants
K-Means Iterations
Distance Measure (Euclidean, Minkowski, Hamming,
or Mahalanobis)
Others (RNG, Ants Movement Distance, Ant Carrying
Capacity)

34
ACS DM Object Grouping Only
35
ACS DM Object and Cluster Grouping Only
36
ACS DM Object, Cluster, and Object
37
Why Move to Hardware?

For such large datasets the ACS classifier
perform remarkably well. However,
Speed of classification is very limited in
software.
The computational bottlenecks lay in the number
of multiply and adds that must be performed for
each object. In addition, the requirement of a
square root for each distance measurement adds
complexity.

38
Target HardwareAvnets Virtex II Pro Board

Uses Virtex II Pro XC2VP20
Many Options for I/O.
32 Bit PCI Bus has Data Throughput of Over 100 MB
per Second.

39
ACS-DM System Top-Level HW
40
ACSDM Hardware Design
41
K-Means Distance Calculator with CORDIC Square
Root
42
Device Utilization Summary

Selected Device 2vp20ff896-6
Number of Slices 6600 out
of 9280 71
Number of Slice Flip Flops 8312 out
of 18560 44
Number of 4 input LUTs 7661 out
of 18560 41
Number of bonded IOBs 266 out
of 556 48
Number of BRAMs 3 out
of 88 3
Number of MULT18X18s 8 out
of 88 9
Number of GCLKs 1 out
of 16 6
TIMING REPORT
Clock Information
-------------------------------------------------
-----------------
Clock Signal Clock
buffer(FF name) Load
-------------------------------------------------
-----------------
clk BUFGP
1419
-------------------------------------------------
-----------------
Timing Summary
Minimum period 16.499ns (Maximum Frequency
60.611MHz)

CORDIC Sqrt data path is greatest bottleneck
causing high period
43
Hardware Euclidean Distance Result
V1
V2
0.83812 0.01964 0.68128 0.37948 0.8318 0.50281 0.7
0947 0.42889 0.30462 0.18965 0.19343 0.68222 0.302
76 0.54167 0.15087 0.6979 0.37837 0.86001 0.85366
0.59356
0.49655 0.89977 0.82163 0.64491 0.81797 0.66023 0.
34197 0.28973 0.34119 0.53408 0.72711 0.30929 0.83
85 0.56807 0.37041 0.70274 0.54657 0.44488 0.69457
0.62131

Result from Matlab 1.5058
Result from Hardware 1.5172
Vectors are Fix 8_7 on input
Then after add Fix 9_7
Then after multi Fix 18_14
Then after accum Fix 20_14
Then after CORDIC Sqrt Fix 42_36
Error is present in round-off and Cordic Sqrt

44
Ant Colony Actions Movement
CARNG is a simple 32-bit rule 30 that is user
initialized for reproducibility
RNG Ant(1)
Ant Move-Direction Filter
RNG Ant(2)
Current Location Data
Current Location Last Location Have Data Status
Ant Colony Data
Ant Change Location
New Location Data
RNG Ant(N)
45
Pheromone Trail Result from Hardware
Co-simlulation
A single ant is simulated for clarity and the
Darker Red is most recent position
46
Ant Colony Actions Object Load/Drop
Were Probabilities and Thresholds Met?
Enable Drop/Load Y/N
Current Location
Current Location Carried Status
Object Information
Carried Status
Ant Change Have Data Status
Current Have Data Status
Current Location Last Location Have Data Status
Ant Colony Data
New Have Data Status
47
ACS DM Hardware Storage Requirements

Preprocessed Data (Number of Objects Vector
Length, 8- to 32-bit fixed-point)
Object Vectors
Object Locations
Object Status
Parameter Values (16 32-bit fixed-point)
Probabilities
Thresholds
Limits
Max Distance (1 32-bit fixed-point)
Groups (Number of Objects Number of Groups,
1-bit and 3Number of Groups 8-bit)
Members
Means (Object Vector Length 32-bit fixed-point)
Locations
Ant Locations and Have-Object Status (Number of
Ants 8-bit, plus 1-bit status)

48
(No Transcript)
49
PCI Bridge
50
Block Diagram

Virtex-II Pro is focal point.
Spartan acts as bridge to PCI
On Board Memory
32 MB SDRAM
2 MB SRAM
16 MB FLASH
128 MB DDR SDRAM
64 MB Compact Flash
Ethernet
RS232
4 AvBus Connectors
2 PMC Connectors

51
Conclusions/Future Work

Continue to design the ACS Data Mining System
Implement an improved Memory Manager
Correct Errors associated with Round-off and the
CORDIC Sqrt.
Implement the Group Clustering Algorithm
Optimize the PC/FPGA interfacing to create our
own low-cost integrated system.
Our problems currently reside on the PCI
interface design shipped with the Avnet
Development Board. We are working hard to resolve
this issue, but in the end we may have to
consider another board. Also shown in
presentation P248.
We also need to improve the speed. 60Mhz is too
slow.
Optimize data through put and calculating
efficiency of the distance metric algorithm,
i.e., consider a multi-stage pipeline or employ
the use of more look-up tables.
The ultimate goal is to demonstrate the ability
of ACS algorithms to perform as well as other
well-know techniques allowing for computational
speed-up utilizing FPGAs as co-processors.