Title: FPGA Co-Processor Enhanced Ant Colony Systems Data Mining
1FPGA Co-Processor Enhanced Ant Colony Systems
Data Mining
- Jason Isaacs and Simon Y. Foo
- Machine Intelligence Laboratory
- FAMU-FSU College of Engineering
- Department of Electrical and Computer Engineering
2Presentation Outline
- Introduction
- Significance of Research
- Concise Background on ACS
- Summary of Data Mining focused on Clustering
- Discussion of ACS-based Data Mining
- FPGA Co-processor Enhancement
- Conclusions
- Future Work
3Project Goal to design and implement an Ant
Colony Systems toolbox for non-combinatorial
problem solving. This toolbox will comprise both
hardware and software based solutions.
4Ant Colony Systems Project Overview
- This work aims at advancing fundamental research
in Ant Colony Systems. - The major objectives of this project are
- Develop a set of behavior models
- Design ACS algorithms for solutions to
non-combinatorial problems - Analyze algorithms for hardware implementations
- Implement FPGA Modules CURRENT
- Incorporate all modules into a cohesive toolbox
5Introduction to Ant Colony Systems
- Ants are model organisms for bio-simulations due
to both their relative individual simplicity and
their complex group behaviors. - Colonies have evolved means for collectively
performing tasks that are far beyond the
capacities of individual ants. They do so without
direct communication or centralized control
Stigmergy. - Previous Research our use of simulated ants to
generate random numbers proved a novel
application for ACS. - Prior to 1992, ACS was used exclusively to study
real ant behavior. - However, in the last decade, beginning with Marco
Dorigos 1992 PhD Dissertation Optimization,
Learning and Natural Algorithms, modeling the
way real ants solve problems using pheromones,
ant colony simulations have provided solutions to
a variety of NP-hard combinatorial optimization
problems
6ACS Application Area Data Mining
- Ant Colony real-world behaviors applicable to
Data Mining - Ant Foraging
- Cemetery Organization and Brood Sorting
- Division of Labor and Task Allocation
- Self-organization and Templates
- Co-operative Transport
- Nest Building
7Cemetery Organization and Brood Sorting
8Ant Colony Nest Examples
9Flowchart for the ACS Data Mining System
10Knowledge Discovery andData Mining
- What is Data Mining?
- Discovery of useful summaries of data
- Also, Data Mining refers to a collection of
techniques for extracting interesting
relationships and knowledge hidden in data. - It is best described as the nontrivial process
of identifying valid, novel, potentially useful,
and ultimately understandable patterns in data.
(Fayyad, et al 1996)
11Knowledge Discovery in Databases
12Typical Tasks in Data Mining
- Classification
- Prediction
- Clustering
- Association Analysis
- Summarization
13Clustering
- What is Clustering?
- Given points in some space, often a
high-dimensional space, group the points into a
small number of clusters, each cluster consisting
of points that are near in some sense.
14The k-Means Algorithm
- k-means picks k cluster centroids and assigns
points to the clusters by picking the closest
centroid to the point in question. As points are
assigned to clusters, the centroid of the cluster
may migrate. - For a very simple example of five points in two
dimensions. Suppose we assign the points 1, 2, 3,
4, and 5 in that order, with k 2. Then the
points 1 and 2 are assigned to the two clusters,
and become their centroids for the moment. - When we consider point 3, suppose it is closer to
1, so 3 joins the cluster of 1, whose centroid
moves to the point indicated as a. Suppose that
when we assign 4, we find that 4 is closer to 2
than to a, so 4 joins 2 in its cluster, whose
center thus moves to b. Finally, 5 is closer to a
than to b, so it joins the cluster 1,3, whose
centroid moves to c.
15The k-Means Algorithm
Having located the centroids of the k clusters,
we can reassign all points, since some points
that were assigned early may actually wind up
closer to another centroid, as the centroids move
about. If we are not sure of k, we can try
different values of k until we find the smallest
k such that increasing k does not much decrease
the average distance of points to their
centroids.
16ACS Notation and Heuristics
E Oi,, On Set of n data or objects
collected. Oi vi,, vk Each object is a
vector of k numerical attributes. Vector
similarity is measured by Euclidean distance (can
use other Minkowski, Hamming, or
Mahalanobis). Dmax max DOi, Oj, where Oi,Oj
? E
17ACS Notation and Heuristics
- 2-D search area, in general, must be at least m2
? n, but experiments have shown that m2 ? 4n
provides good results. - A heap/pile H is considered to be a collection of
two or more objects. This collection is located
on a given single cell rather than just spatially
connected. This limitation prevents overlaps.
Spatial pattern cluster
Single-cell ranked cluster
18ACS Distance Measures
- Dmax is the maximum distance between two objects
of H - Ocenter is the center of mass of all objects in
H (not necessarily a real object) - Odissim is the most dissimilar object in H, i.e.
which maximizes - Dmean is the mean distance between the objects of
H and the center of mass Ocenter
19ACS Unsupervised Learning and Clustering Algorithm
- Initialize randomly the ant positions
- Repeat
- For each anti Do
- Move anti
- If anti does not carry any object Then look at
8-cell neighborhood and pick up object according
to pick-up algorithm - Else (anti is already carrying an object O) look
at 8-cell neighborhood and drop O according to
drop-off algorithm - Until stopping criterion
20ACS Data Mining AlgorithmTop Level
- Load Database
- Data Compression
- Object Clustering
- Clustering of Similar Groups
- Reevaluate Objects in Groups
21ACS Data Mining AlgorithmTop Level
- Load Database
- Select Compression Method
- Wavelets
- Principle Component Analysis
- None
- Repeat for Max_Iterations1 Object Clustering
- Begin Ants Redistribute Objects
- K-means
- Repeat for Max_Iterations2 Clustering of
Similar Groups - Ants Redistribute Piles (Clusters) of Objects
- K-means
- Repeat for Max_Iterations3 Reevaluate Objects
in Groups - Ants Redistribute Objects in Clusters with a
Probability based on Least Similar Objects
Distance from the Mean of the Cluster - K-means
22ACS Object Pick-up Algorithm
- Label 8-cell neighborhood as unexplored
- Repeat
- Consider the next unexplored cell c around anti
with the following order cell 1is NW, cell 2 is
N, cell 3 is NE, N is the direction the ant is
facing. - If c is not empty Then do one of the following
- If c contains a single object O, Then load O with
probability Pload, Else - If c contains a heap of two objects, Then remove
one of the two with a probability Pdestroy, Else - If c contains a heap H of more than 2 objects,
Then remove the most dissimilar object Odissim(H)
from H provided that - Label c as explored
- Until all 8 cells have been explored or one
object has been loaded
23ACS Object Drop-off Algorithm
- Label 8-cell neighborhood as unexplored
- Repeat
- Consider the next unexplored cell c around anti
with the following order cell 1is NW, cell 2 is
N, cell 3 is NE, N is the direction the ant is
facing. - If c is empty Then drop O in cell with a
probability Pdrop, Else - If c contains a single object O, Then drop O to
create a heap H provided that
Else - If c contains a heap H, Then drop O on H provided
that - Label c as explored
- Until all 8 cells have been explored or carried
object has been dropped
24Parameter Table
25K-means Algorithm
- Take as input the partition P of the data set
found by the ants in the form of k heaps Hi,,Hk
- Repeat
- Compute Ocenter(Hi),, Ocenter(Hk)
- Remove all objects from heaps,
- For each object Oi? E
- Let Hi, j? 1, k be the heap whose center is the
closest to Oi, - Assign Oi to Hj,
- Compute the resulting new partition P H1,,Hk
by removing all empty clusters, - Until stopping criterion
26Benchmark Databases
- The following public domain data sets were
obtained from the UCI (University of California
at Irvine) - Machine Learning Repository. These
have been used extensively for classification
tasks using different paradigms. The main
characteristics of each of these domains are
described in the three slides.
27Tested Databases
- Golf
- Very simple database, 4 attributes, 2 classes
- Balloons
- The influence of prior knowledge on concept
acquisition, 4 data sets, 4 attributes, 2 classes - Wine
- Well behaved class structure, 178 instances, 13
attributes, 3 classes - Hepatitis
- Poorly distributed database, 155 instances, 19
attributes, 2 classes - Iris (plant)
- Very popular database, 150 instances, 4
attributes, 3 classes. - Wisconsin Breast Cancer
- High dimensional database, 198 instances, 32
attributes, 2 classes
28Golf Data Results
Given Data
Numerical Equivalent
Normalized
29Golf Data Results
Number in Cluster
ERROR
Dont Play
Play
Dont Play
Objects (1-14)
Position of Cluster
30Golf Data Results
Number in Cluster
No Errors
Play
Dont Play
Dont Play
Objects (1-14)
Position of Cluster
31Wine Database
Data is the results of a chemical analysis of
wines grown in the same region in Italy but
derived from three different cultivars.
The attributes are 1) Alcohol 2) Malic acid
3) Ash 4) Alcalinity of ash 5)
Magnesium 6) Total phenols 7) Flavanoids 8)
Nonflavanoid phenols 9) Proanthocyanins 10)Colo
r intensity 11)Hue 12)OD280/OD315 of diluted
wines 13)Proline
Number of Instances class 1 59 class 2 71 class 3
48
- Error 0.050562
- 5 class 1 mislabeled as class 2
- 3 class 2 mislabeled as class 3
- 1 class 3 mislabeled as class 2
32Iris (Plant) Database
This is perhaps the best known database to be
found in the pattern recognition literature.
- Attribute Information
- 1. sepal length in cm
- 2. sepal width in cm
- 3. petal length in cm
- 4. petal width in cm
Number of Instances 150 (50 in each of three
classes) -- Iris Setosa -- Iris Versicolour --
Iris Virginica
Errors 0.047 4 mislabeled as type 2 3 mislabeled
as type 3
Errors 0.04 2 mislabeled as type 3 4 mislabeled
as type 2
33ACS DM Optimization of Parameters
- Number of Total Iterations
- Compression Method (PCA, Wavelet, None)
- Cluster Method
- Objects Only
- Objects and Groups of Objects
- Objects, Groups, then Objects again
- Number of Ants
- K-Means Iterations
- Distance Measure (Euclidean, Minkowski, Hamming,
or Mahalanobis) - Others (RNG, Ants Movement Distance, Ant Carrying
Capacity)
34ACS DM Object Grouping Only
35ACS DM Object and Cluster Grouping Only
36ACS DM Object, Cluster, and Object
37Why Move to Hardware?
- For such large datasets the ACS classifier
perform remarkably well. However, - Speed of classification is very limited in
software. - The computational bottlenecks lay in the number
of multiply and adds that must be performed for
each object. In addition, the requirement of a
square root for each distance measurement adds
complexity.
38Target HardwareAvnets Virtex II Pro Board
- Uses Virtex II Pro XC2VP20
- Many Options for I/O.
- 32 Bit PCI Bus has Data Throughput of Over 100 MB
per Second.
39ACS-DM System Top-Level HW
40ACSDM Hardware Design
41K-Means Distance Calculator with CORDIC Square
Root
42Device Utilization Summary
- Selected Device 2vp20ff896-6
- Number of Slices 6600 out
of 9280 71 - Number of Slice Flip Flops 8312 out
of 18560 44 - Number of 4 input LUTs 7661 out
of 18560 41 - Number of bonded IOBs 266 out
of 556 48 - Number of BRAMs 3 out
of 88 3 - Number of MULT18X18s 8 out
of 88 9 - Number of GCLKs 1 out
of 16 6
- TIMING REPORT
- Clock Information
- -------------------------------------------------
----------------- - Clock Signal Clock
buffer(FF name) Load - -------------------------------------------------
----------------- - clk BUFGP
1419 - -------------------------------------------------
----------------- - Timing Summary
- Minimum period 16.499ns (Maximum Frequency
60.611MHz)
CORDIC Sqrt data path is greatest bottleneck
causing high period
43Hardware Euclidean Distance Result
V1
V2
0.83812 0.01964 0.68128 0.37948 0.8318 0.50281 0.7
0947 0.42889 0.30462 0.18965 0.19343 0.68222 0.302
76 0.54167 0.15087 0.6979 0.37837 0.86001 0.85366
0.59356
0.49655 0.89977 0.82163 0.64491 0.81797 0.66023 0.
34197 0.28973 0.34119 0.53408 0.72711 0.30929 0.83
85 0.56807 0.37041 0.70274 0.54657 0.44488 0.69457
0.62131
- Result from Matlab 1.5058
- Result from Hardware 1.5172
- Vectors are Fix 8_7 on input
- Then after add Fix 9_7
- Then after multi Fix 18_14
- Then after accum Fix 20_14
- Then after CORDIC Sqrt Fix 42_36
- Error is present in round-off and Cordic Sqrt
44Ant Colony Actions Movement
CARNG is a simple 32-bit rule 30 that is user
initialized for reproducibility
RNG Ant(1)
Ant Move-Direction Filter
RNG Ant(2)
Current Location Data
Current Location Last Location Have Data Status
Ant Colony Data
Ant Change Location
New Location Data
RNG Ant(N)
45Pheromone Trail Result from Hardware
Co-simlulation
A single ant is simulated for clarity and the
Darker Red is most recent position
46Ant Colony Actions Object Load/Drop
Were Probabilities and Thresholds Met?
Enable Drop/Load Y/N
Current Location
Current Location Carried Status
Object Information
Carried Status
Ant Change Have Data Status
Current Have Data Status
Current Location Last Location Have Data Status
Ant Colony Data
New Have Data Status
47ACS DM Hardware Storage Requirements
- Preprocessed Data (Number of Objects Vector
Length, 8- to 32-bit fixed-point) - Object Vectors
- Object Locations
- Object Status
- Parameter Values (16 32-bit fixed-point)
- Probabilities
- Thresholds
- Limits
- Max Distance (1 32-bit fixed-point)
- Groups (Number of Objects Number of Groups,
1-bit and 3Number of Groups 8-bit) - Members
- Means (Object Vector Length 32-bit fixed-point)
- Locations
- Ant Locations and Have-Object Status (Number of
Ants 8-bit, plus 1-bit status)
48(No Transcript)
49PCI Bridge
50Block Diagram
- Virtex-II Pro is focal point.
- Spartan acts as bridge to PCI
- On Board Memory
- 32 MB SDRAM
- 2 MB SRAM
- 16 MB FLASH
- 128 MB DDR SDRAM
- 64 MB Compact Flash
- Ethernet
- RS232
- 4 AvBus Connectors
- 2 PMC Connectors
51Conclusions/Future Work
- Continue to design the ACS Data Mining System
- Implement an improved Memory Manager
- Correct Errors associated with Round-off and the
CORDIC Sqrt. - Implement the Group Clustering Algorithm
- Optimize the PC/FPGA interfacing to create our
own low-cost integrated system. - Our problems currently reside on the PCI
interface design shipped with the Avnet
Development Board. We are working hard to resolve
this issue, but in the end we may have to
consider another board. Also shown in
presentation P248. - We also need to improve the speed. 60Mhz is too
slow. - Optimize data through put and calculating
efficiency of the distance metric algorithm,
i.e., consider a multi-stage pipeline or employ
the use of more look-up tables. - The ultimate goal is to demonstrate the ability
of ACS algorithms to perform as well as other
well-know techniques allowing for computational
speed-up utilizing FPGAs as co-processors.