Title: FPGA Implementation of Cellular Neural Networks: An initial study
1FPGA Implementation of Cellular Neural Networks
An initial study
- Dimitris Amanatidis
- k986134_at_kingston.ac.uk
- School of Mathematics
- Kingston University
2Contents
- A brief reminder of Cellular Neural Networks
(CNN) - Field Programmable Gate Arrays (FPGA)
- Demonstration of a traditional Image Processing
Algorithm using FPGA - Current work - FPGA implementation of CNN
- Future work
3Cellular Neural Networks
- Definition Chua Roska, 1997
- A CNN is any spatial arrangement of
locally-coupled cells, where each cell is a
dynamical system which has an input, an output,
and a state, evolving according to some
prescribed dynamical laws.
4CNN structure
5CNN equations
uij , xij , yij are input, state and output
variables. aij and bij represent feedback and
feedforward weighting coefficients. zij is the
threshold or bias.
Activation function (PWL-sigmoidal)
6A generic CNN iteration
7What the papers say
- An ultra-fast programming language promises
radical changes to computing. Computer Business
Review, May 1999 - A new technology which threatens to blur the line
between hardware and software permanently.
ExeOnLine, February 2000 - ESL to close a round of funding worth about 8
million at the end of the year. EETimesOnLine,
November 1999 - Ian Pages breakthrough in software technology
may revolutionise computing and telecoms. Real
Business, May 1999
8Reconfigurable Computers
- A major challenge for both hardware and software
engineers and for scientists. - Computing Systems whose hardware architecture can
be modified by software to suit the application
at hand. - Low-cost reconfigurable computer cards that can
be hosted on any PC are especially beneficiary to
academic users.
9Field Programmable Gate Arrays
- The core component of reconfigurable computers.
- A silicon chip enabling a significant amount of
digital hardware to be created and wired together
under software control in a matter of
milliseconds. - Remarkable performance gains are achieved by
placing an algorithm in an FPGA for embedded
applications, compared with using a
microprocessor.
10Application Areas
- Real Time Image Processing
- Real Time Control Systems
- Custom Processors
- Network Protocols
- Ethernet Controllers
- TCP/IP
- Video Games
11Altera FPGA
The large square chip is an Altera FPGA.Its
nominal capacity is 50,000 logic gates, 3184
latches and 2.5kbytes RAM. The smaller square
chip connects the FPGA and the PCI bus. Two of
the connectors on the right are 100 Mbit/s DS
Links (IEEE1355).
12Hardware Compilation
- The second key enabling technology for
reconfigurable computing. - The innovation of HC is to make smart compilation
tools available which allow computer programs to
be turned automatically into hardware designs. - Enables programmers as well as electronic
engineers to produce complex hardware systems.
13Handel-C
- Clear, familiar syntax
- small and readable code
- Explicit parallelism
- par statement
- simultaneous assignment
- Well-defined timing
- fast external I/O
- simplifies pipelines
- Easy to simulate
- No need to use real hardware
14Example of Handel-C code
while (1) // Accept and process the next word
as fast possible par counter
_FLUSH_words ch_in ? in // Read
packet, last word indicated by bit 32 set. //
Pad or trim to _FLUSH_words and output the
result. while (!in 32 counter ! 0)
// Ordinary C loop par if (counter
! 0) par
// Explicit parallelism counter--
send_buffered (ch_out, inlt-32) // Buffered
output macro if (!in 32)
ch_in ? in // Input from
channel else in 1 ltlt 32
15Comparison of Performance
- Conways Game of Life, compiled in Handel-C and
run on a 25MHz FPGA device can outperform a
conventional competing program on a 300 MHz
Pentium by a factor of 16. - A logarithmic Greatest Common Divisor algorithm
was run 10 million times for the same two 32-bit
numbers. On the Pentium, clock cycles reached
2100 millions whereas on a 10MHz FPGA chip the
number was 320 millions.
16Traditional Edge Detection
17CNN edge detection
18Erosion-Dilation CNN
19Benefits
- CNN vs Traditional Image Processing
- real-time output
- can be tuned to produce the desired result
- code does not need to change - use different
template set - FPGA vs microprocessor implementation
- parallelism, absence of software layers leads to
better performance - Reconfigurability
- Slower clock speeds, less power consumption,
mobile systems - Handel-C vs hardware description languages
- code is much cleaner, more easily reused, and
maintainable - Hardware design expertise not required for core
logic
20Future work
- Improve algorithm to reduce clock cycles and/or
amount of hardware - Use of a 4th order Runge-Kutta method instead of
Euler - Target hardware (100k gates PCI card less than
10)