FPGA Implementation of Cellular Neural Networks: An initial study PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: FPGA Implementation of Cellular Neural Networks: An initial study


1
FPGA Implementation of Cellular Neural Networks
An initial study
  • Dimitris Amanatidis
  • k986134_at_kingston.ac.uk
  • School of Mathematics
  • Kingston University

2
Contents
  • A brief reminder of Cellular Neural Networks
    (CNN)
  • Field Programmable Gate Arrays (FPGA)
  • Demonstration of a traditional Image Processing
    Algorithm using FPGA
  • Current work - FPGA implementation of CNN
  • Future work

3
Cellular Neural Networks
  • Definition Chua Roska, 1997
  • A CNN is any spatial arrangement of
    locally-coupled cells, where each cell is a
    dynamical system which has an input, an output,
    and a state, evolving according to some
    prescribed dynamical laws.

4
CNN structure
5
CNN equations
uij , xij , yij are input, state and output
variables. aij and bij represent feedback and
feedforward weighting coefficients. zij is the
threshold or bias.
Activation function (PWL-sigmoidal)
6
A generic CNN iteration
7
What the papers say
  • An ultra-fast programming language promises
    radical changes to computing. Computer Business
    Review, May 1999
  • A new technology which threatens to blur the line
    between hardware and software permanently.
    ExeOnLine, February 2000
  • ESL to close a round of funding worth about 8
    million at the end of the year. EETimesOnLine,
    November 1999
  • Ian Pages breakthrough in software technology
    may revolutionise computing and telecoms. Real
    Business, May 1999

8
Reconfigurable Computers
  • A major challenge for both hardware and software
    engineers and for scientists.
  • Computing Systems whose hardware architecture can
    be modified by software to suit the application
    at hand.
  • Low-cost reconfigurable computer cards that can
    be hosted on any PC are especially beneficiary to
    academic users.

9
Field Programmable Gate Arrays
  • The core component of reconfigurable computers.
  • A silicon chip enabling a significant amount of
    digital hardware to be created and wired together
    under software control in a matter of
    milliseconds.
  • Remarkable performance gains are achieved by
    placing an algorithm in an FPGA for embedded
    applications, compared with using a
    microprocessor.

10
Application Areas
  • Real Time Image Processing
  • Real Time Control Systems
  • Custom Processors
  • Network Protocols
  • Ethernet Controllers
  • TCP/IP
  • Video Games

11
Altera FPGA
The large square chip is an Altera FPGA.Its
nominal capacity is 50,000 logic gates, 3184
latches and 2.5kbytes RAM. The smaller square
chip connects the FPGA and the PCI bus. Two of
the connectors on the right are 100 Mbit/s DS
Links (IEEE1355).
12
Hardware Compilation
  • The second key enabling technology for
    reconfigurable computing.
  • The innovation of HC is to make smart compilation
    tools available which allow computer programs to
    be turned automatically into hardware designs.
  • Enables programmers as well as electronic
    engineers to produce complex hardware systems.

13
Handel-C
  • Clear, familiar syntax
  • small and readable code
  • Explicit parallelism
  • par statement
  • simultaneous assignment
  • Well-defined timing
  • fast external I/O
  • simplifies pipelines
  • Easy to simulate
  • No need to use real hardware

14
Example of Handel-C code
while (1) // Accept and process the next word
as fast possible par counter
_FLUSH_words ch_in ? in // Read
packet, last word indicated by bit 32 set. //
Pad or trim to _FLUSH_words and output the
result. while (!in 32 counter ! 0)
// Ordinary C loop par if (counter
! 0) par
// Explicit parallelism counter--
send_buffered (ch_out, inlt-32) // Buffered
output macro if (!in 32)
ch_in ? in // Input from
channel else in 1 ltlt 32

15
Comparison of Performance
  • Conways Game of Life, compiled in Handel-C and
    run on a 25MHz FPGA device can outperform a
    conventional competing program on a 300 MHz
    Pentium by a factor of 16.
  • A logarithmic Greatest Common Divisor algorithm
    was run 10 million times for the same two 32-bit
    numbers. On the Pentium, clock cycles reached
    2100 millions whereas on a 10MHz FPGA chip the
    number was 320 millions.

16
Traditional Edge Detection
17
CNN edge detection
18
Erosion-Dilation CNN
19
Benefits
  • CNN vs Traditional Image Processing
  • real-time output
  • can be tuned to produce the desired result
  • code does not need to change - use different
    template set
  • FPGA vs microprocessor implementation
  • parallelism, absence of software layers leads to
    better performance
  • Reconfigurability
  • Slower clock speeds, less power consumption,
    mobile systems
  • Handel-C vs hardware description languages
  • code is much cleaner, more easily reused, and
    maintainable
  • Hardware design expertise not required for core
    logic

20
Future work
  • Improve algorithm to reduce clock cycles and/or
    amount of hardware
  • Use of a 4th order Runge-Kutta method instead of
    Euler
  • Target hardware (100k gates PCI card less than
    10)
Write a Comment
User Comments (0)
About PowerShow.com