Neural Network Architectures - PowerPoint PPT Presentation

1 / 43

About This Presentation

Title:

Neural Network Architectures

Description:

Neural Network Architectures Ayd n Ula 02 December 2004 ulasmehm_at_boun.edu.tr Outline Of Presentation Introduction Neural Networks Neural Network Architectures ... – PowerPoint PPT presentation

Number of Views:190

Avg rating:3.0/5.0

Slides: 44

Provided by: cmpeBoun

Category:

more less

Transcript and Presenter's Notes

Title: Neural Network Architectures

1
Neural Network Architectures

Aydin Ulas

02 December 2004
ulasmehm_at_boun.edu.tr
2
Outline Of Presentation

Introduction
Neural Networks
Neural Network Architectures
Conclusions

3
Introduction

Some numbers
The human brain contains about 10 billion nerve
cells (neurons)
Each neuron is connected to the others through
10000 synapses
Brain as a computational unit
It can learn, reorganize from experience
It adapts to the environment
It is robust and fault tolerant
Fast computations with too much individual
computational units

4
Introduction

Taking the nature as a model. Consider the neuron
as a PE
A neuron has
Input (dendrites)
Output (the axon)
The information circulates from the dendrites to
the axon via the cell body
Axon connects to dendrites via synapses
Strength of synapses change
Synapses may be excitatory or inhibitory

5
Perceptron (Artificial Neuron)

Definition Non linear, parameterized function
with restricted output range

6
Activation Functions
Linear
Sigmoid
Hyperbolic tangent
7
Neural Networks

A mathematical model to solve engineering
problems
Group of highly connected neurons to realize
compositions of non linear functions
Tasks
Classification
Clustering
Regression
According to input flow
Feed forward Neural Networks
Recurrent Neural Networks

8
Feed Forward Neural Networks

The information is propagated from the inputs to
the outputs
Time has no role (Acyclic, no feedbacks from
outputs to inputs)

9
Recurrent Networks

Arbitrary topologies
Can model systems with internal states (dynamic
ones)
Delays can be modeled
More difficult to train
Problematic performance
Stable Outputs may be more difficult to evaluate
Unexpected behavior (oscillation, chaos, )

x1
x2
10
Learning

The procedure that consists in estimating the
parameters of neurons (setting up the weights) so
that the whole network can perform a specific
task.
2 types of learning
Supervised learning
Unsupervised learning
The Learning process (supervised)
Present the network a number of inputs and their
corresponding outputs (Training)
See how closely the actual outputs match the
desired ones
Modify the parameters to better approximate the
desired outputs
Several passes over the data

11
Supervised Learning

The real outputs of the model for the given
inputs is known in advance. The networks task is
to approximate those outputs.
A Supervisor provides examples and teach the
neural network how to fulfill a certain task

12
Unsupervised learning

Group typical input data according to some
function.
Data clustering
No need of a supervisor
Network itself finds the correlations between
the data
Examples
Kohonen feature maps (SOM)

13
Properties of Neural Networks

Supervised networks are universal approximators
(Non recurrent networks)
Can act as
Linear Approximator (Linear Perceptron)
Nonlinear Approximator (Multi Layer Perceptron)

14
Other Properties

Adaptivity
Adapt weights to the environment easily
Ability to generalize
May provide against lack of data
Fault tolerance
Not too much degradation of performances if
damaged ? The information is distributed within
the entire net.

15
An Example Regression
16
Example Classification

Handwritten digit recognition
16x16 bitmap representation
Converted to 1x256 bit vector
7500 points on training set
3500 points on test set

0000000001100000 0000000110100000 0000000100000000
0000001000000000 0000010000000000 000010000000000
0 0000100000000000 0000100000000000 00001000000000
00 0001000111110000 0001011000011000 0001100000001
000 0001100000001000 0001000000001000 000010000001
0000 0000011111110000
17
Training

Try to minimize an error or cost function
Backpropogation algorithm
Gradient Descent
Learn the weights of the network
Update the weights according to the error function

18
Applications

Handwritten Digit Recognition
Face recognition
Time series prediction
Process identification
Process control
Optical character recognition
Etc

19
Neural Networks

Neural networks are statistical tools
Adjust non linear functions to accomplish a task
Need of multiple and representative examples but
fewer than in other methods
Neural networks can model static (FF) and dynamic
(RNN) tasks
NNs are good classifiers BUT
Good representations of data have to be
formulated
Training vectors must be statistically
representative of the entire input space
The use of NN needs a good comprehension of the
problem

20
Implementation of Neural Networks

Generic architectures (PCs etc)
Specific Neuro-Hardware
Dedicated circuits

21
Generic architectures

Conventional microprocessors
Intel Pentium, Power PC, etc
Advantages
High performances (clock frequency, etc)
Cheap
Software environment available (NN tools, etc)
Drawbacks
Too generic, not optimized for very fast neural
computations

22
Classification of Hardware

NN Hardware
Neurochips
Special Purpose
General Purpose (Ni1000, L - Neuro)
NeuroComputers
Special Purpose (CNAPS, Synapse)
General Purpose

23
Specific Neuro-hardware circuits

Commercial chips CNAPS, Synapse, etc.
Advantages
Closer to the neural applications
High performances in terms of speed
Drawbacks
Not optimized to specific applications
Availability
Development tools

24
CNAPS

SIMD
One instruction sequencing and control unit
Processor nodes (PN)
Single dimensional array (only right or left
nodes)

25
CNAPS 1064
26
CNAPS
27
Dedicated circuits

A system where the functionality is buried in the
hardware.
For specific applications only not changeable
Advantages
Optimized for a specific application
Higher performances than the other systems
Drawbacks
High development costs in terms of time and money

28
What type of hardware to be used in dedicated
circuits ?

Custom circuits
ASIC (Application-Specific Integrated Circuit)
Necessity to have good knowledge of the hardware
design
Fixed architecture, hardly changeable
Often expensive
Programmable logic
Valuable to implement real time systems
Flexibility
Low development costs
Lower performances compared to ASIC (Frequency,
etc.)

29
Programmable logic

Field Programmable Gate Arrays (FPGAs)
Matrix of logic cells
Programmable interconnection
Additional features (internal memories embedded
resources like multipliers, etc.)
Reconfigurability
We can change the configurations as many times as
desired

30
Real Time Systems

Execution of applications with time constraints.
Hard real-time systems
Digital fly-by-wire control system of an
aircraftNo lateness is accepted. The lives of
people depend on the correct working of the
control system of the aircraft.
Soft real-time systems
Vending machineAccept lower performance for
lateness, it is not catastrophic when deadlines
are not met. It will take longer to handle one
client with the vending machine.

31
Real Time Systems

ms scale real time system
Connectionist retina for image processing
Artificial Retina combining an image sensor with
a parallel architecture
µs scale real time system
Level 1 trigger in a HEP experiment

32
Connectionist Retina

Integration of a neural network in an artificial
retina
Screen
Matrix of Active Pixel sensors
CAN
8 bits ADC converter 256 levels of grey
Processing Architecture
Parallel system where neural networks are
implemented

Processing Architecture
33
Maharadja Processing Architecture
Command bus
Micro-controller

Micro-controller
Generic architecture executing sequential cost
with low power consumption
Memory
256 Kbytes shared between processor, PEs, input
Store the network parameters
UNE (Unit Neural SIMD
Completely pipelined
16 bit internal data bus)
Processors to compute the neurons outputs
Command bus manages all different operators in
UNE
Input/Output module
Data acquisition and storage of intermediate
results

M
M
M
M
UNE-0
UNE-1
UNE-2
UNE-3
Sequencer
Instruction Bus
Input/Output unit
34
Level 1 trigger in a HEP experiment

High Energy Physics (Particle Physics)
Neural networks have provided interesting results
as triggers in HEP.
Level 2 H1 experiment 10 20 µs
Level 1 Dirac experiment 2 µs
Particle Recognition
High timing constraints (in terms of latency and
data throughput)

35
Neural Network architecture
Electrons, tau, hadrons, jets
4
64
..
..
128
Execution time 500 ns
with data arriving every BC25ns
Weights coded in 16 bits States coded in 8 bits
36
Very Fast Architecture

256 PEs
Matrix of nm matrix elements
Control unit
I/O module
TanH are stored in LUTs
1 matrix row computes a neuron
The results is back-propagated to calculate the
output layer

PE
PE
PE
PE
ACC
TanH
PE
PE
PE
PE
ACC
TanH
PE
PE
PE
PE
ACC
TanH
PE
PE
PE
PE
TanH
ACC
Control unit
I/O module
37
PE architecture
Data in
Data out
Accumulator
Multiplier
Input data
8

X
16
Weights mem
Addr gen
Control Module
cmd bus
38
Neuro-hardware today

Generic Real time applications
Microprocessors technology (PCs, computers, i.e.
software) is sufficient to implement most of
neural applications in real-time (ms or sometimes
µs scale)
This solution is cheap
Very easy to manage
Constrained Real time applications
It still remains specific applications where
powerful computations are needed e.g. particle
physics
It still remains applications where other
constraints have to be taken into consideration
(Consumption, proximity of sensors, mixed
integration, etc.)