Neural Network Architectures - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Neural Network Architectures

Description:

Neural Network Architectures Ayd n Ula 02 December 2004 ulasmehm_at_boun.edu.tr Outline Of Presentation Introduction Neural Networks Neural Network Architectures ... – PowerPoint PPT presentation

Number of Views:190
Avg rating:3.0/5.0
Slides: 44
Provided by: cmpeBoun
Category:

less

Transcript and Presenter's Notes

Title: Neural Network Architectures


1
Neural Network Architectures
  • Aydin Ulas

02 December 2004
ulasmehm_at_boun.edu.tr
2
Outline Of Presentation
  • Introduction
  • Neural Networks
  • Neural Network Architectures
  • Conclusions

3
Introduction
  • Some numbers
  • The human brain contains about 10 billion nerve
    cells (neurons)
  • Each neuron is connected to the others through
    10000 synapses
  • Brain as a computational unit
  • It can learn, reorganize from experience
  • It adapts to the environment
  • It is robust and fault tolerant
  • Fast computations with too much individual
    computational units

4
Introduction
  • Taking the nature as a model. Consider the neuron
    as a PE
  • A neuron has
  • Input (dendrites)
  • Output (the axon)
  • The information circulates from the dendrites to
    the axon via the cell body
  • Axon connects to dendrites via synapses
  • Strength of synapses change
  • Synapses may be excitatory or inhibitory

5
Perceptron (Artificial Neuron)
  • Definition Non linear, parameterized function
    with restricted output range

6
Activation Functions
Linear
Sigmoid
Hyperbolic tangent
7
Neural Networks
  • A mathematical model to solve engineering
    problems
  • Group of highly connected neurons to realize
    compositions of non linear functions
  • Tasks
  • Classification
  • Clustering
  • Regression
  • According to input flow
  • Feed forward Neural Networks
  • Recurrent Neural Networks

8
Feed Forward Neural Networks
  • The information is propagated from the inputs to
    the outputs
  • Time has no role (Acyclic, no feedbacks from
    outputs to inputs)

9
Recurrent Networks
  • Arbitrary topologies
  • Can model systems with internal states (dynamic
    ones)
  • Delays can be modeled
  • More difficult to train
  • Problematic performance
  • Stable Outputs may be more difficult to evaluate
  • Unexpected behavior (oscillation, chaos, )

x1
x2
10
Learning
  • The procedure that consists in estimating the
    parameters of neurons (setting up the weights) so
    that the whole network can perform a specific
    task.
  • 2 types of learning
  • Supervised learning
  • Unsupervised learning
  • The Learning process (supervised)
  • Present the network a number of inputs and their
    corresponding outputs (Training)
  • See how closely the actual outputs match the
    desired ones
  • Modify the parameters to better approximate the
    desired outputs
  • Several passes over the data

11
Supervised Learning
  • The real outputs of the model for the given
    inputs is known in advance. The networks task is
    to approximate those outputs.
  • A Supervisor provides examples and teach the
    neural network how to fulfill a certain task

12
Unsupervised learning
  • Group typical input data according to some
    function.
  • Data clustering
  • No need of a supervisor
  • Network itself finds the correlations between
    the data
  • Examples
  • Kohonen feature maps (SOM)

13
Properties of Neural Networks
  • Supervised networks are universal approximators
    (Non recurrent networks)
  • Can act as
  • Linear Approximator (Linear Perceptron)
  • Nonlinear Approximator (Multi Layer Perceptron)

14
Other Properties
  • Adaptivity
  • Adapt weights to the environment easily
  • Ability to generalize
  • May provide against lack of data
  • Fault tolerance
  • Not too much degradation of performances if
    damaged ? The information is distributed within
    the entire net.

15
An Example Regression
16
Example Classification
  • Handwritten digit recognition
  • 16x16 bitmap representation
  • Converted to 1x256 bit vector
  • 7500 points on training set
  • 3500 points on test set

0000000001100000 0000000110100000 0000000100000000
0000001000000000 0000010000000000 000010000000000
0 0000100000000000 0000100000000000 00001000000000
00 0001000111110000 0001011000011000 0001100000001
000 0001100000001000 0001000000001000 000010000001
0000 0000011111110000
17
Training
  • Try to minimize an error or cost function
  • Backpropogation algorithm
  • Gradient Descent
  • Learn the weights of the network
  • Update the weights according to the error function

18
Applications
  • Handwritten Digit Recognition
  • Face recognition
  • Time series prediction
  • Process identification
  • Process control
  • Optical character recognition
  • Etc

19
Neural Networks
  • Neural networks are statistical tools
  • Adjust non linear functions to accomplish a task
  • Need of multiple and representative examples but
    fewer than in other methods
  • Neural networks can model static (FF) and dynamic
    (RNN) tasks
  • NNs are good classifiers BUT
  • Good representations of data have to be
    formulated
  • Training vectors must be statistically
    representative of the entire input space
  • The use of NN needs a good comprehension of the
    problem

20
Implementation of Neural Networks
  • Generic architectures (PCs etc)
  • Specific Neuro-Hardware
  • Dedicated circuits

21
Generic architectures
  • Conventional microprocessors
  • Intel Pentium, Power PC, etc
  • Advantages
  • High performances (clock frequency, etc)
  • Cheap
  • Software environment available (NN tools, etc)
  • Drawbacks
  • Too generic, not optimized for very fast neural
    computations

22
Classification of Hardware
  • NN Hardware
  • Neurochips
  • Special Purpose
  • General Purpose (Ni1000, L - Neuro)
  • NeuroComputers
  • Special Purpose (CNAPS, Synapse)
  • General Purpose

23
Specific Neuro-hardware circuits
  • Commercial chips CNAPS, Synapse, etc.
  • Advantages
  • Closer to the neural applications
  • High performances in terms of speed
  • Drawbacks
  • Not optimized to specific applications
  • Availability
  • Development tools

24
CNAPS
  • SIMD
  • One instruction sequencing and control unit
  • Processor nodes (PN)
  • Single dimensional array (only right or left
    nodes)

25
CNAPS 1064
26
CNAPS
27
Dedicated circuits
  • A system where the functionality is buried in the
    hardware.
  • For specific applications only not changeable
  • Advantages
  • Optimized for a specific application
  • Higher performances than the other systems
  • Drawbacks
  • High development costs in terms of time and money

28
What type of hardware to be used in dedicated
circuits ?
  • Custom circuits
  • ASIC (Application-Specific Integrated Circuit)
  • Necessity to have good knowledge of the hardware
    design
  • Fixed architecture, hardly changeable
  • Often expensive
  • Programmable logic
  • Valuable to implement real time systems
  • Flexibility
  • Low development costs
  • Lower performances compared to ASIC (Frequency,
    etc.)

29
Programmable logic
  • Field Programmable Gate Arrays (FPGAs)
  • Matrix of logic cells
  • Programmable interconnection
  • Additional features (internal memories embedded
    resources like multipliers, etc.)
  • Reconfigurability
  • We can change the configurations as many times as
    desired

30
Real Time Systems
  • Execution of applications with time constraints.
  • Hard real-time systems
  • Digital fly-by-wire control system of an
    aircraftNo lateness is accepted. The lives of
    people depend on the correct working of the
    control system of the aircraft.
  • Soft real-time systems
  • Vending machineAccept lower performance for
    lateness, it is not catastrophic when deadlines
    are not met. It will take longer to handle one
    client with the vending machine.

31
Real Time Systems
  • ms scale real time system
  • Connectionist retina for image processing
  • Artificial Retina combining an image sensor with
    a parallel architecture
  • µs scale real time system
  • Level 1 trigger in a HEP experiment

32
Connectionist Retina
  • Integration of a neural network in an artificial
    retina
  • Screen
  • Matrix of Active Pixel sensors
  • CAN
  • 8 bits ADC converter 256 levels of grey
  • Processing Architecture
  • Parallel system where neural networks are
    implemented

Processing Architecture
33
Maharadja Processing Architecture
Command bus
Micro-controller
  • Micro-controller
  • Generic architecture executing sequential cost
    with low power consumption
  • Memory
  • 256 Kbytes shared between processor, PEs, input
  • Store the network parameters
  • UNE (Unit Neural SIMD
  • Completely pipelined
  • 16 bit internal data bus)
  • Processors to compute the neurons outputs
  • Command bus manages all different operators in
    UNE
  • Input/Output module
  • Data acquisition and storage of intermediate
    results

M
M
M
M
UNE-0
UNE-1
UNE-2
UNE-3
Sequencer
Instruction Bus
Input/Output unit
34
Level 1 trigger in a HEP experiment
  • High Energy Physics (Particle Physics)
  • Neural networks have provided interesting results
    as triggers in HEP.
  • Level 2 H1 experiment 10 20 µs
  • Level 1 Dirac experiment 2 µs
  • Particle Recognition
  • High timing constraints (in terms of latency and
    data throughput)

35
Neural Network architecture
Electrons, tau, hadrons, jets
4
64
..
..
128
Execution time 500 ns
with data arriving every BC25ns
Weights coded in 16 bits States coded in 8 bits
36
Very Fast Architecture
  • 256 PEs
  • Matrix of nm matrix elements
  • Control unit
  • I/O module
  • TanH are stored in LUTs
  • 1 matrix row computes a neuron
  • The results is back-propagated to calculate the
    output layer

PE
PE
PE
PE
ACC
TanH
PE
PE
PE
PE
ACC
TanH
PE
PE
PE
PE
ACC
TanH
PE
PE
PE
PE
TanH
ACC
Control unit
I/O module
37
PE architecture
Data in
Data out
Accumulator
Multiplier
Input data
8

X
16
Weights mem
Addr gen
Control Module
cmd bus
38
Neuro-hardware today
  • Generic Real time applications
  • Microprocessors technology (PCs, computers, i.e.
    software) is sufficient to implement most of
    neural applications in real-time (ms or sometimes
    µs scale)
  • This solution is cheap
  • Very easy to manage
  • Constrained Real time applications
  • It still remains specific applications where
    powerful computations are needed e.g. particle
    physics
  • It still remains applications where other
    constraints have to be taken into consideration
    (Consumption, proximity of sensors, mixed
    integration, etc.)

39
Clustering
  • Idea Combine performances of different
    processors to perform massive parallel
    computations

High speed connection
40
Clustering
  • Advantages
  • Take advantage of the implicit parallelism of
    neural networks
  • Utilization of systems already available
    (university, Labs, offices, etc.)
  • High performances Faster training of a neural
    net
  • Very cheap compare to dedicated hardware

41
Clustering
  • Drawbacks
  • Communications load Need of very fast links
    between computers
  • Software environment for parallel processing
  • Not possible for embedded applications

42
Hardware Implementations
  • Most real-time applications do not need dedicated
    hardware implementation
  • Conventional architectures are generally
    appropriate
  • Clustering of generic architectures to combine
    performances
  • Some specific applications require other
    solutions
  • Strong Timing constraints
  • Technology permits to utilize FPGAs
  • Flexibility
  • Massive parallelism possible
  • Other constraints (consumption, etc.)
  • Custom or programmable circuits

43
Questions?
Write a Comment
User Comments (0)
About PowerShow.com