Artificial%20Neural%20Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Artificial%20Neural%20Networks

Description:

Artificial Neural Networks An Introduction – PowerPoint PPT presentation

Number of Views:377

Avg rating:3.0/5.0

Slides: 19

Provided by: Maty2

Category:

more less

Transcript and Presenter's Notes

Title: Artificial%20Neural%20Networks

1
Artificial Neural Networks

An Introduction

2
Outline

Introduction
Biological and artificial neurons
Perceptrons (problems)
Backpropagation network
Training
Other ANNs (examples in HEP)

3
Introduction - What are ANNs?

Artificial Neural Networks
data analysis tools (/computational modelling
tools)
model complex real-world problems
structures comprised of densely interconnected
simple processing elements
each element is linked to neighbours with varying
strengths
learning is accomplished by adjusting these
strengths to cause network to output appropriate
results
learn from experience (rather than being
explicitly programmed with rules)
inspired by biological neural networks (ANNs
idea is not to replicate operation of bio
systems, but use whats known of their
functionality to solve complex problems)

Information processing characteristics
nonlinearity (allows better fit to data)
fault and failure tolerance (for uncertain data
and measurement errors)
learning and adaptivity (allows system to update
its internal structure in response to changing
environment)
generalization (enables application of model to
unlearned data)

Generally ANNs outperform other computational
tools in solving a variety of problems
Pattern classification categorizes set of input
patterns in terms of different features
Clustering clusters formed by exploring
similarities between input patterns based on
their inter-correlations
Function approximation training ANN to approx.
the underlying rules relating the inputs to the
outputs

5
Biological Neuron

3 major functional units
Dendrites
Cell body
Axon
Synapse
Amount of signal passing through a neuron depends
on
Intensity of signal from feeding neurons
Their synaptic strengths
Threshold of the receiving neuron
Hebb rule (plays key part in learning)
(A synapse which repeatedly triggers the
activation of a postsynaptic neuron will grow in
strength, others will gradually weaken.)
Learn by adjusting magnitudes of synapses
strengths

x2
x1
xn
w1
w2
wn
y
g(?)
?
6
Artificial Neurons (basic computational entities
of an ANN)

Analogy between artificial and biological
(connection weights represent synapses)
In 1958 Rosenblatt introduced mechanics
(perceptron)
Input to output (yg(?iwixj)
Only when sum exceeds the threshold limit will
neuron fire
Weights can enhance or inhibit
Collective behaviour of neurons is whats
interesting for intelligent data processing

y
g( )
?w.x
w1
w3
w2
x3
x1
x2
7
Perceptrons

Can be trained on a set of examples using a
special learning rule (process)
Weights are changed in proportion to the
difference (error) between target output and
perceptron solution for each example.
Minimize summed square error function
E 1/2 ?p?i(oi(p) - ti(p))2
with respect to the weights.
Error is function of all the weights and forms an
irregular multidimensional complex hyperplane
with many peaks, saddle points and minima.
Error minimized by finding set of weights that
correspond to global minimum.
Done with gradient descent method (weights
incrementally updated in proportion to dE/dwij)
Updating reads wij(t 1) wij(t) ?wij
Aim is to produce a true mapping for all patterns

oi
wij
xj
g(?)
?
threshold
8
Summary of Learning for Perceptron

Initialize wij with random values.
Repeat until wij(t 1) wij(t)
Pick pattern p from training set.
Feed input to network and calculate the output.
Update the weights according to
wij(t 1) wij(t) ?wij
where ?wij -? dE/dwij.
When no change (within some accuracy) occurs, the
weights are frozen and network is ready to use on
data it has never seen.

9
Example

AND OR

x1 x2 t x1 x2 t

1 1 1 1 1 1
1 0 0 1 0 1
0 1 0 0 1 1
0 0 0 0 0 0

Perceptron learns these rules easily (ie sets
appropriate weights and threshold)
(to w(w0,w1,w2) (-1.5,1.0,1.0) and
(-0.5,1.0,1.0) where w0 corresponds to the
threshold term)

10
Problems

Perceptrons can only perform accurately with
linearly separable classes (linear hyperplane can
place one class of objects on one side of plane
and other class on other)
ANN research put on hold for 20yrs.
Solution additional (hidden) layers of neurons,
MLP architecture
Able to solve non-linear classification problems

x1
x2
x1
x2
11
MLPs

Learning procedure is extension of simple
perceptron algorithm
Response function
oig(?iwijg(?kwjkxk))
Which is non-linear so network able to perform
non-linear mappings
(Theory tells us that a neural network with at
least 1 hidden layer can represent any function)
Vast number of ANN types exist

oi
wij
hj
wjk
xk
12
Backpropagation ANNs

Most widely used type of network
Feedforward
Supervised (learns mapping from one data space to
another using examples)
Error propagated backwards
Versatile. Used for data modelling,
classification, forecasting, data and image
compression and pattern recognition.

13
BP Learning Algorithm

Like Perceptron, uses gradient descent to
minimize error (generalized to case with hidden
layers)
Each iteration constitutes two sweeps
To minimize Error we need dE/dwij but also need
dE/dwjk (which we get using the chain rule)
Training of MLP using BP can be thought of as a
walk in weight space along an energy surface,
trying to find global minimum and avoiding local
minima
Unlike for Perceptron, there is no guarantee that
global minimum will be reached, but most cases
energy landscape is smooth

14
Summary of BP learning algorithm

Initialize wij and wjk with random values.
Repeat until wij and wjk have converged or the
desired performance level is reached
Pick pattern p from training set.
Present input and calculate the output.
Update weights according to
wij(t 1) wij(t) ?wij
wjk(t 1) wjk(t) ?wjk
where ?w -? dE/dw.
(etcfor extra hidden layers).

15
Training

Generalization networks performance on a set of
test patterns it has never seen before. (lower
than on training set)
Training set used to let ANN capture features in
data or mapping.
Initial large drop in error is due to learning,
but subsequent slow reduction is due to
Network memorization (too many training cycles
used).
Overfitting (too many hidden nodes).
(network learns individual training examples and
loses generalization ability)

Error (eg SSE)
Testing
Optimum network
Training
No. of hidden nodes or training cycles
16
Other Popular ANNs

Some applications may be solved using variety of
ANN types, some only via specific. (problem
logistics)
Hopfield networks optimization.
Presented with incomplete/noisy pattern, network
responds by retrieving an internally stored
pattern it most closely resembles.
Kohonen networks (self-organizing)
Trained in an unsupervised manner to form
clusters in the data. Used for pattern
classification and data compression.

17
HEP Applications

ANNs applied from off-line data analysis to
low-level experimental triggers
Signal to background ratios reduced. (BP)
ie in flavour tagging, Higgs detection
Feature recognition problems in track finding.
(feed-back)
Function approximation tasks (feed-back)
ie reconstructing the mass of a decayed particle
from calorimeter information

http//www.doc.ic.ac.uk/nd/surprise_96.journal/vo
l4/cs11/report.html
http//www.cs.stir.ac.uk/lss/NNIntro/InvSlides.ht
ml
Carsten Peterson and Thorsteinn Rognvaldsson, An
Introduction to Artificial Neural Networks, LU TP
91-23, September 1991 (Lectures given at the 1991
Cern School of Computing, Sweden)

Write a Comment

User Comments (0)