ICT619%20Intelligent%20Systems%20Topic%204:%20Artificial%20Neural%20Networks - PowerPoint PPT Presentation

About This Presentation
Title:

ICT619%20Intelligent%20Systems%20Topic%204:%20Artificial%20Neural%20Networks

Description:

ICT619 Intelligent Systems. Topic 4: Artificial Neural ... Photo: Osaka University. ICT619. 8. An overview of the biological neuron ... Stock market prediction ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 48
Provided by: drsham
Category:

less

Transcript and Presenter's Notes

Title: ICT619%20Intelligent%20Systems%20Topic%204:%20Artificial%20Neural%20Networks


1
ICT619 Intelligent SystemsTopic 4 Artificial
Neural Networks
2
Artificial Neural Networks
  • PART A
  • Introduction
  • An overview of the biological neuron
  • The synthetic neuron
  • Structure and operation of an ANN
  • Problem solving by an ANN
  • Learning in ANNs
  • ANN models
  • Applications
  • PART B
  • Developing neural network applications
  • Design of the network
  • Training issues
  • A comparison of ANN and ES
  • Hybrid ANN systems
  • Case Studies

3
Introduction
  • Artificial Neural Networks (ANN)
  • Also known as
  • Neural networks
  • Neural computing (or neuro-computing) systems
  • Connectionist models
  • ANNs simulate the biological brain for problem
    solving
  • This represents a totally different approach to
    machine intelligence from the symbolic logic
    approach
  • The biological brain is a massively parallel
    system of interconnected processing elements
  • ANNs simulate a similar network of simple
    processing elements at a greatly reduced scale

4
Introduction
  • ANNs adapt themselves using data to learn problem
    solutions
  • ANNs can be particularly effective for problems
    that are hard to solve using conventional
    computing methods
  • First developed in the 1950s, slumped in 70s
  • Great upsurge in interest in the mid 1980s
  • Both ANNs and expert systems are non-algorithmic
    tools for problem solving
  • ES rely on the solution being expressed as a set
    of heuristics by an expert
  • ANNs learn solely from data.

5
(No Transcript)
6
An overview of the biological neuron
  • Estimated 1000 billion neurons in the human
    brain, with each connected to up to 10,000 others
  • Electrical impulses produced by a neuron travel
    along the axon
  • The axon connects to dendrites through synaptic
    junctions

7
An overview of the biological neuron
8
An overview of the biological neuron
  • A neuron collects the excitation of its inputs
    and "fires" (produces a burst of activity) when
    the sum of its inputs exceeds a certain threshold
  • The strengths of a neurons inputs are modified
    (enhanced or inhibited) by the synaptic junctions
  • Learning in our brains occurs through a
    continuous process of new interconnections
    forming between neurons, and adjustments at the
    synaptic junctions

9
The synthetic neuron
  • A simple model of the biological neuron, first
    proposed in 1943 by McCulloch and Pitts consists
    of a summing function with an internal threshold,
    and "weighted" inputs as shown below.

10
The synthetic neuron (contd)
  • For a neuron receiving n inputs, each input xi (
    i ranging from 1 to n) is weighted by multiplying
    it with a weight wi
  • The sum of the products wixi gives the net
    activation value of the neuron
  • The activation value is subjected to a transfer
    function to produce the neurons output
  • The weight value of the connection carrying
    signals from a neuron i to a neuron j is termed
    wij..

11
Transfer functions
  • These compute the output of a node from its net
    activation. Among the popular transfer functions
    are
  • Step function
  • Signum (or sign) function
  • Sigmoid function
  • Hyperbolic tangent function
  • In the step function, the neuron produces an
    output only when its net activation reaches a
    minimum value known as the threshold
  • For a binary neuron i, whose output is a 0 or 1
    value, the step function can be summarised as

12
Transfer functions (contd)
  • The sign function returns a value between -1 and
    1. To avoid confusion with 'sine' it is often
    called signum.

outputi
1
0
activationi
-1
13
Transfer functions (contd)
  • The sigmoid
  • The sigmoid transfer function produces a
    continuous value in the range 0 to 1
  • The parameter gain affects the slope of the
    function around zero

14
Transfer functions (contd)
  • The hyperbolic tangent
  • A variant of the sigmoid transfer function
  • Has a shape similar to the sigmoid (like an S),
    with the difference being that the value of
    outputi ranges between 1 and 1.

15
Structure and operation of an ANN
  • The building block of an ANN is the artificial
    neuron. It is characterised by
  • weighted inputs
  • summing and transfer function
  • The most common architecture of an ANN consists
    of two or more layers of artificial neurons or
    nodes, with each node in a layer connected to
    every node in the following layer
  • Signals usually flow from the input layer, which
    is directly subjected to an input pattern, across
    one or more hidden layers towards the output
    layer.

16
Structure and operation of an ANN
  • The most popular ANN architecture, known as the
    multilayer perceptron (shown in diagram above),
    follows this model.
  • In some models of the ANN, such as the
    self-organising map (SOM) or Kohonen net, nodes
    in the same layer may have interconnections among
    them
  • In recurrent networks, connections can even go
    backwards to nodes closer to input

17
Problem solving by an ANN
  • The inputs of an ANN are data values grouped
    together to form a pattern
  • Each data value (component of the pattern vector)
    is applied to one neuron in the input layer
  • The output value(s) of node(s) in the output
    layer represent some function of the input
    pattern

18
Problem solving by an ANN (contd)
  • In the example above, the ANN maps the input
    pattern to either one of two classes
  • The ANN produces the output for an accurate
    prediction, only if the functional relationships
    between the relevant variables, namely the
    components of the input pattern, and the
    corresponding output, have been learned by the
    ANN
  • Any three-layer ANN can (at least in theory)
    represent the functional relationship between an
    input pattern and its class
  • It may be difficult in practice for the ANN to
    learn a given relationship

19
Learning in ANN
  • Common human learning behaviour repeatedly going
    through same material, making mistakes and
    learning until able to carry out a given task
    successfully
  • Learning by most ANNs is modelled after this type
    of human learning
  • Learned knowledge to solve a given problem is
    stored in the interconnection weights of an ANN
  • The process by which an ANN arrives at the right
    values of these weights is known as learning or
    training

20
Learning in ANN (contd)
  • Learning in ANNs takes place through an iterative
    training process during which node
    interconnection weight values are adjusted
  • Initial weights, usually small random values, are
    assigned to the interconnections between the ANN
    nodes.
  • Like knowledge acquisition in ES, learning in
    ANNs can be the most time consuming phase in its
    development

21
Learning in ANNs (contd)
  • ANN learning (or training) can be supervised or
    unsupervised
  • In supervised training,
  • data sets consisting of pairs, each one an input
    patterns and its expected correct output value,
    are used
  • The weight adjustments during each iteration aim
    to reduce the error (difference between the
    ANNs actual output and the expected correct
    output)
  • Eg, a node producing a small negative output when
    it is expected to produce a large positive one,
    has its positive weight values increased and the
    negative weight values decreased

22
Learning in ANNs
  • In supervised training,
  • Pairs of sample input value and corresponding
    output value are used to train the net repeatedly
    until the output becomes satisfactorily accurate
  • In unsupervised training,
  • there is no known expected output used for
    guiding the weight adjustments
  • The function to be optimised can be any function
    of the inputs and outputs, usually set by the
    application
  • the net adapts itself to align its weight values
    with training patterns
  • This results in groups of nodes responding
    strongly to specific groups of similar inputs
    patterns

23
The two states of an ANN
  • A neural network can be in one of two states
    training mode or operation mode
  • Most ANNs learn off-line and do not change their
    weights once training is finished and they are in
    operation
  • In an ANN capable of on-line learning, training
    and operation continue together
  • ANN training can be time consuming, but once
    trained, the resulting network can be made to run
    very efficiently providing fast responses

24
ANN models
  • ANNs are supposed to model the structure and
    operation of the biological brain
  • But there are different types of neural networks
    depending on the architecture, learning strategy
    and operation
  • Three of the most well known models are
  • The multilayer perceptron
  • The Kohonen network (the Self-Organising Map)
  • The Hopfield net
  • The Multilayer Perceptron (MLP) is the most
    popular ANN architecture

25
The Multilayer Perceptron
  • Nodes are arranged into an input layer, an output
    layer and one or more hidden layers
  • Also known as the backpropagation network
    because of the use of error values from the
    output layer in the layers before it to calculate
    weight adjustments during training.
  • Another name for the MLP is the feedforward
    network.

26
MLP learning algorithm
  • The learning rule for the multilayer perceptron
    is known as "the generalised delta rule" or the
    "backpropagation rule"
  • The generalised delta rule repeatedly calculates
    an error value for each input, which is a
    function of the squared difference between the
    expected correct output and the actual output
  • The calculated error is backpropagated from one
    layer to the previous one, and is used to adjust
    the weights between connecting layers

27
MLP learning algorithm (contd)
  • New weight Old weight change calculated from
    square of errorError difference between
    desired output and actual output
  • Training stops when error becomes acceptable, or
    after a predetermined number of iterations
  • After training, the modified interconnection
    weights form a sort of internal representation
    that enables the ANN to generate desired outputs
    when given the training inputs or even new
    inputs that are similar to training inputs
  • This generalisation is a very important property

28
The error landscape in a multilayer perceptron
  • For a given pattern p, the error Ep can be
    plotted against the weights to give the so called
    error surface
  • The error surface is a landscape of hills and
    valleys, with points of minimum error
    corresponding to wells and maximum error found on
    peaks.
  • The generalised delta rule aims to minimise Ep by
    adjusting weights so that they correspond to
    points of lowest error
  • It follows the method of gradient descent where
    the changes are made in the steepest downward
    direction
  • All possible solutions are depressions in the
    error surface, known as basins of attraction

29
The error landscape in a multilayer perceptron
Ep
j
i
30
Learning difficulties in multilayer perceptrons -
local minima
  • The MLP may fail to settle into the global
    minimum of the error surface and instead find
    itself in one of the local minima
  • This is due to the gradient descent strategy
    followed
  • A number of alternative approaches can be taken
    to reduce this possibility
  • Lowering the gain term progressively
  • Used to influence rate at which weight changes
    are made during training
  • Value by default is 1, but it may be gradually
    reduced to reduce the rate of change as training
    progresses

31
Learning difficulties in multilayer
perceptrons(contd)
  • Addition of more nodes for better representation
    of patterns
  • Too few nodes (and consequently not enough
    weights) can cause failure of the ANN to learn a
    pattern
  • Introduction of a momentum term
  • Determines effect of past weight changes on
    current direction of movement in weight space
  • Momentum term is also a small numerical value in
    the range 0 -1
  • Addition of random noise to perturb the ANN out
    of local minima
  • Usually done by adding small random values to
    weights.
  • Takes the net to a different point in the error
    space hopefully out of a local minimum

32
The Kohonen network (the self-organising map)
  • Biological systems display both supervised and
    unsupervised learning behaviour
  • A neural network with unsupervised learning
    capability is said to be self-organising
  • During training, the Kohonen net changes its
    weights to learn appropriate associations,
    without any right answers being provided

33
The Kohonen network (contd)
  • The Kohonen net consists of an input layer, that
    distributes the inputs to every node in a second
    layer, known as the competitive layer.
  • The competitive (output) layer is usually
    organised into some 2-D or 3-D surface (feature
    map)

34
Operation of the Kohonen Net
  • Each neuron in the competitive layer is connected
    to other neurons in its neighbourhood
  • Neurons in the competitive layer have excitatory
    (positively weighted) connections to immediate
    neighbours and inhibitory (negatively weighted)
    connections to more distant neurons.
  • As an input pattern is presented, some of the
    neurons in the competitive layer are sufficiently
    activated to produce outputs, which are fed to
    other neurons in their neighbourhoods
  • The node with the set of input weights closest to
    the input pattern component values produces the
    largest output. This node is termed the best
    matching (or winning) node

35
Operation of the Kohonen Net(contd)
  • During training, input weights of the best
    matching node and its neighbours are adjusted to
    make them resemble the input pattern even more
    closely
  • At the completion of training, the best matching
    node ends up with its input weight values aligned
    with the input pattern and produces the strongest
    output whenever that particular pattern is
    presented
  • The nodes in the winning node's neighbourhood
    also have their weights modified to settle down
    to an average representation of that pattern
    class
  • As a result, the net is able to represent
    clusters of similar input patterns - a feature
    found useful for data mining applications, for
    example.

36
The Hopfield Model
  • The Hopfield net is the most widely known of all
    the autoassociative - pattern completing - ANNs
  • In autoassociation, a noisy or partially
    incomplete input pattern causes the network to
    stabilise to a state corresponding to the
    original pattern
  • It is also useful for optimisation tasks.
  • The Hopfield net is a recurrent ANN in which the
    output produced by each neuron is fed back as
    input to all other neurons
  • Neurons computer a weighted sum with a step
    transfer function.

37
The Hopfield Model (contd)
  • The Hopfield net has no iterative learning
    algorithm as such. Patterns (or facts) are simply
    stored by adjusting the weights to lower a term
    called network energy
  • During operation, an input pattern is applied to
    all neurons simultaneously and the network is
    left to stabilise
  • Outputs from the neurons in the stable state form
    the output of the network.
  • When presented with an input pattern, the net
    outputs a stored pattern nearest to the presented
    pattern.

38
When ANNs should be applied
  • Difficulties with some real-life problems
  • Solutions are difficult, if not impossible, to
    define algorithmically due mainly to the
    unstructured nature
  • Too many variables and/or the interactions of
    relevant variables not understood well
  • Input data may be partially corrupt or missing,
    making it difficult for a logical sequence of
    solution steps to function effectively

39
When ANNs should be applied (contd)
  • The typical ANN attempts to arrive at an answer
    by learning to identify the right answer through
    an iterative process of self-adaptation or
    training
  • If there are many factors, with complex
    interactions among them, the usual "linear"
    statistical techniques may be inappropriate
  • If sufficient data is available, an ANN can find
    the relevant functional relationship by means of
    an adaptive learning procedure from the data

40
Current applications of ANNs
  • ANNs are good at recognition and classification
    tasks
  • Due to their ability to recognise complex
    patterns, ANNs have been widely applied in
    character, handwritten text and signature
    recognition, as well as more complex images such
    as faces
  • They have also been used successfully for speech
    recognition and synthesis
  • ANNs are being used in an increasing number of
    applications where high-speed computation of
    functions is important, eg, in industrial robotics

41
Current applications of ANNs(contd)
  • One of the more successful applications of ANNs
    has been as a decision support tool in the area
    of finance and banking
  • Some examples of commercial applications of ANN
    are
  • Financial market analysis for investment decision
    making
  • Sales support - targeting customers for
    telemarketing
  • Bankruptcy prediction
  • Intelligent flexible manufacturing systems
  • Stock market prediction
  • Resource allocation scheduling and management
    of personnel and equipment

42
ANN applications - broad categories
  • According to a survey (Quaddus Khan, 2002)
    covering the period 1988 up to mid 1998, the main
    business application areas of ANNs are
  • Production (36)
  • Information systems (20)
  • Finance (18)
  • Marketing distribution (14.5)
  • Accounting/Auditing (5)
  • Others (6.5)

43
ANN applications - broad categories (contd)
  • The levelling off of publications on ANN
    applications may be attributed to the ANN moving
    from the research to the commercial application
    domain
  • The emergence of other intelligent system tools
    may be another factor

44
Some advantages of ANNs
  • Able to take incomplete or corrupt data and
    provide approximate results.
  • Good at generalisation, that is recognising
    patterns similar to those learned during
    training
  • Inherent parallelism makes them fault-tolerant
    loss of a few interconnections or nodes leaves
    the system relatively unaffected
  • Parallelism also makes ANNs fast and efficient
    for handling large amounts of data.

45
ANN State-of-the-art overview
  • Currently neural network systems are available as
  • Software simulation on conventional computers -
    prevalent
  • Special purpose hardware that models the
    parallelism of neurons.
  • ANN-based systems not likely to replace
    conventional computing systems, but they are an
    established alternative to the symbolic logic
    approach to information processing
  • A new computing paradigm in the form of hybrid
    intelligent systems has emerged - often involving
    ANNs with other intelligent system tools

46
REFERENCES
  • AI Expert (special issue on ANN), June 1990.
  • BYTE (special issue on ANN), Aug. 1989.
  • Caudill,M., "The View from Now", AI Expert, June
    1992, pp.27-31.
  • Dhar, V., Stein, R., Seven Methods for
    Transforming Corporate Data into Business
    Intelligence., Prentice Hall 1997
  • Kirrmann,H., "Neural Computing The new gold rush
    in informatics", IEEE Micro June 1989 pp. 7-9
  • Lippman, R.P., "An Introduction to Computing with
    Neural Nets", IEEE ASSP Magazine, April 1987
    pp.4-21.
  • Lisboa, P., (Ed.) Neural Networks Current
    Applications, Chapman Hall, 1992.
  • Negnevitsky, M. Artificial Intelligence A Guide
    to Intelligent Systems, Addison-Wesley 2005.

47
REFERENCES (contd)
  • Quaddus, M. A., and Khan, M. S.,  "Evolution of
    Artificial Neural Networks in Business
    Applications An Empirical Investigation Using a
    Growth Model", International Journal of
    Management and Decision Making, Vol.3, No.1,
    March 2002, pp.19-34.(see also ANN application
    publications end note library files, ICT619 ftp
    site)
  • Wasserman, P.D., Neural Computing, Theory and
    Practice, Van Nostrand Reinhold, New York 1989
  • Wong, B.K., Bodnovich, T.A., Selvi, Yakup,
    "Neural Networks applications in business A
    Review and Analysis of the literature (1988-95)",
    Decision Support Systems, 19, 1997, pp. 301-320.
  • Zahedi, F., Intelligent Systems for Business,
    Wadsworth Publishing, Belmont, California, 1993.
  • http//www.doc.ic.ac.uk/nd/surprise_96/journal/vo
    l4/cs11/report.html
Write a Comment
User Comments (0)
About PowerShow.com