ICT619 Intelligent Systems Topic 4: Artificial Neural Networks - PowerPoint PPT Presentation

About This Presentation

Title:

ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

Description:

Custom-coded simulators which requires more expertise on part of the user but ... the task of producing the best loading strategy for packages into trucks ... – PowerPoint PPT presentation

Number of Views:123

Avg rating:3.0/5.0

Slides: 52

Provided by: drsham

Category:

more less

Transcript and Presenter's Notes

Title: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

1
ICT619 Intelligent SystemsTopic 4 Artificial
Neural Networks
2
Artificial Neural Networks

PART A
Introduction
An overview of the biological neuron
The synthetic neuron
Structure and operation of an ANN
Problem solving by an ANN
Learning in ANNs
ANN models
Applications

PART B
Developing neural network applications
Design of the network
Training issues
A comparison of ANN and ES
Hybrid ANN systems
Case Studies

3
Developing neural network applications

Neural Network Implementations
Three possible practical implementations of ANNs
are
A software simulation program running on a
digital computer
A hardware emulator connected to a host computer
- called a neurocomputer
True electronic circuits

4
Software Simulations of ANN

Currently the cheapest and simplest
implementation method for ANNs - at least for
general purpose use.
Simulates parallel processing on a conventional
sequential digital computer
Replicates temporal behaviour of the network by
updating the activation level and output of each
node for successive time steps
These steps are represented by iterations or
loops
Within each loop, the updates for all nodes in a
layer are performed.

5
Software simulations of ANN (contd)

In multilayer ANNs, processing for a layer is
completed and its output used to calculate states
of the nodes in the following layer
Typical additional features of ANN simulators
Configuring the net according to a chosen
architecture and node operational characteristic
Implementation of training phase using a chosen
training algorithm
Tools for visualising and analysing behaviour of
nets
ANN simulators are written in hi-level languages
such as C, C and Java.

6
Advantages and possible problems with software
simulators

Advantages and possible problems with software
simulators
Main attraction of ANN simulators is the
relatively low cost and wide availability of
ready-made commercial packages
They are also compact, flexible and highly
portable.
Writing your own simulator requires programming
skills and would be time consuming (except that
you don't have to now!)
Training of ANNs using software simulators can be
slow for larger networks (greater than a few
hundred)

7
Commercially available neural net packages

Prewritten shells with convenient user interfaces
Cost a few hundred to tens of thousands of
dollars
Allow users to specify the ANN design and
training parameters
Usually provide graphic interfaces to enable
monitoring of the nets training and operation
Likely to provide interfacing with other software
systems such as spreadsheets and databases.

8
Neurocomputers

Dedicated special-purpose digital computer (aka
accelerator boards)
Optimised to perform operations common in neural
network simulation
Acts as a coprocessor to a host computer and is
controlled by a program running on the host.
Can be tens to thousands of times faster than
simulators
Systems are available with approx. 1000 million
IPS connection updates per second for networks
with 8,192 neurons e.g ACC Neural Network
Processor

9
Neurocomputers

Genobyte's CAM-Brain Machine was developed
between 1997 and 2000

10
True Networks in Hardware

Closer to biological neural networks than
simulations
Consist of synthetic neurons actually fabricated
on silicon chips
Commercially available hardwired ANNs are limited
to a few thousand neurons per chip1.
Chips connected in parallel to achieve larger
networks.
Problems interconnection and interference,
fixed-valued weights - work progressing on
modifiable synapses.
1 Figures more than five years old.

11
Neural Network Development Methodology

Aims to add structure and organisation to ANN
applications development for reducing cost,
increasing accuracy, consistency, user confidence
and friendliness
Split development into the following phases
The Concept Phase
The Design Phase
The Implementation Phase
The Maintenance Phase

12
Neural Network Development Methodology - the
Concept Phase

Involves
Validating the proposed application
Selecting an appropriate neural paradigm.
Application validation
Problem characteristics suitable for neural
network application are
Data intensive
Multiple interacting parameters
Incomplete, erroneous, noisy data
Solution function unknown or expensive
Requires flexibility, generalisation,
fault-tolerance, speed

13
ANN Development Methodology - the Concept Phase
(contd)

Common examples of applications with above
attributes are
pattern recognition (eg, printed or handwritten
character, consumer behaviour, risk patterns),
forecasting (eg, stock market), signal (audio,
video, ultrasound) processing
Problems not suitable for ANN-based solutions
include
A mathematically accurate and precise solution is
available
Solution involving deduction and step-wise logic
appropriate
Applications involving explaination or reporting
One application area that is unsuitable for ANNs
is resource management eg, inventory, accounts,
sales data analysis

14
Selecting an ANN paradigm

Decision based on comparison of application
requirements to capabilities of different
paradigms
eg, the multilayer perceptron is well known
for its pattern recognition capabilities,
Kohonen net more suited for applications
involving data clustering
Choice of paradigm also influenced by the
training method that can be employed
eg. supervised training must have adequate
number of input-correct output pairs available
and training may take a relatively long time
Technical and economic feasibility assessments
should be carried out to complete the concept
phase

15
The Design Phase

The design phase specifies initial values and
conditions at the node, network and training
levels
Decisions to be made at the node level include
Types of input binary (0,1), bipolar (-1,1),
trivalent (-1, 0, 1), discrete,
continuous-valued
Transfer function - step or threshold,
hyperbolic tangent, sigmoid, consider possible
use of lookup tables for speeding up calculations
Decisions to be made at the network architecture
level
The number and size of layers and their
connectivity
(fully interconnected, or sparsely
interconnected, feedforward or recurrent, other?)

16
The Design Phase (contd)

'Size' of a layer is the number of nodes in the
layer
For the input layer, size is determined by number
of data sources (input vector components) and
possibly the mathematical transformations done
The number of nodes in the output layer is
determined by the number of classes or decision
values to be output
Finding optimal size of the hidden layer needs
some experimentation
Too few nodes will produce inadequate mapping,
while too many may result in inadequate
generalisation

17
The Design Phase (contd)

Connectivity
Connectivity determines the flow of signals
between neurons in the same or different layers
Some ANN models, such as the multilayer
perceptron, have only interlayer connections -
there is no intralayer connection
The Hopfield net is an example of a model with
intralayer connections

18
The Design Phase (contd)

Feedback
There may be no feedback of output values, eg,
the multilayer perceptron
or
There may be feedback as in a recurrent network
eg, the Hopfield net
Other design questions include
Setting of parameters for the learning phase
eg, stopping criterion, learning rate.
Possible addition of noise to speed up training.

19
The Implementation phase

Typical steps
Gathering the training set
Selecting the development environment
Implementing the neural network
Testing and debugging the network
Gathering the training set
Aims to get right type of data in adequate amount
and in the right format

20
Gathering training data (contd)

How much data to gather?
Increasing data amount increases training time
but may help earlier convergence
Quality more important than quantity
Collection of data
Potential sources - historical records,
instrument readings, simulation results
Preparation of data
Involves preprocessing including scaling,
normalisation, binarisation, mapping to
logarithmic scale, etc.

21
Gathering training data (contd)

Type of data to collect should be representative
of given problem including routine, unusual and
boundary-condition cases
Mix of good as well as imperfect data but not
ambiguous or too erroneous.
Amount of data to gather
Increasing data amount increases training time
but may help earlier convergence
Quality more important than quantity

22
Gathering training data (contd)

Collection of data
Potential sources - historical records,
instrument readings, simulation results
Preparation of data
Involves preprocessing including normalisation
and possible binarisation

23
Selecting the development environment

Hardware and software aspects
Hardware requirements based on
speed of operation
memory and storage capacity
software availability
cost
compatibility
The most popular platforms are workstations and
high-end PC's (with accelerator board option)

24
Selecting the development environment

Two options in choosing software
Custom-coded simulators which requires more
expertise on part of the user but provides
maximum flexibility
Commercial development packages which are
usually easy to use because of a more
sophisticated interface

25
Selecting the development environment (contd)

Selection of hardware and software environment
usually based on following considerations
ANN paradigm to be implemented
Speed in training and recall
Transportability
Vendor support
Extensibility
Price

26
Implementing the neural network

Common steps involved are
Selection of appropriate neural paradigm
Setting network size
Deciding on the learning algorithm
Creation of screen displays
Determining the halting criteria
Collecting data for training and testing
Data preparation including preprocessing
Organising data into training and test sets

27
Implementation - Training

Training the net, which consists of
Loading the training set
Initialisation of network weights usually to
small random values
Starting the training process
Monitoring the training process until training is
completed
Saving of weight values in a file for use during
operation mode

28
Implementation Training (contd)

Possible problems arising during training
Failure to converge to a set of optimal weight
values
Further weight adjustments fail to reduce output
error, stuck in a local minimum
Remedied by resetting the learning parameters and
reinitialising the weights
Overtraining
Net fails to generalise, i.e., fails to classify
less than perfect patterns
Mix of good and imperfect patterns for training
helps

29
Implementation Training (contd)

Training results may be affected by the method of
presenting data set to the network.
Adjustments may be made by varying the layer
sizes and fine-tuning the learning parameters.
To ensure optimal results, several variations of
a neural network may be trained and each tested
for accuracy

30
Implementation - Testing and Debugging

Testing can be done by
1. Observing operational behaviour of the net.
2. Analysing actual weights
3. Study of network behaviour under specific
conditions
Observing operational behaviour
Network treated as a black box and its response
to a series of test cases is evaluated
Test data
Should contain training cases as well as new
cases
Routine, unusual as well as boundary condition
cases should be tried

31
Implementation - Testing and Debugging (contd)

Testing by weight analysis
Weights entering and exiting nodes analysed for
relatively small and large values
In case of significant errors detected in
testing, debugging would involve examining
the training cases for representativeness,
accuracy and adequacy of number
learning algorithm parameters such as the rate at
which weights are adjusted
neural network architecture, node
characteristics, and connectivity
training set-network interface, user-network
interface

32
The Maintenance Phase

Consists of
placing the neural network in an operational
environment with possible integration
periodic performance evaluation, and maintenance
Although often designed as stand-alone systems,
some neural network systems are integrated with
other information systems using
Loose-coupling preprocessor, postprocessor,
distributed component
Tight-coupling or full integration as embedded
component

33
The Maintenance Phase

Possible ANN operational environments

34
System evaluation

Continual evaluation is necessary to
ensure satisfactory performance in solving
dynamic problems
check for damaged or retrained networks.
Evaluation can be carried out by reusing original
test procedures with current data.

35
ANN Maintenance

Involves modification necessitated by
Decreasing accuracy
Enhancements
System modification falls into two categories
involving either data or software.
Data modification steps
Training data is modified or replaced
Network retrained and re-evaluated.

36
ANN Maintenance (contd)

Software changes include changes in
Interfaces
cooperating programs
the structure of the network.
If the network is changed, part of the design and
most of the implementation phase may have to be
repeated.
Backup copies should be used for maintenance and
research.

37
A comparison of ANN and ES

Similarities between ES and ANN
Both aim to create intelligent computer systems
by mimicking human intelligence, although at
different levels
Design process of neither ES nor ANN is automatic
Knowledge extraction in ES is a time and labour
intensive process
ANNs are capable of learning but selection and
preprocessing of data have to be done carefully.

38
A comparison of ANN and ES (contd)

Differences between ANN and ES
Differ in aspects of design, operation and use
Logic vs. brain
ES simulate the human reasoning process based on
formal logic
ANNs are based on modelling the brain, both in
structure and operation
Sequential vs. parallel
The nature of processing in ES is sequential
ANNs are inherently parallel

39
A comparison of ANN and ES (contd)

External and static vs. internal and dynamic
Learning is performed external to the ES
ANN itself is responsible for its knowledge
acquisition during the training phase.
Learning is always off-line in ES - knowledge
remains static during operation
Learning in ANNs, although mostly off-line, can
be on-line
Deductive vs. inductive inferencing
Knowledge in an ES always used in a deductive
reasoning process
An ANN constructs its knowledge base inductively
from examples, and uses it to produce decision
through generalisation

40
A comparison of ANN and ES (contd)

Knowledge representation explicit vs. implicit
ES store knowledge in explicit form -possible to
inspect and modify individual rules
ANNs knowledge stored implicitly in the
interconnection weight values
Design issues simple vs. complex
Technical side of ES development relatively
simple without difficult design choices.
ANN design process often one of trial and error

41
A comparison of ANN and ES (contd)

User interface white box vs. black box
ES have explanation capability
Difficulty in interpreting an ANN's
knowledge-base effectively makes it a black box
to the user
State of maturity and recognition
well-established vs. early
ES already well established as a methodology in
commercial applications
ANN recognition and development tools at a
relatively early stage.

42
Hybrid systems

Neuro-symbolic computing utilises the
complementary nature of computing in neural
networks (numerical) and expert systems
(symbolic).
Neuro-fuzzy systems combine neural networks with
fuzzy logic
ANNs can also be combined with genetic algorithm
methodology
Hybrid ES-ANN systems
The strengths of the ES can be utilised to
overcome the weaknesses of an ANN based system
and vice versa.
For example, ANNs extraction of knowledge from
data
ESs explanation capability

43
Hybrid ES-ANN systems

Rule extraction by inference justification in an
ANN
MACIE, an ANN based decision support system
described in (Gallant 1993)
Extracts a single rule that justifies an
inference in an ANN
Inference in an ANN is represented by output of a
single node
This output is based upon incomplete input values
fed from a number of nodes as shown in the
diagram below.

44
Hybrid ES-ANN systems (contd)

A node ui is defined to be a contributing node to
node uj if wij ui ? 0.

45
Hybrid ES-ANN systems (contd)

In this example, the contributing variables are
u2, u3, u5, u6 .
The rule produced in this example is
IF u6 Unknown
AND u2 TRUE
AND u3 FALSE
AND u5 TRUE
THEN conclude u7 TRUE.

46
Hybrid ES-ANN systems (contd)

One approach to hybrid systems divides a problem
into tasks suitable for either ES and ANN
These tasks are then performed by the appropriate
methodology
One example of such a system (Caudill 1991) is an
intelligent system for delivering packages
ES performs the task of producing the best
loading strategy for packages into trucks
ANN works out best route for delivering the
packages efficiently.

47
Hybrid ES-ANN systems (contd)

Hybrid ES-ANN systems with ANNs embedded within
expert systems
ANN used to determine which rule to fire, given
the current state of facts.
Another approach to hybrid ES-ANN uses an ANN as
a preprocessor
One or more ANNs produce classifications.
Numerical outputs produced by ANN are interpreted
symbolically by an ES as facts
ES applies the facts for deductive reasoning

48
Case Study

Case Application of ANNs in bankruptcy
prediction (Coleman et al, AI Review, Summer
1991, in Zahedi 1993)
Predicts banks that were certain to fail within
a year
Predicts certainty given to bank examiners
dealing with the bank in question.
ANN has 11 inputs, each of which is a ratio
developed by Peat Marwick.
Developed by NeuralWares Application Development
Services and Support Group (ADSS)
Software used - the NeuralWorks Professional
neural network development system.
Uses the standard backpropagation (multiplayer
perceptron) network.

49
Case Study (contd)

ANN has 11 inputs, each a ratio developed by Peat
Marwick.
Inputs connected to a single hidden layer, which
in turn is connected to a single node in the
output layer.
Network outputs a single value denoting whether
the bank would or would not fail within that
calendar year
Employed the hyperbolic-tangent transfer function
and a proprietary error function created by the
ADSS staff.
Trained on a set of 1,000 examples, 900 of which
were viable banks and 100 of which were banks
that had actually gone bankrupt
Training consisted of about 50,000 iterations of
the training set.
Predicted 50 of banks that are viable, and 99
of banks that actually failed.

50
REFERENCES

AI Expert (special issue on ANN), June 1990.
BYTE (special issue on ANN), Aug. 1989.
Caudill,M., "The View from Now", AI Expert, June
1992, pp.27-31.
Dhar, V., Stein, R., Seven Methods for
Transforming Corporate Data into Business
Intelligence., Prentice Hall 1997
Kirrmann,H., "Neural Computing The new gold rush
in informatics", IEEE Micro June 1989 pp. 7-9
Lippman, R.P., "An Introduction to Computing with
Neural Nets", IEEE ASSP Magazine, April 1987
pp.4-21.
Lisboa, P., (Ed.) Neural Networks Current
Applications, Chapman Hall, 1992.
Negnevitsky, M. Artificial Intelligence A Guide
to Intelligent Systems, Addison-Wesley 2005.

51
REFERENCES (contd)

Bailey, D., Thompson, D., How to Develop Neural
Network Applications, AI Expert, June 1990, pp.
38-47.
Caudill Butler, Naturally Intelligent Systems,
MIT Press,1989, pp 227-240.
Caudill, M., Expert networks, BYTE pp.109-116,
October 1991.
Dhar, V., Stein, R., Seven Methods for
Transforming Corporate Data into Business
Intelligence., Prentice Hall 1997.
Gallant, S., Neural Network Learning and Expert
Systems, MIT Press 1993.
Medsker,L., Hybrid Intelligent Systems, Kluwer
Academic Press, Boston 1995
Zahedi, F., Intelligent Systems for Business,
Wadsworth Publishing, , Belmont, California, 1993.