Intro. to Neural Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Intro. to Neural Networks

Description:

Get output O, nudge weights to gives results toward our desired output T ... Compute outputs O, adjust weights according to the delta rule, backpropagating the errors. ... – PowerPoint PPT presentation

Number of Views:39

Avg rating:3.0/5.0

Slides: 34

Provided by: kenric

Learn more at: https://www.cs.kent.edu

Category:

more less

Transcript and Presenter's Notes

Title: Intro. to Neural Networks

1
Intro. to Neural Networks Using a Radial-Basis
Neural Network to Classify Mammograms

Pattern Recognition 2nd Presentation
Mohammed Jirari
Spring 2003

2
Neural Network History

Originally hailed as a breakthrough in AI
Biologically inspired information processing
systems (parallel architecture of animal brains
vs processing/memory abstraction of human
information processing)
Referred to as Connectionist Networks
Now, better understood
Hundreds of variants
Less a model of the actual brain than a useful
tool
Numerous applications
handwriting, face, speech recognition
CMU van that drives itself

3
Perceptrons

Initial proposal of connectionist networks
Rosenblatt, 50s and 60s
Essentially a linear discriminant composed of
nodes, weights

4
Perceptron Example
2(0.5) 1(0.3) -1 0.3 , O1
Learning Procedure

Randomly assign weights (between 0-1)
Present inputs from training data
Get output O, nudge weights to gives results
toward our desired output T
Repeat stop when no errors, or enough epochs
completed

5
Perception Training
Weights include Threshold. TDesired, OActual
output.
Example T0, O1, W10.5, W20.3, I12,
I21,Theta-1
6
Perceptrons

Can add learning rate to speed up the learning
process just multiply in with delta computation
Essentially a linear discriminant
Perceptron theorem If a linear discriminant
exists that can separate the classes without
error, the training procedure is guaranteed to
find that line or plane.

7
Strengths of Neural Networks

Inherently Non-Linear
Rely on generalized input-output mappings
Provide confidence levels for solutions
Efficient handling of contextual data
Adaptable
Great for changing environment
Potential problem with spikes in the environment

8
Strengths of Neural Networks
(continued)

Can benefit from Neurobiological Research
Uniform analysis and design
Hardware implementable
Speed
Fault tolerance

9
Hebbs Postulate of Learning

The effectiveness of a variable synapse between
two neurons is increased by the repeated
activation of the neuron by the other across the
synapse
This postulate is often viewed as the basic
principal behind neural networks

10
LMS Learning
LMS Least Mean Square learning Systems, more
general than the previous perceptron learning
rule. The concept is to minimize the total
error, as measured over all training examples, P.
O is the raw output, as calculated by
E.g. if we have two patterns and T11, O10.8,
T20, O20.5 then D(0.5)(1-0.8)2(0-0.5)2.145
We want to minimize the LMS
C-learning rate
E
W(old)
W(new)
W
11
LMS Gradient Descent

Using LMS, we want to minimize the error. We can
do this by finding the direction on the error
surface that most rapidly reduces the error rate
this is finding the slope of the error function
by taking the derivative. The approach is called
gradient descent (similar to hill climbing).

12
Activation Function

To apply the LMS learning rule, also known as the
delta rule, we need a differentiable activation
function.

Old
New
13
LMS vs. Limiting Threshold

With the new sigmoidal function that is
differentiable, we can apply the delta rule
toward learning.
Perceptron Method
Forced output to 0 or 1, while LMS uses the net
output
Guaranteed to separate, if no error and is
linearly separable
Gradient Descent Method
May oscillate and not converge
May converge to wrong answer
Will converge to some minimum even if the classes
are not linearly separable, unlike the earlier
perceptron training method

14
Backpropagation Networks

Attributed to Rumelhart and McClelland, late 70s
To bypass the linear classification problem, we
can construct multilayer networks. Typically we
have fully connected, feedforward networks.

Input Layer
Output Layer
Hidden Layer
I1
O1
H1
I2
H2
O2
I3
Wj,k
1
Wi,j
1
1s - bias
15
Backprop - Learning
Learning Procedure

Randomly assign weights (between 0-1)
Present inputs from training data, propagate to
outputs
Compute outputs O, adjust weights according to
the delta rule, backpropagating the errors. The
weights will be nudged closer so that the network
learns to give the desired output.
Repeat stop when no errors, or enough epochs
completed

16
Backprop - Modifying Weights
We had computed
For the Output unit k, f(sum)O(k). For the
output units, this is
For the Hidden units (skipping some math), this
is
I
H
O
Wi,j
Wj,k
17
Backprop

Very powerful - can learn any function, given
enough hidden units! With enough hidden units, we
can generate any function.
Have the same problems of Generalization vs.
Memorization. With too many units, we will tend
to memorize the input and not generalize well.
Some schemes exist to prune the neural network.
Networks require extensive training, many
parameters to fiddle with. Can be extremely slow
to train. May also fall into local minima.
Inherently parallel algorithm, ideal for
multiprocessor hardware.
Despite the cons, a very powerful algorithm that
has seen widespread successful deployment.

18
Why This Project?

Breast Cancer is the most common cancer and is
the second leading cause of cancer deaths
Mammographic screening reduces the mortality of
breast cancer
But, mammography has low positive predictive
value PPV (only 35 have malignancies)
Goal of Computer Aided Diagnosis CAD is to
provide a second reading, hence reducing the
false positive rate

19
Data Used in my Project

The dataset used is the Mammographic Image
Analysis Society (MIAS) MINIMIAS database
containing Medio-Lateral Oblique (MLO) views for
each breast for 161 patients for a total of 322
images.
Every image is
1024 pixels X 1024 pixels X 256

20
Sample of Well-Defined/Circumscribed Masses
Mammogram
21
Sample of a Normal Mammogram
22
Sample of an Ill-Defined Masses Mammogram
23
Sample of an Asymmetric Mammogram
24
Sample of an Architecturally Distorted Mammogram
25
Sample of a Spiculated Masses Mammogram
26
Sample of a Calcification Mammogram
27
Approach Followed

Normalize all images between 0 and 1
Normalize the features between 0 and 1
Train the network
Test on an image (Simulate the network)
Denormalize the classification values

28
Features Used to Train

Character of background tissue
Fatty, Fatty-Glandular, and Dense-Glandular
Severity of abnormality
Benign or Malignant
Class of abnormality present
Calcification, Well-Defined/Circumscribed Masses,
Spiculated Masses, Other/Ill-Defined Masses,
Architectural Distortion, Asymmetry, and Normal

29
Radial Basis Network Used

Radial basis networks may require more neurons
than standard feed-forward backpropagation FFBP
networks
BUT, can be designed in a fraction of the time to
train FFBP
Work best with many training vectors

30
Radial Basis Network with R Inputs
31
radbas(n)e-(n2)
aradbas(n)
32
Radial basis network consists of 2 layers a
hidden radial basis layer of S1 neurons and an
output linear layer of S2 neurons
33
Results and Future Work

The network was able to correctly classify 55 of
the mammograms
I will use more pre-processing including
sub-sampling, segmentation, and statistical
features extracted from the images, as well as
the coordinates of the center of abnormality and
approximate radius of a circle enclosing the
abnormality.
I will use different networks like fuzzy ARTMAP
network, self-organizing network, cellular
networks and compare their results in designing a
good CAD.