Intro. to Neural Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Intro. to Neural Networks

Description:

Get output O, nudge weights to gives results toward our desired output T ... Compute outputs O, adjust weights according to the delta rule, backpropagating the errors. ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 34
Provided by: kenric
Learn more at: https://www.cs.kent.edu
Category:
Tags: intro | networks | neural

less

Transcript and Presenter's Notes

Title: Intro. to Neural Networks


1
Intro. to Neural Networks Using a Radial-Basis
Neural Network to Classify Mammograms
  • Pattern Recognition 2nd Presentation
    Mohammed Jirari
  • Spring 2003

2
Neural Network History
  • Originally hailed as a breakthrough in AI
  • Biologically inspired information processing
    systems (parallel architecture of animal brains
    vs processing/memory abstraction of human
    information processing)
  • Referred to as Connectionist Networks
  • Now, better understood
  • Hundreds of variants
  • Less a model of the actual brain than a useful
    tool
  • Numerous applications
  • handwriting, face, speech recognition
  • CMU van that drives itself

3
Perceptrons
  • Initial proposal of connectionist networks
  • Rosenblatt, 50s and 60s
  • Essentially a linear discriminant composed of
    nodes, weights

4
Perceptron Example
2(0.5) 1(0.3) -1 0.3 , O1
Learning Procedure
  • Randomly assign weights (between 0-1)
  • Present inputs from training data
  • Get output O, nudge weights to gives results
    toward our desired output T
  • Repeat stop when no errors, or enough epochs
    completed

5
Perception Training
Weights include Threshold. TDesired, OActual
output.
Example T0, O1, W10.5, W20.3, I12,
I21,Theta-1
6
Perceptrons
  • Can add learning rate to speed up the learning
    process just multiply in with delta computation
  • Essentially a linear discriminant
  • Perceptron theorem If a linear discriminant
    exists that can separate the classes without
    error, the training procedure is guaranteed to
    find that line or plane.

7
Strengths of Neural Networks
  • Inherently Non-Linear
  • Rely on generalized input-output mappings
  • Provide confidence levels for solutions
  • Efficient handling of contextual data
  • Adaptable
  • Great for changing environment
  • Potential problem with spikes in the environment

8
Strengths of Neural Networks
(continued)
  • Can benefit from Neurobiological Research
  • Uniform analysis and design
  • Hardware implementable
  • Speed
  • Fault tolerance

9
Hebbs Postulate of Learning
  • The effectiveness of a variable synapse between
    two neurons is increased by the repeated
    activation of the neuron by the other across the
    synapse
  • This postulate is often viewed as the basic
    principal behind neural networks

10
LMS Learning
LMS Least Mean Square learning Systems, more
general than the previous perceptron learning
rule. The concept is to minimize the total
error, as measured over all training examples, P.
O is the raw output, as calculated by
E.g. if we have two patterns and T11, O10.8,
T20, O20.5 then D(0.5)(1-0.8)2(0-0.5)2.145
We want to minimize the LMS
C-learning rate
E
W(old)
W(new)
W
11
LMS Gradient Descent
  • Using LMS, we want to minimize the error. We can
    do this by finding the direction on the error
    surface that most rapidly reduces the error rate
    this is finding the slope of the error function
    by taking the derivative. The approach is called
    gradient descent (similar to hill climbing).

12
Activation Function
  • To apply the LMS learning rule, also known as the
    delta rule, we need a differentiable activation
    function.

Old
New
13
LMS vs. Limiting Threshold
  • With the new sigmoidal function that is
    differentiable, we can apply the delta rule
    toward learning.
  • Perceptron Method
  • Forced output to 0 or 1, while LMS uses the net
    output
  • Guaranteed to separate, if no error and is
    linearly separable
  • Gradient Descent Method
  • May oscillate and not converge
  • May converge to wrong answer
  • Will converge to some minimum even if the classes
    are not linearly separable, unlike the earlier
    perceptron training method

14
Backpropagation Networks
  • Attributed to Rumelhart and McClelland, late 70s
  • To bypass the linear classification problem, we
    can construct multilayer networks. Typically we
    have fully connected, feedforward networks.

Input Layer
Output Layer
Hidden Layer
I1
O1
H1
I2
H2
O2
I3
Wj,k
1
Wi,j
1
1s - bias
15
Backprop - Learning
Learning Procedure
  • Randomly assign weights (between 0-1)
  • Present inputs from training data, propagate to
    outputs
  • Compute outputs O, adjust weights according to
    the delta rule, backpropagating the errors. The
    weights will be nudged closer so that the network
    learns to give the desired output.
  • Repeat stop when no errors, or enough epochs
    completed

16
Backprop - Modifying Weights
We had computed
For the Output unit k, f(sum)O(k). For the
output units, this is
For the Hidden units (skipping some math), this
is
I
H
O
Wi,j
Wj,k
17
Backprop
  • Very powerful - can learn any function, given
    enough hidden units! With enough hidden units, we
    can generate any function.
  • Have the same problems of Generalization vs.
    Memorization. With too many units, we will tend
    to memorize the input and not generalize well.
    Some schemes exist to prune the neural network.
  • Networks require extensive training, many
    parameters to fiddle with. Can be extremely slow
    to train. May also fall into local minima.
  • Inherently parallel algorithm, ideal for
    multiprocessor hardware.
  • Despite the cons, a very powerful algorithm that
    has seen widespread successful deployment.

18
Why This Project?
  • Breast Cancer is the most common cancer and is
    the second leading cause of cancer deaths
  • Mammographic screening reduces the mortality of
    breast cancer
  • But, mammography has low positive predictive
    value PPV (only 35 have malignancies)
  • Goal of Computer Aided Diagnosis CAD is to
    provide a second reading, hence reducing the
    false positive rate

19
Data Used in my Project
  • The dataset used is the Mammographic Image
    Analysis Society (MIAS) MINIMIAS database
    containing Medio-Lateral Oblique (MLO) views for
    each breast for 161 patients for a total of 322
    images.
  • Every image is
  • 1024 pixels X 1024 pixels X 256

20
Sample of Well-Defined/Circumscribed Masses
Mammogram
21
Sample of a Normal Mammogram
22
Sample of an Ill-Defined Masses Mammogram
23
Sample of an Asymmetric Mammogram
24
Sample of an Architecturally Distorted Mammogram
25
Sample of a Spiculated Masses Mammogram
26
Sample of a Calcification Mammogram
27
Approach Followed
  • Normalize all images between 0 and 1
  • Normalize the features between 0 and 1
  • Train the network
  • Test on an image (Simulate the network)
  • Denormalize the classification values

28
Features Used to Train
  • Character of background tissue
  • Fatty, Fatty-Glandular, and Dense-Glandular
  • Severity of abnormality
  • Benign or Malignant
  • Class of abnormality present
  • Calcification, Well-Defined/Circumscribed Masses,
    Spiculated Masses, Other/Ill-Defined Masses,
    Architectural Distortion, Asymmetry, and Normal

29
Radial Basis Network Used
  • Radial basis networks may require more neurons
    than standard feed-forward backpropagation FFBP
    networks
  • BUT, can be designed in a fraction of the time to
    train FFBP
  • Work best with many training vectors

30
Radial Basis Network with R Inputs
31
radbas(n)e-(n2)
aradbas(n)
32
Radial basis network consists of 2 layers a
hidden radial basis layer of S1 neurons and an
output linear layer of S2 neurons
33
Results and Future Work
  • The network was able to correctly classify 55 of
    the mammograms
  • I will use more pre-processing including
    sub-sampling, segmentation, and statistical
    features extracted from the images, as well as
    the coordinates of the center of abnormality and
    approximate radius of a circle enclosing the
    abnormality.
  • I will use different networks like fuzzy ARTMAP
    network, self-organizing network, cellular
    networks and compare their results in designing a
    good CAD.
Write a Comment
User Comments (0)
About PowerShow.com