Artificial Neural Network - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Artificial Neural Network

Description:

this sigmoid function has a useful property that its derivative is easily ... the sigmoid is only one unit in the network, now we take a look at the whole ... – PowerPoint PPT presentation

Number of Views:2801

Avg rating:3.0/5.0

Slides: 19

Provided by: www2InTuc

Category:

more less

Transcript and Presenter's Notes

Title: Artificial Neural Network

1
Artificial Neural Network

1 Brief Introduction
2 Backpropogation Algorithm
3 A Simply Illustration

2
Chapter 1 Brief Introduction

History

1.2 Review to Decision Tree
Learning process is to reduce the error, which
can be understood as the difference between the
target and output values from learning structure.
ID3 Algorithm can be implemented only for
discrete values.
Artificial Neural Network (ANN) can describe
arbitrary functions.

1.3 Basic Structure
This example of ANN learning is provided by
Pomerluaus(1993) system ALVINN, which uses a
learned ANN to steer an autonomous vehicle
driving at normal speeds. The input of ANN is a
30x32 grid of pixel intensities obtained from
forward-faced camera mounted on the vehicle. The
output is the direction in which the vehicle is
steered.
As can be seen, 4 units receive inputs directly
from all of the 30X32 pixels from the camera in
vehicle. These are called hidden units because
their outputs are only available to the coming
units in the network, but not as apart of the
global network.

4
(No Transcript)
5

1.4 Ability
Instances are represented by many attribute-value
pairs. The target function to be learned is
defined over instances that can be described by a
vector of predefined feature. such as the pixel
values in the ALVINN example.
The training examples may contain errors. In
following sections we can see, that ANN learning
methods are quite robust to noise in training
data.
Long training times are acceptable. Compared to
decision tree learning, network training
algorithm requires longer training time,
depending on factors such as the number of the
weights in network.

6
Chapter 2backpropagation Algorithm

2.1 Sigmoid
Like the perceptron, the sigmoid unit first
computes a linear combination of its input.
then the sigmoid unit computes its output with
the following function.

This equation 2 is often referred to as the
squashing function since it map very large input
domain to a small range of output.

this sigmoid function has a useful property that
its derivative is easily expressed in terms of
its output. In the following description of the
backpropagation we can see, the algorithm makes
use of this derivative.

2.2 Function
the sigmoid is only one unit in the network, now
we take a look at the whole function, which the
neural network calculates. There is a figure 2.2,
if we consider an example (x, t), where x is
called input attribute and t is called target
attribute, than

9
(No Transcript)
10

2.3 Squared Error
Above it has mentioned, that the whole learning
process is in order to reduce the error, but how
can man error describe? Generally the function
squared error is used.
Notice this function 3 sums all the error over
all of the networks output units after a whole
set of training examples has been computed.

11
(No Transcript)
12

then the value-vector can be updated by

where ?E(w) is the gradient of E

so for each value k can be updated by

13

But in practice, because the function 3 sums all
the error over a whole set of the training data,
so need the algorithm with this function more
time to compute, and can easily be effected by
local minimum, so construct man a new function,
named stochastic squared error

As can be seen, the function computes error only
about a example. The gradient of Ed(w) is
easily made out

2.4 Backpropagation Algorithm
The learning problem faced by Backpropagation is
to search a large hypothesis space defined by all
possible weight values for all the units in the
network. The diagram of Algorithm is

15
(No Transcript)
16

Notice the error term for hidden unit h is
calculated by summing the error terms s_k for
each output unit influenced by unit h, weighting
each of the s_ks by w_kh,the weight from hidden
unit h to output unit k. This weight
characterizes the degree to which hidden unit h
is responsible for the error in output unit k.

17
Chapter 3 A Simple Illustration
Now we make an example to give a more inductive
knowledge. How does ANN learn the most simply
function, a identity id. We construct the network
shown in figure. There are eight network input
units, which are connected to three hidden units,
which are in turn connected to eight output
units. Because of this structure, the three
hidden units will be forced to represent the
eight input values in some way that captures
their relevant features, so that this hidden
layer representation can be used by the output
units to compute the correct target values.
18

This 8 x 3 x 8 network was trained to learn the
identity function. After 5000training times, the
three hidden unit values encode the eight
distinct inputs using the encoding shown in the
tabular. Notice if the encoded values are rounded
to zero or one, the result is the standard binary
encoding for 8 distinct values.

Write a Comment

User Comments (0)