Programming assign proper weights - PowerPoint PPT Presentation

About This Presentation

Title:

Programming assign proper weights

Description:

From Chicago. From NY. Cubs fan. Mets fan. democrat. republican. Likes lemonade. White Sox fan ... So, Republican cubs fans also like the Sox, white Democrats do not. ... – PowerPoint PPT presentation

Number of Views:49

Avg rating:3.0/5.0

Slides: 15

Provided by: utha

Learn more at: https://personalpages.bradley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Programming assign proper weights

1

Programming assign proper weights
Learning

u1
Wi,1
Wi,2
u2
ui
Wi,n
un
n
Oi f ( S Wj i uj )
Activation
j1
f transfer function (usually non-linear
threshold function, sigmoid, hyperbolic tangent,
etc.)
2
Perceptron
Rosenblatt, 1962, Minsky, 1969
W1
X1
W2
Oi
X2
Wn
Xn
1, if S Wi xi gt q
Activation Oi
0, if S Wi xilt q
W0
-q
1
W1
Oi
X1
1, if S Wi xi gt 0
Activation Oi
Wn
Xn
0, if S Wi xilt 0
3
Algorithm for Perceptron Learning

Initialize (w0, w1,.., wn) randomly.
Iterate through training set collecting
unclassified examples by current weights.
If all examples correctly classified (or up to an
acceptable threshold classified) then quit
else
compute sum of misclassified examples, x,
S S x, if failed to fire while it should
S S - x, if fired while it shouldnt
4. Modify weights wt1 wt k S.

4
Perceptron learning
Total input in w0 w1 x1 w2 x2
w1
w0
x1
In 0
x2

w2
w2
Decision surface
Learning locating proper decision surface
5
To find decision surface use gradient descent
methods
E ( W1,W2) error function sum of distances of
unclassified input vectors from decision surface
E (W1,W2)
W1
W2
6
Cannot do XOR problem
Can do it with multi layer
-1.5
1
-0.5
1
1.0
-9.0
x1
1.0
x1
1.0
1.0
x2
x2
Perceptron training doesnt work (Minsky-Papert)
7
Back Propagation
1
1
Output
1 e Swij xj
0
Can get stuck in local minima .
Slow speed of learning
(Boltzman machine uses simulated annealing for
search and doesnt get stuck)
8
o1
o2
oi
W11
W23
Wij
v1
v3
v2
vj
w11
w32
wjk
xk
Binary or continuous
x1
x2
Output of vj Vjm g( hjm) g(S wjk xkm)
k
Output of oi Oim g( him) g(S Wi j Vjm)
g(S Wij g(S w j k
xkm) )
j
j
k
2
zim -g(S Wij g(S wjk xkm) ) k
Error function E (w) ½ S
Continuous , differentiable. (zi is desired
output
g(h) 1 / (1 e 2h ) or tanh (h)
9
To get weights
D Wij -h qE /qWij - h Szim- Oim g(him)
Vjm
m - h S dim Vjm
m where dim g(him) zim- Oim
10
D wjk -h qE/q wjk - h S qE/q Vj m q Vj m
/q wjk -h Sm,i zim-Oim g(him) Wij
g(hjm) xkm-h Sm,i dim Wij g(hjm) xkm -h
Sm djm xkm where djm g(hjm) Si dim Wij

Judd , 1988
Teaching nets is NP- Complete on of nodes.
Possible Solution
Use probabilistic algorithms to train them
(like GAs).

12
Example back propagation network

Input Vector 19 bits correspond to a person
1 0 1 0 1
0 1 . Etc.
Output vector 5 bits
0 1 0 0
0
Training and testing data vectors
24-bits each
80-training vectors
20- testing vectors

Mets fan
Likes lemonade
From Chicago
From NY
Cubs fan
democrat
republican
Likes tennis
NY Jets fan
Bears fan
NY Yankees fan
White Sox fan
13

Possible rules from data
People from Chicago are never fans of NY teams
50 of Cub fans are White Sox fans
All Mets fans are Jets fans.
Can train BP-network to learn such patterns.
After training and testing, if we give as input
1010000 0 (characteristics of a person)
We get output 0.44 0.88 0.23 0.03 0.02
i.e. There is a small chance that person likes
Sox, high probability that likes Bears, no chance
that he likes Mets, Jets or tennis.