Programming = assign proper weights

About This Presentation

Title:

Programming = assign proper weights

Description:

From NY. Cubs fan. Mets fan. Democrat. republican. Likes lemonade. White sox fan. Beans fan ... from Chicago are never fans of NY teams. 50% of Cub fans are ... – PowerPoint PPT presentation

Number of Views:26

Avg rating:3.0/5.0

Slides: 14

Provided by: utha

Learn more at: https://personalpages.bradley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Programming = assign proper weights

1

Programming assign proper weights
Learning

u1
Wi,1
Wi,2
u2
ui
Wi,n
un
n
Oi f ( S Wj i uj )
Activation
j1
f transfer function (usually non-linear
threshold function, sigmoid, hyperbolic tangent,
etc.)
2
Perceptron
Rosenblatt, 1962, Minsky, 1969
W1
X1
W2
Oi
X2
Wn
Xn
1, if S Wi xi gt q
Activation Oi
0, if S Wi xilt q
W0
-q
1
W1
Oi
X1
1, if S Wi xi gt 0
Activation Oi
Wn
Xn
0, if S Wi xilt 0
3
Algorithm for Perceptron Learning

Initialize (w0, w1,.., wn) randomly.
Iterate through training set collecting
unclassified examples by current weights.
If all examples correctly classified (or up to an
acceptable threshold classified) then quit
else
compute sum of misclassified examples, x,
S S x, if failed to fire while it should
S S - x, if fired while it shouldnt
4. Modify weights wt1 wt k S.

4
Perceptron learning
Total input in w0 w1 x1 w2 x2
w1
w0
x1
In 0
x2

w2
w2
Decision surface
Learning locating proper decision surface
5
To find decision surface use gradient descent
methods
E ( W1,W2) error function sum of distances of
unclassified input vectors from decision surface
E (W1,W2)
W1
W2
6
Cannot do XOR problem
Can do it with multi layer
-1.5
1
-0.5
1
1.0
-9.0
x1
1.0
x1
1.0
1.0
x2
x2
Perceptron training doesnt work (Minsky-Papert)
7
Back Propagation
1
1
Output
1 e Swij xj
0
Can get stuck in local minima .
Slow speed of learning
(Boltzman machine uses simulated annealing for
search and doesnt get stuck)
8
o1
o2
oi
W11
W23
Wij
v1
v3
v2
vj
w11
w32
wjk
zk
Binary or continuous
?1
?2
Output of vj Vjm g( hjm) g(S wjk zkm)
k
Output of oi Oim g( him) g(S Wi j Vjm)
g(S Wij g(S w j k
zkm) )
j
j
k
2
Error function E (w) ½ S
zim -g(S Wij g(S wjk zkm) )
Continuous , differentiable.
g(h) 1 / (1 e 2h ) or tanh (h)
9
To get weights
D Wij -n q
10

Judd , 1988
Teaching nets is NP- Complete on of nodes.
Possible Solution
Use probabilistic algorithms to train them
(like GAs).

11
Example back propagation network

Input Vector 19 bits correspond to a person
1 0 1 0 1
0 1 . Etc.
Output vector 5 bits
0 1 0 0
0
Training and testing data vectors
24-bits each
80-training
20- testing

Mets fan
Likes lemonade
From chicago
From NY
Cubs fan
Democrat
republican
NY Jets fan
Likes tennis
Beans fan
NY yankees fan
White sox fan
12

Possible rules from data
People from Chicago are never fans of NY teams
50 of Cub fans are White Sox fans
All Mets fans are Jets fans.
Can train BP-network to learn such patterns.
After training , testing given an input
1010000 0
We get output 0.44 0.88 0.23 0.03 0.02
i.e. There is a small chance that person likes
Sox, high probability that likes Beans, no chance
that he likes Mets, Jets or tennis.