Title: ECE 173 Assignment
1ECE 173 Assignment 3 1-out-of-m coding
- Joshua Wortman
- Mathematics/Psychology
- Due April 29, 2004
2Data Source
- Data represent four classes inputs represented as
overlapping clusters in 2 dimensional space. - Data are 100,000 examples of x1,x2 pairs
associated with one of four classes. Clusters are
not linearly seperable. - All clusters are nearly equally represented n
25000.
3Objective is 1 out of m MLP coding
- Input space is 2 dimensional.
- Network will trained with each
- M 4,8,16, 24 hidden neurons.
- 4 outputs to network exist, giving 1 (one) to
correct output and 0 for the others.
4Weight Updates
- vpqnew vpqold 2?(yp - yp) zq
- updates the weights for outputs from the hidden
neuron layer - uqrnew uqrold 2?(1- zq2) ?( ?i viq )xr
- updates weights from inputs to hidden layer
update of qth weight is weighted relative to its
contribution to the total error.
5Procedure
- TrSet 40000, TrTstSet 20000, ValSet
40000 - Output layer connected to all hidden layer
neurons. - Weights are updated using variable learning rate
? near 10-6. After each epoch, Error on training
test set is measured. - After 50 consecutive error decreases, ? grows to
1.5? - After 2 consecutive error increases, ? shrinks to
0.7?. - (Note 1.5x0.7 1.05)
- Network is trained for each M to find MSE
achieved. - Best MSE to accuracy to time cost result is
selected.
6M4
min MSE 0.5164 epochs 900 Accuracy TrSet
56.8 TstSet 56.9 alpha .00015 lt ? lt
.000001
7M8
min MSE 0.420 epochs 1000 Accuracy TrSet
73.3 TstSet 73.1 alpha .000008 lt ? lt
.000003
8M16
min MSE 0.324 epochs 1200 Accuracy TrSet
82.9 TstSet 82.8 alpha .000005 lt ? lt
.000002
9M24
min MSE 0.2652 epochs 1500 Accuracy TrSet
85.1 TstSet 85.1 alpha .000005 lt ? lt
.0000018
10Comparing Error plot curves
M24 learns quickest before flattening Choose
M24 as optimal condition Applying Validation Set
Yields 84.9 correct classifications
11MSE using Test Set 85.12 MSE using Validation
Set 84.87
12Example Netork Output
The Y1 output values for the first 1000 examples
in the original data vector are shown in
blue. Those values which are actually members of
class 1 are circled in red. The image shows that
output Y1 is giving class 1 inputs substantial
preference.
13Conclusions
- It is reasonable that more training time would
lead to higher accuracy. - From Learning curve comparison graph, it seems
that more neurons may also increase accuracy and
decrease error. This would have been experimented
if not for time constraints.