Applying Statistical Machine Learning to Retinal Electrophysiology

About This Presentation

Title:

Applying Statistical Machine Learning to Retinal Electrophysiology

Description:

Classification using Support Vector Machines (SVM) Assessing ... Selection of best gamma (?) and cost (c) values obtained by exhaustive search of loge-space ... – PowerPoint PPT presentation

Number of Views:26

Avg rating:3.0/5.0

Slides: 27

Provided by: flame6

Category:

more less

Transcript and Presenter's Notes

Title: Applying Statistical Machine Learning to Retinal Electrophysiology

1
Applying Statistical Machine Learning to Retinal
Electrophysiology

Matt Boardman
January, 2006
matt.boardman_at_dal.ca

2
Discussions

Axotomy ERG Data Sets
Classification using Support Vector Machines
(SVM)
Assessing Waveform Significance
Probability Density Estimation
Confidence Measures

3
Axotomy ERG Data Sets (from F. Tremblay,
Retinal Electrophysiology)

Data Set A
19 axotomy subjects, 19 control subjects (total
38)
time between control axotomy?
Multifocal ERG 145 data points (mean of all
locations)
1000 Hz (?) sample rate
Data Set B
6 axotomy subjects, 8 control subjects (total 14)
measurements approximately six weeks after
axotomy
Multifocal ERG 14,935 data points (103
locations x 145 ms)
Corneal and Optic Nerve readings (control
subjects only)

4
Classification using Support Vector Machines

SVM use statistical machine learning
Constrained optimization problem
Objective Find a hyperplane which
maximizes margin
Higher dimensional mappings provide flexibility
Non-separable data a cost parameter controls
the tradeoff between outlier detection and
generalization performance
Non-linear SVM (Polynomial, Sigmoid, Gaussian
kernels)

5
Data Normalization

Balanced training data
Number of positive samples number of negative
samples
Data set A is already balanced
Keep data set B balanced through combination,
i.e. 8C628
Independently and identically distributed (iid)
data
Independence not true
e.g. value of point x17 most likely depends on
x16
Not Identically distributed
e.g. x26 is always positive (P1 wave), but x40 is
always negative (N2 wave)
Approximate iid data by subtracting mean from
each dimension, then dividing each dimension by
its maximum magnitude
results in zero mean for all dimensions, with all
values between -1 and 1
No zero-setting necessary!
e.g. subtracting mean tail value does not affect
classification accuracy!

6
Parameter Selection for Classification

Selection of best gamma (?) and cost (c) values
obtained by exhaustive search of loge-space
try all possible parameter values, choose best
points (red circles)
accuracy-weighted centre of mass gives optimal
point (green circle)
Training / Testing
75 / 25
Leave one out
Better searches
3 strikes
Simulated annealing (?)

7
Classification Results

Data set A (38 samples x 145 data points)
94.7
Data set B (14 samples x 145 data points)
99.4
Data set B (14 samples x 14,935 data points)
90.8

8
Classification Benchmarks

How does this method perform on industry-standard
classification benchmark data sets?
Wisconsin Breast Cancer Database
O.L. Mangasarian, W.H. Wolberg, Cancer diagnosis
via linear programming, SIAM News,
23(5)1-18, 1990.
Iris Plants Database
R.A. Fisher, The use of multiple measurements in
taxonomic problems, Annual Eugenics,
7(2)179-88, 1936.

9
Classification Benchmarks
Wisconsin 96.9, s0.18
Iris (Class 1 or not) 100.0
Iris (Class 2 or not) 96.9, s0.55
Iris (Class 3 or not) 97.1, s0.77
10
Assessing Waveform Significance

Which are the most important parts of the
waveform, with respect to classification
accuracy?
Fisher Ratio
distance between means over sum of variance
(linear)
Pearson Correlation Coefficients
strength of association between variables
(linear)
Kolmogorov-Smirnoff
distance between cumulative distributions
(non-linear)
Linear SVM
classification on one dimension only (linear)
Cross-Entropy
mutual information measure (non-linear)
SVM Sensitivity
Monte Carlo simulation using SVM (non-linear)

11
Comparison of All Measures (Dataset B)
12
Probability Density Estimation

Goal define a measure to show how sure the
classifier is with the result
Density Estimation is known to be a hard
problem
Generally need large number of samples for
accuracy
Small deviations in sample points have magnified
effect
How do we estimate a probability distribution?
Best-Fit Gaussian
Assume Gaussian distribution, find sigmoid that
fits best
Kernel Smoothing
Part of MATLABs Statistics Toolbox
SVM Density Estimation (RSDE method)
Special case of SVM Regression

13
Comparison of Estimation Techniques
14
Confidence Measures

Support is the overall distribution of the
sample
Denote p(x)
Density H p(x) dx 1
Confidence is defined as the posterior
probability
Probability that sample x is of class C
Denote p(Cx)
Can we combine these measures somehow?

15
Confidence Measures
16
Confidence Measures
17
Confidence Measures
18
References

SVM Tutorial (mathematical but practical)
C. Burges, A Tutorial on Support Vector Machines
for Pattern Recognition, Data Mining and
Knowledge Discovery, 2(2)121-67, 1998.
SVM Density Estimation (RSDE algorithm)
Mark Girolami, Chao He, Probability Density
Estimation from Optimally Condensed Data
Samples, IEEE Trans. Pattern Analysis and
Machine Intelligence, 25(10)1253-64, 2003.
MATLAB versions
LIBSVM http//www.csie.ntu.edu.tw/cjlin/libsvm
SVMlight http//svmlight.joachims.org/
An excellent online SVM demo (Java applet)
http//www.csie.ntu.edu.tw/cjlin/libsvm/GUI

19
Data Representation

We can represent the input data in many ways
Unprocessed vector (145 dimensions as is)
Second order information (first time derivative)
Third order information (second time derivative)
Frequency information (Power Spectral Density)
Wavelet transforms (Daubechies, Symlet)
Result Only small differences in accuracy!

20
Data Representation

Example Wavelet representations
i.e. some indications, but nothing statistically
significant (5)

21
Cross Entropy
22
SVM Sensitivity Analysis
23
SVM Sensitivity Analysis (Windowed)
24
Comparison of Estimation Techniques
25
Comparison of Estimation Techniques
26
Comparison of Estimation Techniques

Write a Comment

User Comments (0)