Optimization-Based Data Mining Approaches in Neuroscience Research presentation

About This Presentation

Transcript and Presenter's Notes

Title: Optimization-Based Data Mining Approaches in Neuroscience Research

1
Optimization-Based Data Mining Approaches in
Neuroscience Research

Panos M. Pardalos
University of Florida

2
Introduction

Data Mining the practice of searching through
large amounts of computerized data to find useful
patterns or trends.
Optimization An act, process, or methodology of
making something (as a design, system, or
decision) as fully perfect, functional, or
effective as possible specifically the
mathematical procedures (as finding the maximum
of a function) involved in this.
Merriam Webster Dictionary

3
Introduction

The combination of data mining and optimization
Find the best way to extract meaningful
patterns from data.
Not always an easy task.

4
How difficult Optimization can be?

Given integers N1,N2,,Nk and M find a subset of
N1,N2,,Nk such that their sum is equal to M.
Can you find a better algorithm than of O(2k).
Exponential complexity ?

5
Hard drive Cost

Approximately 1/10 cheaper every 5 years

6
Hard Drive Capacity

Approximately 10 times more every 5 years

7
Processing power

Number of transistors of a computer processor
double every two years

8
References

Handbook of Massive Data Sets, co-editors J.
Abello, P.M. Pardalos, and M. Resende, Kluwer
Academic Publishers, (2002).

9
Main problems in data mining

Data preprocessing
Dimensionality reduction
Feature selection
Regression
Clustering (Unsupervised learning)
Classification (Supervised Learning)
Semi-Supervised learning (between unsupervised
and unsupervised)
Biclustering
Result Validation
Data Visualization/Representation
Biomedical Informatics is a challenging area with
lots of these problems.

10
Agenda

Research Background
Epilepsy
Seizure Prediction
Sources of Data
Electroencephalogram (EEG) Time Series
Dimensionality Reduction
Chaos Theory
Feature Selection for Brain Monitoring
Time Series Classification of Neuro-Physiological
States
Brain Clustering
Brain Network Models
Concluding Remarks

11
Facts About Epilepsy

At least 2 million Americans and other 40-50
million people worldwide (about 1 of population)
suffer from Epilepsy.
Epilepsy is the second most common brain disorder
(after stroke) that causes recurrent seizures.
Epileptic seizures occur when a massive group of
neurons in the cerebral cortex suddenly begin to
discharge in a highly organized rhythmic pattern.

12
Epileptic Seizures

Seizures usually occur spontaneously, in the
absence of external triggers.
Seizures cause temporary disturbances of brain
functions such as motor control, responsiveness
and recall which typically last from seconds to a
few minutes.
Seizures may be followed by a post-ictal period
of confusion or impaired sensorial that can
persist for several hours.

13
10-second EEGs Seizure Evolution
14
Why do we care?

Based on 1995 estimates, epilepsy imposes an
annual economic burden of 12.5 billion in the
U.S. in associated health care costs and losses
in employment, wages, and productivity.
Cost per patient ranged from 4,272 for persons
with remission after initial diagnosis and
treatment to 138,602 for persons with
intractable and frequent seizures.

15
Current Epilepsy Treatment

Pharmacological Therapy
Anti-Epileptic Drugs (AEDs)
Mainstay of epilepsy treatment
Approximately 25 to 30 remain unresponsive
Epilepsy Resective Surgery
Require long-term invasive EEG monitoring to
locate a specific, localized part of the brain
where the seizures are thought to originate
50 of pre-surgical candidates do not undergo
respective surgery
Multiple epileptogenic zones
Epileptogenic zone located in functional brain
tissue
Only 50-60 of surgery cases result in seizure
free

16
Current Epilepsy Treatment

Electrical Stimulation (Vagus nerve stimulator)
Parameters (amplitude and duration of
stimulation) arbitrarily adjusted
As effective as one additional AED dose
Side Effects
Seizure Prediction?
Monitoring Unit?
Forecasting Impending Seizures?
Seizure Control?
Deep Brain Stimulator?

17
Electroencephalogram (EEG)

is a traditional tool for evaluating the
physiological state of the brain.
offers excellent spatial and temporal resolution
to characterize rapidly changing electrical
activity of brain activation
captures voltage potentials produced by brain
cells while communicating.
In an EEG, electrodes are implanted in deep brain
or placed on the scalp over multiple areas of the
brain to detect and record patterns of electrical
activity and check for abnormalities.

18
From Microscopic to Macroscopic Level
(Electroencephalogram - EEG)
19
Electrode Montage and EEGs
20
Scalp EEG Data Acquisition
21
Open Problems

Is the seizure occurrence random?
If not, can seizures be predicted?
If yes, are there seizure pre-cursors (in EEGs)
preceding seizures?
If yes, what data mining techniques can be used
to indicate these pre-cursors?
Does normal brain activity during differ from
abnormal brain activity?

22
Goals of Research

Test the hypothesis that seizures are not a
random process.
Demonstrate that seizures could be predicted
Feature Selection to identify seizure pre-cursors
(Statistical Process Control)
Demonstrate that normal and abnormal EEGs can be
differentiated
Time Series Classification
Better understand the epileptogenic process how
seizures are initiated and propagated.
Brain Clustering
Develop a closed-loop seizure control device
(Brain Pacemaker)

23
Dimensionality Reduction

Chaos Theory

24
EEGs with the Curse of Dimensionality

The brain is a non-stationary system.
EEG time series is non-stationary.
With 200 Hz sampling, 1 hour of EEGs is comprised
of
200606030 21,600,000 data points 43.2MB
(assume 16-bit ASCI format)
1 day 1.04GB
1 week 7.28GB
20 patients 0.15TB

? Terabytes
? Gigabytes
? Megabytes
Kilobytes
25
Data Transformation Using Chaos Theory

Measure the brain dynamics from time series
Stock Market
Currency Exchanges (e.g., Swedish Kroner)
Apply dynamical measures (based on chaos theory)
to non-overlapping EEG epochs of 10.24 seconds
2048 points.
Maximum Short-Term Lyapunov Exponent
measure the average uncertainty along the local
eigenvectors and phase differences of an
attractor in the phase space
measure the stability/chaoticity of EEG signals

26
Measure of Chaos
27
STLmax Profiles
28
Hidden Synchronization Patterns
29
How similar are they?Statistics to quantify the
convergence of STLmax

By paired-T statistic
Per electrode, for EEG signal epochs i and j,
suppose their STLmax values in the epochs (of
length 60 points, 10 minutes) are

The T-index between EEG signal epochs i and j is
defined as
30
Statistically Quantifying the Convergence
31
Convergence of STLmax
32
Why Feature Selection?

Not every electrode site shows the convergence.
Feature Selection Select the electrodes that are
most likely to show the convergence preceding the
next seizure.

33
Feature Selection

Quadratic Integer Programming with Quadratic
Constraints

34
Optimization Problem

Optimization
We apply optimization techniques to find a group
of electrode sites such that
They are the most converged (in STLmax) electrode
sites during 10-min window before the seizure
They show the dynamical resetting (diverged in
STLmax) during 10-min window after the seizure.
Such electrode sites are defined as critical
electrode sites.
Hypothesis
The critical electrode sites should be most
likely to show the convergence in STLmax again
before the next seizure.

35
Notation and Modeling

x is an n-dimensional column vector (decision
variables), where each xi represents the
electrode site i.
xi 1 if electrode i is selected to be one of
the critical electrode sites.
xi 0 otherwise.
Q is an (n?n) matrix, whose each element qij
represents the T-index between electrode i and j
during 10-minute window before a seizure.
b is an integer constant. (the number of critical
electrode sites)
D is an (n?n) matrix, whose each element dij
represents the T-index between electrode i and j
during 10-minute window after a seizure.
a 2.662b(b-1), an integer constant. 2.662 is
the critical value of T-index, as previously
defined, to reject H0 two brain sites acquire
identical STLmax values within 10-minute window

36
Multi-Quadratic Integer Programming

To select critical electrode sites, we formulated
this problem as a multi-quadratic integer (0-1)
programming (MQIP) problem with
objective function to minimize the average
T-index among electrode sites
a linear constraint to identify the number of
critical electrode sites
a quadratic constraint to ensure that the
selected electrode sites show the dynamical
resetting

37
Conventional Linearization Approach for
Multi-Quadratic 0-1 Problem
38
Theoretical ResultsMILP formulation for MQIP
problem

Consider the MQIP problem
We proved that the MQIP program is EQUIVALENT to
a MILP problem with the SAME number of integer
variables.

Equivalent
39
Empirical ResultsPerformance on Larger Problems
40
Hypothesis Testing - Simulation

Hypothesis
The critical electrode sites should be most
likely to show the convergence in STLmax (drop in
T-index below the critical value) again before
the next seizure.
The critical electrode sites are electrode sites
that
are the most converged (in STLmax ) electrode
sites during 10-min window before the seizure
show the dynamical resetting (diverged in STLmax
) during 10-min window after the seizure
Simulation
Based on 3 patients with 20 seizures, we compare
the probability of showing the convergence in
STLmax (drop in T-index below the critical value)
before the next seizure between the electrode
sites, which are
Critical electrode sites
Randomly selected (5,000 times)

41
Optimal VS Non-Optimal
42
Simulation - Results
43
Statistical Process ControlHow to automate the
system?
44
Automated Seizure Warning System
Select critical electrode sites after every
subsequent seizure
Continuously calculate STLmax from multi- channel
EEG.
EEG Signals
Give a warning when T-index value drops below a
critical value
45
Data Characteristics
46
Performance Evaluation for ASWS

To test this algorithm, a warning was considered
to be true if a seizure occurred within 3 hours
after the warning.
Sensitivity
False Prediction Rate average number of false
warnings per hour

47
Training Results
Performance characteristics of automated seizure
warning algorithm with the best
parameter-settings of training data set.
48
RECEIVER OPERATING CHARACTERISTICS (ROC)

ROC curve (receiver operating characteristic) is
used to indicate an appropriate trade-off that
one can achieve between
the false positive rate (1-Specificity, plotted
on X-axis) that needs to be minimized
the detection rate (Sensitivity, plotted on
Y-axis) that needs to be maximized.

49
Test Results
Performance characteristics of automated seizure
warning algorithm with the best parameter
settings on testing data set.
50
Validation of the ASWS algorithm

Temporal Properties
Surrogate Seizure Time Data Set
100 Surrogate Data Sets
Spatial Properties
Non-Optimized ASWS Selecting non-optimal
electrode sites
100 Randomly Selected Electrodes

51
Prediction Scores Surrogate Data and
Non-Optimized ASWS
52
Remarks

Optimization as feature selection for brain
monitoring
Developed an online real-time seizure prediction
system
Tested on the dataset of
10 patients suffering from temporal lope seizures
90 days (2100 hours) of EEG data
58 seizures
Seizure Prediction
Predicting 70 of temporal lobe seizures on
average
Giving a false alarm rate of 0.16 per hour on
average
Whats next?-fundamental questions on brain
physiology

53
Time Series Classification I

Support Vector Machines with Dynamic Time Warping

54
Other Dynamical MeasuresPhase Profiles
55
Other Dynamical MeasuresEntropy H of Attractor
56
Classification of Physiological States
57
Support Vector Machines
From 1 electrode
58
Input

Standard SVM Input
30 electrodes, 30 data points, 3 dynamical
features 2,700 features
Time Series SVM Input
3029 data pairs, 3 dynamical features 2,700
90 features

59
Dynamic Time Warping
60
Preliminary Data Set

132 5-minute epochs of pre-seizure EEGs
300 5-minute epochs of normal EEGs
Pre-seizure 0-30 minutes before seizure
Normal 10 hours away from seizure

61
Metrics for Performance Evaluation
PREDICTED CLASS PREDICTED CLASS PREDICTED CLASS
ACTUALCLASS ClassYes ClassNo
ACTUALCLASS ClassYes a b
ACTUALCLASS ClassNo c d
a TP (true positive) b FN (false negative)
c FP (false positive) d TN (true negative)
62
Sensitivity and Specificity

Sensitivity measures the fraction of positive
cases that are classified as positive.
Specificity measures the fraction of negative
cases classified as negative.
Sensitivity TP/(TPFN)
Specificity TN/(TNFP)
Sensitivity can be considered as a detection
(prediction or classification) rate that one
wants to maximize.
Maximize the probability of correctly classifying
patient states.
False positive rate can be considered as
1-Specificity which one wants to minimize.

63
Leave-one-out Cross Validation

Cross-validation can be seen as a way of applying
partial information about the applicability of
alternative classification strategies.
K-fold cross validation
Divide all the data into k subsets of equal size.
Train a classifier using k-1 groups of training
data.
Test a classifier on the omitted subset.
Iterate k times.

64
Empirical Results
65
Automated Seizure Prediction Paradigm
Multichannel

Com
Feature Extraction/ Cluster Analysis
Data Acquisition

Interface Technology

Pattern Recognition

VNS
Initiate a variety of therapies (e.g., electrical
stimulation, drug injection)
User
Drug
66
Related Patents

Multi-dimensional multi-parameter time
series processing for seizure warning and
predictionPatent 7,263,467 (Issued on August 28,
2007).Optimization of Multi-dimensional Time
Series Processing for seizure warning and
predictionPatent 7,373,199 (Issued on May 13,
2008).Optimization of spatio-temporal pattern
processing for seizure warning and
predictionPatent 7,461,045 (Issued on December
2, 2008).Multi-dimensional dynamical
analysisU.S. Utility Patent application filed on
December 21, 2006, Serial No. 11/339,606.Closed
-Loop State-Dependent Seizure Prevention
SystemsU.S. Utility Patent application filed on
December 19, 2006, Serial No. 11/641,292.

67
Brain Network Models

Brain Connectivity Networks Based on fMRI Data

68
The Problem

Certain neurological diseases are very difficult
to diagnose at early stages
Functional Magnetic Resonance Imaging (fMRI)
technique provides vast amount of information
about structure and function of human brain, but
there is lack of methods to analyse these data
Computational methods and algorithms based on
mathematical models should be applied in order to
find and recognize key patterns in this ocean
of data

69
Network Models

Network models of human brain
Partition of the brain into regions of interest
Functional interconnections between regions in
brain

70
Connectivity Networks
71
MRI Data

Blood flow level as an indicator of neuronal
activity
Representation of values of signal in spatial
voxels as 2D and 3D images

72
fMRI Data

The measurements are being performed every 2
seconds over 6 minutes for each voxel of brain of
size 2mm x 2mm x 2mm
The fMRI data is therefore a set of time series,
corresponding to particular elementary volumes of
the brain. In our data set each series contains
180 elements.

73
fMRI Data, Vector Representation
Z
(x, y, z)
X
0
Y
74
Small World Networks

Small world phenomenon first described by Stanley
Milgram in 1960.
Six degrees of separation
Erdos number

75
From Random Graph to Regular Lattice

Random graphs generally have property of low mean
shortest path length and low clustering
coefficient
Regular lattice has high mean shortest path and
high cluster coefficient
Small world networks have low mean shortest path
length while still high clustering coefficient

76
Random Graph vs Regular Lattice
77
Small World Network
78
Quantitative Measures of Small World Property

Characteristic path length
Clustering coefficient
Global efficiency
Nodal efficiency

79
Brain Connectivity Networks

Brain connectivity networks possess small world
properties
We predicted, that network characteristics, such
as global and local efficiency values, would be
decreased for people with Parkinsons disease.

80
Nodes in Connectivity Network

How to define brain regions nodes in the
network?
Clustering problem
Standard MNI template

81
Signal Time Series Form Clusters
time
82
Clustering Problem

Each data set contains roughly 100 000 of time
series, each of them consist of 180 elements
Efficient algorithms should be developed in order
to solve this problem

83
Standard MNI Brain Atlas

Partition of the brain into 116 brain regions

84
Edges in Connectivity Network

Weighted graph with nodes corresponding to MNI
brain regions
Weights of edges defined based on correlation
between averaged neural activity over the regions

85
Signal Processing

Neural activity
Head movements during the MR session
Respiratory and heart rhythms
Noise

86
Maximal Overlap Discrete Wavelet Transform

Wavelet is a small wave
Wavelet transform is a decomposition of initial
signal into linear combination of wavelets

87
Time Series Decomposition by Wavelets
88
Wavelet Coefficients Correlation

Inter-regional correlations in resting state fMRI
data are particularly salient at frequencies
below 0.1 Hz
Second scale wavelet coefficients correspond to
0.06 0.12 Hz frequency range

89
Connectivity Strength

Averaged over the regions signal vectors
Define level 2 wavelet coefficients of averaged
signals , .
The connectivity between regions A and B is

90
Definition of Distance Between Nodes

For each time series S s1, s2, , sn of size
n there is a corresponding point in n-dimensional
space
For normal vectors x and y the distance between
end points is equal to
Therefore, (1 corr(x,y)) may serve as a
measure of distance between time series

91
Geometrical Representation
x
x - y
S (s1, s2, , sn)
y
0
92
Data Set

15 healthy controls, 14 Parkinson patients
Each network for each patient consist of 116
nodes

93
Averaged Connectivity Networks
Control
Parkinson
94
Global Network Efficiency Values
Control (1.85 /- 0.57), Parkinson (1.12 /-
0.55), independent t-test p-value 0.0017
95
Top 30 Nodal Efficiency Values
96
Nodal Efficiency Plot
Red line Control set, blue line - PD set
97
Discussion

Parkinsons brain network properties possess
measurable alteration in comparison with healthy
ones
Further research, in particular, different
network model, may reveal the pattern in brain
networks, which could be used as a diagnosis
criteria

98
Concluding Remarks

Overview of Epilepsy Research
Applications of Data Mining and Optimization
Techniques
Interplay between theory and application
Feature Selection
Time Series Classification
Brain Clustering
Brain Network Models

99
Related Patents

Sensor registration by global optimization
proceduresPatent 7,653,513 (Issued January 26,
2010).Atomic Magnetometer Sensor Array
Magnetoencephalogram Systems and MethodUnited
States Patent Application 20100219820 (Filed
April 14, 2008)

100
References

Handbook of Massive Data Sets, co-editors J.
Abello, P.M. Pardalos, and M. Resende, Kluwer
Academic Publishers, (2002).

101
References
Clustering Challenges on Biological Networks S.
Butenko, W. A. Chaovalitwongse and P. M.
Pardalos, World Scientific (2009).

Feature Selection for Consistent Biclustering
via Fractional 0-1 Programming (with Stanislav
Busygin and Oleg A. Prokopyev), Journal of
Combinatorial Optimization, Volume 10, Number 1
(2005), pp. 7-21.
Biclustering in Data Mining (with S. Busygin,
and O. Prokopyev), Computers Operations
Research, Volume 35, Issue 9 (2008), pp.
2964-2987.
On Biclustering with Features Selection for
Microarray Data Set (with S. Busygin and O.
Prokopyev), In (BIOMAT 2005) Proceedings of the
International Symposium on Mathematical and
Computational Biology (Edited by R. Mondaini R.
Dilao), World Scientific (2006), pp. 367-377.
Biclustering algorithms and applications in
data mining and forecasting (with P.
Xanthopoulos, N. Boyko and N. Fan) In
Encyclopedia of Operations Research and
Management Science (accepted to appear)
Wiley(2010).

102
References

Quantitative Neuroscience, co-editors P.M.
Pardalos, C. Sackellares, P. Carney, and L.
Iasemidis, Kluwer Academic Publishers, (2004).
Biocomputing, co-editors P.M. Pardalos and J.
Principe, Kluwer Academic Publishers, (2002).

103
References

New in 2010 Computational Neuroscience,
co-editors
W.A. Chaovalitwongse, P.M. Pardalos, P.
Xanthopoulos (Eds.) Series Springer Optimization
and Its Applications , Vol. 38.

104
References

Optimization in Medicine, Carlos Alves,, Panos M.
Pardalos, Luis Vicente (Eds.), 2008

105
References

Handbook of Optimization in Medicine, Panos M.
Pardalos, Edwin H. Romeijn (Eds.), 2009

106
Reference

W. Chaovalitwongse, L.D. Iasemidis, P.M.
Pardalos, P.R. Carney, D.-S. Shiau, and J.C.
Sackellares. A Robust Method for Studying the
Dynamics of the Intracranial EEG Application to
Epilepsy. Epilepsy Research, 64, 93-133, 2005.
W. Chaovalitwongse, P.M. Pardalos, and O.A.
Prokopyev. Electroencephalogram (EEG) time series
classification Applications in epilepsy , Annals
of Operations Research, 148, 1 (2006), p 227-250.
Jicong Zhang, Petros Xanthopoulos ,Chang-Chia
Liu, Panos M. Pardalos. Real-time differentiation
of nonconvulsive status epilepticus from other
encephalopathies using quantitative EEG analysis
A pilot study, Epilepsia, 51, 2 (2010), pp.
243-250
W. Chaovalitwongse , P.M. Pardalos, L.D.
Iasemidis, D.-S. Shiau, and J.C. Sackellares.
Dynamical Approaches and Multi-Quadratic Integer
Programming for Seizure Prediction. Optimization
Methods and Software, 20 (2-3) 383-394, 2005 .
L.D. Iasemidis, P.M. Pardalos, D.-S. Shiau, W.
Chaovalitwongse, K. Narayanan, A. Prasad, K.
Tsakalis, P.R. Carney, and J.C. Sackellares. Long
Term Prospective On-Line Real-Time Seizure
Prediction. Journal of Clinical Neurophysiology,
116 (3) 532-544, 2005.
P.M. Pardalos, W. Chaovalitwongse, L.D.
Iasemidis, J.C. Sackellares, D.-S. Shiau, P.R.
Carney, O.A. Prokopyev, and V.A. Yatsenko.
Seizure Warning Algorithm Based on Spatiotemporal
Dynamics of Intracranial EEG. Mathematical
Programming, 101(2) 365-385, 2004. (INFORMS
Pierskalla Best Paper Award 2004)
W. Chaovalitwongse , P.M. Pardalos, and O.A.
Prokopyev. A New Linearization Technique for
Multi-Quadratic 0-1 Programming Problems.
Operations Research Letters, 32(6) 517-522,
2004. (Rank 5th in Top 25 Articles in Operations
Research Letters)

107
Thank you for your attention!

Questions?

108
Conference in 2011

Write a Comment

User Comments (0)

About PowerShow.com

Optimization-Based Data Mining Approaches in Neuroscience Research PowerPoint PPT Presentation