K' P' Unnikrishnan - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

K' P' Unnikrishnan

Description:

Goal: Unearthing network connectivity patterns. a. b. c. d. e. f. g. h ... tYZ. A. Efficient level-wise mining. G M. tGM. t. 5. Tracking an Evolving Network. 6 ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 24
Provided by: debprakas
Category:

less

Transcript and Presenter's Notes

Title: K' P' Unnikrishnan


1
Data Mining Methods for Electronic Medical Records
  • K. P. Unnikrishnan

Collaborators Indian Institute of Science P. S.
Sastry, (Srivatsan Laxman) Univ. Michigan Vijay
Nair, Casey Diekman, Kohinoor Dasgupta Virginia
Tech Naren Ramakrishnan, Debprakash Patnaik UC,
Davis Anne Smith, UCSF Loren Frank RIKEN Kazuo
Okanoya Wayne State Sorin Draghici
2
Applications
3
Discovering episodes with temporal constraints
  • In Neuroscience, one can get delays synaptic and
    axonal delays.
  • Automatic discovery of inter-event intervals and
    episodes
  • Inter-event times of event occurrences have
    valuable information
  • Goal Unearthing network connectivity patterns

4
Graph Edges Patterns in data
t
U X Y
Z
G M
tUX
tGM
tXY
tYZ
A. Efficient level-wise mining
Counting all episodes above a threshold
B. Discovering inter-event intervals
Discovering the best fit interval from a
supplied set
5
Tracking an Evolving Network
6
Discovering Rare Patterns
  • 10,000 spikes from 26 neurons
  • 11 spikes (0.1 of the total) are in a pattern
    G-M-B-E-A-T-S-F-O-R-D
  • A single occurrence is statistically significant

7
Mining EMR using GMiner
  • Example EMRPatient ID_0Recorded medical event
    "DIAG_1" on Day 0Recorded medical event "DIAG_3"
    on Day 1Recorded medical event "PRES_1" on Day
    5Recorded medical event "PRES_3" on Day
    6Recorded medical event "EVT_L" on Day
    7Recorded medical event "TEST_4" on Day 7...
  • Embedded patterns
  • TEST_1 -gt TEST_2 -gt DIAG_1 -gt PRES_1
  • TEST_3 -gt TEST_4 -gt DIAG_2 -gt PRES_2
  • TEST_5 -gt DIAG_3 -gt PRES_3
  • GMiner Results
  • No. of 3 node frequent episodes 5
  • TEST_54-6-DIAG_34-6-PRES_3 (0.78141) 242
  • No. of 4 node frequent episodes 2
  • TEST_14-6-TEST_24-6-DIAG_14-6-PRES_1
    (0.81822) 187
  • TEST_34-6-TEST_44-6-DIAG_24-6-PRES_2
    (0.80452) 175

8
Imaginary Situation 1
  • Patients arriving in Emergency Department (ED)
  • Events Diagnostic tests EMR (historical data)
    represented here as alphabets
  • Event patterns can be discovered
  • Patients can be flagged as high-priority (based
    on partial patterns)

A15-30 min-B15-30 min-C15-30 hours-Y
MAZXYCQBGMQPTARYCDJBSPASWCJDGMDYZXHGDH
Patient 1
Historical Data
ZXHADHOTCBFAKVPCLVIRXY
Patient 2

SARYCDJBSPASWCJDGMDYKVPQLVIRX
Patient 3
Raise flag at current time
Time
9
GMiner Graph Visualization
10
Imaginary Situation 2
  • 1,000 patients come through the hospital
  • Most of these events occur independently of one
    another or with weak dependence
  • 2 of these patients have the same condition and
    show the sequence of events we looked at before
  • A4 to 6 hours-B1 to 3 hours-C5 to 7
    hours-Y
  • However, another 2 of the patients also have the
    same condition but a different pattern of events
    occurs with different time delays
  • A9 to 11 hours-B3 to 5 hours-C11 to 13
    hours-N
  • Imagine that event Y represents a positive
    outcome, while event N represents a negative
    outcome.

11
GMiner Results (Simulation Example 2)
12
Backup
13
Complex dynamical system
  • What is the problem we are trying to solve?
  • Large graph with many nodes and edges
  • Activity from many of the nodes are available
  • How do we get the graph (strength, direction,
    delay) out

14
With inter-event time constraints
  • Inter-event times in serial episodes
  • Inter-event expiry constraint (0 lt ?ti lt TX)
  • Inter-event interval constraint (Tlow lt ?ti lt
    Thigh)

15
Counting Episodes with inter-event constraints
  • Complex state-transitions required for counting
    with inter-event constraints
  • Space complexity O(mnC) and Time Complexity
    O(mnC)

Accept_A()
Accept_B()
Accept_C()
Accept_D()
C10
A1
B4
D17
A2
B12
C13
A5
5
10
Data Read Head
A1
B4
A5
C10
B12
C13
D17
A2
Event Sequence
16
Parallelizing counting
  • Run several parallel automatons at different
    start states for the same episode
  • Map step
  • Merge count and state info from each auto
  • Concatenate step
  • Implemented on Nvidia GTX280 GPU
  • 1.3 Ghz clock
  • 1 GB device memory)
  • 200X speed up w.r.t CPU

17
Cortical cultures on micro-electrode arrays
18
Relative spike counts
Days
19
(No Transcript)
20
Data mining Bayesian GLM
21
TDMiner Finds Fault Correlations
22
Finding Relevant Fault Correlations
Statistically significant correlations
Problem begins
Problem fixed
By using TDMiner, the root-cause could have been
identified 2.5 weeks earlier
23
LOMA Robot Problems
Write a Comment
User Comments (0)
About PowerShow.com