MINING RELATIONSHIPS AMONG INTERVALBASED EVENTS FOR CLASSIFICATION - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

MINING RELATIONSHIPS AMONG INTERVALBASED EVENTS FOR CLASSIFICATION

Description:

To constraint size of candidate set. Size of 2-pattern set is reduced in each iteration ... Increment count of those generated candidate patterns ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 41
Provided by: soc128
Category:

less

Transcript and Presenter's Notes

Title: MINING RELATIONSHIPS AMONG INTERVALBASED EVENTS FOR CLASSIFICATION


1
MINING RELATIONSHIPS AMONG INTERVAL-BASED EVENTS
FORCLASSIFICATION
  • Dhaval Patel Wynne Hsu Mong Li
    Lee
  • School of Computing
  • National University of Singapore

2
Outline
  • Introduction
  • Problem Tasks
  • Contributions
  • Proposed Solution
  • Experiments
  • Conclusion
  • On-going Work

3
Introduction
  • Event duration captures temporal relation between
    events
  • Diabetic patients (E1 Overlap E2)
  • Multimedia video anomaly detection in smart
    home environment
  • Financial time series - stock market data

4
Introduction
  • Encoding of temporal relation between events

(A Overlap B) Overlap C
(A Overlap B)
?
(A Overlap B) Overlap C
5
Problem Tasks
  • Design a lossless representation to encode
    temporal relation among events (gt 3 events)
  • Design an efficient algorithm to discover
    frequent interval-based temporal patterns
  • Apply the discovered patterns in classification

6
Contributions
  • Design an augmented hierarchical representation
  • Develop Apriori based frequent interval-based
    temporal patterns discovery algorithm called
    IEMiner
  • Build IEClassifier based on discovered frequent
    interval-based temporal patterns

7
Augmented Hierarchical Representation
  • Incorporate additional count information
  • Contain, Finish by, Meet, Overlap, Start
  • Representation is lossless

(A Overlap0,0,0,1,0 B) Overlap0,0,0,1,0 C
(A Overlap0,0,0,1,0 B)
(A Overlap0,0,0,1,0 B) Overlap0,0,0,2,0 C
8
IEMiner
Frequent k-pattern
Candidate generation
Candidate (k1)-patterns
Support counting
Frequent (k1)-patterns
9
IEMiner
Support (A Overlap0,0,0,1,0 B) ¾ (
75) Confidence ((A Overlap0,0,0,1,0 B) gt
Class A) 2/3) Confidence ((A Overlap0,0,0,1,0
B) gt Class B) 1/3)
10
IEMiner
Frequent k-pattern
Candidate generation
Candidate (k1)-patterns
Support counting
Frequent (k1)-patterns
11
Candidate generation
  • Straightforward Apriori-based approach
  • Generate level (k1) candidates from 2 frequent k
    patterns

13
12
Candidate generation
  • Candidate Generation at level (k1)
  • Generate candidates from frequent k-pattern and
    2-pattern
  • To constraint size of candidate set
  • Size of 2-pattern set is reduced in each
    iteration
  • Only selected k-patterns are expanded

13
Candidate generation
Generate 4-pattern from frequent 3 pattern and
2-pattern
  • Support Counting
  • X

A Overlap0,0,0,1,0 B A Before0,0,0,0,0 D B
Before0,0,0,0,0 D A Before0,0,0,0,0 F A
Before0,0,0,0,0 G F Before0,0,0,0,0 G C
Contain1,0,0,0,0 D .
14
Candidate generation
15
Candidate generation
  • Theorem 2
  • At iteration (k1), 2-patterns which are present
    in less that (k-1) frequent k-pattern will not
    generate any valid candidates.

16
Candidate generation
Generate 4-patterns from frequent 3 patterns and
2-patterns
A Overlap0,0,0,1,0 B A Before0,0,0,0,0 D B
Before0,0,0,0,0 D A Before0,0,0,0,0 F A
Before0,0,0,0,0 G F Before0,0,0,0,0 G C
Contain1,0,0,0,0 D .
3 3 3 3 1 1 2
17
IEMiner
Frequent k-pattern
Candidate generation
Candidate (k1)-patterns
Support counting
Frequent (k1)-patterns
18
Support counting
  • Count number of windows in which each candidate
    (k1) patterns are present
  • For each window w
  • Intelligently generate only those candidates
    which are present in candidate (k1)-pattern set
  • Increment count of those generated candidate
    patterns
  • Issue Avoid processing unnecessary windows

19
Optimization
  • Prefix Count
  • Selectively expands frequent k-pattern during
    candidate generation
  • Window Blacklist
  • Avoid un-necessary checking of windows to reduce
    dataset size

20
IEClassifier
D
Input
n1
n2
nn
3
4
10
10
Majority Vote
Highest Confidence
21
Experiments
  • Evaluate efficiency and scalability of IEMiner on
    both synthetic and real world datasets
  • Evaluate accuracy of IEClassifier on Hepatitis
    Dataset

22
Effect of varying minimum support
23
Effect of varying database size
24
Effect of varying pattern length
25
Effect of varying event density
26
Effect of optimization strategies
Window Blacklist
Prefix Count
27
ASL Dataset
28
Hepatitis Dataset
29
Accuracy of IEClassifier
Experiments on Hepatitis Data
30
Some Discovered Results
Hepatitis Data
31
Conclusion
  • Mining relationships among interval-based events
    is important problem having applications in
    diverse field
  • Proposed Augmented Hierarchical Representation
  • Designed an efficient IEMiner algorithm
  • Designed IEClassifier based on frequent pattern
  • Temporal abstraction applied to Hepatitis dataset
    can be viewed as domain dependent dimensionality
    reduction techniques

32
On Going Work
  • Integrating IEMiner and IEClassifier as a single
    stage algorithm
  • Discover only those patters with high
    discriminating power

33
Q A?
34
  • Thank you

35
Related Work
  • Kams A1-pattern discovery algorithm DaWak-2000
  • Lossy Representation
  • Used vertical id concept
  • H-DFS ICDE-2005
  • Matrix Based Representation List n(n-1)/2
    relations for temporal pattern
  • Used vertical id concept with candidate
    generation
  • Tprefix TKDE-2007
  • Transform interval data into sequence
  • Prefix Based approach

36
Candidate generation
Generate 4-patterns from frequent 3 patterns and
2-patterns
  • Support Counting
  • X

A Overlap0,0,0,1,0 B A Before0,0,0,0,0 D B
Before0,0,0,0,0 D A Before0,0,0,0,0 F A
Before0,0,0,0,0 G F Before0,0,0,0,0 G C
Contain1,0,0,0,0 F .
F
F
37
Support Counting
Candidate Pattern (A Overlap0,0,0,1,0 B)
Overlap0,0,0,2,0 C
Before active_TP A, B, A Overlap0,0,0,1,0
B passive_TP
After active_TP A, B, A Overlap0,0,0,1,0
B, A Overlap0,0,0,1,0 C, B Overlap0,0,0,1,0
C, A Overlap0,0,0,1,0 B Overlap0,0,0,2,0
C passive_TP
38
Optimization
  • Prefix Count

39
Optimization
  • Prefix Count
  • X

Generate 4-patterns from frequent 3 patterns and
2-patterns
1
1
40
Optimization
  • Window Blacklist
Write a Comment
User Comments (0)
About PowerShow.com