Temporal Database Paper Reading - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Temporal Database Paper Reading

Description:

Episode : a partially ordered collection of events occurring together. ... A local frequent ID has boundlist that can match into other episode's PBL. ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 27
Provided by: mar194
Category:

less

Transcript and Presenter's Notes

Title: Temporal Database Paper Reading


1
Temporal DatabasePaper Reading
  • R95922007 ???? ???
  • Efficient Mining Strategy for Frequent Serial
    Episodes in Temporal Database, K Huang, C Chang

2
Introduction
  • Discover frequent serial episodes to find
    relationships between events.
  • - explain the problems that cause a particular
    event
  • - predict future result
  • Episode a partially ordered collection of
    events occurring together.
  • - the user defines how close is close enough
  • - win the width of the time window

3
Three classes of episodes
  • Introduced by Mannila et al.
  • Serial episodes
  • - patterns of a total order in the sequence
  • Parallel episodes
  • - no constraints on the relative order
  • Composite episodes
  • - serial combination of parallel episodes

4
Examples episodes
5
Algorithms (old)
  • Presented by Mannila et al.
  • Finding parallel and serial episodes that are
    frequent enough.
  • WINEPI
  • - consider the support of an episode
  • MINEPI
  • - consider the number of minimal occurrences
  • of an episode

6
WINEPI
  • Consider the Sequence SA3A4B5B6.
  • support the number of sliding windows with
    width win.
  • Given win3, there are six windows
  • W1A3, W2A3A4, W3A3A4B5,
  • W4A4B5B6, W5B5B6, W6B6 .
  • ltA,Bgt is supported by two windows.

7
MINEPI
  • Consider the Sequence SA3A4B5B6.
  • minimal occurrences an interval that contains
    episode a, but no proper sub-interval does.
  • ltAgt has mo support 2.
  • - interval 3,3 and 4,4.
  • ltA,Bgt has mo support 1.
  • - interval 4,5.

8
Complex sequences
  • Several events occurring at one time
  • Example
  • A temporal database is a complex sequence with
    temporal attributes.

A D B A B E C E A B F A C E B D F D
9
Algorithms (new)
  • Extend the algorithm to deal with complex
    sequences.
  • MINEPI
  • - depth-first enumeration to generate the
    frequent episodes by equalJoin and temporalJoin.
  • EMMA
  • - Episodes Mining using Memory Anchor
  • - utilizes memory anchors to accelerate mining
    task

10
More about MINEPI
  • Breath-first manner
  • - enumerate longer episodes from shorter ones
  • Parameters
  • - maxwin maximum window width for an episode
  • - minsup minimal frequent for frequent
    episode
  • Temporal Join
  • - connects events from different time intervals

11
Example MINEPI
  • S A1A2B3A4B5, maxwin4, minsup2
  • Find frequent 1-episode first
  • - mo(A)1,1,2,2,4,4, mo(B)3,3,5,5
  • Temporal Join with maxwin4
  • - possibles of ltA,Bgt 1,3,2,3,2,5,4,5
  • - mo(ltA,Bgt)2,3,4,5 (choose minimal ones)
  • - support(ltA,Bgt)1,4,2,5,4,5
  • - support count 3, counting distinct start
    point

12
MINEPI
  • Must deal with complex sequences.
  • Depth-first manner for memory saving
  • Equal Join
  • - connects events at the same interval
  • Bound List
  • For a serial episode Pltp1,,pkgt
  • - tsi,tei S contains P in time tsi,tei
  • For an event Y
  • - ti,ti S contains P in time ti

13
Example bound list
  • maxwin 4.
  • Bound list of ltA,B,Cgt 1,4,3,6.
  • Bound list of ltCgt 4,4,6,6.

1 2 3 4 5 6 7 8
A D B A B E C E A B F A C E B D F D
14
Operations
  • Given Pltp1,,pkgt and an event f.
  • - P.boundlist ts1,te1,,tsn,ten
  • - f.boundlist ts1,ts1,,tsm,tsm
  • Equal Join P1P?fltp1,,pk?fgt.
  • - P1.boundlist are tsi,tei such that
  • teitsj for some j (1?j?m)
  • Temporal Join P2P.fltp1,,pk,fgt.
  • - P2.boundlist are tsi,tsj such that
  • tsj-tsiltmaxwin and tsjgttei for some j (1?j?m)

15
Drawbacks of MINEPI
  • Huge amount of combinations
  • - Consider I 1-frequent episodes
  • - O(I2) checking for temporal joins and equal
    joins
  • Unnecessary joins
  • - should skip temporal joins for a prefix if the
    number
  • of extendable matching bounds lt minsup
    TDB
  • Duplicate joins
  • - episode ltABC,ABCgt need 41 joins
  • ltAgt?ltABgt?ltABCgt?ltABC,Agt?ltABC,ABgt?ltABC,ABCgt

16
EMMA
  • Divide into three phases
  • (I) Mining frequent itemset in the complex
    sequence.
  • (II) Encode each frequent itemset with a unique
    ID,
  • and construct a encoded horizontal database.
  • (III) Mining episodes in the encoded database.
  • Depth-First Search
  • Memory Anchor
  • - utilize the boundlists to access information
  • - timelists of frequent itemsets are their
    boundlists

17
Example database
  • minsup 5

18
Combine episodes
  • Only combine existing episodes with a local
    frequent 1-tuple episode.
  • - overcome the huge amount of generations
  • Projected boundlist (PBL)
  • - episode 3ltCgt has boundlist
  • 1,1,2,2,4,4,8,8,11,11,14,14,15,15
  • - given maxwin 4, the projected boundlist is
  • 2,4,3,5,5,7,9,11,12,14,15,16,16,1
    6
  • - note that TDB16

19
Example PBL
  • 3.timelist1,2,4,8,11,14,15.
  • 1 ? 2,4
  • 2 ? 3,5
  • 4 ? 5,7
  • 8 ? 9,11
  • 11 ? 12,14
  • 14 ? 15,16
  • 15 ? 16,16
  • with maxwin 4 and TDB16.

20
Local frequent ID
  • A local frequent ID has boundlist that can match
    into other episodes PBL.
  • - 3.PBL2,4,3,5,5,7,9,11,12,14,15,16
    ,16,16
  • - 4.BL3,3,5,5,6,6,9,9,12,12,13,13,
    16,16
  • Record boundlist of ID when examining.
  • - get the boundlist immediately at temporal join
  • - ltC,Dgtlt3,4gt then ltC,Dgt.boundlist
  • 1,3,2,3,4,5,8,9,11,12,14,16,15,16

21
Example temporal join
  • 4.BL3,3,5,5,6,6,9,9,12,12,13,13,16
    ,16.
  • Recall the construction of 3.PBL
  • 1 ? 2,4 3,3 in it
  • 2 ? 3,5 3,3 in it (take minimal)
  • 4 ? 5,7 5,5 in it
  • 8 ? 9,11 9,9 in it
  • 11 ? 12,14 12,12 in it
  • 14 ? 15,16 16,16 in it
  • 15 ? 16,16 16,16 in it
  • Result 1,3,2,3,4,5,8,9,11,12,14,16,
    15,16

22
Procedure emmajoin
  • Recursively extend the episodes
  • - until no more serial episodes can be extended
  • Avoid unnecessary checking in MINEPI
  • - stop when the number of extendable bounds for
    a
  • serial episode is less than minsup TDB.
  • Example 2ltBgt.
  • - 2.BL3,3,6,6,9,9,12,12,16,16
  • - 2.PBL4,6,7,9,10,12,13,15
    (TDB16)
  • - do not need to extend 2 if minsup 5

23
Example emmajoin
  • 3.BL1,1,4,4,8,8,11,11,14,14,15,15.
  • 7.BL1,1,4,4,8,8,11,11,14,14.
  • 9.BL3,3,6,6,9,9,12,12,16,16.
  • Call emmajoin to extend each 1-tuple episodes
  • 3.PBL2,4,5,7,9,11,12,14,15,16,16,16
    .
  • Find local frequent IDs in 3.PBL.

24
Example emmajoin (cont.)
  • minsup 5, maxwin 4.
  • By temporal Join
  • - lt3,3gt.BL1,4,8,11,11,14,14,15
  • - lt3,7gt.BL1,4,8,11,11,14
  • - lt3,9gt.BL1,3,4,6,8,9,11,12,14,16
  • - lt3,9gt is generated from prefix 3
  • - recursively call emmajoin to extendlt3,9gt
  • - lt3,9gt.PBL4,4,7,7,10,11,13,14
  • - there are no local frequent IDs since minsup5
  • Back to call emmajoin for episode 7.

25
Experiments
  • On a dataset composed of 10 stocks.
  • Parameters maxwin/minsup.
  • - more running time when maxwin increases
  • - more running time when minsup decreases
  • - since the number of frequent episodes
    increases
  • EMMA runs faster than MINEPI.
  • MINEPI uses lesser space than EMMA.
  • - EMMA needs large memory as minsup decreases

26
Conclusion
  • Modify MINEPI to MINEPI
  • - for mining episodes in a complex sequence
  • Propose EMMA
  • - avoid the drawbacks of MINEPI
  • EMMA is more efficient than MINEPI.
  • Future work
  • - only discussed serial episodes
  • - parallel and composite episodes remain to be
    solved
Write a Comment
User Comments (0)
About PowerShow.com