FlExPat: Flexible Extraction of Sequential Patterns - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

FlExPat: Flexible Extraction of Sequential Patterns

Description:

A sequential pattern is a set of segments from the sequential database which ... The initial M ('multi-description') means that MVEM has been designed to be able ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 15
Provided by: xav82
Category:

less

Transcript and Presenter's Notes

Title: FlExPat: Flexible Extraction of Sequential Patterns


1
FlExPat Flexible Extraction of Sequential
Patterns
  • Pierre-Yves ROLLAND
  • 2001 IEEE International Conference on Data Mining
    (ICDM.01)

2
Outline
  • Introduction
  • Formalization
  • Two-Phase Algorithm
  • Conclusion

3
Introduction
  • A sequential pattern is a set of segments from
    the sequential database which share a significant
    degree of resemblance.
  • For pairs of segments in the pattern, the
    similarity between the two segments can be
    measured numerically and is above a given
    threshold.

4
Introduction
  • MVEM
  • multi-description valued edit model
  • MVEMs main parameter is the
  • allowed pairing type set APTS.

5
Introduction
  • The initial M (multi-description) means that
    MVEM has been designed to be able to deal with
    representations of sequences and elements that
    use multiple simultaneous descriptions.

6
Formalization
  • A sequence S is an ordered collection of
    individual data structures Si called the
    sequences elements SS1S2 SL.
  • A sequence segment is a contiguous part of a
    sequence SiSi1Sim-1 is the segment of S
    with length m and starting at position i.

7
Formalization
  • Two sequence segments s1 and s2 are said to be
    equipollent iff their similarity is greater or
    equal to a given threshold.

8
FlExPat Parameters
  • The minimum and maximum possible length for
    candidate segments
  • Integers mmin?1S and mmax?mmin..S
  • The maximum possible length difference between
    segments to be compared
  • MaxLenghDiff(m)

9
FlExPat Parameters
  • The maximum length of overlap between two
    candidate segments
  • The similarity threshold defining equipollence
  • A quorum threshold

10
Equipollence Graph Construction
  • The segments Si,m that will be compared to
    Si,m verify the following property
  • (i,m)lt(i,m),
  • which means (lt denotes lexicographic ordering)
    ilti or ii and mltm

11
Equipollence Graph Construction
  • If Simil(s,s) is greater or equal to ST then
  • If no vertex corresponding to s yet exists in the
    equipollence graph then it is created. Same for
    s.
  • 2) An edge is created between the vertices
    corresponding to s and s in the equipollence
    graph. It is labeled with the value Simil(s,s).

12
Equipollence Graph Construction
13
Subgraph Extraction
14
Conclusion
  • FlExPat allows using a rich and flexible
    similarity model while maintaining reasonable
    time and space complexities
  • FlExPat can be used to find approximate repeating
    patterns in a sequence
Write a Comment
User Comments (0)
About PowerShow.com