IMPORTANT%20EXTREMA%20OF%20TIME%20SERIES:%20THEORY%20AND%20APPLICATIONS - PowerPoint PPT Presentation

About This Presentation
Title:

IMPORTANT%20EXTREMA%20OF%20TIME%20SERIES:%20THEORY%20AND%20APPLICATIONS

Description:

strict, left, right, and flat importance. Algorithm to assign importance ... Strict, left, right, and flat important extrema. Extrema in derivative series ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 27
Provided by: syst3
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: IMPORTANT%20EXTREMA%20OF%20TIME%20SERIES:%20THEORY%20AND%20APPLICATIONS


1
IMPORTANT EXTREMA OF TIME SERIESTHEORY AND
APPLICATIONS
2
Introduction
  • Time series
  • Definition Sequence of values measured at equal
    intervals
  • Examples stock prices, weather data,
    electrocardiograms, etc.
  • Compression using Important extrema
  • Importance levels
  • Indexing based importance
  • Distance computation based indexing trees
  • Pattern retrieval based on indexing trees

3
Previous Work
  • Previous work Specific
  • Indexing and fast retrieval of time series
    Pratt, 2001 Pratt and Fink, 2002 Fink and
    Pratt, 2003
  • Compression
  • Use Important points
  • Identify important points using ration between
    values of points
  • Use the parameter R to control the compression
  • Do not have the direct correlation between
    compression rate and R
  • Similarity measures and pattern retrieval
  • Define leg/extended leg as segment between two
    consecutive important points
  • Index the legs with their ration
  • Retrieve the legs that are similar to the
    prominent leg of the pattern
  • Find the similarity between pattern and the
    series who has leg similar to prominent leg
  • Output the series whose similarity is within the
    given threshold

4
Basic Concepts
  • Related structures and algorithms
  • Stacks and linked lists
  • Red-black trees
  • Order statistics
  • Time series representation
  • Series structure
  • full-size number of points in the original
    series
  • cmpr-size number of points in the compressed
    series
  • points array or red-black tree of all points
  • Point structure
  • index index of the point in the original series
  • value value of the point
  • type end-point, minimum, maximum or
    non-extremal
  • next next point in the compressed series
  • prev previous point in the compressed series
  • side strict, left, right, or flat
  • imp importance of point

5
Basic Concepts..
  • Distance measures
  • Distance between real values
  • Definition two-argument function that satisfy
    the following properties
  • For every value a, dist(a,a) 0
  • For every two values a and b, dist(a,b)
    dist(b,a)
  • For every three values a, b, and c, if altbltc,
    then dist(a,b) lt dist(a,c) and dist(b,c) lt
    dist(a,c)
  • Examples
  • a-b, a-b/ab
  • Distance composition
  • Definition real-value function f(d1,,dq) on
    non-negative real arguments such that f(0,,0)
    0 and f is monotonically increasing on each of
    its argument
  • Example
  • If w1, ,wq are weights and dist1, ,distq are
    distance functions then w1.dist1.wq.distq is a
    composition.

6
Basic Concepts..
  • Distance between time series
  • Definition For two equal-length series, a1,an
    and b1,,bn, and a distance function dist for
    real values, the corresponding lr distance
    between the series is lr (1/n . Eni1 (dist(ai
    ,bi))r)1/r
  • Example In the below picture the l1 distance
    between series is 2.0 and l2 distance is 2.1
  • Advantage of using distance functions
  • Flexibility of choosing different distance
    functions and their compositions.

7
  • Questions ?

8
Important Points
  • Extrema
  • We define extremum as a minimum or maximum in a
    series
  • Formal definition of a minimum The point ai of a
    time series a1,,an is a strict minimum if
    ailtai-1 and ailtai1
  • Example
  • Strict, left, right, and flat extrema

9
Important Points..
  • Extrema..
  • Compression by extracting all extrema
  • Example
  • Algorithm
  • Space complexity constant
  • Time complexity O(n) n number of points in
    the series
  • Can process live series

10
Important Points..
  • Important Extrema
  • Higher compression by selecting only certain
    important extrema.
  • Control the compression rate using the parameter
    R
  • Formal definition of important minimum The point
    ai of a time series a1,,an is a strict important
    minimum if
  • If ai is minimum among ail,,air, and
  • dist(ai,ail) gt R and dist(ai,air) gt R
  • Examples of Strict, left, right, and flat
    important minima

11
Important Points..
  • Important Extrema..
  • Compression by extracting important extrema
  • Example
  • Algorithm
  • Space complexity constant
  • Time complexity O(n) n number of points in
    the series
  • Can process live series

12
Important Points..
  • Derivative series
  • Compression based on changes in slope
  • Example
  • Algorithm
  • The algorithm for important extrema can modified
    for derivative series

13
Important Points..
  • Importance Levels
  • Idea Assign numerical importance to the extrema
    and use the importance for compression, indexing,
    pattern retrieval
  • DefinitionIf a point is a strict (left, right,
    flat) extremum for compression with some value of
    R, then its strict (left, right, flat) importance
    is the maximal value of R for which it is a
    strict (left, right, flat) extremum.
  • Example with distance function a-b

14
Important Points..
  • Importance Levels
  • strict, left, right, and flat importance
  • Algorithm to assign importance
  • One pass through the series
  • Space complexity O(m) m number of extrema
    in the series
  • Time complexity O(n) n number of points in
    the series

15
Important Points..
  • Compression rate
  • rate number of points removed during the
    compression
  • Problem Select important extrema according to a
    given rate
  • Example The compression rate in below series is
    60 since we have selected eight of 20 points

16
Important Points..
  • Compression rate..
  • Three-pass algorithm
  • Makes three-passes through the series
  • Space complexity O(m) m number of extrema
    in the series
  • Time complexity O(n) n number of points in
    the series
  • One-pass algorithm
  • Makes one pass and uses red-black tree to keep
    only desired number of important extrema.
  • Space complexity O(m) m number of extrema
  • Time complexity O(nm.lgs) n number of
    points s number of points in the compressed
    series
  • Dependency on the distance functions

17
  • Questions ?

18
Indexing Trees
  • Idea
  • Index the series based on importance for fast
    retrieval of compressed series
  • Indexing based on importance
  • Can retrieve s points in O(s) time
  • Need to sort the retrieved points, which takes
    O(s.lgs)
  • Augmented indexing structure
  • Example series

19
Indexing Trees..
  • Augmented indexing structure ..
  • Structure
  • Left superior nearest extremum to the left in
    the original series with strictly greater
    importance
  • Right Superior nearest extremum to the right
    with equal or greater importance

20
Indexing Trees..
  • Algorithms
  • Sorted retrieval
  • Space and Time complexity O(s) s number of
    points to be retrieved
  • Building augmented tree
  • Space complexity O(m) m number of extrema
  • Time complexity O(m.lgm)
  • Range Tree
  • Problem retrieve important points of a given
    segment ail,,air
  • Idea
  • Use Range tree
  • Index the points in the series by position as
    well as importance
  • Space complexity of building the range tree
    O(m)
  • Time complexity of building the range tree O(m.
    lgm)
  • Time complexity of retrieval O(s. lgs lgm)

21
Distance Computation
  • Distance range
  • Problem find the distance range between two
    compressed series
  • Idea
  • Each leg in the series can be bounded by a
    rectangle as shown
  • Bounding rectangle represents the bounds of all
    the points in that segment
  • Find the distance range between the series by
    finding the distance range between the bounding
    rectangles
  • Algorithm
  • Time complexity O(s) s number of points in
    the compressed series
  • Space complexity constant if the compressed
    series are in file
  • O(s) if we
    need to retrieve compressed series
  • from
    tree

22
Distance Computation..
  • Approximate Distance
  • Problem find the distance between two series
    with a given a approximation
  • Algorithm
  • Uses pre-computed indexing trees
  • Generated highly compressed versions to start. If
    the distance range is not within the given
    accuracy, it increases the number of points by a
    factor of 2
  • Time and space complexity O(s) s number of
    points required for a given accuracy
  • Threshold Test
  • Problem determine whether the distance between
    two series is smaller than the given threshold
  • Algorithm
  • Same idea as in Approximate distance
  • Checks whether the lower bound is greater than
    the threshold or the upper bound is less than the
    threshold

23
Pattern Retrieval
  • Range Query
  • Fetch all the series that are within a given
    range from the pattern
  • Nearest neighbor
  • Fetch the closest series from the pattern
  • Multiple neighbors
  • Fetch given number of closest series to the
    pattern
  • Complexity of the algorithms
  • Time complexity
  • Best case O(N) N number of series in
    database
  • Worst case O(n.N) n number of point in each
    series
  • Space complexity
  • Best case O(s.N) s number of points required
    to perform the threshold test

24
Conclusions
  • Distance measures
  • Strict, left, right, and flat important extrema
  • Extrema in derivative series
  • Importance levels
  • compression, indexing, and pattern retrieval
    techniques using importance level
  • Future work
  • Experiments
  • Investigating the usefulness

25
  • Questions ?

26
Acknowledgements
  • Eugene Fink
  • Colleagues and Management at Nielsen Media
    Research
  • Colleagues at Cognizant Technology Solutions, Ltd
  • Family
  • Parents Mohan Reddy Gandhi and Sulochana Gandhi
  • Brother Sharat Gandhi
  • Wife Madhuri Gandhi
Write a Comment
User Comments (0)
About PowerShow.com