Title: Indexing of Time Series by Major Minima and Maxima
1Indexing of Time Seriesby Major Minima and Maxima
Eugene Fink Kevin B. Pratt Harith S. Gandhi
2Time series
A time series is a sequence of real values
measured at equal intervals.
3Results
- Compression of a time series by extracting
its major minima and maxima
- Indexing of compressed time series
- Retrieval of series similar to a given pattern
- Experiments with stock and weather series
4Outline
5Compression
We select major minima and maxima, along with the
start point and end point, and discard the other
points.
We use a positive parameter R to control the
compression rate.
6Major minima
- A point am in a1..n is a major minimum if
there are i and j, where i lt m lt j, such that - am is a minimum among ai..j, and
- ai am ? R and aj am ? R.
7Major maxima
- A point am in a1..n is a major maximum if
there are i and j, where i lt m lt j, such that - am is a maximum among ai..j, and
- am ai ? R and am aj ? R.
am
? R
? R
aj
ai
8Compression procedure
The procedure performs one pass through a given
series.
It takes linear time and constant memory.
It can compress a live serieswithout storing it
in memory.
9Outline
10Indexing of series
We index series in a database by their major
inclines, which are upward and downward segments
of the series.
11Major inclines
- A segment a1..j is a major upward incline if
- ai is a major minimum
- aj is a major maximum
- for every m ? i..j, ai lt am lt aj.
The definition of a major downward inclineis
symmetric.
12Identification of inclines
The procedure performs two passes through a list
of major minima and maxima.
13Identification of inclines
The procedure performs two passes through a list
of major minima and maxima.
Its time is linear in the number of inclines.
14Indexing of inclines
We index major inclines of series in a database
by their lengths and heights.
We use a range tree, which supports indexing of
points by two coordinates.
15Outline
16Retrieval
The procedure inputs a pattern series
andsearches for similar segments in a database.
Pattern
17Retrieval
The procedure inputs a pattern series
andsearches for similar segments in a database.
- Main steps
- Find the patterns inclines with the greatest
height
- Retrieve all segments that have similar
inclines
- Compare each of these segments with the
pattern
18Highest inclines
First, the retrieval procedure identifies the
important inclines in the pattern.
, and selects the highest inclines.
19Candidate segments
Second, the procedure retrieves segments with
similar inclines from the database.
- An incline is considered similar if
- its height is between height / C and height
C - its length is between length / D and length
D.
We use the range tree toretrieve similar
inclines.
20Similarity test
Third, the procedure compares the retrieved
segments with the pattern.
,using a given similarity test.
21Outline
22Experiments
We have tested a Visual-Basic implemen- tation on
a 2.4-GHz Pentium computer.
- Data sets
- Stock prices 98 series, 60,000 points
- Air and sea temperatures 136 series, 450,000
points
23Stock prices (60,000 points) Search for 100-point
patterns
The x-axes show the ranks of matches retrieved by
the developed procedure, and the y-axes are the
ranks assigned by a slow exhaustive search.
210
perfect ranking
0
0
200
fast rankingC D 5 time 0.05 sec
24Stock prices (60,000 points) Search for 500-point
patterns
The x-axes show the ranks of matches retrieved by
the developed procedure, and the y-axes are the
ranks assigned by a slow exhaustive search.
400
328
202
perfect ranking
perfect ranking
perfect ranking
0
0
0
200
167
0
0
0
200
fast rankingC D 5 time 0.31 sec
fast rankingC D 2 time 0.12 sec
fast rankingC D 1.5 time 0.09 sec
25Temperatures (450,000 points) Search for
200-point patterns
The x-axes show the ranks of matches retrieved by
the developed procedure, and the y-axes are the
ranks assigned by a slow exhaustive search.
400
400
202
perfect ranking
perfect ranking
perfect ranking
0
0
0
82
0
151
0
0
200
fast rankingC D 5 time 1.18 sec
fast rankingC D 2 time 0.27 sec
fast rankingC D 1.5 time 0.14 sec
26Conclusions
Main results Compression and indexing of time
series by major minima and maxima.
Current work Hierarchical indexing by importance
levels of minima and maxima.
4