Title: Time Series Anomaly Detection Experiments
1Time Series Anomaly Detection Experiments
- This file contains full color, large scale
versions of the experiments shown in the paper,
and additional experiments which were omitted
because of space constraints - Note that in every case, all the data is freely
available
2Figure 1 Expanded
Here is a Premature Ventricular Contraction (PVC)
Here the bitmaps are very different. This is the
most unusual section of the time series, and it
coincidences with the PVC.
Here the bitmaps are almost the same.
3Figure 3 Expanded
The gene sequences of mitochondrial DNA of four
animals, used to create their own file icons
using a chaos game representation. Note that
Pan troglodytes is the familiar Chimpanzee, and
Loxodonta africana and Elephas maximus are the
African and Indian Elephants, respectively. The
file icons show that humans and chimpanzees have
similar genomes, as do the African and Indian
elephants.
4Figure 6 Expanded
Annotations by a cardiologist
Premature ventricular contraction
Premature ventricular contraction
Supraventricular escape beat
5Figure 7 Expanded
A very complex and noisy ECG, but according to a
cardiologist there is only one abnormal
heartbeat. The algorithm easily finds it.
6Below are some examples of classification,
clustering with our bitmap approach. These
examples did not make it into the paper because
of space limitations
7Time Series Thumbnails
A snapshot of a folder containing cardiograms
when its files are arranged by Cluster option.
Five cardiograms have been grouped into two
different clusters based on their similarity.
Cluster 1 (eeg 1 3) BIDMC Congestive Heart
Failure Database (chfdb) record chf02 Start
times at 0, 82, 150, respectively Cluster 2 (eeg
6 7) BIDMC Congestive Heart Failure Database
(chfdb) record chf15 Start times at 0, 82
respectively
8Clustering with Time Series Thumbnail Approach
Data Key
Cluster 1 (datasets 1 5) BIDMC Congestive
Heart Failure Database (chfdb) record chf02
Start times at 0, 82, 150, 200, 250,
respectively Cluster 2 (datasets 6 10) BIDMC
Congestive Heart Failure Database (chfdb) record
chf15 Start times at 0, 82, 150, 200, 250,
respectively Cluster 3 (datasets 11 15) Long
Term ST Database (ltstdb) record 20021 Start
times at 0, 50, 100, 150, 200, respectively Cluste
r 4 (datasets 16 20) MIT-BIH Noise Stress
Test Database (nstdb) record 118e6 Start times
at 0, 50, 100, 150, 200, respectively
9Clustering Extended
Segmental Markov model 1
54
51
In Ge and Smyth 2000, this dataset was explored
with segmental hidden Markov models. After they
careful adjusted the parameters they reported 98
classification accuracy. Using time series
bitmap with virtually any parameter settings, we
get perfect classifications and clustering. We
can get perfect classifications using one nearest
neighbor classification, or we can project the
data into 2 dimensional space (see next slide)
and get perfect accuracy using a simple linear
classifier, a decision tree or SVD. (Dataset
donated by Padhraic Smyth and Seyoung Kim)
47
47
49
54
48
33
46
55
51
43
42
38
45
45
35
42
53
40
37
32
33
56
43
49
31
50
41
48
39
52
34
46
30
53
56
44
52
35
44
37
36
31
40
39
50
30
38
41
32
34
55
23
29
18
4
11
21
21
16
13
11
6
10
36
6
29
20
22
23
8
22
5
18
24
17
17
19
27
5
14
7
26
26
20
3
9
2
16
27
4
13
25
15
12
12
10
Parameters Level 1 N 60 n 12
8
3
28
15
14
19
24
7
9
28
25
2
1
1
10Classification
The MIT ECG Arrhythmia dataset projected into 2D
space using only the information from a level
2-time series bitmap. The two classes are easily
separated by a simple linear classifier (gray
line).
39
21
30
41
34
56
4
31
52
11
17
20
29
44
16
18
55
35
6
1
23
45
32
22
5
25
38
50
28
27
19
14
13
36
42
10
24
53
40
43
15
9
8
12
26
2
3
46
49
7
48
37
33
0.55
51
0.5
0.45
54
Parameters Level 1 N 60 n 12
0.4
47
0.35