Sequence Classification Using Statistical Pattern Recognition - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Sequence Classification Using Statistical Pattern Recognition

Description:

RoboCup Soccer Server. Pattern Recognized ... finger. more. ls. Sequence 1 Class 1. Sequence 2 Class 2. Sequence n Class n. Pattern 1 ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 59
Provided by: velblodVid
Category:

less

Transcript and Presenter's Notes

Title: Sequence Classification Using Statistical Pattern Recognition


1
Sequence Classification Using Statistical Pattern
Recognition
.
  • José Antonio Iglesias, Agapito Ledezma,
  • and Araceli Sanchis
  • Computer Science Department
  • Universidad Carlos III de Madrid
  • Avda. de la Universidad, 30. 28911 Leganés, Spain
  • jiglesia, ledezma, masm_at_inf.uc3m.es

2
.
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments and Results
  • Conclusions and Future Works

1
3
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments and Results
  • Conclusions and Future Works

1
4
.
Motivation
  • Opponent behavior Modelling / Classification
  • (Environment soccer simulation domain)

2
5
.
Introduction
  • Behavior Classification
  • Behavior as sequence of elements
  • Sequence Classification
  • Sequence
  • set of elements ordered so that they can be
    labelled with the positive integers
    (Merriam-Webster Dictionary)

3
6
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments Results
  • Conclusions and Future Works

4
7
Sequence classification
  • Given
  • Classes c1, c2, cn
  • Sequence E e1, e2, en
  • Determine
  • Which class ci ? C does the sequence E belong
    to.

5
8
.
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments Results
  • Conclusions and Future Works

6
9
.
Our approach
pwd fs fg
vi man ls
finger more ls ...
vi more ls

SEQUENCE CLASS Classification Result
Sequence 1 Class 1
Sequence 2 Class 2
Sequence n Class n
Sequence to classify
Compare_Patterns
On-Line Sequence Classification
Compare_Patterns


Compare_Patterns
Pattern Library
Library Creation
Classification
7
10
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments Results
  • Conclusions and Future Works

8
11
.
Library Creation
  • Trie (retrieval) data structure
  • Special search tree used for storing elements and
    its prefixes.
  • Every node
  • represents an element
  • stores useful information (times appeared,)

9
12
Library Creation - An example trie
pwd vi pwd vi pwd ls
  • Sequence to insert initially in the trie
  • pwd ? vi ? pwd ? vi ? pwd ? ls

Sequence
10
13
Library Creation - An example trie
pwd vi pwd vi pwd ls
  • Sequence to insert initially in the trie
  • pwd ? vi ? pwd ? vi ? pwd ? ls

Sub-sequence length 3 pwd ? vi ? pwd ?
vi ? pwd ? ls Sub-sequences to insert in the
trie pwd ? vi ? pwd and vi ? pwd
? ls
Sequence
10
14
Library Creation - An example trie
  • Sub-sequences to insert in the trie
  • pwd ? vi ? pwd and vi ? pwd ? ls

Root
11
15
Library Creation - An example trie
  • Sub-sequences to insert in the trie
  • pwd ? vi ? pwd and vi ? pwd ? ls

Root
11
16
Library Creation - An example trie
  • Sub-sequences to insert in the trie
  • pwd ? vi ? pwd and vi ? pwd ? ls

Root
11
17
Library Creation - An example trie
  • Sub-sequences to insert in the trie
  • pwd ? vi ? pwd and vi ? pwd ? ls

Root
11
18
Library Creation - An example trie
  • Sub-sequences to insert in the trie
  • pwd ? vi ? pwd and vi ? pwd ? ls

Root
11
19
Library Creation - An example trie
  • Sub-sequences to insert in the trie
  • pwd ? vi ? pwd and vi ? pwd ? ls

Root
11
20
Library Creation - An example trie
  • Sub-sequences to insert in the trie
  • pwd ? vi ? pwd and vi ? pwd ? ls

Root
11
21
Library Creation - An example trie
  • pwd ? vi ? pwd ? vi ? pwd ? ls

pwd vi pwd vi pwd ls
Root
11
22
Library Creation - Evaluating Dependences
  • Evaluate the relation/dependence between an
    element and its prefix
  • Two approaches
  • Frequency-based method.
  • Statistical dependence method.
  • Our approach Statistical Value used Chi-square
    value.
  • This value is stored in every node of the trie

12
23
.
Library Creation - Evaluating Dependences
(Rowi Total x Columnj Total)
Expected (Eij)
Grand Total
(Oij - Eij ) 2
r
k
X2 ? ?
Eij
i1
j1
2 x 2 Contingency Table
O11 How many times the current node/element is
followed by its prefix. O12 How many times the
current node/element is followed by a different
prefix. O21 How many times a different prefix
(of the same length) is followed by the same
node. O22 How many times a different prefix (of
the same length) is followed by a different node.
13
24
.
Library Creation - Evaluating Dependences
Sequence Pattern Trie
Root
  • A Sequence Pattern Trie is created for each
    class.

14
25
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments Results
  • Conclusions and Future Works

15
26
.
Classification
pwd fs fg
vi man ls
finger more ls ...
vi more ls
Testing Trie

Sequence 1 Class 1
Sequence 2 Class 2
Sequence n Class n
Sequence to classify
ONLINE SEQUENCE CLASS
Compare_Patterns
Class Trie
On-Line Sequence Classification
Compare_Patterns


Compare_Patterns
Pattern Library
Classification
Library Creation
16
27
.
Classification Comparing Process
Class Trie
Testing Trie
Root
Root

ls 2
pwd 3
vi 2
pwd 3
vi 2
vi 1 5.1
who 2 3.5
vi 1 7.1
pwd 2 1.5
who 1 4.3
pwd 1 7.3
ls 1 0.3
If the node (and its prefix) are in both Tries
If ( abs(chi2TestingTrie chi2ClassTrie)
ThresholdValue ) Similarity
between both tries. Result ?
ElementTestingTrie, PrefixTestingTrie,
Chi2TestingTrie
17
28
.
Classification Comparing Process
Class Trie
Testing Trie
Root
Root

ls 2
pwd 3
vi 2
pwd 3
vi 2
vi 1 5.1
who 2 3.5
vi 1 7.1
pwd 2 1.5
who 1 4.3
pwd 1 7.3
ls 1 0.3
If the node (and its prefix) are in both Tries
If (abs(5.1 7.1) ThresholdValue )
Similarity between both tries. Result
? vi , pwd, 5.1
17
29
.
Classification Comparing Process
Class Trie
Testing Trie
Root
Root

ls 2
pwd 3
vi 2
pwd 3
vi 2
vi 1 5.1
who 2 3.5
vi 1 7.1
pwd 2 1.5
who 1 4.3
pwd 1 7.3
ls 1 0.3
If the node (and its prefix) are only in the
Testing Trie Difference between both
tries. Result ? ElementTestingTrie,
PrefixTestingTrie, (Chi2TestingTrie -1)
17
30
.
Classification Comparing Process
Class Trie
Testing Trie
Root
Root

ls 2
pwd 3
vi 2
pwd 3
vi 2
vi 1 5.1
who 2 3.5
vi 1 7.1
pwd 2 1.5
who 1 4.3
pwd 1 7.3
ls 1 0.3
If the node (and its prefix) are only in the
Testing Trie Difference between both
tries. Result ? who, pwd ? vi, (-4.3)
17
31
.
Classification Comparing Process
Class Trie
Testing Trie
Root
pwd 3
vi 2
vi 1 5.1
who 2 3.5
who 1 4.3
If the node (and its prefix) are only in the
Testing Trie Difference between both
tries. Result ? who, vi, (-3.5)
17
32
.
Classification Comparing Process
  • Result
  • Element1, Prefix1, Value1
  • Element2, Prefix2, Value2
  • Element3, Prefix3, Value3
  • Element4, Prefix4, Value4
  • Elementn, Prefixn, Valuen

Each comparison (ClassTrie, TestingTrie) A
comparision value
Comparison Value
18
33
.
Classification Comparing Process
  • Result
  • vi, pwd, 5.1
  • who, pwd ?vi, - 4.3
  • who, pwd, - 3.5

- 2.7
Comparison Value
18
34
.
Classification
pwd fs fg
vi man ls
finger more ls ...
vi more ls

Sequence 1 Class 1
Sequence 2 Class 2
Sequence n Class n
Sequence to classify
ONLINE SEQUENCE CLASS
Compare_Patterns
comparision value
On-Line Sequence Classification
Compare_Patterns

comparision value

Compare_Patterns
Pattern Library
comparision value
Library Creation
Classification
19
35
.
Classification
pwd fs fg
vi man ls
finger more ls ...
vi more ls

Sequence 1 Class 1
Sequence 2 Class 2
Sequence n Class n
Sequence to classify
ONLINE SEQUENCE CLASS
Compare_Patterns
comparision value
On-Line Sequence Classification
Compare_Patterns

comparision value

Greatest Comparison Value
Compare_Patterns
Pattern Library
comparision value
Library Creation
Classification
20
36
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments Results
  • Conclusions and Future Works

21
37
Environment UNIX command line sequences
  • Command histories of 9 UNIX computer users at
    over 2 years
  • UCI Repository of ML Database Newman C., Hettich
    S., Merz, C. (1998)

Start session 1 cd /private/docs ls
-laF more cat foo.txt bar.txt zorch.txt gt
a.txt exit End session 1 Start session
2 cd /games/ xquake fg
SOF cd lt1gt ls -laF more cat lt3gt gt lt1gt exit
EOF
one "file name" argument
three "file name" arguments
one "file name" argument
22
38
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments Results
  • Conclusions and Future Works

23
39
.
Experiments UNIX command line sequences
  • 9 files (users) containing from about 10.000
    to 60.000 commands each.
  • 1. Extracting Patterns A trie is created for
    each user ? Pattern Library

24
40
.
Experiments UNIX command line sequences
  • 9 files (users) containing from about 10.000 to
    60.000 commands each.
  • 1. Extracting Patterns A trie is created for
    each user ? Pattern Library

2. Classification Algorithm Sequence to
classify (sequences of very different sizes) ?
? Classified in the class with the greatest
value (result value).
24
41
.
Experiments UNIX command line sequences
  • 9 files (users) containing from about 10.000 to
    60.000 commands each.
  • 1. Extracting Patterns A trie is created for
    each user ? Pattern Library

2. Classification Algorithm Sequence to
classify (sequences of very different sizes) ?
? Classified in the class with the greatest
value (result value).
  • 3. Evaluating the result
  • Calculate
  • difference between the greatest value and the
    second greatest value ()
  • difference between the real classification
    value and the greatest value (-)
  • (The greater the difference, the better the
    classification)

24
42
.
Results UNIX command line sequences
Unix Commands Classification User 6
Classification Value
average of 25 simulation results
Length of the Sequence to classify
25
43
.
Results UNIX command line sequences
Minimum length for classifying a UNIX Computer
User correctly
Length of the Sequence to classify
Unix Computer User (Class)
26
44
Outline
  • Motivation and Introduction
  • Sequence classification
  • Our approach
  • Library Creation
  • Classification
  • Target Environment
  • Description
  • Experiments Results
  • Conclusions and Future Works

27
45
Conclusions
  • A threshold must be found
  • Long time for creating the tries
  • Results depend on the length of the sub-sequences
  • used to create the trie

28
46
Conclusions
  • Effective method to classify UNIX users
  • If a behavior can be represented by sequences,
  • the proposed classification method can be
    used
  • If a new class is added, only its trie must be
    created
  • (the others are not modified)
  • This method could be used for other tasks
  • sequence prediction, sequence clustering
  • RoboCup Coach 2006 Competition (succesfully
    results)

29
47
Future Works
  • Pattern Library ? One Trie for all classes
    (users).
  • Classification method without threshold value
  • Analysis comparing our approach to others (HMMs)

30
48
Sequence Classification Using Statistical Pattern
Recognition
.
Thank you!
  • José Antonio Iglesias, Agapito Ledezma,
  • and Araceli Sanchis
  • Computer Science Department
  • Universidad Carlos III de Madrid
  • Avda. de la Universidad, 30. 28911 Leganés, Spain
  • jiglesia, ledezma, masm_at_inf.uc3m.es

49
Sequence Classification Using Statistical Pattern
Recognition
.
Questions
  • José Antonio Iglesias, Agapito Ledezma,
  • and Araceli Sanchis
  • Computer Science Department
  • Universidad Carlos III de Madrid
  • Avda. de la Universidad, 30. 28911 Leganés, Spain
  • jiglesia, ledezma, masm_at_inf.uc3m.es

50
Sequence Classification Using Statistical Pattern
Recognition
.
Related to Questions...
  • José Antonio Iglesias, Agapito Ledezma,
  • and Araceli Sanchis
  • Computer Science Department
  • Universidad Carlos III de Madrid
  • Avda. de la Universidad, 30. 28911 Leganés, Spain
  • jiglesia, ledezma, masm_at_inf.uc3m.es

29
51
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser1

Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
User On-Line ? Class c

Sequence Classification
Pattern Library
52
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser1

Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
User On-Line ? Class c

Sequence Classification
Pattern Library
53
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser1

Correctly Classified
Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
User On-Line ? Class c

Sequence Classification
Pattern Library
54
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser1

Correctly Classified
Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
20
User On-Line ? Class c

Sequence Classification
Pattern Library
55
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser2

Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
User On-Line ? Class c

Sequence Classification
Pattern Library
56
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser2

Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
User On-Line ? Class c

Sequence Classification
Pattern Library
57
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser2

NO Correctly Classified
Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
User On-Line ? Class c

Sequence Classification
Pattern Library
58
Experiments UNIX command line sequences
SOF ls -laF More cd lt4gt
SOF cd lt1gt ls -laF more cat lt3gt gt
SOF ls lt1gt exit lt1gt ls -laF xquake fg
SOF vi lt1gt vi lt3gt ls -la cat lt2gt
ClassUser2

NO Correctly Classified
Test User
USER 0 Class0
USER 1 Class1
USER 8 Class8
User On-Line vs Class User0 ? 21 User On-Line vs
Class User1 ? 49 User On-Line vs Class User2 ?
9 User On-Line vs Class User3 ? 3 User On-Line vs
Class User4 ? 12 User On-Line vs Class User5 ?
29 User On-Line vs Class User6 ? -1 User On-Line
vs Class User7 ? 0 User On-Line vs Class User8 ?
11
- 40
User On-Line ? Class c

Sequence Classification
Pattern Library
Write a Comment
User Comments (0)
About PowerShow.com