Monitoring Extended Regular Expressions - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Monitoring Extended Regular Expressions

Description:

E.g., standard type checking is automatic, efficient and effective, but reveals ... Idea: Let system run and observe execution trace. ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 40
Provided by: oslCs
Category:

less

Transcript and Presenter's Notes

Title: Monitoring Extended Regular Expressions


1
Monitoring Extended Regular Expressions
  • Grigore Rosu
  • University of Illinois at Urbana-Champaign, USA

Joint work with Mahesh Viswanathan and Koushik Sen
2
Increasing Software Reliability
  • Current solutions
  • Human review of code and testing
  • Most used in practice
  • Usually ad-hoc, intensive human support
  • (Advanced) Static analysis
  • Often scales up
  • False positives and negatives, annotations
  • (Traditional) Formal methods
  • Model checking and theorem proving
  • General, good confidence, do not always scale up

3
Trade-offs in System Analysis
Automation
Trade - offs
Efficiency
Generality
Efficacy
  • E.g., standard type checking is automatic,
    efficient and effective, but reveals a very
    limited set of errors

4
Runtime Verification and Monitoring
Idea Let system run and observe execution trace.
If that violates or appears to violate
requirements then report error or guide the
program to avoid or to hit error.
5
Runtime Verification and Monitoring
  • PathExplorer developed jointly with Havelund
  • Used on 70,000 lines of C code (K9 Rover)
  • Found a deadlock in 10 seconds
  • Confirmed a datarace suspicion
  • Runtime Verification Workshop
  • 01 France (CAV), 02 Denmark (CAV), 03 USA
    (CAV)
  • 04 Spain (ETAPS),

6
PathExplorer - Overview
Observer
Events
Running program
(socket)
(Joint work with Klaus Havelund of NASA Ames)
7
PathExplorer the Observer
paxmodules module datarace java
pax.Datarace module deadlock java
pax.Deadlock module temporal java
pax.Temporal spec module ERE java
pax.Ere spec end
warning
datarace
deadlock
warning
Dispatcher
Event stream
warning
temporal
ERE
warning
8
Why (Extended) Regular Expressions?
  • Ordinary programmers and software engineers
    understand and use regular expressions
  • Perl, Python, etc.
  • Safety policies are often regular patterns on
    sequences of states/events
  • (idle open (read write) close)
  • Complementation needed to say what should not
    happen (any start1 ( end1) start2 any)

9
Extended Regular Expressions (ERE)
  • Regular expressions with complement
  • Language of an ERE
  • Intersection R n R (R R)

R F e A ? ? R R R R R R
L(F) F L(R R) L(R) ? L(R)
L(e) e L(R R) ww w?
L(R), w? L(R) L(A) A L(R)
(L(R)) L(R) ? \ L(R)
10
ERE Membership Problem
  • Given w? ? and R, is it the case that w ? L(R)?
  • Patterns in strings many applications
  • Programming languages (PERL, Python)
  • Molecular biology (Knight-Myers95)
  • Monitoring
  • Efficient solutions are of great practical
    interest
  • From now on, n is the length of the word/trace w
    and m is the size of the ERE R
  • n is typically much much larger than m

11
What is known (I)
  • If R does not contain negations, then
  • Transform R into an NFA of size O(m) (Aho90)
  • Solution in time O(nm) and space O(m)
  • Improved by Mayers92 (JACM) time/space O(nm /
    log n)
  • Transform R into a DFA of size O(2m) (Aho90)
  • Solution in time O(nm) and space O(2m)
  • Note transitions in a DFA take logarithmic time
  • Negations and their nesting make the membership
    problem highly non-trivial

12
Problems with Negation (I)
  • How to complement an NFA?
  • Just complementing the set of final states is
    wrong!

A
A
L(A) ab
L(A) ab,a, e
13
Problems with Negation (II)
  • DFAs can be complemented safely by just
    complementing the set of final states, but
  • NFA -gt DFA implies exponential state blowup!
  • For k nested negations, 2(2((2m))) states
  • This makes the membership problem non-elementary
    more complex in the context of (nested) negations

14
What is known (II)
  • Dynamic programming algorithm
  • (Hopcroft-Ullman 79)
  • Time O(n3m) and space O(n2m)
  • Special synchronized alternating automata
  • (Yamamoto 02) intersection but not negation
  • (Kupferman-Zuhovitzky 02) general ERE
  • Time O(n2m) and space O(nmkn2), where k is the
    number of negations and intersections
  • Algorithms above store the word this is
    unacceptable in many practical situations

15
Desired Behavior - Monitoring
Algorithms processing and then discarding each
event are desired in practice, since words or
execution traces can be extremely long
Observer
Events
Running program
socket
16
Challenges and Talk Overview
  • What is the lower space/time bound of the ERE
    monitoring problem (to process one event)?
  • ?(2cm½ ) for space
  • What is a reasonable upper bound for the ERE
    monitoring problem (to process one event)?
  • Rewriting algorithm in O(22m2) space/time
  • How to generate optimal monitors for ERE?
  • Optimal monitor generation by coinduction

17
Lower Bound for ERE Monitoring (I)
  • Consider the language
  • (Chandra-Kozen-Stockmeyer81 in alternation)
  • (Kupferman-Vardi98 in model checking)

Lk u w u w w ?0,1k and u,u
?0,1,
  • We show that
  • There is an ERE Rk of size ?(k2) with L(Rk) Lk
  • Any monitoring algorithm for Lk needs ?(2k) space
  • So we can conclude that the space lower bound for
  • ERE monitoring is ?(2cm½)

18
Lower Bound for ERE Monitoring (II)
Lk u w u w w ?0,1k and u,u
?0,1,
() () n ???
Rk ???
There should be exactly one symbol, and
Each letter in W should appear after at exactly
the same position
There should be some sequence of 0,1,, followed
by a and then by a W
(01) ???
Note that size of Rk is ?(k2) and L(Rk) Lk
19
Lower Bound for ERE Monitoring (III)
Lk u w u w w ?0,1k and u,u
?0,1,
  • Let A be a monitor for Lk
  • When A reads symbol , it should remember
  • exactly those w that have been seen so far
  • There are 22k possible distinct situations to
    remember so at least 2k memory needed by A to
    encode each of these situations

20
Idea of an Event-Consuming Algorithm
  • Consume each event as it arrives, generating a
    new ERE monitoring requirement
  • Use the notion of derivative
  • Ra is the ERE that should hold after seeing
    event a, in order for R to hold now
  • Algorithm A stores an ERE R, and when an event a
    arrives it replaces R by Ra at the end of
    trace A checks whether e?R
  • How can we generate Ra efficiently?
  • How can we store Ra compactly?

21
ERE Syntax
  • Sorts Ere and Event subsort Event lt Ere
  • Operations
  • F -gt Ere
  • e -gt Ere
  • __ Ere Ere -gt Ereassoc comm id empty
  • _ _ Ere Ere -gt Ereassoc id nil
  • _ Ere -gt Ere
  • _ Ere -gt Ere

22
Derivatives
  • Related work
  • Antimirov and Mosses
  • Operations
  • __ Ere Event -gt Ere
  • _?__ Bool Ere Ere -gt Ere
  • e?_ Ere -gt Bool
  • Equations
  • (R1 R2)a R1a R2a
  • (R1 R2)a R1a R2 (e?R) ? R2a F
  • (R)a Ra R
  • (R)a (Ra)
  • ea F
  • Fa F
  • ba (b a) ? e F

Obvious!
23
Three Important Simplifying Rules
  • Without any other rules, Ra1a2an can grow
    to unbounded size
  • Simplifying rules
  • F R F
  • R R R
  • R1 R R2 R (R1 R2) R
  • Let R be the rewriting system defined so far

24
Theorems (RTA03)
  • R is terminating and ground Church-Rosser
    modulo AC of __ and A of _ _
  • L(nfAC(Ra)) w aw ? L(R) for all EREs R
  • a1a2an ? L(R) iff e ? Ra1a2an
  • Ra1a2an requires O(22m2) space and
    O(n22m2) time, where m R

25
Problems
  • Previous algorithm is not synchronous!
  • Unless we check for emptiness after processing
    each event, which is very expensive
  • How to generate a minimal monitor for ERE
    avoiding the highly exponential state explosion?
  • Solution Circular Coinduction
  • Related work by Rutten no negation

26
Hidden LogicBehavioral Specification
  • Behavioral specification
  • Tuple (V, H, G, S, E), or simply (G, S, E)
  • Sorts S V ? H
  • V visible sorts (stay for data integers,
    reals, chars, etc.)
  • H hidden sorts (stay for states, objects,
    blackboxes, etc.)
  • Operations G ? S
  • S is an S-signature
  • G is a subsignature of S of behavioral operations
  • E is a set of S-equations

27
Contexts and Experiments
  • G-context is a G-term with a hidden slot
  • G-experiment is a G-context of visible result

visible if G-experiment
operations in G
z h
28
Behavioral Equivalence
  • Models called hidden S-algebras A, A,
  • Behavioral equivalence on A a a
  • Identity on visible carriers
  • a h a iff A?(a) A?(a) for any G-experiment
    ?

G
visible
A?(a)
A?(a)

G
G
29
Behavioral Satisfaction
  • a S-equation, A a hidden
    S-algebra
  • A behaviorally satisfies , written
  • iff ?(t) h ?(t) for any map ? X ? A

A
( X) t h t
A
G
A
30
Proving Behavioral Equivalence
  • Behavioral satisfaction known to be p2 hard, so
  • No way to automatically prove any truth
  • No way to automatically disprove any falsity
  • Hidden logics are incomplete
  • Coinduction and context induction very strong
  • Both require human support
  • Circular coinduction is an automatic procedure
  • Tuned and tested on hundreds of examples
  • Streams, Protocols (ABP), Pattersons mutual
    exclusion, etc.
  • Supported by BOBJ, prototyped in Maude

0
31
Circular Coinduction in a Nutshell
Explanation? (4) Context induction Nodes above
form induction hypothesis
Moreover, all the behavioral equalities on the
proof graph are true lemma descovery!
Explanation? (1) All possibilities to
distinguish the two are exhaustively explored
Explanation? (2) Any experiment can be
consumed bottom-up, ending in a visible node
Explanation? (3) Congruent binary relation R
is built but behavioral equiv. is the largest!
  • Derive the original proof goal until end up in
    circles

Modulo substitutions, special contexts
and equational reasoning
? ?
?
? ?
5 5
?
? ?
9 9
0 0
32
zip(zero, one) blink
Cobasis h,t
zip(zero, one) blink
0 0
zip(one,zero) t(blink)
1 1
zip(zero,one) blink
33
zip(zero, one) blink
Cobasis h, ht, tt
zip(zero, one) blink
0 0
1 1
zip(zero,one) blink
34
zip(odd(S), even(S)) S
Cobasis h,t
zip(odd(S), even(S)) S
h(S) h(S)
zip(even(S),even(t(S))) t(S)
h(t(S)) h(t(S))
zip(even(t(S)), even(t(t(S)))) t(t(S))
35
zip(odd(S), even(S)) S
One can prove by h,t-circular coinduction that
odd(zip(S,S)) S even(zip(S,S)) S
Cobasis h, odd, even
zip(odd(S), even(S)) S
h(S) h(S)
odd(S) odd(S)
even(S) even(S)
36
Behavioral Specification of EREs
  • B (V, H, G, S, E) where
  • V contains Event and Bool
  • H contains Ere
  • S contains F, e, __, _ _, _, _
  • E contains all equations defined before
  • G contains
  • e?_ Ere -gt Bool
  • __ Ere Event -gt Ere

Theorem B beh. satisfies R R iff L(R)
L(R)
37
(a b) (a b)
Moreover, all the equivalences in the proof graph
below are true!
Theorem Circular Coinduction is a decision
procedure for ERE language equality
(a b) (a b)
(a b) b (a b)
true true
(a b) a b (a b)
(a b) a b (a b)
(a b) b (a b)
true true
true true
38
Generating Minimal DFAs for EREs
R
a
b
Rb
Ra
a
b
R
R
a
b
Ra
39
Implementation
  • BOBJ cannot be used because it does not return
    the set of circularities
  • Implemented a specialized circular coinduction
    algorithm in Maude
  • Web server at http//fsl.cs.uiuc.edu
  • A PERL CGI script which calls Maude
  • Generates JPEG, PS, and DOT versions of DFA

40
Conclusion and Future Work
  • Exponential complexity unavoidable when negation
    is added to regular expressions (EREs)
  • Few rewriting rules provide the best trace
    membership algorithm known for EREs
  • Generation of minimal DFAs for EREs by circular
    coinduction (CC) avoids state explosion
  • To be part of PathExplorer at NASA Ames
  • Behavioral Maude with circular coinduction
  • Inductive/Coinductive Theorem Prover (ICTP)
  • Behavioral Rewriting Logic
Write a Comment
User Comments (0)
About PowerShow.com