Discriminative Model Checking

About This Presentation

Title:

Discriminative Model Checking

Description:

... Using Model Checking and Genetic Programming. Gal Katz ... Genetic programming ... At each iteration of the GP algorithm, the following genetic operations are ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 70

Provided by: dcsWar

Category:

more less

Transcript and Presenter's Notes

Title: Discriminative Model Checking

1
Discriminative Model Checking

Peter Niebert
Doron Peled
Amir Pnueli
CAV 2008

2
Discriminative Model Checking

Peter Niebert
Doron Peled
Amir Pnueli
CAV 2008

Warwning inside this talk hides another talk!
Automatic Generation of Programs Using Model
Checking and Genetic Programming
Gal Katz Doron
Peled
3
Which logic to use?

Linear each execution is an alternating sequence
of states/actions.
Use LTL/Buchi automata.
Counterexample if property fails.
Branching a tree repsresents all executions,
including the points where they branch.
Allows expressing possibility, e.g., of services.

4
Linear Temporal Logic

?????
????
O ?
?U??

5
Computation Tree Logic
EG p
AF p
p
p
p
p
p
p
p
p
p
p
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
6
Our point of view

Linear time is sufficient for specifying most
properties.
A counterexample is often not enough
Gives very little clue about the location of the
error.
Does not give information about how good and bad
executions are related to each other.
Thus, for analysis beyond finding the existence
of an error, we promote a deeper search.

7
Our suggestion

Primary or base specification ? in LTL, for the
base property.
Analysis specification, quantifies over
executions that satisfy or do not satisfy the
base specification.
Syntaxp ?\/? ?? ?? ??? ???? (and
others)
Semantics- ??? there exists a continuation
satisfying the property ?, where ? holds from the
beginning. - ??? there exists a continuation not
satisfying the property ?, where ? holds from the
beginning.

8
Semantics illustration

Semantics- ??? there exists a continuation
satisfying the property ?, where ? holds from the
beginning. - ???? there exists a continuation
not satisfying the property ?, where ? holds
from the beginning.

? holds
? holds
. . .
. . .
. . .
. . .
9
Examples for specifications

Bad executions depend on infinitely many bad
choices ??ltgt??true
Before executing a, there are good and bad
executions. Once a is executed, things things are
persistently bad ?((Execa/\??true)W(Execa/\??fa
lse))
Properties such as from some point all
continuations are good/bad.

10
How to do model checking?

We need to remember some information about the
path so far to verify that with the rest of the
computation it is (not) satisfying ?.
Suppose we would have run a Buchi automaton for
?, but with nondeterministic, maybe it is running
on the wrong branch to be completed.
Thus, we would be running a subset construction
(determinization) of the Buchi automaton.
At the point of branching, we continue with a
state consistent with one of the Buchi states in
the current subset.
Apply CTL model checking to this structure.

11
Complexity

EXSPACE-complete even for AG ??true
Reduction shown for related logic mCTLKV LICS
2006 (this logic has different semantics, where
quantification always start from the initial
state).
But EXSPACE-complete in size of LTL formula,
PSPACE-complete in size of branching formula.

12
Application

Why do we need such an analysis?
and now we go to another lecture

13
Automatic Generation of Programs Using Model
Checking and Genetic Programming

Gal Katz
Doron Peled

TACAS 2008
14
Agenda

Introduction motivation
Genetic Programming
Model Checking
Combined method
Application to mutual exclusion
Conclusions future work

15
Introduction

Genetic programming
A methodology for automatic programming inspired
by Darwinian evolution Koza 92.
Used for automatic generation of programs in
various fields.
Mostly used for optimization related problems.
Fitness is usually calculated by checking program
performance against test cases.
Less used for problems with a strict
specification.

16
Introduction (2)

Model Checking
An automatic formal verification technique used
mainly with finite-state software and hardware
systems.
Can be used to verify communication and
concurrent protocols.
Models are checked against a strict
specification. The result is either
A confirmation that the model satisfies the
specification, or
A counterexample of that fact.

17
Introduction (3)

How to construct a model from the spec.?
Synthesis
Transforms spec. directly to a model that
satisfies it.
Complicated.
Currently not practical for automatic program
generation.
Brute-force enumeration
All possible programs of a specific domain and
size are generated and model-checked.
All existing solutions will eventually be found.
Very time-intensive. Not practical for programs
with more than few lines of code.

18
Our MethodCombining GP Model Checking
User
1. Specification
2. Configuration
6. Final Model / Results
GP Engine
EnhancedModel Checker
3. Initial population
4. Verification results
5. New programs
19
Main Steady-state GP Algorithm

Create initial program population.
Randomly choose µ programs.
Create ? new programs by applying genetic
operations to the above µ programs.
Calculate fitness function for µ ? programs,
and use it to select µ new programs.
Replace the old µ programs by the selected ones.
Repeat steps 2-5 until either
a perfect solution is found, or
maximum allowed number of iterations is reached.

20
Program Representation

Programs are represented as trees.
Internal nodes represent expressions or
instructions with parameters (assignment, while,
if, block).
Terminal nodes represent constants or expressions
without any parameter (0, 1, 2, me, other).
Strongly-typed GP is used Montana 95.

While (A2 ! 0) Ame 1
21
Initial Population Creation

Population usually contains 100 1000 programs.
Program are created recursively using the grow
method KOZA 92.
The root is randomly selected from instruction
nodes.
Offspring are randomly selected from allowed node
or terminals as long as rules are preserved.
If max allowed tree depth is reached, a terminal
must be chosen.

22
Genetic Operations

At each iteration of the GP algorithm, the
following genetic operations are applied to the
selected programs
Reproduction programs are copied without any
change
Mutation
Crossover

23
Mutation Operation

The main operation we use.
Allows performing small modifications to an
existing program by the following method
Randomly choose a program node (internal, or
leaf).
According to the node type, apply one of the
following operations with respect to the chosen
node (strong typing must be kept)

24
Replacement Mutation type (a)
while

Replace the sub-tree rooted by node with a new
randomly generated sub-tree.
Can change a single node or an entire sub-tree.

assign
!
0
A
A
1
me
2
While (A2 ! 0) Ame 1
While (A2 ! 0) Ame A0
25
Insertion Mutation type (b)

Add an immediate parent to the selected node.
Randomly create other offspring to the new
parent, if needed.
According to the selected parent type, can cause
Insertion of code,
Wrapping code with a while loop,
Extending Boolean expressions.

While (A2 ! 0) Ame 1
While (A2 ! 0) A2 other Ame 1
26
Reduction Mutation Type (c)

Replace the selected node by one of its
offspring.
Delete the remaining offspring of the node.
Has the opposite effect of the previous insertion
mutation, and reduces the program size.

27
Deletion Mutation Type (d)
while

Delete the sub-tree rooted by the node.
Update ancestors recursively.

!
0
A
2
While (A2 ! 0) Ame 1
28
Crossover Operation

Creates new programs by merging building blocks
of two existing programs.
Crossover steps are
Randomly choose a node from the 1st program.
Randomly choose a node from the 2nd program, that
has the same type as the 1st node.
Exchange between the sub-trees rooted by the two
nodes, and use the two newly created programs.

29
Crossover Example
block
if
assign
!
1
A
me
A
me
2
A2 me while (ame other)
If (Ame ! 1) a0 other
A2 me a0 other
If (Ame ! 1) while (ame other)
30
Crossover (cont.)

Heavily used by traditional GP Koza.
Tries to mimic biological sexual recombination,
but
Unlike biology (and unlike GA), GP lacks the
notion of genes Banzhaf et al. 01.
Often acts only as a macro-mutation.
Various methods were developed in order to turn
it into a more fruitful operation (Brood,
Inteligent crossover).
Still, not a significant operation for small
programs like those of Mutual Exclusion.

31
Selection

At each iteration, selection is applied to all µ
? programs (over-production selection).
Program are selected using a fitness-proportional
(roulette) method Holland 92.
Elitism is used to ensure that the best program
is always selected.
Similar to Evolution Strategies Rechenberg 94
and Brood Recombination method Tackett 94 -
better protection from harmful operations.

32
Model Checking
33
?-automata

Runs on infinite words, and consist of
A finite alphabet S,
A finite set of states S,
A set of initial states S0 ? S,
A transition relation ? ? S x S,
A labeling function L S ? ?,
An acceptance condition O.
In this version, the labels are on the states
instead of on the arcs.

34
Acceptance conditions

For a run p, inf(p) denotes the states appearing
infinitely on p.
Buchi condition
A set of states F ? S,
A run p over A is accepted if inf(p) n F ? Ø
Streett condition
A set of k pairs (Ei,Fi), 1 i k, Ei, Fi ? S,
A run p over A is accepted if for all pairs
inf(p) n Ei ? Ø ? inf(p) n Fi ? Ø.

35
?-automata Closure

Buchi automata can be converted into Streett
automata, and vice versa.
Both Buchi and Streett automata are closed under
intersection and complement.
Streett automata are less simple to use, but are
closed under determinization, while Buchi
automata are not.

36
Building Programs State-graph

Each state consists of values of variables,
program counters, buffers, etc.
Edges represent atomic transitions caused by
program instructions.

Can be built by a DFS algorithm.
Can be decomposed into SCCs Tarjan 72.

37
Converting Model to ?-automaton

We use the states, initial state and transitions
of the programs state-space.
Acceptance condition can allow all runs, or
impose fairness conditions.
Streett automata can be used in order to define
various fairness conditions (weak strong).

38
Safety Properties

Basic properties can be checked by simply
analyzing the state graph
Invariants can be checked on every visited
state.
Deadlocks states without outgoing edges.
Unreachable code instructions that are not
represented on any transition.
Liveness properties require a more complicated
process.

39
Specification

We use Linear Temporal Logic (LTL) Pnueli 77 to
define specification properties.
LTL formulas are interpreted over an infinite
sequences of states, and consist of
Propositional variables,
Logical connectives, such as ? , ? , ? , ?, and
Temporal operators, such as
?(p) p will eventually occur.
?(p) p always occurs.
A model M satisfies a formula f (M f) if every
(fair) run of M satisfies f.

40
Converting specification to ?-automaton

Every LTL property can be converted into a Buchi
automaton with a size exponential to the LTL
formula size Vardi Wolper 94.
For deterministic Streett automata, a
determinization process is also required Safra
88.
May result in a doubly exponential blowup from
LTL property.

41
The Model Checking Process Vardi Wolper 86

Both model and speciation are converted to
?-automata over the same alphabet.
The alphabet is 2AP, where AP denotes a set of
atomic propositions that may hold on the system
states.
Every word accepted by M (a fair run) should be
accepted by the spec, therefore we have to check
whether L(M) ? L(f(.

42
Model Checking Results

Its easier to check whether
L(M) n L(f( Ø, or
L(M) n L(?f( Ø.
Case 1
Intersection is empty.
M satisfies f .
Case 2
Intersection is not empty.
Runs contained in the intersection can be used
for generating counterexamples.

43
Checking for Non-Emptiness

Easy with Buchi automata
Decompose intersection graph into maximal SCCs
reachable from the root.
Check ff an accepting state from F occurs
infinitely often inside a reachable SCC.

More complicated with Streett automata.
Alg. can be used for a single SCC or an entire
automaton

44
Model Checking and GP

Can standard model checking results be used as a
GP fitness function?
Yes, but it was done so far with a limited
success Johnson 07.
A fitness function with just two values is a poor
one.
We wish to analyze the model checking graph in
order to quantify the level of satisfaction.
When using nondeterministic Buchi automata, a
single program computation may have multiple
accepting and non-accepting paths ? difficult to
analyze.
Deterministic Streett automata are not more
expensive, but ensure symmetry between accepting
and non-accepting paths.

45
Enhanced Model Checking Algorithm

The idea
We assume that an hostile scheduler (or
environment) chooses the execution path.
For each spec. property, we check the amount of
work the scheduler has to make in order to cause
a property violation.
The results are used for setting the fitness
level scores.

46
Fitness Level 0
A

All SCCs are empty (not accepting).
Property is never satisfied.
No scheduler choices are needed.

C
B
E
D
47
Fitness Level 1
A

At least one accepting SCC.
At least one empty bottom SCC.
Finite number of scheduler choices can lead the
execution into the empty BSCC (D in the example).
The program will stay there forever.
BSCC with only 1 node means a deadlock ? gets
worse score.

C
B
E
D
48
Fitness Level 2
A

All BSCCs are accepting.
At least one empty SCC.
Infinite scheduler choices are needed for keeping
the program inside the empty SCC (B in the
example).

C
B
E
D
49
Fitness Level 3

All SCCs are accepting.
There still may be SCCs that are not universal,
and contains violating paths.
Therefore, the graph universality is checked.
If the graph is not universal, we are still at
level 2.
Otherwise, level 3 is assigned.
In this case, even infinite scheduler choices
cannot cause a violation, since the property is
always satisfied.

A
C
B
E
D
50
Overall Fitness Function

Fitness levels scores are calculated for each
specification property.
How to merge into a single fitness function?
Naïve summing can bias the results, since some
properties may be trivially satisfied when more
basic properties are violated.
Thus, spec. properties are divided into levels,
starting from level 1 for most basic properties.
As long as not all properties at level i are
satisfied, properties at higher level gets
fitness of 0.
This algorithm also saves running time by
skipping unneeded checks.

51
Parsimony

GP programs tend to grow up over time to the
maximal allowed tree size (bloating).
Large portions of the code become introns
(junk DNA).
To avoid that, we use parsimony as a secondary
fitness measure.
Number of program nodes small factor is
subtracted from the fitness score.
The factor should be carefully chosen.
Should encourage programs to reduce their size,
but
Should not harm the evolutionary process.
Therefore, programs cannot get a score of 100,
but only get close to it. The run can be stopped
when all properties are satisfied.
Programs can be reduces either by mutations, or
directly by detecting dead code by the model
checking process, and then removing it.

52
Vacuity
?(p ? ?q)

A special care is needed for implication
properties of the form ?(p ? ?q).
Some (or all) executions may be vacuously
satisfied if p never happens.
We are usually interested only on runs when p
eventually occurs.
Other runs are neither good nor bad. They are
irrelevant.
Thus, in these cases, the program automata is
first intersected with the property ?p.
Some SCC might be marked irrelevant.

p?q
?p
p??q
?(p ? ?q)
?p

If all SCCs are irrelevant, fitness level 0 is
assigned.
A similar mechanism is used for excluding unfair
runs.

53
The Mutual Exclusion Problem

Originally described by Dijkstra 65.
Many variants and solutions exist.
Modeled using the following program parts
Non Critical Section
Pre Protocol
Critical Section
Post Protocol
We wish to automatically generate correct code
for the pre and post protocol parts.

54
Spec. Properties

The specification includes the following LTL
properties
The properties are converted into Streett
automata.

55
Runs Configuration

3 different sets of runs
The following parameters were used
Population size 150
Max number of iterations 2000
µ 5
? 150

56
An Example of a Run (1st variant)
Score 0.0

Randomly created.
Does not satisfy mutual exclusion property.
Higher level properties are set to 0.

57
An Example of a Run (1st variant)
Score 66.77

Randomly created.
While loop guarantees mutual exclusion.
Only process 0 can enter the critical section.

58
An Example of a Run (1st variant)
Score 75.77

Last line changed by a mutation.
The naïve mutual exclusion algorithm.
Processes uses a turn flag, but depend on each
other.
A local maximum point in the search space.

59
An Example of a Run (1st variant)
Score 70.17

An important building block common to many
algorithms.
Each process set its own flag and wait for
others flag, but
The flag is not turned off correctly.
Might eventually deadlock, thus, properties 4 and
5 get fitness level of 1.

60
An Example of a Run (1st variant)
Score 76.10

Last line is replaced by a mutation.
Now, process 0 correctly turns its flag off.
Property 5 is fully satisfied

61
An Example of a Run (1st variant)
Score 92.77

A single node is changed by a mutation.
Both processes turn off their flag.
Properties 4 and 5 are fully satisfied.
Still, deadlock occurs if both processes enter
simultaneously.

62
An Example of a Run (1st variant)
Score 93.20

A mutation added a line to the empty while loop.
This turns the deadlock into a live lock, and
causes a slight fitness improvement.

63
An Example of a Run (1st variant)
Score 94.37

Another line is added to the while loop.
No more dead or live locks, but property can
still be violated by some infinite scheduler
choices.

64
An Example of a Run (1st variant)
Score 96.50

Created by some random mutations.
All properties are satisfied.
Still, not the shortest solution.

65
An Example of a Run (1st variant)
Score 97.10

Created by more mutations.
The shortest found algorithm.
Identical to the known One bit protocol Burns
Lynch 93.

66
Fitness Graph

Best fitness is alternately improved by
Major leaps due to changes in fitness levels.
Small improvements caused by parsimony pressure.

67
More experiments

Successfully found Dekker's algorithm. Dijkstra
65.
Successfully found Petersons algorithm.
Peterson Fisher 77.
Found a shorter algorithm than Dekker's.

68
Performance

First variant was easiest to solve.
Other variants are much harder to find.
Still, much better than brute-force methods.
Less significant on small programs (Peterson).
Crucial on large programs (Dekker).

69
Conclusions and Future Work

GP and model checking were successfully combined.
To achieve that, a specific tool was developed.
Found solutions are guaranteed to completely
satisfy the specification.
Scoring system can be further refined.
More information can be extracted from the model
checking results, for assisting the evolutionary
process.
A similar method can be used for correcting a
given program, or at least showing where the
error is.
Next step use discriminative model checking
properties to refine grading and to find where in
program to make changes.