Applying First-Order Theorem Provers in Formal Software Safety Certification presentation

About This Presentation

Transcript and Presenter's Notes

Title: Applying First-Order Theorem Provers in Formal Software Safety Certification

1
Applying First-Order Theorem Proversin Formal
Software Safety Certification
Joint work with E. Denney and J. Schumann,NASA
Ames Research Center
2
Disclaimer

Our view of ATPs

but sometimes

Keep in mind that we are ATP users, not
developers we are not looking for proofs but
for assurance we consider ATPs a necessary
evil -)
3
Code Generator DSL Compiler
Initial model (DSL program)
. . .
model landsat as Landsat Clustering. const nat
N as number of pixels. const nat B as number
of bands. const nat C 5 as number of
classes where C ltlt N. double phi(1..N) as
class weights where 1 sum(I 1..C,
phi(I)). double mu(1..C), sig(1..C) where 0 lt
sig(_). int c(1..N) as class assignments. c(_)
discrete(phi). data double x(1..N, 1..B) as
pixels. x(I,_) gauss(mu(c(I)),
sig(c(I))). max pr(x phi,mu,sig) for
phi,mu,sig.

Ground cover map
multiple Landsat-bands
estimate pixel classes
estimate class parameters
Implementation problems
which model?
which algorithm?
efficient C/C code?
correctness?

Model refinements
sig(_) invgamma(delta/21,sig0delta/2).
Model changes
x(I,_) cauchy(mu(c(I)), sig(c(I))).
x(I,_) mix(c(I) cases 1 -gt
gauss(0, error), _ -gt
cauchy(mu(c(I)),sig(c(I)))).
4
Code Generator DSL Compiler
Generated program
. . .

Ground cover map
multiple Landsat-bands
estimate pixel classes
estimate class parameters
Implementation problems
which model?
which algorithm?
efficient C/C code?
correctness?

5
Code Generator DSL Compiler

Generated program
1sec. generation time
600 lines
130 leverage
fully documented
deeply nested loops
complex calculations
correct-by-construction

. . .

Ground cover map
multiple Landsat-bands
estimate pixel classes
estimate class parameters
Implementation problems
which model?
which algorithm?
efficient C/C code?
correctness?

6
Generator Assurance

Should you trust a code generator?
Correctness of the generated code depends on
correctness of the generator
Correctness of the generator is difficult to show
very large
very complicated
very dynamic
So what do you do?

7
Generator Assurance

Should you trust a code generator?
Correctness of the generated code depends on
correctness of the generator
Correctness of the generator is difficult to show
very large
very complicated
very dynamic
So what???
Dont care whether generator is buggy for other
peopleas long as it works for me now!
? Certifiable Code Generation (PCC for
code generators)

8
Certifiable Code Generation

Generator is extended to support post-generation
verification
certify generated programs, not the generator
minimizes trusted component base
no need to re-certify generator
use standard program verification techniques
annotations (e.g., invariants) give hints only
proofs are independently verifiable evidence
(certificates)
keeps certification independent from code
generation
focus on specific safety properties
keeps annotation generation and certification
tractable

... // Initialization for(v440v44ltn-1v44
) for(v450v45ltc-1v45 )
q(v44,v45)0 for(v460v46ltn-1v46)
q(v46,z(v46))1 ... for(v120v12ltn-1pv12
) for(v130v13ltc-1pv13) pv68 0
for(v410v41 lt c-1pv41 )
v68exp((x(v12)-mu(v41))
(x(v12)-mu(v41))/ (double)(-2)/
...) ...
model mog as 'Mixture of Gaussians'. ...
Class probabilities double rho(1..c). where 1
sum(I1..c, rho(I)). Class parameters double
mu(1..c). double sigma(1..c). where 0 lt
sigma(_). Hidden variable nat z(1..n)
discrete(rho). Data data double x(1..n). x(I)
gauss(mu(z(I)),sigma(z(I))). Goal max
pr(xrho,mu,sigma) for rho,mu,sigma.
Proofs
Model
Code
9
Hoare-Style Certification Framework

Safety property formal characterization of
aspects of intuitively
safe programs
introduce shadow variables to record safety
information
extend operational semantics by effects on shadow
variables
define semantic safety judgements on
expressions/statements
Safety policy proof rules designed to show that
safety property
holds for program
extend Hoare-rules by safety predicate and shadow
variables
? prove soundness and completeness (offline,
manual)

10
Certification Framework

Safety property formal characterization of
aspects of intuitively
safe programs
All automatic variables shall have been
assigned a value before being used (MISRA 9.1)
Formal
introduce shadow variables to record safety
information
extend operational semantics by effects on shadow
variables

11
Certification Framework

Safety property formal characterization of
aspects of intuitively
safe programs
All automatic variables shall have been
assigned a value before being used (MISRA 9.1)
Formal
introduce shadow variables to record safety
information
extend operational semantics by effects on shadow
variables
define semantic safety judgements on
expressions/statements

12
Certification Framework

Safety property formal characterization of
aspects of intuitively
safe programs
All automatic variables shall have been
assigned a value before being used (MISRA 9.1)
Formal
introduce shadow variables to record safety
information
extend operational semantics by effects on shadow
variables
define semantic safety judgements on
expressions/statements
prove safety reduction (i.e., consistency of
safety property)
? safe programs dont go wrong

13
Certification Framework

Safety policy proof rules designed to show that
safety property
holds for program
responsible for
maintenance of shadow variables
construction of safety obligations
extend Hoare-rules by safety predicate and shadow
variables

14
Safety Properties

Language-specific properties
array indices within bounds (array) ?ai ? c
a i a
variable initialization before use (init) ? rvar
x ? c x INIT
nil-pointer dereference,
Domain-specific properties
matrix symmetry (symm) ? covar m ? c ?i,i
mi,j mj,i
covariance matrices known by code generator
can insert annotations
vector norm, coordinate frame safety,

lo
hi
similar to PCC
init
15
Certification Architecture

standard PCC architecture
organically grown job control
run scripts based on SystemOnTPTP
dynamic axiom generation (sed / awk)
dynamic axiom selection (based on problem names)

Lessons

17
Lesson 1 Things dont go wrong in the ATP

Most errors are in the application,
axiomatization, or integration
interface between application and ATP proof
task
application debugging task proving / refuting
application errors difficult to detect
application must provide full axiomatization
(axioms / lemmas)
no standards ? no reuse
consistency difficult to ensure manually
integration needs generally supported job control
language
better than SystemOnTPTP shell scripts C
pre-processor
applications need more robust ATPs
better error checking free variables, ill-sorted
terms,
consistency checking more important than proof
checking

18
Lesson 2 TPTP ? Real World

Applications and benchmarks have different
profiles
full (typed) FOF vs. CNF, UEQ, HNE, EPR,
clausification is integral part of ATP
problems in subclasses are rare (almost
accidental)
ATPs need to work on full FOF(? branch to
specialized solvers hidden)
task stream vs. single task one problem many
tasks
success only if all tasks proven
most tasks relatively simple, but often large
most problems contain hard tasks
background theory remains stable
ATPs need to minimize overhead (? batch mode)
ATPs should be easily tunable to application
domain

19
Lesson 2 TPTP ? Real World

characteristics (and results) vary with policy
array partial orders ground arithmetics
significant fraction of tasks solved, but not
problems
most tasks relatively simple, most problems
contain hard tasks
problem-oriented view magnifies ATP differences

array
20
Lesson 2 TPTP ? Real World

characteristics (and results) vary with policy
array partial orders ground arithmetics
init deeply nested select/update terms
completely overwhelms ATPs
response times determined by Tmax (60secs.)

array
init
21
Lesson 2 TPTP ? Real World

characteristics (and results) vary with policy
array partial orders ground arithmetics
init deeply nested select/update terms
symm deeply nested select/update terms (but
trickier context...)
completely overwhelms ATPs
ATPs only solve trivial VCs

array
init
symm
22
Lesson 2 TPTP ? Real World

characteristics (and results) vary with policy
array partial orders ground arithmetics
init deeply nested select/update terms
symm deeply nested select/update terms (but
trickier context...)
array and init are inherently simple
should be solvable by any competitive ATP
(low-hanging fruit)
TPTP overtuning?

array
init
symm
23
Lesson 3 Need controlled simplification

Applications generate large conjectures with
redundancies
propositional true ? x, x ? true,
arithmetics 65, minus(6, 1),
Hoare-logic select(update(update(update(x,0,0),1,
1),2,2),0)
can frequently be evaluated / simplified before
provingrewriting beats resolution
ATPs should provide (user-controlled)
simplification mode
ground-evaluation of built-in functions /
predicates
orientation of equalities / equivalences
here hand-crafted rewrite-based simplifications

24
Lesson 3 Need controlled simplification

propositional simplification, min-scoping,
splitting into VCs
evaluation of ground arithmetics
array good w/ E / Vampire, neutral w/ Spass,
mixed w/ Equinox
init floods ATPs with hard VCs (other ATPs
killed after hours)
symm splits hard VCs, no benefits

array
init
symm
25
Lesson 3 Need controlled simplification

select(update(x,i,v), j) ? (ij) ? v select(x,
j)
rules for _?_ _
array, symm no difference to before
init dramatic improvement for all ATPs (?
eliminates all array accesses)

array
init
symm
26
Lesson 3 Need controlled simplification

domain-specific simplifications (mostly
symmetry)symm(update(m,k,v)) ? ?i,i
select(update(m,k,v),i,j) select(update(m,k,v),j
,i)
array, init no substantial difference to before
(but less VCs)
symm dramatic improvement for all ATPs

array
init
symm
27
Lesson 4 Need axiom selection

Application domain theories are often large but
largely disjoint
core theory formalizes underlying programming
language
array and init essentially part of core theory
property-specific theories
introduce symbols only occurring in tasks for the
given property
contain lemmas only relevant to tasks for the
given property
intuitively not required for other properties
should have no detrimental effect on ATP
performance

28
Lesson 4 Need axiom selection

Example init with redundant symm-axioms
explicit symmetry-predicate ?i,i
select(m,i,j) select(m,j,i) ? symm(m) symm(n)
? symm(m) ? symm(madd(n, m))
implicit symmetry ?i,i select(n,i,j)
select(n,j,i) ?select(m,i,j) select(m,j,i) ?
select(madd(n,m),i,j) select(madd(n,m),j,i)

init imp. symm
init
init exp. symm
29
Lesson 4 Need axiom selection

ATPs shouldnt flatten structure in the domain
theory
similar to distinction conjecture vs. axioms
coarse selection based on file structuring can be
controlled by application
detailed selection based on signature of
conjecture might be benificial (cf. Reif /
Schellhorn 1998)

30
Lesson 5 Need built-in theory support

Applications use the same domain theories over
and over again
It's disgraceful that we have to define integers
using succ's,and make up our own array syntax
significant engineering effort
no standards ? no reuse
hand-crafted axiomatizations are
typically incomplete
typically sub-optimal for ATP
typically need generators to handle occurring
literals
can add 3succ(succ(succ(0))) but what about
1023?
shell scripts generate axioms for
Pressburgerization, finite domains (small
numbers only)
ground orders (gt/leq), ground arithmetic (all
occurring numbers)
often inadequate (i.e., wrong)
often inconsistent

31
Lesson 5 Need built-in theory support

FOL ATPs should steal from SMT-LIB and HO
systems and
provide libraries of standard theories
TPTP TFF (typed FOL and built-in arithmetics) is
a good start
Your ATP should support that!
SMT-LIB is a good goal to aim for
theories can be implemented by
axiom generation
decision procedures
Your ATP should support that!

32
Lesson 6 Need ATP PlugPlay

No single off-the-shelf ATP is optimal for all
problems
combine ATPs with different preprocessing steps
clausification
simplification
axiom selection
ATP combinators
sequential composition
parallel competition
TPTPWorld is a good ATP harness but not a gluing
framework

33
Conclusions

(Software engineering) applications are tractable
no fancy logic, just POFOL and normal theories
different characteristics than TPTP (except for
SWC and SWV -)
Support theories!
theory libraries, built-in symbols
light-weight support sufficient
ATP customization
exploit regularities from application
user-controlled pre-processing
grab low-hanging fruits!
Success Proof and Integration
need robust tools
need PlugPlay framework

Write a Comment

User Comments (0)

About PowerShow.com

Applying First-Order Theorem Provers in Formal Software Safety Certification PowerPoint PPT Presentation