Title: Verification of Embedded Software in Industrial Microprocessors
1Verification of Embedded Software inIndustrial
Microprocessors
- Eli Singerman
- Design and Technology Solutions
- Intel Corp.
- FMCAD2007, Austin
-
2Acknowledgements
- Our Intel team
- Michael Mishaeli, Elad Elster, Ronen Kofman,
Tamarah Arons, Andreas Tiemeyer, Shlomit Ozer,
Jonathan Shalev, Pavel Mikhlin, Terry Murphi,
Sela Mador-Haim (past) - Academic partners
- Lenore Zuck, Amir Pnueli, Moshe Vardi
3Outline
- Motivation
- Embedded Software Intro
- Characteristics
- Verification Landscape
- Application of Formal Methods
- Modeling
- Reasoning
- Verification Flows
- The way ahead
- Related work
4Motivation
- In modern Intel processors, several advanced new
technologies are provided through embedded
software - E.g., virtualization, security
- We expect this to grow as CPUs gradually move to
SoC design paradigm
5Motivation cont.
- Verification is on the critical path
- Limits the introduction of new features
- Dominant in Time-To-Market
- Embedded software (mostly microcode) is
responsible for a significant portion of bugs - Implementation is challenging (and error prone)
- Verification methodologies/tools/technologies are
challenged - In this talk, I will overview some of the
directions we have pursued in addressing this
growing gap (extending what we reported at
CAV05)
6Outline
- Motivation
- Embedded Software Intro
- Characteristics
- Verification Landscape
- Application of Formal Methods
- Modeling
- Reasoning
- Verification Flows (it is not all FV)
- The way ahead
- Related work
7Characteristics Basics
- Programs are low level (sub-assembly) let us
call them microprograms - The basic data types are bits and bit-vectors
- Denoting state variables such as registers,
memory, microarchitectural control and
configuration bits
8- Control flow is facilitated by jump-to-label
instructions - Where the labels appear in the program or
indicate a call to an external procedure - Sometimes, label offset is used
- It is possible to have indirect jumps (target is
known only at run time) - Have loops (non-recursive, almost all with fixed
bound)
9Characteristics Atomic statements
- Atomic statements of microprograms are called
microinstructions - These are implemented via dedicated hardware
executed in various units of the CPU (e.g., ALU) - Can be thought of as functions that (typically)
get as input two register arguments, perform some
computation and assign the final result to a
third register - Possibly raising various exceptions
10Example (not real)
- A simple microprogram (not actual)
- Registers are bitvectors of width 64
- Memory is an array3264
- That is, an address space of 32 bits is mapped to
entries of 64 bits - memory_read and add are microinstructions
BEGIN FLOW(example) reg1
memory_read(reg2, reg3) reg4 add(reg1,
reg5)
11Characteristics Hardware Interaction
- In addition to invoking microinstructions, we
have implicit interaction by reading/setting
various shared machine-state variables - E.g., memory (both persistent and volatile),
special control bits for signaling
microarchitectural events, etc. - The latter is used for governing the
microarchitectural state and is mostly modeled as
side-effects (not visible in the source code)
12Characteristics Termination
- All microprograms execution paths are finite
(hopefully) - Except for spin-loops waiting for HW event to
occur in order to resume execution - Two types of exits
- Normal Nothing bad happened
- Exception Some fault occurred, e.g., arithmetic
(underflow), memory (page not found)
13Verification Landscape
HW Env model
Simulator
Source
Tests
- Simulation based
- Try to cover all possible execution paths in each
microprogram (at least once.)
Test Generator
Path Manager
Paths DB
Lint Checker
14Major Limitations
- Path extraction is manual -- annotation based
- Very difficult to write and get correct
- Missing real paths
- Generating un-real paths that can never be
covered - Significant maintenance burden
- Test ? Simulate ? Cover loop is too long
- Verification is control oriented
- In critical microprograms data should be taken
into account
15Outline
- Motivation
- Embedded Software Intro
- Characteristics
- Verification Landscape
- Application of Formal Methods
- Modeling
- Reasoning
- Verification Flows (it is not all FV)
- The way ahead
- Related work
16Modeling
17Modeling IRL
- Native Embedded SW dialects are extremely
complex, with many implicit side-effects, and
intricate semantics - We introduce an intermediate format, which we
call IRL Intermediate Representation Language - IRL is a simple programming language
- Basic Data Types are bits and bit-vectors (with a
rich set of operations) - Basic statements are conditional assignments and
GOTOs. - IRL is
- expressive enough to describe fully and
explicitly the behavior of microprograms and
their (implicit) interaction with their hardware
environment at the right abstraction level - Yet, its sequential semantics is simple enough to
enable formal reasoning
18Constructing an IRL model
- IRL uses a Template Mechanism
- Each microinstruction is implemented by an IRL
Template - Template body is a sequence of (plain) IRL
statements that compute the effect of the
microinstruction - Including side effects of a microinstruction
computation, by updating the relevant auxiliary
variables - When compiling a microprogram, templates are
instantiated to generate IRL code - This enables
- Compositional build
- Write once, use many times
- In addition, exceptions/faulty behaviors are
modeled as executions at the end of which various
variables are made observable
19Example
- A simple microprogram (not actual uCode)
- Registers are bitvectors of width 64
- Memory is an array3264
- That is, an address space of 32 bits is mapped to
entries of 64 bits - memory_read and add are microinstructions
BEGIN FLOW(example) reg1
memory_read(reg2, reg3) reg4 add(reg1,
reg5)
20IRL Templates for microinstructions
- Note that a side effect setting zeroFlag is
explicit (for simplicity, we ignore the
possibility of add overflow) - Memory_read is more involved
- Includes a possible exception. The address is
calculated as tmp address offset. - If this is out of the memory address range of 32
bits, then an address overflow exception is
signaled with relevant variables
template add (reg result,reg src1,reg
src2) result src1 src2 zeroFlag
(result 0)
21IRL Templates cont.
exception address_overflow(bit32
address) template memory_read(reg result, reg
tmp_address, reg offset) TMP0 tmp_address
offset if (TMP0 gt 0xFFFFFFFF) exit
address_overflow (TMP06332) result
memoryTMP0310 zeroFlag (result 0)
Found_valid_address 1
22Formal model
- Symbolic transition system (Manna et al.)
- States defined by means of state variables
- Transitions defined by means of logical
constraints between pre and post values - Multiple exits (doors)
- Each exit has its own observable expressions
23Reasoning
24Overview
- Reasoning is done through an IRL symbolic
simulator - All inputs are assigned with initial symbolic
values - Memory interaction is modeled via un-interpreted
functions using a stack mechanism - Constraints computed by the simulator are
propositional formulas involving bit-vector
expressions over initial values - We compute necessary and sufficient conditions to
traverse any path the program can execute - For each path, compute the final state mapped to
selected observables - For evaluation, the conditions are submitted to
propositional SAT solver - We are very encouraged by initial results using
academic bit-vector solvers, more on this later
25System Diagram
IRL
MicroFormal
Compiler
SAT Solver
Symbolic Simulator
Verification Conditions generator
Debug
26Symbolic Simulation Some obstacles
- Simulating industrial strength embedded SW
requires resolving some difficulties - First, expressions get REALLY BIG, e.g., a
microprogram can consist of thousands of
execution paths, several of which have a
sequential length of 104 - In addition, have to handle indirect jump
statements - Lastly, have to account for loops
- Have to handle these on-the-fly due to first
issue
27Symbolic Simulation Partial answers
- Avoiding expression blow-up by dynamic
- Pruning of un-feasible execution paths (evaluate
path condition on-the-fly) - Merging of simulation paths at strategic control
locations (both automatically detected and user
provided) - Resolution (at least reduction) of indirect jump
targets by a combination of static expression
analysis and SAT - Caching and grouping of conditions
- Speed up current simulation using control info
computed at previous simulations
More details available in our coming DATE08 paper
28Example
BEGIN(toy_program) start I1 if (CPL gt 0)
fault I2 if (EAX gt 7) then EBX 8 else
EBX EAX - 2 I3 if (EBX lt 5) goto
skip_mask I4 EAX EAX 0x000F
skip_mask I5 if EAX lt EBX fault
29Path Analysis
Path Conditions P0 (CPL0gt0) ? fault P1
(CPL00) (EBX1 lt 5) (EAX0 lt EBX1) ? fault P2
(CPL00) (EBX1 lt 5) (EAX0 EBX1) ? end P3
(CPL00) (EBX1 5) (EAX1 lt EBX1) ? fault P4
(CPL00) (EBX1 5) (EAX1 EBX1) ? end
Remove infeasible
start I1
CPL0 0
CPL0gt 0
I2
EBX1 (EAX0 gt7)?8EAX0-2
P0 fault
Merge
I3
EBX1 5
EBX1 lt 5
skip_mask I5
EAX1 EAX0 0x000F
I4
EAX0 EBX1
EAX0 lt EBX1
skip_mask I5
P2 End
P1 fault
EAX1 lt EBX1
((EBX1 lt 5) ? EAX0 EAX1) lt EBX1
((EBX1 lt 5) ? EAX0 EAX1) EBX1
EAX1 EBX1
P3 fault
P4 End
P1 fault
P2 End
30Verification flows
31Assist Standard Verification
HW Env model
- Compute paths
- automatically w/o
- manual annotations
Simulator
Source
Tests
Test Generator
Path Manager
Guide a direct test generator
Paths DB
Lint Checker
Note These have far more impact than formal
verification flows
32Test Generation
- Using the path conditions, tests can be created
to exercise paths in simulation - State mapped between microarchitectural
representation and architectural means of setting
it - The program simulated is only a small piece of
the whole environment simulated - Other structures are needed to bootstrap test,
handle faulting conditions, and reach the program
under test - For this reason, it is not possible to simply
jam the values (example memory layout)
33Formal Property Verification
- Goal verify intended (partial) behavior of
microprograms - Specifying properties
- Essentially, state predicates expressing
uArch/Arch properties - User can specify what and where to check
- These are natural based on control flow of the
program, relating to significant program
locations - Examples
- If EAX3true at start then XYZ true at
end - if ECX contains an initial value of given set,
then a General Protection Ffault will occur -
- Basic directives
- Assume a predicate at a specific loaction
- Assert (verify) a predicate at a specific
location (or at a set of locations) - Constrain program simulation paths during
execution
34Formal Equivalence Verification
- Goals
- Ensure backward compatibility w/IA 32
- Verify that optimizations do not break
functionality - Given two microprograms new, legacy
- A set of (global) constraints
- Mapping predicates -- relating the two different
CPU micro-architectures - Predicates specifying new features are
disabled - new is backward compatible with legacy if
both exhibit the same observable behavior (under
constraints) - For every initial state, both exit in the same
manner - Either both reach normal exit or both have the
same fault - Both produce the same values on relevant
observables - Both write the same values into the same
locations of external memory, in the same order - Compatible means equivalent under constraints
35Full Functional Verification of Instructions
- Goals
- Verify correct input/output behavior against a
full architectural specification - Account for both software and hardware
implementation - We developed a high-level specification based on
the programmers reference manual in IRL - Since we have IRL representation for the SW
implementation, its verification reduces to
checking equivalence of IRL programs - With some auxiliary mapping to bridge the
abstraction gap - The HW RTL implementation of microinstructions is
formally verified separately - Again, using symbolic simulation (STE)
- Together, this implies full verification (for
in-order execution) - Sometimes, it works ?
36Outline
- Motivation
- Embedded Software Intro
- Characteristics
- Verification Landscape
- Application of Formal Methods
- Modeling
- Reasoning
- Verification Flows (it is not all FV)
- The way ahead
- Related work
37Summary and future work
- In the past couple of years, we have we made
progress in the introduction of formal methods to
verification of embedded SW at Intel - Current toolset provides automaton in several key
verification activities - Contributes to quality and productivity of
traditional methods - Future (and on-going work) include
- Application (and adaptation) to other types of
embedded SW - On-going search for efficiency improvements in
all levels - On the solver level we are very encouraged by
latest results using academic word-level solvers
38Thanks!
39Some related work
- R.E. Bryant, Symbolic simulation techniques and
applications, DAC1990 - D. Currie et al, Embedded software verification
using symbolic execution and uninterpreted
functions, Int. J. Parallel Program, 2006 - A. Koelbl and C. Pixley, Constructing efficient
formal models from high-level descriptions using
symbolic simulation, Int. J. Parallel Program,
2005 - D. Babic and A. Hu, structural abstraction of
software verification conditions, CAV2005 - D. Currie, A. Hu and S. Rajan, Automatic formal
verification of DSP software, DAC2000 - C. Flanagan and J. Saxe, Avoiding exponential
explosion generating compact verification
conditions, POPL2001 - E.M. Clarke, D. Kroening and K. Yorav,
Behavioral consistency of C and Verilog programs
using bounded model checking, DAC2003 - R. C. Ho et al, "Architecture validation for
processors", Proc. Int. Symp. Computer
Architecture (ISCA95) - D. Lugato et al, Automated Functional Test Case
Synthesis from THALES industrial Requirements,
RTAS04 - P. Mihsra and N. Dutt, Graph-Based Functional
Test Program Generation for Pipelined
Processors, DATE2004 - S. Ur and Y. Yadin, Micro Architecture coverage
directed generation of test programs, DAC1999
40More related work
- D. Cyrluk, Microprocessor Verification in PVS A
Methodology and Simple Example, - Technical Report SRI-CSL-93-12, 1993
- S.Y. Huang and K.T. Cheng, Formal Equivalence
Checking and Design Debugging, - Kluwer, 1998.
- D. Harel and A. Pnueli, On the development of
reactive systems, In Logics and Models of
Concurrent Systems, 1985. - A. Pnueli, M. Siegel, and E. Singerman,
Translation validation, In TACAS1998 - J. Sawada and W.A. Hunt, Verification of FM9801
An out-of-order microprocessor model with
speculative execution, exceptions, and
program-modifying capability, J. on Formal
Methods in System Design, 2002. - M. Srivas and S. Miller, Applying formal
verification to the AAMP5 microprocessor - A case study in the industrial use of formal
methods, J. on Formal Methods in System, 1996. - A. Aharon et al, Test Program Generation for
Functional Verification of PowerPC Processors in
IBM, DAC1995 - R. S. Boyer, B. Elspas and K. N. Levitt, SELECT
- a formal system for testing and debugging
programs by symbolic execution, 1975. - T. Ball and S. K. Rajamani, Automatically
Validating Temporal Safety Properties of
Interfaces, SPIN 2001.
41Yet, some more
- S. Fine and A. Ziv. Coverage Directed Test
Generation for Functional Verification using
Bayesian Networks, DAC2003 - D. Geist et al, Coverage directed test
generation using symbolic techniques, FMCAD1996 - A. Gupta et al, Property-Specific Testbench
Generation for Guided Simulation, VLSID2002. - T. Arons et al, Formal verification of backward
compatibility of microcode, CAV2005 - T. Arons et al, Embedded Software Validation
Applying formal techniques for coverage and test
Generation, MTV2006 - T. Arons et al, Efficient symbolic simulation of
low-level software, to appear in DATE2008