Title: Language-Based Safety Mechanisms
1Language-Based Safety Mechanisms
- Stanford University CS 444A, Autumn 99Software
Development for Critical ApplicationsArmando Fox
David Dillfox,dill_at_cs.stanford.edu
2Concepts Overview Outline
- Static approaches
- Safe by design (limiting the language)
- Static analysis/type-safe languages
- Dynamic approaches
- interpreters and sandboxes
- Dynamic dataflow analysis
- A few examples (and problems)
- Java, the Exokernel, VMware, SFI, Janus,
Interface Compilation - As usualeach bullet is the subject of volumes of
papersthis is just an introduction to the
landscape
3Contrast With Davids Req Spec
- RS is about verifying a program (or FSM) in the
abstract - SFI is about securing them in practice
- The two are complementary
- Ex Transitions in FSM cover all possibilities
- What is all, really?
- Recall dreaming up desired emergent properties
- Compare Intel P6 bus protocol verification vs.
implementation validation
4What Is Safety in this context?
- Primary emphasis prevent buggy/malicious app
from doing harm to others - Dont interfere with other apps directly
(read/write their data or files) - Dont interfere with other apps indirectly (hog
OS resources so other apps are denied service) - Dont crash or corrupt the OS
- particularly important, since OS usually is the
trusted arbiter of limited resources - Non-goal stability of the isolated app.
5Techniques
- Two basic families of techniques
- 1. Limit things at runtime
- 2. Limit things at compile time
- Many schemes use a combination of both
- Runtime schemes typically rely on some OS and/or
hardware support
6Background The Thin Red Line
- Separates untrusted user space(s) from trusted
kernel space - Kernel manages hardware, shared resources,
- If you can bend the kernel to your will, you can
do serious damage - Typical implementation hardware VM support
- Each user process has its own page tables
(managed by the kernel) - Certain addresses mapped to kernel pages
Usercode
Programming model
Kernelcode
7Call Gates
- Call gates (or call descriptors, or traps, or)
- Controlled breach in the thin red line
- Typically involve an address space change, which
relies on VM so they are slow and expensive - Implementation often uses exception-handling
capability of processor
User code
Kernel code
8Background Virtual Machines
- In practice, a VM provides a combination of a
language execution environment and a pseudo-OS
runtime system - guest VM may virtualize hardware resources
differently from host OS - Safety is often not a primary goal of a VM
- The guest and host OSs may be the same or
different with respect to - Machine language/programmer-visible architecture
- Virtualization of resources
- Common flavor to various approaches Control
access to unsafe language/VM features
9VM Examples
- Java artificial-machine-in-a-real-machine
- Provides a language, a runtime, and OS-like
abstractions (network, filesystems, etc.) - Centralized Java Security Manager enforces
security policies - For the most part, runs in user mode
- VMware virtualize any x86 OS inside any other
(well, almost) - Every VM sees x86 protected-mode environment
- Within a VM, policies enforced by guest OS
- Across VMs, virtualized hardware is isolated
- User must grant a certain level of trust to
VMware host program
10What Can You Do With This?
- Limit what the language can express
- Unsafe operations are defined out of existence
- Never put off till runtime what you can do at
compile time - Limit what can be done at runtime
- Perhaps in combination with language limiting
- Each approach has pros and cons
11Static Analysis, Type-Safe Languages
- Goal To limit the damage a program can do, limit
what can be expressed in the source language - Assumes binaries are tamper-evident
- Assumes only trusted tools used to build binaries
- Assumes trusted tools are working correctly!
- Language features/limitations may allow you to
prove some invariants - Example Backward branching disallowed ?
finite-length programs finish in finite time - Example Pointers disallowed ? dangling pointer
dereferences vanish - Contrast SFI or inserting guard code
12Example Spin and Modula-3
- SPIN (Bershad et al., early 90s) a
user-extensible microkernel - Extension language Modula-3, a type-safe,
object-oriented language - Why type safety?
- Why object oriented?
- The extension checker and compiler
13Limiting the Language
- Goal To limit the damage a program can do, limit
what can be expressed in the source language - Assumes binaries are tamper-evident
- Assumes only trusted tools used to build binaries
- Assumes trusted tools are working correctly!
- Language features/limitations may allow you to
prove some invariants - Example Backward branching disallowed ?
finite-length programs finish in finite time - Example Pointers disallowed ? dangling pointer
dereferences vanish - Never put off till runtime what you can do at
compile time
14Pros Cons of Static Analysis
- - Requires that code be written in that specific
language - Sometimes its actually desirable to have a
simpler language! (e.g. Exokernel generalized
packet filter) - Other times languages may be too limited or
awkward - May also rely on integrity of tool chain
- - Languages with rich type systems and class
hierarchies confound this approach - Checking virtual function calls
- Casting between safe types (e.g. int to enum)
15Static analysis, contd.
- - Relies on integrity of interpreter or binaries
- What if the Java guys forgot some of the security
checks? - VM interpreter may need semi-privileged access to
get at the real resources controlled by the
host OS - Or at least, OS must verify signed code segments
(ActiveX does this) - May allow strong formal proofs of program
safety - Usually done by showing that a particular
high-level construct can never produce unsafe
low-level code - Can prove from the source code, if
transformations are correctness-preserving (or
semantics preserving)
16At Runtime Classic SFI and Janus
- SFI If program stays in its sandbox, it cant
damage other programs. - Dangerous operations/references surrounded by
interpolated guard code - Dangerous references can also be pinned to
sandbox by overwriting upper address bits - Note, this breaks program correctness! But focus
of SFI is preventing harm to others, not to
oneself - Janus If program cant make system calls, it
cant damage the OS and therefore other
programs. - Some programs break because they dont check
system call results
17Pros cons of runtime approaches
- Use high-confidence machine-level mechanisms
- Based on hardware-level mechanisms, e.g. VM,
traps - In practice, hardware implementation errors for
these are extremely rare (why?) - Can be used with arbitrary legacy code
- - No onus on programmer to make potential error
conditions explicit (e.g. assertions) - So runtime has no idea what to do to recover
- - Doesnt guarantee correct behavior--only
safety to others
18Dynamic Dataflow Analysis
- Potentially unsafe operations must always be
denied, to be conservative - If done statically, renders code impotent
- Idea quarantine the data that may be
contaminated by user (taintperl works this
way) - print STDERR Enter file namexltSTDINgt
x is tainted (user input)more code
z/tmp/safe_file.txt z is
cleanysysdir/x y is taintedsystem(cat
y) disallowed!system(cat z) OK
19Interface Compilation
- Problem interfaces are a syntactic abstraction
that usually carry no semantics - Semantics might be useful for
- Special-case optimizations (e.g. file I/O,
specialization by call site) - Safety of called proc, or error handling in case
of failure - Is the interface too narrow?
- Semantic type info may be lost (Unix)
- Semantic properties such as liveness are not
preserved across the interface (hidden state) -
example to follow
20Exploiting Semantics
- Example 1 File I/O
- fd open(filename)/ do some file
operations /close(fd)/ more code
/read(fd,buf,4096) / certain to fail! / - Example 2 type impoverishment
- read(int fd, void buf, size_t n)
- What if buf is unaligned or not big enough?
- No way to tell from call syntax
21Interface Compilation With MAGIK
- Provides abstractions for dealing with interfaces
- Iterators over the function calls
- Accessors for the data structures manipulated by
each call what type? Compile-time constant?
Access to internal fields of structure? Etc. - Allows programmer to write C-like code
extensions using these functions and accessors - Original source and extensions are compiled
together into common intermediate form - Intermediate form can be optimized using
traditional methods before machine targeting
22IC as an Orthogonal Mechanism
- Can retrofit existing legacy code (provided
source is available) - Admits of incremental improvements
- Safety concerns/development can be kept separate
from mainline logic for maintainability - Some cool implemented examples
- Type-aware I/O for C
- Safe signal handling (prevents calling
non-reentrant library functions inside a signal
handler) - Common thread uses semantic information that
cannot be extracted from source alone - Compare with emergent properties in req. spec.
23Lessons? Anyone?
- Limits of virtual machines and static analysis
- Assumes tools are trustworthy, from a security
standpoint - Butbuggy untrustworthy
- End-to-end argument suggests falling back on
runtime SFI?