Title: To Memory Safety through Proofs
1To Memory Safety through Proofs
- Doctoral Thesis Defense
- Dengping Zhu
- Advisor Hongwei Xi
- Boston University
2Outline
- Motivation
- Introduction to ATS
- Our Approach (Stateful Views)
- Programming with Stateful Views
- Conclusion
3Memory manipulation in C/C
- Direct memory manipulation
- Useful. E.g., Pointers in C/C
- p n pointer arithmetic
- Dangerous. No safety guarantee.
- Dangling pointers
- Segmentation fault
- x An // potentially out-of-bounds
- Difficult to debug !
4Problems with Safe languages
- Safe programming languages such as ML and Java.
- Explicit use of pointers is forbidden
- Memory manipulation is done through systematic
allocation and de-allocation - In order to deal with direct memory manipulation,
an interface with C is provided - Workable, but
- Low efficiency. Not appropriate for system
programming and embedded applications - Probably the most difficult part of programming
is done in a unsafe language such as C through
the interface
5Our Goal
- Support explicit use of pointers and ensure its
safety statically (via type checking). - For instance
- x p we want p not to be a dangling pointer
- If p is a dangling pointer, a type error is
reported - x An we want n to be within the array
bounds - If n is out of bounds, a type error is reported
6More
- Enforce invariants on data structures involving
sophisticated use of pointers - Example
struct node int item node next node
prev
7Outline
- Motivation
- Introduction to ATS
- Our Approach (Stateful Views)
- Programming with Stateful Views
- Conclusion
8The Framework ATS
- ATS is a framework to facilitate the design and
formalization of (advanced) type systems in
support of practical programming. - The name applied type system refers to a type
system formed in ATS, which consists of two
components - A static component (statics), where types are
formed and reasoned about, and - A dynamic component (dynamics), where programs
are constructed and evaluated. - Statics is completely separately from dynamics.
9The Language ATS
- ATS is a programming language with a type system
rooted in the framework ATS. In ATS, a variety of
programming paradigms are supported in a typeful
manner, including - Functional programming
- Object-oriented programming
- Modular programming
- Meta-programming
- Imperative programming with pointers (my work)
10ATS Dependent Types
- Can capture more program properties
- e.g
- 5 int(5) 3 int(3)
- Add (Int, Int) -gt Int
- With dependent types
- Add ?m int. ? n int. (int(m), int(n)) -gt
int(mn)
dynamic term
static term
11ATS Guarded Types
- Type guards P
- e.g. n gt 0 (n is a static integer)
- Guarded types P ? T
- e.g.
- factorial ?aint. a ? 0 ? (int(a) ? Int)
-
12ATS Asserting types
- Has the form P ? T
- Example a function from non-negative integers
to negative integers - ?a int. a ? 0 ? (int(a) -gt ? a int. ( a lt
0) ? int(a))
13Outline
- Motivation
- Introduction to ATS
- Our Approach (Stateful Views)
- Programming with Stateful Views
- Conclusion
14Motivations review
- Use types to
- Ensure safe use of pointers
- Enforce invariants on data structure involving
sophisticated use of pointers, such as arrays,
linked lists, trees - There are many challenges.
15Obstacle 1
- How to model memory layout?
- Solution Stateful Views
- Primitive views T_at_L
- T is a type
- L is a memory address
- A value of type T is stored at address L
- E.g.
- int(5) _at_ 100 5 is stored at address 100
100
5
16Stateful Views
- Other stateful views are built on top of
primitive views - Adjacent views (T1_at_L, T2_at_L1)
- A value of type T1 is stored at L
- A value of type T2 is stored at L1
- May be written as (T1, T2)_at_L
L
L1
T1
T2
17Viewtypes
- Viewtypes VT
- (V T)
- V is a view and T is a type
- A value of the type has the form (pf v)
- pf is the proof of V
- v is the value of type T
18Stateful Views
- Example
- Read from a pointer
- getPtr ?atype. ?laddr. (a_at_l ptr(l)) ? (a_at_l
a) - Disallow reading from dangling pointers !
- Write to a pointer
- setPtr ? atype. ?laddr.
- (top_at_l (ptr(l), a)) ? (a_at_l 1)
19Obstacle 2
- How to model recursive data structures, such as
arrays and linked lists? - Using primitive view T_at_L, we are only able to
track a fixed number of memory locations. - Solution Recursive Stateful Views
20Recursive Stateful Views
arrayView (a, n, l) an array of type a
with length n is stored at address L
L
L
No Memory
No Memory
arrayView(a,0,l)
L1
L
L
a_at_l
arrayView(a,n,l1)
arrayView(a,n1,l)
21Array
- dataview arrayView (type, int, addr)
- atype, laddr ArrayNone (a, 0, l)
- atype, nnat, laddr
- ArraySome (a, n1, l) of (a_at_l, arrayView (a, n,
l1))
ArrayNone ?a type. ?laddr. () ! arrayView
(a, 0, l)
ArraySome ?a type. ?laddr. ? n nat.
(a_at_l, arrayView(a, n, l1)) ! arrayView (a,
n1, l)
22Obstacle 3
- A data structure may have more than one views.
How to switch?
23View Change
- View change function -- Split
?atype. ?nint. ?inat. ?laddr. i ? n ?
(arrayview (a, n, l) ! (arrayview (a, i, l),
arrayView (a, n-i, li))
arrayView(a,n,L)
L
Li
arrayView(a,i,L)
arrayView(a,n-i,Li)
24View Change
- Theorem (split) for any array of length n and a
given integer i where 0lt i lt n, this array can
be split into two sub-arrays, which are of length
i (left) and n-i (right), respectively. - Proof by induction
- Base case i0. left part is empty, right part is
the array itself. - Induction igt0.
- Take the head element.
- Apply induction hypothesis on the tail of this
array which is of length n-1 gives us two
sub-arrays of length i-1 (left) and n-i (right),
respectively. - Combine the head element with the left sub-array
gives us a new left sub-array with length i.
Done.
25Split defintion
- prfun split atype, nint, inat, laddr n gt
i .ltigt. - (pf arrayView (a, n, l))
- '(arrayView (a, i, l), arrayView (a, n-i,
li)) - if i 0 then '(ArrayNone, pf)
- else
- let
- prval ArraySome (pf1, pf2) pf
- prval '(pf21, pf22) split a,n-1,i-1,l1
(pf2) - in
- '(ArraySome (pf1, pf21), pf22)
- end
26Obstacle 4
- Views are linear. How to deal with sharing?
- Solution Dynamic Locks
- We need to accommodate multi-threaded programming
in ATS.
27Sharing
- Primitive functions
- Viewref_some creates a lock
- Viewref_get locks it
- Viewref_set unlocks it
28Results
- Typing judgment
- ? Static variable context, e.g. a int
- Static propositions, e.g. a gt 0
- ? Proof variable context, e.g. x V
- ? Dynamic variable contex, consists of two
parts - ?i intuitionistic context, e.g. x T
- ?l linear context, e.g. xVT
- ? state types, e.g. l ! T
29Some typing rules
30Soundness
- (Substitution Lemma)
-
- 1. Assume that both ?, a? B ? ? ? t VT and
? s ? are derivable. Then ? Ba a s ?a a
s?a ? s ?a ? s t VTa ? s is also
derivable. - Six more
-
31Soundness
- (Subject Reduction)
- Assume ( ) ?1 t1 VT is derivable
and ² ST1 ?1 holds. If (ST1, t1) !ev/st (ST2,
t2), then ( ) ?2 t2 VT is derivable
for some store type ?2 such that ² ST2 ?2 holds. - (Progress)
- Assume that ( ) ? t VT is
derivable and ² ST ? holds. Then either t is a
value or (ST, t) !ev/st (ST', t') for some ST'
and t or t is of the form Ecf(v1, , vn) such
that cf(v1, , vn) is undefined. - An electronic soundness proof of the core of ATS
(without views) is online.
32Outline
- Motivation
- Introduction to ATS
- Our Approach (Stateful Views)
- Programming with Stateful Views
- Conclusion
33Swap
- swap ?t1type. ?t2type. ?l1addr. ?l2addr.
- (t1_at_l1, t2_at_l2 ptr(l1), ptr(l2)) -gt (t2_at_l1,
t1_at_l2 unit)
fun swap t1type, t2type, l1addr, l2addr
(pf1 t1_at_l1, pf2 t2_at_l2 ? p1 ptr(l1), p2
ptr(l2)) (t1 _at_ l2, t2 _at_ l1 ? unit)
let val (pf1 ? tmp1) getPtr (pf1 ?
p1) val (pf2 ? tmp2) getPtr (pf2 ?
p2) val (pf1 ? _ ) setPtr (pf1 ?
p1, tmp2) val (pf2 ? _) setPtr
(pf2 ? p2, tmp1) in (pf2, pf1
? ()) end
l1
l2
t1
t2
t2
t1
pf1
pf2
pf2
pf1
pf2
pf1
34Swap
- All types and proofs will be erased after
type-checking.
fun swap (p1, p2) let val tmp1
getPtr (p1) val tmp2 getPtr (p2)
val _ setPtr (p1, tmp2) val _
setPtr (p2, tmp1) in () end
35Singly-linked lists
first
last
slsegView (a, n, first, last) 1. each
element of the list segment is of type a 2. the
length of this segment is n 3. the head of this
segment is first 4. the tail of this segment
points to last
36Singly-linked lists
slsegView (a, n, first, last) How to define
it?
L
L
No Memory
No Memory
slsegView(a,0,L, L)
37Singly-linked List
- dataview slsegView (type, int, addr, addr)
- atype, laddr SlsegNone (a, 0, l, l)
- atype, nnat, first, next, last first ltgt
null - SlsegSome (a, n1, first, last) of
- ((a, ptr (next)) _at_ first, slseg (a, n,
next, last))
slsegNone ?a type. ?laddr. () ! slsegView
(a, 0, l, l)
slsegSome ?a.type. 8 nnat. ?firstaddr. 8
nextaddr. 8 lastaddr. (first ltgt 0) ¾ ((a,
ptr(next))_at_l, slsegView(a, n, next, last )) !
slsegView (a, n1, first, last)
38Singly-linked List
- viewdef sllist (a, n, l) slseg (a, n, l, null)
39Linked-list reversal
- struct node int item node next
- node rev (node list1, node list2)
- if (list2 NULL) return list1
- else
- node next list2-gtnext
- list2-gtnext list1
- return rev(list2, next)
-
- node reverse (node list) return rev(NULL,
list)
40Linked List Reversal in ATS
- rev 8 a.type. 8 n1nat. 8 n2nat. 8 l1addr. 8
l2addr. - (sllist(a, n1, l1), sllist(a, n2, l2) ptr(l1),
ptr(l2)) -gt - laddr (sllist(a, n1n2, l) ptr(l))
reverse 8 a.type. 8 nnat. 8 laddr.
(sllist(a, n, l) ptr(l)) -gt laddr
(sllist(a, n, l) ptr(l))
41Linked List Reversal in ATS
- fun rev n1nat,n2nat,l1addr,l2addr
- (pf1 sllist (a, n1, l1), pf2 sllist(a, n2,
l2) list1 ptr(l1), list2 ptr(l2)) - l addr '(sllist (a, n1n2, l) ptr(l))
- if (isNull list2) then
- let prval SlsegNone () pf2 in '(pf1
list1) end - else
- let
- prval SlsegSome (pf21, pf22) pf2
- prval '(pf210, pf211) pf21
- val '(pf211 next) getPtr(pf211
list2 padd 1) - val '(pf211 _) setPtr (pf211 list2
padd 1, list1) - prval pf1 SlsegSome ('(pf210, pf211),
pf1) - in
- rev (pf1, pf22 list2, next)
- end
node rev (node list1, node list2) if
(list2 NULL) return list1 else node
next list2-gtnext list2-gtnext list1
return rev(list2, next)
42Binomial Trees
B0
43Binomial Trees
In practice
3
2
1
0
0
0
1
0
44Binomial Trees
- // index element type, rank, self, parent,
sibling - dataview bntree (type, int, addr, addr, addr)
- atype, rnat, selfaddr, paraddr,
childaddr, sibaddr self ltgt null - Node (a, r, self, par, sib) of
- ((ptr par, a, int(r), ptr(child),
ptr(sib))_at_self, - btseg(a, r, child, self, null))
- // index element type, length, self, parent,
sibling - and btlseg (type, int, addr, addr, addr)
- atype, paraddr BtlsegNone(a, 0, null,
par, null) - atype, rnat, nextaddr, selfaddr,
paraddr, sibaddr self ltgt null - BtlsegSome(a, r1, self, par, sib) of
- (bntree(a, r, self, par, next),
- btlseg(a, r, next, par, sib))
- viewdef binomialT (atype, rint, selfaddr)
bntree(a, r, self, null, null)
45Binomial Trees Union
- Invariants
- Rank must be the same, say r
- Rank of the returned tree is r1
- The returned root node is one of the roots
- Our type system can enforce these invariants
46More Examples
- Find more on-line
- Cyclic buffer,
- Doubly-linked binary trees
- Splay trees
- AVL Trees
47Outline
- Motivation
- Introduction to ATS
- Our Approach (Stateful Views)
- Programming with Stateful Views
- Conclusion
48Conclusion
- The notion of stateful views provides a general
and flexible approach to safe programming with
pointers - Disallow dangling pointer access
- Describe memory layouts
- Enforce invariants for various data structures
49Conclusion
- Identify the need of view change
- The user can define view change functions
- Applications
- Interaction with C code
- By assigning ATS types to C library functions, we
can rule out some misuse of those functions
50Current Status of ATS
- The current implementation of ATS is done is
Ocaml, including a type-checker, an interpreter
and a compiler from ATS to C. -
- The library of ATS is done in ATS itself,
consisting of over 20k lines of code. - More information can be found at ATS homepage
- http//www.cs.bu.edu/hwxi/ATS
51Future Work
- Better proof management
- Right now the user needs to manipulate proof
terms - Some proof manipulation can be automated
- More applications
- Such as device drivers (ongoing, Rui Shi)
52The end