Title: A TamperResistant Programming Language
1A Tamper-ResistantProgramming Language
- Dennis Heimbigner
- University of Colorado at Boulder
- http//www.cs.colorado.edu/users/dennis
- http//www.cs.colorado.edu/serl
- UC Davis Security Laboratory Seminar
- April 24, 2002
2Computing in Hostile Environment
- A number of scenarios require the ability to
carry out trusted computations in untrustworthy
environments - Software only solutions are difficult, maybe even
impossible (in general) - Barak et al. On the (Im)possibility of
Obfuscating Programs. CRYPTO 2001. - Hardware solution Secure processor attached to
insecure host - E.g., SmartCard
- Secure processor uses the resources of the host
for some of its computations
3SmartCard Scenario
- Card contains limited resources, especially
memory - Card needs access to larger memory in order to
compute - Larger memory is on untrusted host computer
- Card can read/write/malloc addressable memory
- Host can lie
- Not write specified contents card
- Return false contents to read request
4Background
- Inspiration
- Stack and Queue Integrity on Hostile Platforms
(Devanbu, Stubblebine) - Focus was on defining data structures with the
following properties - Data storable in untrusted memory of host
computer - Limited amount of memory in SmartCard
- Lies by untrusted host can eventually be detected
- Off-line - after computation is complete
- On-line - during the computation
- Interface is data structure specific
- e.g., push, pop, top, empty
5From Data Structures to Languages
- Goal is to raise the level of abstraction from
specific data structures to a complete, usable
programming language - NOT trying to build an intra-card programming
system such as Java SmartCard - Hide the complexity of tamper-resistant computing
from the programmer - Programs automatically resistant to tampering by
untrusted processor - Lies by untrusted host are detectable on-line
6Target Language Lisp 1.5
- Lisp 1.5 Programmers Manual by McCarthy et al.
- Why Lisp?
- Very simple interpreter
- Usable
- Based primarily on single data structure Lists
- Easy to make lists tamper-resistant
7Lisp 1.5 Problem Areas
- Efficiency
- Not very efficient, and tamper-resistance makes
it worse - Data Structures
- Representing list cells, atom, and primitive
values (e.g., integers) - Tamper Checking
- When and where is tamper checking performed?
- CAR, CDR, CONS
8Lisp 1.5 Problem Areas (cont.)
- Garbage Collection
- Building a tamper-resistant garbage collector
- Recursion avoidance
- Avoid need for a stack in the SmartCard
- External stack or use of Lisp lists
- Bindings
- setq, lambda, function, label
- Miscellaneous
- REPLACA/REPLACD
9Interpreter Design
- Interpreter code resides in SmartCard
- Includes common primitives built-in
- Non-recursive code recursion pushed to data
structures - Associated set of cell registers and pointer
registers that are manipulated by the interpreter - Registers can be filled by reading from host
memory - But all reads must eventually be checked for
tampering - Note that we dont care about tampering for cells
we never read - Privacy not important for now
- Can add later using encryption
10Interpreter State Registers
- Pointer Registers P1Pn
- Contains a pointer to a cell
- Special pointer registers act as root pointers
for interpreter data structures - Patom (atom list), Pset (global bindings), Pstack
(eval stack) - Cell Registers R1Rm
- Contains the contents of some cell as read from
untrusted memory plus address used to read it
Cell Address
P1
Cell Address
Contents of Memory Cell Ci
11Write Once Per Epoch (WOPE)
- Assume that the computation is divided into
epochs - Epoch divided by the invocation of the garbage
collector - Each cell is written at most once per epoch
- Each cell contains a hash signature
- h H(C.car, C.cdr, C.flags, C, Ti, Key)
- C is the cell address
- Ti is the current time (i.e., the current epoch
number) - Key is a secret key
- Current time and the key kept in the secure
co-processor
12Tamper Resistance Within an Epoch
- Assumes sufficiently strong cryptographic
security parameters - Key and hash function
- Cannot synthesize new cell contents
- Guaranteed by non-invertible hash plus secret key
- No replay of cell values from free list
- Guaranteed by flag values
- No replay of cell values from previous epochs
- Guaranteed by use of epoch time Ti
- No replay of other cell values from same epoch
- Guaranteed by use of cell address Ci
13Computing CAR Function
T
Ti
1. R1 lt- read C1
C2
R2
P2
C1
R1
P1
3. P2 lt- R1.car
SmartCard
Untrusted Host
C2
C3
F
h1
C1
A2
D2
F
h2
C2
A2
D2
F
h2
C3
14Computing CONS Function
C1
P3
1. P3 lt- C1 newcell()
C3
R2
P2
2. R1.addr lt- P3
3. R1.car lt- P1
C2
R1
C2
C3
F
C1
h1
P1
4. R1.cdr lt- P2
SmartCard
5. R1.F lt- Flags
Untrusted Host
C1
7. Write R1 to C1
A2
D2
F
h2
C2
A2
D2
F
h2
C3
15Garbage Collection
- As is usual with Lisp, we periodically need to
reclaim unused cells - Each such garbage collection activity marks the
end of an epoch - After garbage collection, all reachable cells
will have been re-signed using the new epoch time - Problem
- Write once per epoch will not hold true during
garbage collection - Potential target for reply by untrusted host
16Reference Counting
- Superficially possible since we are avoiding
cyclic structures - Requires good upper bound on number of references
to any one cell - Requires space in each cell for the reference
count - Requires stack space while propagating zero
reference counts - Infeasible in this implementation because it
violates WOPE
17Mark and Sweep
- Two phase process
- Mark walk all reachable cells and mark them with
a flag - Sweep examine all cells to locate unreachable
cells and place on a free list for use by
newcell() - Requires some flags in the cell
- Naïve implementation requires extra stack for
mark phase - Potential problems
- Violates WOPE because of cell modifications
during mark phase - Sweep phase examines all cells without following
cell pointers (RAM model of memory)
18Schorr-Wait Marking
- Clever marking algorithm that avoids a separate
stack - CACM 10(8) August 1967
- Idea is to reverse car and cdr links while
walking and then restoring when going back up the
list - Depth first walk
- Reversed path to root forms single list serving
as a stack - Use some flag bits to track link direction
- More complicated in our context
- Must maintain tamper resistance of the reversed
list
19Mark Phase Initial state
h Cell Signature Hm Merkle hash
Note use of Modified Cell Format
M1
M2
Hm
Hm
P1
P2
20Mark Phase CDR Walk
M1
M2
Hm
Hm
?
?
?
21Mark Phase CAR Walk
M2
M1
Hm
Hm
?
?
?
P1
?
P2
22Mark Phase Tamper Resistance
- Hash value serves two roles
- Signature of cell
- When the cell is not actively part of the
reversed list - Merkle hash
- When cell is part of reversed List
- Hash of the contents of the prior cell in the
list - Signs whole path to root during depth first walk
- Requires acyclic list structures DAG ok
- Once both Car and Cdr are traversed, hash value
reverts to signature of the cell - But with the new epic time stamp.
23Problem!
- For lists that are reachable from two or more
paths - Marking starting at the second list should see
shared list as already marked - But host can replay original unmarked cell
contents - gt Garbage Collector will happily re-mark that
list and everything reachable from it - Assuming host continues to replay the unmarked
contents
24Solution So What?
- Assume
- Acyclic list graphs at the beginning of epoch
- Bounded set of roots
- Bounded total number of cells
- We can bound the amount of extra work by counting
number of cells we mark - Must be no more than the total number of cells
available - gt Marking will eventually terminate
- So garbage collector will waste time, but will
never fail - Ok, this is somewhat unsatisfactory
- But it works (I think)
25Sweeping Free Storage
- Sweep by reading all cells in order indexed by
their address - Unmarked cells are reclaimed and linked into a
free list - Cell is written with special free flag and using
the new epoch number. - Again, replay is possible
- Host can, for example, return original unmarked
contents - Misleads SmartCard into thinking all cells can be
reclaimed - But this is detectable immediately upon
resumption of computation since the very first
cell retrieval will be incorrect
26Other Issues
- Efficiency
- Big cost is hashing at every read plus storage
overhead - Atom, and primitive values (e.g., integers)
- No special issues
- Bindings
- Acyclic requirement causes APVAL inefficiencies
- REPLACA/REPLACD
- Under controlled circumstances?
27Future Directions
- Try to fix problems identified during this talk
- Implement and measure performance
- Improve tamper resistance of garbage collection
- Add encryption
- Is short block size a problem?
- Extend to more usable versions of Lisp
- e. g. Common Lisp, Scheme
- Extend to more traditional RAM-like languages
- e.g. C (or BCPL?)
- Need for efficiency will require new approaches
to replay problem
28Backup Slides
29Storage Allocation
- Interface to untrusted host could include a
newcell() operation to obtain a previously unused
cell. - Requires more operation calls to untrusted host
- Alternative sbreak() functionality
- SmartCard asks for large chunks of memory (2?
bytes) - SmartCard allocates cells from chunks
30HEAP
31Approach Lisp 1.5 Interpreter
- Lisp 1.5 Programmers Manual by McCarthy et al.
- Issues
- Lists Car, Cdr, Cons
- Atoms and Integers
- Null
- Bindings
- Setq, Lambda, Function, Label
- FEXPRs cond, plus...
- EXPRS eq, lt,
- Replaca, Replacd
- Garbage Collection
- Storage allocation
- Input/Output
32Why Lisp
- Very simple interpreter
- Usable, if not very efficient programming
language - Based primarily on single data structure Lists
- Easy to make lists tamper-resistant
- General problems
- Avoid need for a stack in the SmartCard
gt External stack or use of Lisp
lists
33Lisp Cell Format
- Assume write once per epoch
- hi H(AiDiFTiCiKey)
P1
C1
SmartCard
Untrusted Host
34Lisp 1.5 Issues Revisited
- Lists Car, Cdr, Cons
- Atoms and Integers
- Null
- Bindings
- Setq, Lambda, Function, Label
- FEXPRs cond, plus...
- EXPRS eq, lt,
- Replaca, Replacd
- Garbage Collection
- Storage allocation
- Input/Output