Title: Encoding Information Flow in Haskell
1Encoding Information Flow in Haskell
- Steve Zdancewic
- Peng Li
- University of Pennsylvania
- CSFW-19, July 2006
2Current State of the Art
- Enforce information flow policies in programming
languages! - Jif
Myers et al. - Variant of Java with "decentralized label model"
- Handles many features classes, exceptions,
state, dynamic policies, etc. - Software Battleship games Myers et al.,
JifPoker Sabelfeld , JPMail Hicks et al. ,
Jif Web Servlet Framework Chong et al - FlowCaml
Simonet and Pottier - Dialect of Ocaml
- Strong support for polymorphism and label
inference
3Benefits of language-based security
- End-to-end security guarantee noninterference
- Enforced using type systems / static program
analysis - Cannot easily achieve by access control /
encryption
4Costs of end-to-end
- The whole system (from one end to the other) must
be built using the security-typed language!
- Adoption of a new language (like Jif and
FlowCaml) - Wholesale only use it for all (or nothing)
- Re-train programmers (costly!)
- Re-write the libraries and existing code bases
- Too big of a hammer?
- Often in practice only a small part of system
(e.g. a few variables in 10,000 lines of code)
has security requirements - Is it worth writing the entire system in this
language?
5Flexibility?
- Hard-wired in todays languages
- DLM --- the decentralized label model
- Problems
- The security lattice is fixed
- Many applications do not need the full power of
DLM - Some applications need to use their own security
lattice - Difficult to add new features
- Dynamic policies, different interpretations of
the security labels - Fancy new downgrading or declassification
mechanisms - Sabelfeld and Sands, CSFW-18
6An alternative approach
- Use general purpose programming languages (such
as C/Java/ML/Haskell) - Encode information-flow policies using the
constructs provided by the general purpose
language. - Use private/abstract data types for information
hiding - Support information-flow security by means of
software libraries and design patterns - Challenges
- Encoding the information-flow policies (labels,
etc.) - Ease of use / expressiveness
- Dealing with implicit information flows
- Soundness
7This talk Encoding Info-Flow in Haskell
- A security sublanguage implemented in Haskell
- Works with the Glasglow Haskell Compiler (GHC)
right out of the box! - Smooth integration with existing (Haskell) code.
8Encoding the Label Lattice
- Dynamically (encode the lattice as Haskell terms)
- Simple just use a data type
- Flexible easy to program new lattices,
accommodate "advanced" features like dynamic
policies - How do you tie the labels to a "static" analysis
of the embedded language? - Statically (encode the lattice as Haskell types)
- Previous work shows how to encode label lattice
using polymorphic types
Tse Zdancewic - Perhaps give stronger guarantees
- Less flexible for dynamic policies
- Here, we take the dynamic approach.
9A Haskell Typeclass for Labels
10Summary so far
-
- Encoding the security lattices
- Building a sub-language in Haskell
- Checking information-flow policies
- Specifying information-flow policies
- Code demo
11A Security Sublanguage of Haskell
- Basic idea
- Provide an abstract type Protected a
- Protected a encapsulates a "secure computation
that creates a value of type a - Security types/labels are attached in the
implementation of Protected a at run time - Provide combinators to build more complex
programs from simple Protected building blocks. - Security types/labels are checked during
composition - Dynamically checks information flow between the
building blocks - Information-flow policy violations run-time
errors
12How to build a sublanguage?
- In Haskell programming, Monad is an essential
concept - What a Monad is
- A standard interface for programming with
combinators - What Haskell offers for monads
- Syntactic sugars (the do syntax)
- Automatic operator overloading (using
typeclasses) - What you can do with monads
- Encapsulating side-effects in functional
programming - Using monads as sublanguages
- What you cannot do with monads
- Dynamic information-flow control!
13Choosing the right interface
- Monad does not work for our purposes
- A monadic language uses the control-flow
primitives of the base language (Haskell) - CFG blocks are generated on-the-fly as the
sublanguage evaluates - Cannot deal with implicit information flow
Base language control flow (Haskell)
Protected
Protected
Protected
Protected
14Choosing the right interface
- Requirement ability to reason about the whole
control-flow graph (CFG) of the sublanguage in
the base language - So we can use static analysis techniques to
dynamically check information-flow policies at
run time
Base language control flow (Haskell)
Sublanguage control flow
Protected
Protected
Protected
Protected
15Solution Arrows (generalizing monads)
Hughes '98
- class Arrow a where
- -- Basic blocks
- pure (b ? c) ??a b c
- -- Sequential composition
- (gtgtgt) a b c ? a c d ? a b d
- -- Products
- first a b c ? a (b,d) (c,d)
- () a b c ? a d e ?? a (b,d) (c,e)
- class Arrow a gt ArrowChoice a where
- -- Conditionals
- left a b c ? a (Either b d)(Either c d)
- () a b d ? a c d ? a (Either b c) d
- class Arrow a gt ArrowLoop a where
- -- Loops
- loop a (b,d) (c,d) ? a b c
16Summary so far
-
- Encoding the security lattices
- Building a sub-language in Haskell
- Checking information-flow policies
- Specifying information-flow policies
- Code demo
17Designing FlowArrow
- A protected computation of type FlowArrow a is
internally represented as ? ??? e l1 ? l2 - The encapsulated computation e
- The information-flow type l1 ? l2 of e
- e.g. LOW-gtHIGH
- A set of label constraints ? for e
- e.g. LOWltHIGH, LOWltLOW,HIGHltUserLabel,
- The control-flow primitives combine FlowArrow
values by - Combining the encapsulated computations
- Generating new types and constraints (type
checking)
18Checking information flow
- Sequential composition
- ?? ??? c1 l1 ? l2 ?? ??? c2 l3 ?
l4 - ???? ???? l2 ? l3 ?? c1 gtgtgt c2 l1 ? l4
- Parallel composition (for conditionals/products)
- ?? ??? c1 l1 ? l2 ?? ??? c2 l3 ?
l4 - ???? ????? c1 c2 (l1 ? l3) ? (l2 ? l4)
19Summary so far
-
- Encoding the security lattices
- Building a sub-language in Haskell
- Checking information-flow policies
- Specifying information-flow policies
- Code demo
20Tagging Data
- We need a way to explicitly label the data in the
embedded language - Ø ??? tag l l ? l
- A special operator tag in the sublanguage
-
- let y x gtgtgt (tag HIGH)
- The output of computation y has label HIGH
21Other Features
- Code privileges objects
- An ADT representing the capability of the code
at run time - Can only be created from the trusted computing
base - Declassification
- Requires the code privilege object
- Generates a label constraint during typechecking
- Code certification
- Label constraints are checked before an embedded
computation is executed - unsafe declassifications, invalid implicit
information flows are all ruled out
22Summary so far
-
- Encoding the security lattices
- Building a sub-language in Haskell
- Checking information-flow policies
- Specifying information-flow policies
- Code demo
23Code Demo
- Simple "bidding" server
- Guests can place bids, but can't see the highest
bid - Administrators can see (and reset) highest bid
- Authentication database
- Maps username and password to code privilege
objects - Is itself a protected object with label HIGH
- Uses declassification
24Discussion
- What is the Trusted Computing Base?
- Implementation of FlowArrow and code
certification (both provided as library modules) - Implementation of the parts of the program that
can manufacture code privilege objects - Security guarantee (informally conjecture)
- Let e1 and e2 have type Protected a tagged with
label HIGH - Let H- be a well-typed Haskell context with no
way to obtain a code privilege object capable of
declassifying HIGH data - Then He1 ? v iff He2 ? v
25Caveats and Disclaimers
- Debugging may be more difficult
- Checking occurs only for code that is executed.
- Haskell provides Unsafe.performIO
- For more stateful programming, one needs to
enrich the type system - Haskell is mostly pure anyway
- Arrow laws our FlowArrow is conservative
- Our simple dynamic privilege model may not be
sufficient - Untrusted code could duplicate or replay
privileges - This is an instance of the standard problems with
capability-based security mechanisms - Use "lifetimes" or "one-shot" privileges or
other revocation strategies
26Conclusions Future Work
- Embedding security-typed languages in Haskell
- Easily able to accommodate basic information-flow
language properties - Conjectures
- It's easy to experiment with different label
models, declassification features, etc. - Future directions
- Proof of soundness?
- Can we make a more precise FlowArrow type system?
- Static approaches? i.e. encoding labels as types?
27Thanks!
28What have we gained?
- Haven't we just replaced barrier to adopting a
security-oriented language with adopting Haskell? - Yes, but there are already more Haskell
programmers than Jif programmers - Haskell's features are intended to be more
general purpose, so they're more likely to be
adopted by mainstream languages