Title: On The Impossibility of Software Obfuscation
1On The (Im)possibility of Software Obfuscation
Joint work with Oded Goldreich, Russell
Impagliazzo, Steven Rudich, Amit Sahai, Salil
Vadhan and Ke Yang.
2What Is an Obfuscator?
- An obfuscator An algorithm O such that for any
program P , O(P) is a program such that - O(P) has the same functionality as P
- O(P) is infeasible to analyze /
reverse-engineer.
Intuition an obfuscator should provide a
virtual black-box in the sense that giving
someone O(P) should be equivalent to giving her a
black-box that computes P.
3Why Might Obfuscators Exist?
- Practical Reasons
- Understanding code is very difficult
- Obfuscation used (successfully?) in practice for
security purposes - Theoretical Reasons
- All canonical hard problems are problems of
reverse engineering SAT, HALTING - Rices Theorem You cant look at the code
(Turing Machine description) of a function and
find out a non-trivial property of it.
4Applications for Obfuscators
- Distributing music on-line
- Removing Random Oracles for specific natural
protocols. - Converting a private key encryption to a public
key encryption - Give someone ability to sign/decrypt a restricted
subset of the message space.
5Private (Shared) Key Encryption ? Public Key
Encryption
Private Key Encryption Scheme
CPA (Chosen Plaintext Attack) Security
6Public Key Encryption Scheme
Security
7The Conversion
Instead of publishing the key k, publish eO(Ek)
8Security of The Converted Scheme
9Defining Obfuscators
- Definition 1 An algorithm O is an obfuscator if
for any circuit C - (functionality) O(C) C (i.e., O(C) computes the
same function as C) - (polynomial slowdown) O(C) ? p(C) for some
polynomial p( ). - We say that O is efficient if it runs in
polynomial time.
10Defining Security
Anything that can be learned from the obfuscated
form, could have been learned by merely observing
the circuits input-output behavior (i.e., by
treating the circuit as a black-box)
A Natural Formal Interpretation For any
adversary A theres a simulator S such that for
any circuit C A(O(C)) ?C.I. SC(1C)
This definition is impossible to meet!
11Defining Security (2)
Relaxation simulator should only compute a
specific function (even predicate) rather than
generate an indistinguishable output.
Weak Obfuscators " A " (poly time) predicate
p0,1?0,1 S such that for all circuits C
Pr A(O(C)) p(C) Pr SC(1C) p(C)
negl(C)
Note may be too weak for desired applications,
but still well prove that it is impossible to
meet.
12Inherently Unobfuscatable Functions
Definition 2 A (efficiently computable) function
ensemble Ft ( Ft0,1t?0,1t ) is an
unobfuscatable function ensemble (UF) if it
satisfies
Theres a poly time predicate p0,1?0,1
such that
(a) (p easy to compute with a circuit) Theres a
p.p.t A such that for any circuit C such that C
Ft A(C) p(Ft)
(b) (p hard to compute with black-box access) For
any p.p.t S , if t ?0,1n then Pr SFt (1n)
p(t) ½ negl(n)
13The Main Result
Theorem 1 ? unobfuscatable functions
? ? very weak obfuscators. Theorem 2
? one way functions ? ?
unobfuscatable functions Theorem 3 ? efficient
weak obfuscators ? ? one way
functions Corollary 4 Efficient weak
obfuscators do not exist.
14The Combination Operator
For f0 , f1 X ? Y , define f0f1 0,1 ? X ? Y
by f0f1(b,x) fb(x)
- Properties
- From a circuit C that computes f0f1 one can
compute circuits C0,C1 that compute f0 and f1
(Cb(x) C(b,x) .) - Oracle access to f0f1 ? oracle access to both f0
and f1
Using the combination operator, we can attempt to
prove Theorem 2.
15Solving The Input Size Problem
Lemma 5 If one-way functions exist then there
exists an (efficiently constructible) ensemble
Da,b,z such that
1. Theres a p.p.t A such that for any circuit C
that satisfies C(a) b and for any z
ADa,b,z(C) a1
(in particular theres a p.p.t A such that
A(C Da,b,z ) a1)
2. Oracle access to Da,b,z does not help in
learning anything about a .
Formal Interpretation (semantic security) For
any p.p.t S theres a p.p.t S such that for any
(poly time) function p0,1?0,1 Pra,b,z
SDa,b,z(1n) p(a) Pra,b,z S(1n) p(a)
negl(n)
16Lemma 5 Proves Theorem 2
- Define Fa,b,z Ca,bDa,b,z
- p(a,b,z) a1
- We claim that Fa,b,z is an IUF w.r.t the
function p . - Algorithm A When given a circuit F do
- Decompose F into circuits C,D such that FCD
- Return AD(C)
- Claim 1 For any circuit F such that F Fa,b,z ,
- A(F) a1
- Claim 2 For any p.p.t S
- Pra,b,z SFa,b,z(1n) a1 ½ negl(n)
17Proof of Lemma 5
Let (ENCk , DECk) be a private key encryption
scheme. Define Ia,k constant function
ENCk(a1)ENCk(an) Hk(c,d,?) ENCk( DECk(c) ?
DECk(d) ) Ba,b,k (c1,,cn) a1 if DECk(c1)
b1,, DECk(cn) bn Ba,b,k (c1,,cn) 0
otherwise Let hk be a pseudorandom function
ensemble. We define Da,b,k,,k
Ia,k,kHk,k Ba,b,k
18Unobfuscatable Encryption Scheme
Definition 3 A (CPA secure) private key
encryption scheme (GEN, ENC , DEC) is
unobfuscatable if there is an alg A such that
A(C) k for any circuit C s.t. CENCk That is,
A can totally break the encryption scheme given
any circuit that computes the encryption function.
Theorem 6 If secure private key encryption
schemes exist then so do inherently
unobfuscatable encryption schemes.
19Proof of Theorem 6
Suppose that (GEN,ENC,DEC) is a (CPA) secure
private key encryption scheme. It follows that
one way function exist. Let Fa,b,z be the
ensemble from the proof of Theorem 2 and change
it to Fa,b,z,k such that theres an algorithm A
such that A(F) (a,b,z,k) for any circuit F
such that F Fa,b,z,k. Define (GEN , ENC
,DEC) to be the following GEN(1n) (k, a,
b,z) where k?GEN(1n) ENC k a, b,z(m)
ENCk(m)Fa,b,z,k(m) DEC k a, b,z(cy)
DECk(c)
20Similar Results
If signature schemes exist then so do
unobfuscatable signature schemes. If pseudorandom
functions exist then so do unobfuscatable
pseudorandom functions.
This results mean that any algorithm that
satisfies Definition 1 can not be used to obtain
the applications described before in the way that
we thought. They do not mean that these
applications cant be obtained in other ways. (In
particular, we believe that public key encryption
schemes do exist).
21Other Results
- Generalization obfuscators that only (strongly)
approximate input circuit for any circuit C ,
and for any input x - Pr C(x) ? (O(C) )(x) negl(C)
- (probability only over O s coin tosses)
- Note our proof does not directly apply here
- A promise problem version of a complexity theory
analog of Rices Theorem is false. - Weaker obfuscation-like notions (e.g.,
sampling obfuscators).
22Conclusions
- Is there any hope for obfuscation?
- Weaker / different definitions.
- Restricted classes of algorithms.