Title: Lecturer: Moni Naor
1Foundations of CryptographyLecture 6
pseudo-random generators, hardcore predicate,
Goldreich-Levin Theorem, Next-bit
unpredictability.
2Recap of last weeks lecture
- Signature Scheme definition
- Existentially unforgeable against an adaptive
chosen message attack - Construction from UOWHFs
- Other paradigms for obtaining signature schemes
- Trapdoor permutations
- Encryption
- Desirable properties
- One-time pad
- Cryptographic pseudo-randomness
- Statistical difference
- Hardocre predicates
3Computational Indistinguishability
Polynomially
- Definition two sequences of distributions Dn
and Dn on 0,1n are computationally
indistinguishable if - for every polynomial p(n) for every probabilistic
polynomial time adversary A for sufficiently
large n - If A receives input y ? 0,1n and tries to
decide whether y was generated by Dn or Dn then - ProbA0 Dn - ProbA0 Dn
1/p(n) - Without restriction on probabilistic polynomial
tests equivalent to variation distance being
negligible - ?ß ? 0,1n Prob Dn ß - Prob Dn ß
1/p(n)
advantage
4Pseudo-random generators
- Definition a function g0,1 ? 0,1 is said
to be a (cryptographic) pseudo-random generator
if - It is polynomial time computable
- It stretches the input g(x)gtx
-
- Let l(n) be the length of the output on inputs of
length n - If the input (seed) is random, then the output is
computationally indistinguishable from random on
0,1l(n) - For any probabilistic polynomial time adversary A
that receives input y of length l(n) and tries to
decide whether y g(x) or is a random string from
0,1l(n) for any polynomial p(n) and
sufficiently large n - ProbArand yg(x) - ProbArand y?R
0,1l(n) 1/p(n)
x
seed
g
l(n)
5Hardcore Predicate
- Definition for f0,1 ? 0,1 we say that
h0,1 ? 0,1 is a hardcore predicate for f
if - h is polynomial time computable
- For any probabilistic polynomial time adversary A
that receives input yf(x) and tries to compute
h(x) for any polynomial p(n) and sufficiently
large n - ProbA(y)h(x) - 1/2 lt 1/p(n)
- where the probability is over the choice x and
the random coins of A - Sources of hardcoreness
- not enough information about x
- not of interest for generating pseudo-randomness
- enough information about x but hard to compute
it
6Single bit expansion
- Let f0,1n ? 0,1n be a one-way permutation
- Let h0,1n ? 0,1 be a hardcore predicate for
f - Consider g0,1n ? 0,1n1 where
- g(x)(f(x), h(x))
- Claim g is a pseudo-random generator
- Proof can use a distinguisher A for g to guess
h(x)
f(x), h(x)
0,1n1
f(x), 1-h(x)
7- Using the distinguisher A to guess h(x)
- Run A on hf(x),0i and hf(x),1i
- If outcomes are different guess h(x) as the b
that caused pseudo-random - Otherwise flip a coin
- Advantage ?1-?3
- By assumption ?1?2 gt ?2 ?3 ?
- Advantage ?1-?3 gt ?
0,1n1
?1
?2
A outputs pseudo
A outputs pseudo
?3
?4
f(x), h(x)
f(x), 1-h(x)
More prevalent on left!
8Hardcore Predicate With Public Information
- Definition let f0,1 ? 0,1 we say that
h0,1x 0,1 ? 0,1 is a hardcore predicate
with public information for f if - h(x,r) is polynomial time computable
- For any probabilistic polynomial time adversary A
that receives input yf(x) and public randomness
r and tries to compute h(x,r) for any polynomial
p(n) and sufficiently large n - ProbA(y,r)h(x,r) -1/2 lt 1/p(n)
- where the probability is over the choice y of r
and the random coins of A - Alternative view can think of the public
randomness as modifying the one-way function f
f(x,r)f(x),r.
9Example weak hardcore predicate
- Let h(x,i) xi
- I.e. h selects the ith bit of x
- For any one-way function f, no polynomial time
algorithm A(y,i) can have probability of success
better than 1-1/2n of computing h(x,i) - Exercise let c0,1 ? 0,1 be a good error
correcting code - c(x) is O(x)
- distance between any two codewords c(x) and c(x)
is a constant fraction of c(x) - It is possible to correct in polynomial time
errors in a constant fraction of c(x) - Show that for h(x,i) c(x)i and any one-way
function f, no polynomial time algorithm A(y,i)
can have probability of success better than a
constant of computing h(x,i)
10Inner Product Hardcore bit
- The inner product bit choose r ?R 0,1n let
- h(x,r) r x ? xi ri mod 2
- Theorem Goldreich-Levin for any one-way
function the inner product is a hardcore
predicate - Proof structure
- Algorithm A for inverting f
- There are many xs for which A returns a correct
answer (r x) on ½e of the r s - Reconstruction algorithm R take an algorithm A
that guesses h(x,r) correctly with probability
½e over the rs and output a list of candidates
for x - No use of the y info by R (except feeding to A)
- Choose from the list the/an x such that f(x)y
The main step!
11There are many xs for which A is correct
r 2 0,1n
1 if A returns h(x,r) 0 otherwise
x 2 0,1n
- Altogether ½e of the table is 1
- ? For at least e/2 of the of the row at least
½e/2 of the row is 1
12Why list?
- Cannot have a unique answer!
- Suppose A has two candidates x and x
- On query r it returns at random either r x
or r x - ProbA(y,r) r x ½ ½Probrx rx ¾
13Introduction to probabilistic analysis
concentration
- Let X1, X2, ? Xn be 0,1 random variables where
- PrXi 1 p
- Then
- ? Exp I?i1n Xi np
- How concentrated is the sum around the
expectation? - Chebyshev PrI-E(I) kvVAR(I) 1/k2
- if the Xi s are pair-wise independent then
- VAR(I) E(I- ?)2 ?i1n VAR(Xi) np(1-p)
- Chernoff if the Xis are (completely)
independent, then - PrI-E(I) kvVAR(I) 2e-k2/4n
14A algorithm for guessing rxR Reconstruction
algorithm that outputs a list of candidates for
xA algorithm for inverting f on a given y
y
A
y
R
y,r1
A
?
z1 r1 x
y,r2
A
?
z2 r2 x
?
y,rk
A
?
zk rk x
z1, z2, ? zk
x1 ,x2 ? xk
xix
Check whether f(xi)y
15Warm-up (1)
- If A returns a correct answer on 1-1/2n of the r
s - Choose r1, r2, rn ?R 0,1n
- Run A(y,r1), A(y,r2), A(y,rn)
- Denote the response z1, z2, zn
- If r1, r2, rn are linearly independent then
- there is a unique x satisfying rix zi for all
1 i n - Probzi A(y,ri) rix 1-1/2n
- Therefore probability that all the zis are
correct is at least ½ - Do we need complete independence of the ri s?
- one-wise independence is sufficient
- Can choose r ?R 0,1n and set ri rei
- ei 0i-110n-i
- All the ri s are linearly independent
- Each one is uniform in 0,1n
Union bound
16Warm-up (2)
- If A returns a correct answer on 3/4e of the r
s - Can amplify the probability of success!
- Given any r ? 0,1n Procedure A(y,r)
- Repeat for j1, 2,
- Choose r ?R 0,1n
- Run A(y,rr) and A(y,r). Denote the sum of the
responses by zj - Output the majority of the zjs
- Analysis
- Przj rx PrA(y,r)rx
A(y,rr)(rr)x½2e - Does not work for ½e since success on r and
rr is not independent - Each one of the events zj rx is independent
of the others - ? By taking sufficiently many js can amplify to
as close to 1 as wish - Need roughly 1/e2 examples
- Idea for improvement fix a few of the r
amplification
17The real thing
- Choose r1, r2, rk ?R 0,1n
- Guess for j1, 2, k the value zj rjx
- Go over all 2k possibilities
- For all nonempty subsets S ?1,,k
- Let rS ? j? S rj
- The implied guess for zS ? j? S zj
- For each position xi
- for each S ?1,,k run A(y,ei-rS)
- output the majority value of zs A(y,ei-rS)
- Analysis
- Each one of the vectors ei-rS is uniformly
distributed - A(y,ei-rS) is correct with probability at least
½e - Claim For every pair of nonempty subset S ?T
?1,,k - the two vectors rS and rT are pair-wise
independent - Therefore variance is as in completely
independent trials - I is the number of correct A(y,ei-rS), VAR(I)
2k(½e) - Use Chebyshevs Inequality PrI-E(I)
?vVAR(I)1/?2 - Need 2k n/e2 to get the probability of error
tobe at most 1/n
One of them is right
Reconstruction procedure
S
T
18Analysis
- Number of invocations of A
- 2k n (2k-1) poly(n, 1/e) n3/e4
- Size of resulting list of candidates for x
- for each guess of z1, z2, zk unique x
- 2k poly(n, 1/e) ) n/e2
- Conclusion single bit expansion of a one-way
permutation is a pseudo-random generator
guesses
positions
subsets
n1
n
x
f(x)
h(x,r)
19Reducing the size of the list of candidates
- Idea bootstrap
- Given any r ? 0,1n Procedure A(y,r)
- Choose r1, r2, rk ?R 0,1n
- Guess for j1, 2, k the value zj rjx
- Go over all 2k possibilities
- For all nonempty subsets S ?1,,k
- Let rS ? j? S rj
- The implied guess for zS ? j? S zj
- for each S ?1,,k run A(y,r-rS)
- output the majority value of zs A(y,r-rS)
- For 2k 1/e2 the probability of error is, say,
1/8 - Fix the same r1, r2, , rk for subsequent
executions - They are good for 7/8 of the rs
- Run warm-up (2)
- Size of resulting list of candidates for x is
1/e2
20Application Diffie-Hellman
- The Diffie-Hellman assumption
- Let G be a group and g an element in G.
- Given g, agx and bgy it is hard to find cgxy
- for random x and y the probability of a poly-time
machine outputting gxy is negligible - More accurately a sequence of groups
- Dont know how to verify whether given c is
equal to gxy - Exercise show that under the DH Assumption
- Given agx , bgy and r ? 0,1n no polynomial
time machine can guess r gxy with advantage
1/poly - for random x,y and r
21Application if subset is one-way, then it is a
pseudo-random generator
- Subset sum problem given
- n numbers 0 a1, a2 ,, an 2m
- Target sum y
- Find subset S? 1,...,n ? i ?S ai,y
- Subset sum one-way function f0,1mnn ?
0,1mmn - f(a1, a2 ,, an , x1, x2 ,, xn )
- (a1, a2 ,, an , ? i1n xi ai mod 2m )
- If mltn then we get out less bits then we put in.
- If mgtn then we get out more bits then we put in.
- Theorem if for mgtn subset sum is a one-way
function, then it is also a pseudo-random
generator
22Subset Sum Generator
- Idea of proof use the distinguisher A to compute
r x - For simplicity do the computation mod P for
large prime P - Given r ? 0,1n and (a1, a2 ,, an ,y)
- Generate new problem(a1, a2 ,, an ,y)
- Choose c ?R ZP
- Let ai ai if ri0 and ai aic mod P if ri1
- Guess k ?R 0,?,n - the value of ? xi ri
- the number of locations where x and r are 1
- Let y yc k mod P
- Run the distinguisher A on (a1, a2 ,, an
,y) - output what A says Xored with parity(k)
- Claim if k is correct, then (a1, a2 ,, an
,y) is ?R pseudo-random - Claim for any incorrect k (a1, a2 ,, an
,y) is ?R random - y z (k-h)c mod P where z ? i1n xi ai mod
P and h? xi ri - Therefore probability to guess r x is 1/n(½e)
(n-1)/n (½) ½e/n
ProbA0pseudo ½e
ProbA0random ½
Pseudo-random
random
correct k
Incorrect k
23Interpretations of the Goldreich-Levin Theorem
- A tool for constructing pseudo-random generators
- The main part of the proof
- A mechanism for translating general confusion
into randomness - Diffie-Hellman example
- List decoding of Hadamard Codes
- works in the other direction as well (for any
code with good list decoding) - List decoding, as opposed to unique decoding,
allows getting much closer to distance - Explains unique decoding when prediction was
3/4e - Finding all linear functions agreeing with a
function given in a black-box - Learning all Fourier coefficients larger than e
- If the Fourier coefficients are concentrated on a
small set can find them - True for AC0 circuits
- Decision Trees
24Two important techniques for showing
pseudo-randomness
- Hybrid argument
- Next-bit prediction and pseudo-randomness
25Hybrid argument
- To prove that two distributions D and D are
indistinguishable - suggest a collection of distributions
- D D0, D1, Dk D
- If D and D can be distinguished, then there is
a pair Di and Di1 that can be distinguished. - Advantage e in distinguishing between D and D
means advantage e/k between some Di and Di1 - Use a distinguisher for the pair Di and Di1 to
derive a contradiction
26Composing PRGs
- Composition
- Let
- g1 be a (l1, l2 )-pseudo-random generator
- g2 be a (l2, l3)-pseudo-random generator
- Consider g(x) g2(g1(x))
- Claim g is a (l1, l3 )-pseudo-random generator
- Proof consider three distributions on 0,1l3
- D1 y uniform in 0,1l3
- D2 yg(x) for x uniform in 0,1l1
- D3 yg2(z) for z uniform in 0,1l2
- By assumption there is a distinguisher A between
D1 and D2 - A must either
- Distinguish between D1 and D3 - can use A use
to distinguish g2 - or
- Distinguish between D2 and D3 - can use A use
to distinguish g1
l1
l2
l3
triangle inequality
27Composing PRGs
- When composing
- a generator secure against advantage e1
- and a
- a generator secure against advantage e2
- we get security against advantage e1e2
- When composing the single bit expansion generator
n times - Loss in security is at most e/n
- Hybrid argument to prove that two distributions
D and D are indistinguishable - suggest a collection of distributions D D0, D1,
Dk D such that - If D and D can be distinguished, there is a
pair Di and Di1 that can be distinguished. - Difference e between D and D means e/k between
some Di and Di1 - Use such a distinguisher to derive a contradiction
28From single bit expansion to many bit expansion
Internal Configuration
Input
Output
x
f(x)
h(x,r)
r
h(f(x),r)
f(2)(x)
f(3)(x)
h(f (2)(x),r)
h(f (m-1)(x),r)
f(m)(x)
- Can make r and f(m)(x) public
- But not any other internal state
- Can make m as large as needed
29Exercise
- Let Dn and Dn be two distributions that
are - Computationally indistinguishable
- Polynomial time samplable
- Suppose that y1, ym are all sampled according
to Dn or all are sampled according to Dn - Prove no probabilistic polynomial time machine
can tell, given y1, ym, whether they were
sampled from Dn or Dn
30Existence of PRGs
- What we have proved
- Theorem if pseudo-random generators stretching
by a single bit exist, then pseudo-random
generators stretching by any polynomial factor
exist - Theorem if one-way permutations exist, then
pseudo-random generators exist - A harder theorem to prove
- Theorem HILL if one-way functions exist, then
pseudo-random generators exist - Exercise show that if pseudo-random generators
exist, then one-way functions exist
31Two important techniques for showing
pseudo-randomness
- Hybrid argument
- Next-bit prediction and pseudo-randomness
32Next-bit Test
- Definition a function g0,1 ? 0,1 is
next-bit unpredictable if - It is polynomial time computable
- It stretches the input g(x)gtx
- denote by l(n) the length of the output on
inputs of length n - If the input (seed) is random, then the output
passes the next-bit test - For any prefix 0 ilt l(n), for any PPT adversary
A that is a predictor receives the first i bits
of y g(x) and tries to guess the next bit, for
any polynomial p(n) and sufficiently large n - ProbA(yi,y2,, yi) yi1 1/2 lt 1/p(n)
- Theorem a function g0,1 ? 0,1 is next-bit
unpredictable if - and only if it is a pseudo-random generator
33Proof of equivalence
- If g is a presumed pseudo-random generator and
there is a predictor for the next bit can use it
to distinguish - Distinguisher
- If predictor is correct guess pseudo-random
- If predictor is not-correct guess random
- On outputs of g distinguisher is correct with
probability at least 1/2 1/p(n) - On uniformly random inputs distinguisher is
correct with probability exactly 1/2
34Proof of equivalence
- If there is distinguisher A for the output of g
from random - form a sequence of distributions and use the
successes of A to predict the next bit for some
value - y1, y2 ? yl-1 yl
- y1, y2 ? yl-1 rl
- ?
- y1, y2 ? yi ri1 ? rl
- ?
- r1, r2 ? rl-1 rl
- There is an 0 i l-1 where A can distinguish
Di from Di1. - Can use A to predict yi1 !
Dn
g(x)y1, y2 ? yl r1, r2 ? rl 2R Ul
Dn-1
Di
D0
35Next-block Undpredictable
- Suppose that g maps a given a seed S into a
sequence of blocks - let l(n) be the number of blocks given a seed of
length n - Passes the next-block unpredicatability test
- For any prefix 0 ilt l(n), for any probabilistic
polynomial time adversary A that receives the
first i blocks of y g(x) and tries to guess the
next block yi1, for any polynomial p(n) and
sufficiently large n - ProbA(y1,y2,, yi) yi1 lt 1/p(n)
- Homework show how to convert a next-block
unpredictable generator into a pseudo-random
generator.
y1 y2, ,
36Sources
- Goldreichs Foundations of Cryptography, volumes
1 and 2 - M. Blum and S. Micali, How to Generate
Cryptographically Strong Sequences of
Pseudo-Random Bits , SIAM J. on Computing, 1984. - O. Goldreich and L. Levin, A Hard-Core Predicate
for all One-Way Functions, STOC 1989.