pebbling and proofs of work - PowerPoint PPT Presentation

About This Presentation

Title:

pebbling and proofs of work

Description:

The Complexity of Pebbling Graphs and Spam Fighting Moni Naor WEIZMANN INSTITUTE OF SCIENCE – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 39

Provided by: Hoete8

Category:

more less

Transcript and Presenter's Notes

Title: pebbling and proofs of work

1
The Complexity of Pebbling Graphs and Spam
Fighting
Moni Naor WEIZMANN INSTITUTEOF SCIENCE
2
Based on

Cynthia Dwork, Andrew Goldberg, N
On Memory-Bound Functions for Fighting Spam.
Cynthia Dwork, N, Hoeteck Wee
Pebbling and Proofs of Work

3
Principal techniques for spam-fighting

FILTERING
text-based, trainable filters
MAKING SENDER PAY
computation Dwork Naor 92, Back 97, Abadi
Burrows Manasse Wobber 03, DGN 03, DNW05
human attention Naor 96, Captcha
micropayments
NOTE techniques are complementary reinforce
each other!

4
Principal techniques for spam-fighting

FILTERING
text-based, trainable filters
MAKING SENDER PAY
computation Dwork Naor 92, Back 97, Abadi
Burrows Manasse Wobber 03, DGN 03, DNW 05
human attention Naor 96, Captcha
micropayments
NOTE techniques are complementary reinforce
each other!

5
Talk Plan

The proofs of work approach
DGNs Memory bound functions
Generating a large random looking table DNW
Open problems moderately hard functions

6
Pricing via processing Dwork-Naor Crypto 92
IDEA If I dont know you prove you spent
significant computational resources (say 10 secs
CPU time), just for me, and just for this message

automated for the user
non-interactive, single-pass
no need for third party or payment infrastructure

7
Choosing the function f

Message m, Sender S, Receiver R and Date and time
d
Hard to compute f(m,S,R,d) - cannot be
amortized
lots of work for the sender
Should have good understanding of best methods
for computing f
Easy to check z f(m,S,R,d) - little work for
receiver
Parameterized to scale with Moore's Law
easy to exponentially increase computational
cost, while barely increasing checking cost
Example computing a square root mod a prime vs.
verifying it
x2 y mod P

8
Which computational resource(s)?

WANT corresponds to the same computation time
across machines
computing cycles
high variance of CPU speeds within desktops
factors of 10-30
memory-bound approach Abadi Burrows Manasse
Wobber 03
low variance in memory lantencies
factors of 1-4

GOAL design a memory-bound proof of effort
function which requires a large number of cache
misses
9
memory-bound model
10
memory-bound model
USER
SPAMMER

CACHE
small but fast

CACHE
cache size at most ½ users main memory

charge accesses to main memory
must avoid exploitation of locality
computation is free
except for hash function calls
watch out for low-space crypto attacks

MAIN MEMORY
large but slow

MAIN MEMORY
may be very very large

11
Talk Plan

The proofs of work approach
DGNs Memory bound functions
Generating a large random looking table DNW
Open problems moderately hard functions

12
Path-following approach DGN Crypto 03

PUBLIC large random table T (2 x spammers cache
size)
PARAMETERS integer L, effort parameter e
IDEA path is a sequence of L sequential accesses
to T
sender searches collection of paths to find a
good path
collection depends on (m, S, R, d)
density of good paths 1/2e
locations in T depends on hash functions H0,,H3

13
Path-following approach DGN Crypto 03

PUBLIC large random table T (2 x spammers cache
size)
PARAMETERS integer L, effort parameter e
IDEA path is a sequence of L sequential accesses
to T
sender searches collection of paths to find a
good path
OUTPUT (m, S, R, d) description of a good path
COMPLEXITY sending O(2eL) memory accesses
verifying O(L) accesses

14
Collection P of paths. Depends on (m,S,R,d)
L
15
Abstracted Algorithm

Sender and Receiver share large random Table T.
To send message m, Sender S, Receiver R date/time
d,
Repeat trial for k 1,2, until success
Current state specified by A auxiliary table
Thread defined by (m,S,R,d,k)
Initialization A H0(m,S,R,d,k)
Main Loop Walk for L steps (Lpath length)
c H1(A)
A H2(A,Tc)
Success if last e bit of H3(A) 000
Attach to (m,S,R,d) the successful trial number
k and H3(A)
Verification straightforward given (m, S, R, d,
k, H3 (A))

16
Animated Algorithm a Single Step in the Loop
A
C
C H1(A)
A H2(A,TC)
T
TC
17
Full Specification

E (expected) factor by which computation cost
exceeds verification expected number of trials
2e
If H3 behaves as a random function
L length of walk
Want, say, ELt 10 seconds, where
t memory latency 0.2 ?sec
Reasonable choices
E 24,000, L 2048
Also need How large is A?
A should not be very small

abstract algorithm

Initialize A H0(m,S,R,d,k)
Main Loop Walk for L steps
c ? H1(A)
A ? H2(A,Tc)
Success if H3(A) 0log E
Trial repeated for k 1,2,
Proof (m,S,R,d,k,H3(A))

18
Choosing the Hs

A theoretical approach idealized random
functions
Provide a formal analysis showing that the
amortized number of memory access is high
A concrete approach inspired by RC4 stream cipher
Very Efficient a few cycles per step
Dont have time inside inner loop to compute
complex function
A is not small changes gradually
Experimental Results across different machines

19
Path-following approach Dwork-Goldberg-Naor
Crypto 03

Theorem fix any spammer
whose cache size is smaller than T/2
assuming T is truly random
assuming H0,,H3 are idealized hash functions
the amortized number of memory accesses per
successful message is ?(2eL).

Remarks
lower bound holds for spammer maximizing
throughput across any collection of messages and
recipients
model idealized hash functions using random
oracles
relies on information-theoretic unpredictability
of T

20
Why Random Oracles?

Random Oracles 101
Can measure progress
know which oracle calls must be made
can see when they occur.
First occurrence of each such call is a progress
call
1 2 3 1 3 2 3 4

Initialize A H0(m,S,R,d,k)
Main Loop Walk for L steps
c ? H1(A)
A ? H2(A,Tc)
Success if H3(A) 0log E
Trial repeated for k 1,2,
Proof (m,S,R,d,k,H3(A))

abstract algorithm
21
Proof highlights

Use of idealized hash function implies
At any point in time A is incompressible
The average number of oracle calls per success
is ?(EL).
We can follow the progress of the algorithm
Cast the problem as that of asymmetric
communication complexity between memory and cache
Only the cache has access to the functions H1 and
H2

Cache
Memory
22
Talk Plan

The proofs of work approach
DGNs Memory bound functions
Generating a large random looking table DNW
Open problems

23
Using a succinct table DNW 05

GOAL use a table T with a succinct description
easy distribution of software (new users)
fast updates (over slow connections)
PROBLEM lose information theoretic
unpredictability
spammer can exploit succinct description to avoid
memory accesses
IDEA generate T using a memory-bound process
Use time-space trade-offs for pebbling
Studied extensively in 1970s

User builds the table T once and for all
24
Pebbling a graph

GIVEN a directed acyclic graph
RULES
inputs a pebble can be placed on an input node
at any time
a pebble can be placed on any non-input vertex if
all immediate parent nodes have pebbles
pebbles may be removed at any time
GOAL find a strategy to pebble all the outputs
while using few pebbles and few moves

INPUT
OUTPUT
25
What do we know about pebbling

Any graph can be pebbled using O(N/log N)
pebbles. Valiant
There are graphs requiring ?(N/log N) pebbles
PTC
Any graph of depth d can be pebbled using O(d)
pebbles
Constant degree
Tight tradeoffs some shallow graphs requires
many (super poly) steps to pebble with a few
pebbles LT
Some results about pebbling outputs hold even
when possible to put the available pebbles in any
initial configuration

26
Succinctly generating T

GIVEN a directed acyclic graph
constant in-degree

input node i labeled H4(i)
non-input node i labeledH4(i, labels of parent
nodes)
entries of T labels of output nodes

OBSERVATION good pebbling strategy ) good
spammer strategy
Lj
Lk
Li H4(i, Lj, Lk)
INPUT
OUTPUT
27
Converting spammer strategy to a pebbling

EX POST FACTO PEBBLING computed by offline
inspection of spammer strategy
PLACING A PEBBLE place a pebble on node i if
H4 used to compute Li H4(i, Lj, Lk), and
Lj, Lk are the correct labels
INITIAL PEBBLES place initial pebble on node j
if
H4 applied with Lj as argument, and
Lj not computed via H4
REMOVING A PEBBLE remove a pebble as soon as
its not needed anymore

computing a label using hash function
lower bound on moves )lower bound on hash
function calls

using cache memory fetches
lower bound on pebbles )lower bound on
memory accesses

IDEA limit of pebbles used by the spammer as
a function of its cache size and of bits it
brings from memory
28
Constructing the dag

CONSTRUCTION dag D composed of D1 D2
D1 has the property that pebbling many outputs
requires many pebbles
more than cache and pages brought from memory can
supply
stack of superconcentratorsLengauer Tarjan 82
D2 is a fault-tolerant layered graph
even if a constant fraction of each layer is
deleted can still embed a superconcentrator
stack of expandersAlon Chung 88, Upfal 92

inputs of D
D1
D2

SUPERCONCENTRATOR is a dag
N inputs, N outputs
any k inputs and k outputs connected by
vertex-disjoint paths

outputs of D
29
Using the dag

CONSTRUCTION dag D composed of D1 D2
D1 has the property that pebbling many outputs
requires many pebbles
more than cache and pages brought from memory can
supply
stack of superconcentratorsLengauer Tarjan 82
D2 is a fault-tolerant layered graph
even if a constant fraction of each layer is
deleted can still embed a superconcentrator
stack of expanders Alon Chung 88, Upfal 92

idea fix any execution
let S set of mid-level nodes pebbled
if S is large, use time-space trade-offs for D1
if S is small, use fault-tolerant property of D2
delete nodes whose labels are largely determined
by S

30
The lower bound result

Theorem for the dag D, fix any spammer
whose cache size is smaller than T/2
assuming H0,,H4 are idealized hash functions
makes poly of hash function calls
the amortized number of memory accesses per
successful message is ?(2e L).

Remarks
lower bound holds for spammer maximizing
throughput across any collection of messages and
recipients
model idealized hash functions using random
oracles

31
What can we conclude from the lower bound?

Shows that the design principles are sound
Gives us a plausibility argument
Tells us that if something will go wrong we will
know where to look
But
Based on idealized random functions
How to implement them
Might be computationally expensive
Are applied to all of A
Might be computationally expensive simply to
touch all of

32
Talk Plan

The proofs of work approach
DGNs Memory bound functions
Generating a large random looking table DNW
Open problems moderately hard functions

33
Alternative construction based on sorting

motivated by time-space trade-offs for sorting
Borodin Cook 82
easier to implement

Ti H4(i, 1)

input node i labeled H4(i, 1)
at each round, sort array
then apply H4 to current values of the array

SORT
Ti H4(i, Ti, 2)
SORT
OPEN PROBLEM prove a lower bound

34
More open problems

WEAKER ASSUMPTIONS? no recourse to random
oracles
use lower bounds for cell probe model and
branching programs?
Unlike most of cryptography in this case there
is a chance of coming up with an unconditional
result
Physical limitations of computation to form a
reasonable lower bound on the spammers effort

35
A theory of moderately hard function?

Key idea in cryptography use the computational
infeasibility of problems in order to obtain
security.
For many applications moderate hardness is needed
current applications
abuse prevention, fairness, few round
zero-knowledge
FURTHER WORK develop a theory of moderate hard
functions

36
Open problems moderately hard functions

Unifying Assumption
In the intractable world one-way function
necessary and sufficient for many tasks
Is there a similar primitive when moderate
hardness is needed?
Precise model
Details of the computational model may matter,
unifying it?
Hardness Amplification
Start with a somewhat hard problem and turn it
into one that is harder.
Hardness vs. Randomness
Can we turn moderate hardness into and moderate
pseudorandomness?
Following standard transformation is not
necessarily applicable here
Evidence for non-amortization
It possible to demonstrate that if a certain
problem is not resilient to amortization, then a
single instance can be solved much more quickly?

37
Open problems moderately hard functions

Immunity to Parallel Attacks
Important for timed-commitments
For the power function was used, is there a good
argument to show immunity against parallel
attacks?
Is it possible to reduce worst-case to average
case
find a random self reduction.
In the intractable world it is known that there
are limitations on random self reductions from
NP-Complete problems
Is it possible to randomly reduce a P-Complete
problem to itself?
is it possible to use linear programming or
lattice basis reduction for such purposes?
New Candidates for Moderately Hard Functions

Thank you
Merci beaucoup
???? ???

39
path-following approach Dwork-Goldberg-Naor
Crypto 03

PUBLIC large random table T (2 x spammers cache
size)
PARAMETERS integer L, effort parameter e
IDEA path is a sequence of L sequential accesses
to T
sender searches collection of paths to find a
good path
collection depends on (m, S, R, d)
locations in T depends on hash functions H0,,H3
density of good paths 1/2e
OUTPUT (m, S, R, d) description of a good path
COMPLEXITY sending O(2eL) memory accesses
verifying O(L) accesses

40
path-following approach Dwork-Goldberg-Naor
Crypto 03

PUBLIC large random table T (2 x spammers cache
size)
INPUT message m, sender S, receiver R, date/time
d
PARAMETERS integer L, effort parameter e
IDEA sender searches paths of length L for a
good path
path determined by table T and hash functions
H0,,H3
any path is good with probability 1/2e
OUTPUT (m, S, R, d) description of a good path
COMPLEXITY sender O(2eL) memory fetches
verification O(L) fetches

MAIN RESULT ?(2eL) memory fetches necessary
41
memory-bound model
USER
SPAMMER

MAIN MEMORY
large but slow
locality

MAIN MEMORY
may be very very large

CACHE
cache size at most ½ users main memory

CACHE
small but fast
hits/misses

42
path-following approach Dwork-Goldberg-Naor
Crypto 03

PUBLIC large random table T (2 x spammers cache
size)
INPUT message m, sender S, receiver R, date/time
d
sender makes a sequence of random memory
accesses into T
inherently sequential (hence path-following)
sends a proof of having done so to the receiver
verification requires only a small number of
accesses
memory access pattern leads to many cache misses

43
path-following approach Dwork-Goldberg-Naor
Crypto 03

PUBLIC large random table T (2 x spammers cache
size)
INPUT message m, sender S, receiver R, date/time
d
OUTPUT attach to (m, S, R, d) the successful
trial number k and H3(A)
COMPLEXITY sender ?(2eL) memory fetches
verification O(L) fetches

Repeat for k 1, 2,
Initialize A H0(m,S,R,d,k)
Main Loop Walk for L steps
c ? H1(A)
A ? H2(A,Tc)
Success last e bits of H3(A) are 0s

SPAMMER
needs 2e walks
each walk requires L/2 fetches

44
using the dag

CONSTRUCTION dag D composed of D1 D2
D1 has the property that pebbling many outputs
requires many pebbles
more than cache and pages brought from memory can
supply
stack of superconcentratorsLengauer Tarjan 82
D2 is a fault-tolerant layered graph
even if a constant fraction of each layer is
deleted can still embed a superconcentrator
stack of expanders Alon Chung 88, Upfal 92