Title: Merkle Tree Traversal in Log Space
1Merkle Tree Traversal
in Log Space Time
Michael Szydlo, RSA Eurocrypt 2004 May 6, 2004
2Presentation overview
- Review of Merkle Authentication Trees
- Define the Traversal Problem
- Describe classic traversal technique
- Present new, space-efficient algorithm
- Concluding comments
3Merkle trees
- Introduced by Ralph Merkle, 1979
- Classic cryptographic construction
- Involves combining hash functions on binary tree
structure - A public-key authentication scheme
- Using only one-way hash function as building
blocks - No number theory or trapdoor permutations
- Also public-key signatures (Lamports one-time
signatures) - Theoretical and practical contexts
- Receive less practical attention today due to
(e.g, RSA, DSA) - Not terribly inefficient. No number theory
advantage? - Our contribution
- Re-examine efficiency aspects of construction
- New algorithm - answer an old question about
Merkle trees
4Merkle tree data structure
- Binary tree, nodes are assigned (e.g. 160 bit)
values - Extra, secret values associated to each leaf.
xxxxxx
Interior nodes
vHash( vleft vright )
xxxxxx
xxxxxx
leaves
xxxxx
xxxxx
xxxxx
xxxxxx
vi Hash( si )
xxxxxx
xxxxxxx
xxxxxx
xxxxxxx
si secret
5A Public / Private key pair
- How to generate a public key pair
- Select a random (e.g 160 bit) secret S
- Derive leaf secrets si PRF(S i )
- Use hash function to get leaf / interior node
values - Publish root value as P
- Key generation has a cost
- Tree of height H has N 2H leaves
- Nodes at height h will depend on 2h leaf values
- Obtaining P requires calculating all N leaf
values plus 2H-1 more hash function evaluations
6Authenticating a secret
- Prover wishes to reveals si to identify herself
- Prover sends i,si (each secret used just once)
- Additional data requiredsibling node values
- Verifier checks si against the public key P
- Hash first si
- Hash result together with its sibling in tree
- Repeat, moving up tree
- Check result with root
7Sibling node values required
xxxxxx
Root value is public
H
Sibling nodes required to authenticate secret
xxxxxx
xxxxxx
H
xxxxx
xxxxx
H
xxxxxx
- Verify secret value by hashing, then hashing
together with sibling, etc. - Accept if you match with the root value
s0
8Digital signatures, too
- Use up 1 leaf per authentication
- Digital Signature use multiple leaves
- Extends Lamports one-time signature scheme
- Want to sign m (m0, m1, m159)
- Requires 160 pairs of secrets si ti
- si included in signature if mi 0. Otherwise ti
is. - Verification requires sibling nodes, as above
- Merkle construction provides signatures
- Security intuitive, how about efficiency?
9Efficiency questions
- Tacit assumption - all node values saved.
- A useful Merkle tree has many leaves!
- E.g., N 230 allows many authentications /
signatures. - Not practical for a weak prover!
- Store all node values? too much space!
- N 2H leaves, N-1 interior nodes
- Recalculate from scratch? - too much time!
- Interior node near the top requires 2H-1 Hash
operations
10The traversal problem
- Formulate efficient Prover algorithm.
- Must output authentication data for each leaf, in
sequence (on round i, si with associated
sibling nodes) - Prover has limited memory
- Prover should compute few Hash values per round
- Metrics
- Space 1 Unit 1 stored node value
- Time 1 Unit 1 leaf calc. or 1 interior node
calc. - Note - this analysis fixes the security
parameter.
11Traversal challenge
Higher node used for 220 rounds, costs 221
Lower node used for 25 rounds, costs 26
( Note per round cost is lt2 )
12Merkles amortization technique
- Used space-efficient node computation
- Costly nodes computed over many rounds
- Form of the algorithm on each round
- Output si with sibling values
- Discard expired sibling values
- For each height, working on preparing upcoming
sibling - Upcoming values should be ready on time
- Merkles result for tree with N2H leaves
- O(log(N)) O(H) time per round.
- Space bounded O(log(N)2) O(H2)
13TREEHASH
- Calculate a height h node using space h1
- Simply erase values no longer required
- Adding leaf or internal node is 1 unit of work
- Evolving set of stored node call tail nodes
- Example with h3
14Merkles amortization (2)
- Provers initial internal state
- Contains Current and Next sibling value for each
height hltH - Provers internal state (later points)
- Contains Current sibling value for each height
hltH - For each height, contains Next sibling, OR a
partial TREEHASH computation for Next. - Per-round update procedure
- Output leaf secret and Current sibling nodes
- Discard expired sibling nodes, promote Next to
Current - Spend maximum 2 units of work towards the
TREEHASH procedure for each height
15Merkles amortization (3)
- Nodes are ready on time
- 2 units per round is enough
- The cost of 2h1 spread over 2h rounds
- Time per round linear in tree-height
- O(log(N)) O(H) time per round.
- Total Space quadratic in tree-height
- Each height TREEHASH may be in progress.
- Space for TREEHASH lt 123H
- Space bound - O(log(N)2) O(H2)
16Recap of classic traversal
- Merkles Solution indeed satisfactory
- Medium / Large Merkle trees practical
- Less efficient than number theory approaches
- Security properties transparent
- No random oracles, etc
- Conjecture classic traversal is optimal?
17Related work
- Time-space trade-off. RSA03
- Jakobsson, Micali, Leighton, Szydlo
- Idea use sub trees of height T
- Speed up Prover by a factor of T !
- Increases space by a factor of 2T
18This work
- New traversal algorithm
- Still O(log(N)) time
- Space required reduced to O(log(N))
- This is optimal in sense
- Space at least O(log(N)) - easy to see
- No traversal algorithm has both
- If time lt O(log(N))
- space O(log(N))
- Proof in paper
19Motivation for improvement
- Tails of Concurrent TREEHASH computations
- Graphic reminder of why space is O(log(N)2)
-
Tail at height h - up to h1 values
up to h tail pebbles
up to h-1 tail pebbles
Many tails contain pebbles at the same
height. Can this be avoided ?
20Wasteful concurrent computation
- Example - two TREEHASH instances.
- Each must compute a node value at height 3 as a
sub-goal - Assume start at same time
- Classic traversal 2 units of work to each
- Maximum space 44 8
- Re-allocate 4-units/per round
- Complete first, then do second
- Maximum space 14 5
- Rescheduling save space, complete nodes on time.
- Look for scheduling algorithm to avoid such
concurrent node computations.
21New algorithmZipping up the tails
- Apply budget to meet two kinds of requirements
- Avoid working on height h nodes from different
tails - Ensure completion of nodes with short deadline.
- Solution this compromise algorithm satisfies
both - Focus computational attention on nodes with
shortest deadline - Delay beginning new height h node until other
TREEHASH are partially completed, with no tail
nodes below height h - So we zip up the tails before diverting attention
- Essentially rigging it to have fewer tail nodes
- What is the effect of this rescheduling ?
- Question 1 Are the nodes completed on time ?
- Question 2 How much space do you need now ?
22Nodes completed on time
- Informal justification
- For a node at height h node, the delay lt 2h1
- This is only 2 per round over period of 2h rounds
- Long time to recover from delay
- Formal proof involves computation
- Fix any period of 2h rounds
- Identify all deadlines, maximum delay
- Tabulate total required computation units
- This is less than total budget over period
- Experimental verification (via implementation)
- Algorithm works time 2 log(N) per round
23Less space is used
- Easy to see why space is O(log(N))
- At each height at most 4 values are stored.
- Exactly one current sibling value
- At most 1 completed next sibling value
- At most 2 tail values
- Total space required 3 log(N)
- Tail pebbles happen when a sibling incomplete
24Result of new algorithm
- Traversal of a Merkle tree with N leaves
- Space bounded by 3 log(N)
- node storage units
- Time is 2 log(N)
- leaf calc units, hash evaluation units
- Answers classic Merkle traversal problem.
- Asymptotically optimal
25Improved constants?
- The constants are not optimal
- Example - retain left nodes to half time
- Manuscript on webpage rsasecurity.com /
szydlo.com - Can technique be combined with JMLS03?
- The main focus was to increase speed, at space
cost - Zipping technique still always saves some space
26Practical ramifications
- Merkle authentication signatures more feasible
on space constrained devices - Easy relationship between tree size and speed
- Speed up if smaller tree size acceptable
- Possible bonus for longer term assurance
- hedge against number theory breakthrough
27Conclusions
- Merkle Trees - interesting after 25 years.
- Viable for practical applications?
- Need not be only a theoretical construction
- More efficient than widely believed.
- Further directions
- Use as a tool in larger crypto protocols
- Improve constants
- good implementations, compare speed to RSA
- What else can we do without number theory based
cryptography?