Dynamo: Amazons Highly Available Keyvalue Store SOSP07 - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Dynamo: Amazons Highly Available Keyvalue Store SOSP07

Description:

Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter ... Amazon eCommerce platform - Scale & Requirements ... – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 11
Provided by: steve1811
Category:

less

Transcript and Presenter's Notes

Title: Dynamo: Amazons Highly Available Keyvalue Store SOSP07


1
Dynamo Amazons Highly Available Key-value
Store(SOSP07)
  • Giuseppe DeCandia, Deniz Hastorun,
  • Madan Jampani, Gunavardhan Kakulapati,
  • Avinash Lakshman, Alex Pilchin, Swaminathan
    Sivasubramanian, Peter Vosshall
  • and Werner Vogels

2
Amazon eCommerce platform - Scale Requirements
  • Scale
  • gt 10M customers at peak times
  • gt 10K servers (distributed around the world)
  • 3M checkout operations per day (2006)
  • Key requirements
  • Data/service availability the key issue.
  • always writeable data-store
  • Low latency delivered to (almost all) clients
  • Incremental scalability

3
Data Access Model
  • Data stored as (key, object) pairs
  • Interface put(key, object), get(key)
  • identifier generated as a hash for object
  • Objects Opaque
  • Application examples shopping carts, customer
    preferences, session management, sales rank,
    product catalog

4
Design assumptions and requirements
  • Requirements
  • High data availability always writeable data
    store
  • Solution
  • Avoid synchronous replica coordination
  • (used by solutions that provide strong
    consistency).
  • Tradeoff Consistency ?? Availability
  • .. and use weak consistency models to improve
    availability
  • The problems this introduces when to resolve
    possible conflicts and who should solve them
  • When at read time (allows providing an always
    writeable data store)
  • Who application or data store
  • Incremental scalability
  • Symmetry
  • Heterogeneity

5
Key technical problems
  • Partitioning
  • High availability for writes
  • Handling temporary failures
  • Recovering from permanent failures
  • Membership and failure detection

6
(No Transcript)
7
Partition Algorithm Consistent hashing
  • Each data item is replicated at N hosts.
  • preference list The list of nodes that is
    responsible for storing a particular key.

8
Data Versioning
  • A put() call may return to its caller before the
    update has been applied at all the replicas
  • A get() call may return different versions of the
    same object.
  • Challenge different replicas of an object can
    have distinct version sub-histories, which the
    system will need to reconcile.
  • Solution uses vector clocks in order to capture
    causality between different versions of the same
    object.

9
Vector Clocks
  • Each version of each object has one associated
    vector clock.
  • list of (node, counter) pairs.
  • Reconciliation
  • If the counters on the first objects clock are
    less-than-or-equal than all of the nodes in the
    second clock, then the first is an ancestor of
    the second (and can be ignored).
  • Otherwise application-level reconciliation

10
write handled by Sx
D1 (Sx, 1)
write handled by Sx
D2 (Sx, 2)
write handled by Sy
write handled by Sz
D3 (Sx, 2Sy,1)
D4 (Sx, 2 Sz,1)
Reconciled and written by Sx
D5 (Sx, 3Sy,1Sz,1)
Write a Comment
User Comments (0)
About PowerShow.com