Efficient Replica Maintenance for Distributed Storage Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Efficient Replica Maintenance for Distributed Storage Systems

Description:

Efficient Replica Maintenance for Distributed Storage Systems B-G Chun, F. Dabek, A. Haeberlen, E. Sit, H. Weatherspoon, M. Kaashoek, J. Kubiatowicz, and R. Morris ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 10
Provided by: Fabia187
Category:

less

Transcript and Presenter's Notes

Title: Efficient Replica Maintenance for Distributed Storage Systems


1
Efficient Replica Maintenance for Distributed
Storage Systems
  • B-G Chun, F. Dabek, A. Haeberlen, E. Sit, H.
    Weatherspoon, M. Kaashoek, J. Kubiatowicz, and R.
    Morris, In Proc. of NSDI, May 2006.
  • Presenter Fabián E. Bustamante

2
Replication in Wide-Area Storage
  • Applications put get objects in/from the
    wide-area storage system
  • Objects are replicated for
  • Availability
  • Get on an object will return promptly
  • Durability
  • Object put by the app are not lost due to disk
    failures
  • An object may be durably stored but not
    immediately available

3
Goal durability at low bandwidth cost
  • Durability is a more practical useful goal
  • Threat to durability
  • Loose the last copy of an object
  • So, create copies faster than they are destroyed
  • Challenges
  • Replication can eat your bandwidth
  • Hard to distinguish bet/ transient permanent
    failure
  • After recover, some replicas may be in nodes the
    lookup algorithm does not check
  • Paper presents Carbonite efficient wide-area
    replication technique for durability

4
System Environment
  • Use PlanetLab (PL) as representative
  • gt600 nodes distributed world-wide
  • History traces collected by CoMon project (every
    5)
  • Disk failures from event logs of PlanetLab
    Central
  • Synthetic traces
  • 632 nodes as PL
  • Failure inter-arrival times from exponential
    dist. (mean session time and downtime as in PL)
  • Two years instead of one and avg node lifetime of
    1 year
  • Simulation
  • Trace-driven event-based simulator
  • Assumptions
  • Network paths are independent
  • All nodes reachable from all other nodes
  • Each node with same link capacity

Dates 3/1/05-2/28/06
Hosts 632
Transient failures 21355
Disk failures 219
Transient host downtime (s) (median,avg,90th) 1208, 104647, 14242
Any failure interarrival (s) 305, 1467, 3306
Disk failure interarrival (s) 544411, 143476, 490047
5
Understanding durability
  • To handle some avg. rate of failure create new
    replicas faster than they are destroyed
  • Function of per-node access link, number of
    nodes, amount of data stored per node
  • Infeasible system unable to keep pace w/ avg.
    failure rate will eventually adapt by
    discarding objects (which ones?)
  • If creation rate is just above failure rate
    failure burst may be a problem
  • Target replicas to maintain rL
  • Durability does not increased continuously with
    rL

6
Improving repair time
  • Scope set of other nodes that can hold copies
    of the objects a node is responsible for
  • Small scope
  • Easier to keep track of copies
  • Effort of creating copies fall on a small set of
    nodes
  • Addition of nodes may result on needless copying
    of objects (when combined w/ consistent hashing)
  • Large scope
  • Spread work among more nodes
  • Network traffic source/destination are spread
  • Temp failures will be noticed by more nodes

7
Reducing transient costs
  • Impossible to distinguish transient/permanent
    failures
  • To minimize net traffic due to transient
    failures reintegrate replicas
  • Carbonite
  • Selecet a suitable value for rL
  • Respond to detected failure by creating new
    replica
  • Reintegrate replicas

Bytes sent by different maintenance algorithms
8
Reducing transient costs
Bytes sent w/ and w/o reintegration
Impact of timeouts on bandwidth and durability
9
Assumptions
  • The PlanetLab testbed can be seen as
    representative of something
  • Immutable data
  • Relatively stable system membership data loss
    driven by disk failures
  • Disk failures are uncorrelated
  • Simulation
  • Network paths are independent
  • All nodes reachable from all other nodes
  • Each node with same link capacity
Write a Comment
User Comments (0)
About PowerShow.com