Cuckoo: Replication Policies - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Cuckoo: Replication Policies

Description:

Motivation: What's a Cuckoo to do? It should dynamically identify heavily accessed data ... Mostly RO objects are the sweet spot' for Cuckoo ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 14
Provided by: rajasam
Category:

less

Transcript and Presenter's Notes

Title: Cuckoo: Replication Policies


1
CuckooReplication Policies Load Shedding
Algorithms
  • Raja Sambasivan

Andrew Klosterman Greg Ganger PARALLEL DATA
LABORATORY Carnegie Mellon University
2
Motivation An Overloaded NFS Server
C
D
  • NFS Servers export filesystems
  • Clients mount exported filesystems

3
Motivation An Overloaded NFS Server
C
D
  • Common solutions to relieve load
  • Buy a better server
  • Move responsible data to another server

4
Motivation Whats a Cuckoo to do?
  • It should dynamically identify heavily accessed
    data
  • Automatically copy to other servers to allow
    load-shedding
  • A Replication Policy is responsible for this
    operation
  • It should automatically identify most work
    intensive ops
  • Offload to other servers with replicas when busy
  • A load-shedding algorithm is responsible for this
    operation

Objective Identify simple replication policies
and load shedding algorithms that will allow us
to offload large amounts of work from a NFS server
5
Replication in Cuckoo
  • Cuckoo must determine what to replicate from
  • The replication policy in use
  • Access traces received from NFS servers
  • Replication is performed periodically
  • Assumed replication period of one day for
    analysis
  • Short-Lived Objects Created between replications
  • Long-Lived Objects Exists at beginning of
    replication
  • What objects should the planner replicate?
  • Replicate only most utilized objects
  • Replicate objects whose replicas can be synced
    easily

6
Replication Policies Most Popular RO Policy
  • Simple and presents no consistency issues
  • Define popularity by number of ops seen

7
Replication Policies Most Popular RO Policy
  • This is not a good replication policy
  • Can only offload up to 12 of all operations
  • Can only offload about 1 of all bytes
    transferred
  • Most operations accounted for are metadata ops
  • Offloading metadata ops not important
  • Work intensive operations are probably not
    metadata ops
  • Lessons Learned
  • Tune replication policies for data objects
  • Operation level granularity is not sufficient
  • Number of ops seen is not a good ranking scheme

8
Replication Policies Data Objects
  • Many long-lived objects have file data that is
    mostly RO
  • Visible if viewing location of unique bytes
    read/written
  • Considerable data is read from these objects

A 100 RO object
RO Portion
A mostly RO object
RW Portion
RO Portion
Green area sees no write ops Red area sees
read/write ops
9
Replication Policies Data Objects
  • Many long-lived objects have file data that is
    mostly RO
  • Considerable data is read from these objects

10
Replication Policies ROF/RWP Policy
  • Mostly RO objects are the sweet spot for Cuckoo
  • They are used in the ROF/RWP replication policy
    by
  • Ranking data objects by bytes read
  • Replicating data objects that have
  • Large fraction of total bytes read from RO
    sections (ROF)
  • High proportion of bytes read vs. bytes written
    in RW sections (RWP)

11
Replication Policies ROF/RWP Policy
  • Mostly RO objects are the sweet spot for Cuckoo
  • They are used in the ROF/RWP replication policy
    by
  • Ranking data objects by bytes read
  • Replicating data objects that have
  • Large fraction of total bytes read from RO
    sections (ROF)
  • High proportion of bytes read vs. bytes written
    in RW sections (RWP)

Bytes Offloadable1K Objects Replicated DEAS
02/02/03
Bytes Written 37.2
Offloadable Bytes Read 62.2
Unoffloadable Bytes Read 0.6
12
Summary
  • We have identified a replication policy that
  • Is capable of offloading a small fraction of all
    ops
  • Can offload a large amount of data read
  • We would like more traces
  • That are busier
  • That show accesses to more than one server
  • Come see our posters!

13
Related Work/Products
  • Rainfinity RainStorage. http//www.rainfinity.com
  • Scott Baker and John H. Hartmann. The Mirage NFS
    Router. University of Arizona Computer Science
    Technical Report TR-02-04. November, 2002.
  • Daniel Ellard, Jonathan Ledlie and Margo Seltzer.
    Passive NFS Tracing of Email and Research
    Workloads. USENIX File and Storage Technologies
    Conference, pages 203-216, San Francisco,
    California. March, 2003.
  • Daniel Ellard, Michael Mesnier, Eno Thereska,
    Gregory R. Ganger, and Margo Seltzer.
    Attribute-Based Prediction of File Properties.
    Harvard Computer Science Technical Report
    TR-16-03. December, 2003.
Write a Comment
User Comments (0)
About PowerShow.com