Dynamic Replica Placement for Scalable Content Delivery - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamic Replica Placement for Scalable Content Delivery

Description:

Use Tapestry location service to improve the ... facility location problem ... Constraints: Variant of Capacitated Facility Location Problem ... – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 27
Provided by: yanc8
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Replica Placement for Scalable Content Delivery


1
Dynamic Replica Placement for Scalable Content
Delivery
Yan Chen, Randy H. Katz, John D.
Kubiatowicz yanchen, randy, kubitron_at_CS.Berkeley
.EDU EECS Department UC Berkeley
2
Motivation Scenario
data source
data plane
Web content server
CDN server
network plane
3
Goal and Challenges
Provide content distribution to clients with good
Quality of Service (QoS) while retaining
efficient and balanced resource consumption of
the underlying infrastructure
  • Dynamically choose the number and placement of
    replicas while satisfying clients QoS and
    servers capacity constraints
  • Good performance for update dissemination
  • Delay - Bandwidth consumption
  • Without global network topology knowledge
  • Scalability for millions of objects, clients and
    servers

4
Outline
  • Goal and Challenges
  • Previous Work
  • Our Solutions Dissemination Tree
  • Peer-to-peer Location Service
  • Replica Placement and Tree Construction
  • Evaluation
  • Conclusion and Future Work

5
Previous Work
  • Focused on static replica placement
  • Clients distributions and access patterns known
    in advance
  • Assume global IP network topology
  • DNS-redirection based CDNs highly inefficient
  • Centralized CDN name server cannot record replica
    locations
  • No inter-domain IP multicast
  • Application-level multicast (ALM) unscalable
  • Root maintains states for all children or handle
    all join requests

6
Solution Dissemination Tree
  • Dynamic replica placement use close-to-minimum
    number of replicas to satisfy QoS and capacity
    constraints with local network topology
  • Adaptive cache coherence for efficient coherence
    notification
  • Use Tapestry location service to improve the
    scalability locality

data source
data plane
root server
server
network plane
7
Peer-to-peer Routing and Location Services
  • Properties Needed by d-tree
  • Distributed, scalable location with guaranteed
    success
  • Search with locality
  • P2P Routing and Location Services
  • CAN, Chord, Pastry, Tapestry, etc.
  • Http//www.cs.berkeley.edu/ravenben/tapestry

8
Integrated Replica Placement and d-tree
Construction
  • Dynamic Replica Placement Application-level
    Multicast
  • Naïve approach - Smart approach
  • Static Replica Placement IP Multicast
  • Modeled as capacitated facility location problem
  • Design a greedy algorithm with logN approximation
  • Optimal case for comparison
  • Soft-state Tree Maintenance

9
Dynamic Replica Placement naïve
data plane
s
surrogate
c
network plane
10
Dynamic Replica Placement naïve
data plane
parent candidate
s
surrogate
c
network plane
Tapestry overlay path
first placement choice
11
Dynamic Replica Placement smart
data plane
client child
s
parent
surrogate
sibling
c
server child
network plane
12
Dynamic Replica Placement smart
  • Aggressive search
  • Lazy placement
  • Greedy load distribution

data plane
parent candidates
client child
s
parent
surrogate
sibling
c
server child
network plane
Tapestry overlay path
first placement choice
13
Evaluation Methodology
  • Network Topology
  • 5000-node network with GT-ITM transit-stub model
  • 500 d-tree server nodes, 4500 clients join in
    random order
  • Network Simulator
  • NS-like packet-level priority-queue based event
    simulator
  • Dissemination Tree Server Deployment
  • Random d-tree
  • Backbone d-tree (choose backbone routers and
    subnet gateways first)
  • Constraints
  • 50 ms latency bound and 200 clients/server load
    bound

14
Performance of Dynamic Replica Placement
  • Compare Four Approaches
  • Overlay dynamic naïve placement (dynamic_naïve)
  • Overlay dynamic smart placement (dynamic_smart)
  • Static placement on overlay network
    (overlay_static)
  • Static placement on IP network (IP_static)
  • Metrics
  • Number of replicas deployed and load distribution
  • Dissemination multicast performance
  • Tree construction traffic

15
Number of Replicas Deployed and Load Distribution
  • Overlay_smart uses much less replicas than
    overlay_naïve and very close to IP_static
  • Overlay_smart has better load distribution than
    od_naïve, overlay_static and very close to
    IP_static

16
Multicast Performance
  • 85 of overlay_smart Relative Delay Penalty (RDP)
    less than 4
  • Bandwidth consumed by overlay_smart is very close
    to IP_static and much less than overlay_naive

17
Tree Construction Traffic
  • Including join requests, ping messages,
    replica placement and parent/child registration
  • Overlay_smart consumes three to four times of
    traffic than overlay_naïve, and the traffic of
    overlay_naïve is quite close to IP_static
  • Far less frequent event than update dissemination

18
Conclusions and Future Work
  • Dissemination Tree dynamic Content Distribution
    Network on top of a peer-to-peer location service
  • Dynamic replica placement satisfy QoS and
    capacity constraints and self-organize into an
    app-level multicast tree
  • Use Tapestry to improve the scalability and
    locality
  • Simulation Results Show
  • Close to optimal number of replicas, good load
    distribution, low multicast delay and bandwidth
    penalty at the price of reasonable construction
    traffic
  • Future Work
  • Evaluate with more diverse topologies and
    workload
  • Dynamic replica deletion/migration to adapt to
    the shift of users interests

19
Routing in Detail
Example Octal digits, 212 namespace, 5712 ? 7510
5712
0880
3210
4510
7510
20
Dynamic Replica Placement
  • Client c sends join request to statistically
    closest server s with object o through nearest
    representative server rs
  • Naïve placement only checks s before placing new
    replicas while smart algorithm in addition checks
    parent, free siblings and free server children of
    s
  • Remaining capacity info piggybacked in soft-state
    messages
  • If unsatisfied, s place new replica on one of the
    Tapestry path nodes from c to s (path piggybacked
    in the request)
  • Naïve one always chooses the closest qualified
    node (to c), while smart one puts on the furthest
    qualified node

21
Static Replica Placement
  • Without Capacity Constraints Minimal Set Cover
  • Most cost-effective method Greedy algorithm
    Grossman Wool 1994
  • With Capacity Constraints Variant of Capacitated
    Facility Location Problem
  • C demand locations, S locations to build
    facilities, the capacity installed at each
    location is an integer multiple of u
  • Facility building cost f_i, service cost c_ij
  • Objective function minimize sum of f_i and c_ij
  • Mapped to our problem f_i 1 and c_ij 0 if
    location i cover demand j and infinity otherwise

22
Static Replica Placement Solutions
  • Best theoretical one use Primal-dual schema and
    Lagrangian relaxation Jain Varizani 1999
  • Approximation ratio 4
  • Variant of greedy algorithm
  • Approximation ratio lnS
  • Choose s with the largest value of
    min(cardinality C_s, remaining capacity RC_s)
  • If RC_s lt C_s, choose those least-covered
    clients to cover first
  • With Global IP topology vs. with Tapestry overlay
    path topology only

23
Soft State Tree Maintenance
  • Bi-directional Messaging
  • Heartbeat message downstream refresh message
    upstream
  • Scalability
  • Each member only maintains states for direct
    childrenparent
  • Join request can be handled by any member
  • Fault-tolerance through rejoining

24
Performance Under Various Conditions
  • With 100, 1000 or 4500 clients
  • With or without load constraints
  • Od_smart performs consistently better than
    od_naïve and close to IP_s as before
  • Load constraint can avoid hot spot

25
Performance Under Various Conditions II
  • With 2500 random d-tree servers
  • Merit of scalability maximal number of server
    children is very small compared with total number
    of servers
  • Due to the randomized and search with locality
    properties of Tapestry

26
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com