Cost Aware Resource Management for Decentralized Network Services PowerPoint PPT Presentation

presentation player overlay
1 / 48
About This Presentation
Transcript and Presenter's Notes

Title: Cost Aware Resource Management for Decentralized Network Services


1
Cost Aware Resource Management for Decentralized
Network Services
  • Venugopalan Ramasubramanian (Rama)
  • Microsoft Research Silicon Valley / Cornell
    University

2
Introduction
  • decentralized services have become increasingly
    important
  • e.g. name systems, CDNs, publish-subscribe
  • low latency, constant availability, and high
    scalability
  • current services often fall short of required
    performance
  • ad hoc techniques

3
Problems with Ad hoc Techniques
  • no performance guarantees
  • unable to quantify/bound performance
  • unable to tune resource utilization to meet
    performance targets
  • tailored to specific workloads
  • e.g. opportunistic caching on 90/10 rule
  • heavy-tailed popularity distributions
  • mutable objects

4
Principled Approach
  • fundamental cost-performance tradeoff
  • e.g. lookup latency vs. memory / bandwidth
    consumption
  • resource allocation problem
  • which node hosts which object?
  • depends on popularity, size, update rate, etc.

5
Prior Work
  • Scalability
  • high complexity even to express the problem
  • number of objects x number of nodes (M x N)
  • Decentralization
  • objects are distributed among multiple nodes
  • expensive to perform resource allocation centrally

6
Cost-Aware Resource Management Framework
  • high performance, robust, and scalable services
  • Mathematical Optimization
  • system-wide performance goals become constraints
    to optimization problems
  • Min. cost s.t. performance meets target
  • Max. performance s.t. cost limit
  • Structured Overlays
  • decentralization and self-organization
  • well-defined topology with bounded diameter and
    node degree

7
Decentralized Internet Services
  • name service for the Internet
  • Cooperative Domain Name System (CoDoNS)
  • content distribution network
  • Cooperative Beehive Web (CoBWeb)
  • on-line data monitoring
  • Cornell On-line News Aggregator (CorONA)

8
Scalable Resource Allocation
  • structured overlay
  • each object has a home node
  • DAG rooted at home node reaching all nodes
  • uniform branching-factor
  • allocate resources at well-defined levels
  • level l means all nodes l hops away from home
    node
  • low complexity resource allocation
  • Number of objects x Diameter (e.g. M x log N)
  • practical and scalable

9
Structured Overlays Pastry
prefix-matching logbN hops
2012
10
Opportunistic Caching in Pastry
2012
11
Structured Resource Allocation
  • analytically model performance-overhead tradeoff
  • object replicated at all nodes with l matching
    prefix-digits
  • lookup latency l hops
  • replicas N/bl
  • inexpensive to locate and update replicas

0021
0112
0122
2012
12
Outline
  • Introduction
  • Honeycomb Framework
  • Optimization Analysis
  • Implementation
  • Applications
  • Evaluation
  • Conclusions

13
Analytical Modeling
  • level of allocation (l)
  • object hosted at all nodes l hops from the home
    node
  • optimization problem find optimal values of li
  • min. ? Ci(li), s.t. ? Pi(li) ? T
  • max. ? Pi(li), s.t. ? Ci(li) ? T
  • performance variables
  • lookup latency, update latency
  • cost variables
  • memory consumption, network overhead, number of
    nodes

14
Optimization Problem Lookup Latency
  • min. ? ci . bli s.t., ? qi (D - li) ? TL

total overhead
avg. lookup latency
TL target lookup latency in hops qi relative
query frequency ci replication cost of object i
objects M, nodes N, branching factor b, diameter D
15
Resource Allocation for Lookup Performance
  • target avg. lookup latency hops
  • sub-one hop, fractional values (e.g., 0.5 hops)
  • indirectly specifies cache hit ratio
  • worst case lookup latency
  • lower bound on l
  • optimizes multiple overhead metrics
  • number of nodes c 1
  • memory c size of object
  • bandwidth c size x update rate

16
Analytical Optimization (Beehive)
  • Zipf popularity distribution (e.g. DNS, Web, RSS)
  • analytically tractable (one parameter ?)
  • closed-form solution
  • inexpensive to compute and apply

Ramasubramanian and Sirer NSDI 04
17
Numerical Optimization
  • general-purpose approach
  • any popularity distribution (including Zipf)
  • many cost metrics (fine-grained bandwidth
    consumption)
  • many performance metrics (update latency)
  • optimization problem is NP-Hard
  • Multiple choice Knapsack problem
  • discrete, convex, and separable
  • fast and accurate approximation algorithm
  • O(M D log(M D)) running time
  • at most one object per node (more or less than
    optimum)

18
Numerical Optimization 2
  • Lagrange multiplier
  • min. ? C(lm) ? ? P(lm) T
  • bisection-based bracketing algorithm
  • upper and lower bound solutions that differ in
    one channel yields near-optimal solution
  • pre-computation and sorting of ?s before
    iterating yields O(MD log (MD)) algorithm

19
Honeycomb
  • cost-aware resource allocation framework for
    structured overlays
  • properties
  • system-wide performance goals
  • scalability and failure resilience
  • quick adaptation to workload
  • fast update propagation

20
Scalable Resource Management
  • independent decisions
  • local aggregation
  • estimate popularity
  • communication only with overlay neighbors
  • replicas managed by one-hop neighbors

21
Scalable Resource Management
  • independent decisions
  • local aggregation
  • estimate popularity
  • communication only with overlay neighbors
  • replicas managed by one-hop neighbors

22
Decentralized Optimization
  • global optimum requires global information
  • Using local knowledge alone leads to sub-optimal
    solutions
  • solution
  • approximate tradeoffs for non-local channels
  • aggregate coarse-grained information between
    neighbors

23
Decentralized Optimization 2
  • approximate parameters
  • cluster channels with similar values of P(l) /
    C(l)
  • constant number of clusters per level

24
Decentralized Optimization 3
  • Aggregating Clusters
  • Exchange clusters with one-hop neighbors
  • Hierarchical aggregation through structured
    overlay

25
Adaptation to Workload Changes
  • popularity of objects may change drastically
  • flash-crowds, denial of service attacks
  • nodes measure popularity for local objects and
    aggregate popularity estimates with neighbors

26
Adaptation to Workload Changes 2
  • orders of magnitude difference in query rates of
    popular and unpopular objects
  • solution combine inter-arrival times and query
    counts
  • estimation times proportional to the query rate
    of the object
  • monitoring overhead proportional to the query
    rate of the object
  • quick detection of large increases in query rate

27
Honeycomb Fast Update Propagation
  • single integer (replication level) indicates
    locations of all objects
  • no TTL required
  • proactively propagate updates
  • use neighbors in the underlying overlay
  • increasing version numbers differentiate versions
  • lazy updates in background

28
Outline
  • Introduction
  • Honeycomb Framework
  • Applications
  • Name service (CoDoNS)
  • Content distribution network (CoBWeb)
  • On-line data monitoring system (CorONA)
  • Evaluation
  • Conclusions

29
CoDoNS Cooperative Domain Name System
  • legacy DNS has fundamental problems
  • poor failure resilience due to limited
    replication
  • high response times due to multi-hop lookups
  • no support for spontaneous updates
  • cooperative cache for DNS bindings

LegacyDNS
Ramasubramanian and Sirer SIGCOMM 04
30
CoDoNS Cooperative Domain Name System
  • structured, proactive caching of name-data
    mappings
  • targets avg. lookup latency of (0.5 hops)
  • minimizes memory consumption
  • updates pushed proactively to all caching nodes
  • self-certifying data to preserve integrity
    (DNS-SEC)
  • incremental deployment path
  • safety-net for legacy DNS
  • deployed on Planet-Lab

31
CobWeb Cooperative Beehive Web
  • Web caches
  • passive, client driven
  • Content Distribution Networks
  • active, replication driven
  • e.g. Akamai, Digital Island (commercial), CoDeeN,
    CoralCDN (academia)
  • web caching solutions based on heuristics
  • ideal cache hit rate (60-70) Wolman et al. 01
  • achieved cache hit rate (20-40) Breslao et al.
    99, Wolman et al. 01

32
CobWeb Cooperative Beehive Web
  • CobWeb is a cooperative web cache
  • high cache hit rate through structured, proactive
    caching
  • low network overhead using object size and update
    rate
  • adaptation to flash crowds
  • CobWeb performance goals
  • min. network bandwidth s.t. cache hit rate meets
    a target
  • max. cache hit rate s.t. network bandwidth is all
    consumed

33
CobWeb Cooperative Beehive Web
  • user interfaces
  • append cob-web.org to urls
  • e.g., http//slashdot.org.cob-web.org8888
  • DNS redirection, URL rewriting
  • Meridian finds closest node to the client
  • deployed on Planet-Lab
  • greater than10 million requests per day

34
Corona Monitoring Online Data
  • continuously monitoring and detecting changes is
    crucial
  • e.g., web pages, sensors, databases
  • content servers only provide query-based
    interface
  • naïve approach through repeated, independent
    polling
  • bad update performance
  • high server load

35
Corona Monitoring Online Data
  • publish-subscribe interface for monitoring web
    urls
  • cooperative polling
  • resource allocation decides how many nodes poll
    each channel

Ramasubramanian, Peterson, and Sirer NSDI 06
36
Corona Performance Goals
  • Corona Lite
  • Min. update detection time s.t. network load is
    bounded
  • Corona Fast
  • Min. network load s.t. update detection time
    meets a target
  • Corona Fair
  • Min. relative update detection time s.t. network
    load is bounded
  • ratio of update detection time to update interval

37
Outline
  • Introduction
  • Honeycomb Framework
  • Applications
  • Evaluation
  • Conclusions

38
CoDoNS Lookup Latency
MIT-DNS trace 265111 queries, 30000 names, 65
nodes
CoDoNS gives 1.5 to 2 times better latency
39
CoBWeb Lookup Performance
NLANR Workload 1024 nodes, 10,000 objects, 100,
000 queries
40
CoBWeb vs. Opportunistic Caching
Lookup Latency
41
CoBWeb vs. Opportunistic Caching
Storage Overhead
42
CoBWeb Flash Crowd
Lookup Latency
43
CoBWeb Flash Crowd
Network Bandwidth
44
Corona Update Performance
Corona improves update detection time from 15 min
to 45 sec
Corona keeps load lower than Legacy RSS
45
Corona Update Performance
Heuristics vs. Corona
46
Conclusions
  • enables high performance, robust, and scalable
    network services
  • principled approach for achieving performance
    goals in distributed systems
  • mathematical optimization and structured overlays
  • CoDoNS, CobWeb, and Corona

47
Other Research in Wireless Networks
  • Sharp hybrid adaptive routing prorocol for mobile
    ad hoc networks Mobihoc 03
  • combines proactive and reactive approaches to
    routing to achieve high performance efficiently
  • SRL bidirectional abstraction to support routing
    protocols on asymmetric mobile ad hoc networks
    INFOCOM 02
  • Anonymous Gossip improving multicast reliability
    on mobile ad hoc networks ICDCS 01

48
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com