Cost Aware Resource Management for Decentralized Network Services presentation

About This Presentation

Transcript and Presenter's Notes

Title: Cost Aware Resource Management for Decentralized Network Services

1
Cost Aware Resource Management for Decentralized
Network Services

Venugopalan Ramasubramanian (Rama)
Microsoft Research Silicon Valley / Cornell
University

2
Introduction

decentralized services have become increasingly
important
e.g. name systems, CDNs, publish-subscribe
low latency, constant availability, and high
scalability
current services often fall short of required
performance
ad hoc techniques

3
Problems with Ad hoc Techniques

no performance guarantees
unable to quantify/bound performance
unable to tune resource utilization to meet
performance targets
tailored to specific workloads
e.g. opportunistic caching on 90/10 rule
heavy-tailed popularity distributions
mutable objects

4
Principled Approach

fundamental cost-performance tradeoff
e.g. lookup latency vs. memory / bandwidth
consumption
resource allocation problem
which node hosts which object?
depends on popularity, size, update rate, etc.

5
Prior Work

Scalability
high complexity even to express the problem
number of objects x number of nodes (M x N)
Decentralization
objects are distributed among multiple nodes
expensive to perform resource allocation centrally

6
Cost-Aware Resource Management Framework

high performance, robust, and scalable services
Mathematical Optimization
system-wide performance goals become constraints
to optimization problems
Min. cost s.t. performance meets target
Max. performance s.t. cost limit
Structured Overlays
decentralization and self-organization
well-defined topology with bounded diameter and
node degree

7
Decentralized Internet Services

name service for the Internet
Cooperative Domain Name System (CoDoNS)
content distribution network
Cooperative Beehive Web (CoBWeb)
on-line data monitoring
Cornell On-line News Aggregator (CorONA)

8
Scalable Resource Allocation

structured overlay
each object has a home node
DAG rooted at home node reaching all nodes
uniform branching-factor
allocate resources at well-defined levels
level l means all nodes l hops away from home
node
low complexity resource allocation
Number of objects x Diameter (e.g. M x log N)
practical and scalable

9
Structured Overlays Pastry
prefix-matching logbN hops
2012
10
Opportunistic Caching in Pastry
2012
11
Structured Resource Allocation

analytically model performance-overhead tradeoff
object replicated at all nodes with l matching
prefix-digits
lookup latency l hops
replicas N/bl
inexpensive to locate and update replicas

0021
0112
0122
2012
12
Outline

Introduction
Honeycomb Framework
Optimization Analysis
Implementation
Applications
Evaluation
Conclusions

13
Analytical Modeling

level of allocation (l)
object hosted at all nodes l hops from the home
node
optimization problem find optimal values of li
min. ? Ci(li), s.t. ? Pi(li) ? T
max. ? Pi(li), s.t. ? Ci(li) ? T
performance variables
lookup latency, update latency
cost variables
memory consumption, network overhead, number of
nodes

14
Optimization Problem Lookup Latency

min. ? ci . bli s.t., ? qi (D - li) ? TL

total overhead
avg. lookup latency
TL target lookup latency in hops qi relative
query frequency ci replication cost of object i
objects M, nodes N, branching factor b, diameter D
15
Resource Allocation for Lookup Performance

target avg. lookup latency hops
sub-one hop, fractional values (e.g., 0.5 hops)
indirectly specifies cache hit ratio
worst case lookup latency
lower bound on l
optimizes multiple overhead metrics
number of nodes c 1
memory c size of object
bandwidth c size x update rate

16
Analytical Optimization (Beehive)

Zipf popularity distribution (e.g. DNS, Web, RSS)
analytically tractable (one parameter ?)
closed-form solution
inexpensive to compute and apply

Ramasubramanian and Sirer NSDI 04
17
Numerical Optimization

general-purpose approach
any popularity distribution (including Zipf)
many cost metrics (fine-grained bandwidth
consumption)
many performance metrics (update latency)
optimization problem is NP-Hard
Multiple choice Knapsack problem
discrete, convex, and separable
fast and accurate approximation algorithm
O(M D log(M D)) running time
at most one object per node (more or less than
optimum)

18
Numerical Optimization 2

Lagrange multiplier
min. ? C(lm) ? ? P(lm) T
bisection-based bracketing algorithm
upper and lower bound solutions that differ in
one channel yields near-optimal solution
pre-computation and sorting of ?s before
iterating yields O(MD log (MD)) algorithm

19
Honeycomb

cost-aware resource allocation framework for
structured overlays
properties
system-wide performance goals
scalability and failure resilience
quick adaptation to workload
fast update propagation

20
Scalable Resource Management

independent decisions
local aggregation
estimate popularity
communication only with overlay neighbors
replicas managed by one-hop neighbors

21
Scalable Resource Management

independent decisions
local aggregation
estimate popularity
communication only with overlay neighbors
replicas managed by one-hop neighbors

22
Decentralized Optimization

global optimum requires global information
Using local knowledge alone leads to sub-optimal
solutions
solution
approximate tradeoffs for non-local channels
aggregate coarse-grained information between
neighbors

23
Decentralized Optimization 2

approximate parameters
cluster channels with similar values of P(l) /
C(l)
constant number of clusters per level

24
Decentralized Optimization 3

Aggregating Clusters
Exchange clusters with one-hop neighbors
Hierarchical aggregation through structured
overlay

25
Adaptation to Workload Changes

popularity of objects may change drastically
flash-crowds, denial of service attacks
nodes measure popularity for local objects and
aggregate popularity estimates with neighbors

26
Adaptation to Workload Changes 2

orders of magnitude difference in query rates of
popular and unpopular objects
solution combine inter-arrival times and query
counts
estimation times proportional to the query rate
of the object
monitoring overhead proportional to the query
rate of the object
quick detection of large increases in query rate

27
Honeycomb Fast Update Propagation

single integer (replication level) indicates
locations of all objects
no TTL required
proactively propagate updates
use neighbors in the underlying overlay
increasing version numbers differentiate versions
lazy updates in background

28
Outline

Introduction
Honeycomb Framework
Applications
Name service (CoDoNS)
Content distribution network (CoBWeb)
On-line data monitoring system (CorONA)
Evaluation
Conclusions

29
CoDoNS Cooperative Domain Name System

legacy DNS has fundamental problems
poor failure resilience due to limited
replication
high response times due to multi-hop lookups
no support for spontaneous updates
cooperative cache for DNS bindings

LegacyDNS
Ramasubramanian and Sirer SIGCOMM 04
30
CoDoNS Cooperative Domain Name System

structured, proactive caching of name-data
mappings
targets avg. lookup latency of (0.5 hops)
minimizes memory consumption
updates pushed proactively to all caching nodes
self-certifying data to preserve integrity
(DNS-SEC)
incremental deployment path
safety-net for legacy DNS
deployed on Planet-Lab

31
CobWeb Cooperative Beehive Web

Web caches
passive, client driven
Content Distribution Networks
active, replication driven
e.g. Akamai, Digital Island (commercial), CoDeeN,
CoralCDN (academia)
web caching solutions based on heuristics
ideal cache hit rate (60-70) Wolman et al. 01
achieved cache hit rate (20-40) Breslao et al.
99, Wolman et al. 01

32
CobWeb Cooperative Beehive Web

CobWeb is a cooperative web cache
high cache hit rate through structured, proactive
caching
low network overhead using object size and update
rate
adaptation to flash crowds
CobWeb performance goals
min. network bandwidth s.t. cache hit rate meets
a target
max. cache hit rate s.t. network bandwidth is all
consumed

33
CobWeb Cooperative Beehive Web

user interfaces
append cob-web.org to urls
e.g., http//slashdot.org.cob-web.org8888
DNS redirection, URL rewriting
Meridian finds closest node to the client
deployed on Planet-Lab
greater than10 million requests per day

34
Corona Monitoring Online Data

continuously monitoring and detecting changes is
crucial
e.g., web pages, sensors, databases
content servers only provide query-based
interface
naïve approach through repeated, independent
polling
bad update performance
high server load

35
Corona Monitoring Online Data

publish-subscribe interface for monitoring web
urls
cooperative polling
resource allocation decides how many nodes poll
each channel

Ramasubramanian, Peterson, and Sirer NSDI 06
36
Corona Performance Goals

Corona Lite
Min. update detection time s.t. network load is
bounded
Corona Fast
Min. network load s.t. update detection time
meets a target
Corona Fair
Min. relative update detection time s.t. network
load is bounded
ratio of update detection time to update interval

37
Outline

Introduction
Honeycomb Framework
Applications
Evaluation
Conclusions

38
CoDoNS Lookup Latency
MIT-DNS trace 265111 queries, 30000 names, 65
nodes
CoDoNS gives 1.5 to 2 times better latency
39
CoBWeb Lookup Performance
NLANR Workload 1024 nodes, 10,000 objects, 100,
000 queries
40
CoBWeb vs. Opportunistic Caching
Lookup Latency
41
CoBWeb vs. Opportunistic Caching
Storage Overhead
42
CoBWeb Flash Crowd
Lookup Latency
43
CoBWeb Flash Crowd
Network Bandwidth
44
Corona Update Performance
Corona improves update detection time from 15 min
to 45 sec
Corona keeps load lower than Legacy RSS
45
Corona Update Performance
Heuristics vs. Corona
46
Conclusions

enables high performance, robust, and scalable
network services
principled approach for achieving performance
goals in distributed systems
mathematical optimization and structured overlays
CoDoNS, CobWeb, and Corona

47
Other Research in Wireless Networks

Sharp hybrid adaptive routing prorocol for mobile
ad hoc networks Mobihoc 03
combines proactive and reactive approaches to
routing to achieve high performance efficiently
SRL bidirectional abstraction to support routing
protocols on asymmetric mobile ad hoc networks
INFOCOM 02
Anonymous Gossip improving multicast reliability
on mobile ad hoc networks ICDCS 01

48
(No Transcript)

Write a Comment

User Comments (0)

About PowerShow.com

Cost Aware Resource Management for Decentralized Network Services PowerPoint PPT Presentation