Untangling the Web from DNS - PowerPoint PPT Presentation

About This Presentation

Title:

Untangling the Web from DNS

Description:

CDF slope decreases as median var. of attr. incr. may be able to classify nodes as high/low var. over time for mem, load, net bytes (they have high median var. ... – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 21

Provided by: rocCsBe

Learn more at: http://roc.cs.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Untangling the Web from DNS

1
A case for resource discovery in shared
distributed platforms
David Oppenheimer
UCB ROC Retreat12 January 2005
2
Introduction

Application performance is a function of
resources available to the application
resources needed by the application
or, application sensitivity to resource
constraints
At summer retreat, described SWORD
at app deployment time, find best set of nodes
given
resources available on a set of distributed nodes
application sensitivity to resource constraints
assumptions
available resources vary among nodes enough to
matter
spare CPU, mem, disk space inter-node latency,
avail. bw ...
applications are sensitive to resource
constraints enough to matter
Focus of this talk verify assumption (1)

3
Introduction (cont.)

Questions we will address
is there enough variation among nodes at any
given (deployment) time to justify service
placement?
is there enough variation over time on a single
node to justify periodic task migration?
are there correlations between attributes on a
single node, or among nodes at the same site?
All of these questions are important in designing
a system for resource discovery and service
placement (like SWORD)

4
Outline

How much does the available amount of per-node
resources vary among nodes at a fixed time?
How much does the available amount of per-node
resources vary over time? How much do inter-node
latency and available bandwidth vary over time?
On a given node, are any per-node attributes
strongly correlated? Are inter-node latency and
available bandwidth correlated?

5
Experimental environment

Per-node attributes Ganglia, CoMon
two-week period (Oct 10-Oct 24, 2004)
each node polled every 5 minutes
free memory, free swap, free disk, load average,
network bytes sent and received/sec, active
slices
Inter-node latency all-pairs pings
one month period ending Oct 24, 2004
each pair of nodes measured every 15 minutes
Inter-node bandwidth Iperf
one month period ending Oct 24, 2004
each pair of nodes measured 1-2x/week
About 250 nodes in the trace each day

6
Outline

How much does the available amount of per-node
resources vary among nodes at a fixed time?
How much does the available amount of per-node
resources vary over time? How much do inter-node
latency and available bandwidth vary over time?
On a given node, are any per-node attributes
strongly correlated? Are inter-node latency and
available bandwidth correlated?

7
Resource heterogeneity averages

How much does available resources vary over the
trace?

attribute mean std. dev. 10th ile 90th ile
of CPUs 1.0 0.0 1.0 1.0
CPU speed (MHz) 1942 572 1263 2652
Total disk (GB) 127 88.5 35.1 232
Total memory (MB) 1153 467 628 2017
Total swap (GB) 1.0 0.0 1.0 1.0
8
Resource heterogeneity averages

How much does available resources vary over the
trace?

attribute mean std. dev. 10th ile 90th ile
1 min load average 6.81 20.06 1.05 11.86
Free memory (MB) 62.359 125.234 13.668 105.432
Free swap (MB) 755.596 178.795 524.336 1000.268
Free disk (GB) 102.8 86.04 8.088 208.3
Active slices 13.3 5.96 0.0 20.0
Bytes/s in 50477 117023 5568 92877
Bytes/s out 52543 130112 5476 96214
9
Resource heterogeneity CV vs. time
10
Outline

How much does the available amount of per-node
resources vary among nodes at a fixed time?
How much does the available amount of per-node
resources vary over time? How much do inter-node
latency and available bandwidth vary over time?
On a given node, are any per-node attributes
strongly correlated? Are inter-node latency and
available bandwidth correlated?

11
Variability of per-node attributes over time
12
Variability of per-node attributes over time
13
Variability of per-node attributes over time
14
Variability of per-node attributes over time

Can rank degree of variability of each attribute
disk, swap lt mem, load lt net bytes slices mod
to sig.
CDF curve shifts to right as interval length
incrs.
attributes vary less over short time periods than
long
migration interval find sweet spot in curve of
variability vs. interval length
CDF slope decreases as median var. of attr. incr.
may be able to classify nodes as high/low var.
over time for mem, load, net bytes (they have
high median var.)

15
Inter-node latency and BW variation over time

Most nodes have low latency (and bw) variability
even over a month-long trace
migration may not be worthwhile

16
Outline

How much does the available amount of per-node
resources vary among nodes at a fixed time?
How much does the available amount of per-node
resources vary over time? How much do inter-node
latency and available bandwidth vary over time?
On a given node, are any per-node attributes
strongly correlated? Are inter-node latency and
available bandwidth correlated?

17
Correlation among per-node attributes
r loadone memfree swapfree diskfree actvslice byte_in byte_out
loadone .080
memfree -.050 .627
swapfree -.231 .274 .473
diskfree -.035 .192 .212 .929
actvslice .079 -.050 -.219 .049 .773
byte_in .059 -.033 -.074 .059 .140 .209
byte_out .058 -.033 -.059 .078 .137 .443 .188

No strong correlations between different attrs.
though some one-hour trace segments had some
Some correlation between nodes at same site

18
Correlation between latency and avail BW
r-.59

Moderate inverse power law correlation
Using latency to estimate BW gives 233 error
some nodes are bandwidth-capped, some in weird
ways
Some node pairs showed strong lat-BW correlation
17 within 25, 56 within 50

19
Conclusion

How much does the available amount of per-node
resources vary among nodes at a fixed
time? significantly enough to warrant svc.
placement
How much does the available amount of per-node
resources vary over time? How much do inter-node
latency and available bandwidth vary over
time? moderate variability may warrant
migration
On a given node, are any per-node attributes
strongly correlated? Are inter-node latency and
available bandwidth correlated? no strong
correlation between diff. attrs. some
correlation between same attr, same site latency
can predict avail. bandwidth

20
Future work

Ask same questions but use application model to
answer, rather than analysis of raw data
different apps have different resource
sensitivities
different apps have different migration costs
Can we predict attribute values?
give warning before migration
or just dont bother to deploy on bad nodes
How much better could we do if SWORD could
schedule jobs?