Title: Towards Energy Efficient Hadoop
1Towards Energy Efficient Hadoop
- Wednesday, June 10, 2009
- Santa Clara Marriott
2Why Energy?
Cooling
Costs
Environment
3Why Energy Efficient Software
Power Utilization Efficiency (PUE)
Total power used by a datacenter
IT power used by a datacenter
IT power PDU UPS HVAC Lighting other
overhead
Servers, network, storage
2 circa 2006 and before
? 1 present day
Most of the further savings to be had in IT
hardware and software
4Energy as a Performance Metric
Traditional view of the software system design
space
Productivity
Resources Used
Increase productivity for fixed resources of a
system
5Energy as a Performance Metric
Maybe a better view of the design space?
Productivity
Energy
Resources Used
Decrease energy without compromising productivity?
6Methodology
Performance Metrics
Basket of metrics job duration, energy, power
(i.e. time rate of energy use).
Performance variance?
Parameters
Static cluster size, workload size,
configuration parameters.
Dynamic Task scheduling? Block placement?
Speculative execution?
Workload
Exercise all components sort, HDFS read, HDFS
write, shuffle.
Representative of production workloads nutch,
gridmix, others?
Energy measurement
Wall plug energy measurement 1W accuracy, 1
reading per second.
Fine grain measurement to correlate energy
consumption to hardware components?
7Scaling to More Workers Sort
Terasort format, 100 bytes records with 10 bytes
keys, 10GB of total data
Out of box Hadoop 0.18.2 with default config.
Reduce energy by adding more workers????
JouleSort highly customized system vs. Out of box
Hadoop with default config.
11k sorted records per joule vs. 87 sorted
records per joule
8Scaling to More Workers Sort
Terasort format, 100 bytes records with 10 bytes
keys, 10GB of total data
Out of box Hadoop with default config., workers
energy only
Energy of the master amortized by additional
workers
9Scaling to More Workers Nutch
Nutch web crawler and indexer, with Hadoop
0.19.1.
Index URLs anchored at www.berkeley.edu, depth 7,
2000 links per page
Workload has some built-in bottlenecks?
10Isolating IO Stages
HDFS read, shuffle, HDFS write jobs, modified
from prepackaged sort example
Read, shuffle, write 10GB of data, terasort
format, does nothing else
HDFS write seems to be the scaling bottleneck
11HDFS Replication
HDFS read, shuffle, HDFS write, sort jobs, 10GB
data, terasort format
Modify the number of HDFS replica, default
config. for everything else
Some workloads are affected HDFS write, some
are not shuffle
12HDFS Replication
Replication 3 default
Replication 2
Reducing HDFS replication to 2 makes HDFS write
less of a bottleneck?
13Changing Input Size
Sort, modified from prepackaged sort example
Jobs that handle less than 1GB of data per node
bottlenecked by overhead
Heres a somewhat noteworthy result
Out of box Hadoop competitive with JouleSort
winner at 100MB?!?
14HDFS Block Size
HDFS read, shuffle, HDFS write, sort jobs, 10GB
data, terasort format
Modify the HDFS block size, default config. for
everything else
Some workloads are affected HDFS read, some are
not shuffle
15Slow Nodes
One node on the cluster consistently received
fewer blocks
Removing the slow node leads to performance
improvement
Clever ways to use the slow node instead of
taking it offline?
16Predicting IO Energy
Working example Predict IO energy for a
particular task
Benchmark energy in joules per byte for HDFS
read, shuffle, HDFS write
IO energy bytes read joules per
byte (HDFS read)
bytes shuffled joules
per byte (shuffle)
bytes written joules
per byte (HDFS write)
The simple model is effective, but requires prior
measurements
17Cluster Provision and Configuration
Working example Find optimal cluster size for a
steady job steam
Optimize for E(N) over the range N such that D(N)
T
In general, multi-dimensional optimization
problem to meet job constraints
18Optimal HDFS Replication
Working example Reduce HDFS replication from 3
to 2, i.e. off-rack replica only?
Cost-benefit trade-off between lower energy and
higher recovery costs
Need to quantify probability of failure/recovery
to set sensible replication
19Faster More Energy Efficient?
Power Work rate
Constant energy for fixed workload size, so run
as fast as we can
20Faster More Energy Efficient?
Power Work rate
Reduce energy by using more resources, so run as
fast as we can, again
21Faster More Energy Efficient?
Power Work rate
Caveats What is meant by resource? What is a
realistic behavior for R(r)?
22Take Away Thoughts
If work rate ? resources used, energy is another
aspect of performance
All prior performance optimization techniques
dont need to be re-invented
Performance
What if work rate is not proportional to
resources used?
Different hardware?
Productivity benchmarks?
Resources Used
Hadoop as terasort and JouleSort winner?