Towards Energy Efficient Hadoop

About This Presentation

Title:

Towards Energy Efficient Hadoop

Description:

Total power used by a datacenter. IT power used by a datacenter. IT power PDU UPS HVAC Lighting ... Here's a somewhat noteworthy result: HDFS Block Size ... – PowerPoint PPT presentation

Number of Views:95

Avg rating:3.0/5.0

Slides: 23

Provided by: lucy154

Category:

more less

Transcript and Presenter's Notes

Title: Towards Energy Efficient Hadoop

1
Towards Energy Efficient Hadoop

Wednesday, June 10, 2009
Santa Clara Marriott

2
Why Energy?
Cooling
Costs
Environment
3
Why Energy Efficient Software
Power Utilization Efficiency (PUE)
Total power used by a datacenter

IT power used by a datacenter
IT power PDU UPS HVAC Lighting other
overhead

Servers, network, storage
2 circa 2006 and before
? 1 present day
Most of the further savings to be had in IT
hardware and software
4
Energy as a Performance Metric
Traditional view of the software system design
space
Productivity
Resources Used
Increase productivity for fixed resources of a
system
5
Energy as a Performance Metric
Maybe a better view of the design space?
Productivity
Energy
Resources Used
Decrease energy without compromising productivity?
6
Methodology
Performance Metrics
Basket of metrics job duration, energy, power
(i.e. time rate of energy use).
Performance variance?
Parameters
Static cluster size, workload size,
configuration parameters.
Dynamic Task scheduling? Block placement?
Speculative execution?
Workload
Exercise all components sort, HDFS read, HDFS
write, shuffle.
Representative of production workloads nutch,
gridmix, others?
Energy measurement
Wall plug energy measurement 1W accuracy, 1
reading per second.
Fine grain measurement to correlate energy
consumption to hardware components?
7
Scaling to More Workers Sort
Terasort format, 100 bytes records with 10 bytes
keys, 10GB of total data
Out of box Hadoop 0.18.2 with default config.
Reduce energy by adding more workers????
JouleSort highly customized system vs. Out of box
Hadoop with default config.
11k sorted records per joule vs. 87 sorted
records per joule
8
Scaling to More Workers Sort
Terasort format, 100 bytes records with 10 bytes
keys, 10GB of total data
Out of box Hadoop with default config., workers
energy only
Energy of the master amortized by additional
workers
9
Scaling to More Workers Nutch
Nutch web crawler and indexer, with Hadoop
0.19.1.
Index URLs anchored at www.berkeley.edu, depth 7,
2000 links per page
Workload has some built-in bottlenecks?
10
Isolating IO Stages
HDFS read, shuffle, HDFS write jobs, modified
from prepackaged sort example
Read, shuffle, write 10GB of data, terasort
format, does nothing else
HDFS write seems to be the scaling bottleneck
11
HDFS Replication
HDFS read, shuffle, HDFS write, sort jobs, 10GB
data, terasort format
Modify the number of HDFS replica, default
config. for everything else
Some workloads are affected HDFS write, some
are not shuffle
12
HDFS Replication
Replication 3 default
Replication 2
Reducing HDFS replication to 2 makes HDFS write
less of a bottleneck?
13
Changing Input Size
Sort, modified from prepackaged sort example
Jobs that handle less than 1GB of data per node
bottlenecked by overhead
Heres a somewhat noteworthy result
Out of box Hadoop competitive with JouleSort
winner at 100MB?!?
14
HDFS Block Size
HDFS read, shuffle, HDFS write, sort jobs, 10GB
data, terasort format
Modify the HDFS block size, default config. for
everything else
Some workloads are affected HDFS read, some are
not shuffle
15
Slow Nodes
One node on the cluster consistently received
fewer blocks
Removing the slow node leads to performance
improvement
Clever ways to use the slow node instead of
taking it offline?
16
Predicting IO Energy
Working example Predict IO energy for a
particular task
Benchmark energy in joules per byte for HDFS
read, shuffle, HDFS write
IO energy bytes read joules per
byte (HDFS read)
bytes shuffled joules
per byte (shuffle)
bytes written joules
per byte (HDFS write)
The simple model is effective, but requires prior
measurements
17
Cluster Provision and Configuration
Working example Find optimal cluster size for a
steady job steam
Optimize for E(N) over the range N such that D(N)
T
In general, multi-dimensional optimization
problem to meet job constraints
18
Optimal HDFS Replication
Working example Reduce HDFS replication from 3
to 2, i.e. off-rack replica only?
Cost-benefit trade-off between lower energy and
higher recovery costs
Need to quantify probability of failure/recovery
to set sensible replication
19
Faster More Energy Efficient?
Power Work rate
Constant energy for fixed workload size, so run
as fast as we can
20
Faster More Energy Efficient?
Power Work rate
Reduce energy by using more resources, so run as
fast as we can, again
21
Faster More Energy Efficient?
Power Work rate
Caveats What is meant by resource? What is a
realistic behavior for R(r)?
22
Take Away Thoughts
If work rate ? resources used, energy is another
aspect of performance
All prior performance optimization techniques
dont need to be re-invented
Performance
What if work rate is not proportional to
resources used?
Different hardware?
Productivity benchmarks?
Resources Used
Hadoop as terasort and JouleSort winner?

Write a Comment

User Comments (0)