Title: Energy Aware Lossless Data Compression
1Energy Aware Lossless Data Compression
- Kenneth Barr and Krste Asanovic
- MIT Laboratory for Computer Science
2Energy Aware Lossless Data CompressionIntroducti
on
- Motivation
- Compression can save wireless network energy.
- Observation
- Energyadd lt 1nJ
- Energysend gt 1000nJ
- Approach
- Can we use 1000 adds to eliminate a bit?
- Reconsider slow compressors that perform many
operations to achieve best compression ratios? - Should we choose the fastest compressor because
EPt?
3Energy Aware Lossless Data CompressionIntroducti
on
- Results
- Theres no easy answer. For minimum energy, one
must characterize hardware, software, and
workload. - Energy saved on Skiff
- Compared to default (zlib-6) 31 (web) to 57
(text) savings - Asymmetric strategy can save 11-12 percent over
the best symmetric pair.
4Energy Aware Lossless Data CompressionAgenda
- Experimental Setup
- Hardware
- Benchmarks
- Observed Energy
- Compression Applications
- Compute, network, memory
- Energy Analysis
- What impacts compression energy?
- Lowering Overall Energy of Transmission
- Understanding cache behavior
- Sleep mode affects choice
- Asymmetric compression
- Conclusion and Future Work
5Compaq Personal Server (aka Skiff)
- CPU similar to iPAQ
- Spread out and exposed to facilitate measurement
- Network Enterasys five volt 802.11b (Cardbus)
6Skiff enables power measurement
Regulator (2V)
- Measurement
- Three power planes (after cutting traces)
- PC-Card power measured with extender card
- 2 measurements (supply voltage and current) per
plane - 5 x 6.5sec samples at 60Hz sample rate
multimeter controlled via RS-232 - Error
- Missed events are possible due to slow sample
rate, but not a problem in practice - Error sources analyzed in a Compaq tech report
- Total error (hardware averaging) lt1
- Higher error with simulation-based power
estimation, but simulator is useful for
instruction and event counts
StrongARM
V
V
2
1
R
SA-110 CPU
cpu
Regulator (3.3V)
DRAM
Mem. Controller
R
mem
Flash
Peripherals
wired ethernet,
R
Cardbus, RS232
peri
Clocks, GPIO, et al.
Regulator (5V)
Wireless
ethernet card
R
net
12V DC
GND
7Benchmarks
- Workload
- 1MB English text from Calgary Corpus
- A novel and structured bibliography
- 1MB web data from most popular sites (according
to Lycos Top 50 searches and Neilsen
Netratings) - No pre-compressed images (gif, jpg) were used
- Mostly .html, .css, and .js
- No sites had Java class files
- Compressors
- Represent major algorithms (LZ77, LZ78, PPM, BWT)
- Chosen due to popularity, maturity,
documentation, code quality, and portability - bzip2 (BWT)
- Unix compress (LZ78)
- LZO (realtime LZ77)
- PPMd (PPM)
- zlib (LZ77)
8Compression for portable devices
Portable Client
Wall-powered Server
- Goal choose a compressor that strikes best
balance between compressed file size ( network
energy) and time to achieve that size ( compute
energy)
9Energy required to receive 1MB text
- Receiving and uncompressing usually saves energy
(compared to receiving uncompressed data)
10Energy required to send 1MB text
- Compressing prior to sending can actually
increase total energy! - Web data (not shown) is easier to compress and
requires less energy than none for all except
bzip2
11Large effect of varying parameters
- Parameters size of input blocks, size of data
structures, amount of effort - Use such a chart to choose best compressor for
platformdata combo
12Energy per operation Skiff
- Microbenchmarks verify that computation is cheap
13Instructions per bit
- We dont execute an unreasonable number of
instructions (though there is quite a variation
between applications!)
14Energy per operation Skiff
- Computation is cheap, cache misses are not.
- By their nature, compressors can have many cache
misses.
15Memory Footprints
- Requiring many memory accesses leads to high
energy - But a large memory footprint can be used wisely
(eg, PPMd)
16Understanding cache behavior
- Skiff cache is 16KB. No L2 Cache.
- iPAQ cache only 8KB. Cache problems can be
exacerbated. - X-Scale cache is 32KB. May still be a problem
for apps tuned for the desktop - Suggestions for Unix Compress (which apply to
other apps) - A 1K buffer speeds I/O, but cuts into 16KB cache
- Not the size of allocation, its how you use it.
(e.g., a large, sparse hash table ? fewer
collisions ? fewer misses due to probing) - Merge adjacent tables into structure to bring in
code with fcode
Merged
Original
17Understanding cache behavior
- Suggestions (continued)
- Compact structures to put more usable data in
cache less wasted space
struct entry int fcode unsigned short
code tableSIZE Wasted space due to types
and alignment padding
18Understanding cache behavior results
- Merging tables has little effect
- Sparse arrays have dramatic effect even though
logical table is much larger than cache - Compacting array removes 92 of cache misses from
11-merge - Not much energy left to be saved
- But, program runs 1.5 times faster
19Asymmetric Compression
- No need for the same compression method in both
directions - Client compresses its requests using its
lowest-energy compressor - Server supplies data (transcoding if necessary)
so that client requires minimal energy to
decompress - Server can maintain state for a flow as it may be
hard to compress individual small blocks
20Overall results
- Energy savings over mod_gzip default (eg
compress12 vs zlib-6) - Text 57
- Web 31
- Asymmetric compression energy savings over best
symmetric scheme(eg, compress12zlib9 vs
compress12compress12) - Text 11
- Web 12
- Asymmetric energy savings over no compression
- Text 45
- Web 73
Combination Compressor Decompressor
21Exploiting low-power sleep mode
- Idle power will affect choice of compressor on
unloaded processor - Low power idle?
- Getting some work done quickly and going to sleep
is best choice - High idle power?
- It is best to spend time doing a good job
otherwise platform wastes power while idle
22Changing component energy affectschoice of
compressor
- If CPU and memory decrease in energy while
network remains constant? - Aggressive compression becomes possible, if not
better
6.00
5.00
bzip2
4.00
compress
Joules
3.00
lzo
ppmd
2.00
zlib
1.00
0.00
8
10
12
13
15
17
21
26
35
52
105
Network Energy / Average CPUMemory Energy
23Related work
- Using sophisticated error correcting codes can
reduce the number bits to send, but processing
codes can outweigh the energy savings - Energy efficiency of error correction on wireless
links (Havinga 1999) - Energy efficient lossy compression recast the
problem or trade energy for quality - CMU Odyssey(Satyanarayanan et al. 1994-2000)
- Algorithmic transforms for efficient scalable
computation (Sinha et al. 2000) - Adaptive image compression for wireless
multimedia communication (Taylor and Dey 2001) - Recognize the importance of low-power idle mode
- Critical power slope (Miyoshi et al. 2002)
- Many other compression and optimization
techniques - Several noted in my Masters Thesis (Barr 2002)
24Conclusion Future Work
- Conclusions
- Compression to save transmission energy is not
always a net win. Default compressor can double
send energy! - The fastest compressor is not always best the
smallest file is not always best. - However, knowledge of component energy and input
data combined with wise choice of algorithms and
parameters can give large energy savings - Up to 57 over default scheme
- Up to 12 over optimal symmetric scheme
- Future work
- Developing a hardware energy profiler for iPAQ
that fits on a PC-Card to measure energy portably
in an active system. Use its findings to choose
best application or dynamically change. - Explore further implementation tweaks for
cache-friendly behavior on portable systems.
25(No Transcript)
26Backup
- Compression ratio? Text vs web?
- See paper
- Why not compress on the NIC?
- Regardless, same set of tradeoffs
- Higher bandwidth links -gt less need.
- Multiple flows mean less correlation
- Better ratios at the application layer
(application-specific compression can be
employed, large context can be maintained). - Applications
- Difficult for interactive or small packet traffic
- If you have the choice over what format to
receive (eg, bzip2? No!) - Room full of conference attendees sharing an
access point