Integrating Online Compression to Accelerate Large-Scale Data Analytics Applications

About This Presentation

Title:

Integrating Online Compression to Accelerate Large-Scale Data Analytics Applications

Description:

A Compression Framework for Data Intensive Applications. Chunk Resource Allocation (CRA) Layer. Initialization of the system. Generate chunk requests, enqueue processing – PowerPoint PPT presentation

Number of Views:68

Avg rating:3.0/5.0

Slides: 21

Provided by: cseOhios

Category:

more less

Transcript and Presenter's Notes

Title: Integrating Online Compression to Accelerate Large-Scale Data Analytics Applications

1
Integrating Online Compression to Accelerate
Large-Scale Data Analytics Applications

Tekin Bicer, Jian Yin, David Chiu, Gagan
Agrawal and Karen SchuchardtOhio State
UniversityWashington State UniversityPacific
Northwest National Laboratories

2
Introduction

Scientific simulations and instruments can
generate large amount of data
E.g. Global Cloud Resolving Model
1PB data for 4km grid-cell
Higher resolutions, more and more data
I/O operations become bottleneck
Problems
Storage, I/O performance
Compression

3
Motivation

Generic compression algorithms
Good for low entropy sequence of bytes
Scientific dataset are hard to compress
Floating point numbers Exponent and mantissa
Mantissa can be highly entropic
Using compression in applications is challenging
Suitable compression algorithms
Utilization of available resources
Integration of compression algorithms

4
Outline

Introduction
Motivation
Compression Methodology
Online Compression Framework
Experimental Results
Related Work
Conclusion

5
Compression Methodology

Common properties of scientific datasets
Multidimensional arrays
Consist of floating point numbers
Relationship between neighboring values
Domain specific solutions can help
Approach
Prediction-based differential compression
Predict the values of neighboring cells
Store the difference

6
Example GCRM Temperature Variable Compression

E.g. Temperature record
The values of neighboring cells are highly
related
X table (after prediction)
X compressed values
5bits for prediction difference
Lossless and lossy comp.
Fast and good compression ratios

7
Compression Framework

Improve end-to-end application performance
Minimize the application I/O time
Pipelining I/O and (de)comp. operations
Hide computational overhead
Overlapping app. computation with comp. framework
Easy implementation of diff. comp. alg.
Easy integration with applications
Similar API to POSIX I/O

8
A Compression Framework for Data Intensive
Applications

Chunk Resource Allocation (CRA) Layer
Initialization of the system
Generate chunk requests, enqueue processing
Converting original offset and data size requests
to compressed

Parallel Compression Engine (PCE)
Applies encode(), decode() functions to chunks
Manages in-memory cache with informed prefetching
Creates I/O requests

Parallel I/O Layer (PIOL)
Creates parallel chunk requests to storage medium
Each chunk request is handled by a group of
threads
Provides abstraction for different data transfer
protocols

9
Compression Framework API

User defined functions
encode_t() (R) Code for compression
decode_t() (R) Code for decompression
prefetch_t() (O) Informed prefetching function
Application can use below functions
comp_read Applies decode_t to comp. chunk
comp_write Applies encode_t to original chunk
comp_seek Mimics fseek, also utilizes prefetch_t
comp_init Init. system (thread pools, cache etc.)

10
Prefetching and In-Memory Cache

Overlapping application layer computation with
I/O
Reusability of already accessed data is small
Prefetching and caching the prospective chunks
Default is LRU
User can analyze history and provide prospective
chunk list
Cache uses row-based locking scheme for efficient
consecutive chunk requests

Informed Prefetching
prefetch()
11
Integration with a Data-Intensive Computing System

MapReduce style API
Remote data processing
Sensitive to I/O bandwidth
Processes data in
local cluster
cloud
or both (Hybrid Cloud)

12
Outline

Introduction
Motivation
Compression Methodology
Online Compression Framework
Experimental Results
Related Work
Conclusion

13
Experimental Setup

Two datasets
GCRM 375GB (L270 R105)
NPB 237GB (L166 R71)
16x8 cores (Intel Xeon 2.53GHz)
Storage of datasets
Lustre FS (14 storage nodes)
Amazon S3 (Northern Virginia)
Compression algorithms
CC, FPC, LZO, bzip, gzip, lzma
Applications AT, MMAT, KMeans

14
Performance of MMAT
Compression Ratios Compression Ratios
CC 51.68 (186GB)
LZO 20.40 (299GB)
Speedups Speedups Speedups Speedups
Local Remote Hybrid
CC 1.63 1.90 1.85
LZO 1.04 1.24 1.14
I/O Throughput (128np) I/O Throughput (128np) I/O Throughput (128np)
GB/sec Orig. CC
Local 1.62 3.21
Remote 0.1 0.19

Breakdown of Performance
Overhead (Local) 15.41
Read Speedup 1.96

15
Lossy Compression (MMAT)

Lossy
e dropped bits
Error bound 5x(1/105)

Compression Ratios Compression Ratios
Lossless 51.68
2e 56.88 (162GB)
4e 62.93 (139GB)
Speedups Speedups Speedups Speedups
Local Remote Hybrid
2e vs CC 1.07 1.18 1.09
4e vs CC 1.13 1.43 1.18
4e vs orig. 1.76 2.41 2.18
16
Performance of KMeans

NPB dataset
Comp ratio 24.01 (180GB)
More computation
More opportunity to fetch and decompression

Speedups Speedups Speedups Speedups
Local Remote Hybrid
FPC 0.75 1.30 1.12
Speedups w/ multithreading Speedups w/ multithreading Speedups w/ multithreading Speedups w/ multithreading
Local Remote Hybrid
2P - 4IO 1.25 1.17 1.19
4P - 8IO 1.37 1.16 1.21
4P 8IO vs Orig. 1.03 1.51 1.36
17
Conclusion

Management and analysis of scientific datasets
are challenging
Generic compression algorithms are inefficient
for scientific datasets
We proposed a compression framework and
methodology
Domain specific compression algorithms are fast
and space efficient
51.68 compression ratio
53.27 improvement in exec. time
Easy plug-and-play of compression
Integration of the proposed framework and
methodology with a data analysis middleware

18
Thanks!
19
Multithreading Prefetching

Diff. PCE and I/O Threads
2P 4IO
2 PCE threads, 4 I/O threads
One core is assigned to comp. framework

Speedups Speedups Speedups Speedups
Local Remote Hybrid
2P - 4IO 0.88 1.13 1.05
4P - 8IO 0.86 1.10 1.04
20
Related Work

(Scientific) data management
NetCDF, PNetCDF, HDF5
Nicolae et al. (BlobSeer)
Distributed data management service for efficient
reading, writing and appending ops.
Compression
Generic LZO, bzip, gzip, szip, LZMA etc.
Scientific
Schendel and Jin et al. (ISOBAR)
Organizes highly entropic data into compressible
data chunks
Burtscher et al. (FPC)
Efficient double-precision floating point
compression
Lakshminarasimhan et al. (ISABELA)