Optimizing Data Aggregation for Clusterbased Internet Services - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Optimizing Data Aggregation for Clusterbased Internet Services

Description:

Optimizing Data Aggregation for Cluster-based Internet Services ... All must be achieved in a cluster environment! Component failures. ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 24

Provided by: lingk

Category:

more less

Transcript and Presenter's Notes

Title: Optimizing Data Aggregation for Clusterbased Internet Services

1
Optimizing Data Aggregation for Cluster-based
Internet Services

Lingkun Chu, Hong Tang, Tao Yang
University of California, Santa Barbara
Kai Shen
University of Rochester

2
Internet Services and Data Aggregation

Large-scale service clusters AOL, Yahoo!, MSN,
Google, Teoma/Ask Jeeves.
24x7 availability.
Scalability (Large data sets. High traffic)
Efficient resource management
Programming support for reliable and scalable
network services is very important.
This talk focuses on programming and runtime
support for data aggregation in cluster-based
services.

3
Example of Internet Services Search Engine
Index servers (partition 1)
Query caches
Firewall/ Traffic switch
Web server/ Query handler
Local-area network
Index servers (partition 2)
Doc server (partition 2)
Index servers (partition 3)
Doc server (partition 1)
4
Neptune Programming and Runtime Support for
Cluster-based Services

Programming support
Component-oriented.
High-level primitives for service
aggregation/replication in clusters.
Runtime support
Service discovery, service invocation, load
balancing, failover management, service
differentiation, and replica consistency.
Applications
Discussion groups online auctions persistent
cache BLAST-based protein sequence match.
Teoma/AskJeeves search.

5
Outline

Background on Internet Services.
Data Aggregation Semantics API
Runtime System Design and Implementation.
Experimental Evaluation.

6
Data Aggregation Introduction

Internet services often partition data into
multiple groups for data parallelism and
management simplification.
Aggregation combines partial results from
multiple data partitions.
Aggregation for high performance and availability
is hard.
Need explicit programming support and efficient
runtime system design.

7
Design Objectives for Scalable Data Aggregation

Programming primitive
Easy-to-use.
General and flexible.
Runtime support
Scalable to a large number of partitions.
Low response time and high throughput.
All must be achieved in a cluster environment!
Component failures.
Platform heterogeneity among partitions due to
hardware/application irregularity.

8
Data Aggregation Call (DAC)The Basic Semantics
DAC(P, opproc , opreduce)
Requirement of reduce() commutative and
associative.
partition 1
partition 2
partition 3
partition 4
9
Adding Quality Control to DAC

What if a server fails or is very slow?
Aggregation quality guarantee
Partial aggregation results may still be useful.
Aggregation quality Percentage of partitions
contributed to the aggregation result.
Soft deadline guarantee
Better to return partial results promptly than
waiting for too long.

DAC(P, opproc , opreduce ,q, t)
10
Summary of Key Design Ideas

Load-adaptive tree reduction
Minimizes response time
Sustains throughput
Tolerates faults/unresponsiveness.
A hybrid thread/event-driven node architecture.
Staged timeout that proactively prunes slow or
unresponsive servers from a reduction tree.

11
Design Choices for Aggregation

Three reduction schemes
Base without programming support.
Flat random delegated roots.
Hierarchical dynamic, load-aware.

Service Providers
12
Optimization in Tree-based Aggregation

Form a reduction tree dynamically for each
request
Load changes from one request to another request.
Dynamic trees can help balance load.
Need to tolerate node slowness and failures
Optimization issues in tree formation
Optimize the tree shape
High outgoing degree implies high aggregation
cost, causing load unbalanced.
Tree depth affects latency (long path).
Machine assignment
Assign slow machines to leaf nodes.

13
Load-adaptive Tree Formation (LAT)
7
G
H
6
5
4
3
2
1
D
E
F
A
B
D
C
E
F
G
H
14
LAT Summary

Steps
Collecting server load information.
Assigning operations to servers.
Constructing the reduction tree.
Adjusting the tree shape.
Time complexity O(nlogn).

15
Runtime System Architecture
Service
Consumer
DAC
DAC
Client
Module
Request
16
Handling Failures and Unresponsiveness

Cases
Server stopped No heartbeat packets.
Server unresponsive Very long queue.
Solutions
Exclude stopped servers from the reduction tree.
Slow nodes are already on leafs.
Use staged timeout to eagerly prune unresponsive
servers.

17
Evaluation

Application deployments Index search server
NCBIs BLAST protein sequence matcher online
facial recognizer.
Hardware A cluster of Linux servers
30 dual-CPU (400MHz P-II), 512MB MEM
4 quad-CPU (500MHz P-II), 1GB MEM.
Benchmark I Search engine index server
Dataset 28 partitions, 1-1.2GB each.
Workload Trace-driven (One week trace from
Ask.com).
Benchmark II CPU-spinning microbenchmark.
Workload Synthetic.

18
Ease of Use

Applications Index server NCBIs BLAST protein
sequence matcher online facial recognizer.
First implemented without DAC.
A graduate student modified it with DAC.

19
Comparison of Three Aggregation Approaches

24 dual-CPU nodes, index server benchmark.

10
39
20
Scalability (simulation)
(B) Scalability Throughput
(A) Scalability Response Time
0.5
100
0.4
80
0.3
60
Throughput (req/sec)
Response Time (s)
40
0.2
Throughput
95 Demand level
60 Demand level
80 Demand level
0.1
20
90 Demand level
0
0
100
200
300
400
500
100
200
300
400
500
Number of Server Partitions
Number of Server Partitions
21
Handling Server Failures without Replication