Cluster Resource Management: A Scalable Approach - PowerPoint PPT Presentation

About This Presentation

Title:

Cluster Resource Management: A Scalable Approach

Description:

Implemented a new management scheme: Banking. Comparable ... Future Work. Real system implementation. Real Workloads. Real node level resource management ... – PowerPoint PPT presentation

Number of Views:17

Avg rating:3.0/5.0

Slides: 37

Provided by: sjordan

Learn more at: https://pages.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Cluster Resource Management: A Scalable Approach

1
Cluster Resource Management A Scalable Approach

Ning Li and Jordan Parker
CS 736 Class Project

2
Outline

Introduction
A Scalable Approach Hierarchy
Results
Conclusions
Questions

3
Why Study Resource Management?

Clusters have become increasingly popular for
large parallel computing.
Web Servers
Clusters are becoming increasingly large to the
order of thousands of nodes.
Clusters are providing multiple services.
Hard to evaluate
Bad is easy to determine
Good is much harder

4
Resource Management Example

4th Node Services only B
Poor Management
Ideal

Overall
A 37.5
B 62.5
5
Clustering Goals

Scalability
Reliability
High Performance
Affordability

6
Related Work

Proportional-Share
Cluster Reserves

7
Related Work Approach Differences

Our Goal to provide a scalable solution for
resource management.
Other work focused primarily on just having good
management
This often meant 1 manager for all the nodes
Clearly this could present a scalable bottleneck
Effectiveness Other solutions probably better
for smaller clusters, we hope to be better for
large (gt1000 nodes) clusters.

8
Outline

Introduction
A Scalable Approach Hierarchy
Results
Conclusions
Questions

9
Hierarchy A Scalable Approach

Hierarchical Management
Nodes service jobs
Managers facilitate resource management

10
Banking Algorithm

Goal
Determine best allocation given previous usage
Primitives
Tickets
Bank accounts
Deposit / withdraw tickets
6 Steps

11
Banking Algorithm

Step 1 For each service class on each node
Deposit unused tickets
Step 2 For each service class on each node
Reallocate service class
Full utilization Allocation usage k
Under utilization Allocation usage - k

12
Banking Algorithm Cont.

Step 3 For each service class
Compare total allocation to desired
Subtract from over-allocated
Add to needy under-allocated
Step 4 For each service class
Deposit / Withdraw
If still over-allocated withdraw
If still under-allocated deposit

13
Banking Algorithm Cont.

Step 5
Withdraw and allocate
Reward the needy nodes
Step 6
Done, clear the bank accounts

14
Reliability

Bottom-up Manager Replacement

5
6
7
8
9
10
11
12
2
3
4
1
2
5
15
Outline

Introduction
A Scalable Approach Hierarchy
Results
Conclusions
Questions

16
Results
Cluster Nodes Managers 1st/2nd Level Reporting 1st/2nd Level Workloads Workloads Class 2 Constraints Tests Tests
4 2/1 1/1 Steady Dyn 1 1 1
1/5 Steady Dyn
100 10/1 1/1 Steady Dyn 1-30 2 3
1/5 Steady Dyn 4 4
900 30/1 1/1 Steady Dyn 1-300 5 5
1/5 Steady Dyn
17
Implementation Details