PerformanceDriven Processor Allocation - PowerPoint PPT Presentation

About This Presentation

Title:

PerformanceDriven Processor Allocation

Description:

P fixed at submission time. FCFS, SJF, SCDF [Majumdar88, ... PDPA behavior (zoom) Tuning algorithm. C. D A C. U. P. Performance-Driven Processor Allocation ... – PowerPoint PPT presentation

Number of Views:58

Avg rating:3.0/5.0

Slides: 29

Provided by: DAC63

Learn more at: https://research.ac.upc.edu

Category:

more less

Transcript and Presenter's Notes

Title: PerformanceDriven Processor Allocation

1
Performance-Driven Processor Allocation

Julita Corbalan, Xavier Martorell, Jesus Labarta
juli,xavim,jesus_at_ac.upc.es
DAC-UPC

2
Objective

Scheduling parallel applications in Shared Memory
Multiprogrammed systems
Allocate processors to applications that
can take advantage of them
Implemented in an SGI Origin2000 with 64
processors

3
Outline

Introduction Related Work
NANOS Execution Environment
Performance-Driven Processor AllocationPDPA
Evaluation
Conclusions Future Work

4
Introduction

Scheduling problem allocate processors to
applications
Space-Sharing / Time-Sharing
Number of processes Number of Processors
Process Control Tucker89
Space-sharing approaches
P fixed at submission time
FCFS, SJF, SCDF Majumdar88,...
P defined at execution time (Adaptive / Dynamic)
Equal-allocation of the resources Equipartition
McCan93
Processor allocation proportional to the
application performance

5
Introduction (2)

Processor allocation proportional to application
performance
Drawback Application performance is not known
before its execution
Solution Calculate it a priori
Executing several times with different P and
input data
Extrapolate the values based on a few samples
These approaches may not be valid
Application performance depends on run-time
parameters Initial data placement, process
migrations, distance between processors and
memory,
It can be impracticable e.g. infinite input data
sets

6
Related Work

Dynamic performance analysis
Self-Tuning Nguyen96, efficiency calculated at
run-time as a function of idleness, system and
communication overhead
Adaptive/Dynamic processor allocation policies
Equal_efficiency Nguyen96, tries to achieve the
same efficiency on all processors
Dynamic Allocation, based on the idleness
McCann93
Allocates the knee of the efficiency/execution
time curve Eager89

7
Our proposal

We propose
Dynamic performance analysis
Real speedup
Calculated at run-time
Allocate processors to applications that can
take advantage of them
Dynamic partitioning
Cost conscious re-allocations (memory locality)
Really efficient use of processors
Dynamic multiprogramming level
Coordination between the medium long term
schedulers

8
Outline

Introduction Related work
NANOS Execution Environment
Performance-Driven Processor AllocationPDPA
Evaluation
Conclusions Future Work

9
NANOS Execution Environment
-Controls the application arrival -Coordinated
with the CPU Manager
FCFS
Queued applications
OpenMP Parallel Applications (malleable)
Queueing System
Start new application
-Implements the scheduling policy -Informs the
applications about its decisions -Enforces the
processor allocation
New application?
-Request processors -Informs about its performance
Proc. request, speedup
CPU Manager
Proc. allocated
Resume, bind, ...
SelfAnalyzer
Operating System
Shared Memory Multiprocessor
.
10
Outline

Introduction Related work
NANOS Execution Environment
Performance-Driven Processor Allocation PDPA
Dynamic Performance Analysis SelfAnalyzer
Performance-Driven Processor Allocation policy
Dynamic Multiprogramming Level
Evaluation
Conclusions Future Work

11
Dynamic Performance Analysis SelfAnalyzer

Tool to estimate the application speedup and
execution time

Based on iterative parallel applications
Source code available
SelfAnalyzer calls inserted by the user or the
compiler
Source code not available
Dynamic Periodicity Detection
SelfAnalyzer dynamically loaded

12
Dynamic Performance Analysis SelfAnalyzer(2)

Speedup calculated as the relationship between
T(1) and T(P)

Serialization!!
13
Performance-Driven Processor Allocation

Space-Sharing
Allocation for acceptable efficiency (S(p)/p)
In the range low_eff , high_eff 50-70
Run-To-Completion
Minimum allocation of one processor
Dynamic partitioning, re-allocations when
Applications inform about their speedups
Application arrival/Application end
Remembers the application state
Allocation, performance

14
Performance-Driven Processor Allocation(2)

Policy parameters step, low_eff and high_eff

NewAppl Pmin(Free Proc., Proc. Requested)
NO_REF
Eff(p)lthigh_eff Eff(p)gtlow_eff
DEC
STABLE
INC
15
Dynamic Multiprogramming Level

Multiprogramming level (ML)
Number of applications running concurrently
Static/Dynamic ML
Coordination between the medium long term
schedulers
If (new_appl_fits()?)
start_new_appl()
new_appl_fits() defined by the scheduling policy
Free processors during several quanta
start_new_appl() implemented by the queuing
system

16
Outline

Introduction Related work
NANOS Execution Environment
Performance-Driven Processor AllocationPDPA
Evaluation
Processor Allocation Policies
Applications Workloads
Execution Time Processor Allocation
Conclusions Future Work

17
Processor Allocation Policies

Equip equal CPUs to each running application
PDPA DML our proposal
Equal_eff equal efficiency in all the processors
SGI-MP native IRIX Scheduler
MP_BLOCKTIME200000
OMP_DYNAMICTRUE

18
Applications Workloads

Architecture System
SGI Origin2000 with 64 processors IRIX 6.5.8
Applications Open MP
Swim(44.2), Bt(20.85), Hydro2d(6.3), apsi(1)
Workloads
Multiprogramming Level set to 4
Request 32 processors each application

19
Exec.Time Proc. Allocation
Limited processor allocation
Total execution time reduced
Appl. exc. time slightly increased
20
Exec.Time Proc. Allocation
Performance affected by the multiprogrammed
execution
Total exec. Time improved
Allocation proportional to the performance
21
SGI vs. PDPA
4476 vs. 4 processes migrations !!!!
Processor Affinity Process Control
22
PDPA behavior (zoom)
Tuning algorithm
23
Outline

Introduction Related Work
NANOS Execution Environment
Performance-Driven Processor AllocationPDPA
Evaluation
Conclusions Future Work

24
Conclusions

It is important to provide an accurate
performance information
SelfAnalyzer dynamic, accurate, easy to use
PDPA allocates processors to applications that
can take advantage of them
The Dynamic Multiprogramming Level improves the
system performance
Coordinating the medium long term schedulers

25
Future Work

Dynamic performance analysis
Non-iterative applications
PDPA
Space SharingTime Sharing
Evaluation in a open environment
Step, low_eff and high_eff need further research
Number of reallocations limited
Coordination medium long term schedulers
New policies

26
More contact info...

http//www.ac.upc.es/NANOS
http//www.ac.upc.es/homes/juli
juli_at_ac.upc.es

27
Related Work

Dynamic performance analysis
Self-Tuning Nguyen96, efficiency calculated at
run-time as a function of idleness, system and
communication overhead
Dynamic processor allocation policies
Equal_efficiency Nguyen96, tries to achieve the
same efficiency on all processors
Dynamic Allocation, based on the idleness
McCann93
Allocates the knee of the efficiency/execution
time curve Eager89

It does not calculate the real speedup
It does not ensure an efficient use of processors
Excessive number of reallocations
Uses a priori information
28
Performance-Driven Processor Allocation(3)