HC: Goals and Problems - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

HC: Goals and Problems

Description:

inter-task precedence constraints on scheduling. typically, user must ... on precedence constraints. lowest level consists of subtasks with no predecessors ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 48
Provided by: hjsi
Category:
Tags: goals | problems

less

Transcript and Presenter's Notes

Title: HC: Goals and Problems


1
  • An Introduction to Research Issues in
    Heterogeneous Parallel and Distributed Computing
  • H. J. Siegel
  • Professor of Electrical and Computer Engineering
  • and Professor of Computer Science
  • Colorado State University
  • Fort Collins, Colorado, 80523 USA
  • HJ_at_ColoState.edu
  • Outline
  • 1. Anecdote
  • 2. Goal of Heterogeneous Computing
  • 3. Mixed-Machine Heterogeneous Computing
    Environment
  • 4. Model of Automatic Heterogeneous Computing
  • 5. Example Resource Allocation Research
  • 6. Open Problems
  • 7. Alligators

2
Goal of Heterogeneous Computing
  • heterogeneous computing system has varied
    computational capabilities
  • numerous applications of various types are to be
    executed
  • each application consists of tasks with
    differentcomputational requirements
  • match each task to appropriate componentof the
    heterogeneous computing system
  • goal optimize some performance criterion
  • ex. minimize execution time of set of
    applications

3
Goal of Heterogeneous Computing - Example
  • hypothetical example - application with four
    tasks
  • based on total time 100 units on a workstation

example on workstation
4
Goal of Heterogeneous Computing - Example
  • hypothetical example - application with four
    tasks
  • based on total time 100 units on a workstation

example on workstation
25 distr. mem. multi-proc.
30distr. shared mem. machine
10small, shared mem. proc.
35 largecluster
5
Goal of Heterogeneous Computing - Example
  • hypothetical example - application with four
    tasks
  • based on total time 100 units on a workstation

example on workstation
25 distr. mem. multi-proc.
30distr. shared mem. machine
10small, shared mem. proc.
35 largecluster
execution on a large cluster
20 20 0.3 8
about 2? faster
6
Goal of Heterogeneous Computing - Example
  • hypothetical example - application with four
    tasks
  • based on total time 100 units on a workstation
  • heterogeneous suite time includes any needed
    inter-machine communication (not drawn to scale)
  • need workload to keep all machines reasonably busy

example on workstation
25 distr. mem. multi-proc.
30distr. shared mem. machine
10small, shared mem. proc.
35 largecluster
execution on a heterogeneous suite
execution on a large cluster
20 20 0.3 8
about 2? faster
7
Outline
  • 1. Anecdote
  • 2. Goal of Heterogeneous Computing
  • 3. Mixed-Machine Heterogeneous Computing
    Environment
  • 4. Model of Automatic Heterogeneous Computing
  • 5. Example Matching and Scheduling Research
  • 6. Open Problems
  • 7. Alligators

8
Heterogeneous Computing Systems
  • mixed-machine heterogeneous computing (HC) system
  • network of different machines
  • also known as
  • heterogeneous parallel computing
  • heterogeneous distributed computing
  • heterogeneous multicomputer
  • presentation also applies to a cluster of
    different types (or different ages) of machines
  • presentation also applies to some kinds of grid
    computing environments
  • two machines are heterogeneous if their
  • performance differs on any given task
  • differences could result from differences in
  • CPU clock speed, instruction set,
  • memory size, speed, organization,
  • operating system type, version,
  • ...

9
Using HC Systems
  • each application to be executedcomposed of one
    or more tasks
  • ideally, each task computationally homogeneous
  • different tasks may have different computational
    needs
  • there may be inter-task communication
  • each task must be assigned (matched) to a machine
  • execution of tasks and inter-task communication
    must be ordered (scheduled)
  • mapping matching scheduling
  • also called resource allocation or resource
    management
  • mapping attempts to optimize a performance metric
  • in general, known NP-complete problem
  • use heuristic to find near-optimal solutions

10
Mapping Tasks to Machines in an HC System
  • map tasks to machines considering
  • quality of match (computational requirements to
    machine capabilities exploit heterogeneity)
  • inter-machine communication overhead (code,
    initial and generated data)
  • concurrent use of multiple machines when
    appropriate
  • estimated machine and network availability
  • inter-task precedence constraints on scheduling
  • typically, user must
  • decompose application into tasks
  • match tasks to machines
  • schedule execution order of tasks
  • schedule inter-machine data transfers

11
Example of Use of Mixed-Machine HC System
  • simulation of turbulent convection atMinnesota
    Supercomputer Center
  • calculation of velocity and temperature fields
    CM5
  • calculation of particle traces Cray 2
  • calculation of particle statistics CM-200
  • visualization SGI VGX

12
Outline
  • 1. Anecdote
  • 2. Goal of Heterogeneous Computing
  • 3. Mixed-Machine Heterogeneous Computing
    Environment
  • 4. Model of Automatic Heterogeneous Computing
  • 5. Example Matching and Scheduling Research
  • 6. Open Problems
  • 7. Alligators

13
User Specified Versus Automatic Mappings
  • some tools exist to help user map an application
  • long term goal
  • automatic decomposition,matching, and scheduling
  • encourage, facilitate, and improveperformance of
    HC system use

14
Conceptual Model of Automatic HC
generation of parametersrelevant to both
applications and machines
machines in suite
applications
categories for machine capabilities
categories for computational needs
15
Conceptual Model of Automatic HC
generation of parametersrelevant to both
applications and machines
machines in suite
applications
categories for machine capabilities
categories for computational needs
task profiling fordecomposing applications
characteristics of each app. task
16
Conceptual Model of Automatic HC
analyticalbenchmarking for machines
characteristics of each app. task
machine characteristics,inter-machine comm.
17
Conceptual Model of Automatic HC
generation of parametersrelevant to both
applications and machines
machines in suite
applications
categories for machine capabilities
categories for computational needs
perf. measure, initial status of machines, network
task profiling fordecomposing applications
analyticalbenchmarking for machines
static resource allocation
characteristics of each app. task
machine characteristics,inter-machine comm.
matching of tasks to machines, execution
schedule
execution
18
Conceptual Model of Automatic HC
generation of parametersrelevant to both
applications and machines
machines in suite
applications
categories for machine capabilities
categories for computational needs
task profiling fordecomposing applications
analyticalbenchmarking for machines
static and dynamic resource allocation
characteristics of each app. task
machine characteristics,inter-machine comm.
matching of tasks to machines, execution
schedule
status of machines, network, and workload
monitor
19
Conceptual Model of Automatic HC
generation of parametersrelevant to both
applications and machines
machines in suite
applications
categories for machine capabilities
categories for computational needs
task profiling fordecomposing applications
analyticalbenchmarking for machines
static and dynamic resource allocation
characteristics of each app. task
machine characteristics,inter-machine comm.
matching of tasks to machines, execution
schedule
status of machines, network, and workload
monitor
20
Outline
  • 1. Anecdote
  • 2. Goal of Heterogeneous Computing
  • 3. Mixed-Machine Heterogeneous Computing
    Environment
  • 4. Model of Automatic Heterogeneous Computing
  • 5. Example Matching and Scheduling Research
  • 6. Open Problems
  • 7. Alligators

21
Static Mapping in Ad Hoc Grids
  • ad hoc grid
  • heterogeneous computing system consisting of
    different mobile deviceswith wireless
    communication
  • group of individuals with mobile computing
    devices
  • application consisting of numerous communicating
    subtasks
  • extensive computation and communication among ad
    hoc grid components
  • total battery energy available for each machine
    limited
  • often in difficult environments - examples
  • disaster management
  • wildfire fighting
  • defense operations

22
Simplified Wildfire Fighting Example
23
Problem Statement
  • map the S communicating subtasks of the
    application task to the M machines in the ad hoc
    grid
  • constraints
  • all subtasks of the application must be executed
  • must complete application in ? ? seconds
  • battery capacity constraint for each machine
  • wall clock time for mapper itself to execute 60
    minutes
  • goal
  • design resource allocation (mapping) heuristics
  • assign the communicating subtasks of the
    application task to the machines in ad hoc grid
  • performance metricminimize the average over
    all machines of thepercentage of the energy
    consumed

24
Energy Model for Computation
  • two classes of machines fast machines slow
    machines
  • initial (maximum) battery capacity on machine j
    B(j)
  • B(j) for fast machines 580 energy units
  • B(j) for slow machines 58 energy units
  • estimated time to compute subtask i on machine j
    ETC(i, j)
  • each machine has a unique ETC value for each
    subtask
  • differ among fast machines
  • differ among slow machines
  • rate at which machine j consumes energy
    for subtask execution, per
    ETC time unit E(j)
  • E(j) for fast machines 0.1 energy units per
    second
  • E(j) for slow machines 0.001 energy units per
    second
  • energy consumed for executing subtask i on
    machine j ETC(i, j) ? E(j)

25
Energy Model for Communication
  • communication bandwidth for machine j BW(j)
  • BW(j) for fast machines 8 megabits per second
  • BW(j) for slow machines 4 megabits per second
  • CMT(j,k) per bit time to transfer data item
    from machine j to machine k
  • CMT(j,k) 1 / min (BW(j), BW(k))
  • rate at which machine j consumes energy for
    transmitting subtask output, per communication
    time unit C(j)
  • C(j) for fast machines 0.2 energy units per
    second
  • C(j) for slow machines 0.002 energy units per
    second
  • energy consumed to send data item of size gfrom
    machine j to machine k
  • CMT(j,k) ? g ? C(j)

26
Assumptions
  • each machine can communicate while computing
  • ignore energy consumed by subtask to receive data
    item
  • ignore energy consumed by machine when idle

27
Performance Metric
  • total battery energy consumed by machine j after
    entire task completed EC(j)
  • recall
  • B(j) is maximum battery capacity on machine j
  • M is number of machines
  • performance metric Bpavg
  • goal
  • minimize Bpavg
  • complete application must execute in ? seconds
  • obey battery capacity constraint for each machine
  • wall clock time for mapper 60 minutes

28
Simulation Setup
  • each application task composed of S 1,024
    subtasks
  • data dependencies among subtasks represented by
    a random directed acyclic graph (DAG)
  • 10 different DAGs generated for this study
  • sizes of transferred data items sampled from a
    Gamma distribution
  • two classes of eight machines
  • fast machines (machines 0 to 3)
  • slow machines (machines 4 to 7)
  • 10 different ETC matrices generated
  • used coefficient of variation (COV) method
  • 100 different scenarios
  • each scenario combination of DAG and ETC matrix
  • time constraint ? for all subtasks in the
    application task to execute 34,075
    seconds

29
Heuristics Overview
  • six static mapping schemes studied in this
    research
  • Levelized Weight Tuning, Bottoms Up, Min-Min,
    Genetic Algorithm, A, and Simplified Lagrangian
  • makespan defined as overall execution time of
    entire application task on machines in ad hoc
    grid
  • for final mapping of all six heuristics
  • energy constraint
  • B(j) not exceeded for any machine
  • time constraint
  • execution time (makespan) of application does
    not exceed ?

30
Levelized Weight Tuning (LWT) Assigning Levels
  • all subtasks assigned levels depending on
    precedence constraints
  • lowest level consists of subtasks with no
    predecessors
  • highest level consists of subtasks with no
    successors
  • each of rest of the subtasks at one level below
    lowest producer of its global data items
  • example

31
Levelized Weight Tuning (LWT) Procedure
  • within each level, list subtasks in descending
    order based on total size of output data items
  • let ? (current level number 1) / (total
    number of levels)
  • for each level from lowest to highest for
    subtask Sj in level
  • F is a weighting factor experimentally determined
  • ? ratio of partial makespan to ?
  • if ? gt (? F) assign subtask Sj to machine
    that increases current makespan by least amount
  • else map subtask to machine that increases
    current Bpavg by least amount
  • update time and energy availability across
    machines
  • repeat steps 2 to 4 until all subtasks mapped
  • F varied from 1 to 2 in steps of 0.1 for each
    complete mapping for each scenario keep best
    value of Bpavg
  • average value of F was 1.6

32
Bottoms Up (BU) Fitness Value
  • based on Min-Min (greedy) concept Ibarra, 1977
  • subtasks in DAG mapped bottom up from child to
    parent
  • from the highest level to lowest level
  • mappable subtasks successors mapped
  • normalized time for subtask i on machine j
  • fitness value ( ? ? NT(i, j) ) ( (1 ? ?) ?
    NE(i, j) )
  • weighting factor ? varied from 0 to 1 in steps
    of 0.1 for each complete mapping for each
    scenario
  • value of 0.5 gave best value of Bpavg in all
    scenarios

33
Bottoms Up (BU) Procedure
  • list all mappable subtasks (successors already
    mapped)
  • for each mappable subtask in subtask list,while
    ignoring other subtasks in list
  • find machine that gives the subtask its minimum
    fitness value
  • among all subtask/machine pairs found from
    (2),find pair that gives minimum fitness value
  • ties broken arbitrarily
  • assign subtask to its paired machine
  • remove that subtask from mappable subtask list
  • update time and energy availability for that
    machine
  • repeat steps 1 to 6 until all subtasks are
    assigned machines
  • subtasks scheduled to execute in reverse order
    they were assigned to machines

34
Lower Bound
  • lower bound (LB) on Bpavg
  • Bpavg for optimal mapping ? this lower bound
  • ignores data precedence constraints,
    inter-machine communications, battery power
    constraint, and ?
  • for each subtask in any random order
  • find minimum percentage energy consumed over all
    machines to execute the subtask
  • sum above values for all subtasks and average
    them across all machines
  • LB recall

35
Simulation Results Bpavg Exec. Times (sec)
results averaged over 100 scenarios
36
Summary of Static Mapping in Ad Hoc Grids
  • designed, evaluated, and compared six static
    heuristics for an ad hoc grid environment to
    minimize average energy consumed across all
    machines
  • Genetic Algorithm performed the best only 14
    greater than unattainable lower bound
  • Levelized Weight Tuning and Bottoms Up performed
    comparably and did second best
  • Genetic Algorithm used Levelized Weight Tuning,
    Bottoms Up, and Min-Min as seeds
  • on average performed 3 to 4 better than seeds
  • heuristic execution time very large for Genetic
    Algorithm relative to Levelized Weight Tuning
    and Bottoms Up
  • Levelized Weight Tuning and Bottoms Up good
    choice for given type of problem

37
Outline
  • 1. Anecdote
  • 2. Goal of Heterogeneous Computing
  • 3. Mixed-Machine Heterogeneous Computing
    Environment
  • 4. Model of Automatic Heterogeneous Computing
  • 5. Example Matching and Scheduling Research
  • 6. Open Problems
  • 7. Alligators

38
Open Problems for HC Mappers
  • conceptual model for automaticheterogeneous
    computing
  • generation of parameters relevant to
    bothapplication domain and machines
  • task profiling of application
  • analytical benchmarking of machines
  • mapping matching and scheduling
  • handling uncertainty in estimated system
    parameter values
  • evaluating impact of uncertainty on performance
    of mapper
  • incorporating uncertainty robustness in mapper
  • allowing redundant computations on different
    machines
  • incorporate power consumption issues
  • deriving standard set of benchmark applications
  • minimizing dollar cost of set of machines to
    meet performance constraints

39
Open Problems for HC System Software
  • machine-independent languageswith user-specified
    directives to
  • allow compilation into efficientcode for any
    machine in suite
  • aid in decomposing application into homogeneous
    tasks
  • facilitate determination of task computational
    needs
  • interface with machine-dependent libraries
  • operating system interfaces to support
    heterogeneous computing and inter-task
    communications
  • local (machines) and global (network)
  • interactive applications
  • debugging and performance tuning
  • programming tools and environments
  • visualization tools

40
Open Problems for HC Network Issues
  • inter-machine data transport
  • hardware support needed
  • software protocols needed
  • network topology
  • computing minimum path between two machines
  • rerouting in case of faults or heavy loads
  • modeling the sharing of links and bandwidth among
    tasks

41
Open Problems for HC QoS Requirements
  • static and dynamic mappers for applications when
  • system is overloaded
  • applications have
  • deadlines (soft and hard)
  • priority levels with relative weightings
  • multiple versions
  • different computational needs
  • different quality results
  • different worths to users (e.g., 2nd choice only
    25)
  • security and other application dependent QoS
    requirements
  • performance measure is sum of priority weights
    of tasks that meet deadlines, degraded if
  • lesser version used
  • soft deadline not met
  • partial QoS received

42
Open Problems for HC Dynamic Issues
  • machine and network loading and status
    information(dynamic mapping)
  • how to measure non-intrusively
  • how often to take new measurements
  • how to communicate and update information
  • how to incorporate effectively into mappings
  • how to estimate task/transfer completion time
  • methods for dynamic task migration at execution
    time(dynamic mapping)
  • how to checkpoint and move an executing
    taskbetween different types of machines
  • how and when to use task migration for load
    balancing
  • how to use task migration for fault tolerance

43
Open Problems for HC Paradigms
  • mapping different classes of applications
  • execute once (ex. compress an image)
  • execute continuously (ex. monitor inputs from
    sensors and control actuators)
  • subtasks communication pattern not represented
    by DAG
  • ex. co-routines
  • multi-tasking on each machine
  • how to estimate task computation time
  • how to model sharing of machine I/O
  • machines that are not under complete control of
    mapper
  • scalability
  • centralized versus distributed implementations
    of mappers
  • hierarchically structured mappers

44
Reference - Automatic HC and Open Problems
  • Heterogeneous Computing Goals, Methods, and
    Open Problems
  • by T. D. Braun, H. J. Siegel, and A. A.
    Maciejewski
  • 8th International Conference on High
    PerformanceComputing (HiPC 2001), Dec. 2001
  • one of the keynote presentations

45
Reference - Static Mapping in an Ad Hoc Grid
  • Static Mapping of Subtasks in a Heterogeneous
    Ad Hoc Grid Environment
  • by S. Shivle, R. Castain, H. J. Siegel, A. A.
    Maciejewski, T. Banka, K. K. Chindam, S.
    Dusinger, P. K. Pichumani, P. M. Satyasekaran,
    W. W. Saylor, D. Sendek, J. C. Sousa, J.
    Sridharan, P. V. Sugavanam, J. A. Velazco
  • 13th Heterogeneous Computing Workshop (HCW
    2004),Apr. 2004
  • in the IEEE Computer Society proceedings of the
    18th International Parallel and Distributed
    Processing Symposium (IPDPS 2004)

46
Outline
  • 1. Anecdote
  • 2. Goal of Heterogeneous Computing
  • 3. Mixed-Machine Heterogeneous Computing
    Environment
  • 4. Model of Automatic Heterogeneous Computing
  • 5. Example Matching and Scheduling Research
  • 6. Open Problems
  • 7. Alligators

47
Concluding Remarks
  • heterogeneous parallel and distributed computing
    is an important research area, including clusters
    and grids
  • presented brief introduction to heterogeneous
    computing
  • showed model of automatic heterogeneous computing
  • gave example of heterogeneous computing mapping
    research
  • discussed some open problems in the field
  • please see our papers listed as references for
    more information and references to other relevant
    research
Write a Comment
User Comments (0)
About PowerShow.com