Parallel Objects: Virtualization - PowerPoint PPT Presentation

About This Presentation
Title:

Parallel Objects: Virtualization

Description:

Real OS processes/threads. Robust, reliable, implemented. High performance penalty ... Implemented as user-level threads. Very fast context switching ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 21
Provided by: orionl
Learn more at: https://www.ece.lsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Parallel Objects: Virtualization


1
Parallel Objects Virtualization In-Process
Components
  • Orion Sky Lawlor
  • Univ. of Illinois at
  • Urbana-Champaign
  • POHLL-2002

2
Introduction
  • Parallel Programming is hard
  • Communication takes time
  • Message startup cost
  • Bandwidth contention
  • Synchronization, race conditions
  • Parallelism breaks abstractions
  • Flatten data structures
  • Hand off control between modules
  • Harder than serial programming

3
Motivation
  • Parallel Applications are either
  • Embarrassingly Parallel
  • Trivial, 1 RA-week effort
  • E.g. Monte Carlo, parameter sweep, SETI_at_home
  • Communication totally irrelevant to performance

4
Motivation
  • Parallel Applications are either
  • Embarrassingly Parallel
  • Excruciatingly Parallel
  • Massive, 1 RA-year effort
  • E.g. Pure MPI codes 10k lines
  • Communication, synchronization totally determine
    performance

5
Motivation
  • Parallel Applications are either
  • Embarrassingly Parallel
  • Excruciatingly Parallel
  • Well be done in 6 months
  • Several parallel libraries codes groups,
    dynamic adaptive
  • E.g. Multiphysics simulation

6
Serial Solution Abstract!
  • Build layers of software
  • High-level Libc, C STL,
  • Mid-level OS Kernel
  • Silently schedule processes
  • Keep CPU busy even when some processes block
  • Allows a process to ignore other processes
  • Low-level assembler

7
Parallel Solution Abstract!
  • Middle layers are missing
  • High-level ScaLAPACK, POOMA..
  • Mid-level ? Kernel
  • Silently schedule components
  • Keep CPU busy even when some components block
  • Allows a component to ignore other components
  • Low-level MPI

8
The missing middle layer
  • Provides dynamic computation and communication
    overlap, even across separate modules
  • Handles inter-module handoff
  • Pipelines communication
  • Improves cache utilizationsmaller components
  • Provides nice layer for advanced features, like
    process migration

9
Examples Multiprogramming
10
Examples Pipelining
11
Middle Layer Implementation
  • Real OS processes/threads
  • Robust, reliable, implemented
  • High performance penalty
  • No parallel features (migration!)
  • Converse/Charm
  • In-process components efficient
  • Piles of advanced features
  • AMPI, MPI interface to Charm
  • Application Framework

12
Charm
  • Parallel library for Object-Oriented C
    applications
  • Messaging via method calls
  • Communication proxy objects
  • Methods called by scheduler
  • System determines who runs next
  • Multiple objects per processor
  • Object migration fully supported
  • Even with broadcasts, reductions

13
Mapping Work to Processors
System implementation
User View
14
AMPI
  • MPI interface, implemented on Charm
  • Multiple virtual processors per physical
    processor
  • Implemented as user-level threads
  • Very fast context switching
  • MPI_Recv only blocks virtual processor, not
    physical
  • All the benefits of Charm

15
Application Frameworks
  • Domain-specific interfaces unstructured grids,
    structured grids, particle-in-cell
  • Provide natural interface to application
    scientists (Fortran!)
  • Encapsulate communication
  • Built on Charm
  • Most popular interfaces to Charm

16
Charm Features Migration
  • Automatic load balancing
  • Balance load by migrating objects
  • Application-independent
  • Built-in data collection (cpu, net)
  • Pluggable strategy modules
  • Adaptive Job Scheduler
  • Shrink/expand parallel job, by migrating objects
  • Dramatic utilization improvment

17
Examples Load Balancing
1. Adaptive Refinement
3. Chunks Migrated
2. Load Balancer Invoked
18
Examples Expanding Job
19
Examples Virtualization
20
Conclusions
  • Parallel applications need something like a
    kernel
  • Neutral party to mediate CPU use
  • Significant utilization gains
  • Easy to put good tools in kernel
  • Work migration support
  • Load balancing
  • Consider using Charm

http//charm.cs.uiuc.edu/
Write a Comment
User Comments (0)
About PowerShow.com