Integrating LargeScale Distributed and Parallel High Performance Computing DPHPC Applications Using presentation

About This Presentation

Transcript and Presenter's Notes

Title: Integrating LargeScale Distributed and Parallel High Performance Computing DPHPC Applications Using

1
Integrating Large-Scale Distributed and Parallel
High Performance Computing (DPHPC) Applications
Using a Component-based Architecture

Nanbor Wang1, Fang (Cherry) Liu2, Paul Hamil1,
Stephen Tramer1, Rooparoni Pundaleeka1, Randall
Bramley2
1Tech-X Corporation 2Indiana University
Boulder, CO U.S.A Bloomington, IN, U.S.A

Workshop on Component-Based High-Performance
ComputingOctober 16, 2008 Karlsruhe, Germany
Work partially funded by the US Department of
Energy, Office of Advanced Scientific Computing
Research, Grant DE-FG02-04ER84099
2
Agenda

Motivation and approach for Distributed and
Parallel High-Performance Computing (DPHPC)
Enabling distributed technologies
Applications development

3
Distributed and Parallel Component-Based Software
Engineering Addresses Modern Needs of Scientific
Computing

Motivating scenarios for Distributed and Parallel
HPC (DPHPC)
Integrate separately-developed and established
codes FSP, climate modeling, space weather
modeling, each component needing its own
architecture
Provide ways to better utilize high-CPU number
hardware and combine computing resources of
multiple clusters/computing centers
Enable parallel data streaming between computing
task and post-processing task (no feedback to the
solver)
Integrate multiple parallel codes using
heterogeneous architectures
Existing component standards and frameworks
designed with enterprise applications in mind
No support for features that are important for
HPC scientific applications interoperability
with scientific programming languages (FORTRAN)
and parallel computing infrastructure (MPI)
CCA address needs of HPC scientific applications
combustion modeling, global climate modeling,
fusion and plasma simulations
Tasks
Explore various distributed technologies and
approaches for DPHPC
Enhance tool support for DPHPC F2003 struct
support (covered later in Stefans talk)

4
Typical Parallel CCA Frameworks
MCMD
SCMD

Support both SPMD and MPMD scenarios
Stay out of the way of component parallelism
Components handle parallel communication

5
An Illustration of DPHPC Application
Alternative MCMD

Still support conventional CCA component managed
parallelism
Provide additional framework mediated distributed
inter-component communication capability

Cooperative Processing LLNLPACO INRIA
6
Agenda

Motivation for Distributed and Parallel
High-Performance Computing (DPHPC)
Enabling distributed technologies
Applications development

7
Babel RMI Allows Multiple Implementations

Babel generates mapping for remote invocations,
and has its own transfer protocol Simple
Protocol implemented in C
Thanks to Babels open architecture and language
interoperability users can take advantage of
various distributed technologies through third
party RMI libraries
We have developed a CORBA protocol library for
Babel RMI using TAO (version 1.5.1 or later)
The first 3rd-party Babel RMI library
TAO is the C based CORBA middleware framework
This protocol is essentially a bridge between
Babel and TAO

8
Using CORBA in Babel RMI Allows CORBA and Babel
Objects to Interoperate

Goal is to
Allow interoperability between existing CORBA and
Babel objects
Retain performance of CORBA IIOP protocol
Possible approaches for serialization
Encapsulating Babel Simple Protocol wire-format
into a block of binary data and transport it
using CORBA (as Octet Sequence)
Encapsulating Babel communications into CORBA Any
objects (did not follow up because of
inefficiency of Any)
Mapping Babel communications to CORBA format
directly (the adopted approach). CORBA uses
Common Data Representation (CDR) in the wire.

9
Direct Conversions Between CORBA Babel types
Enable Interoperability with Little Penalty

module taoiiop
module rmi
exception ServerException string
info
struct fcomplex float real float
imaginary
struct dcomplex double real double
imaginary

/
SIDL arrays are mapped to CORBA structs which
keep all the metadata information and the array
values are stored as CORBA sequence following
the metadata
/
typedef sequence ltlonggt ArrayDims
struct Array_Metadata short
ordering short dims ArrayDims stride
ArrayDims lower ArrayDims upper

AfterTaoIIOP 2.0 has a performance close to raw
socket
Optimizations Made CORBA-Babel mapping types
native in TAO by implementing optimized,
zero-copy version of marshaling and demarshaling
support

10
Agenda

Motivation for Distributed and Parallel
High-Performance Computing (DPHPC)
Enabling distributed technologies
Applications development

11
Leveraging Oneway and Asynchronous Calls to
Increase Application Parallelism
Compute-bound task
Compute-bound task
Compute-bound task
Simulation cluster
Dump data
Dump data
signal
signal
Data Analysis
Data Analysis
Remote cluster
Synchronous Invocations
Asynchronous/oneway Invocations
Simulation cluster
Compute-bound task
Compute-bound task
Compute-bound task
Compute-bound task
Dump data
signal
Data Analysis
Data Analysis
Remote cluster
Data Analysis
12
Performance Comparison TaoIIOP Async and Oneway
Calls

Figure shows average time for each time step
Very lightweight data analysis emphasis on
transport cost
0 payload actually makes no remote invocation
Babel team is working on a new RMI implementation

13
VORPAL is a Versatile Framework for Physics
Simulations

Highly-flexible, arbitrary-dimension
Plasma and beam simulations using multiple models
Utilize both MPI and parallel I/O
Use of robust ltinitgt file to configure a
simulation task

ltgrid Globalgridgt numPhysCells NX, NY, NZ
length LX, LY, LZ lt/gridgt ltDecomp decompgt
decompTyperegular lt/Decompgt
ltEmField myemfieldgt kindyeeEmField lt/EmFieldgt
ltSpecies Electronsgt kindrelBoris
massELEMACS lt/Speciesgt
14
Componentize VORPAL to perform On-demand Data
Processing
15
DPHPC Application Speed-up for On-line Data
Analysis

We had developed a prototype to perform online
data-analysis as a proof-of-concept
Run in the same cluster as two group of
processors
20 speedup was observed
More speed up with elaborate data processing

We modified the VORPAL source code separately for
this prototype
16
DPHPC Applications Remote Monitoring/Steering
of Simulations

We have extended Vorpal component framework to
interact with CCA framework through Babel RMI
Configurable from Vorpals initialization
fileltHistory historyNamegt kind
historyKind ltSender mySendergt kind
babelSender babelRmiURL eclipse.txcorp.com
8081 lt/Sendergtlt/Historygt
Support specification of a URL group a list of
URLs running parallel tasks

We are able to connect a running simulation to
one or multiple workstations
For online data processing/analysis
For monitoring simulation
Physicists are most interested in
Monitoring
Steering

VpBabelSender connecting to taoiiophandle//quart
ic.txcorp.com8081 VpBabelSender endpoint URL
taoiiophandle//quartic.txcorp.com8081/1000
VorpalClient constructor update 1
time6.128014e-13 update 2 time1.225603e-12
17
Summary

Implemented the distributed proxy components and
the TaoIIOP Babel RMI protocol for connecting
distributed CCA applications into an integrated
systems
Conducted performance benchmarking on preliminary
prototype implementation (version 1.0) to
identify key optimizations needed
Implemented the optimizations to minimize the
overhead (version 2.0)
Interoperability with CORBA can be achieved with
little/no performance penalty

18
Summary and Future Directions

Interoperability with CORBA can be achieved with
little/no performance penalty
Implement more scenarios of mixing distributed
and high performance components involving several
clusters and real applications
Synergy with MCMD
Support for petascale HPC applications
Remote monitoring/steering of large-scale
simulations on supercomuters (e.g., franklin)
Can take advantage of CORBA-Babel RMI
interoperability for now and switch to TAOIIOP
later

Write a Comment

User Comments (0)

About PowerShow.com

Integrating LargeScale Distributed and Parallel High Performance Computing DPHPC Applications Using PowerPoint PPT Presentation