Open TS: an Advanced Tool for Parallel and Distributed Computing - PowerPoint PPT Presentation

About This Presentation
Title:

Open TS: an Advanced Tool for Parallel and Distributed Computing

Description:

(Tampa--Redmond, USA) 2. Open TS: an advanced tool for parallel and distributed computing. ... MPI vs Open TS case study. Applications. Future work. 3. Program ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 54
Provided by: skifPer
Category:

less

Transcript and Presenter's Notes

Title: Open TS: an Advanced Tool for Parallel and Distributed Computing


1
Open TS an Advanced Tool forParallel and
Distributed Computing
  • Program Systems Institute RAS, 2006-11-10..26(Ta
    mpa--Redmond, USA)

2
Presentation Outline
  • Short self-introduction
  • Open TS outline
  • MPI vs Open TS case study
  • Applications
  • Future work

3
Short Self-Introduction
4
PSI RAS, Pereslavl-Zalesski
5
SKIF Supercomputing Project
  • Joint of Russian Federation and Republic of
    Belarus
  • 2000-2004
  • 10 10 organizations
  • PSI RAS is lead organization from Russian
    Federation
  • Hardware and Software

6
Flagship SKIF ?-1000
  • Peak performance2,5 Tflops
  • Linpack-performance2,0 Tflops
  • Efficiency ratio80.1
  • November 2004 The most powerful supercomputer
    in ex-USSR, Rank 98 in Top500

7
Open TS Overview
8
T-System History
  • Mid-80-iesBasic ideas of T-System
  • 1990-iesFirst implementation of T-System
  • 2001-2002, SKIF GRACE Graph Reduction
    Applied to Cluster Environment
  • 2003-current, SKIF Open TS Open T-system

9
Comparison T-System and MPI
Sequential
Parallel
10
Related work
  • Parallel Programming Using C (Scientific and
    Engineering Computation) by Gregory V. Wilson
    (Editor), Paul Lu (Editor)ABC, Amelia, CC,
    CHAOS, COOL, C//, ICC, Mentat, MPC,
    MPI, pC, POOMA, TAU, UC

11
T-System in Comparison
Related work Open TS differentiator
Charm FP-based approach
UPC, mpC Implicit parallelism
Glasgow Parallel Haskell Allows C/C based low-level optimization
OMPC Provides both language and C templates library
Cilk Supports SMP, MPI, PVM, and GRID platforms
12
Open TS an Outline
  • High-performance computing
  • Automatic dynamic parallelization
  • Combining functional and imperative approaches,
    high-level parallel programming
  • ? language Parallel dialect of C an
    approach popular in 90-ies

13
?-Approach
  • Pure functions (tfunctions) invocations produce
    grains of parallelism
  • T-Program is
  • Functional on higher level
  • Imperative on low level (optimization)
  • C-compatible execution model
  • Non-ready variables, Multiple assignment
  • Seamless C-extension (or Fortran-extension)

14
? Keywords
  • tfun ?-function
  • tval ?-variable
  • tptr ?-pointer
  • tout Output parameter (like )
  • tdrop Make ready
  • twait Wait for readiness
  • tct ?-context

15
Sample Program
  • include ltstdio.hgt
  • tfun int fib (int n)
  • return n lt 2 ? n fib(n-1)fib(n-2)
  • tfun int main (int argc, char argv)
  • if (argc ! 2) printf("Usage fib ltngt\n")
    return 1
  • int n atoi(argv1)
  • printf("fib(d) d\n", n, (int)fib(n))
  • return 0

16
Open TS Environment
17
Open TS Runtime
  • Three-tiered architecture (?, M, S)
  • Design microkernel , 10 extensions currently
  • Supermemory
  • Lightweight threads
  • DMPI Dynamic MPI
  • auto selection of MPI implementation
  • dynamic loading and linking

18
Supermemory
  • Object-Oriented Distributed shared memory (OO
    DSM)
  • Global address space
  • Cell versioning

19
Multithreading Communications
  • Lightweight threads
  • PIXELS (1 000 000 threads)
  • Asynchronous communications
  • A thread A asks non-ready value (or new job)
  • Asynchronous request sent Active messages
    Signals delivery over network to stimulate data
    transfer to the thread A
  • Context switches (including a quant for
    communications)
  • Latency Hiding for node-node exchange

20
DMPI
  • Dynamic MPI
  • auto selection of MPI implementation
  • dynamic loading and linking
  • Seven implementations of MPI supported now
  • LAM
  • MPICH
  • SCALI MPI
  • MVAPICH
  • IMPI
  • MPICH-G2
  • PACX-MPI
  • even PVM can be used instead of MPI

21
Debugging WAD, LTDB
22
Statistics Gathering
23
Message Tracing
24
Open TS Applying to Distributed Computing
  • Meta-cluster messaging support(MPICH-G2, IMPI,
    PACX-MPI)
  • Customizable scheduling strategies(network
    topology information used)

25
NPB, Test ??Rewritten _at_OpenTS
  • ?? Embarrassingly Parallel
  • NASA Parallel Benchmarks suite
  • Speedup 96of theoretical maximum(on 10 nodes)

Efficiency, Time Nproc
Time, of sequential
Nproc
26
Open TS vs MPI case study
27
Applications
  • Popular and widely used
  • Developed by independent teams (MPI experts)
  • PovRay Persistence of Vision Ray-tracer,
    enabled for parallel run by a patch
  • ALCMD/MP_lite molecular dynamics package (Ames
    Lab)

28
T-PovRay vs MPI PovRay code complexity
Program Source code volume
MPI modules for PovRay 3.10g 1,500 lines
MPI patch for PovRay 3.50c 3,000 lines
T modules (for both versions 3.10g 3.50c) 200 lines
29
T-PovRay vs MPI PovRay performance
16 dual Athlon 1800, AMD Athlon MP 1800 RAM
1GB, FastEthernet, LAM 7.0.6
30
T-PovRay vs MPI PovRay performance
2CPUs AMD Opteron 248 2.2 GHz RAM 4GB, GigE, LAM
7.1.1
31
ALCMD/MPI vs ALCMD/OpenTS
  • MP_Lite component of ALCMD rewritten in T
  • Fortran code is left intact

32
ALCMD/MPI vs ALCMD/OpenTS code complexity
Program Source code volume
MP_Lite total/MPI 20,000 lines
MP_Lite,ALCMD-related/MPI 3,500 lines
MP_Lite,ALCMD-related/OpenTS 500 lines
33
ALCMD/MPI vs ALCMD/OpenTS performance
16 dual Athlon 1800, AMD Athlon MP 1800 RAM
1GB, FastEthernet, LAM 7.0.6, Lennard-Jones MD,
512000 atoms
34
ALCMD/MPI vs ALCMD/OpenTS performance
2CPUs AMD Opteron 248 2.2 GHz RAM 4GB, GigE, LAM
7.1.1, Lennard-Jones MD, 512000 atoms
35
ALCMD/MPI vs ALCMD/OpenTS performance
2CPUs AMD Opteron 248 2.2 GHz RAM 4GB,
InfiniBand,MVAMPICH 0.9.4, Lennard-Jones
MD,512000 atoms
36
Porting OpenTSto MS Windows CCS
37
2006 contract with Microsoft Porting OpenTS to
Windows Compute Cluster Server
  • OpenTS_at_WinCCS
  • inherits all basic features of the original Linux
    version
  • is available under FreeBSD license
  • does not require any commercial compiler for
    T-program development its only enough to
    install VisualC 2005 Express Edition (available
    for free on Microsoft website) and PSDK

38
OpenTS_at_WinCCS
  • AMD64 and x86 platforms are currently supported
  • Integration into Microsoft Visual Studio 2005
  • Two ways for building T-applications command
    line and Visual Studio IDE
  • A unified installer of OpenTS for Windows
    XP/2003/WCCS
  • Installation of WCCS SDK (including MS-MPI), if
    necessary
  • OpenTS testing procedure

39
OpenTS integration into Microsoft Visual Studio
2005
40
Installer of OpenTSfor Windows XP/2003/WCCS
41
Open TS applications
42
?-Applications
  • MultiGen biological activity estimation
  • Remote sensing applications
  • Plasma modeling
  • Protein simulation
  • Aeromechanics
  • Query engine for XML
  • AI-applications
  • etc.

43
MultiGenChelyabinsk State University
?0
Level 0
Level 1
?11
?12
Level 2
?22
?21
Multi-conformation model
44
MultiGen Speedup
National Cancer Institute USA Reg.No.
NCI-609067 (AIDS drug lead)
National Cancer Institute USA Reg.No.
NCI-641295 (AIDS drug lead)
TOSLAB company (Russia-Belgium) Reg.No. TOSLAB
A2-0261 (antiphlogistic drug lead)
Substance Atom number Rotations number Conformers Exectution time (min.?) Exectution time (min.?) Exectution time (min.?)
Substance Atom number Rotations number Conformers 1 node 4 nodes 16 nodes
NCI-609067 28 4 13 933 321 122
TOSLAB A2-0261 82 18 49 11527 3923 1609
NCI-641295 126 25 74 26619 9557 3448
45
AeromechanicsInstitute of Mechanics, MSU
46
Belocerkovskis approach
flow presented asa collection of
smallelementary whirlwind(colours
clockwiseand contra-clockwiserotation)
47
Creating space-born radar image from hologram
48
Simulating broadband radar signal
  • Graphical User Interface
  • Non-PSI RAS development team (Space research
    institute of Khrunichev corp.)

49
Landsat Image Classification
  • Computational web-service

50
Future Work
  • Multi-kernel CPU support
  • Distributed computing
  • Schedulers
  • Transport
  • Interface to web-services
  • Fault-tolerance
  • Optimizing for modern CPUs
  • Algorithmic skeletons, patterns and high level
    parallel libraries

51
Out of Presentation Scope
  • Other T-languages T-Refal, T-Fortan
  • Memoization
  • Automatically choosing between call-style and
    fork-style of function invocation
  • Checkpointing
  • Heartbeat mechanism
  • Flavours of data references normal, glue and
    magnetic lazy, eager and ultra-eager
    (speculative) data transfer

52
ACKNOLEDGEMENTS
  • SKIF supercomputing project
  • Russian Academy of Science grants
  • Program High-performance computing systems on
    new principles of computational process
    organization
  • Program of Presidium of Russian Academy of
    Science Development of basics for implementation
    of distributed scientific informational-computatio
    nal environment on GRID technologies
  • Russian Foundation Basic Research
    05-07-08005-???_?
  • Microsoft
  • 2005 contract for Open TS vs MPI case study
  • 2006 contract for Porting OpenTS to MS Windows
    CCS

53
THANKS
  • ANY QUESTIONS ???
Write a Comment
User Comments (0)
About PowerShow.com