Flexibility and Interoperability in a Parallel MD code - PowerPoint PPT Presentation

About This Presentation
Title:

Flexibility and Interoperability in a Parallel MD code

Description:

Laxmikant Kale, Klaus Schulten, Robert Skeel. Development team ... Mature, robust, portable. http://charm.cs.uiuc.edu. Data driven execution. Scheduler ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 26
Provided by: laxmika
Learn more at: http://charm.cs.uiuc.edu
Category:

less

Transcript and Presenter's Notes

Title: Flexibility and Interoperability in a Parallel MD code


1
Flexibility and Interoperability in a Parallel MD
code
  • Robert Brunner,
  • Laxmikant Kale,
  • Jim Phillips
  • University of Illinois at Urbana-Champaign

2
Contributors
  • Principal investigators
  • Laxmikant Kale, Klaus Schulten, Robert Skeel
  • Development team
  • Milind Bhandarkar, Robert Brunner, Attila Gursoy,
    Neal Krawetz, Ari Shinozaki, ...

3
Middle layers
Applications
Middle Layers Languages, Tools, Libraries
Parallel Machines
4
(No Transcript)
5
Molecular Dynamics
  • Collection of charged atoms, with bonds
  • Newtonian mechanics
  • At each time-step
  • Calculate forces on each atom
  • bonds
  • non-bonded electrostatic and van der Waals
  • Calculate velocities and Advance positions
  • 1 femtosecond time-step, millions needed!
  • Thousands of atoms (1,000 - 100,000)

6
Molecular Dynamics
  • Collection of charged atoms, with bonds
  • Newtonian mechanics
  • At each time-step
  • Calculate forces on each atom
  • bonds
  • non-bonded electrostatic and van der Waals
  • Calculate velocities and Advance positions
  • 1 femtosecond time-step, millions needed!
  • Thousands of atoms (1,000 - 100,000)

7
Further MD
  • Use of cut-off radius to reduce work
  • 8 - 14 Ã…
  • Faraway charges ignored!
  • 80-95 work is non-bonded force computations
  • Some simulations need faraway contributions

8
NAMD Design Objectives
  • Performance
  • Scalability
  • To a small and large number of processors
  • small and large molecular systems
  • Modifiable and extensible design
  • Ability to incorporate new algorithms
  • Reusing new libraries without re-implementation
  • Experimenting with alternate strategies

9
Force Decomposition
Distribute force matrix to processors Matrix is
sparse, non uniform Each processor has one
block Communication N/sqrt(P) Ratio
sqrt(P) Better scalability (can use 100
processors) Hwang, Saltz, et al 6 on 32 Pes
36 on 128 processor
Not Scalable
10
Spatial Decomposition
11
Spatial decomposition modified
12
Implementation
  • Multiple Objects per processor
  • Different types patches, pairwise forces, bonded
    forces,
  • Each may have its data ready at different times
  • Need ability to map and remap them
  • Need prioritized scheduling
  • Charm supports all of these

13
Charm
  • Data Driven Objects
  • Object Groups
  • global object with a representative on each PE
  • Asynchronous method invocation
  • Prioritized scheduling
  • Mature, robust, portable
  • http//charm.cs.uiuc.edu

14
Data driven execution
Scheduler
Scheduler
Message Q
Message Q
15
Object oriented design
  • Two top level classes
  • Patches cubes containing atoms
  • Computes force calculation
  • Home patches and Proxy patches
  • Home patch sends coordinates to proxies, and
    receives forces from them
  • Each compute interacts with local patches only

16
Compute hierarchy
  • Many compute subclasses
  • Allow reuse of coordination code
  • Reuse of bookkeeping tasks
  • Easy to add new types of force objects
  • Example steered molecular dynamics
  • Implementor focuses on the new force functionality

17
Multi-paradigm programming
  • Long-range electrostatic interactions
  • Some simulations require this feature
  • Contributions of faraway atoms can be computed
    infrequently
  • PVM based library, DPMTA
  • Developed at Duke, by John Board, et al
  • Patch life cycle
  • better expressed as a thread

18
Converse
  • Supports multi-paradigm programming
  • Provides portability
  • Makes it easy to implement RTS for new paradigms
  • Several languages/libraries
  • Charm, threaded MPI, PVM, Java, md-perl, pc,
    nexus, Path, Cid, CC,..

19
Namd2 with Converse
20
Separation of concerns
  • Different developers, with different interests
    and knowledge, can contribute effectively
  • Separation of communication and parallel logic
  • Threads to encapsulate life-cycle of patches
  • Adding new integrator, improving performance, new
    MD ideas, can be performed modularly and
    independently

21
Load balancing
  • Collect timing data for several cycles
  • Run heuristic load balancer
  • Several alternative ones
  • Re-map and migrate objects accordingly
  • Registration mechanisms facilitate migration
  • Needs a separate talk!

22
Performance size of system
23
Performance various machines
24
Speedup
25
Conclusion
  • Multi-domain decomposition works well for
    dynamically evolving, or irregular apps
  • When supported by data driven objects (Charm),
    user level threads, call backs
  • Multi-paradigm programming is effective!
  • Object oriented parallel programming
  • promotes reuse ,
  • good performance
  • Measurement based load balancing
Write a Comment
User Comments (0)
About PowerShow.com