Computational issues in Carbon nanotube simulation - PowerPoint PPT Presentation

About This Presentation
Title:

Computational issues in Carbon nanotube simulation

Description:

... of each atom. Force computations. Bond angles. Pair interactions ... For each bond, compute forces due to that bond, angles, and dihedrals. Finer grained ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 24
Provided by: asri9
Learn more at: http://www.cs.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Computational issues in Carbon nanotube simulation


1
Computational issues in Carbon nanotube simulation
  • Ashok Srinivasan
  • Department of Computer Science
  • Florida State University

2
Outline
  • Background
  • Sequential computation
  • Performance analysis
  • Parallelization
  • Shared memory
  • Message passing
  • Load balancing
  • Communication reduction
  • Research issues

3
Background
  • Uses of Carbon nanotubes
  • Materials
  • NEMS
  • Transistors
  • Displays
  • Etc
  • www.ipt.arc.nasa.gov

4
Sequential computation
  • Molecular dynamics, using Brenners potential
  • Short-range interactions
  • Neighbors can change dynamically during the
    course of the simulation
  • Computational scheme
  • Find force on each particle due to interactions
    with close neighbors
  • Update position and velocity of each atom

5
Force computations
  • Pair interactions
  • Bond angles
  • Dihedral
  • Multibody

6
Performance analysis
7
Profile of execution time
  • 1 Force
  • 2 Neighbor list
  • 3 Predictor/corrector
  • 4 Thermostating
  • 5 Miscellaneous

8
Profile for force computations
9
Parallelization
  • Shared memory
  • Message passing
  • Load balancing
  • Communication reduction

10
Shared memory parallelization
  • Do each of the following loops in parallel
  • For each atom
  • Update forces due to atom i
  • If neighboring atoms are owned by other threads,
    update an auxiliary array
  • For each thread
  • Collect force terms for atoms it owns
  • Srivastava, et al, SC-97 and CSE 2001
  • Simulated 105 to 107 atoms
  • Speedup around 16 on 32 processors
  • Include long-range forces too

Lexical decomposition
11
Message passing parallelization
  • Decompose domain into cells
  • Each cell contains its atoms
  • Assign a set of adjacent cells to each processor
  • Each processor computes values for its cells,
    communicating with neighbors when their data is
    needed
  • Caglar and Griebel, World scientific, 1999
  • Simulated 108 atoms on up to 512 processors
  • Linear speedup for 160,000 atoms on 64 processors

12
Load balancing
  • Atom based decomposition
  • For each atom, compute forces due to each bond,
    angle, and dihedral
  • Load not balanced

13
Load balancing ... 2
  • Bond based decomposition
  • For each bond, compute forces due to that bond,
    angles, and dihedrals
  • Finer grained
  • Load still not
  • balanced!

14
Load balancing ... 3
  • Load imbalance was not caused by granularity
  • Symmetry is used to reduce calculations through
  • If i gt j, dont compute for bond (i,j)
  • So threads get unequal load
  • Change condition to
  • If ij is even, dont compute bond (i,j) if i gt j
  • If ij is odd, dont compute bond (i,j) if i lt j
  • Does not work, due to regular structure of
    nanotube
  • Use a different condition to balance load

15
Load balancing ... 4
  • Load is much better balanced now
  • ... at least for this simple configuration

16
Locality
  • Locality important to reduce cache misses
  • Current scheme based on lexical ordering
  • Alternate Decompose based on a breadth first
    search traversal of the atom-interaction graph

17
Locality ... 2
18
Research issues
  • Neighbor search
  • Parallelization
  • Dynamic load balancing
  • Communication reduction
  • Multi-scale simulation of nano-composites

19
Neighbor search
  • Neighbor lists
  • Crude algorithm
  • Compare each pair, and determine if they are
    close enough
  • O(N2) for N atoms
  • Cell based algorithm
  • Divide space into cells
  • Place atoms in their respective cells
  • Compare atoms only in neighboring cells
  • Problem
  • Many empty cells
  • Inefficient use of memory

20
Computational geometry techniques
  • Orthogonal search data structures
  • K-d tree
  • Tree construction time O(N log N)
  • Worst case search overhead O(N2/3)
  • Memory O(N)
  • Range tree
  • Tree construction time O(N log2N)
  • Worst case search overhead O(log2N)
  • Memory O(N log2N)

21
Desired properties of search techniques
  • Update should be efficient
  • But the number of atoms does not change
  • Position changes only slightly
  • The queries are known too
  • May be able to use knowledge of the structure of
    the nanotube
  • Parallelization

22
Parallelization
  • Load balancing and locality
  • Better graph based techniques
  • Geometric partitioning
  • Dynamic schemes
  • Use structure of the tube
  • Stochastic versions of
  • Spectral partitioning
  • Diffusive schemes

23
Multi-scale simulation of nano-composites
  • Molecular dynamics at nano-scale
  • Finite element method at larger scale
  • New models to links the scales
  • New algorithms to dynamically balance the loads,
    while minimizing communication
  • Latency tolerance and scalability to large
    numbers of processors
Write a Comment
User Comments (0)
About PowerShow.com