Preliminary Look - PowerPoint PPT Presentation

About This Presentation
Title:

Preliminary Look

Description:

Synchronize. Lots of repetition. Help from the source. Profiling ... Propogate time/synchronize procs. Number Total msec/ Flop/ MFLOPS. of cases min. case case ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 10
Provided by: katherin96
Learn more at: https://www.mcs.anl.gov
Category:

less

Transcript and Presenter's Notes

Title: Preliminary Look


1
Preliminary Look
  • Steven Piepers
  • Quantum Monte Carlo Code

2
What is the code?As far as I can tell
  • Quantum Monte Carlo
  • Generate input file
  • Chomp (all the time)
  • Solving wave equation/minimizing potentials
  • Send minimized nuclei to master
  • Possible redist work
  • Synchronize
  • Lots of repetition

3
Help from the source
  • Profiling pre-exists
  • Time, FLOPS, memory
  • Memory is just IBM/Mac
  • FLOPS counts in code
  • MPI
  • Tracks the amount communicated in messages

4
How does it perform?
  • Computation takes from 80-95 of wall clock time
  • MPI is very simple
  • One whole nucleus per proc
  • Memory Limitation
  • Keeps MPI costs very low
  • i.e. can run on ethernet
  • So, it scales quite well.

5
Basic Algorithm
  • Init fbn wave function
  • Init some positions (randomly)
  • Init wave functions and probability density
  • Propogate time/synchronize procs

6
  • Number Total msec/
    Flop/ MFLOPS
  • of cases min. case
    case
  • Wavefunctions 670250 1.8
    0.163 165499 1014.881
  • 2-body Prop. 0 0.0
    0.000 0 0.000
  • 3-body Prop. Vijk 0 0.0
    0.000 0 0.000
  • Propagation step 0 0.0
    0.000 0 0.000
  • Other Propagation 0 0.0
    0.000 0 0.000
  • Other vij 0 0.0
    0.000 0 0.000
  • Total accounted for time 110926. MFLOP
    1.8 Min 1014.881 MFLOPS
  • MFLOP/Wall-second 2857.285
  • 96.1 of total compute time is accounted for
  • ..
  • Master got 53.45 Mbytes 1.3767
    Mbytes/sec
  • Master sent 0.00 Mbytes 0.0000
    Mbytes/sec
  • Master got 30005 messages 772.8822
    messages/sec

7
Totals for all slaves Num
Cases Total Time Time/case
Min.
Sec. Propagation 4059060
116.9 0.002 Branching 379
0.0 0.003 Energies
223050 59.2 0.016
Kinetic
22.2 0.006 2-b Potential
31.9 0.009 3-b
Potential 4.4
0.001 Densities
0.7 0.000 Other compute
223050 0.3 0.000 All compute
223050 176.4
0.047 Config. write (wall) 223050 6.7
0.002 Above include Number
Total msec/ Flop/ MFLOPS
of cases min.
case case Wavefunctions 22612208
62.2 0.165 165031 1000.523 2-body
Prop. 4059060 39.8 0.588
324854 552.399 3-body Prop. Vijk 4282110
60.1 0.842 805563 957.102 Propagation
step 4059060 7.9 0.117 34435
293.892 Other Propagation 4059060 0.6
0.009 2115 237.629 Other vij
223050 4.7 1.251 1210116
967.236
8
Total accounted for time 8918108. MFLOP
175.2 Min
848.397
MFLOPS MFLOP/Wall-second 15321.884 99.3
of total compute time is accounted for
Master got 505.62 Mbytes 0.8687
Mbytes/sec Master sent 5.21 Mbytes
0.0089 Mbytes/sec Master got 223639 messages
384.2262 messages/sec Master wall min. in
loop 9.7 Master idle wall min.
9.4 Total available wall min.
184.3 Total compute min.
176.4 Efficency
95.7 Speed up 18.2
9
Data Structures
  • Pretty straight forward arrays
  • Wave function solution on the grid
  • Grows 2 with number of particle
  • Quickly moves from FLOPS to memory bound
Write a Comment
User Comments (0)
About PowerShow.com