Title: Workflow automation for processing plasma fusion simulation data
1Workflow automation for processing plasma fusion
simulation data
Norbert PodhorszkiBertram Ludäscher
University of California, Davis
Scott A. Klasky
Scientific Computing GroupOak Ridge National
Laboratory
GPSC
2Center for Plasma Edge Simulation
- Focus on the edge of the plasma in the tokamak
- Multi-scale, multi-physics simulation
Edge turbulence in NSTX (_at_ 100,000 frames/s)
Diverted magnetic field
3Images plasma physicists adore
Electric potential
Parallel flow and particle positions
4Monitoring the simulation means
5Multi-physics ? many codes
6XGC simulation output
- Desired size of simulation (to be run on the
petascale machine) - 100K time steps
- 100 billion particles
- 10 attributes (double precision) per particles
- 8 TB data per time step
- Save (and process) 1K-10K time steps
- about 5 days run on the petascale
7XGC simulation output
- Proprietary binary files (BP)
- 3D variables, separate file per each timestep
- NetCDF files containing
- 2D variables, all timesteps in one file
- M3D coupling data
- to compute new equilibrium with external code
(loose coupling) - to check linear stability of XGC externally
8What to do with those output?
- Proprietary binary files (BP)
- Transfer to end-to-end system using bbcp
- Convert to HDF5 format (with a C program)
- Generate images using AVS/Express (running as
service) - Archive HDF5 files in large chunks to HPSS
- NetCDF files containing
- Transfer to end-to-end system (updating as new
timesteps are written into the files) - Generate images using grace library
- Archive NetCDF files at the end of simulation
- M3D coupling data
- Transfer to end-to-end system
- Execute M3D compute new equilibrium
- Transfer back the new equilibrium to XGC
- Execute ELITE compute growth rate, test linear
stability - Execute M3D-MPP to study unstable states (ELM
crash)
9Schematic view of components
ORNL
40 GB/s
HPSS
Command control site
10Schematic view of components
ORNL
40 GB/s
HPSS
11Schematic view of components
ORNL
Pull data
Seaborg _at_ NERSC
40 GB/s
Cray XT4
HPSS
Command control site
12- Kepler workflow
- to accomplish all these tasks
- 1239 (java) actors
- 4 levels of hierarchy
13Workflow java - remote script - remote prg
14Kepler actors for CPES
- Permanent SSH connection to perform tasks on a
remote machine - Generalized actors (sub-workflows) for specified
tasks - Watch a remote directory for simulation timesteps
- Execute an external command on a remote machine
- Tar and archive data in large junks to HPSS
- Transfer a remote image file and display on
screen - Control a running SCIRun server remotely
- Job submission and control to various resource
managers - Above actors do logging/checkpointing
- the final workflow can be stopped / restarted
15What Kepler features are used in CPES?
- Different computational models
- PN for parallelism and pipeline processing
- DDF for sequential workflow with if-then-else and
while loop structures - SDF for efficient (static schedule) sequential
execution of simple sub-workflows - Stateful actors in stream processing of files
- SSH for remote operations
- keeps the connection alive
- Command-line execution of the workflow
- from a script (at deployment) (no GUI)
- reading workflow parameters from a file
16FileWatcher data-dependent loop
- SSH Directory Listing Java actor gives new files
in a directory (once) - This is a do-while loop where the termination
condition is whether the list contains a specific
element (which indicates end of simulation)
17Modeling problem stopping and finishing
- You create working pipelines finally. Fine.
- How do you stop them?
- How do you let intermediate actors know that they
will not receive more tokens? - How do you perform something after the
processing? - We use a special token flowing through the
pipelines - Always the last item in the pipeline.
- Actors are implemented (extra work) to skip this
token. - Stop file created by the simulation
- to stop the task generator actors in the
workflow (FileWatchers) - to notify (stateful) actors in the pipeline that
they should finalize (Archiver, Stop_AVS/Express) - to synchronize on two independent pipelines
(NetCDFHDF5 ? archive images at the end)
18Role of stop file
19Role of stop file
Extra work after the end
20Problem how to restart this workflow?
- Kepler has no system-level checkpoint/restart
mechanism - seems to be difficult for large Java applications
- not to mention the status of external (and
remote) things. - Pipeline execution
- each actor is processing a different step
21Our solution user-level logging/restart
- We record
- the successful operations at each (heavy) actor
- Those actors
- are implemented to check before doing something
whether that has been done already - When the workflow is restarted
- it starts from the very beginning, but the actors
simply skip operations (files, tokens) that have
already been done. - We do not worry about repeating small (control
related) actions within the workflow - external operations are that matter here
22ProcessFile core check-perform-record
23Problem failed operations
- What if an operation fails, e.g. one timestep
cannot be transferred? Options - a) trust that they fail silently on missing
data - b) notify everybody on the pipeline below (to
skip) - c) avoid giving tasks to them for the erroneous
step - Retrying later and processing that step is
important but - Keeping up with the simulation on the next steps
is even more important.
24Our approach for failed operations
- ProcessFile and thus the workflow handles
failures by discarding tokens related to failed
operations from the stream - Advantage
- actors need not care about failures
- an incoming token is a task to be done
- Disadvantage
- rate of token production varies
- this can upset Keplers models of computation
25Discarding tokens on failure
3
2
1
transfer 1
convert 1
arch 1
failed 2
transfer 3
convert 3
arch 3
26After a restart
3
2
1
skip 1
skip 1
skip 1
transfer 2
convert 2
arch 2
skip 3
skip 3
skip 3
27Future Plans
- Provenance management
- one main reason to use scientific workflow system
e.g. in bioinformatics workflows - needed for debugging runs, interpreting results,
repeat experiment, generate documentation,
compare runs etc. - CPES workflow is selected as one use case for the
ongoing Kepler provenance work - New actors in CPES for controlling asynchronous
I/O from the petascale computer towards the
processing cluster
28Thank You
29Disadvantage of discarding 1/7
1
3
6
5
2
4
- Distributor splits the stream and distributes it
to the two actors evenly - Commutator keeps the original order of tokens
30Disadvantage of discarding 2/7
1
3
6
5
2
4
- T2 is waiting at the Commutator for T1 to be
finished - T4 is started by the lower actor
31Disadvantage of discarding 3/7
1
3
6
5
2
4
- T4 is also finished but waiting in the lower
actor for T2 going through the Commutator - Lower actor becomes idle (this comes from
Commutators behavior)
32Disadvantage of discarding 4/7
1
3
6
5
2
4
33Disadvantage of discarding 5/7
3
5
8
7
2
4
6
- T3 can be started finally
- The lower actor is still idle (this comes from
discarding T1!)
34Disadvantage of discarding 6/7
5
7
3
9
8
2
4
6
- T3 is finished and sent out. The Commutator sends
it out first. - Then T2 is sent out
35Disadvantage of discarding 7/7
5
7
10
9
4
8
6
- Lower actor can start working on T6
- T4 will be sent out only after T5
- Order of outgoing stream 3, 2, 5, 4
36Checkpointing
..., f4, f3, f2, f1
..., g4, g3, g2, g1
..., h4, h3, h2, h1
..., list2, list1
UNFINISHED