Parallel System for Interactive Multi-Experiment Computational Studies (pSIMECS) - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Parallel System for Interactive Multi-Experiment Computational Studies (pSIMECS)

Description:

... Space Exploration ... Steerable Design Space Exploration. Simecs User View ... e.g., Limit the exploration space but increase the resolution. Simecs ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 58
Provided by: csN6
Learn more at: http://www.cs.nyu.edu
Category:

less

Transcript and Presenter's Notes

Title: Parallel System for Interactive Multi-Experiment Computational Studies (pSIMECS)


1
Parallel System for Interactive Multi-Experiment
Computational Studies(pSIMECS)
2
Simecs Problem Description
  • Multi-Experiment Computational Studies
  • Computational Studies involving multiple
    experiments, each corresponding to an individual
    execution of a simulation software
  • Example Design Space Exploration
  • Goal Given a set of possible parameter values (a
    parameter space), an experiment that maps a
    parameter value to a performance metric, find a
    subset of the parameter space whose performance
    metrics fit certain criteria.

3
Simecs Problem Description
  • Model Application Pareto Frontier Discovery.
  • Pareto Frontier is a set of points on the
    parameter space that is not completely dominated
    by any other point in the parameter space.
  • p completely dominates q iff there is all
    components in p's performance metric perform
    better than q's.

4
Simecs Pareto Frontier Insights
  • Simulations are independent embarrassingly
    parallel
  • An experiment corresponds to an execution of a
    simulation software, which can itself be parallel
    or sequential
  • Result from one simulation can be used to speed
    up simulations of nearby parameter values (e.g.,
    as initial guess for Newton Iteration.)

5
Simecs Pareto Frontier Insights
  • Decisions can be made with imprecise results can
    trade off precision Vs resources
  • If parameter space is large, sweeps are
    inefficient.
  • Need to prune portions of the space as the study
    progresses, either automatically or
    interactively.
  • Active Sampler can automatically pick
    "interesting" simulations (e.g., close to
    boundary)

6
Simecs Example Problem
  • Bridge design computational study 1D bridge in
    2D space, with end points clamped. Two elastic
    supports are added to the middle of bridge.
  • Parameter space distance of the two supports
    from the end of the bridge.
  • Performance measures maximum deflection of the
    bridge, and the cost of supports
  • Bridge is clamped at all support points, with
    bending and stretching forces, and uniform load.

7
Simecs Example Problem
Test Problem. Parameter ltr0, r1gt Performance
metric ltmax0ltrltLf(r), c(r0 ) c(r1)gt.
Cost function c(r)
8
Simecs Goal
  • Simecs Software on parallel systems that manages
    simulation processes in a Multi-Experiment
    Computational Study.
  • Frees users and application developers from
    micromanaging every simulation process
  • Goal Interactive, Steerable Design Space
    Exploration

9
Simecs User View
  • Two types of parameters
  • technique parameters (e.g., discretisation of
    nodes, convergence tolerance)
  • model parameters (e.g., young's modulus of a
    material, viscosity of a fluid).
  • Goal As the Pareto frontier obtained from one
    set of parameters is forming, the user can switch
    to another setup and continue the study.
  • e.g., Limit the exploration space but increase
    the resolution.

10
Simecs Developer View
  • Application Developer provides 3 modules
  • Simulation Maps a parameter space point to
    performance space point
  • Visualisation interaction Displays the
    relevant information to user Collects
    information from user, and maps the information
    into the Simulation module
  • Transformation Transform a state of a simulation
    on one technique parameter into another.
  • e.g., interpolate checkpoints from different
    resolutions

11
Simecs System View
  • Shared object layer, Active sampler, Resource
    Allocator

12
Simecs System View
  • Shared object space layer System-wide repository
    of shared objects (e.g., checkpoints, error
    estimations, results)
  • Sampler Based on users' specifications, issues
    sample points where simulations will be run
  • Resource Allocator / Manager Maps simulations
    into computing elements, decides whether to use a
    checkpoint.

13
Simecs SISOL
  • Spatially-Indexed Shared Object Layer (SISOL)
  • Used for storing system-wide shared objects.
  • For the model problem, checkpoints, and results
    (performance metric at each parameter point).
  • ltIndex, object set idgt names a unique object in
    the system.

14
Simecs SISOL
  • Objects are typed SISOL requires pack() and
    unpack() implementations for each type. For
    parallel object types, also requires a function
    to map parallel objects into different
    decompositions.
  • Supports split-phase create, delete, read and
    write to enforce read-modify-write consistency
  • Supports neighborhood query

15
Simecs SISOL Implementation
  • Ideal implementation directory-based cache,
    where each node participates in storing of
    objects.
  • Current implementation
  • Single TCP Server
  • In core
  • Hash-map based lookup
  • Linear lookup for nearest neighbor
  • Supports only sequential objects

16
Simecs SISOL Implementation
  • Object sets created on server
  • Nearest neighbor query retrieves coordinates only
  • Supports Sequential Petsc Vector object type by
    default.
  • Sufficient for small sets, small objects

17
Simecs SISOL Use
  • Current Pareto Frontier problem uses two object
    sets
  • Result set (parameter point gt performance
    metric)
  • Checkpoint set (parameter point gt Sequential
    Petsc vectors)
  • In the test problem, parameter point is a 2D
    vector, so result set checkpoint set have 2D
    indices.

18
Simecs FUEL
  • Frame/Update Exchange Layer Control layer
    between the manager and simulation processes
  • Codes that represent a functional aspect of a
    steerable application are grouped together
    (called a Satellite).
  • Event-based on manager process Poll-based on
    simulation processes
  • Dynamic model Satellites can be activated and
    decommissioned as a simulation is running

19
(No Transcript)
20
Simecs Active Sampler
  • Resolves the pareto frontier progressively
  • Maintains a task queue and a result set
  • Task queue points in parameter space of
    interest, result set points discovered so far
    that are undominated (i.e., current pareto set
    candidates)
  • Seeds a task queue with points from a lattice on
    the parameter space.
  • Run the task queue.

21
Simecs Active Sampler
  • For each result that comes back, decide if the
    point is undominated by all points in the result
    set. If so, remove all points in the result set
    that are dominated by it, add it to the result
    set, and insert its lattice neighbors into the
    task queue.
  • Continue until task queue is empty.
  • Refine the lattice, then repeat
  • Effect result set contains a set of pareto point
    candidates that had originated from a lattice.
    The lattice is finer as more time is spent.

22
Simecs Active Sampler
Initial Grid
23
Simecs Active Sampler
1st level results
24
Simecs Active Sampler
First Level Pareto Frontier
25
Simecs Active Sampler
First Refinement
26
Simecs Active Sampler
2nd level results
27
Simecs Active Sampler
Second level Pareto Frontier
28
Simecs Active Sampler
2nd Refinement
29
Simecs Active Sampler
3rd level results
30
Simecs Active Sampler
3rd level Pareto Frontier
31
Simecs Manager
  • Spawns off simulation processes
  • When the result of a simulation comes back (via a
    FUEL callback)
  • Registers the result
  • Asks active sampler for the next point to run
  • Looks up the SISOL for a checkpoint to jump-start
    the next point
  • Sends the parameters of the next simulation,
    coordinates of the checkpoint, and error
    tolerances to the simulation process.

32
Simecs Test System
  • Single Server implementation of SISOL to store
    checkpoint set
  • 3 Versions Samplers Active, Random, and Sweep
  • TCP-based FUEL
  • Simulation implemented with PETSc SNES solver.
  • Jump-start from Checkpoints use checkpoint's
    configuration as the starting guess

33
Simecs Test System
  • Heterogenous cluster
  • 1 1.5GHz Athlon node (manager, SISOL Server),
  • 22 1.2GHz Duron nodes (simulation processes)
  • 10 3 GHz Pentium 4 nodes. (simulation processes)
  • 100Mbps switched Ethernet network between Athlon
    and Duron nodes, 10Mbps Ethernet between Pentium
    4 nodes.

34
Simecs Test Result (Sampler)
  • Active Sampler compared against 1) Grid-based
    sampler, which performs a parameter sweep on the
    grid with increasing refinement, 2) Random
    sampler
  • Both run for 1500 simulations, and the partial
    frontiers are dumped at periodic intervals.
    Housedorff distance is measured, using the final
    Active Sampler-based frontier with 1500
    simulations as the ground truth.

35
Simecs Test Result (Sampler)
36
Simecs Test Result (Sampler)
37
Simecs Test Result (Sampler)
38
Simecs Test Result (Sampler)
39
Simecs Test Result (Sampler)
40
Simecs Test Result (Sampler)
41
Simecs Test Result (Sampler)
42
Simecs Test Result (Sampler)
43
Simecs Test Result (Sampler)
44
Simecs Test Result (Sampler)
45
Simecs Test Result (Sampler)
46
Simecs Test Results (Sampler)
47
Simecs Test Results (Sampler)
48
Simecs Test Results (Sampler)
49
Simecs Test Results (Sampler)
50
Simecs Test Results (Sampler)
51
Simecs Test Results (Sampler)
52
Simecs - Test Result (Checkpoints)
  • Cuts down number of iterations per simulation.

53
Simecs Test Result (Scaling)
Duron nodes added (Slower speed, faster
communication)
54
Simecs Test Result (Scaling)
55
Simecs Conclusions
  • Multiple experiments can be managed automatically
  • Interactive speed can be achieved via re-use of
    checkpoints, active sampling, and partial results
    run time goes from 3088 seconds down to 17, and
    lower if partial frontiers can be used

56
Simecs Conclusions
  • TCP-based communication framework provides system
    with portability - can be used on heterogeneous
    clusters
  • Spatially-indexed object sets are useful
    communication substrate

57
Simecs Future work
  • Distributed implementation of SISOL
  • Parallelise individual simulations (SISOL Support
    for Parallel Objects)
  • MPI-based communication for SISOL and FUEL
  • Interactivity
Write a Comment
User Comments (0)
About PowerShow.com