PSWEEP: A Lightweight Pattern for Distributed Computational Experiments - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

PSWEEP: A Lightweight Pattern for Distributed Computational Experiments

Description:

[lena.jpg, 4, motion] [lena.jpg, 4, gaussian] e = Domain(params, images) for state in e: pass ... [lena.jpg, 4, motion] [lena.jpg, 4, gaussian] ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 18
Provided by: chrism54
Category:

less

Transcript and Presenter's Notes

Title: PSWEEP: A Lightweight Pattern for Distributed Computational Experiments


1
PSWEEP A Lightweight Pattern for Distributed
Computational Experiments
  • Christopher Mueller and Andrew Lumsdaine
  • Open Systems Lab, Indiana University

2
Introduction
  • Parameter Sweeps are common cluster applications
  • Approaches
  • Scripts (sh, perl ssh, mpi)
  • Low level applications (C, Fortran MPI)
  • Parameter sweep applications (e.g., Nimrod)
  • Problems
  • Custom solutions become tangled quickly
  • Applications are not available on all platforms

3
How do we use our clusters?
4
Anatomy of a Parameter Sweep
Parameters and Enumeration Order
  • for i in range(rank, n, size)
  • if process load_image(i)
  • elif stats query_image(i)
  • for j in 1, 2, 4, 8
  • if process time(i, j)
  • for k in motion, gaussian
  • if process process_image(i,j,k)
  • elif stats image_stats(i,j,k)
  • else
  • print 'ssh nd run d d' (i, j, k)
  • if process clear_process(k)
  • elif bgi clear_temp(k)
  • if process unload_image(i)

Resrouce distribution is handled by the
execution enviroment, e.g. mpirun
5
Anatomy of a Parameter Sweep
Tasks and Experiments
  • for i in range(rank, n, size)
  • if process load_image(i)
  • elif stats query_image(i)
  • for j in 1, 2, 4, 8
  • if process time(i, j)
  • for k in motion, gaussian
  • if process process_image(i,j,k)
  • elif stats image_stats(i,j,k)
  • else
  • print 'ssh nd run d d' (i, j, k)
  • if process clear_process(k)
  • elif bgi clear_temp(k)
  • if process unload_image(i)

6
Anatomy of a Parameter Sweep
Artifacts and Errors
  • for i in range(rank, n, size)
  • if process load_image(i)
  • elif stats query_image(i)
  • for j in 1, 2, 4, 8
  • if process time(i, j)
  • for k in motion, gaussian
  • if process process_image(i,j,k)
  • elif stats image_stats(i,j,k)
  • else
  • print 'ssh nd run d d' (i, j, k)
  • if process clear_process(k)
  • elif bgi clear_temp(k)
  • if process unload_image(i)

7
Users View
Experiments
0, n
.01, .1, 1.0
script gen
10, 12, 14
print
0, 0.01, 10 0, 0.01, 12 0, 0.01, 14 0, 0.1,
10 0, 0.1, 12
i, j, k
Parameters
8
The PSWEEP Pattern
9
Abstracting the Loops
Parameter. A Parameter is an iterator or
container that supplies the values for a variable
in the experiment. Enumerator. The enumerator
takes a ordered list of parameters and
lexigraphically enumerates all possible values.
State. The state contains the current value of
each parameter, in order.
  • i house.jpg, lena.jpg
  • j 1, 2, 4, 8
  • K motion, gaussian
  • params i, j, k
  • e enumerator(params)
  • for state in e process_image(state)

10
Abstracting the Experiments
Task. A Task is any unit of work performed when a
parameter value changes. A Task is subdivided
into setup and cleanup operations, corresponding
to the work done at the beginning and end of a
block of code in a loop, respectively.
Experiment. An Experiment is a collection of
tasks.
  • def PrepareImage(state, img)
  • Setup
  • db_load(img, './current.jpg')
  • yield suspend the function
  • Cleanup
  • delete('./current.jpg')
  • def ProcessImage(state, alg)
  • data load('./current.jpg')
  • img process(data, alg(value))
  • save(img, str(state) '.jpg')
  • return no cleanup

11
Binding Experiments to State
Bound Task Semantics. Tasks must execute in the
same order they would if the parameter sweep was
expanded to nested loops.
  • for img in images
  • PrepareImage.setup(img)
  • for alg in algs
  • ProcessImage.setup(alg)
  • PrepareImage.cleanup(img)
  • e enumerator(images, algs)
  • e.bind(images, PrepareImage)
  • e.bind(algs, ProcessImage)
  • for state in e pass

These examples are equivalent.
12
Distributing the Workload
DistributedEnumerator. DistributedEnumerator is
an Enumerator that distributes the state to
multiple instances across multiple computing
resources.
e RoundRobin(params) for state in e
pass States p1 house.jpg, 1, motion p2
house.jpg, 1, gaussian house.jpg, 2,
motion house.jpg, 2, gaussian
house.jpg, 4, motion house.jpg, 4,
gaussian lena.jpg, 1, motion
lena.jpg, 1, gaussian lena.jpg, 2,
motion lena.jpg, 2, gaussian
lena.jpg, 4, motion lena.jpg, 4,
gaussian
e Domain(params, images) for state in e
pass States p1 house.jpg, 1, motion
house.jpg, 1, gaussian house.jpg, 2,
motion house.jpg, 2, gaussian
house.jpg, 4, motion house.jpg, 4,
gaussian p2 lena.jpg, 1, motion
lena.jpg, 1, gaussian lena.jpg, 2,
motion lena.jpg, 2, gaussian
lena.jpg, 4, motion lena.jpg, 4,
gaussian
e MasterWorker(params) for state in e
pass States p1 house.jpg, 1, motion p2
house.jpg, 1, gaussian house.jpg, 2,
motion house.jpg, 2, gaussian
house.jpg, 4, motion house.jpg, 4,
gaussian lena.jpg, 1, motion
lena.jpg, 1, gaussian lena.jpg, 2,
motion lena.jpg, 2, gaussian
lena.jpg, 4, motion lena.jpg, 4,
gaussian
The DistributedEnumerators must ensure that bound
state semantics are satisfied.
13
Implementations
  • Python
  • Designed around Iterators and Generators
  • DistribtedEnumerator based on pyMPI
  • Ideal for managing experiments on clusters
  • C
  • Template metaprogramming techniques remove
    abstraction penalties
  • Ideal for applications with many nested loops

14
C Example
Generate HTML tables for days of the week with
hours for the rows and minutes for the colums
Task Classes
Parameter Sweep
  • struct table_task
  • void setup(State state)
  • stdcout ltlt "lttable title\""
  • print_last_param()(state)
  • stdcout ltlt "\"gt\n"
  • void cleanup(State)
  • stdcout ltlt "lt/tablegt\n"
  • struct table_row_task
  • // As above with lttrgt
  • struct table_data_task
  • // As above with lttdgt
  • int main()
  • using boostmake_tuple
  • sweep(make_tuple("Sat", "Sun"
  • make_tuple(range(24)
  • make_tuple(range(0,60,10))))
  • empty_state().
  • bindlt0gt(table_task()).
  • bindlt1gt(table_row_task()).
  • bindlt2gt(table_data_task()),
  • print_last_param())
  • return 0

15
Conclusions
  • PSWEEP cleanly separates concerns
  • Parameters
  • Tasks
  • Resources
  • Modern languages enable flexible and
    high-performance implementations

16
Reference
A Lightweight Pattern for Managing Distributed
Computational Experiments Christopher
Mueller, Douglas Gregor, and Andrew Lumsdaine.
Submitted to HPDC 2006.
http//www.osl.iu.edu/chemuell/new/psweep.php
17
Questions?
Write a Comment
User Comments (0)
About PowerShow.com