Solving Irregular Problems Through Parallel Irregular Trees - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Solving Irregular Problems Through Parallel Irregular Trees

Description:

non-homogeneous. dynamic and non-predictable. the evolution of an element ... the domain is recursively partitioned into a set of spaces by applying a a ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 32
Provided by: paolo63
Category:

less

Transcript and Presenter's Notes

Title: Solving Irregular Problems Through Parallel Irregular Trees


1
Solving Irregular Problems Through Parallel
Irregular Trees
  • Fabrizio Baiardi
  • Paolo Mori
  • Laura Ricci
  • Dipartimento di Informatica
  • Università di Pisa
  • Istituto di Informatica e Telematica
  • CNR - Pisa

2
Outline
  • Irregular problems main features
  • Hierarchical representation of the domain
  • Parallel Irregular Tree library
  • Experimental results
  • Future works

3
Irregular Problems
  • the domain includes a set of elements
    characterised by
  • the position in the domain
  • other problem specific properties
  • the elements distribution is
  • non-homogeneous
  • dynamic and non-predictable
  • the evolution of an element
  • depends upon that of other elements (locality)
  • updates the element properties
  • Examples
  • Barnes Hut
  • Adaptive Multigrid Methods
  • Radiosity methods

4
Hierarchical Representation
  • the domain is recursively partitioned into a set
    of spaces by applying a a problem dependent
    condition
  • the Hierarchical Tree represents the
    decomposition and each Hnode represents either a
    space or an element

5
Distributed Hierarchical Tree
  • Htree representation distributed among the
    p-nodes
  • pt lth0,..hn-1, mHtgt
  • private Htree (pHt) subtree assigned to a p-node
  • mapping Htree (mHt) represents the hierarchical
    relations among the private Htrees ( )

h0
h1
h3
h2
6
PIT Library
  • defines
  • PITree
  • PIT operations
  • key point both the sequential and the parallel
    versions of the application are structured in
    terms of operations on Htrees
  • aims
  • be a simple, complete and effective
    parallelization tool
  • hide to the user the details of the parallel
    programming
  • preserve most of the sequential code

7
PIT API
  • main operations
  • PITree creation
  • PITree completion
  • PITree update
  • alternative API
  • standard
  • advanced
  • composition of the adopted API
  • standard structure
  • customised for the specific problem

8
PITree Creation
  • it creates the PITree starting from the domain
    elements
  • one (or more) pHt for each p-node
  • one mHt replicated in each p-node
  • it implements a distributed strategy to exploit
    memory at best
  • it needs some user-defined functions to manage
    the elements of the target problem

9
PITree Completion (I)
  • standard API
  • fault prevention and informed fault prevention
  • one function only implements the strategy
  • invoked before each operator
  • PITree_completion(pht_root, stencil_0)
  • tp_op_0(pht_root)

this comes from the sequential code
10
PITree Completion (II)
  • advanced API
  • informed fault prevention only
  • two distinct functions
  • PITree_det_neighbours invoked each time the
    neighbourhood relations among the elements
    changes
  • PITree_exch_neighbours invoked before each
    operator
  • PITree_det_neighbors(pht_root, stencil_0)
  • PITree_exch_neighbors(pht_root, stencil_0)
  • tp_op_0(pht_root)

this comes from the sequential code
11
PITtree Update (I)
  • advanced API two distinct functions
  • PITree correction
  • updates the mapping of the elements violating the
    mapping strategy
  • it is invoked after each operator that updates
    the distribution
  • tp_op_0(pht_root)
  • PITree_correction(pht_root)
  • PITree balance
  • updates the mapping to redistribute the workload
    among the p-nodes
  • it is invoked after each operator that modifies
    the workload
  • tp_op_0(pht_root)
  • PITree_balance(pht_root, Tresh)

12
PITtree Update (II)
  • Standard API
  • one function only, PITree update, implements the
    PITree correction and balancing
  • PITree update is invoked after each operator
  • tp_op_0(pht_root)
  • PITree_update(pht_root, Tresh)

13
Parallelization
  • Standard
  • the functions of the sequential version are
    inserted into the standard structure
  • the development is straighforward
  • a deep knowledge of the target problem is not
    required
  • Customized
  • the PIT operations are inserted into the
    sequential code according to the semantics of the
    target problem
  • a deep knowledge of the target problem is
    required
  • both the standard and the advanced API can be
    adopted
  • it achieves a better efficiency

14
Sequential Code
  • irregular_problem(tElementList dom)
  • ...
  • root Htree_creation(dom)
  • ...
  • while (not solution_computed)
  • tp_op_0(root)
  • tp_op_n(root)

problem operator mainly consists in a visit of
the Htree
15
Standard Structure
  • irregular_problem(tElementList dom)
  • ...
  • pht_root PITree_creation(dom, dec_el,
    incl_el, rem_el)
  • ...
  • while (not solution_computed)
  • PITree_completion(pht_root, stencil_0)
  • tp_op_0(pht_root)
  • pht_root PITree_update(pht_root, T)
  • .
  • PITree_completion(pht_root, stencil_n)
  • tp_op_n(pht_root)
  • pht_root PITree_update(pht_root, T)

16
Customised Structure
  • irregular_problem(tElementList dom)
  • pht_root PITree_creation(dom, dec_el,
    incl_el, rem_el)
  • ...
  • while (not solution computed)
  • PITree_det_neighbors(pht_root,
    stencil_0..stencil_i)
  • PITree_exch_neighbors(pht_root, stencil_0)
  • tp_op_0(pht_root)
  • PITree_exch_neighbors(pht_root, stencil_i)
  • tp_op_i(pht_root)
  • PITree_correction(pht_root)
  • PITree_det_neighbors(pht_root,
    stencil_i1..stencil_n)
  • PITree_exch_neighbors(pht_root, stencil_n)
  • tp_op_n(pht_root)
  • PITree_update(pht_root)

17
Validation
  • Applications
  • Adaptive Multigrid Methods
  • Hierarchical Radiosity
  • Parallel architectures
  • PC cluster
  • Intel Pentium II 266MHz
  • 128 Mb
  • 100Mb Fast Ethernet
  • IBM Beowulf (x330)
  • Intel Pentium III 1.133GHz
  • 1GB per p-node (2 procs)
  • Myricom LAN (264MB)

18
Adaptive Multigrid Methods
  • fast iterative methods to solve partial diff.
    equations
  • discretized and multi level domain representation
    through a grid hierarchy
  • adaptive problem
  • the discretization is finer where the equation is
    irregular
  • new grids are added during the computation
  • Poisson Problem

19
Sequential Code
  • amm(tElementList initial_grid)
  • rootHtree_creation(initial_grid)
  • while (not end)
  • smoothing(root, v, f, all_levels)
  • for level from Lmax downto Lg
  • rest(root, level)
  • restriction(root, level-1)
  • smoothing(root, e, r, level-1)
  • for level frm Lg1 to Lmax
  • prolongation(root, level)
  • correction(root, e, level)
  • smoothing(root, e, r, level)
  • correction(root, v, all_levels)
  • end norm(root)
  • if (not end) Lmax refinement(root)

20
Parallel Code (I)
  • amm(tElementList initial_grid)
  • pht_root PITree_creation(initial_grid,
    dec_el, incl_el, rem_el)
  • while (not end)
  • PITree_det_neighbors(pht_root, stencil_union)
  • PITree_exch_neighbors(pht_root,
    smooth-rest_stencil, all_levels)
  • smoothing(pht_root, v, f, all_levels)
  • for level from Lmax downto Lg
  • PITree_exch_neighbors(pht_root,
    smooth-rest_stencil, level)
  • rest(pht_root, level)
  • PITree_exch_neighbors(pht_root,
    restriction_stencil, level)
  • restriction(pht_root, level-1)
  • PITree_exch_neighbors(pht_root,
    smooth-rest_stencil, level)
  • smoothing(pht_root, e, r, level-1)

21
Parallel code (II)
  • for level frm Lg1 to Lmax
  • PITree_exch_neighbors(pht_root,
    prolongation_stencil, level)
  • prolongation(pht_root, level)
  • correction(pht_root, e, level)
  • PITree_exch_neighbors(pht_root,
    smooth-rest_stencil, level)
  • smoothing(pht_root, e, r, level)
  • correction(pht_root, v, all_levels)
  • PITree_exch_neighbors(pht_root, norm_stencil,
    level)
  • end norm(pht_root)
  • if (not end)
  • Lmax refinement(pht_root)
  • pht_root PITree_update(pht_root, T)

22
  • Domain
  • Hierarchical
  • Decomposition
  • After
  • 10 Iterations

23
Load Balancing
24
Efficiency
25
Hierarchical Radiosity
  • a model of the light exchanges to compute the
    illumination of a scene
  • representation of the scene
  • discretized and hierarchical
  • adaptive
  • locality interactions among objects at distinct
    abstraction levels

26
Sequential Code
  • hierarchical_rad(segment_list scene)
  • root Htree_creation(scene)
  • visib_list_det(root)
  • while (not end)
  • Gather_H(root)
  • for level from L_min to L_max
  • Push_H(root, level)
  • for level from L_max downto L_min
  • Pull_H(root, level)
  • end RefineLink_H(root)

27
Parallel Code (I)
  • hierarchical_rad(segment_list scene)
  • pht_root PITree_creation(scene, dec_el,
    incl_el, rem_el)
  • PITree_exch_neighbors(pht_root, vis_stencil,
    all_levels)
  • visib_list_det(pht_root)
  • while (not end)
  • PITree_exch_neighbors(pht_root, int_list,
    all_levels)
  • Gather_H(pht_root)
  • for level from L_min to L_max
  • PITree_exch_neighbors(pht_root,
    push_stencil, level)
  • Push_H(pht_root, level)

28
Parallel Code (II)
  • for level from L_max downto L_min
  • PITree_exch_neighbors(pht_root,
    pull_stencil, level)
  • Pull_H(pht_root, level)
  • end RefineLink_H(pht_root)
  • pht_root PITree_balance(pht_root)

29
  • Test
  • Scene
  • 192 polygons
  • 896 segments

30
Efficiency
31
Future Works
  • the definition of the set of problems that cannot
    be solved adopting our methodology
  • the definition of programming constructs for the
    considered class of problems
Write a Comment
User Comments (0)
About PowerShow.com