Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer


1
Efficient Parallel Refinementfor Hierarchical
Radiosity on a DSM computer
  • François X. Sillion, Jean-Marc Hasenfratz
  • iMAGIS

2
Radiosity
3
Hierarchical Radiosity
  • Hierarchical representation (mesh)
  • Interactions computed at appropriate level

4
Strategies for Hierarchical Radiosity
  • Gathering
  • memory consuming (store links)
  • Easier dynamic modifications
  • Shooting
  • Memory efficient
  • Requires heuristic to decide shooting level
  • Links recomputed as needed

5
Parallel Approaches
  • Two approaches
  • data exchange via message-passing algorithms
  • Shared memory
  • Partial solutions possible if natural
    partitioning exists (e.g. inside buildings)
    Fun96,FY97
  • Virtual interfaces are harder to handle RAPP97
  • Load balancing problemCav99

6
Scheduler
  • Force all link refinement operations through a
    scheduler object
  • Natural place for
  • Parallel synchronization
  • Orientation and steering of calculation
  • Advantages of using scheduler
  • Global view of all pending task at any given time
  • Task extraction can be made according to various
    selection criteria

7
Example (sequential) schedulers
  • Stack scheduler (depth first refinement)
  • Priority scheduler
  • Use simple structure (heap)
  • Hierarchical level (breadth first)
  • Size, energy, error
  • Interactive user control
  • Random scheduler...

8
Architecture
Main / GUI
Solver
Scheduler
9
Synchronization
  • Scheduler
  • Single object talks to all refiners gt Danger!
  • Use simple blocks of refinement jobs
  • Hierarchical data structure
  • Consistency of hierarchical scene structure
  • Interactions
  • Links or energy representations

10
Test scenes
VRLab - 51 182 polygons
Aircraft - 184 456 polygons
Office - 5 285 polygons
11
Measurements
  • Hardware architecture
  • ccNUMA SGI 2000 computer with 64 microprocessors
  • Limit to 40 microprocessors R10000 at 195MHz

12
Measurements
  • Time measurements
  • Refinement times system call which return clock
    ticks
  • Memory access, cache access perfex software
    tool which uses the 31 hard counters of R10,000

13
Results
  • CPU Refinement time

14
Results
Speed-up
15
Results
Influence of the size of link blocks on overall
CPU time
16
Results
Memory used before and during the iterations
17
Conclusions
  • Very simple atomic tasks
  • Easily managed with a single scheduler structure
  • Easily implemented on top of an existing
    radiosity simulation code
  • Thread setup
  • New link creation upon refinement decision

18
Future work
  • Understanding the peculiar behaviour observed
    for the aircraft scene
  • Dealing with graphics resources for optimized
    calculations using graphics hardware

19
Acknowledgements
  • Peter Kipfer contributed to the design and early
    implementation of this work.
  • Thanks to Centre Charles Hermite for providing
    access to its computational resources
  • Laurent Alonso provided useful advice on
    performance questions.
  • This work was supported in part by the European
    Unions ESPRIT project 24944, ARCADE (Making
    Radiosity Usable).
Write a Comment
User Comments (0)
About PowerShow.com