Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer presentation

About This Presentation

Transcript and Presenter's Notes

Title: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

1
Efficient Parallel Refinementfor Hierarchical
Radiosity on a DSM computer

François X. Sillion, Jean-Marc Hasenfratz
iMAGIS

2
Radiosity
3
Hierarchical Radiosity

Hierarchical representation (mesh)
Interactions computed at appropriate level

4
Strategies for Hierarchical Radiosity

Gathering
memory consuming (store links)
Easier dynamic modifications
Shooting
Memory efficient
Requires heuristic to decide shooting level
Links recomputed as needed

5
Parallel Approaches

Two approaches
data exchange via message-passing algorithms
Shared memory
Partial solutions possible if natural
partitioning exists (e.g. inside buildings)
Fun96,FY97
Virtual interfaces are harder to handle RAPP97
Load balancing problemCav99

6
Scheduler

Force all link refinement operations through a
scheduler object
Natural place for
Parallel synchronization
Orientation and steering of calculation
Advantages of using scheduler
Global view of all pending task at any given time
Task extraction can be made according to various
selection criteria

7
Example (sequential) schedulers

Stack scheduler (depth first refinement)
Priority scheduler
Use simple structure (heap)
Hierarchical level (breadth first)
Size, energy, error
Interactive user control
Random scheduler...

8
Architecture
Main / GUI
Solver
Scheduler
9
Synchronization

Scheduler
Single object talks to all refiners gt Danger!
Use simple blocks of refinement jobs
Hierarchical data structure
Consistency of hierarchical scene structure
Interactions
Links or energy representations

10
Test scenes
VRLab - 51 182 polygons
Aircraft - 184 456 polygons
Office - 5 285 polygons
11
Measurements

Hardware architecture
ccNUMA SGI 2000 computer with 64 microprocessors
Limit to 40 microprocessors R10000 at 195MHz

12
Measurements

Time measurements
Refinement times system call which return clock
ticks
Memory access, cache access perfex software
tool which uses the 31 hard counters of R10,000

13
Results

CPU Refinement time

14
Results
Speed-up
15
Results
Influence of the size of link blocks on overall
CPU time
16
Results
Memory used before and during the iterations
17
Conclusions

Very simple atomic tasks
Easily managed with a single scheduler structure
Easily implemented on top of an existing
radiosity simulation code
Thread setup
New link creation upon refinement decision

18
Future work

Understanding the peculiar behaviour observed
for the aircraft scene
Dealing with graphics resources for optimized
calculations using graphics hardware

19
Acknowledgements

Peter Kipfer contributed to the design and early
implementation of this work.
Thanks to Centre Charles Hermite for providing
access to its computational resources
Laurent Alonso provided useful advice on
performance questions.
This work was supported in part by the European
Unions ESPRIT project 24944, ARCADE (Making
Radiosity Usable).

Write a Comment

User Comments (0)

About PowerShow.com

Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer PowerPoint PPT Presentation