Afrigraph Tutorial B: Interactive RayTracing - PowerPoint PPT Presentation

1 / 86
About This Presentation
Title:

Afrigraph Tutorial B: Interactive RayTracing

Description:

Efficient occlusion culling is hard. Visibility determined at end ... Occlusion Culling & Logarithmic Complexity. RT never even looks at invisible geometry ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 87
Provided by: philipps2
Category:

less

Transcript and Presenter's Notes

Title: Afrigraph Tutorial B: Interactive RayTracing


1
Afrigraph Tutorial BInteractive Ray-Tracing
  • Ingo Wald
  • Philipp Slusallek
  • Saarland University
  • Computer Graphics Group
  • http//graphics.cs.uni-sb.de

2
  • For almost 20 years, researchers have argued that
    eventually, Ray-Tracing will become faster than
    rasterization

3
  • For almost 20 years, researchers have argued that
    eventually, Ray-Tracing will become faster than
    rasterization
  • And nothing happened...
  • Well, almost ...

4
UNC Powerplant (12.5 Mtris, gt10 fps)
5
Four Power Plants (50 Mtris)
6
Tutorial Overview
  • Introduction
  • Introduction to Ray-Tracing
  • Discussion Ray-Tracing versus Rasterization
  • Previous Work
  • Approximating Ray-Tracing
  • Accelerated Ray-Tracing
  • Interactive Ray-Tracing on PCs
  • Coherent Ray-Tracing Implementation
  • Comparisons (SW / HW)
  • Distributed RT of Massive Models
  • Outlook Hardware-Architectures for Ray-Tracing
  • Future Research and Conclusions

7
Tutorial Overview
  • Introduction
  • Introduction to Ray-Tracing
  • Discussion Ray-Tracing versus Rasterization
  • Previous Work
  • Approximating Ray-Tracing
  • Accelerated Ray-Tracing
  • Interactive Ray-Tracing on PCs
  • Coherent Ray-Tracing Implementation
  • Comparisons (SW / HW)
  • Distributed RT of Massive Models
  • Outlook Hardware-Architectures for Ray-Tracing
  • Future Research and Conclusions

8
Introduction to Ray-Tracing
  • In principle Very simple algorithm
  • For each pixel
  • Create ray through that pixel
  • Cast ray into scene and find closest intersection
  • Shade ray at intersection point
  • Can also shoot new rays during shading
  • Determine visibility of point lights by shadow
    rays
  • Compute reflected/refracted light by recursively
    tracing reflection-/refraction-rays
  • Basically, thats all

9
Ray-Tracing Algorithm
10
Introduction to Ray-Tracing
  • Only three main components
  • Generating rays
  • Finding the closest intersection of a ray
  • Ray traversal
  • Ray-object intersection
  • Shading

11
Ray-Generation
  • Generate initial ray for each pixel
  • Other camera models are trivial
  • Fisheye lens
  • Non-linear distortions/Lens effects
  • Motion blur, depth of field
  • Options
  • More samples for anti-aliasing
  • Adaptive Sampling
  • Combine with IBR
  • E.g. RenderCache Reuse samples by reprojection

12
Ray-Traversal
Grid (2D)
  • Need to find objects quickly
  • Exhaustive search infeasible
  • Build spatial index structure
  • Grid, octree, BSP-tree, BVH, ...
  • Advantages
  • Logarithmic complexity
  • Occlusion culling
  • Early ray termination
  • Problems
  • Multiple intersection computations
  • (objects often in multiple voxels)
  • Dynamic scenes ?

Octree (2D)
13
Ray-Object-Intersection
  • Need to compute intersections fast
  • Requires many floating point operations
  • But typically dominated by traversal (21)
  • Plenty of algorithms
  • Plenty of primitives
  • Even for triangles
  • Optimizations
  • Use SIMD CPU-extensions (SSE, AltiVec, 3D-Now)
  • Data parallel execution
  • Proper caching of data

14
Shading
  • Lots of reflection models possible
  • Phong, Cook-Torrance, Ward,
  • Direct use of Shading Languages (Renderman)
  • Shading after visibility has been computed
  • No overhead due to overdraw
  • Every ray is shaded exactly once
  • Can generate new rays
  • Shadow, reflection, transmission, ...
  • Need to deal with recursion
  • Rendering cost linear in rays traced

15
Introduction to Ray-Tracing
  • Only three main components
  • Generating rays
  • Finding the closest intersection of a ray
  • Ray traversal
  • Ray-object intersection
  • Shading
  • Problem
  • Find closest intersection is very expensive
  • And Lots of rays per image

16
Rasterization Pipeline
Application
  • In Contrast Rasterization
  • Efficient HW implementation
  • Use of object coherence
  • Many new features
  • Rendering is driven by App.
  • Application submits geometry
  • Visibility determined at end
  • Z-buffer fragment test

TL, Vertex Ops
Rasterization
Texturing
Fragment Ops
Fragment Tests
Framebuffer
17
RasterizationDrawbacks
  • Drawbacks of this approach
  • Use of object coherence
  • Only if triangle is large
  • Rendering is driven by App.
  • Application has to know what is visible
  • Efficient occlusion culling is hard
  • Visibility determined at end
  • Overdraw Discard all but one fragments
  • High depth complexity very inefficient

18
Ray-Tracing versus Rasterization
  • Flexibility
  • Handling unstructured groups of rays
  • Image-based rendering, reflections, shadows
  • Generality
  • Ray-Tracing is the basis for many algorithms
  • Global illumination, visibility,
  • Used in many disciplines
  • Physics, Biology, Chemistry, Telecom,

19
Ray-Tracing versus Rasterization
  • Simple and Efficient Shading
  • Shading happens after visibility computation
  • Direct use of Shading Languages
  • Correctness Image Quality
  • Rasterization inherently relies on approximations
  • Environment maps, shadow maps, ...
  • Ray-traced images are correct by default
  • True reflections and shadows
  • Use of approximations is optional

20
Ray-Tracing versus Rasterization
  • Parallel Scalability
  • Ray-Tracing is embarrassingly parallel
  • (e.g. each pixel independent of all others)
  • Scales well with the available hardware
  • Needs fast access to scene data base

21
Ray-Tracing versus Rasterization
  • Scalability with Scene Size
  • Occlusion Culling Logarithmic Complexity
  • RT never even looks at invisible geometry
  • RT traversal allows for efficient searching
    O(log N)
  • Rasterization shows linear behavior O(N)
  • ? RT wins for complex scenes
  • But rasterization is improving

22
Ray-Tracing versus Rasterization
  • Coherence
  • Key to efficient rendering
  • Rasterization Object coherence
  • Allows for efficient HW implementation
  • But only really efficient for large triangles
  • Ray-Tracing Ray coherence
  • Improved caching reduced bandwidth
  • Allows for data parallel computation
  • RT has much more coherence than assumed
  • But harder to exploit

23
Ray-Tracing versus Rasterization
  • Conclusion of that Comparison
  • Ray Tracing has many advantages
  • These advantages become ever more pronounced
  • Not only qualty, also efficiency
  • But Ray-Tracing is (still) costly
  • Have to make it faster !

24
Tutorial Overview
  • Introduction
  • Introduction to Ray-Tracing
  • Discussion Ray-Tracing versus Rasterization
  • Previous Work
  • Approximating Ray-Tracing
  • Accelerated Ray-Tracing
  • Interactive Ray-Tracing on PCs
  • Coherent Ray-Tracing Implementation
  • Comparisons (SW / HW)
  • Distributed RT of Massive Models
  • Outlook Hardware-Architectures for Ray-Tracing
  • Future Research and Conclusions

25
Previous and Related Work
  • Two ways to achieve ray-tracing like quality
    interactively
  • Trace less rays per frame Approximative
    ray-tracing
  • Rasterization hardware
  • Image-based techniques
  • Interpolation of ray-traced results
  • Trace more rays/sec Accelerated ray-tracing
  • Better data structures
  • Better algorithms
  • Better implementations
  • Parallel processing

26
Previous and Related Work
  • Two ways to achieve ray-tracing like quality
    interactively
  • Trace less rays per frame Approximative
    ray-tracing
  • Rasterization hardware
  • Image-based techniques
  • Interpolation of ray-traced results
  • Trace more rays/sec Accelerated ray-tracing
  • Better data structures
  • Better algorithms
  • Better implementations
  • Parallel processing

27
Approximated Ray-TracingRasterization Hardware
  • HW-Accelerated vista/shadow buffers
  • Compute visible geometry in HW
  • Lookup of geometry in frame buffer
  • Only works for primary rays and point lights
  • Creates artifacts (e.g. shadow buffer resolution)
  • Augmenting hardware with RT effects
  • Selective ray-tracing
  • Integrate ray-tracing with OpenGL rendering
  • Rasterization for diffuse objects
  • Textures or splatting Stamminger/Haber 00/01
    for ray-traced samples

28
Approximated Ray-TracingCorrective Textures
29
Approximated Ray-TracingImage-Based Techniques
  • RenderCache Walter et al. 99
  • Store ray samples per pixel (color, depth, ...)
  • Reproject samples for next frame
  • Detect and fill holes by sending few new rays
  • Heuristic algorithms based on neighborhood
  • Locate and correct errors (shadow, etc)
  • Pseudo-randomly sample a few other pixel
  • Adaptively sample near error regions
  • But Reprojection and Heuristics are expensive
  • Pays off (only) when pixels are very expensive to
    compute directly (e.g. global illumination)
  • Scales badly with CPUs

30
Approximated Ray-TracingImage-Based Techniques
  • Holodeck Ward 98
  • Similar to RenderCache, but
  • Long term storage of ray samples on disk
  • Fast access to samples based on grid structure
  • Builds light-field-like data representation

31
Approximated Ray-TracingImage-Based Techniques
  • Interpolation in the image plane
  • Pixel-selected ray-tracing Akimoto, 89
  • Coarse sampling grid
  • Adaptive refinement based on error criteria
  • Linear interpolation between samples
  • General ray interpolation Bala, 99
  • Object-/Ray-/Image-Space
  • Time
  • Error bounded

32
Previous and Related Work
  • Two ways to achieve ray-tracing like quality
    interactively
  • Trace less rays per frame Approximative
    ray-tracing
  • Rasterization hardware
  • Image-based techniques
  • Interpolation of ray-traced results
  • Trace more rays/sec Accelerated ray-tracing
  • Better data structures
  • Better algorithms
  • Better implementations
  • Parallel processing

33
Accelerated Ray TracingBetter Data
Structures/Algorithms
  • Best data structure (Grid vs BSP vs) ?
  • Always scene and implementation dependent
  • In practice, most do about equally well
  • Well-reserached topic ? New data structures are
    unlikely to be found
  • But Potential for better algorithms
  • Can we better exploit coherence ?
  • Can we build data structures faster ?
  • Can we build data structures fully automatically
    ?
  • Also Need for dynamic data structures

34
Accelerated Ray-TracingParallelization on
SuperComputers
  • RT of large CSG models Muuss 95
  • Motivation Interactively render complex data
    sets
  • Idea Use raytracing
  • Flexibility Avoid tessellation of CSG-models
  • Take advantage of logarithmic complexity of RT
  • Exploit parallelism
  • Implementation
  • Optimized, general RT algorithm
  • 96 CPU, SGI PowerChallenge, shared memory
  • Results
  • 1-2 frames per second _at_ video resolution (in
    95!!!)

35
Accelerated Ray-TracingParallelization on
SuperComputers
  • Utah Parallel RT System Parker 99
  • Similar approach to Muuss
  • Parallelization on shared memory machine
  • Supports general primitives and volume data sets
  • Results
  • Has shown scalability up to 128 CPUs
  • Importance of caching analysis
  • New goal interactive visual cues for
    visualization(Same information at less cost)

36
Tutorial Overview
  • Introduction
  • Introduction to Ray-Tracing
  • Discussion Ray-Tracing versus Rasterization
  • Previous Work
  • Approximating Ray-Tracing
  • Accelerated Ray-Tracing
  • Interactive Ray-Tracing on PCs
  • Coherent Ray-Tracing Implementation
  • Comparisons (SW / HW)
  • Distributed RT of Massive Models
  • Outlook Hardware-Architectures for Ray-Tracing
  • Future Research and Conclusions

37
IRT on PCsWhat to keep in mind
  • PC hardware has changed dramatically
  • Processors become much faster
  • But increase in ray-tracing speed is gradual
  • Increasing gap between speed of CPU and memory
  • But ray-tracing algorithm did not change
  • SIMD extensions
  • Flops become increasingly cheap
  • But difficult to take advantage of in ray-tracing
  • Fast (and cheap) networking network of PCs
  • But good performance on non-shared-memory is hard
  • Small clusters are around everywhere

38
IRT on PCsWhat to keep in mind
  • PC hardware has changed dramatically
  • Have to adapt our algorithms !
  • Special emphasis on
  • Keeping the CPU busy
  • Memory Caching(1 cache miss can cost several
    triangle intersections)
  • SIMD
  • Not so important any more
  • Instruction count, avoiding float ops

39
General Optimizations Cache
  • Main memory is too slow for CPU (110)
  • (bandwidth and latency)
  • Keep relevant data in caches
  • Design algorithms for cache reuse ? coherence
  • Align data to cache lines (32 bytes)
  • Separate data according to usage
  • Separate volatile from non-volatile data
  • Store intersection data separate from shading
    data(e.g. shading normals not needed for
    intersection)
  • Prefetch data
  • Design algorithms to enable data access prediction

40
General Optimizations Cache
  • Cache Reuse Example Triangle Data Structure
  • Variant 1 Struct Triangle Vec3f a,b,c
  • Intersect() routine works on this structure
  • Prefetching hard (2 levels of indirection)
  • Data stored in 4 different memory regions
  • (1 struct 3 vectors)
  • Worst case 8 cache misses
  • (if each of the 4 data overlaps cacheline border)

41
General Optimizations Cache
  • Cache Reuse Example Triangle Data Structure
  • Variant 2 With preprocessed intersection data
  • All necessary data packed into 48 aligned
    bytes(see paper)
  • Con Additional data to store (48b/triangle)
  • But several advantages
  • At most 2 cache misses
  • 1 continuous memory region ? Trivial to prefetch

42
General Optimizations Cache
  • This was only one example Similarly for
  • BSP Nodes (even more important)
  • Triangle lists
  • Materials
  • Shading Data

43
General Optimizations Simplification
  • Today's CPUs have very long pipelines
  • Simplify the code to avoid pipeline stalls
  • Choose simple algorithms
  • KISS wins(KISS keep it simple and stupid)
  • E.g. BSP-tree traversal simpler than grids
  • Easier to maintain and optimize (e.g.
    prefetching)
  • Write tight inner loops
  • E.g. better caching and handling of branches
  • Avoid conditionals/relative jumps in inner loops
  • E.g. support only triangles
  • Avoid memory-access stalls
  • ? Caching, caching, caching !!!

44
OptimizationSIMD Extensions
  • Most CPUs provide SIMD extensions
  • Intel SSE (Others 3D-Now!, AltiVec, ...)
  • Use SIMD higher speed lower bandwidth
  • Up to four parallel floating point operations
  • ? For the cost of 1 !
  • Fetch data once to reduce bandwidth to cache
  • Amortize loading cost over 4 operations
  • ?Factor 4 in bandwidth reduction
  • Overhead due to restricted instruction set
  • E.g. no SSE dot product
  • Con Programming in assembly language

45
OptimizationSIMD Extensions
  • How to use SIMD Extensions ?
  • Either Instruction-parallel
  • Combine 4 computations in normal algorithm
  • E.g. the 4 mults in a dot product
  • Or Data-parallel
  • Run algorithm on 4 different data in parallel
  • E.g. 4 independent dot products

46
SIMD Intersection
  • SIMD best used in data parallel fashion
  • Little instruction-level parallelism (in RT)
  • ? Just doesnt work
  • Data parallel 1 ray ? 4 triangles
  • Hard to always have four triangles ready
  • Data parallel traversal for 1 ray ?
  • Data parallel 4 rays ? 1 triangle
  • Must traverse rays in parallel ? ray packets
  • Standard intersection code
  • Overhead for terminated rays(E.g. 1 ray hits, 3
    rays miss)

47
SIMD Intersection
  • Performance Results
  • Comparison against already optimized C code
  • Amortized cost for SSE code
  • ? 20-36 million intersections/sec! (P-III, 800
    MHz)

48
SIMD BSP-Traversal
  • Recursive Traversal Algorithm

49
SIMD BSP-Traversal
  • SIMD-Traversal
  • Traverse four rays in parallel
  • Intersection with split plane traversal
    decision
  • Combine decisions flags
  • All rays must perform the same traversal
  • Make sure order is consistent
  • Easy to guarantee Same ray origin or same signs
    of direction vector
  • Avoid recursion function calls
  • Maintain stack manually
  • Worst case as bad as before

50
SIMD BSP-Traversal
  • Overhead of SIMD-Traversal (in )
  • Fixed resolution at 10242 (l), fixed 2x2 packet
    (r)
  • Traversal still dominates rendering cost
  • Overall speedup factor 2 to 2.3

51
Coherent Algorithm Tracing Ray Packets
  • Many rays are very similar
  • e.g. primary and shadow rays, but others too
  • Handle rays together in packets of 4 rays
  • Process them in lock-step (? SIMD)
  • Reorder computations to be partly breadth-first
  • Load data once and use it for all rays
  • Reduces memory bandwidth (e.g. SSE Factor 4 !)
  • Increases Cache Utilization
  • Coherence increases with image resolution
  • more rays in same view frustum

52
Coherent Algorithms Shading
  • SIMD Phong-Shading
  • Fixed cost per image
  • Rearrange data from ray packets
  • Different depth non-coherent shadow rays
  • Different materials different shaders
  • Algorithm
  • Parallel shadow rays to light sources
  • SIMD shading using shadow flags
  • Constant shading texturing cost (lt10)
  • Procedural shading is easy (noise)

53
Coherent Ray-Tracing Summary
  • Speedup
  • Prerequisite Expose coherence in ray-tracing
    algorithm
  • Factor gt5 General optimizations
  • Factor gt2 SIMD computations
  • Further optimizations are possible
  • Better prefetching, more efficient shading
  • Performance
  • 200K to 1.5M primary rays/s (800 MHz, P-III)
  • Almost linear in of reflection shadow rays

54
Comparison Test Scenes
55
Comparison Software Ray-Tracers
  • Time per primary ray (1 CPU, 5122, in ?s)
  • Main memory RTRT 256MB, others up to 1GB
  • Rayshade Best grid resolution

56
Comparison OpenGL Hardware
  • Frame rate with SGI-Performer (5122, fps)
  • HW Octane V8, Onyx3/IR3, Geforce II GTS
  • CPUs Onyx 8, nVidia 2, RTRT 1

57
Comparison Scaling with Scene Size
  • Render time of subsampled terrain (spf)
  • Typical linear scaling of rasterization HW
  • Worst case for RT No occlusion
  • Only 1 CPU !

58
  • Demo / Video

59
Distributed RT of Massive Models
60
Reference Model (12.5 Mtris)
61
Previous Work
  • Rendering of Massive Models Aliaga 99
  • Framerate 5 to 15 fps for single power plant
  • Needs shared-memory supercomputer (SGI)
  • Framework of algorithms
  • Textured-depth-meshes (96 reduction in tris)
  • View-Frustum Culling LOD (50 each)
  • Hierarchical occlusion maps (10)
  • Extensive preprocessing required
  • Entire model 3 weeks (estimated)
  • Only semi-automatic

62
Distributed RT of Massive Models
  • Ray-Tracing and massive models just match
  • Logarithmic scaling in primitives
  • Ideal for big models
  • Preprocessing
  • Simple and fast spatial sorting, fully automatic
  • Distributed computing
  • Parallel scalability to many networked computers
  • No scene replication
  • ? Our Approach Use coherent ray-tracing
  • Caching of scene data in network
  • Deal with network issues by reordering

63
Ray-Tracing Issues
  • Distributed Scene Management
  • Several GB of scene data
  • File size and virtual address space (32 bit)
  • Cannot use OS caching (demand paging)
  • Cache miss will stall the entire process
  • 1ms network latency time to trace several
    hundred rays
  • Reordering would need non-blocking memory read
  • Need to handle cache manually
  • No longer limited by address space
  • Allows reordering of computations
  • Do not wait for missing data
  • Continue with other rays while data is being
    fetched

64
Massive Models Caching
  • 2-Level BSP-Trees
  • Caching based on voxels
  • Voxels are completely self-contained

65
Structure of the BSP-Tree
66
Distribution Issues
  • Preprocessing
  • Simple spatial sorting
  • Need out-of-core algorithm due to model size
  • Simplistic implementation 2.5 hours
  • Estimated with optimizations lt 30 min
  • Model Server
  • Single server provides all model data
  • Potenial bottleneck
  • Should be distributed as well
  • At least for more than 10 clients
  • Trivial to implement

67
Distribution Issues
  • Load Balancing
  • Tile based (32x32 pixels)
  • Demand driven
  • Avoid idle-times
  • prefetching tiles
  • Asynchronous communication
  • Frame-to-Frame Coherence
  • Keep rays on the same client
  • Simple Keep tiles on the same client
  • Better Assign tiles based on reprojected pixels
  • Larger effective cache size
  • Increases with number of clients

68
Results
  • Setup
  • Seven dual Pentium-III 800-866 MHz
  • FastEthernet (100Mbit) for normal clients
  • GigabitEthernet only for display model server
  • Performance for one Power Plant
  • 4-5 fps without SSE optimization
  • Factor 2 speedup with SSE
  • Almost perfect scaling from 1 to 14 CPUs
  • Never tried any more than that

69
Animation Framerate vs. Bandwidth
70
Speedup
71
  • Demo / Video

72
Tutorial Overview
  • Introduction
  • Introduction to Ray-Tracing
  • Discussion Ray-Tracing versus Rasterization
  • Previous Work
  • Approximating Ray-Tracing
  • Accelerated Ray-Tracing
  • Interactive Ray-Tracing on PCs
  • Coherent Ray-Tracing Implementation
  • Comparisons (SW / HW)
  • Distributed RT of Massive Models
  • Outlook Hardware-Architectures for Ray-Tracing
  • Future Research and Conclusions

73
Ray-Tracing Hardware
  • Summary so far
  • RT has many technicaladvantages
  • Better performance forlarge scenes, (logN vs N)
  • Better image quality, more features
  • But High initial cost onmain CPU
  • ? Hardware support would help

74
Ray-Tracing HardwareWhy today ?
  • The setting has changed
  • Real scenes arent suited for rasterization any
    more
  • High depth complexity
  • Large scenes, small triangles
  • Shading becomes more expensive
  • Demand for more features (shading,
    programmability)
  • Advantages of raytracing finally come to play
  • Also Flops arent that expensive any more
  • Number of Gigaflops per Gforce ?
  • Neither is memory

75
Ray-Tracing HardwarePrevious Work
  • Over the last decade Several research systems
  • Often suffered from lack of resources
  • Memory and Flops too expensive 10 years ago
  • Offline-Ray-Tracing AR250 (ART)
  • Accelerated offline rendering, bandwidth limited
  • Volume-Ray-Casting systems
  • Full volume ray casting on a chip
  • Many, some already commercially successful

76
Ray-Tracing HardwareThe SHARP Architecture
  • SHARP architecture Tim Purcell, Stanford
  • Mixed SW/HW approach
  • Based on SmartMemories Mai 00
  • Multiprocessor on a Chip
  • Roughly 64 R10k, with 8GB/s (!) memory bandwith

77
Ray-Tracing HardwareThe SHARP Architecture
  • Conclusions from SHARP(Also see Siggraph 2001,
    Course 13)
  • Simple caching works very well
  • Good ray coherence
  • Off-chip bandwidth is minimal
  • Simple memory access design
  • Performance (512x512)
  • Conference scene 50 fps
  • Reconfigurability allows to adapt to demands
  • Adapt number of shading/traversal units to scene

78
Ray-Tracing HardwareOther Architectures
  • RAYA (MERL, Siggraph 2001, Course 13)
  • Based on Memory Coherent Ray-Tracing Pharr
  • CORA (SaarbrĂĽcken)
  • Hardware version of Coherent RT Algorithm
  • Custom-design chip
  • Est. performance 30/25 fps at 1024x768
  • Cruiser 3.5 Mtris, 2 lights
  • BunnyQuake 110 Ktris, 2 lights, 3 reflection
    levels

79
Tutorial Overview
  • Introduction
  • Introduction to Ray-Tracing
  • Discussion Ray-Tracing versus Rasterization
  • Previous Work
  • Approximating Ray-Tracing
  • Accelerated Ray-Tracing
  • Interactive Ray-Tracing on PCs
  • Coherent Ray-Tracing Implementation
  • Comparisons (SW / HW)
  • Distributed RT of Massive Models
  • Outlook Hardware-Architectures for Ray-Tracing
  • Future Research and Conclusions

80
What you should take home with you
  • Interactive Ray Tracing IS feasible
  • If importance is paid to underlying hardware
  • Its not only feasible, its already there
  • Not only a theoretical phantasy any more
  • And even on cheap PCs
  • Not only better, it can even be faster
  • At least for certain applications

81
The Future
  • IRT enables completely new applications
  • Just think what has been done OpenGL
  • Large scale visualization engineering,
  • Handling of huge models
  • Interactive global illumination (?)
  • Need to adapt algorithms to new situation
  • Flexible rendering
  • Gaze tracking and non-uniform sampling density
  • Image-Based or Frameless rendering
  • Question What can IRT do for you?

82
Open Research Problems
  • Can we make it even faster ?
  • Hardware
  • What is the best HW architecture?
  • Dynamic Scenes
  • Optimized rebuild or transformation of index?
  • API
  • Better alternative to OpenGLs push model?
  • OpenGL not suited for Ray-Tracing
  • Global Illumination
  • Efficient new algorithms

83
Acknowledgements
  • AMD
  • Generous support, sponsoring and collaboration
    soon 24-node dual-Althlon IV, 1.5GHz cluster
  • Presenters of the Siggraph 2001 Course 13
  • Images, material, and information
  • Tim Purcell Pat Hanrahan (Stanford)
  • Many discussions and ideas
  • The Max-Planck-Institute at Saarbruecken
  • Collaboration and use of their Graphics Hardware
  • C. Benthin M. Wagner others
  • Work on the RT implementation and discussions

84
Links
  • mailto//wald_at_graphics.cs.uni-sb.de
  • For any questions or comments
  • http//graphics.cs.uni-sb.de/rtrt
  • The Saarland Universities RealTime RayTracing
    Project
  • http//graphics.cs.uni-sb.de/pub/afrigraph01
  • Tutorial Notes (Slides, Papers)
  • http//www.openrt.de
  • The OpenRT Interactive Raytracing API (not yet
    online)

85
The Future
  • Applications on compute clusters
  • Visualization of large models
  • Previewing of animations with full shading
  • Hardware support for IRT
  • At least for specialized applications
  • Convergence between RT and TR
  • Occlusion culling
  • Improved shading capabilities
  • Eventually based on the same API?

86
Open Research ProblemsGlobal Illumination
  • New situation
  • Ray-tracing bottleneck is gone (Well, almost)
  • New challenges
  • Need for coherence
  • Efficient computations
  • Usage of view-importance
  • High-degree of parallelism
  • Small communication overhead
  • Interactivity !!!
  • Can we trade quality for speed ?
Write a Comment
User Comments (0)
About PowerShow.com