Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware

Description:

Interactive TimeDependent Tone Mapping Using Programmable Graphics Hardware – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 56
Provided by: csVir
Category:

less

Transcript and Presenter's Notes

Title: Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware


1
Interactive Time-Dependent Tone Mapping Using
Programmable Graphics Hardware
Eurographics Symposium on Rendering 2003
25-27th June - Leuven, Belgium
2
HDR and Tone Mapping
Compressed
Clamped to 0,1
3
Advances in graphics hardware
  • Physically-based rendering on the GPU
  • (Purcell et al, 2003)
  • High dynamic range texture mapping
  • (Debevec et al, 2001)

4
System Overview
  • Interactive tone mapping system for an OpenGL
    application

tone mapping system
application
Display callback
HDR image
Frame buffer
LDR image
5
Interface to the application
tone mapping system
application
  • tmInitialize() // Initialize the system
  • tmEnable() // Retarget GL calls
  • Draw geometry
  • tmCompress() // Compress output
  • tmDisable() // Restore app context

6
Choosing a tone mapping operator
  • Photographic Tone Reproduction for High Contrast
    Images (Reinhard et al, 2002)
  • Global operator is a simple transfer function

1
scaled luminance
0
7
Choosing a tone mapping operator
  • Local operator
  • Digital analog to burning and dodging

local area luminance
Center-surround
8
Why use this tone mapping operator?
  • Global operator is simple and fast to compute
  • Only one global computation
  • We can dynamically choose the number of zones

9
Variable number of zones 3
3 Zones
10
Variable number of zones 4
3 Zones
11
Variable number of zones 5
3 Zones
12
Variable number of zones 6
3 Zones
13
Variable number of zones 7
3 Zones
14
Variable number of zones 8
3 Zones
15
System block diagram
16
Implementation
  • Target architecture
  • ATI Radeon 9800 (R350)
  • Data storage
  • Floating-point off-screen buffers (pbuffers)
  • Multiple rendering surfaces (GL_AUXi)
  • Algorithms
  • ARB fragment and vertex assembly
  • Generate fragments with image-sized quads
  • Data representation
  • Vector vs. scalar organization

17
Global operator block diagram
18
Implementation global operator
  • Simple luminance transform
  • Store luminance and log luminance in separate
    channels

HDR image
Luminance Log luminance
luminance
log luminance
Mipmap reduction
LDR image
Single buffer
19
Implementation global operator
Single rendering surface
HDR image
Luminance Log luminance
Mipmap reduction
log luminance channel
log average luminance
LDR image
Single buffer
20
Implementation global operator
HDR image
texture 0
operator shader
Luminance Log luminance
texture 1
texture 2
Mipmap reduction
LDR image
Single buffer
21
Local operator block diagram
22
Implementation GPU-based convolutions
  • Transform n-vector product into multiple 4-vector
    products

filter
luminance



23
Vectorizing the luminance
  • Output 4 pixels at the same time
  • Useful for expensive algorithms
  • Requires a conversion back to scalar form.

Stacked domain
24
Vectorizing the luminance
  • A simple method for luminance vectorization

luminance
R
G
B
A
25
Vectorizing the luminance
  • A simple method for luminance vectorization

luminance
R
G
B
A
26
Vectorizing the luminance
  • A simple method for luminance vectorization

luminance
R
G
B
A
27
Vectorizing the luminance
  • A simple method for luminance vectorization

luminance
R
G
B
A
28
Vectorizing the luminance
  • A simple method for luminance vectorization
  • Preserves spatial locality

luminance
R
G
B
A
29
GPU-based convolutions
filter
image
Example 1 x n inner product
stacked image
30
GPU-based convolutions
filter
image
Pass 1
stacked image
31
GPU-based convolutions
filter
image
Pass 1
Pass 2

stacked image
32
GPU-based convolutions
filter
image
Pass 1
Pass 2
Pass 3


stacked image
33
GPU-based convolutions
  • Compute multiple 4-vector products per pass
  • Less shader and texture switching

Single render pass


stacked image
34
GPU-based convolutions
  • Compute multiple 4-vector products per pass
  • Less shader and texture switching

Single render pass


stacked image
35
GPU-based convolutions
  • Compute multiple 4-vector products per pass
  • Less shader and texture switching

Single render pass


stacked image
36
GPU-based convolutions
  • Compute multiple 4-vector products per pass
  • Less shader and texture switching

Single render pass


stacked image
37
GPU-based convolutions
  • Compute multiple 4-vector products per pass
  • Less shader and texture switching

Single render pass


stacked image
38
GPU-based convolutions
  • Advantages
  • Handles large kernels
  • Efficient memory access
  • No transform back to scalar values

512 X 512 image
11 x 11 kernel
6 ms
21 x 21 kernel
10 ms
41 x 41 kernel
16 ms
39
System block diagram
40
Calculating adaptation zones on the GPU
luminance
luminance
FRONT
0
1
BACK
Buffer 0
Buffer 1
41
Calculating adaptation zones on the GPU
luminance
luminance
FRONT
2
1
BACK
Buffer 0
Buffer 1
42
Calculating adaptation zones on the GPU
luminance
luminance
FRONT
2
3
BACK
Buffer 0
Buffer 1
43
Calculating adaptation zones on the GPU
luminance
luminance
FRONT
4
3
BACK
Buffer 0
Buffer 1
44
Performance global operator
16 bit floats
32 bit floats
Frames per second
Image size
45
Performance local operator
16 bit floats
32 bit floats
Frames per second
Number of zones
46
Performance comparison CPU vs. GPU
47
Results Accuracy
  • Comparison with CPU 512 x 512 image

Image RMS error
Scaled luminance 0.022
Convolution (5 x 5) 0.026
Convolution (49 x 49) 0.032
Final image 1.051
48
False-color zone images
CPU GPU
49
Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
50
Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
51
Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
52
Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
53
Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
54
Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
55
Conclusion and Future Work
  • Summary
  • System for interactively compressing HDR output
    from an OpenGL application
  • Complex tone mapping operator on the GPU
  • Future Work
  • Other tone mapping operators
  • Further optimizations
  • Non-invasive implementation
Write a Comment
User Comments (0)
About PowerShow.com