Title: Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware
1Interactive Time-Dependent Tone Mapping Using
Programmable Graphics Hardware
Eurographics Symposium on Rendering 2003
25-27th June - Leuven, Belgium
2HDR and Tone Mapping
Compressed
Clamped to 0,1
3Advances in graphics hardware
- Physically-based rendering on the GPU
- (Purcell et al, 2003)
- High dynamic range texture mapping
- (Debevec et al, 2001)
4System Overview
- Interactive tone mapping system for an OpenGL
application
tone mapping system
application
Display callback
HDR image
Frame buffer
LDR image
5Interface to the application
tone mapping system
application
- tmInitialize() // Initialize the system
- tmEnable() // Retarget GL calls
- Draw geometry
- tmCompress() // Compress output
- tmDisable() // Restore app context
6Choosing a tone mapping operator
- Photographic Tone Reproduction for High Contrast
Images (Reinhard et al, 2002) - Global operator is a simple transfer function
1
scaled luminance
0
7Choosing a tone mapping operator
- Local operator
- Digital analog to burning and dodging
local area luminance
Center-surround
8Why use this tone mapping operator?
- Global operator is simple and fast to compute
- Only one global computation
- We can dynamically choose the number of zones
9Variable number of zones 3
3 Zones
10Variable number of zones 4
3 Zones
11Variable number of zones 5
3 Zones
12Variable number of zones 6
3 Zones
13Variable number of zones 7
3 Zones
14Variable number of zones 8
3 Zones
15System block diagram
16Implementation
- Target architecture
- ATI Radeon 9800 (R350)
- Data storage
- Floating-point off-screen buffers (pbuffers)
- Multiple rendering surfaces (GL_AUXi)
- Algorithms
- ARB fragment and vertex assembly
- Generate fragments with image-sized quads
- Data representation
- Vector vs. scalar organization
17Global operator block diagram
18Implementation global operator
- Simple luminance transform
- Store luminance and log luminance in separate
channels
HDR image
Luminance Log luminance
luminance
log luminance
Mipmap reduction
LDR image
Single buffer
19Implementation global operator
Single rendering surface
HDR image
Luminance Log luminance
Mipmap reduction
log luminance channel
log average luminance
LDR image
Single buffer
20Implementation global operator
HDR image
texture 0
operator shader
Luminance Log luminance
texture 1
texture 2
Mipmap reduction
LDR image
Single buffer
21Local operator block diagram
22Implementation GPU-based convolutions
- Transform n-vector product into multiple 4-vector
products
filter
luminance
23Vectorizing the luminance
- Output 4 pixels at the same time
- Useful for expensive algorithms
- Requires a conversion back to scalar form.
Stacked domain
24Vectorizing the luminance
- A simple method for luminance vectorization
luminance
R
G
B
A
25Vectorizing the luminance
- A simple method for luminance vectorization
luminance
R
G
B
A
26Vectorizing the luminance
- A simple method for luminance vectorization
luminance
R
G
B
A
27Vectorizing the luminance
- A simple method for luminance vectorization
luminance
R
G
B
A
28Vectorizing the luminance
- A simple method for luminance vectorization
- Preserves spatial locality
luminance
R
G
B
A
29GPU-based convolutions
filter
image
Example 1 x n inner product
stacked image
30GPU-based convolutions
filter
image
Pass 1
stacked image
31GPU-based convolutions
filter
image
Pass 1
Pass 2
stacked image
32GPU-based convolutions
filter
image
Pass 1
Pass 2
Pass 3
stacked image
33GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
34GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
35GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
36GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
37GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
38GPU-based convolutions
- Advantages
- Handles large kernels
- Efficient memory access
- No transform back to scalar values
512 X 512 image
11 x 11 kernel
6 ms
21 x 21 kernel
10 ms
41 x 41 kernel
16 ms
39System block diagram
40Calculating adaptation zones on the GPU
luminance
luminance
FRONT
0
1
BACK
Buffer 0
Buffer 1
41Calculating adaptation zones on the GPU
luminance
luminance
FRONT
2
1
BACK
Buffer 0
Buffer 1
42Calculating adaptation zones on the GPU
luminance
luminance
FRONT
2
3
BACK
Buffer 0
Buffer 1
43Calculating adaptation zones on the GPU
luminance
luminance
FRONT
4
3
BACK
Buffer 0
Buffer 1
44Performance global operator
16 bit floats
32 bit floats
Frames per second
Image size
45Performance local operator
16 bit floats
32 bit floats
Frames per second
Number of zones
46Performance comparison CPU vs. GPU
47Results Accuracy
- Comparison with CPU 512 x 512 image
Image RMS error
Scaled luminance 0.022
Convolution (5 x 5) 0.026
Convolution (49 x 49) 0.032
Final image 1.051
48False-color zone images
CPU GPU
49Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
50Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
51Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
52Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
53Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
54Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
55Conclusion and Future Work
- Summary
- System for interactively compressing HDR output
from an OpenGL application - Complex tone mapping operator on the GPU
- Future Work
- Other tone mapping operators
- Further optimizations
- Non-invasive implementation