Title: Tone Mapping on GPUs
1Tone Mapping on GPUs
- Cliff Woolley University of Virginia
Slides courtesy Nolan Goodnight
2HDR and Tone Mapping
Clamped to 0,1
Compressed
3Advances in graphics hardware
- Physically-based rendering on the GPU
- (Purcell et al, 2003)
- High dynamic range texture mapping
- (Debevec et al, 2001)
4System Overview
- Interactive tone mapping system for an OpenGL
application
tone mapping system
application
Display callback
HDR image
Frame buffer
LDR image
5Interface to the application
tone mapping system
application
- tmInitialize() // Initialize the system
- tmEnable() // Retarget GL calls
- Draw geometry
- tmCompress() // Compress output
- tmDisable() // Restore app context
6Choosing a tone mapping operator
- Photographic Tone Reproduction for High Contrast
Images (Reinhard et al, 2002) - Global operator is a simple transfer function
1
0
7Choosing a tone mapping operator
- Local operator
- Digital analog to burning and dodging
Center-surround
8Why use this tone mapping operator?
- Global operator is simple and fast to compute
- Only one global computation
- We can dynamically choose the number of zones
9Variable number of zones 3
10Variable number of zones 4
11Variable number of zones 5
12Variable number of zones 6
13Variable number of zones 7
14Variable number of zones 8
15System block diagram
16Implementation
- Target architecture
- ATI Radeon 9800 (R350)
- Data storage
- Floating-point off-screen buffers (pbuffers)
- Multiple rendering surfaces (GL_AUXi)
17Implementation
- Algorithms
- ARB fragment and vertex assembly
- Generate fragments with image-sized quads
- Data representation
- Vector vs. scalar organization
18Global operator block diagram
19Implementation global operator
- Simple luminance transform
- Store luminance and log luminance in separate
channels
HDR image
Luminance Log luminance
luminance
log luminance
Single pbuffer
Mipmap reduction
LDR image
20Implementation global operator
Single rendering surface
HDR image
Luminance Log luminance
Single pbuffer
Mipmap reduction
log luminance channel
log average luminance
LDR image
21Implementation global operator
HDR image
texture 0
operator shader
Luminance Log luminance
texture 1
texture 2
Mipmap reduction
LDR image
22Local operator block diagram
23Implementation GPU-based convolutions
- Transform n-vector product into multiple 4-vector
products
24Vectorizing the luminance
- Output 4 pixels at the same time
- Useful for expensive algorithms
- Requires a conversion back to scalar form.
Stacked domain
25Vectorizing the luminance
- A simple method for luminance vectorization
26Vectorizing the luminance
- A simple method for luminance vectorization
27Vectorizing the luminance
- A simple method for luminance vectorization
28Vectorizing the luminance
- A simple method for luminance vectorization
29Vectorizing the luminance
- A simple method for luminance vectorization
- Preserves spatial locality
30GPU-based convolutions
stacked image
31GPU-based convolutions
Pass 1
stacked image
32GPU-based convolutions
Pass 1
Pass 2
stacked image
33GPU-based convolutions
Pass 1
Pass 2
Pass 3
stacked image
34GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
35GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
36GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
37GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
38GPU-based convolutions
- Compute multiple 4-vector products per pass
- Less shader and texture switching
Single render pass
stacked image
39GPU-based convolutions
- Advantages
- Handles large kernels
- Efficient memory access
- No transform back to scalar values
512 X 512 image
11 x 11 kernel
6 ms
21 x 21 kernel
10 ms
41 x 41 kernel
16 ms
40System block diagram
41Calculating adaptation zones
luminance
luminance
FRONT
0
1
BACK
Buffer 0
Buffer 1
42Calculating adaptation zones
luminance
luminance
FRONT
2
1
BACK
Buffer 0
Buffer 1
43Calculating adaptation zones
luminance
luminance
FRONT
2
3
BACK
Buffer 0
Buffer 1
44Calculating adaptation zones
luminance
luminance
FRONT
4
3
BACK
Buffer 0
Buffer 1
45Performance global operator
Frames per second
Image size
46Performance local operator
Frames per second
Number of zones
47Performance comparison CPU vs. GPU
48Results Accuracy
- Comparison with CPU 512 x 512 image
49False-color zone images
50Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
51Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
52Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
53Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
54Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
55Images generated at 30Hz
Compressed 2 zones
Clamped 0,1
56Conclusion and Future Work
- Summary
- System for interactively compressing HDR output
from an OpenGL application - Complex tone mapping operator on the GPU
- Future Work
- Other tone mapping operators
- Further optimizations
- Non-invasive implementation