Title: Using wavelets on the XBOX360
1Using wavelets on the XBOX360
- For current and future games
- San Francisco, GDC 2008
Mike Boulton Senior Software Engineer Rare/MGS mb
oulton_at_microsoft.com
2Introduction
- Fixed compression methods such as DXT are
becoming antiquated - Distribution of data density becoming less
uniform as complexity increases - Why cant compression efforts be focused where it
matters most? - Wavelets can help!
3What are wavelets?
- Wavelets are mathematical functions formed from
scaled and translated copies of a few basis
functions - The advantage is their ability to localize
functions in both frequency and space - Fourier just frequency
- Point basis just space
- There are lots of common wavelet bases
4Why 2D Haar?
- This talk will focus on 2D Haar wavelets
- Haar is the simplest basis set
- In general, this means more basis terms and/or
blocky artefacts - But also easiest to implement ?
- Blocky nature of bases is less of a problem for
operations such as integration - What do the wavelets look like?
52D Haar continued...
- Three basis wavelet functions, plus solid
scaling function - White represents 1, black -1
6The wavelet advantage
- Local coverage allows windowed changes
- Compared with spherical harmonics, which have
global cover - This means that local changes only involve local
bases - Mip-map chains not required for image
decompression - Truly variable compression
- Can focus efforts where it counts
7Usage examples
- Real-time shader image decompression
- Lighting
- Static shadow maps
- Displacement maps over large areas
- Easy dynamic texture packing
- Geometry representations
- ...many more!
8Talk focus
- Real-time shader image decompression
- Real-time on XBOX360 GPU
- 500Hz for full-screen 720p monochrome
decompression - Double-product integration for relighting
- Real-time (ish!) on XBOX360 GPU
- Other applications
- E.G. geometry representation
- If time!
9Real-time image decompression
- Image broken up into 16x16 texel blocks
- Each block compressed into wavelet sub-tree
- Texture at 1/16th resolution stores scaling
coefficient and offset to start of sub-tree - Why?
- Decompression performance should be independent
of main image resolution - Each sub-tree fits inside a texture cache tile,
so no traversal cache thrashing - Can unroll traversal loop in shader for perf. gain
10Real-time image decompression continued...
- Given a (u,v) coordinate, pixel shader traverses
appropriate sub-tree for final value - Mip-map chain not required
- Obtained as by-product of traversal
- Intermediate memory not required
- All decompression performed directly in the pixel
shader - Dynamic predication works well
- Good vector coherency if minification avoided
11Real-time image decompression continued...
- How is the wavelet tree represented?
- Stored breadth-first in a line texture
- Multiple ways to pack the data
- Example ARGB8
- r, g, b stores windowed wavelet basis
coefficients - a stores linear offset to child of node, if one
exists - If no child exists, we jump to an empty node
- Wait for the rest of the pixel vector to finish
- Can use wider data formats
- E.g. U16, F32
12Real-time image decompression continued...
Uncompressed with mipmaps 2MB _at_ 1578fps
13Real-time image decompression continued...
Wavelet compressed, cut-off 0.05, 249KB _at_ 579fps
14Real-time image decompression continued...
Wavelet compressed, cut-off 0.08, 157KB _at_ 622fps
15Real-time image decompression continued...
- Notice how areas of high contrast have their
detail preserved, and areas of lower contrast are
smoothed out - Can use a different notion of importance!
16Real-time image decompression continued...
- Has advantages over fixed DXT-like compression
schemes - Can be lossless in areas you need it to be
- Particularly important for data textures, such
as SH coefficients - Will do a much better job for images with areas
of low contrast - Again, higher-order SH terms are a good example
- But most surface-parameterised data is likely to
be like this - Can use focus ability to increase the area of
parameterisation
17Real-time image decompression continued...
18Double-product integration
- Another good application is relighting
- Work by Ng, Ramamoorthi and Hanrahan
- Diffuse BRDF
- We represent both the transfer function (with
cosine) and the environment as wavelet trees in a
texture
19Double-product integration continued...
- Here we need to traverse only the intersection of
both trees - So if one is simple, should get good performance
- Parallel GPU traversal needs to know how to jump
over bits of either tree not contained in the
intersection - If a node has a child, store linear offset to
sibling or ancestor (could be root) - Then we can jump over whole child branch if the
other tree has no children at that point
20Double-product integration continued...
- Performance is a function of the size of the
intersection between both trees - Plus other factors such as cache behaviour
- Loads of other interesting ideas
- Transfer function wavelet trees have a tendency
to be quite similar in localised patches - Clustering scheme would help further
- Could store representative tree for each patch,
and then a (smaller) residual tree per texel of
patch
21Double-product integration continued...
22Other applications
- Wavelets can be used to define surface
deformations - As a dynamic displacement map
- Can keep memory overhead constant by removing
oldest high-frequency terms for new deformation
terms - Windowed nature of wavelets makes applying
localised deformations simple - Can easily modify e.g. moment of inertia