Emerging Technologies for Games Optimisation 2

About This Presentation

Title:

Emerging Technologies for Games Optimisation 2

Description:

Values output from a vertex shader are linearly interpolated for the pixel shader ... Interpolate values from vertex shader. Remove code from pixel shader: ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 17

Provided by: lauren80

Category:

more less

Transcript and Presenter's Notes

Title: Emerging Technologies for Games Optimisation 2

1
Emerging Technologies for GamesOptimisation 2

CO3303
Week 6

2
Todays Lecture

Memory Optimisation
Cache Recap
Cache Optimisation
Shader Optimisation

3
Memory Optimisation

Last week we looked primarily at optimising for
speed
Often need to minimise the memory used by an
application too
These objectives frequently conflict, e.g.
Use a look-up table of pre-calculated values to
speed up a calculation uses more memory
Compress some data to minimise memory program
needs to decompress, slowing it down
Memory optimisations are usually algorithmic

4
Memory Optimisations

Compress data
Standard algorithms RLE, LZW (zip) etc.
Store a minimum of data
Dont store data that can be calculated from
other data
E.g. Store X Y axis of matrix calculate Z
axis (like targeting)
Avoid sparse data structures arrays/lists etc.
with many empty slots
Perhaps use RLE style compression
Store data on hard-drive/DVD rather than memory
Implement streaming ability to read data from
external storage whilst running other processes
Note that hard-drives etc. have a cache also

5
Cache Recap

A cache stores data efficiently that would
otherwise be expensive to fetch/calculate
A memory cache is a small local memory store with
very fast access
Duplicates data held in the main memory, but with
much faster read/write speeds
Anywhere 2-10 times faster
There may be several caches for a CPU
A small fast cache (L1), larger less fast one
(L2) etc.
The GPU also has memory caches
Vertex cache (fast access to vertex data)
Texture cache (fast access to pixel data)

6
Cache Use

When data is read
If in cache, fetched quickly from there - cache
hit
otherwise fetched from slower memory/cache -
cache miss

In any case, any data read is placed in the cache
The entire row containing the data is stored
Typically a power of 2, e.g. a 128 byte block
This means subsequent reads of the same, or
nearby data will be a cache hit
Rewards access to closely clustered data

7
Efficient Cache Use

To use cache efficiently
Try to read data near data we recently accessed
Accessing data sequentially is ideal
Random access to data is cache-inefficient
Particularly if random access to a large area,
makes cache redundant major efficiency loss
Arrays and vectors can be very cache friendly
If we mainly sequentially access them
Linked lists can be problematic
If the nodes become distributed around memory
N.B. Writing to the cache follows a similar
scheme
Although actual writes to main memory can be
delayed

8
GPU Caches

Have seen the vertex cache on modern GPUs
We should attempt to order the triangles in our
geometry to revisit recently used vertices

Also have a texture cache
Recently used blocks of textures can be accessed
more quickly
Suggests we try to render any geometry with
similar textures together
Avoid using large textures on small polygons
Pixels will be widely spaced in the source
texture random access

9
Cache Optimisation

Cache performance affects all applications
Failure to consider the cache coherency of your
app can lead to very poor performance
Without anything obvious being wrong
Note that cache issues often stop promising
optimisations from performing effectively
Particularly Look-up table of pre-calculated
values...
random access into a large table may be slower
than actually performing the calculation
We shouldnt cache optimise everything, but
should be aware of the issues

10
Shader Optimisations

Shaders almost always need to be optimised
Very often the bottleneck on current games
Particularly the pixel shader for more elaborate
effects
The HLSL shader compiler is good and catches many
optimisations
But we really need to squeeze out every drop of
speed if we want to match the competition
Shader optimisation is tricky
Methods are not widely documented (trade secrets)
Nvidia / ATI websites and tools are the best
source of ideas

11
Basic Shader Optimisations

Ensure optimisation is enabled on the shader
compiler (it is by default)
Do some performance analysis
Put a timer on screen
Use time per frame, not FPS. FPS can be
misleading
NVShaderPerf
Expect your pixel shader to need optimisation if
you use fancy materials / lighting
Or the vertex shader if you perform complex
vertex blending / deformation etc
But more usually the pixel shader

12
Shader Optimisations

Use optimisations from last week, especially
Take constant calculations out of loops
In fact avoiding loops is usually better
Especially those with a variable count (i.e. 1 to
n)
Early return from functions
Break calculations into small steps
Use simpler instructions
E.g. Preventing diffuse value becoming negative
float DiffuseLevel max(0.0f, dot(N, L))
The function saturate is faster (clamp to 0-gt1
range)
float DiffuseLevel saturate(dot(N, L))

13
Shader Optimisations

Calculate constant values on the CPU
Move code out of shaders.
E.g. Wiggling texture as used last year
float wiggle // Set wiggle from main CPU code
output.UV cos(wiggle) // Wiggle texture UVs
The term cos(wiggle) is constant for each
primitive
Instead calculate cos in the main code and pass
in
float cosWiggle // cos calculated in CPU code
output.UV cosWiggle
CPU only executes the cos once, a pixel shader
will may calculate it millions of times (once for
each pixel)

14
Shader Optimisations

Values output from a vertex shader are linearly
interpolated for the pixel shader
E.g. A pixel halfway between two vertices will
get values (world position, UVs, normal etc.)
exactly halfway the vertex values
OK for positions / scalars, but vectors suffer
similar problems to rotational lerp
Need to normalise vectors in the pixel shader -
nlerp

15
Shader Optimisations

We can use this knowledge for optimisation
Interpolate values from vertex shader
Remove code from pixel shader
float3 LightVector LightPos - i.WorldPosition
Add a parameter from vertex to pixel shader
float3 LightVector TEXCOORD2
Move code to vertex shader
o.LightVector LightPos - i.WorldPosition
Normalise in pixel shader if using vectors
Will probably need to change setup code too
Powerful method to remove excessive calculation
But problems with large rotational changes (again)

16
Shader Optimisations

Use textures to store calculations
Convert
spec saturate(dot(N,L))
Kspow((dot(N,L)gt0) ? saturate(dot(N,H)) 0),n)
Into
spec tex2D(dot(N,L), dot(N,H))
And load a texture that encodes the first
calculation
Tricky to prepare
Very powerful, especially for complex shaders

Write a Comment

User Comments (0)

About PowerShow.com

Emerging Technologies for Games Optimisation 2 - PowerPoint PPT Presentation

Emerging Technologies for Games Optimisation 2

Values output from a vertex shader are linearly interpolated for the pixel shader ... Interpolate values from vertex shader. Remove code from pixel shader: ... – PowerPoint PPT presentation