Title: Introduction to Graphics Hardware and GPUs
1Introduction to Graphics Hardware and GPUs
2Overview
- Definition
- Motivation
- History of Graphics Hardware
- Graphics Pipeline
- Vertex and Fragment Shaders
- Modern Graphics Hardware
- Stream Programming
- GPU Stream Programming
- Languages
- Exercise
- More Information
3Definition
Logical Representation of Visual Information
Output Signal
4(No Transcript)
5Motivation
- Real Time 15 60 fps
- High Resolution
6Motivation
- High CPU load
- Physics, AI, sound, network,
- Graphics demand
- Fast memory access
- Many lookups vertices, normal, textures,
- High bandwidth usage
- A few GB/s needed in regular cases!
- Large number of flops
- Flops Floating Point Operations ADD, MUL,
SUB, - Illustration matrix-vector products
- (16 MUL 12 ADD) x (vertices normals) x fps
- (28 Flops) x (6.000.000) x 30 5GFlops
- Conclusion Real time graphics needs supporting
hardware!
7History of Graphics Hardware
- - mid 90s
- SGI mainframes and workstations
- PC only 2D graphics hardware
- mid 90s
- Consumer 3D graphics hardware (PC)
- 3dfx, nVidia, Matrox, ATI,
- Triangle rasterization (only)
- Cheap pushed by game industry
- 1999
- PC-card with TnL Transform and Lighting
- nVIDIA GeForce Graphics Processing Unit (GPU)
- PC-card more powerful than specialized
workstations - Modern graphics hardware
- Graphics pipeline partly programmable
- Leaders ATI and nVidia
- ATI Radeon X1900 XTX and nVidia GeForce 7900
GTX - Game consoles similar to GPUs (Xbox)
8Graphics Pipeline
LOD selection Frustum Culling Portal
Culling
Application
Modelview/Projection tr. Clipping
Lighting Division by w Primitive
Assembly Viewport transform Backface culling
Geometry Processing
Scan Conversion Fragment Shading Color and
Texture interpol. Frame Buffer Ops Z-buffer,
Alpha Blending,
Rasterization
Output to Device
Output
9Graphics Pipeline
LOD selection Frustum Culling Portal
Culling
Application
Programmable
Clipping Division by w Primitive
Assembly Viewport transform Backface culling
VERTEX SHADER
Geometry Processing
Scan Conversion
Rasterization
FRAGMENT SHADER
Output to Device
Output
10Vertex and Fragment Shaders
11Modern Graphics Hardware
- GPU Graphics Processing Unit
- Vector processor
- Operates on 4 tuples
- Position ( x, y, z, w )
- Color ( red, green, blue, alpha )
- Texture Coordinates ( s, t, r, q )
- 4 tuple ops, 1 clock cycle
- SIMD Single Instruction Multiple Data
- ADD, MUL, SUB, DIV, MADD,
12Modern Graphics Hardware
- Pipelining
- Number of stages
- Parallelism
- Number of parallel processes
- Parallelism pipelining
- Number of parallel pipelines
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
13Modern Graphics Hardware
- Parallelism pipelining ATI Radeon 9700
4 vertex pipelines
8 pixel pipelines
14Modern Graphics Hardware
- Features of ATI Radeon X1900 XTX
- Core speed 650 Mhz
- 48 pixel shader processors
- 8 vertex shader processors
- 51 GB/s memory bandwidth
- 512 MB memory
15Modern Graphics Hardware
Graphics Card
High bandwidth 51GB/s
GPU 650Mhz
Graphics memory ½GB
Output
AGP bus 2GB/s
Parallel Processes
Processor Chip
Main memory 1GB
AGP memory ½GB
High bandwidth 77GB/s
CPU 3Ghz
3GB/s
Cache ½MB
16Stream Programming
- Input stream of data records
- Output stream(s) of data records
- Kernel operates sequentially on the data
records, accessing one record at a time! - Read-Only Memory record independent read only
memory
17GPU Stream Programming
- Vertex Shader
- Input and output streams
- Vertices, normals, colors, texture coordinates
- Read only memory
- Uniform variables
- Uniform constant per stream
- Textures, floats, ints, arrays,
- Fragment Shader
- Input and output streams
- Pixels
- Z-values
- Read only memory
- See above
18Languages
- Assembly
- Cg C for Graphics
- HLSL High Level Shading Language
- GLSL OpenGL Shading Language
- Sh
- BrookGPU
19Assembly
- Specialized Instruction Set
- DP4 4 tuple dot poduct
- RSQ reciprocal square root
- MAD multiply and add
- DPH homogeneous dot product
- SCS sine and cosine
- LRP linear interpolate
- TEX texture map
-
- Nowadays, not used directly anymore
- Generated by high level language compilers
- !!ARBvp1.0
- ATTRIB pos vertex.position
- PARAM mat4 state.matrix.mvp
- Transform by concatenation of the
- MODELVIEW and PROJECTION
- matrices.
- DP4 result.position.x, mat0, pos
- DP4 result.position.y, mat1, pos
- DP4 result.position.z, mat2, pos
- DP4 result.position.w, mat3, pos
- Pass the primary color through w/o
- lighting.
- MOV result.color, vertex.color
- END
20Cg / HLSL / GLSL
- High level programming language
- Static conditional jumps
- if, while, for,
- Data dependent conditional jumps
- SIMD Fragment shader only efficient in case of
coherent program flow! - No pointers!
- struct appdata
- float4 position POSITION
- float3 normal NORMAL
- float3 color DIFFUSE
- float3 VertexColor SPECULAR
-
- struct vfconn
- float4 HPOS POSITION
- float4 COL0 COLOR0
-
- vfconn main(
- appdata IN,
- uniform float4 Kd,
- uniform float4x4 mvp )
- vfconn OUT
- OUT.HPOS mul( mvp, IN.position)
- OUT.COL0.xyz Kd.xyz IN.VertexColor.xyz
- OUT.COL0.w 1.0
- return OUT
21Sh
- Shader code embedded in C
- // C Code
- vsh SH_BEGIN_VERTEX_PROGRAM
- ShInputNormal3f normal
- ShInputPosition4f p
- ShOutputPoint4f ov
- ShOutputNormal3f on
- ShOutputVector3f lvv
- ShOutputPosition4f opd
- opd Globalsmvp p
- on normalize( Globalsmv normal )
- ov -normalize( Globalsmv p )
- lvv normalize( GlobalslightPos
- ( Globalsmv p) ( 0,1,2 ) )
- SH_END_PROGRAM
- fsh SH_BEGIN_FRAGMENT_PROGRAM
- ShInputVector4f v
- ShInputNormal3f n
- ShInputVector3f lvv
22BrookGPU
- GPGPU Language
- General Purpose GPU Language
- Brook Streaming extension of C
- BrookGPU GPU port of Brook
- No computer graphics knowledge required!
- kernel void k( float sltgt, float3 f, float
a1010, out float oltgt ) - float alt100gt
- float blt100gt
- float clt10,10gt
- streamRead( a, data1 )
- streamRead( b, data2 )
- streamRead( c, data3 )
- // Call kernel "k"
- k( a, 3.2f, c, b )
- streamWrite( b, result )
23Screenshots
- nVidia Toolkit Reflection-Bump Mapping
24Screenshots
25Screenshots
26Exercise
- Vertex Shader in Cg
- Free Form Deformation
- Framework available on website
Vertex Shader
27More Information
- nVidia
- http//developer.nvidia.com/
- ATI
- http//www.ati.com/developer/
- General Purpose GPU Programming
- http//www.gpgpu.org
- GPU Programming and Architecture
- http//www.cis.upenn.edu/suvenkat/700/
- Hardware
- http//www.beyond3d.com
- http//www.tomshardware.com
28Questions?