Title: Introduction to Programmable Hardware
1Introduction to Programmable Hardware
2Traditional Graphics Pipeline
transform lighting
(per vertex operations)
setup rasterizer
(per primitive operation)
texture blending
(per fragment operation)
frame-buffer anti-aliasing
3Programmable features
- Vertex Programming
- Pixel Shader
- Texture shader
- Register combiner
- Based on nVIDIA architecture
4Vertex Program (contd)
- Vertex Programming offers programmable TL unit
User-defined Vertex Processing
transform lighting
setup rasterizer
texture blending
Gives the programmer total control of vertex
processing.
frame-buffer anti-aliasing
5Vertex Program (contd)
Vertex Program
transform lighting
setup rasterizer
texture blending
frame-buffer anti-aliasing
6Vertex Program (contd)
- Vertex Program
- Assembly language interface to TL unit
- GPU instruction set to perform all vertex math
- Reads an untransformed, unlit vertex
- Creates a transformed vertex
- Optionally creates
- Lights a vertex
- Creates texture coordinates
- Creates fog coordinates
- Creates point sizes
7Create Vertex Program
- Programs (assembly) are defined inline as
- character strings
static const GLubyte vpgm \!!VP1. 0\ DP4
oHPOS.x, c0, v0 \ DP4
oHPOS.y, c1, v0 \ DP4
oHPOS.z, c2, v0 \ DP4
oHPOS.w, c3, v0 \ MOV
oCOL0,v3
\ END"
8Programming Model
V0 V15
Vertex Source
Program Constants
c0 c96
16x4 registers
OHPOS OCOL0 OCOL1 OFOGP OPSIZ OTEX0
OTEX7
Vertex Program
96x4 registers
R0 R11
Temporary Registers
128 instructions
12x4 registers
Vertex Output
15x4 registers
All quad floats
9Instruction Set The ops
- 17 instructions total
- MOV, MUL, ADD, MAD, DST
- DP3, DP4
- MIN, MAX, SLT, SGE
- RCP, RSQ, LOG, EXP, LIT
- ARL
10Pixel Shader
User-defined per pixel shading
11Texture Mapping/Blending
- Traditional OpenGL texture mapping/blending
Vertex colors
Gouraud Shading
Fragment color
Texture Coordinate
Texture Unit
Blend colors
Fragment color output
12Multitexturing
- An optional extension of OpenGL 1.2
fragment color input
texture unit 0
blend colors
texture unit 0
blend colors
texture unit 0
blend colors
texture unit 0
blend colors
fragment color output
13Texture Compositing
Fragment Color
Texture Environment 0
Texture Fetching
Tex0
Tex1
Specular Color Sum
Fog Application
Specular Color
Fog Color/Factor
14Compositing Operator
Choice of 5 set functions for RGB and Alpha
Ct texture color At texture alpha
Cf incoming fragment color Af incoming
fragment alpha Cc color assigned to
GL_TEXTURE_ENV_COLOR Post-environme
nt specular color addition and fog application
Function RGB Alpha
Replace Ct At
Modulate Cf Ct Af At
Decal Cf (1 At) Ct At Af
Blend Cf (1 Ct) Cc Ct Af At
Add Cf Ct Af At
15Pixel Shader (contd)
- Based on nVIDIAs GF3/4 architecture
- Texture shader
- 4 texture units
- 23 different texture shader operations
- Conventional (1D, 2D, 3D, texture rectangle, cube
map) - Special case (none, pass through, cull fragment)
- Dependent texture fetches (result of one texture
lookup affects texture coords for subsequent
unit) - Dependent textures fetches with dot product (and
optional reflection) calculations - Register combiners
- 8 stages (general combiners) on GeForce3/4
- Per-stage constants
16Pixel Shader
- Based on nVIDIAs GF3/4 architecture
- Texture shader register combiner
texture shader
fragment color input
texture unit 0
texture program
texture unit 1
texture program
texture unit 2
texture program
texture unit 3
texture program
register combiner
fragment color output
17Texture Shader
- Texture program example conventional 2D texture
Tex
Texture Coords (S,T,R,Q)
Shader Operations
Texture Fetch
Bound Texture Target/Format
Output Color
2D Any Format
Texture 2D
Si
Ti
(R,G,B,A)
( , )
(Si,Ti,Ri,Qi)
i
Qi
Qi
18Texture Shader (contd)
- Texture program example pass through
19Texture Shader (contd)
- Texture program example dependent texture
20Register Combiner
- GeForce 2 (only 2 general combiner stages)
4 RGB Inputs
Fragment Color
4 Alpha Inputs
General Combiner 0
3 RGB Outputs
Specular Color
3 Alpha Outputs
Fog Color/Factor
4 RGB Inputs
4 Alpha Inputs
General Combiner 1
Register Set
Texture 0
Texture Fetching
3 RGB Outputs
3 Alpha Outputs
Texture 1
Spare 0
Specular Color
Final Combiner
6 RGB Inputs
1 Alpha Input
21Register Combiner (contd)
- Register-based programming
- All textures and colors available for each and
every texture blending stage - 8 Stages of blending in hardware, plus specular
and fog - Note that GeForce3 has 8 combiners, and 4
textures. - Signed color arithmetic
22Diagram of a General Combiner
Input RGB, Alpha Registers
Input Mappings
RGB Function
RGB Scale/Bias
Next Combiners RGB Registers
A
A op1 B
B
RGB Portion
C op2 D
C
AB op3 CD
D
Input Alpha, Blue Registers
Input Mappings
Alpha Function
Alpha Scale/Bias
Next Combiners Alpha Registers
A
AB
B
Alpha Portion
CD
C
AB op4 CD
D
23General Combiner Input Registers
Input RGB, Alpha Registers
Input Mappings
RGB Function
RGB Scale/Bias
Next Combiners RGB Registers
A
A op1 B
B
RGB Portion
C op2 D
C
AB op3 CD
D
Input Alpha, Blue Registers
Input Mappings
Alpha Function
Alpha Scale/Bias
Next Combiners Alpha Registers
A
AB
B
Alpha Portion
CD
C
AB op4 CD
D
24The Register Set
- Primary (diffuse) color
- initialized to RGBA of fragments primary color
- Secondary (specular) color
- initialized to RGB of fragments
secondary/specular color - alpha not initialized
- Texture 0 and Texture 1 colors
- initialized to fragments filtered RGBA texel
from numbered texture unit - not initialized if numbered texture unit is
disabled or non-existent - Spare 0 and Spare 1
- Alpha of Spare 0 is initialized to alpha of
Texture 0 color (if enabled) - RGB of Spare 0 and all of Spare 1 is not
initialized - Fog
- RGB is current fog color
- alpha is fragments fog factor (only available
in final combiner) - read-only
- Constant color 0 and Constant color 1
- initialized to user-defined RGBA value
- read-only
- Zero
25General Combiner Input Mappings
Input RGB, Alpha Registers
Input Mappings
RGB Function
RGB Scale/Bias
Next Combiners RGB Registers
A
A op1 B
B
RGB Portion
C op2 D
C
AB op3 CD
D
Input Alpha, Blue Registers
Input Mappings
Alpha Function
Alpha Scale/Bias
Next Combiners Alpha Registers
A
AB
B
Alpha Portion
CD
C
AB op4 CD
D
26General Combiner Input Mappings
Signed Identity f(x) x -1, 1 ? -1, 1 Unsigned Identity f(x) max(0, x) 0, 1 ? 0, 1 Expand Normal f(x) 2 max(0, x) - 1 0, 1 ? -1, 1 Half Bias Normal f(x) max(0, x) ½ 0, 1 ? -½, ½
Signed Negate f(x) -x -1, 1 ? 1, -1 Unsigned Invert f(x) 1-min(max(0,x),1) 0, 1 ? 1, 0 Expand Negate f(x) -2 max(0, x) 1 0, 1 ? 1, -1 Half Bias Negate f(x) -max(0, x) ½ 0, 1 ? ½, -½
27General Combiner RGB Function
Input RGB, Alpha Registers
Input Mappings
RGB Function
RGB Scale/Bias
Next Combiners RGB Registers
A
A op1 B
B
RGB Portion
C op2 D
C
AB op3 CD
D
Input Alpha, Blue Registers
Input Mappings
Alpha Function
Alpha Scale/Bias
Next Combiners Alpha Registers
A
AB
B
Alpha Portion
CD
C
AB op4 CD
D
28General Combiner RGB Functions
Dot / Dot / Discard
Dot / Mult / Discard
Mult / Dot / Discard
A
A
A
A B
A B
AB
B
B
B
C D
CD
C D
C
C
C
D
D
D
Mult / Mult / Sum
Mult / Mult / Mux
A
A
AB
AB
B
B
CD
CD
C
C
AB CD
mux(AB, CD)
D
D
mux(AB, CD) (Spare0Alpha ? ½) ? AB CD
Dot products on RGB registers A B (Ared
Bred Agreen Bgreen Ablue
Bblue, Ared Bred Agreen Bgreen
Ablue Bblue, Ared Bred Agreen
Bgreen Ablue Bblue) Multiplication
on RGB registers AB (Ared Bred, Agreen
Bgreen, Ablue Bblue)
29Diagram of the Final Combiner (OpenGL only)
Input RGB, Alpha Registers
Input Mappings
RGB Function
Available RGB Inputs
Input Mappings
A
Multiplier
E
EF
B
F
RGB Portion
RGB Out
AB (1-A)C D
Clamp to 0, 1
Color Sum Unit
C
Spare0
Sum
2nd-ary Color
D
Input Alpha, Blue Registers
Input Mapping
Alpha Portion
Alpha Out