DirectX HighLevel Shading Language - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

DirectX HighLevel Shading Language

Description:

High level enough to hide hardware specific details. Simple enough for efficient ... float3 albedo = psConst2 * blendFactor.x psConst3 * (1 - blendFactor.x) ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 51
Provided by: downloadM
Category:

less

Transcript and Presenter's Notes

Title: DirectX HighLevel Shading Language


1
DirectXHigh-Level Shading Language
  • Chas. Boyd
  • DirectX Graphics Architect
  • Microsoft

2
Outline
  • What drove the language design?
  • Background
  • What does it look like?
  • Syntax definition
  • How does it work?
  • API integration
  • How to use it efficiently?
  • Tips and Tricks

3
DirectX 8 Assembly
  • tex t0 base texture
  • tex t1 environment map
  • add r0, t0, t1 apply reflection

4
DirectX 9 HLL Syntax
  • outColor tex2d( baseTextureCoord, baseTexture )
  • texCube( EnvironmentMapCoord, Environment )

5
Why an HLL?
  • Scalability vs hw
  • Programming complexity
  • Higher Level Language solves these

6
Design Goals
  • High level enough to hide hardware specific
    details
  • Simple enough for efficient code generation
  • Familiar enough to reduce learning curve
  • With enough optimizing back-ends for portability

7
Design Baseline
  • C -like syntax
  • A standard language
  • like c or C or HTML
  • in the VS.net IDE

8
Graphics Architecture
Application
D3DX
Assembler, compiler, effects, utilities
Direct3D
Semantic mapping
Driver
Code translation
Hardware
9
Preprocessor
  • define
  • elif
  • else
  • endif
  • error
  • if
  • include
  • line
  • undef

10
Types
  • Basic types
  • float
  • int
  • bool
  • double
  • half
  • Structs and arrays supported

11
Vectors and Matrices
  • Typedef to shorthand user defined types
  • float1, float2, float3, float4
  • Float1x1, float1x2 float4x4
  • Defined for all basic types
  • Int1-4, half1-4, etc.
  • Component access and swizzles supported on
    vector/matrix types
  • FloatVector.xyz
  • FloatMatrix._11_12 or FloatMatrix11

12
Variables
  • Local / global
  • Static
  • Global variables that are not externally visible
  • Const
  • Cannot be modified by the shader program
  • Can be set external to the program
  • Can have initializers
  • Can have semantics
  • For function parameters

13
Operators
  • Pretty much all of C operators
  • Including ?, , --, , -, etc
  • No new language semantics
  • Despite temptation
  • Arithmetic operators are per component
  • Matrix multiply is an intrinsic function
  • Logical operators are per component
  • No bitwise operators

14
Statement Syntax
  • statements
  • expression
  • return expression
  • if ( expression ) statement else statement
  • for ( expression variable_declaration
    expression expression ) statement

15
Some Intrinsic functions
16
User Functions
  • Standard C-like functions
  • Output type and input parameters
  • Parameters can be passed by copy in/copy out
    mechanism
  • in/out declaration
  • Inlined internally -no recursion

17
Functions (cont.)
  • Can be static (not externally accessible)
  • Non-static functions parameters must have
    Direct3D declarator semantics
  • Parameters can be marked const
  • Parameters can have default initializers

18
Differences from C
  • No pointers
  • No recursion

19
HLSL Summary
  • Ease of Use
  • Enable software developers
  • Consistency of Implementation
  • Enable multiple vendors
  • Management of Evolution
  • Enable multiple generations
  • Result
  • Fundamental architecture of DXG software stack
    and higher level language

20
Geometry Mapping
  • DirectX 8 Vertex Shaders assume a data layout
  • Decl shader code are tied together
  • Forces shader author to communicate with geometry
    provider
  • Standard register conventions can help some
  • Complicates combining shaders

21
Semantics
  • DirectX 9 decouples decl from VS
  • Both decl and VS refer to semantics rather than
    register names
  • Direct3D runtime connects appropriate vertex
    stream data to Vertex Shader registers
  • Key feature of DirectX9 low-level API
  • driven by HLSL and shader requirements

22
DX8 Vertex Declaration
Strm0
Strm1
Vertex layout
v0
skip
v1
Declaration
vs 1.1 mov r0, v0
Shader handle
Shader program
23
New Vertex Declaration
Strm0
Strm1
Strm0
Vertex layout
pos
norm
diff
pos
norm
diff
Declaration
vs 1.1 dcl_position v0 dcl_diffuse v1 mov r0,
v0
vs 1.1 dcl_position v0 dcl_diffuse v1 mov r0,
v0
Shader program (Shader handle)
24
Vertex Declaration
  • struct D3DVERTEXELEMENT9
  • Stream // id from setstream() Offset
    // offset verts into str Type // float
    vs byte, etc. Method // tessellator
    op Usage // default semantic(pos, etc)
    UsageIndex // e.g. texcoord

25
VS Input Semantics
  • positionn
  • blendweightn
  • blendindicesn
  • normaln
  • psizen
  • diffusen
  • specularn
  • texcoordn
  • tangentn
  • binormaln

26
VS output / PS input semantics
  • position
  • psize
  • fog
  • colorn
  • texcoordn

27
 Uses for Semantics
  • A data binding protocol
  • Between vertex data and shaders
  • Between pixel and vertex shaders
  • Between pixel shaders and hardware
  • Between shader fragments

28
Integrating with Applications
  • Extract dissassembly and use as .asm shader code
    ala DX8
  • Use compiled shader object
  • Enables constant table access
  • Via ID3DXConstantTable Interface
  • Use in an effect object
  • Manage constants, fallbacks, etc.
  • Via ID3DXEffect Interface

29
Language API Standalone
  • Compiler returns a VS or PS and a symbol table
  • Maps extern constants to registers
  • Any expression of constants (i.e. per primitive
    expressions) still performed per vertex
  • Symbol table is a set of constants
  • ID3DXConstantTable interface

30
ID3DXConstantTable
  • Exposes constant parameter metadata
  • For convenient specification of shader input data
  • SetMatrix( curv, matrix )
  • String or handle
  • D3DXHandle hHandle
  • SetVector()
  • SetValue()
  • Use effect parameters
  • Per primitive expressions of parameters computed
    outside the vertex shader

31
Performance 
  • Compiler updates will be frequent
  • Microsoft has good compiler people

32
Current Back Ends
  • Vertex Shader 1.1, 2.0
  • Pixel Shader 1.1, 1.4, 2.0

33
Tips And Tricks
  • Using the int datatype
  • Using matrix datatypes
  • How do if statements work
  • Using constant specialization
  • Pixel shader 1.x optimizations

34
Int Datatype
  • Declare indexing variables as ints
  • avoids unnecessary frc's used to truncate
  • allows for other int optimizations
  • What are the frcs required for?
  • The float to int truncation must happen before
    multiplying by the size of the datatype for
    correct results

35
Int Index example
  • OutPos mul(WorldArrayIndex, Pos)

// float Index frc r0.w, r1.w add r2.w, -r0.w,
r1.w mul r9.w, r2.w, c61.x mov a0.x, r9.w m4x4
oPos, v0, c0a0.x
// int Index mul r0.w, c60.x, r1.w mov a0.x,
r0.w m4x4 oPos, v0, c0a0.x
36
Matrix Datatypes
  • Advantages over array of vectors
  • Will be stored in optimal format
  • Column major or row major depending on usage
  • Easy to cast down to 4x3, 3x3, etc
  • Allows for better performance and correct
    behavior
  • Matrix is supported by set of intrinsics
  • Column major is preferred storage
  • Recommend mul(matrix, vector) order
  • Allows the compiler to use dp4/3s

37
If statements
  • All back ends support if statements
  • If branching is not supported (i.e. vs.1.1)
  • Both sides of the if are executed and final
    result chosen
  • Depending on the conditions this can be expensive
  • If constant branching is supported (i.e. 2.0)
  • If the condition is constant, constant branch
    instructions are used
  • Else will fall back to the vs 1.1 solution

38
If statement example
if (Value gt 0) Position Value1 else
Position Value2 // calculate lerp value based
on Value gt 0 mov r1.w, c2.x slt r0.w, c3.x,
r1.w // lerp between Value1 and Value2 mov r7,
-c1 add r2, r7, c0 mad oPos, r0.w, r2, c1
39
Constant Specialization
  • Specify constants that are to be literals
  • via ID3DXEffectCompiler Interface
  • Call CompileShader() method
  • returns pre-optimized shader or effect
  • Easily generate multiple shaders optimized for
    specific cases
  • Can help shader management by generating them on
    the fly

40
PS 1.x optimizations
  • Modifiers automatically used
  • Complement, negate, x2, sat, etc
  • Optimizes for Co-issue
  • Instruction reordering done to utilize
  • Still keep 1.x shaders simple
  • Doesnt have arbitrary swizzles
  • If bad swizzle requested compile will fail
  • Limited instruction count
  • Complex shaders are possible
  • Modifiers allow for a lot of computation in a
    small number of instructions
  • Effective co-issue use helps as well

41
PS 1.x sample shader
sampler samplerA, samplerB float4
ColorScale float4 PShader(float4 Diffuse
COLOR0, float4 Specular COLOR1, float2
Tex1 TEXCOORD0, float2 Tex2 TEXCOORD1)
COLOR0 float4 Sample1 tex2D(samplerA,Tex1)
float4 Sample2 tex2D(samplerB,Tex2)
float4 Color (1-Diffuse.a)Sample1
Diffuse.aSample2 Color
max(Color,0) Color min(Color,1)
Color Color - .5f Color ColorScale
return Color
ps_1_4 texld r0, t0 texld r1, t1 lrp_sat r0,
v0.w, r0, r1 mul r0, c0, r0_bias
42
Input Datatype Declarations
  • Important to provide good type information for
    program inputs
  • All int input should be declared as int
  • Matrix indices, lookup values, etc.
  • If the data is not integer odd results can
    happen!
  • Take advantage of expansion to float4
  • i.e. declare Position as float4
  • If the vertex data has x,y, and z then w will be
    filled in with 1.0

43
HLSL shader sample
  • Wood Sample shader
  • Thanks to Jason Mitchell (ATI)
  • Procedural wood
  • Complex - rings, wobble, noise

44
hlsl_wood()
float4 hlsl_wood (float3 Pshade0 TEXCOORD0,
float3 Pshade1 TEXCOORD1, float3 Pshade2
TEXCOORD2, float3 zWobble0
TEXCOORD3, float3 zWobble1 TEXCOORD4, float3
Peye TEXCOORD6, float3 Neye TEXCOORD7)
COLOR float3 coloredNoise float3
wobble coloredNoise.x tex3D
(NoiseSampler, Pshade0) // Construct colored
noise from three samples coloredNoise.y
tex3D (NoiseSampler, Pshade1) coloredNoise.z
tex3D (NoiseSampler, Pshade2) wobble.x
tex3D (NoiseSampler, zWobble0) wobble.y
tex3D (NoiseSampler, zWobble1) wobble.z
0.5f coloredNoise coloredNoise 2.0f -
1.0f // Make signed wobble wobble
2.0f - 1.0f // Scale noise and add to
Pshade float3 noisyWobblyPshade Pshade0
coloredNoise psConst3.w wobble psConst4.w
float scaledDistFromZAxis
sqrt(dot(noisyWobblyPshade.xy, noisyWobblyPshade.x
y)) psConst2.w float4 blendFactor tex2D
(PulseTrainSampler, float2 (0.0f,
scaledDistFromZAxis)) // Lookup blend factor
from pulse train float3 albedo psConst2
blendFactor.x psConst3 (1 - blendFactor.x)
// Blend wood colors together // Compute
normalized vector from vertex to light in eye
space (Leye) float3 Leye (psConst4 - Peye)
/ len(psConst4 - Peye) Neye Neye /
len(Neye)
// Normalize interpolated normal float3
Veye -(Peye / len(Peye))
// Compute Veye float3 Heye
(Leye Veye) / len(Leye Veye)
// Compute half-angle float NdotH
clamp(dot(Neye, Heye), 0.0f, 1.0f)
// Compute N.H float k blendFactor.z
//
Scale and bias exponent from pulse train
float specular tex2D (VariablSpecularSampler,
float2 (NdotH, k)) // Evaluate (N.H)k via
dependent read float NdotL dot(Neye,
Leye) //
N.L float diffuse NdotL 0.5f 0.5f
// "Half-Lambert"
technique for more pleasing diffuse float
gloss blendFactor.y
// gloss the specular term
return diffuse float4 (albedo.r, albedo.g,
albedo.b, 0.0f) specular gloss
45
Hlsl_wood() asm
  • ...
  • texld r0, t0, s0
  • add r7.x, r0.x, r0.x
  • texld r2, t1, s0
  • add r7.y, r2.x, r2.x
  • add r9.xy, r7, c3.x
  • mad r11.xy, r9, c1.w, t0
  • texld r6, t3, s0
  • add r11.z, r6.x, r6.x
  • texld r1, t4, s0
  • add r11.w, r1.x, r1.x
  • add r11.zw, r11, c3.x
  • mad r8.x, r11.z, c2.w, r11.x
  • mad r8.y, r11.w, c2.w, r11.y
  • dp2add r3.w, r8, r8, c4.x
  • rsq r2.w, r3.w
  • mul r10.w, r2.w, r3.w
  • mul r5.y, c0.w, r10.w
  • mov r5.x, c3.w

... texld r3, t0, s0 texld r4, t1, s0
texld r5, t2, s0 texld r6, t3,
s0 texld r7, t4, s0 mov r3.y,
r4.x mov r3.z, r5.x mov
r6.y, r7.x mad r6, r6, c0.x, c0.y
mad r3, r3, c0.x, c0.y mad r7, c3.w, r3, t0
mad r7, c4.w, r6, r7 dp2add r0, r7,
r7, c1.w rsq r0, r0.x rcp r0,
r0.x mul r0, r0, c2.w
texld r0, r0, s1 mov r1, c3 lrp r2,
r0.x, c2, r1 sub r4, c4, t6 dp3
r5.w, r4, r4 rsq r5.w, r5.w
mul r4, r4, r5.w dp3 r6.w, t7, t7
rsq r6.w, r6.w mul r5, t7, r6.w
dp3 r3.w, t6, t6 rsq r3.w,
r3.w mul r3, -t6, r3.w add
r6, r3, r4 dp3 r6.w, r6, r6 rsq r6.w,
r6.w mul r6, r6, r6.w dp3_sat r6, r5,
r6 mov r6.y, r0.z texld r6,
r6, s2 dp3 r5, r4, r5
mad_sat r5, r5, c0.z, c0.z mul r6, r6, r0.y
mad r2, r5, r2, r6 mov oC0, r2
HLSL generates 37 ALU Instructions
Handwritten asm is 35 instructions
46
Summary
  • HLSL abstraction solve
  • Continuing hardware evolution
  • Shader programming complexity
  • API semantics solve
  • Shader interoperability

47
Summary
  • HLSL is the next step in graphics API/hardware
    evolution
  • DirectX implementation provides
  • Close API integration
  • Semantic binding to low-level API
  • Shader management via D3DX effects
  • Full IDE support including debugging
  • Performant cross vendor support

48
Action Items
  • Check it out!
  • Use it in your research development
  • Let us know what you think
  • directx_at_microsoft.com
  • http//msdn.microsoft.com/directx

49
Questions
50
Backup
Write a Comment
User Comments (0)
About PowerShow.com