Numerical-Precision-Optimized Volume Rendering

About This Presentation

Title:

Numerical-Precision-Optimized Volume Rendering

Description:

Reverse order precision analysis ... Composite. Ray Casting. Splatting. Reverse Order Precision Analysis ... Composite creates the final image. Precision ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 56

Provided by: ingmarbi

Learn more at: https://www.graphicshardware.org

Category:

more less

Transcript and Presenter's Notes

Title: Numerical-Precision-Optimized Volume Rendering

1
Numerical-Precision-Optimized Volume Rendering
Sqeeze
Ingmar Bitter Neophytos Neophytou Klaus
Mueller Arie Kaufman
2
Numerical-Precision-Optimized Volume Rendering
Sqeeze
Ingmar Bitter Neophytos Neophytou Klaus
Mueller Arie Kaufman
3
Outline

Numerical precision - a rendering resource

4
Outline

Numerical precision - a rendering resource
Fixed-point arithmetic

5
Outline

Numerical precision - a rendering resource
Fixed-point arithmetic
Reverse order precision analysis
Compositing, shading, gradients, classification,
sampling/splatting, sample/splat location

6
Outline

Numerical precision - a rendering resource
Fixed-point arithmetic
Reverse order precision analysis
Compositing, shading, gradients,
classification,sampling/splatting, sample/splat
location
Results

7
Outline

Numerical precision - a rendering resource
Fixed-point arithmetic
Reverse order precision analysis
Compositing, shading, gradients, classification,
sampling/splatting, sample/splat location
Results
Conclusions

8
Numerical Precision A Resource

Double precision computation for all ideal?

9
Numerical Precision A Resource

Double precision computation for all ideal?
slower then all other alternatives
not possible on graphics cards (at least for now)
expensive on custom chip implementations
and most importantly
not needed to create best possible images!!

10
Numerical Precision A Resource

Double precision computation for all ideal?
slower then all other alternatives
not possible on graphics cards (at least for now)
expensive on custom chip implementations
and most importantly
not needed to create best possible images!!
reasons predominantly 8-bit displays (per
channel)
limited range intervals
throughout

11
Current Status

Stable volume rendering pipeline both CPU and
GPULL94, Lev88, MJC02, Wes90, EKE01, RSEB00
Interpolation before classification, even for
splatting MMC99
Caching optimized for volume renderingKni00,
LCCK02, PSL98
Precision-limited rendering systems ATI,
NVidia,VolumePro PHK99, VizardII MKW02,
UltraVis Kni00
Completely fixed final output image display bit
precision
8 bits per RGB color channel on CRTs and LCDs
8 bits max in DVI standard
SGIs 12 bit color displays are nearly extinct
Radiologists requirements are not mass market,
same analysis applies

12
OpenGL Arithmetic 121?

Representation 0, 255 ? a b 255
Computation a0, 255 b0, 255 gtgt 8
254 ? wrong
? 1 mult, one shift
Alternatively tmp a0, 255 b0, 255
128 result (tmp(tmp gtgt
8)) gtgt 8
255, correct
Bli95
? 1 mult, 2 adds, 2 shifts

13
OpenGL Arithmetic 121?

Representation fixed-point I.Fb
I.Fb I integer bits, F fraction bits
8 bits ? 1.7b fixed point number
then a b 11.7b 128
Computation a1.7b b1.7b gtgt 7
128 ? correct
? 1 mult, one shift
? one fewer bit of resolution, but OK (we will
see)

14
Reverse Order Precision Analysis
Ray Casting
Splatting

Unified ray casting and splatting pipelines
Composite creates the final image

Sample Location
Splat Location
Sample
Splat
Classify
Gradient
Shade
Composite
15
Reverse Order Precision Analysis
Ray Casting
Splatting

Unified ray casting and splatting pipelines
Composite creates the final image
Precision requirements propagate backwards

Sample Location
Splat Location
Sample
Splat
Classify
Gradient
Shade
Composite
16
Compositing - Math

Pre-(alpha)-multiplied colors
C aC aR, aG, aB
Alpha correction (r samples per unit)
Tcorrected (1- a)r

17
Compositing - Math

Pre-(alpha)-multiplied colors
C aC aR, aG, aB
Alpha correction
Tcorrected (1- a)r
With back-to-front compositing
CCompositingBuffer Tcorrected Cfront
TCompositingBuffer Tcorrected
aCompositingBuffer 1-Tcorrected
perform multiplication N times per pixel
? correct solution needs N F r bits
precision

T/CCompositingBuffer
Tcorrected, Cfront
T/CCompositingBuffer
18
Compositing Precision Theory

8-bit destination resolution
therefore all partial results can be rounded
drop all bits not contributing to the 8 most
significant bits (MSB)
Adding N 2p samples
allows 8p bits to influence the 8 MSB
Conversion from aCompositingBufferC to C for
display (division)
allows 8p more bits to influence the 8 MSB
Conversion from acorrectedC to C for display
allows r times as many bits to influence the 8
MSB
Sufficient resolution is r 2 (8p) for C, r
(8p) for a
32/16 bits for C/aCompositingBuffer for 2563
volumes and no super-sampling
608 bits for 51222048 volumes and 16 samples per
voxel

19
Compositing Precision Theory

8-bit destination resolution
therefore all partial results can be rounded
drop all bits not contributing to the 8 most
significant bits (MSB)
Adding N 2p samples
allows 8p bits to influence the 8 MSB
Conversion from aCompositingBufferC to C for
display (division)
allows 8p more bits to influence the 8 MSB
Conversion from acorrectedC to C for display
allows r times as many bits to influence the 8
MSB
Sufficient resolution is r 2 (8p) for C, r
(8p) for a
32/16 bits for C/aCompositingBuffer for 2563
volumes and no super-sampling
608 bits for 51222048 volumes and 16 samples per
voxel

20
Compositing Precision Theory

8-bit destination resolution
therefore all partial results can be rounded
drop all bits not contributing to the 8 most
significant bits (MSB)
Adding N 2p samples
allows 8p bits to influence the 8 MSB
Conversion from aCompositingBufferC to C for
display (division)
allows 8p more bits to influence the 8 MSB
Conversion from acorrectedC to C for display
allows r times as many bits to influence the 8
MSB
Sufficient resolution is r 2 (8p) for C, r
(8p) for a
32/16 bits for C/aCompositingBuffer for 2563
volumes and no super-sampling
608 bits for 51222048 volumes and 16 samples per
voxel

21
Compositing Precision Theory

8-bit destination resolution
therefore all partial results can be rounded
drop all bits not contributing to the 8 most
significant bits (MSB)
Adding N 2p samples
allows 8p bits to influence the 8 MSB
Conversion from aCompositingBufferC to C for
display (division)
allows 8p more bits to influence the 8 MSB
Conversion from acorrectedC to C for display
allows r times as many bits to influence the 8
MSB
Sufficient resolution is r 2 (8p) for C, r
(8p) for a
32/16 bits for C/aCompositingBuffer for 2563
volumes and no super-sampling
608 bits for 51222048 volumes and 16 samples per
voxel

22
Compositing Precision Theory

8-bit destination resolution
therefore all partial results can be rounded
drop all bits not contributing to the 8 most
significant bits (MSB)
Adding N 2p samples
allows 8p bits to influence the 8 MSB
Conversion from aCompositingBufferC to C for
display (division)
allows 8p more bits to influence the 8 MSB
Conversion from acorrectedC to C for display
allows r times as many bits to influence the 8
MSB
Sufficient resolution is r 2 (8p) for C, r
(8p) for a
32/16 bits for C/aCompositingBuffer for 2563
volumes and no super-sampling
608 bits for 51222048 volumes and 16 samples per
voxel

23
Compositing Precision Practice

No alpha correction (r 1) 2 (8p) bits
Iso-surface rendering using old fashioned
OpenGL
store not aC but C in frame buffer (8p)
bright colors 5p
at most 8 non-zero samples per ray (p3) 538
bits
? standard 24 bit RGBA frame buffer is
adequate
Fog visualization
what matters is the ability to see objects though
volumetric fog (substance with low opacity)
visual experiments show 15 fractional bits are
sufficient

24
Compositing Precision Practice

No alpha correction (r 1) 2 (8p) bits
Iso-surface rendering using old fashioned
OpenGL
store not aC but C in frame buffer (8p)
bright colors 5p
at most 8 non-zero samples per ray (p3) 538
bits
? standard 24 bit RGBA frame buffer is
adequate
Fog visualization
what matters is the ability to see objects though
volumetric fog (substance with low opacity)
visual experiments show 15 fractional bits are
sufficient

25
Compositing Precision Practice

No alpha correction (r 1) 2 (8p) bits
Iso-surface rendering using old fashioned
OpenGL
store not aC but C in frame buffer (8p)
bright colors 5p
at most 8 non-zero samples per ray (p3) 538
bits
? standard 24 bit RGBA frame buffer is
adequate
Fog visualization
what matters is the ability to see objects though
volumetric fog (substance with low opacity)
visual experiments show 15 fractional bits are
sufficient

26
Compositing Precision Practice

No alpha correction (r 1) 2 (8p) bits
Iso-surface rendering using old fashioned
OpenGL
store not aC but C in frame buffer (8p)
bright colors 5p
at most 8 non-zero samples per ray (p3) 538
bits
? standard 24 bit RGBA frame buffer is
adequate
Fog visualization
what matters is the ability to see objects though
volumetric fog (substance with low opacity)
visual experiments show 15 fractional bits are
sufficient

27
Compositing Precision Practice

No alpha correction (r 1) 2 (8p) bits
Iso-surface rendering using old fashioned
OpenGL
store not aC but C in frame buffer (8p)
bright colors 5p
at most 8 non-zero samples per ray (p3) 538
bits
? standard 24 bit RGBA frame buffer is
adequate
Fog visualization
what matters is the ability to see objects though
volumetric fog (substance with low opacity)
visual experiments show 15 fractional bits are
sufficient

28
Compositing Precision Practice

No alpha correction (r 1) 2 (8p) bits
Iso-surface rendering using old fashioned
OpenGL
store not aC but C in frame buffer (8p)
bright colors 5p
at most 8 non-zero samples per ray (p3) 538
bits
? standard 24 bit RGBA frame buffer is
adequate
Fog visualization
what matters is the ability to see objects though
volumetric fog (substance with low opacity)
visual experiments show 15 fractional bits are
sufficient

29
Compositing Conclusion
Least-significant-bit-fog at various bit
precisions
8
10
12
14
15
16
5123 dataset r 2

Preferred bit-aware back-to-front compositing
equations
aC1.15b T1.15bsample C1.15bsample
T1.15b T1.15bsample

30
Shading - Math

PhongCcolor kambient OobjectColor
IlightIntensity kdiffuse O Si Ii
(NLi) kspecular Si Ii (RLi)r
k ? 0,1 kambient kdiffuse
kspecular 1
OobjectColor (8 bit) and IlightIntensity ? 0,1
NLi and RLi ? -1,1, but ? 0,1 after
clamping
PhongCcolor ? 0,1 (possibly clamping Si)

31
Shading - Analysis

PhongCcolor needs to be as precise as 1.15b
Use 16.16b for all multiplications 0,1) 0,1
sufficient precision and no overflow

32
Shading New Computation

Replace specular exponentiation with recursive
multiplies
repeatedly multiply number with itself
works for all exponents r2n
when r26 (16 bit precision), then max error lt
0.005
better results than Knittels parabola
approximation

33
Shading New Computation

Replace specular exponentiation with recursive
multiplies
repeatedly multiply number with itself
works for all exponents r2n
when r26 (16 bit precision), then max error lt
0.005
better results than Knittels parabola
approximation

Knittels parabola
pow
r2n
34
Shading - Conclusion

Preferred bit-aware Phong shading equation
C16.16b k16.16bambient O0.8bobjectColor
I16.16blight k16.16bdiffuseO0.8b
Si I16.16bi (N16.16bL16.16bi)
k16.16bspecular Si I16.16bi (R16.16bL16.16bi)2
n

35
Gradients - Math

Gx 0.5 sample(x1,y,z) - 0.5 sample(x-1,y,z)
Gy 0.5 sample(x,y1,z) - 0.5 sample(x,y-1,z)
Gy 0.5 sample(x,y,z1) - 0.5 sample(x,y,z-1)

36
Gradients - Analysis

G G1.Fb
Discrete nearest gradient vector neighbors
sin f 1/2F, sin f f ? f 1/2F
Maximum error for specular intensity, large r
r 64, 164 ! 1, but 164 (1- 1/2F)64
error of 22, 6.1, 1.6, 0.4for F of 8,
10, 12, 14

f
37
Gradients - Analysis

5123-sized spheres with Phong highlights
4, 6, 8, 10, 12, 14 bit gradients
Diffuse artifacts for 4 and 6 bits
Specular artifacts up to 10 bits

6
4
8
12
10
14
12
10
14
38
Gradients - Conclusion

Thus, 12 bits dynamic range is needed
Now consider normalization
reduces I.Fb to 1.Fb
up to I bits will be added to the fractional part
Volume samples often have 12 bits
Gx,y,z with 12.12b minimum representation
Gx,y,z with 16.16b preferred representation
leaves room for interpolation bits in
normalization

39
Classification Prelims and Recaps

Use of T instead of a is more efficient in
compositing operation
Largest visual precision/quantization error
occurs at high transparencies (low opacities)
need more bits for T than for C, just to be sure
Want transfer function lookup table to be
cache-friendly
power-of-2 RGBA-tuple alignment
Would like to use pre-integrated classification
for color and opacity transfer functions EKE01,
MGS02

40
Classification Prelims and Recaps

Use of T instead of a is more efficient in
compositing operation
Largest visual precision/quantization error
occurs at high transparencies (low opacities)
need more bits for T than for C, just to be sure
Want transfer function lookup table to be
cache-friendly
power-of-2 RGBA-tuple alignment
Would like to use pre-integrated classification
for color and opacity transfer functions EKE01,
MGS02

41
Classification Prelims and Recaps

Use of T instead of a is more efficient in
compositing operation
Largest visual precision/quantization error
occurs at high transparencies (low opacities)
need more bits for T than for C, just to be sure
Want transfer function lookup table to be
cache-friendly
power-of-2 RGBA-tuple alignment
Would like to use pre-integrated classification
for color and opacity transfer functions EKE01,
MGS02

42
Classification Prelims and Recaps

Use of T instead of a is more efficient in
compositing operation
Largest visual precision/quantization error
occurs at high transparencies (low opacities)
need more bits for T than for C, just to be sure
Want transfer function lookup table to be
cache-friendly
power-of-2 RGBA-tuple alignment
Would like to use pre-integrated classification
for color and opacity transfer functions EKE01,
MGS02

43
Classification - Math

Desired lookup table entries
R1.8bG1.8bB1.8bT1.16b ? 5.5 bytes
Common lookup table entries
R0.8bG0.8bB0.8ba0.8b ? 4 bytes

44
Classification - Math

Desired lookup table entries
R1.8bG1.8bB1.8bT1.16b ? 5.5 bytes
Common lookup table entries
R0.8bG0.8bB0.8ba0.8b ? 4 bytes
Better lookup table entries
R0.8bG0.8bB0.8bsqrt(a)0.8b ? spreads low a
Computed lookup after T 1-(sqrt(a)2)
R0.8bG0.8bB0.8bT1.16b ? squaring doubles
precision

45
Classification - Conclusion
Foot with least-significant-thin-tissue-fog
a0.8b
sqrt(a)0.8b
a0.16b

Preferred bit-aware lookup table entries
R0.8bG0.8bB0.8bsqrt(a)0.8b

46
Sample Interpolation - Math

sample voxel0 (1-w) voxel1 w
sample w (voxel1 - voxel0) voxel0
Requirements
Gx,y,z, derived from samples, need 12 bit dynamic
range
samples need 12 bit values for transfer function
lookup
cover both low and high dynamic range
neighborhoods
Therefore, sample12.12b is a minimum requirement
integer part comes from voxels ? voxel12.0b
fractional part comes from interpolation ? w1.12b

47
Sample Interpolation - Conclusion

Preferred bit-aware sample interpolation
sample12.12b w1.12b (voxel112.0b -
voxel012.0b) voxel012.0b
Splats start on voxels, need no interpolation
splat12.0b voxel12.0b

48
Sample Location - Math
k

k-th sample location startPos Sk Vinc
Perspective rays need to differ enough to allow
1024 rays across 60 degrees, or 0.05?
sin f (k 1/2F) / k, sin f f ? f 1/2F
F 6, 12, 16 ? f 0.9?, 0.05?, 0.0009?
Also, need to address 2048 slices (integer
positions) ? 11bits
Thus, need overall 11.12b

f
49
Sample Location - Conclusion

Preferred bit-aware sample location
perspective projection
sampleLocation11.12b startPos11.12b S
Vinc1.12b
parallel projection sampleLocation11.6b (0.9? OK)

50
Splat Scan Conversion - Math

Splats project onto image grid ? reverse rays
Allow as many as 2048 splat rays across 60
degrees, or 0.025?
Hence, twice the ray casting precision
one extra fractional bit F13
Also address 2048 slices (11bits)
Thus, need overall 11.13b

f
51
Splat Scan Conversion - Conclusion

Preferred bit-aware splat scan conversionsplatLo
cation11.13b startVoxelPos11.13b S
Vinc1.13b
Splats are usually pre-transformed and stored in
bucket lists (one per sheet-buffer)
Preferred voxel location sheet buffer
formatx11.13b u8.0b y11.13b v8.0b (64 bits
total)
x, y location on splat plane
u index into pre-integrated splat table
v voxel value

u
(x, y) y)
52
Results

Summary of minimum precision requirements

Rendering Stage Input Output
Sample locations N/A 11.12b
Sample interpolation 12.00b 12.12b
Classification 12.00b 4 0.8b
Gradients 12.12b 1.12b
Shading 1.12b 1.15b
Compositing 1.15b 1.15b
53
Results

Restricted iso-surface rendering
texture map volume rendering can be done using
plain OpenGL or Direct X and 8 bit frame buffers
General volume rendering, all pipeline stages
32 bit single precision floating point format
16.16b fixed point format (up to 4x faster in our
tests)
Pentium allows 2 simple 32-bit integer ops
per clock cycle

54
Conclusions