Image Compression and Graphics: More Than a Sum of Parts presentation

About This Presentation

Transcript and Presenter's Notes

Title: Image Compression and Graphics: More Than a Sum of Parts

1
Image Compression and GraphicsMore Than a Sum
of Parts?

Bernd Girod
Collaborators Peter Eisert, Marcus Magnor,
Prashant Ramanathan,
Eckehard Steinbach (all Stanford), Thomas Wiegand
(HHI)

Image, Video, and Multimedia Systems
Group Information Systems Laboratory Stanford
University
2
Can 3-D Geometry Help to Compress Images?

Conjecture
3-d geometry models help compression, if a
single 3-D model captures the dependencies
between many views (or frames of a sequence).

3
Outline of this Talk

Compression of many simultaneous views (e.g.
light-fields)
Encoding view-dependent texture maps with 4-d
wavelets
Hierarchical image-domain light-field coder
Why image-domain encoding is (usually) superior
to texture-map encoding
Model-based compression of talking head sequences
Modeling and estimation of facial expressions
Avatars
Incorporate synthetic video into
motion-compensated hybrid coding

4
Multi-View Image Capture

Coding schemes suitable for
2-plane parametrization
Hemispherical image arrangement
(arbitrary recording positions)

5
Align Views by Mapping onto Object Surface

Camera views
No correlation
between
corresponding
pixels

View- dependent texture map Strong
correlation between corresponding texels
6
3-D Reconstruction from Many Views
Volumetric Reconstruction

Subdivide objects bounding box into voxels
Generation of multiple hypotheses for each voxel
Hypothesis elimination by projecting visible
voxels into light-field images
Iterate over all voxels until remaining
hypotheses are photo-consistent

processes all views simultaneously
exploits texture and silhouette information
yields solid 3-D voxel model

7
Surface Representation

Initial octahedral geometry
Geometry refinement
determine vertex normals
move vertices to model surface
subdivide triangles

8
Texture Map Encoding with 4-d Wavelets

Arrange images into 2-d array

Embedded encoding of wavelet coefficients
(4D-SPIHT)

9
Results Wavelet Texture Map Encoder
Reconstruction quality in luminance PSNR (dB)
10
Results Wavelet Texture Map Encoder
11
Progressive Decoding
12
Align Views by Model-aided Prediction
13
Hierarchical Image Coding Order

project camera positions on hemisphere

subdivide into 4 quadrants

INTRA-encode corner images

encode center image
image prediction
residual error coding

encode mid-side images

subdivide into sub-quadrants

encode center and mid-side images

subdivide repeatedly

14
Model-aided Image-Domain Light-Field Coder
Light-Field Image Iu,v
DCT Coefficients
-
Multiframe Disparity Compensation
Disparity Map Generation
15
Picture Quality
original Mouse light field 257 RGB images,
384x288 pixels 81.3 Mbytes
compressed 3001 0.077 bpp (267 KBytes) 37.9 dB
PSNR
16
Model-aided vs. Texture Coding
17
Natural vs. Synthetic Image Set
18
Inaccurate Geometry
19
Model-based videophone
20
Modeling of Facial Expressions

Head geometry composed of 101 triangular
B-spline patches
Facial expressions by superposition of 66 FAPs
(Facial Animation Parameters) according to MPEG-4
standard
FAPs act on control points of triangular B-spline
patches

21
Estimation of Facial Expressions
Displacement field constrained by FAPs
Linearize for small FAPs
Optical flow constraint equation

Solve overdetermined system by linear regression
Apply iteratively in analysis-synthesis loop
Incorporate spatial resolution pyramid

22
Results Peter
Original
Synthesized

Sequence Peter, 230 frames,
CIF resolution, 25 fps

Compressed 25,0001 1.2 kbps - 32.8 dB PSNR
23
Results Eckehard
Original
Synthesized

Sequence Eckehard
CIF resolution, 25 fps

1.1 kbps, 32.6 dB PSNR
24
Results Peter as Eckehard
Original
Synthesized

Sequence Peter, 230 frames,
CIF resolution, 25 fps

25
Results Eckehard as Peter
Original
Synthesized

Sequence Eckehard
CIF resolution, 25 fps

26
Results Peter as Akiyo
Original
Synthesized

Sequence Peter, 230 frames,
CIF resolution, 25 fps

27
. . . But, What About Unknown Objects?
Original
Synthesized

Sequence Clap

1.2 kbps
28
Model-Aided Coding Incorporating Synthetic
Video into MC Hybrid Coding
Coder Control
Control data (incl. motion vectors)
Input Video
Intraframe DCT Coder
DCT coefficients
e
-
Intraframe Decoder
Multiframe Motion Compensation
Decoder
29
R-D-Optimal Mode Decision
Selection Mask minimizing DlR
Previous decoded frame
Predicted frame
Synthesized frame
30
Results Peter
H.263 (TMN-10) _at_ 12 kbps
Model-Aided Coder _at_ 12 kbps

Sequence Clap, 8.33 fps, CIF resolution

31
Results Akiyo
H.263 (TMN-10) _at_ 10 kbps
Model-Aided Coder _at_ 10 kbps

Sequence Akiyo, 10 fps, CIF resolution

32
R-D Performance of Model-Aided Coder
Sequence Peter
Sequence Akiyo
33
Can 3-d geometry help to compress images?
Conclusion

YES . . .
. . . IF many views of the same 3-D object/scene
shall be compressed.
Applications in
Multiview image coding (light-field compression)
Compression of video sequences
Very high compression ratios (1001 . . .
25,0001)
Require accurate vision algorithms for 3-d
reconstruction
Image-domain compression more resilient against
inaccurate geometry and hence more practical than
texture-map encoding

34
. . . THE END

Write a Comment

User Comments (0)

About PowerShow.com

Image Compression and Graphics: More Than a Sum of Parts PowerPoint PPT Presentation