Title: Image Compression and Graphics: More Than a Sum of Parts
1Image Compression and GraphicsMore Than a Sum
of Parts?
- Bernd Girod
- Collaborators Peter Eisert, Marcus Magnor,
Prashant Ramanathan, - Eckehard Steinbach (all Stanford), Thomas Wiegand
(HHI)
Image, Video, and Multimedia Systems
Group Information Systems Laboratory Stanford
University
2Can 3-D Geometry Help to Compress Images?
- Conjecture
- 3-d geometry models help compression, if a
single 3-D model captures the dependencies
between many views (or frames of a sequence). -
3Outline of this Talk
- Compression of many simultaneous views (e.g.
light-fields) - Encoding view-dependent texture maps with 4-d
wavelets - Hierarchical image-domain light-field coder
- Why image-domain encoding is (usually) superior
to texture-map encoding - Model-based compression of talking head sequences
- Modeling and estimation of facial expressions
- Avatars
- Incorporate synthetic video into
motion-compensated hybrid coding
4Multi-View Image Capture
- Coding schemes suitable for
- 2-plane parametrization
- Hemispherical image arrangement
- (arbitrary recording positions)
5Align Views by Mapping onto Object Surface
- Camera views
- No correlation
- between
- corresponding
- pixels
View- dependent texture map Strong
correlation between corresponding texels
63-D Reconstruction from Many Views
Volumetric Reconstruction
- Subdivide objects bounding box into voxels
- Generation of multiple hypotheses for each voxel
- Hypothesis elimination by projecting visible
voxels into light-field images - Iterate over all voxels until remaining
hypotheses are photo-consistent
- processes all views simultaneously
- exploits texture and silhouette information
- yields solid 3-D voxel model
7Surface Representation
- Initial octahedral geometry
- Geometry refinement
- determine vertex normals
- move vertices to model surface
- subdivide triangles
8Texture Map Encoding with 4-d Wavelets
- Arrange images into 2-d array
- Embedded encoding of wavelet coefficients
(4D-SPIHT)
9Results Wavelet Texture Map Encoder
Reconstruction quality in luminance PSNR (dB)
10Results Wavelet Texture Map Encoder
11Progressive Decoding
12Align Views by Model-aided Prediction
13Hierarchical Image Coding Order
- project camera positions on hemisphere
- subdivide into 4 quadrants
- INTRA-encode corner images
- encode center image
- image prediction
- residual error coding
- subdivide into sub-quadrants
- encode center and mid-side images
14Model-aided Image-Domain Light-Field Coder
Light-Field Image Iu,v
DCT Coefficients
-
Multiframe Disparity Compensation
Disparity Map Generation
15Picture Quality
original Mouse light field 257 RGB images,
384x288 pixels 81.3 Mbytes
compressed 3001 0.077 bpp (267 KBytes) 37.9 dB
PSNR
16Model-aided vs. Texture Coding
17Natural vs. Synthetic Image Set
18Inaccurate Geometry
19Model-based videophone
20Modeling of Facial Expressions
- Head geometry composed of 101 triangular
B-spline patches - Facial expressions by superposition of 66 FAPs
(Facial Animation Parameters) according to MPEG-4
standard - FAPs act on control points of triangular B-spline
patches
21Estimation of Facial Expressions
Displacement field constrained by FAPs
Linearize for small FAPs
Optical flow constraint equation
- Solve overdetermined system by linear regression
- Apply iteratively in analysis-synthesis loop
- Incorporate spatial resolution pyramid
22Results Peter
Original
Synthesized
- Sequence Peter, 230 frames,
- CIF resolution, 25 fps
Compressed 25,0001 1.2 kbps - 32.8 dB PSNR
23Results Eckehard
Original
Synthesized
- Sequence Eckehard
- CIF resolution, 25 fps
1.1 kbps, 32.6 dB PSNR
24Results Peter as Eckehard
Original
Synthesized
- Sequence Peter, 230 frames,
- CIF resolution, 25 fps
25Results Eckehard as Peter
Original
Synthesized
- Sequence Eckehard
- CIF resolution, 25 fps
26Results Peter as Akiyo
Original
Synthesized
- Sequence Peter, 230 frames,
- CIF resolution, 25 fps
27. . . But, What About Unknown Objects?
Original
Synthesized
1.2 kbps
28Model-Aided Coding Incorporating Synthetic
Video into MC Hybrid Coding
Coder Control
Control data (incl. motion vectors)
Input Video
Intraframe DCT Coder
DCT coefficients
e
-
Intraframe Decoder
Multiframe Motion Compensation
Decoder
29R-D-Optimal Mode Decision
Selection Mask minimizing DlR
Previous decoded frame
Predicted frame
Synthesized frame
30Results Peter
H.263 (TMN-10) _at_ 12 kbps
Model-Aided Coder _at_ 12 kbps
- Sequence Clap, 8.33 fps, CIF resolution
31Results Akiyo
H.263 (TMN-10) _at_ 10 kbps
Model-Aided Coder _at_ 10 kbps
- Sequence Akiyo, 10 fps, CIF resolution
32R-D Performance of Model-Aided Coder
Sequence Peter
Sequence Akiyo
33Can 3-d geometry help to compress images?
Conclusion
- YES . . .
- . . . IF many views of the same 3-D object/scene
shall be compressed. - Applications in
- Multiview image coding (light-field compression)
- Compression of video sequences
- Very high compression ratios (1001 . . .
25,0001) - Require accurate vision algorithms for 3-d
reconstruction - Image-domain compression more resilient against
inaccurate geometry and hence more practical than
texture-map encoding
34. . . THE END