Image Compression and Graphics: More Than a Sum of Parts PowerPoint PPT Presentation

presentation player overlay
1 / 34
About This Presentation
Transcript and Presenter's Notes

Title: Image Compression and Graphics: More Than a Sum of Parts


1
Image Compression and GraphicsMore Than a Sum
of Parts?
  • Bernd Girod
  • Collaborators Peter Eisert, Marcus Magnor,
    Prashant Ramanathan,
  • Eckehard Steinbach (all Stanford), Thomas Wiegand
    (HHI)

Image, Video, and Multimedia Systems
Group Information Systems Laboratory Stanford
University
2
Can 3-D Geometry Help to Compress Images?
  • Conjecture
  • 3-d geometry models help compression, if a
    single 3-D model captures the dependencies
    between many views (or frames of a sequence).

3
Outline of this Talk
  • Compression of many simultaneous views (e.g.
    light-fields)
  • Encoding view-dependent texture maps with 4-d
    wavelets
  • Hierarchical image-domain light-field coder
  • Why image-domain encoding is (usually) superior
    to texture-map encoding
  • Model-based compression of talking head sequences
  • Modeling and estimation of facial expressions
  • Avatars
  • Incorporate synthetic video into
    motion-compensated hybrid coding

4
Multi-View Image Capture
  • Coding schemes suitable for
  • 2-plane parametrization
  • Hemispherical image arrangement
  • (arbitrary recording positions)

5
Align Views by Mapping onto Object Surface
  • Camera views
  • No correlation
  • between
  • corresponding
  • pixels

View- dependent texture map Strong
correlation between corresponding texels
6
3-D Reconstruction from Many Views
Volumetric Reconstruction
  • Subdivide objects bounding box into voxels
  • Generation of multiple hypotheses for each voxel
  • Hypothesis elimination by projecting visible
    voxels into light-field images
  • Iterate over all voxels until remaining
    hypotheses are photo-consistent
  • processes all views simultaneously
  • exploits texture and silhouette information
  • yields solid 3-D voxel model

7
Surface Representation
  • Initial octahedral geometry
  • Geometry refinement
  • determine vertex normals
  • move vertices to model surface
  • subdivide triangles

8
Texture Map Encoding with 4-d Wavelets
  • Arrange images into 2-d array
  • Embedded encoding of wavelet coefficients
    (4D-SPIHT)

9
Results Wavelet Texture Map Encoder
Reconstruction quality in luminance PSNR (dB)
10
Results Wavelet Texture Map Encoder
11
Progressive Decoding
12
Align Views by Model-aided Prediction
13
Hierarchical Image Coding Order
  • project camera positions on hemisphere
  • subdivide into 4 quadrants
  • INTRA-encode corner images
  • encode center image
  • image prediction
  • residual error coding
  • encode mid-side images
  • subdivide into sub-quadrants
  • encode center and mid-side images
  • subdivide repeatedly

14
Model-aided Image-Domain Light-Field Coder
Light-Field Image Iu,v
DCT Coefficients
-
Multiframe Disparity Compensation
Disparity Map Generation
15
Picture Quality
original Mouse light field 257 RGB images,
384x288 pixels 81.3 Mbytes
compressed 3001 0.077 bpp (267 KBytes) 37.9 dB
PSNR
16
Model-aided vs. Texture Coding
17
Natural vs. Synthetic Image Set
18
Inaccurate Geometry
19
Model-based videophone
20
Modeling of Facial Expressions
  • Head geometry composed of 101 triangular
    B-spline patches
  • Facial expressions by superposition of 66 FAPs
    (Facial Animation Parameters) according to MPEG-4
    standard
  • FAPs act on control points of triangular B-spline
    patches

21
Estimation of Facial Expressions
Displacement field constrained by FAPs
Linearize for small FAPs
Optical flow constraint equation
  • Solve overdetermined system by linear regression
  • Apply iteratively in analysis-synthesis loop
  • Incorporate spatial resolution pyramid

22
Results Peter
Original
Synthesized
  • Sequence Peter, 230 frames,
  • CIF resolution, 25 fps

Compressed 25,0001 1.2 kbps - 32.8 dB PSNR
23
Results Eckehard
Original
Synthesized
  • Sequence Eckehard
  • CIF resolution, 25 fps

1.1 kbps, 32.6 dB PSNR
24
Results Peter as Eckehard
Original
Synthesized
  • Sequence Peter, 230 frames,
  • CIF resolution, 25 fps

25
Results Eckehard as Peter
Original
Synthesized
  • Sequence Eckehard
  • CIF resolution, 25 fps

26
Results Peter as Akiyo
Original
Synthesized
  • Sequence Peter, 230 frames,
  • CIF resolution, 25 fps

27
. . . But, What About Unknown Objects?
Original
Synthesized
  • Sequence Clap

1.2 kbps
28
Model-Aided Coding Incorporating Synthetic
Video into MC Hybrid Coding
Coder Control
Control data (incl. motion vectors)
Input Video
Intraframe DCT Coder
DCT coefficients
e
-
Intraframe Decoder
Multiframe Motion Compensation
Decoder
29
R-D-Optimal Mode Decision
Selection Mask minimizing DlR
Previous decoded frame
Predicted frame
Synthesized frame
30
Results Peter
H.263 (TMN-10) _at_ 12 kbps
Model-Aided Coder _at_ 12 kbps
  • Sequence Clap, 8.33 fps, CIF resolution

31
Results Akiyo
H.263 (TMN-10) _at_ 10 kbps
Model-Aided Coder _at_ 10 kbps
  • Sequence Akiyo, 10 fps, CIF resolution

32
R-D Performance of Model-Aided Coder
Sequence Peter
Sequence Akiyo
33
Can 3-d geometry help to compress images?
Conclusion
  • YES . . .
  • . . . IF many views of the same 3-D object/scene
    shall be compressed.
  • Applications in
  • Multiview image coding (light-field compression)
  • Compression of video sequences
  • Very high compression ratios (1001 . . .
    25,0001)
  • Require accurate vision algorithms for 3-d
    reconstruction
  • Image-domain compression more resilient against
    inaccurate geometry and hence more practical than
    texture-map encoding

34
. . . THE END
Write a Comment
User Comments (0)
About PowerShow.com