MPEG - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

MPEG

Description:

(these data rates are doubled for a stereo signal) MPEG1 - Layer 3 Audio encoding ... joint stereo - very high and very low frequencies can not be located in space ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 25
Provided by: howelli
Category:
Tags: mpeg | stereo

less

Transcript and Presenter's Notes

Title: MPEG


1
MPEG
  • Howell Istance
  • School of Computing
  • De Montfort University

2
Motion Pictures Expert Group
  • Established in 1988 with remit to develop
    standards for coded representation of audio,
    video and their combination
  • operates within framework of Joint ISO/IEC
    Technical Committee (JTC1 on Information
    Technology), organised into committees and
    sub-committees
  • originally 25 experts, now approximately 350
    experts from 200 companies and academic
    institutions, which meet approx. 3 times/year
    (depends on committee)
  • (all) standards work takes a long time, requires
    international agreement, (potentially) of great
    industrial strategic importance

3
MPEG- 1 standards
  • video standard for low fidelity video,
    implemented in software codecs, suitable for
    transmission over computer networks
  • audio standard has 3 layers, encoding process
    increases in complexity and data rates become
    lower as layers increase,
  • Layer 1 - 192 kbps
  • Layer 2 - 128 kbps
  • Layer 3 - 64 kbps (MPEG 1 - Layer 3 MP3)
  • (these data rates are doubled for a stereo
    signal)

4
MPEG1 - Layer 3 Audio encoding
  • Encoders analyse an audio signal and compare it
    to psycho-acoustic models representing
    limitations in human auditory perception
  • Encode as much useful information as possible
    within restrictions set by bit rate and sampling
    frequency
  • Discard samples where the amplitude is below the
    minimum audition threshold for different
    frequencies
  • Auditory masking - a louder sound masks a softer
    sound when played simultaneously or close
    together, so the softer sound samples can be
    discarded

5
Psychoacoustic model
Throw away samples which will not be perceived,
ie those under the curve
6
MPEG1 - Layer 3 Audio encoding
  • Temporal masking - if two tones are close
    together on the frequency spectrum and are played
    in quick succession, they may appear indistinct
    from one another
  • Reservoir of bytes - data is organised into
    frames - space left over in one frame can be
    used to store data from adjacent frames that need
    additional space
  • joint stereo - very high and very low frequencies
    can not be located in space with the same
    precision as sounds towards the centre of the
    audible spectrum. Encode these as mono
  • Huffman encoding removes redundancy in the
    encoding of repetitive bit patterns (can reduce
    file sizes by 20)

7
Masking effects
  • Throw samples in region masked by louder tone

8
Schematic of MPEG1 - Layer 3 encoding
http//www.iis.fhg.de/amm/techinf/layer3/index.htm
9
MPEG - 2 standards
  • Video standard for high fidelity video
  • Levels define parameters, maximum frame size,
    data rate and chrominance subsampling
  • Profiles may be implemented at one or more
    levels
  • MP_at_ML (main profile at main level) uses CCIR
    601 scanning, 420 chrominance subsampling and
    supports a data rate of 15Mbps
  • MP_at_ML used for digital television broadcasting
    and DVD
  • Audio standard essentially same as MPEG-1, with
    extensions to cope with surround sound

10
MPEG - 4
  • MPEG-4 standard activity aimed to define an
    audiovisual coding standard to address the needs
    of the communication, interactive (computing) and
    broadcasting service (TV/film/entertainment)
    models
  • In MPEG-1 and MPEG-2, systems referred to
    overall architecture, multiplexing and
    synchronisation.
  • In MPEG-4, systems also includes scene
    description, interactivity, content description
    and programmability
  • Initial call for proposals - July 1995, version 2
    amendments - December 2000

11
Images from Jean-Claude Dufourd, ENST, Paris
12
Images from Jean-Claude Dufourd, ENST, Paris
13
Images from Jean-Claude Dufourd, ENST, Paris
14
MPEG -4 Systems - mission
  • Develop a coded, streamable representation for
    audio-visual objects and their associated
    time-variant data along with a description of how
    they are combined
  • coded representation as opposed to textual
    representation - binary encoding for bandwidth
    efficiency
  • streamable as opposed to downloaded -
    presentations have a temporal extent rather than
    being being based on files of a finite size
  • audio-visual objects and their associated
    time-variant data as opposed to individual
    audio or visual streams. MPEG-4 deals with
    combinations of streams to create an interactive
    visual scene, not with encoding of audio or
    visual data

15
MPEG-4 Principles
  • Audio-visual objects - representation of natural
    or synthetic object which has a audio and/or
    visual manifestation (e.g video sequence, 3D
    animated face)
  • scene description - information describing where,
    when and for how long a-v objects will appear
  • Interactivity expressed in 3 requirements
  • client side interaction with scene description as
    well as with exposed properties of a-v objects
  • behaviour attached to a-v objects, triggered by
    events (e.g user generated, timeouts)
  • client-server interaction, user data sent back to
    server, server responds with modifications to
    scene (for example)

16
MPEG-4 Systems Principles
Interactive scene description
Scene description stream
Object description stream
Visual object stream
Visual object stream
Visual object stream
Audio object stream
17
MPEG-4 Systems Principles
Interactive scene description
Scene description stream
Object description stream
Visual object stream
Visual object stream
Visual object stream
Audio object stream
Elementary streams
18
Object Descriptor Framework
  • Glue between scene description and streaming
    resources (elementary descriptors)
  • object descriptor container structure-
    encapsulates all setup and association
    information for a set of elementary streams set
    of sub-descriptors describing individual streams
    (e.g configuration information for stream
    decoder)
  • groups sets of streams that are seen as a single
    entity from perspective of scene description
  • object description framework separated from scene
    description so that elementary streams can be
    changed and re-located without changing scene
    description

19
BIFS - BInary Format for Scenes
  • Specifies spatial and temporal locations of
    objects in scenes, together with their attributes
    and behaviours
  • elements of scene and relationship between them
    form a scene graph that must be encoded for
    transmission
  • based heavily on VRML, supports almost all VRML
    nodes
  • does not support use of java in script nodes
    (only ECMAScript)
  • does expand on functionality of VRML - allows a
    much broader range of applications to be supported

20
BIFS expansions to VRML
  • Compressed binary format
  • BIFS describes an efficient binary representation
    of the scene graph information.
  • Coding may be either lossless or lossy.
  • Coding efficiency derives from a number of
    classical compression techniques, plus some novel
    ones.
  • Knowledge of context is exploited heavily in
    BIFS.
  • Streaming
  • scene may be transmitted as an initial scene
    followed by timestamped modifications to the
    scene.
  • BIFS Command protocol allows replacement of the
    entire scenes, addition/deletion/replacement of
    nodes and behavioral elements in the scene graph
    as well as modification of scene properties.

21
BIFS expansions to VRML
  • 2D Primitives
  • BIFS includes native support for 2D scenes.
  • facilitates content creators who wish to produce
    low complexity scenes, including the traditional
    television and multimedia industries.
  • Many applications cannot bear the cost of
    requiring decoders to have full 3D rendering and
    navigation. This is particularly true where
    hardware decoders must be of low cost, as for
    instance television set-top boxes.
  • Rather than simply partitioning the multimedia
    world into 2D and 3D, MPEG-4 BIFS allows the
    combination of 2D and 3D elements in a single
    scene.

22
BIFS expansions to VRML
  • Animation
  • A second streaming protocol, BIFS Anim, provides
    a low-overhead mechanism for the continuous
    animation of changes to numerical values of the
    components in the scene.
  • These streamed animations provide an alternative
    to the interpolator nodes supported in both BIFS
    and VRML.
  • Enhanced Audio
  • BIFS provides the notion of an "audio scene
    graph"
  • audio sources, including streaming ones, can be
    mixed.
  • audio content can even be processed and
    transformed with special procedural code to
    produce various sounds effects

23
BIFS expansions to VRML
  • Facial Animation
  • BIFS provides support at the scene level for the
    MPEG-4 Facial Animation decoder.
  • A special set of BIFS nodes expose the properties
    of the animated face at the scene level,
  • animated face can be integrated with all BIFS
    functionalities, similarly to any other audio or
    visual objects

24
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com