MPEG-4 Structured Audio - PowerPoint PPT Presentation

About This Presentation
Title:

MPEG-4 Structured Audio

Description:

Traditional Technique for Music. The Kolmogorov alternative: ... SAOL instrument tone, that plays a gated sine wave. ( SAOL code in next . ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 20
Provided by: valueds268
Category:
Tags: mpeg | audio | structured

less

Transcript and Presenter's Notes

Title: MPEG-4 Structured Audio


1
MPEG-4 Structured Audio
John Lazzaro John Wawrzynek June 18,
2001 Modified by Francois Thibault January 20,
2003
CS Division University of California at
Berkeley www.cs.berkeley.edu/johnw
2
MPEG 4 Standard
MPEG 4
audio
system
video
Natural coding
Synthetic coding
SA
TTS
AAC
T/F
CELP
Parametric
Structured Audio One component in the MPEG
audio standard.
ISO/IEC 14496-3 sec5
3
Audio Compression Basics
  • How well does this work?
  • Perceptually Lossless 10X-20X reduction
  • MP3, Dolby AC3,
  • True Lossless 2.5X reduction
  • Shorten, T. Robinson (Cambridge University)

4
The Kolmogorov alternative
  • Write a computer program that generates the
    desired audio stream.
  • Transmit the computer program.
  • To decode, execute the program.

Similar to Postscript!
  • MPEG-4 Structured Audio (MP4-SA) uses this
    approach.
  • Eric Scheirer, Editor (MIT Media Lab).
  • http//sound.media.mit.edu/eds/mpeg4/

5
MP4-SA Encoding
  • may be a creative act writing a program.
  • directly (emacs), or
  • indirectly (GUI, webpage)
  • In this case, MP4-SA is a lossless compressor.
  • may be automatic given a sound, an encoder
    writes a program that generates the sound.
  • Automatic encoding is a hard in the general
    case.

6
Key Application Music Production
  • Modern music production is computer-based.
  • Musicians enter performances into computers as
    control information, not audio waveforms.
  • Digital synthesizers, effects, and mixes create
    the final audio, under engineer/producer control.

7
Key Application Music Production
  • Modern music production is computer-based.
  • Musicians enter performances into computers as
    control information, not audio waveforms.
  • Digital synthesizers, effects, and mixes create
    the final audio, under engineer/producer control.

Ideal for collaborative productions, remixes, and
...
8
Key Application Music Performance
  • Music Performance requires dynamic control.
  • True interactively requires parameterized sounds.
  • Musicians control instruments and effects with
    interactive controllers.
  • Control could be indirect and remote (ex games).

9
MPEG 4 Structured Audio
  • A binary file format that encodes
  • The programming language SAOL (pronounced sail).
  • The musical score language SASL.
  • Legacy support for MIDI.
  • Audio sample data.
  • Result is normative an MP4-SA file will sound
    identical on all compliant decoders.
  • Different from MIDI files.

10
Why SAOL and MP4-SA?Why not Java?
  • Musical performance have temporal structure that
    changes over several timescales
  • Writing sound generation code in a conventional
    language results in code dominated by time-scale
    management.
  • Hard to maintain, hard to optimize.

11
Time management is built into SAOL.
  • A SAOL program executes by moving a simulated
    clock forward in time, performing calculations
    along the way in a synchronous fashion.
  • Work is scheduled to happen
  • at the a-rate (the audio sample rate)
  • at the k-rate (envelope control rate)
  • at the i-rate (rate for new notes)
  • Language variables are typed as a/k/i-rate.
  • A language statement is scheduled based on the
    rate of the variables it contains.

12
SAOL, SASL, and Scheduling
  • Sound creation in MP4-SA can be compared to a
    musician playing notes on an instrument.
  • A SAOL subprogram (called an instr or instrument)
    serves as the instrument.
  • SASL commands (called score lines) act to play
    notes on SAOL instruments.
  • Many instances of a SAOL instr can be active at
    one time, making sounds corresponding to notes
    launched by different score lines in a SASL file.

13
An example
  • SAOL instrument tone, that plays a gated sine
    wave. (SAOL code in next slide.)

14
SAOL Features
  • Rate semantics
  • i/k/a-rate execution
  • Vector arithmetic
  • ex ABC ? for i1,n AiBiCi
  • All floating-point arithmetic.
  • Extensive build-in audio function library
  • signal generators, table operators, pitch
    converters, filters, fft, sample rate conversion,
    effects, ...

15
Spectrum of implementations
Significant development maintenance complexity
Zoia Alverti, EPFL, ICASSP 2001
ISO/IEC 14496-3 sec 5, reference implementation
16
Sfront - a SAOL-to-C translator
  • Converts MP4-SA files to a ANSI C program, that
    when executed, produces audio.
  • Runs on UNIX, Windows, MacOS.
  • Under Linux, supports real-time MIDI input,
    real-time audio input and output, and MIDI over
    RTP.
  • www.cs.berkeley.edu/lazzaro/sa

17
Generator Techniques
  • Much of the SA standard describes a library
  • 104 core opcodes (ex pow(), allpass(), reverb()
    )
  • 16 wave table generators (ex harm, spline,
    random)
  • Sfront optimizes the code produced for each
    library element instance based on the invocation
    attributes
  • rate, width, size, constancy, integral nature of
    the parameters, number of paramaters

18
Interesting Issues
  • MP4-SA puts emphasis on sound synthesis methods
    that can be described in a small amount of space.
  • Physical Modeling
    good
  • Sampling Natural Instruments bad
  • If models are chosen carefully, compression
    ratios of 100 to 10,000 are possible.
  • Physical Modeling is relatively immature, but
    holds much promise.

19
Interesting Issues (cont.)
  • MP4-SA specifies that a decoder produces audio
    that sounds identical to computing the program
    accurately.
  • A new role for psychophysics
  • Instead of using psychophysics to squeeze bits
    out of a sound representation, MP4-SA decoders
    will use psychophysics to squeeze FLOPS out of
    sound computations.
  • Leverage spectral and temporal masking.
Write a Comment
User Comments (0)
About PowerShow.com