MPEG-4 Structured Audio - PowerPoint PPT Presentation

About This Presentation

Title:

MPEG-4 Structured Audio

Description:

Traditional Technique for Music. The Kolmogorov alternative: ... SAOL instrument tone, that plays a gated sine wave. ( SAOL code in next . ... – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 20

Provided by: valueds268

Category:

Tags: mpeg | audio | structured

Transcript and Presenter's Notes

Title: MPEG-4 Structured Audio

1
MPEG-4 Structured Audio
John Lazzaro John Wawrzynek June 18,
2001 Modified by Francois Thibault January 20,
2003
CS Division University of California at
Berkeley www.cs.berkeley.edu/johnw
2
MPEG 4 Standard
MPEG 4
audio
system
video
Natural coding
Synthetic coding
SA
TTS
AAC
T/F
CELP
Parametric
Structured Audio One component in the MPEG
audio standard.
ISO/IEC 14496-3 sec5
3
Audio Compression Basics

How well does this work?
Perceptually Lossless 10X-20X reduction
MP3, Dolby AC3,
True Lossless 2.5X reduction
Shorten, T. Robinson (Cambridge University)

4
The Kolmogorov alternative

Write a computer program that generates the
desired audio stream.
Transmit the computer program.
To decode, execute the program.

Similar to Postscript!

MPEG-4 Structured Audio (MP4-SA) uses this
approach.
Eric Scheirer, Editor (MIT Media Lab).
http//sound.media.mit.edu/eds/mpeg4/

5
MP4-SA Encoding

may be a creative act writing a program.
directly (emacs), or
indirectly (GUI, webpage)
In this case, MP4-SA is a lossless compressor.
may be automatic given a sound, an encoder
writes a program that generates the sound.
Automatic encoding is a hard in the general
case.

6
Key Application Music Production

Modern music production is computer-based.
Musicians enter performances into computers as
control information, not audio waveforms.
Digital synthesizers, effects, and mixes create
the final audio, under engineer/producer control.

7
Key Application Music Production

Modern music production is computer-based.
Musicians enter performances into computers as
control information, not audio waveforms.
Digital synthesizers, effects, and mixes create
the final audio, under engineer/producer control.

Ideal for collaborative productions, remixes, and
...
8
Key Application Music Performance

Music Performance requires dynamic control.
True interactively requires parameterized sounds.
Musicians control instruments and effects with
interactive controllers.
Control could be indirect and remote (ex games).

9
MPEG 4 Structured Audio

A binary file format that encodes
The programming language SAOL (pronounced sail).
The musical score language SASL.
Legacy support for MIDI.
Audio sample data.
Result is normative an MP4-SA file will sound
identical on all compliant decoders.
Different from MIDI files.

10
Why SAOL and MP4-SA?Why not Java?

Musical performance have temporal structure that
changes over several timescales

Writing sound generation code in a conventional
language results in code dominated by time-scale
management.
Hard to maintain, hard to optimize.

11
Time management is built into SAOL.

A SAOL program executes by moving a simulated
clock forward in time, performing calculations
along the way in a synchronous fashion.
Work is scheduled to happen
at the a-rate (the audio sample rate)
at the k-rate (envelope control rate)
at the i-rate (rate for new notes)
Language variables are typed as a/k/i-rate.
A language statement is scheduled based on the
rate of the variables it contains.

12
SAOL, SASL, and Scheduling

Sound creation in MP4-SA can be compared to a
musician playing notes on an instrument.
A SAOL subprogram (called an instr or instrument)
serves as the instrument.
SASL commands (called score lines) act to play
notes on SAOL instruments.
Many instances of a SAOL instr can be active at
one time, making sounds corresponding to notes
launched by different score lines in a SASL file.

13
An example

SAOL instrument tone, that plays a gated sine
wave. (SAOL code in next slide.)

14
SAOL Features

Rate semantics
i/k/a-rate execution
Vector arithmetic
ex ABC ? for i1,n AiBiCi
All floating-point arithmetic.
Extensive build-in audio function library
signal generators, table operators, pitch
converters, filters, fft, sample rate conversion,
effects, ...

15
Spectrum of implementations
Significant development maintenance complexity
Zoia Alverti, EPFL, ICASSP 2001
ISO/IEC 14496-3 sec 5, reference implementation
16
Sfront - a SAOL-to-C translator

Converts MP4-SA files to a ANSI C program, that
when executed, produces audio.

Runs on UNIX, Windows, MacOS.
Under Linux, supports real-time MIDI input,
real-time audio input and output, and MIDI over
RTP.
www.cs.berkeley.edu/lazzaro/sa

17
Generator Techniques

Much of the SA standard describes a library
104 core opcodes (ex pow(), allpass(), reverb()
)
16 wave table generators (ex harm, spline,
random)
Sfront optimizes the code produced for each
library element instance based on the invocation
attributes
rate, width, size, constancy, integral nature of
the parameters, number of paramaters

18
Interesting Issues

MP4-SA puts emphasis on sound synthesis methods
that can be described in a small amount of space.
Physical Modeling
good
Sampling Natural Instruments bad
If models are chosen carefully, compression
ratios of 100 to 10,000 are possible.
Physical Modeling is relatively immature, but
holds much promise.

19
Interesting Issues (cont.)

MP4-SA specifies that a decoder produces audio
that sounds identical to computing the program
accurately.
A new role for psychophysics
Instead of using psychophysics to squeeze bits
out of a sound representation, MP4-SA decoders
will use psychophysics to squeeze FLOPS out of
sound computations.
Leverage spectral and temporal masking.

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

Featured Presentations

Related Books