Information Theory and Perceptual Audio Coding - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Information Theory and Perceptual Audio Coding

Description:

Allows to compress signals by getting rid of inaudible information. Absolute ... Due to physics of the cochlea. Noise and Tonal Masking. Noise making Tone ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 17
Provided by: dannym9
Category:

less

Transcript and Presenter's Notes

Title: Information Theory and Perceptual Audio Coding


1
Information Theory and Perceptual Audio Coding
  • SC500
  • Danny Morris

2
Introduction
  • Need to compress audio data
  • CD audio 1.41/1.52 Mbits/sec
  • Lossless Compression not good enough
  • Unpredictable
  • Not many statistical redundancies
  • Psychoacoustics can help
  • Variable bit allocation for frequency bands
  • Masking properties of the human ear
  • Non coding of irrelevant information

3
Audio Coding Overview
4
Psychoacoustic Model
  • Models the ears perceptual properties
  • Absolute Hearing Threshold
  • Critical Bands
  • Noise and Tonal Maskers
  • Global Hearing Threshold
  • Allows to compress signals by getting rid of
    inaudible information

5
Absolute Hearing Threshold
6
Critical Bands
  • The ear as a signal processing filter
  • Transform frequency into position
  • Bank of highly overlapping band pass filters
  • Asymmetric and nonlinear
  • Critical bands
  • Due to physics of the cochlea

7
Noise and Tonal Masking
  • Noise making Tone
  • Signal to Mask ratio -5db
  • Tone masking Noise
  • Signal to Mask ratio 21 - 28 db

8
Masking Spreading Function
9
Global Masking Threshold
Step 1 Identify Noise and Tonal maskers X
Tonal Masker O Noise Masker
10
Global Masking Threshold
Step 2 Apply Tonal and Noise Spreading Functions
11
Global Masking Threshold
Step 3 Determine Max of spreading functions and
absolute hearing threshold as Global threshold
12
Time to Frequency Analysis
  • Converts windowed time to frequency information
  • Uses critically sampled perfect reconstruction
    filter banks
  • Polyphase Quadrature Mirror Filters (PQMF)
  • Modified Discrete Cosine Transform (MDCT)

13
Bit Allocation
  • Minimize the Error as well as the
  • SMR SNR in each frequency band
  • d(x-x)
  • Leads to a bit allocation of

14
Perceptual Entropy
  • Evaluates the Rate Distortion function where Di
    for each frequency band is not above the global
    masking threshold
  • Lower limit on the number of bits required to
    code an audio sample with out perceptual loss

15
MPEG-1 layer 1,2,3
  • Layer 1 , 2
  • 512-tap polyphase quadrature mirror filter (PQMF)
  • 32 frequency bands
  • Linear Quantization, 12 sample block coding
  • Layer 3
  • 18 point Modified discrete cosine transform
    (MDCT) after PQMF
  • 576 frequency bands
  • Non uniform quantization , Huffman coding

16
Conclusion
  • PE tells us its possible to compress audio to 2.1
    bits per sample
  • Audio coding minimizes the mutual information
    between the source and the quantized samples
  • Constrained that the distortion cannot produce a
    signal to noise ratio smaller than the signal to
    mask ratio
  • Compression is achieved with little to no
    perceptual loss of information
Write a Comment
User Comments (0)
About PowerShow.com