W'A'V'S' Compression - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

W'A'V'S' Compression

Description:

Limitations of human auditory system ... Plot audible threshold of tone. Observations: Auditory system sensitive to some frequencies ... – PowerPoint PPT presentation

Number of Views:10
Avg rating:3.0/5.0
Slides: 22
Provided by: Mudd4
Category:

less

Transcript and Presenter's Notes

Title: W'A'V'S' Compression


1
W.A.V.S. Compression
  • Alex Chen
  • Nader Shehad
  • Aamir Virani
  • Erik Welsh

2
Overview
  • Approach
  • Psychoacoustic Modeling
  • Filter Banks
  • Quantization
  • Demonstration
  • Results
  • Further Research

3
Approach
Encoding
Filter Banks
Quantization
Input
Encoded Signal
Psychoacoustic Model
Decoding
Inverse Quantization
Reconstruction Filter Banks
Encoded Signal
Output
4
Psychoacoustic Model
  • Based on studies that show hearing capabilities
    affected by
  • Environment
  • Limitations of human auditory system
  • Used to eliminate portions of signal average
    human wont hear
  • Two key properties
  • Absolute threshold of hearing
  • Auditory masking

5
Absolute Threshold of Hearing
  • Experiment
  • Plot audible threshold of tone
  • Observations
  • Auditory system sensitive to some frequencies
  • Frequencies within critical bandwidth treated
    similarly
  • Basis for Bark scale

6
Auditory Masking
  • Tones and noise drown out less powerful sounds
  • Affect neighboring frequencies
  • Affect critical bandwidth
  • Effects add to produce overall masking threshold
  • Mask quantization

7
Filter Banks Theory
  • Array of bandpass filters
  • Break up signal into frequency subbands
  • Allows for variable coding scheme

8
Analysis and Synthesis Banks
1) Analysis filters divide up the signal 2)
Down-sample 3) Quantize
  • 4) Up-sample
  • 5) Synthesis filters remove distortions
  • 6) Reconstruct the signal

9
Filter Bank Design
  • Phase
  • Tradeoff between fine and coarse frequency
    resolution
  • Piccolo vs. Castanets
  • Non-stationary signals
  • We used non-adaptive approach

10
Filter Bank Implementation
  • We used Cosine Modulated PR
  • (perfect reconstruction) filter banks with 32
    filters each
  • Output is a delayed version of the input (linear
    phase)
  • Distortion arises from quantization only

11
Quantization
  • Two types
  • Narrow-band
  • Current input
  • Overhead cost
  • Full-range
  • Independent of current input
  • No overhead

Sampled Input
Quantized Version
Reconstructed Input
12
Quantization
  • Narrow Band
  • More accurate
  • Lower compression ratio
  • Full-Range
  • Less accurate
  • Higher compression ratio
  • Using 3-bit Quantization
  • Input -.4 -.22 .14 .4
  • Levels 1 3 6 8
  • Recon. -.4 -.2 .1 .3
  • Total Error .16
  • Input -.4 -.22 .14 .4
  • Output 3 4 6 7
  • Recon -.5 -.25 .25 .50
  • Total Error .34

13
Demonstration
  • Sine wave
  • Full range
  • Narrow range
  • Chime
  • 8-bit
  • Full range
  • Narrow range
  • Percussion
  • Full Range
  • Narrow Range
  • Modern
  • 8-bit
  • Full Range
  • Narrow Range

14
Sine Wave (time)
Full-Range Quantization
Narrow Quantization
15
Sine Wave (freq)
Full-Range Quantization
Narrow Quantization
16
Sine Wave (freq error)
Full-Range Quantization
Narrow Quantization
17
Modern (time)
Full-Range Quantization
Narrow Quantization
18
Modern (freq)
Full-Range Quantization
Narrow Quantization
19
Modern (freq error)
Full-Range Quantization
Narrow Quantization
20
Results
  • Full Range Smallest File, Worst Sound Quality
  • Narrow Range Better Sound Quality, Larger File
  • MP3 Industry Standard

21
Further Research
  • Filter Banks
  • Wavelets
  • Dynamic Frequency Ranges
  • Better Psychoacoustic Model
  • Tone Designation
  • Pre- and Post- Echo
  • Bit Allocation
  • Writing a File
Write a Comment
User Comments (0)
About PowerShow.com