VoIP Seminar - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

VoIP Seminar

Description:

Important Coder Attributes in VoIP. Delay caused by coding ... Currently linear prediction-based analysis by synthesis (LPAS) coders offer best performance ... – PowerPoint PPT presentation

Number of Views:192
Avg rating:3.0/5.0
Slides: 20
Provided by: audemi
Category:

less

Transcript and Presenter's Notes

Title: VoIP Seminar


1
Speech Coding for VoIP
  • Konsta Koppinen
  • Signal Processing Laboratory
  • konsta.koppinen_at_tut.fi

2
Why Speech Coding?
  • The purpose of coding is to reduce the required
    bit rate
  • For speech, compression ratio of 81 possible
    with little degradation in quality

3
Important Coder Attributes in VoIP
  • Delay caused by coding (typically lt 50 ms)
  • Bit rate (typically 4-64 kbps)
  • Coder complexity (usually lt 30 DSP MIPS)
  • Compatibility with existing codecs
  • international standards of ITU-T, e.g. G.711,
    G.723, G.729,
  • Nokia recommends using mobile standards

4
Choice of Coder Structure
  • Different coder structures exist
  • waveform coding
  • vocoding
  • hybrid coding
  • Currently linear prediction-based analysis by
    synthesis (LPAS) coders offer best performance

5
Basic Idea of LPAS Coders
  • We have a coder with various parameters
  • Try synthesizing signals with different parameter
    values
  • Pick the best one
  • Hence the name analysis-by-synthesis
  • Also known as closed-loop optimization

6
DPCM Example
  • DPCM? (differential pulse code modulation)

Q
-
P(z)

P(z)
7
Analysis-by-Synthesis, Ver. 0.0
Input
Signal generator
Output
()2
?
8
Signal generator for speech?
  • Excitation
  • Vocal cords
  • Vocal tract

noise
1/A(z)
Z-T
9
Analysis-by-Synthesis, Ver. 1.0
Input
  • decoder

noise
1/A(z)
-
Z-T
Output
10
Comments on ver. 1.0
  • All parameters (synthesis filter, pitch delay,
    gains) must be time-dependent
  • e.g. every 5 ms
  • All parameters must be quantized before sending
  • Squared error in the output is minimized
  • Some preprocessing useful
  • normalize bandwidth
  • speech enhancement

11
Perceptual Weighting
  • Noise is not perceived uniformly
  • Rule of thumb frequency bands with more energy
    tolerate more error (masking)
  • This can be represented as a time-dependent
    frequency weighting (filtering)

original
SNR 15 dB
15 dB, weighted
12
Analysis-by-Synthesis, Ver. 2.0
Input frame
preprocessing
Z0
noise
1/A(z)
Z0
W(z)
Z-T
Output
13
Comments
  • Synthesis and weighting filters (both IIR) should
    have memory, i.e. operate continuously
  • improves performance
  • Frames divided into subframes
  • pitch filter has to be updated more frequently
  • reduces complexity

14
Synthesis Filter
  • Obtained from LP analysis of the input
  • smooth windowing to reduce artifacts
  • bandwidth expansion
  • Update rate 10 ms
  • Can be interpolated e.g. every 5 ms
  • Quantized and interpolated using LSFs
  • Has memory

15
Long-Term Predictor (LTP)
  • Can be implemented using an adaptive codebook
    (AC)
  • delay and gain into past excitation
  • Fractional delay also possible by interpolating
    past excitation

16
Weighting Filter
  • Also needs memory
  • A good choice is

17
Fixed Codebook (FC)
  • Represents residual error or system input
  • Should be reasonably white (no correlation)
  • Difficult to determine without a structured
    codebook
  • stochastic codebook
  • multi-pulse, regular pulse
  • algebraic codebook

18
Choice of Parameters
  • Ideally, all parameters should be chosen jointly
    to minimize the weighted error
  • Practically, this is impossible (e.g. 2120
    possibilities)
  • The parameters are chosen sequentially
  • LPC ? AC lag and gain ? FC index and gain

19
Analysis-by-Synthesis, Ver. 3.0
preprocessing
LPC
-
output
-
1/A(z)
W(z)
Write a Comment
User Comments (0)
About PowerShow.com