Title: A STUDY OF DESIGN COMPROMISES
1A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS
IN PACKET NETWORKS
Roch Lefebvre, Philippe Gournay University of
Sherbrooke Sherbrooke, Quebec, Canada
Redwan Salami VoiceAge Corp. Montreal, Quebec,
Canada
6. LISTENING TEST RESULTS
3. PROPOSED APPROACHES FOR ADDING REDUNDANCY
G.729 frame
packet
- In voice over packet networks, the coding gain
achieved by prediction-based speech coders is
offset by packet losses. Concealment must be
applied to the missing packets, which reduces
quality for two main reasons - not all missing packets can be concealed,
especially when concealment uses only the past
signal - onsets, transients
- the concealment error can propagate over
several frames, even frames received correctly - ? culprit desynchronisation of the excitation
content (LTP)
Consider only G.729 at 8 kbps (baseline
predictive coder) and add redundancy to obtain
bit rates similar to iLBC at 15.2 kbps.
R (kbps) D (ms)
G.729-2 and G.729-3 differ at the decoder
Content of each 20-ms packet
11.8 15
G.729-2 Decode packet Pk when it arrives (do
not wait for packet Pk1). If packet Pk is
missing, then apply concealment followed by
resynchronisation of filter memories using F2k
and F2k1 that are received when packet Pk1
arrives. Then, start decoding packet
Pk1. G.729-3 Decode packet Pk only after
packet Pk1 has arrived (additional delay of 20
ms). If packet Pk was missing, then just use F2k
and F2k1 that are added as redundancy in packet
Pk1. No concealment is applied in this case.
G.729-0
G.729-1
Bit rate and algorithmic delay
16 45
G.729-2 / G.729-3
- We propose to compare two approaches for
alleviating this problem - Adding redundancy to increase the robustness
of a baseline predictive encoder (G.729) - Using a speech coding model which does not
have interframe dependencies (iLBC) - To be compared, solutions should have
comparable bit rates
14.1 45
15.2 25
12 35
G.729-4
14.1 25
G.729-4 At the decoder, wait for packet Pk1
before decoding packet Pk.
8 25
In G.729-2 and G.729-3, Fk denotes Fk but
without the 18 LSF bits and pitch parity bit
(hence, frame Fk has 19 bits less than frame
Fk). The missing ISFs have to be extrapolated at
the decoder when a missing frame occurs.
(Point size proportional to quality at 10 FER)
2. ADDED REDUNDANCY versus FRAME INDEPENDENCE
4. EFFECT ON ERROR PROPAGATION
5. SUBJECTIVE EXPERIMENT
- A formal listening test was conducted to compare
the different solutions for increasing the
robustness in case of missing packets. The main
features of this test are - clean speech, narrowband, IRS filtered
- 4 male, 4 female speakers
- 32 naive listeners
- listening using binaural headphones
- following guidelines of ITU-T Rec. P.800
- 36 conditions in total, including MNRU and
other reference conditions - 0 20 random packet losses, synchronized
between iLBC and G.729
G.729-0 Every missing 20-ms packet implies
that two consecutive 10-ms frames of G.729 are
lost. Concealment and propagation introduce large
artefacts. G.729-1 Every missing 20-ms packet
reduces to a single 10-ms frame loss in G.729.
Concealment is more optimal, and propagation is
reduced. G.729-2 Concealment followed by
approximate resynchronisation of filter
memories. G.729-3 Limited concealment (there
would be no concealment if F was equal to
F). G.729-4 No effective loss in all single
packet losses. ILBC Concealment, but limited
error propagation (only due to post-filtering at
decoder to smooth frame transitions).
Codec_P G.729 (CELP-based)
Approach 1 Use a lower bit rate, predictive
(CELP) coder, and add channel redundancy to
improve robustness to missing frames. Approach
2 Use a higher bit rate, non-predictive or
frame-independent codec, to improve robustness
to missing frames in the core codec itself.
7. CONCLUSIONS
- From the test results, we can make the following
conclusions - In clean channel conditions, iLBC at 15.2
kbps has equivalent quality to G.729 at 8 kbps
(i.e. a much higher bit rate is necessary in a
frame- independent coder to increase both the
quality in clean channel and frame loss
conditions). ? extreme example G.711 at 64
kbps - The best quality in frame loss conditions was
achieved by using a low-rate CELP coder with
added redundancy and delay (G.729-4), with a
total bit rate close to iLBC (16 kbps compared
to 15.2 kbps) - The approaches studied to increase robustness
represent only a subset of all possible
combinations. Only solutions based on a standard
CELP-coder (G.729) were considered, with some of
them not optimal (ex. G.729-2). Improved
results could be expected by designing a solution
without the constraint of using standard core
codecs. - The G.729 RTP payload can already support
solutions G.729-1 and G.729-4.
20 ms packet
3rd Packet lost
G.729 synthesis
G.729-0 error at decoder
Codec_FI iLBC (Freame-independent)
G.729-1 error at decoder
Anticipated gains in quality
G.729-2 error at decoder
G.729-3 error at decoder
G.729-4 error at decoder
iLBC error at decoder (compared to iLBC
synthesis without frame loss)