Some Aspects of Wideband Speech in Enterprise Telephony - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Some Aspects of Wideband Speech in Enterprise Telephony

Description:

2nd Workshop on Wideband Speech Quality - June 2005. 1 ... Diagnostic Rhyme Test and Diagnostic Alliteration Test , except we generated our ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 23
Provided by: porta
Category:

less

Transcript and Presenter's Notes

Title: Some Aspects of Wideband Speech in Enterprise Telephony


1
Some Aspects of Wideband Speech in Enterprise
Telephony
  • Eric J. Diethorn (ejd_at_avaya.com)
  • with
  • Gary W. Elko (gwe_at_avaya.com) and
  • Joseph L. Hall (jhall01_at_avaya.com)
  • Avaya, Inc.
  • Avaya Labs, Research
  • 233 Mt. Airy Road,
  • Basking Ridge, New Jersey 07920 USA

2
Outline
  • Physical acoustics
  • Echo
  • Voice coders
  • Conferencing
  • Wideband speech intelligibility
  • Hallway demonstration Avaya SIP Softphone

3
Some introductory thoughts
  • Wideband speech telephony will instantaneously
    raise the bar of end-user expectation, at least
    for some applications.
  • Skype
  • We have standards for the reproduction of
    wideband speech, but is wider-band good enough?
  • Maybe 150, 5000 is good enough?
  • With greater bandwidth comes a greater range of
    potential artifacts that the acoustical-signal-pro
    cessing engineer must address.
  • Low-frequency acoustic echo, earpiece hiss,
    speech-coder distortion, arbitration of multiple
    sampling rates.
  • The preferences of end users are uncertain.
  • Speech bandwidths policies (buddy lists,
    profiles)?
  • Suppose I have a physiological speech impediment.
    Do I want it emphasized?

4
Physical acoustics
  • The physical design of terminal acoustics must
    change to render wideband speech.
  • Acoustical signal processing changes, too.

5
Loudspeakers enclosures
  • Frequency response,
  • traditional narrowband speakerphone,
  • 80 dB-SPL50 cm

Sound Pressure Level (dB)
Frequency (Hz)
6
Loudspeakers enclosures
  • Total harmonic distortion,
  • traditional narrowband speakerphone,
  • 80 dB-SPL50 cm
  • High distortion at low frequency end of
    wideband-speech spectrum
  • Acoustic echo control difficult if not impossible
    without acoustical modifications.

THD at harmonics ()
Frequency (Hz)
7
Earpieces
  • Frequency response, wideband handset

Sound Pressure Level (dB)
Frequency (Hz)
  • In order to satisfy wideband standards,
    acoustical modifications are necessary to extend
    the low-frequency response of most earpiece
    designs.
  • This is particularly challenging for physical
    arrangements in which the earpiece is held to the
    ear with little pressure.

8
Microphones
  • Most low-cost electret microphones used today
    have a frequency response that is practically
    flat beyond the range of wideband speech they
    are wideband ready.
  • Multiple microphone arrangements arrays can
    be exploited to reduce the level of ambient noise
    at frequencies not present in traditional
    narrowband telephony.
  • Low-frequency rumble.
  • High-frequency hiss.
  • Short-time spectral modification methods of noise
    reduction can help, but the perception of
    artifacts from such processing is enhanced by the
    wider speech band.

9
Microphones
Front of phone
Front of phone
  • Omnidirectional microphone (traditional)
  • Good pick-up of talkers in all directions
  • But, picks-up ambient noise from all directions
  • Directional microphone
  • Reduces off-axis noise
  • Reduces reverberation of talkers voice
  • Reduces coupling from speakerphone (helping AEC)
  • But, talkers off axis cant be heard well.

10
Echo
  • Requirements on echo control may change.
  • The art of echo control must evolve to meet the
    challenge of wideband speech.

11
Requirements on Talker Echo
  • Roundtrip, mouth-to-ear, echo loss requirements
    were measured on populations for narrowband
    speech. How well do these data apply to wideband
    speech echo paths?

Percent Good-or-Better
Acoustic-to-acoustic echo-path loss (dB)
Echo annoyance as a function of roundtrip,
mouth-to-ear loss and delay, for narrowband
speech.
  • Source Transmission Systems for Communications,
    Bell Telephone Laboratories, Inc., 5th Edition,
    1982.

12
Talker Echo, Continued
  • Being strictly digital, wideband-speech network
    paths do not suffer from analog circuit noises,
    however, analog and environmental noises enter
    calls at the endpoint. Should requirements on
    talker echo incorporate such (wideband) noise
    phenomena?

Echo annoyance as a function of roundtrip,
mouth-to-ear echo-and-noise loss. Long-haul
(1000 mi.) PSTN connection, circa 1980.
  • Source Transmission Systems for Communications,
    Bell Telephone Laboratories, Inc., 5th Edition,
    1982.

13
Wideband speech coding
  • G.722, G.722.1 and G.722.2
  • G.722 is cheap.
  • G.722.1 often comes with video-on-the-enterprise
    (Polycom).
  • Proprietary codecs
  • Silicon solution providers have their favorites.
    Some are pretty good.
  • Linear 16-bit encoding?
  • Speech-transmission bandwidth (bits-per-second)
    is becoming a non-issue in the enterprise, at
    least for wired LANs.
  • Architecturally appealing within the enterprise.
    Let boundary gateways worry about transcoding.

14
Multirate audio conferencing
  • Rate arbitration
  • Transcoding
  • Multirate mixing
  • (Artificial) bandwidth extension

Conference bridge server
Wide- and narrow-band speech
IP-1
PSTN
narrowband speech
Leased WAN (compressed speech, e.g., G.729, G.726)
IP-2
15
Stereo audio conferencing
Hands-free, wideband-speech communications with
stereo echo cancellation
echo
g1
h1
talker
ROOM 1
-
ROOM 2

16
Stereo Conferencing
(Placeholder, video demonstration)
17
Wideband speech intelligibility
  • Siemens wideband transmissions can reduce
    speech ambiguities by as much as 90 percent,
    increasing conversational intelligibility and
    reducing listener fatigue. (2003 press release)
  • Polycom For single syllables, 3.3 kHz
    bandwidth yields an accuracy of only 75 percent,
    as opposed to over 95 percent with 7 kHz
    bandwidth. (2003 white paper)
  • Marketing vs. science both required

18
Experimental study
  • Similar to Diagnostic Rhyme Test and Diagnostic
    Alliteration Test , except we generated our own
    word pairs
  • e. g., tie pie (hot hop)
  • Subject hears one of the two, is shown both, is
    asked Which of these two did you hear?
  • Clean anechoic speech filtered to 3 bandwidths
  • 50,3300, 50,5000 and 50,7000 Hz.
  • Investigate all nine combinations of three
    bandwidths and three additive-noise levels (0 dB,
    12 dB, 24 dB SNR).
  • Reference G.A. Miller and P.E. Nicely, An
    analysis of perceptual confusions among some
    English consonants Lincoln Laboratory, MIT, 1955
    (J. Acoust. Soc. Amer. Vol. 27, pp. 338-352)

For questions concerning aspects of this study,
contact Joseph L. Hall, Avaya Research,
jhall01_at_avaya.com
19
What do they sound like?
  • Seed, feed, seed at different bandwidths and
    additive noise levels.

20
Representative results
21
Summary of results
22
Hallway Demonstration -- Avaya widebandSIP
softphone
  • Wideband speech (16 kHz sampling, bandwidth
    limited by PC sound architecture).
  • Voice codecs
  • G.711, G.729, G.726
  • G.722
Write a Comment
User Comments (0)
About PowerShow.com