mh presentation template - PowerPoint PPT Presentation

About This Presentation
Title:

mh presentation template

Description:

www.mhacoustics.com ... Reflections on Issues, Requirements, and Solutions Tomas Gaensler mh acoustics – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 32
Provided by: TomasGa
Category:

less

Transcript and Presenter's Notes

Title: mh presentation template


1
Front-end Audio Processing Reflections on
Issues, Requirements, and Solutions
Tomas Gaensler mh acoustics www.mhacoustics.com
Summit NJ/Burlington VT USA
2
Front-end Audio Processing
  • Processing to enhance perceived and/or measured
    sound quality in communication and recording
    devices

Then
Now
3
Not So Famous Quotes (Acoustic Jewelry/Bluetooth
Headset)
  • Gary Elko (mh/Bell labs colleague)
  • At IWAENC 1995 Acoustic Echo cancellation will
    not be needed in the future when people wear
    acoustic jewelry
  • Arno Penzias (1978 Nobel prize laureate)
  • No one would want acoustic jewelry because
    people would think the users talking to
    themselves are crazy
  • Im glad the success of Bluetooth headsets show
    that both were completely wrong!

4
Classical Front-end Architectures - POTS
5
Classical Front-end Architectures Cellphone 1995
6
Classical Front-end Architectures Cellphone
2005 - 2010
7
Cellphones and Handsfree
  • Common problems
  • Far-end listener does not hear near-end talker
  • Near-end listener does not understand far-end
    talker
  • Why?
  • Form factor Size
  • Limited understanding of physics and acoustics(?)

8
RX/TX Levels, Coupling and Doubletalk
  • Echo louder than near-end
  • Linear AEC
  • ERLE ? 20-30 dB
  • After cancellation Residual Echo to Near-end
    Ratio (RENR)
  • RENR ? 90-20-70 0 dB

Far-end ? 95100 dBSPL at loudspeaker
  • 8590
  • dBSPL at mic
  • gt20 dB of residual echo suppression required
  • Duplexness suffers

Near-end talker ? 5570 dBSPL at mic
9
TX Dynamic Range and Noise
  • Echo 90 dBSPL ? Peak echo ?105-110 dB
  • No saturation of echo in TX path

Echo Level 90 dBSPL
Near-end speech Level 70 dBSPL
10
TX Fixed-point Processing and Quantization Noise
Q-noise increases by 6log2(N) dB!
  • N64 ? Q-noise increases by 36 dB
  • Double-precision required

11
RX Dynamic Range and Distortion
Digital gain
Analog gain
To AEC
  • Small loudspeakers have rather high cut-off
    frequency (high-pass)
  • EQ often required to get acceptable sound
    (frequency response). However EQ means
  • Loss of signal loudness and dynamic range
  • Increased (analog) distortion
  • Many manufacturers compensate the loss of signal
    level by excessive digital gain and therefore get
    (digital) saturation

12
What Can or Should be Done?
  • Minimize acoustical coupling by good physical
    design
  • TX
  • Use noise suppression but not excessively
  • Double-precision, block scaling, or
    floating-point
  • RX
  • Compression instead of fixed gain
  • 10 or less loudspeaker/driver THD is desired

13
What about Non-linear AEC Algorithms?
  • Interesting problem proposed and worked on for
    many years
  • Not practical in most AEC applications since
  • Complicated model
  • Gain and therefore saturation possibly in both TX
    and RX paths
  • Added complexity and system cost
  • Often slow convergence
  • Difficult to fine-tune in field
  • Even when non-linear cancellation works
    perfectly, the user still perceives a distorted
    loudspeaker signal!

14
Classical Front-end Architectures Cellphone
2005 - 2010
15
Single Channel Noise Suppression
  • Basic single channel noise suppressor
  • An extremely successful signal processing
    invention by Manfred Schroeder in the 1960s
  • Musical tones is it a (solved) problem?
  • How do we evaluate and improve quality?
  • How about convergence rate?

16
Background to Single Channel Noise Suppressors
enhanced speech
NS
speech
noise
  • Block processing
  • Frequency domain model
  • Linear Time-varying filter
  • Wiener filter

17
Background to Single Channel Noise Suppressors
  • Estimation of spectra is often done recursively
  • Frequency smoothing

, when speech is not present
18
Musical Tones Is it a (Solved) Problem?
  • Examples
  • Original (Sally Sievers reel, June-Sept. 1964
    by Manfred Schroeder and Mohan Sondhi at Bell
    Labs)
  • Original noise (iSNR 6 dB)
  • Schroeder 1960s
  • Generic spectral subtraction Boll 1979
  • IS-127 1995
  • A problem of last century, only a constraint in
    design
  • Controlling variance of suppression gains
  • Any NS algorithm should be constrained not to
    have musical tones
  • Must only have a small impact on voice quality

19
Quality Metrics
  • Most importantly Listen!
  • SNR
  • Total
  • Segmental
  • During speech
  • Distortion metrics
  • ISD (Itakura-Saito distance)
  • ITU-T P.862 PESQ/MOS-LQO

20
Quality Metric P.862 (PESQ/MOS-LQO)
  • MOS-LQO (MOS Listening Quality Objective)
  • Alg-1/2 Wiener methods with 12 dB noise
    suppression
  • What can the best noise suppressor achieve?

21
Quality Metric My Rule of Thumb
  • Ideal MOS (PESQ) performance bound is given by
    shifting the unprocessed PESQ-curve to the left
  • Example for 12 dB suppression
  • 12 dB shift to the left

12 dB
22
Convergence Rate
  • Important performance criterion
  • Non-stationary noise conditions
  • Frame loss
  • Main objective
  • Maximize convergence rate while maintaining
    speech quality

23
Convergence Rate A Useful Test
  1. Input sequence
  2. IS-127
  3. Wiener Based
  4. A spectral subtraction m-script retrieved from
    the internet

24
Convergence Rate and MOS-LQO
  1. Normal
  2. Fast
  3. MOS-LQO

25
Current Applications and Drivers of NS Technology
  • Where is NS going in industry now?
  • Beyond 12 dB of suppression
  • Multi-microphone solutions
  • Two- or more channel suppressors
  • Linear beamforming
  • Applications
  • Mobile phones (a few two-microphone models have
    reached the market)
  • Bluetooth headsets great "new" application for
    signal processing (Ericsson BT headset 2000)

26
Background to Linear Beamforming
  • N Number of microphones
  • Broadside linear beamforming (e.g. delay-sum)
  • Directional gain 10log(N)
  • White Noise Gain (WNG)gt0
  • Practical size large (30cm)
  • Endfire differential beamforming
  • Directional gain 20log(N)
  • WNGlt0
  • Practical size small (1.5-5cm)

? Differential beamformers more suitable for
small form-factors
27
Background to Linear Beamforming
  • What do we gain?
  • Less reverberation (increased intelligibility)
  • Less (environmental) noise
  • No (or low) distortion on axis
  • Possible interference rejection by spatial
    zero(s)
  • Some Issues
  • Performance is given by critical distance!
  • Increase in sensor noise (WNG, differential
    beamforming)

28
Beamforming Critical Distance
  • Critical distance (Reverberation radius)
    reverberant-to-direct path energy ratio is 0 dB
  • DI Directivity Index gain of direct to
    reverberant energy over an omni-directional
    microphone
  • Order of finite differences used. 1st ?2 mics,
    2nd ?3 mics etc)

Order DI dB
0 0
1 6 2.0
2 9.5 3.0
3 12 4.0
29
First-Order Differential Beamforming
30
Classical First-Order Beamformer Responses
Cardioid
Hypercardioid
Dipole
31
Beamforming Demo DEWIND? processing
Write a Comment
User Comments (0)
About PowerShow.com