mh presentation template - PowerPoint PPT Presentation

About This Presentation

Title:

mh presentation template

Description:

www.mhacoustics.com ... Reflections on Issues, Requirements, and Solutions Tomas Gaensler mh acoustics – PowerPoint PPT presentation

Number of Views:103

Avg rating:3.0/5.0

Slides: 32

Provided by: TomasGa

Category:

more less

Transcript and Presenter's Notes

Title: mh presentation template

1
Front-end Audio Processing Reflections on
Issues, Requirements, and Solutions
Tomas Gaensler mh acoustics www.mhacoustics.com
Summit NJ/Burlington VT USA
2
Front-end Audio Processing

Processing to enhance perceived and/or measured
sound quality in communication and recording
devices

Then
Now
3
Not So Famous Quotes (Acoustic Jewelry/Bluetooth
Headset)

Gary Elko (mh/Bell labs colleague)
At IWAENC 1995 Acoustic Echo cancellation will
not be needed in the future when people wear
acoustic jewelry
Arno Penzias (1978 Nobel prize laureate)
No one would want acoustic jewelry because
people would think the users talking to
themselves are crazy
Im glad the success of Bluetooth headsets show
that both were completely wrong!

4
Classical Front-end Architectures - POTS
5
Classical Front-end Architectures Cellphone 1995
6
Classical Front-end Architectures Cellphone
2005 - 2010
7
Cellphones and Handsfree

Common problems
Far-end listener does not hear near-end talker
Near-end listener does not understand far-end
talker
Why?
Form factor Size
Limited understanding of physics and acoustics(?)

8
RX/TX Levels, Coupling and Doubletalk

Echo louder than near-end
Linear AEC
ERLE ? 20-30 dB
After cancellation Residual Echo to Near-end
Ratio (RENR)
RENR ? 90-20-70 0 dB

Far-end ? 95100 dBSPL at loudspeaker

8590
dBSPL at mic

gt20 dB of residual echo suppression required
Duplexness suffers

Near-end talker ? 5570 dBSPL at mic
9
TX Dynamic Range and Noise

Echo 90 dBSPL ? Peak echo ?105-110 dB
No saturation of echo in TX path

Echo Level 90 dBSPL
Near-end speech Level 70 dBSPL
10
TX Fixed-point Processing and Quantization Noise
Q-noise increases by 6log2(N) dB!

N64 ? Q-noise increases by 36 dB
Double-precision required

11
RX Dynamic Range and Distortion
Digital gain
Analog gain
To AEC

Small loudspeakers have rather high cut-off
frequency (high-pass)
EQ often required to get acceptable sound
(frequency response). However EQ means
Loss of signal loudness and dynamic range
Increased (analog) distortion
Many manufacturers compensate the loss of signal
level by excessive digital gain and therefore get
(digital) saturation

12
What Can or Should be Done?

Minimize acoustical coupling by good physical
design
TX
Use noise suppression but not excessively
Double-precision, block scaling, or
floating-point
RX
Compression instead of fixed gain
10 or less loudspeaker/driver THD is desired

13
What about Non-linear AEC Algorithms?

Interesting problem proposed and worked on for
many years
Not practical in most AEC applications since
Complicated model
Gain and therefore saturation possibly in both TX
and RX paths
Added complexity and system cost
Often slow convergence
Difficult to fine-tune in field
Even when non-linear cancellation works
perfectly, the user still perceives a distorted
loudspeaker signal!

14
Classical Front-end Architectures Cellphone
2005 - 2010
15
Single Channel Noise Suppression

Basic single channel noise suppressor
An extremely successful signal processing
invention by Manfred Schroeder in the 1960s
Musical tones is it a (solved) problem?
How do we evaluate and improve quality?
How about convergence rate?

16
Background to Single Channel Noise Suppressors
enhanced speech
NS
speech
noise

Block processing
Frequency domain model
Linear Time-varying filter
Wiener filter

17
Background to Single Channel Noise Suppressors

Estimation of spectra is often done recursively
Frequency smoothing

, when speech is not present
18
Musical Tones Is it a (Solved) Problem?

Examples
Original (Sally Sievers reel, June-Sept. 1964
by Manfred Schroeder and Mohan Sondhi at Bell
Labs)
Original noise (iSNR 6 dB)
Schroeder 1960s
Generic spectral subtraction Boll 1979
IS-127 1995
A problem of last century, only a constraint in
design
Controlling variance of suppression gains
Any NS algorithm should be constrained not to
have musical tones
Must only have a small impact on voice quality

19
Quality Metrics

Most importantly Listen!
SNR
Total
Segmental
During speech
Distortion metrics
ISD (Itakura-Saito distance)
ITU-T P.862 PESQ/MOS-LQO

20
Quality Metric P.862 (PESQ/MOS-LQO)

MOS-LQO (MOS Listening Quality Objective)
Alg-1/2 Wiener methods with 12 dB noise
suppression

What can the best noise suppressor achieve?

21
Quality Metric My Rule of Thumb

Ideal MOS (PESQ) performance bound is given by
shifting the unprocessed PESQ-curve to the left
Example for 12 dB suppression
12 dB shift to the left

12 dB
22
Convergence Rate

Important performance criterion
Non-stationary noise conditions
Frame loss
Main objective
Maximize convergence rate while maintaining
speech quality

23
Convergence Rate A Useful Test

Input sequence
IS-127
Wiener Based
A spectral subtraction m-script retrieved from
the internet

24
Convergence Rate and MOS-LQO

Normal
Fast
MOS-LQO

25
Current Applications and Drivers of NS Technology

Where is NS going in industry now?
Beyond 12 dB of suppression
Multi-microphone solutions
Two- or more channel suppressors
Linear beamforming
Applications
Mobile phones (a few two-microphone models have
reached the market)
Bluetooth headsets great "new" application for
signal processing (Ericsson BT headset 2000)

26
Background to Linear Beamforming

N Number of microphones
Broadside linear beamforming (e.g. delay-sum)
Directional gain 10log(N)
White Noise Gain (WNG)gt0
Practical size large (30cm)
Endfire differential beamforming
Directional gain 20log(N)
WNGlt0
Practical size small (1.5-5cm)

? Differential beamformers more suitable for
small form-factors
27
Background to Linear Beamforming

What do we gain?
Less reverberation (increased intelligibility)
Less (environmental) noise
No (or low) distortion on axis
Possible interference rejection by spatial
zero(s)
Some Issues
Performance is given by critical distance!
Increase in sensor noise (WNG, differential
beamforming)

28
Beamforming Critical Distance

Critical distance (Reverberation radius)
reverberant-to-direct path energy ratio is 0 dB
DI Directivity Index gain of direct to
reverberant energy over an omni-directional
microphone
Order of finite differences used. 1st ?2 mics,
2nd ?3 mics etc)

Order DI dB
0 0
1 6 2.0
2 9.5 3.0
3 12 4.0
29
First-Order Differential Beamforming
30
Classical First-Order Beamformer Responses
Cardioid
Hypercardioid
Dipole
31
Beamforming Demo DEWIND? processing

Write a Comment

User Comments (0)