Understand what is Noise presentation

About This Presentation

Transcript and Presenter's Notes

Title: Understand what is Noise

1
Understand what is Noise

Presenter Shih-Hsiang (??)

Spoken Language processing, Chapter Advanced
Digital Signal Processing and Noise Reduction,
Chapter 2Robustness Techniques for Speech
Recognition, Berlin Chen, 2004
2
Introduction - Noise

What is Noise?
Unwanted signal that interferes with the
communication or measurement or processing of an
information-bearing signal
Noise is present in various degrees in almost all
environments
Noise can cause transmission errors and may even
disrupt a communication process
What kind of noises in the real world?

3
Different kind of noises
Depending on its source

Acoustic noise
Emanated from moving, vibrating, colliding
sources
moving cars, air-condition, computer fans,
traffic, people talking in the background, wind,
rain
Electromagnetic noise
Present at all frequencies ( from electric
devices)
radio, television transmitters and receivers
Electrostatic noise
Generated by the presence of a voltage with or
without current flow
fluorescent lighting
Channel distortions, echo, and fading
Non-ideal characteristics of communication
channel
radio channel
Processing noise
Results from the digital / analog processing of
signals

4
Different kind of noises (cont.)
Depending on its frequency or time
characteristics

Narrow band noise
a noise process with a narrow bandwidth
Band-limited white noise
a noise with a flat spectrum and a limited
bandwidth that usually covers the limited
spectrum of the device
White Noise (theoretical concept)
has the same power at all frequencies
Coloured noise
non-white noise or any wideband noise whose
spectrum has a non-flat shape
Pink noise, brown noise
Impulsive noise
consists of short-duration pulses of random
amplitude and random duration
Transient noise
consists of relatively long duration noise pulses

5
Color of Noise

White Noise
a signal with a flat frequency spectrum in linear
space
Pink Noise
frequency spectrum of pink noise is flat in
logarithmic space
power density decreases 3dB per octave with
increasing frequency
Brown Noise
power density decrease of 6 dB per octave with
increasing frequency
Blue Noise (Azure Noise)
power density increases 3 dB per octave with
increasing frequency
Purple Noise (Violet Noise)
power density increases 6 dB per octave with
increasing frequency
Gray Noise
noise subjected to a psychoacoustic equal
loudness curve over a given range of frequencies

6
Spectrogram

The spectrogram shows the energy in a signal at
each frequency and at each time

Dark areas of spectrogram show high intensity
7
White Noise
8
Pink Noise
9
Brown Noise
10
Blue Noise
11
Purple Noise
12
Gray Noise
13
Noises in Aurora 2
Babble
Airport
Exhibition
Car
14
Noises in Aurora 2 (cont.)
Street
Restaurant
Train
Subway
15
Noise in Speech Recognition
Time-Domain
Frequency-Domain
16
Additive Noises / Convolutional Noises

Additive noises can be stationary or
non-stationary
Stationary noises
the power spectral density does not change over
time
the noises are also narrow-band noises
such as computer fan, air conditioning, car noise
Non-stationary noises
the statistical properties change over time
wide band noise
machine gun, door slams, keyboard clicks,
radio/TV, and other speakers voices (babble
noise)
Convolutional noises (channel noises) are mainly
resulted from channel distortion and are
stationary for most cases
Reverberation, the frequency response of
microphone, transmission lines, etc

17
Reconstruction of incomplete spectrograms for
robust speech recognition
Bhiksha Raj Ramakrishnan Ph.D. dissertation, ECE
Dept, CMU, Apr. 2000 Advisor Richard Stern

Presenter Shih-Hsiang (??)

18
Introduction

The performance of ASR systems degrades greatly
when the speech has been corrupted by noise
Training the same level of noise?
Two approaches to reduce the mismatch
Data-compensation methods
Classifier-compensation methods (Model Adaptation)

Training Data Distribution
Testing Data Distribution
no longer similar
19
Introduction (cont.)

The drawback of above approaches
Most of them assume the noise is stationary
The effect of the noise can be representable by a
linear transform of the parameters
Effective in the context of their intended
purposes
Human auditory system preferentially processes
the high-energy components of the speech signal
while suppressing the weaker components
Human are able to comprehend speech that has
undergone considerable spectral excision

20
Introduction (cont.)

Two new approaches be developed
Multi-band based approaches Hermansky
Different frequency bands of speech signals may
be corrupted at different SNRs.
Using divide-and-conquer
deweighting noise bands
Missing-feature approaches Cooke
Low SNR regions are selectively erased or label
as unreliable
Performed on the basis of incomplete-data

21
Introduction (cont.)

The advantages of Missing-Features approaches
Make no assumptions about the corrupting noise
Do not need to have a knowledge about noise
Remarkable robust to high levels of noise
corruption
Missing-feature methods
Classifier modification methods
Model the effect of the incompleteness of the
data
Spectrogram reconstruction methods
Estimate the missing components of incomplete
spectrograms and reconstruct them

22
Introduction (cont.)

Classifier modification methods
Spectrogram reconstruction methods ? Todays
topic

23
Background InformationMultivariate Gaussian
Distribution

When X(X1,, XL) is a L-dimensional random
vector, the multivariate Gaussian pdf has the
form
Conditional distributions
If X1 conditional on X2 a is multivariate
normal

mean shift
regression coefficients
24
Background InformationMultivariate Gaussian
Distribution (cont.)
observed data
missing data
25
Background InformationMaximum A-Posteriori (MAP)
Estimation
In MAP estimation the missing data are estimated
to maximize their Likelihood, conditioned on the
value of the observed data
when
is Gaussian We get
26
Background InformationMaximum A-Posteriori
Estimation (cont.)
Figure. The same Gaussian sliced at X 2. The
flat surface in the figure represents the
distribution of all vectors whose X component is
2. This distribution peaks at Y Y1. Thus Y1 is
the MAP estimate of Y when X is 2
Figure. The solid horizontal line shows the
observed value of X. The circle on the
intersection of the solid diagonal line, and the
dotted line, shows where the distribution of
vectors with X2 peaks. This is the MAP estimate
of Y when X2. The solid diagonal line shows how
the position of this peak varies at each value
of X.
Figure. Gaussian distribution of a 2 dimensional
random vector. The mean of the Gaussian is at
1,1. The X and Y components have covariance
1.0, and the covariance between X and Y is 0.5
27
Background InformationSpectrogram

It is a short pictorial representation of the
short-time periodogram

short-time Fourier transform
where
Px(l,?) represents the power in frequency ? at
time instant l in the signal
S(l,k) represents the kth component of the lth
log-spectral vector
28
Background InformationSpectrogram (cont.)

Wide-band spectrogramsshorter windows(lt10ms)
have good time resolution
Narrow-band spectrogramsLonger windows(gt20ms)
the harmonics can be clearly seen

29
Background InformationMEL Spectrogram

Mel spectrogram consists of a sequence of log
mel-spectral vectors

Px(l,k) is the kth component of the mel spectrum
in the lth analysis window mk(j) is the jth DFT
coefficient of the impulse response of the kth
mel filter
The mel spectrogram consists of a sequence of
log-mel-spectral vectors and K is the total
number of mel filters
30
Background InformationMEL Spectrogram (cont.)
31
Background InformationEffect of noise on the
spectrogram

When the speech signal is corrupted by additive
noise
If assume that the noise is uncorrelated to the
speech signal

time domain
frequency domain
spectrogram
mel-spectrogram
32
Background InformationEffect of noise on the
spectrogram (cont.)
Region have been Deleted when a Local SNR less
then 0 dB
Speech be corrupted to 15db by additive
white noise
Speech be corrupted to 10db By additive
white noise
Region have been Deleted when a Local SNR less
then 0 dB
33
Recognizing speech with incomplete spectrograms
Modify the manner in which the classifier, or
recognizer

A speech recognition system is a statistical
pattern classifier
There are two possible approaches to handing
Data imputation approach
Marginalization approach

language model
acoustic model
decompose S into its observed and missing
component as SSo,Sm
Sm is not known and thereforce its likelihood
cannot be computed
34
Spectrogram reconstruction methods for missing
data
Modify the manner in Data-compensate

Estimating missing regions of incomplete
spectrograms to reconstruct complete spectrogram
Geometrical reconstruction methods
Linear interpolation
Nonlinear interpolation with polynomial function
Cluster-based reconstruction methods
Single cluster based reconstruction
Multiple cluster based reconstruction
Covariance-based reconstruction methods

35
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Interpolating between adjacent observed elements
in the spectrogram to reconstruct a missing
element
adjacent along frequency axis
adjacent along time axis
The interpolation used could be
simple linear interpolation
use other higher-order functional forms such as
polynomials, rational functions, or spline

36
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Linear interpolation
Given any sequence of numbers s1,s2,,sM,
where the samples in the intervall1,l2 are
unknown or missing

l2
l1
37
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Linear interpolation (cont)
Linear along frequency
Liner along time

s(l,k) lth spectral vector kth component in the
spectrogram
38
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Nonlinear interpolation(1) with polynomial
functions
Lagranges formula give a set of L points on a
plane, (x1,y1), (x2,y2),, (xmym)

39
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Nonlinear interpolation(1) with polynomial
functions (cont.)
Nonliner along frequency
Nonliner along time

s(l,k) lth spectral vector kth component in the
spectrogram
40
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Experimental results
Using mean squared error (MSE) to measure the
accuracy of the reconstructed spectrogram
The greater the MSE, the greater the divergence
between the reconstructed and uncorrupted
spectrograms

True uncorrupted spectrogram
Reconstructed spectrogram
The number of missing elements in the spectrogram
41
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Experimental results (cont.)

liner interpolation along time
randomly 50 deleted
liner interpolation along frequency
original
nonliner interpolation(1) along frequency
nonliner interpolation(1) along time
nonliner interpolation(2) along frequency
nonliner interpolation(2) along time
42
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Experimental results (cont.)

MSE along time
MSE along frequency
Accuracy along time
Accuracy along frequency
43
Spectrogram reconstruction methods for missing
data (cont.)Geometrical reconstruction methods

Summary
Linear interpolation estimation can be quite
effective
More detailed models are more likely to be
erroneous
Interpolation along time is generally more
effective than interpolation along frequency
Not enough frequency components
Several drawbacks
When the fraction of missing elements is very
high
there might not be sufficient information
remaining in the picture to reconstruct the
missing elements properly
If the observed elements in the spectrogram were
to be distorted,
all missing elements reconstructed on the basis
would also be distorted similarly

44
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Use the vector statistics of the spectral vector
for reconstruction of the complete spectrogram
Spectral vectors are assumed to be segregated
into a set of cluster

MAP estimaate for the missing component
complete component
observed component
45
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Single Cluster-based reconstruction methods
The tth spectral vector in spectrogram denote
S(t)
Missing component of the tth spectral vector
denote Sm(t)
Observed component of the tth spectral vector
denote So(t)
S(t)AtSo(t),Sm(t), where At is the
permutation matrix

46
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Single Cluster-based reconstruction methods
(cont.)
Experimental results

?complete spectrogram
randomly 50 deleted
original
reconstructed
47
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Experimental results (cont.)

Accuracy
MSE
48
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Multiple Cluster-based reconstruction methods
Two steps to estimate the missing portions of an
incomplete vector
Cluster membership of the vector
Decide which cluster the vector belongs to
Once the cluster membership of the vector is
established the distribution of that cluster is
used to obtain MAP estimates for the missing
components

49
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Multiple Cluster-based reconstruction methods
(cont.)
Step 1 Decide which cluster the vector belongs
to

The cluster membership
Kth cluster
priori probability,P(k)
negative of the log-likelihood
50
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Multiple Cluster-based reconstruction methods
(cont.)
Step 2 MAP estimates
Experimental results (Oracle experimental upper
bound)

randomly 70 deleted
codebook512
original
codebook1
codebook8
codebook64
codebook size is the number of clusters used in
the representation
51
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Experimental results (cont.)

Accuracy
MSE
52
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Cluster Marginal Reconstruction Identifying
cluster membership based on observed components
alone
Because we have no knowledge about entire S(t)
when some components in S(t) are missing
Step 1 Decide which cluster the vector belongs
to

identify the cluster membership of the vector
based on The observed component of the vector
along
marginalization
53
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Cluster Marginal Reconstruction Identifying
cluster membership based on observed components
alone (cont.)
Experimental results

randomly 70 deleted
codebook512
original
codebook1
codebook8
codebook64
54
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Experimental results (cont.)

Wrongly identified cluster
MSE
Accuracy
55
Spectrogram reconstruction methods for missing
data (cont.)Cluster-based reconstruction methods

Summary
Cluster based reconstruction methods can be very
effective in reconstructing missing regions of
spectrogram
When cluster memberships are identified based
only on the observed components, the result is
similar to single-cluster based reconstruction
Single Gaussian model for the distribution is a
good method

56
Spectrogram reconstruction methods for missing
data (cont.)Covariance-based reconstruction

Consider the sequence of spectral vectors that
constitute a spectrogram to be the output of a
Gaussian wide-sense stationary (WSS) random
process
The mean of the spectral vectors and covariances
between elements in the spectrogram are
independent of their position in the spectrogram
WSS gives us the following properties
Mean is not depend on where it occurs
Covariance between the component of two vector
depends only on the distance

57
Spectrogram reconstruction methods for missing
data (cont.)Covariance-based reconstruction
apply WSS properties
58
Spectrogram reconstruction methods for missing
data (cont.)Covariance-based reconstruction

Example

apply ES(t,k)µ(k)
MAP estimation for Sm can be estimated
4 sec utterance has 400 frames Each spectral
vectors have 20 frequency components There 8000
components in all in the spectrogram
?computational cost very high
59
Spectrogram reconstruction methods for missing
data (cont.)Covariance-based reconstruction
??

Reconstructing missing elements individually

Let S(t,k) is an element of the vector of
missing component Com(t,k) is
cross-covariance between So and Sm
expected value of S(t,k)
not all components of So contribute equally to
estimate of S(t,k)
60
Spectrogram reconstruction methods for missing
data (cont.)Covariance-based reconstruction

Jointly reconstructing all missing elements in a
vector

The missing element vector for the second
spectral vector is constructed as
The neighborhood of observed vector for Sm(2)
The mean vector for So(2) and Sm(2)
The cross covariance between Sm(2) and So(2)
The autocovariance matrix of So(2)
The second spectral vector would be obtained as
61
Spectrogram reconstruction methods for missing
data (cont.)Covariance-based reconstruction

Result

original
randomly 90 deleted
individually
vector jointly
MSE
Accuracy
62
Estimating the location of corrupt regions in
spectrograms

Its a difficult task to estimate the reliable
and unreliable region
Use spectrographic mask to distinguish the region
Binary information about every element in the
spectrogram
The ability of missing feature methods depends
critically on the accuracy of the spectrographic
masks used
False alarm reliable element declared as
unreliable
Miss unreliable element tagged as reliable

63
Estimating the location of corrupt regions in
spectrograms (cont.)

The recognition performance degrades very quickly
with increasing fraction of false alarms
The sensitivity of missing-feature methods to
misses is not so much

64
Estimating the location of corrupt regions in
spectrograms (cont.)

Using spectral subtraction

Typical values of ? and ß are 0.95 and 2
The initial portion of any utterance is assumed
to contain only noise
65
Estimating the location of corrupt regions in
spectrograms (cont.)
spectrographic mask estimated using
spectral- subtraction for speech corrupted to 10
dB
oracle spectrographic
spectrographic masks estimated by
spectral- subtraction
oracle spectrographic
66
Estimating the location of corrupt regions in
spectrograms (cont.)

Using a bayesian classifier

classification vector
67
Estimating the location of corrupt regions in
spectrograms (cont.)
spectrographic mask estimated using a classifier
for speech corrupted to 10 dB
oracle spectrographic
spectrographic masks estimated by a classifier
oracle spectrographic

Write a Comment

User Comments (0)

About PowerShow.com

Understand what is Noise PowerPoint PPT Presentation