Blind Source Separation of Acoustic Signals Based on Multistage Independent Component Analysis - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Blind Source Separation of Acoustic Signals Based on Multistage Independent Component Analysis

Description:

Blind Source Separation of Acoustic Signals Based on Multistage Independent Component Analysis Hiroshi SARUWATARI, Tsuyoki NISHIKAWA, and Kiyohiro SHIKANO – PowerPoint PPT presentation

Number of Views:211
Avg rating:3.0/5.0
Slides: 46
Provided by: spap8
Category:

less

Transcript and Presenter's Notes

Title: Blind Source Separation of Acoustic Signals Based on Multistage Independent Component Analysis


1
Blind Source Separation of Acoustic Signals Based
on Multistage Independent Component Analysis
  • Hiroshi SARUWATARI,
  • Tsuyoki NISHIKAWA, and Kiyohiro SHIKANO
  • Graduate School of Information Science
  • Nara Institute of Science and Technology

2
Contents
  • Background of blind source separation (BSS)
  • Overview of Independent Component Analysis (ICA)
  • Time-domain ICA (TDICA)
  • Frequency-domain ICA (FDICA)
  • Disadvantages of TDICA and FDICA
  • Proposal of Multistage ICA
  • Experimental results under real acoustic
    condition
  • Conclusion

3
Background
Hands-free speech recognition system
Is it fine tomorrow?
Microphone
Target Speech
Speech recognition system
4
Background (Contd)
  • Goal
  • Microphone array
  • Receiver which consists of multiple elements
  • To enhance target speech or reduce interference
  • Problem of microphone array processing
  • A priori information is required.
  • Directions of arrival of the sound sources
  • Breaks of target speech for filter adaptation
  • Realization of high quality hands-free
  • speech interface system

5
Problem of Microphone Array
  • Delay and Sumto produce narrow beam-pattern
  • Adaptiveto update filters to reduce the noise

Target
High sidelobe gains noise.
?
It is necessary to observe only noise for filter
learning.
Set target direction.
?
Null
6
Blind Source Separation (BSS)
  • Approach taken to estimate source signals only
    from the observed mixed signals.
  • Any information about source directions and
    acoustic conditions is not required.
  • Independent component Analysis (ICA) is mainly
    used.
  • Previous works on ICA
  • J. Cardoso, 1989
  • C. Jutten, 1990 (Higher-order
    decorrelation)
  • P. Common, 1994 (define the term ICA)
  • A. Bell et al., 1995 (infomax)

7
ICA-Based BSS
No a priori information (unsupervised adaptive
filtering)
Good Morning!
Speaker1
Source1
Observed signal1
Microphone1
Microphone2
Observed signal2
Hello!
Source2
Speaker2
8
BSS for Instantaneous mixture
Linearly Mixing Process
Mixing Matrix
Source
Observed
Separation Process
Unmixing Matrix
Separated
Optimize
9
Various Criterion for ICA
Separated Signal
  • Decorrelation
  • To minimize correlation among signals
  • in multiple time durations
  • Nonlinear function 1
  • To minimize higher-order correlation
  • Nonlinear function 2
  • To assume p.d.f of sources

Sigmoid function
10
Cost Function for Nonlinear Function 2
Kullback-Leibler (KL) divergence between
and
11
Derivation for Nonlinear Function 2
To update along the negative gradient of
Nonlinear Function 2 ? To be diagonalized
where
This can be approximated by Sigmoid Function in
speech signal.
12
Why Instantaneous Mixture Model?
  • ICA for Instantaneous Mixture model
  • ? Mixing matrix is represented as real-valued.
  • ? No assumption for time delay among
    microphones and room reflections.

Is it applicable to sound signals in real
acoustic environment?
NO!
It is a useful example to show how ICA can work,
but only mathematical Toy model.
13
ICA for Convolutive Mixture Model (1)
  • In application to microphone array,
  • each received signal has a time delay which
    corresponds to the sound direction and position
    of each element.
  • mixing matrix A is not simple scalar-valued
    coefficient,
  • but is represented as convolution filter.

Mixing Process in Real Environment
Mixing Matrix (FIR-Filter)
Source
Observed
?We should use FIR filter for separation process.
14
ICA for Convolutive Mixture Model (2)
  • To simplify the convolutive mixture down to
  • instantaneous mixtures by the frequency
    transform

Mixing Process in Frequency Domain
Complex-Valued Mixing Matrix
Source
observed
?We only solve the complex-valued instantaneous
mixture in each subband independently.
15
Permutation Problem in FDICA
ICA is conducted in each subband
independently. Ordering and Scaling of outputs
are arbitrary in ICA.
Permutation and Gain Determinacy
16
Solutions for Permutation Problem
To use correlation among outputs waveforms
(Murata et al. 1998) To use directivity pattern
of W (Kurita and Saruwatari, 2000) To use
correlation among unmixing matrices in
neighboring frequency bins (Parra et al, 2000,
Asano et al, 2001)
17
Problem in ICA-Based BSS
Separation performance under reverberant
conditions significantly degrades.
Why?
Reverberation in typical room 2400-tap
FIR-filter in 8 kHz sampling ?The number of
parameters is too large.
It is necessary to achieve robust BSS method
against reverberation.
18
Conventional Approaches
  • Frequency-Domain ICA (FDICA)
  • To estimate the separation coefficients every
    frequency bin in the frequency domain
  • Time-Domain ICA (TDICA)
  • To estimate the separation FIR filter in the time
    domain

19
FDICA-Based BSS
20
Performance of FDICA
  • In conventional dereverberation processing,
    dereverberation performance is improved as the
    number of subbands (filter length) is enlarged.

ltSpeculationgt
In FDICA, as the number of subbands is enlarged,
is source-separation performance also improved?
To investigate the relation between the number of
subbands and source-separation performance
21
Experimental Setup
  • Interelement
  • spacing is 4 cm.
  • Reverberation time
  • is 300 ms.
  • Sound source
  • 2-male and 2-female speech from ASJ corpus
  • (12 combinations)
  • Evaluation Noise Reduction Rate
  • Output SNR dB Input SNR dB

22
FDICA Algorithm
  • Iterative learning algorithm (Amari, 1996)
  • where

? Step size parameter
23
Relation between Number of Subbands and
Separation Performance
24
Narrow-Band Signals
Real part (1 kHz)
Signal1
Signal1
Signal2
Signal2
32-Subbands
2048-Subbands
25
Relation between Number of Subbands and
Correlation
Higher correlation
26
Trade-off Relation between Independence and
Robustness against Reverberation
High-Independence
Robust against Reverberation
Low- Independence
Poor against Reverberation
Noise Reduction Rate
Number of Subbands
Large
Small
27
TDICA-Based BSS
28
Performance of TDICA
  • In conventional dereverberation processing,
    dereverberation performance is improved as the
    filter length is lengthened.

ltSpeculationgt
In TDICA, as the filter length is lengthened,
is source-separation performance also improved?
To investigate the relation between the filter
length and source-separation performance
29
TDICA Algorithm
  • Cost function

minimize
where
30
TDICA Algorithm (Contd)
  • Iterative equation of separation filter
  • Natural gradient of Q(w(z))

where a is the step-size parameter
31
Proposed Iterative Equation of TDICA
  • Iterative equation of separation filter (TDICA1)

32
Results of TDICA1 and TDICA2
TDICA 1 (only correlation of same time)
TDICA 2 (correlation of different time)
33
Problems and Solution
FDICA
TDICA
  • Advantages
  • To simplify the convolutive mixture down to
    instantaneous mixtures
  • Easy to converge the separation filter with high
    stability
  • Disadvantages
  • Independence assumption collapses in each
    narrow-band.
  • Separation performance is saturated before
    reaching a sufficient performance.
  • Advantages
  • To treat the fullband speech signals where the
    independence assumption of sources usually holds
  • High-convergence possibility near the optimal
    point
  • Disadvantages
  • Iterative rule for filter learning is
    complicated.
  • Convergence degrades under reverberant conditions.

34
Separation Procedure of MSICA
TDICA
FDICA
Mixing system
  • Separated signals of FDICA are regarded
  • as the input signals for TDICA.
  • Residual cross-talk components of FDICA
  • can be removed by TDICA.

35
Comparison among Each ICA
  • To investigate the relation between the filter
    length and source-separation performance of TDICA
    part in MSICA
  • Comparison among separation performances of
    TDICA, FDICA, and MSICA

36
Relation between Filter Length and Separation
Performance
9.4
37
Comparison among Each ICA
  • To investigate the relation between the filter
    length and source-separation performance of TDICA
    part in MSICA
  • Comparison among separation performances of
    TDICA, FDICA, and MSICA

38
Comparison Results
39
Spectral Distortion (TDICA)
10 taps No whitening, 1000 taps whitening
40
Spectral Distortion (FDICA, MSICA)
FDICA No whitening, MSICA No whitening
41
Sound Demonstration of MSICA
  • Reverberation time 300 ms
  • Mixed speech (FemaleMale)
  • Separated speech (Female)
  • Separated speech (Male)

Female
Male
40
-30
42
Conclusion
  • Disadvantages of FDICA and TDICA
  • Separation performance of FDICA is saturated
    before reaching a sufficient performance.
  • Source separation using long filter fails in
    TDICA because the iterative rule for FIR-filter
    learning is complicated.
  • MSICA combining FDICA and TDICA
  • In TDICA part in MSICA, separation performance is
    improved even with the long filter.
  • Separation performance of MSICA is superior to
    those of TDICA and FDICA.

43
Future Work
  • Further evaluation in real environment
  • Robustness under reverberant conditions
  • Larger array with more than 2-element
  • To apply BSS to speech recognition system
  • Improvement of convergence speed
  • On-line and real-time algorithm

44
ICA and BSS Where do we go?
We should go to NARA-city!
Why?
45
4th International Symposium onICA and BSS
(ICA2003), in NARA
  • Date April 1-4, 2003
  • Place Nara, JAPAN
  • Scientific Areas
  • ICA and Factor Analysis, PCA etc.
  • Blind source separation
  • Blind and semi-blind equalization and
    deconvolution
  • Blind identification
  • Any signal processing application related with
    ICA
  • URL http//ica2003.jp/

46
Analysis Conditions
TDICA, TDICA part in MSICA
Filter length 102000 taps
Number of blocks B (Block length ) 110 (Data length/B)
ICA-iteration 500
FDICA part in MSICA
Number of subbands 1024 point
Frame shift 16 point
ICA-iteration 30
47
Analysis Conditions
TDICA, TDICA part in MSICA
Filter length 10 (TDICA) 1000 (MSICA)
Number of blocks B 3 (TDICA) 9 (MSICA)
ICA-iteration 500
FDICA, FDICA part in MSICA
Number of subbands 1024 point
Frame shift 16 point
ICA-iteration 30
Write a Comment
User Comments (0)
About PowerShow.com