System architecture for Pattern Recognition in Eco Systems - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

System architecture for Pattern Recognition in Eco Systems

Description:

... many wolves are there ... The sound transmitted by the wolf pack is processed, in order to ... a second or two, a second wolf joins, followed by one or two ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 40
Provided by: benjamn5
Category:

less

Transcript and Presenter's Notes

Title: System architecture for Pattern Recognition in Eco Systems


1
System architecture for Pattern Recognition in
Eco Systems
  • Benjamín Dugnol ÁlvarezCarlos Fernández
    GarcíaDepartamento de MatemáticasUniversidad de
    Oviedo

2
Abstract The purpose of the present work is the
count and classification of the individuals in
the wolf packs by processing its audio signals,
supposing that we have recordings of sufficient
temporary length, obtained with a single
microphone. We will set out an architecture that
includes the treatment of the environmental
background noise, the separation of signals and
its classification. Keywords biological
signals, wolf pack size, spectral subtraction,
monaural signal separation, signal features,
cepstral analysis, signal classification.
3
1. Motivation of the proposal. 2. The music of
the wolves. 3. Introduction and
state-of-the-art 4. Proposed architecture. 5.
Processing of the background noise. 6.
Classification of the signal segments. 7.
Processing segments of one voice. 8. Processing
segments of two voices. 9. Characterization,
recognition and estimation. 10. Conclusions and
future work. 11. References.
4
Motivation The problem
Usually wolves coexist in packs made up of a
number between 2 and 10 individuals. How many
wolves are there in the pack? It is hard to make
a direct estimation of the pack size due to the
difficulties of human approach to the group and
still preserve its integrity.
wolf pack
5
Motivation The problem
The wolf pack generates acoustic information when
it wants to communicate with another pack
6
Motivation The answer
Communication is basically made using different
sounds like howls, barks, whispers and
growls. The howls of pack members can be heard
about 15 km away.
Can any wolf be recognized through its howl?
7
Motivation The answer
A solution can be found using DSP. The sound
transmitted by the wolf pack is processed, in
order to extract data on the sound that
corresponds to each animal.
8
Motivation The answer
First, the biologist capture the sound
9
Motivation The answer
Then, the engineers digitize the signal
10
Motivation The answer
Finally, the engineers (using Matlab) process
the signal.
Pack size estimation
11
Motivation The answer
Wolf howls function both to decrease and increase
distance between communicating individuals and,
as such, might be expected to provide information
on individual identity.
12
Motivation The answer
Authors disagree on the presence of vocal
signatures in wolf howls.
But, I recognize my dog from its acoustic sound!
13
Motivation The answer
14
The music of the wolves
The adults The howl signals are simple. They
can be broadly described as loud, continuous,
tonal sounds with a fundamental frequency between
150 and 800 Hz. Often the pitch is constant or
piecewise-constant in long temporary segments or
with smooth variation by pieces. Sometimes it
shows a composition of several consecutive
connected segments or with very brief
interruptions.
15
The music of the wolves
16
The music of the wolves
The pups The pups (4/6 months), howl very often,
answering any other howl they listen. The
signals are usually made up of shorter segments
than the adults segments and with bigger average
fundamental frequencies.
17
The music of the wolves
18
The music of the wolves
The chorus It begins with a single howl, which is
relatively simple in structure. After a second or
two, a second wolf joins, followed by one or two
more before the rest of the pack follows
virtually in masse. This accelerating start makes
it possible to pick out the first three or four
individuals but, after that, too many begin
howling at once to count them.
19
The music of the wolves
20
Proposed architecture
21
Proposed architecture
The record obtained using a single microphone is
digitized to a 44100 Hertz. The reduction of
the background noise is made. Signal
segmentation (Hanning window with 50
overlapping).
22
Proposed architecture
Segments without underlying signal and with
residual background noise.
0
Segments that possibly contain one individual
voice and a residual background noise .
The output is a collection of signal segments
classified following a content based criterion.
I
Segments that possibly contain a mixture of two
individuals voices and a residual background
noise.
II
Segments that contains a mixture of three or more
individuals voices.
III
23
Proposed architecture
0
Nothing
I
One signal
Separation of the signals contained in each
segment.
II
Two signals
III
Nothing
24
Proposed architecture
Signals
Features
Classification
Pack size estimation
25
Processing the background noise
We will use the Boll algorithm for the
suppression of acoustic noise in the speech
signal. This algorithm operates a spectral
subtraction by an efficient way.
Boll79 Boll, S.F. Suppression of Acoustic
Noise in Speech using Spectral Subtraction. IEEE
Trans. Acoust., Speech and Signal Processing,
vol. 27, pp. 113-120, 1979.
26
Processing of the background noise
  • Estimation of the spectrum of the signal by using
    STFT,
  • Estimation of the noise spectrum,
  • Spectral subtraction,
  • Reduction of the residual noise,
  • Suppression of the noise in segments of
    inactivity of the underlying signal,
  • Synthesis of the clean signal.

27
Processing of the background noise
Matlab Demo
28
Classification of the segments
Type 0 segments
First, VAD detectsegments withoutunderlying
signal
Types I,II,III segments
Next, segments are classified using a
multipitch estimation technique.
Tolonen00 Tolonen, T., Karjalainen, M. A
Computationaly Efficient Multipitch Analysis
Model. IEEE Transactions on Speech and Audio
Processing, Vol. 8, No. 6, november 2000.
29
Processing segments with one voice
  • The mathematical model adopted to describe the
    signal in this segments is

30
Processing segments with one voice
  • The signal extraction results from both
  • State estimation using a Kalman recursion
  • Model parameter estimation using
    (Expectation-Maximization) EM algorithm.

31
Processing segments with one voice
32
Processing segments with two voices
  • Now we have two signals s1, s2.
  • For both we suppose a gaussian model.
  • We know the observed data y s1s2.
  • We calculate the MAP estimate for s1.

Godsill97 Godsill S.J.,Tan,CH. Removal of low
frequency transient noise from old recordings
using model-based signal separation techniques.
In Proc. IEEE Workshop on Audio and Acoustics,
Mohonk, NY State, Mohonk, NY State, October 1997.
33
Characterization
  • A1 Linear prediction coefficients of AR model,
    or WLPC (warped) model,A2 Cepstral
    coefficients also with the Mel frequency
    scale,A3 Delta-cepstral coefficients, A4
    Impulse response h(n) of AR model,A5 Spectral
    centroid,A6 Onset segment duration,A7
    Amplitude envelope,A8 Amplitude modulation,A9
    Fundamental frequency,A101 and 2 harmonic,A11
    Frequency modulation,A12 Spectral ratio,A13
    Normalized energy.

34
Classification
  • Based on features, the signal classification has
    two different directions
  • First, we design a two classes mechanism adults
    and pups.
  • Next we build an initial class and another new
    class for each individual which not belong to an
    old class.

35
Estimation
  • The found number of different classes is the pack
    size estimation.
  • The number of individuals of classes pups and
    adults shows the pack structure.

36
Future work
  • The task of audio signal separation with a single
    channel is a very complex problem.

37
Future work
  • 1. The signal capture using two microphones is
    very relevant. This improvement reduces the
    problem indetermination, simplifying the
    background noise processing and allow the echo
    signal processing.
  • 2. The background noise processing can be done
    with wavelet transform.
  • 3. The background noise processing can be
    realized with Kalman recursion.
  • 4. The separation task, the most difficult step,
    can be realized with overcomplete signal
    dictionaries, like Best Basis, Basis Pursuit or
    Matching Pursuit.

38
Future work
  • 4. Following the human auditory system, Stéphane
    Maes Maes96 suggests a method of nonlinear
    squeezing to derive the amplitude and phase
    components of the signal and then to derive
    signal features 'wastrum' instead of cepstrum.
  • 5. The linear model selected to describe signals
    can be enhanced considering a more sophisticated
    representation for excitation signal.

39
System architecture for Pattern Recognition in
Echo Systems
  • (The end)
  • You can go to
  • http//coco.ccu.univi.es
Write a Comment
User Comments (0)
About PowerShow.com