Computer Science Department - PowerPoint PPT Presentation

About This Presentation
Title:

Computer Science Department

Description:

Computer Science Department. A Speech / Music Discriminator ... Costas Panagiotakis and George Tziritas. Department of Computer Science. University of Crete ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 17
Provided by: promi6
Category:

less

Transcript and Presenter's Notes

Title: Computer Science Department


1
Computer Science Department
A Speech / Music Discriminator using RMS and
Zero-crossings
Costas Panagiotakis and George Tziritas
Department of Computer Science University of
Crete Heraklion Greece
2
Computer Science Department
Presentation Organization
  • I. Introduction
  • II. Segmentation
  • Classification
  • Results
  • Conclusion

3
Computer Science Department
Introduction (1/3)
Input
Figure 1 Original Sound Signal (44100 or 22050
sample rate)
Output
Figure 2 Real time Segmentation and
Classification (Speech,Music,Silence)
4
Computer Science Department
Introduction (2/3)
Approaches
  • Features extraction (energy,frequency)
  • Feature based Segmentation and Classification

Basic purpose
  • Real time segmentation and classification
  • Algorithmic - computation constraints
  • Low feature number
  • Low change extraction error (20 msec)
  • Low minimum distance between two changes (1 sec)
  • High accuracy (95 )


5
Computer Science Department
6
Computer Science Department
Segmentation (1/3)
Basic characteristics RMS based
?2 distribution fits well the RMS
histograms

G( a 1)
m mean , s2 variance
Two stage algorithm
  • Stage 1
  • 1 sec accuracy (low computation cost)
  • Stage 2
  • 20 msec accuracy (high computation cost)

7
Computer Science Department
Segmentation (2/3)
  • Stage 1
  • Partitioning in 1 sec frames (50 RMS values)
  • Change in Frame i ? Frame i-1 and Frame i1 have
    to differ
  • Computation of frame distance D (Matusita
    Distance) using frame similarity (p)
  • Frame i is candidate for Stage 2 (there is a
    change)
  • If D(i) gt threshold and D(i) local maximal

p( p1 , p2 )

Change in frame i
RMS
time
1 sec frames
Distance
8
Computer Science Department
Segmentation (3/3)
  • Stage 2
  • 20 msec accuracy
  • for each candidate frame (i) from stage 1
  • 1. move 2 successive frames (1 sec) located
    before and after frame (i)
  • 2. find the time instant where the 2 successive
    frames have the maximum Matusita distance
    in RMS distribution
  • Possible oversegmentation


Figure 11 The segmentation result and the RMS
data
Figure 10 The RMS data and the distance D
9
Computer Science Department
10
Computer Science Department
11
Computer Science Department
12
Computer Science Department
Classification (4/4)
Silence segment recognition
Segment is silence ? E lt Threshold
Decision making algorithm
13
Computer Science Department
Data Data source Segmentat
ion performance
Results
11.328 sec speech 3.131 sec music
70 audio CDs 15 WWW 15 recordings
Actual features performance
  • 97 detection probability
  • Change accuracy 0.2 sec

Accuracy
s2? Cz
Cz s2?
ZC0 s2?
Fu s2?
All
Cz
ZC0
s2? , ZC0
s2?
Features
Features
14
Computer Science Department
15
Computer Science Department
Segmentation - Classification Demo
16
Computer Science Department
Sound Player Demo
Write a Comment
User Comments (0)
About PowerShow.com