Singer%20similarity%20/%20identification

About This Presentation

Title:

Description:

Number of Views:102

Avg rating:3.0/5.0

Slides: 19

Provided by: Thomas1286

Category:

Tags: 20identification | 20similarity | polyphase | singer

Transcript and Presenter's Notes

Title: Singer%20similarity%20/%20identification

1
Singer similarity / identification

2
Introduction

Relatively easy for humans to identify singing
voice in various contexts
Difficult to find time/environment invariant
features for robust automatic identification
Growing demand for such systems as Network
databases keep expanding

3
Background (1)

Significant research in speaker identification,
systems perform poorly with singing voice
(inadequate training)
Singer identification research can draw much of
automatic instrument recognition systems
Artist / singer identification much harder than
song identification (due to necessity of context
invariant features)

4
Background (2)

Often builds on speech / music discrimination
systems
Acoustical features heavily used to create
N-dimensional Euclidean space loudness, pitch,
brightness, bandwidth, harmonicity
Often uses the same tools as style identification
because each singer correspond to a micro style

5
Kim and Whitman overview

6
K W features extraction

Determine formant location and amplitude by a
12-poles linear predictor using the
autocorrelation method
Augments low frequency resolution without
increasing model order by warping the frequency
representation with a function approximating the
Bark scale

7
K W classification

8
K W results

9
K W Experimental results
10
Liu and Huang overview

Singer classification of MP3 files
First segment audio into phonemes
Calculate feature vector and store phoneme
feature vector with associated singer for
training set
Above feature vectors are used as discriminators
for classification of unknown MP3 music objects

11
L H System Architecture
12
L H segmentation features

Phoneme segmentation is derived from polyphase
filter coefficients by obtaining a frame energy
measurement

13
K W phoneme database

14
L H Phoneme features

15
L H classification (1)

Compares phonemes features with those in the
phoneme database
Discriminating radius (Euclidean distance) is
determines uniqueness of a phoneme
Number of neighbors by same singer within the
discriminating radius is called frequency (w)

16
L H classification (2)

kNN classifier used to guess artist in unknown
MP3 songs
For efficiency, only uses the first N phonemes in
unknown MP3
Find the k closest neighbors in database and
allow to vote if distance is within a threshold
For each neighbor, give a weighted vote dependent
on frequency, and distance

where w is frequency and
17
K W results

18
Other works

Minnowmatch MIR engine including artist
classification using NN and SVM (Whitman, Flake,
Lawrence (NEC))
Quest for ground truth in musical artist
similarity determine accurate measure of
similarity given subjective nature of artist
classification (Ellis, Whitman, Berenzweig,
Lawrence)

Write a Comment

User Comments (0)