Title: Realtime Recognition of Orchestral Instruments
1Realtime Recognition of Orchestral Instruments
- Ichiro Fujinaga
- McGill University
2Overview
- Introduction
- Lazy learning (exemplar-based learning)
- k-NN classifier
- Genetic algorithm
- Features
- Results
- Conclusions
3Introduction
- Realtime recognition of isolated monophonic
orchestral instruments - Spectrum analysis by Miller Puckettes fiddle
- Adaptive system based on a exemplar-based
classifier and a genetic algorithm
4Overall Architecture
Live mic Input
Sound file Input
Data Acquisition Data Analysis (fiddle)
Recognition K-NN Classifier
Output Instrument Name
Knowledge Base Feature Vectors
Genetic Algorithm K-NN Classifier
Best Weight Vector
Off-line
5Exemplar-based learning
- The exemplar-based learning model is based on the
idea that objects are categorized by their
similarity to one or more stored examples - There is much evidence from psychological studies
to support exemplar-based categorization by
humans - This model differs both from rule-based or
prototype-based (neural nets) models of concept
formation in that it assumes no abstraction or
generalizations of concepts - This model can be implemented using k-nearest
neighbor classifier and is further enhanced by
application of a genetic algorithm
6Exemplar-based categorization
- Objects are categorized by their similarity to
one or more stored examples - No abstraction or generalizations, unlike
rule-based or prototype-based models of concept
formation - Can be implemented using k-nearest neighbor
classifier - Slow and large storage requirements?
7Exemplar-based learning
- The exemplar-based learning model is based on the
idea that objects are categorized by their
similarity to one or more stored examples - There is much evidence from psychological studies
to support exemplar-based categorization by
humans - This model differs both from rule-based or
prototype-based (neural nets) models of concept
formation in that it assumes no abstraction or
generalizations of concepts - This model can be implemented using k-nearest
neighbor classifier and is further enhanced by
application of a genetic algorithm
8K-nearest-neighbor classifier
- Determine the class of a given sample by its
feature vector - Distances between feature vectors of an
unclassified sample and previously classified
samples are calculated - The class represented by the majority of
k-nearest neighbors is then assigned to the
unclassified sample
9Example of k-NN classifier
10Example of k-NN classifier
11Example of k-NN classifier
12Example of k-NN classifier
13Distance measures
- The distance in a N-dimensional feature space
between two vectors X and Y can be defined as - A weighted distance can be defined as
14Genetic algorithms
- Optimization based on biological evolution
- Maintenance of population using selection,
crossover, and mutation - Chromosomes weight vectors
- Fitness function recognition rate
- Leave-one-out cross validation
15Features
- Static features (per window)
- pitch
- mass or the integral of the curve (zeroth-order
moment) - centroid (first-order moment)
- variance (second-order central moment)
- skewness (third-order central moment)
- amplitudes of the harmonic partials
- number of strong harmonic partials
- spectral irregularity
- tristimulus
- Dynamic features
- means and velocities of static features over time
16Data
- Original source McGill Master Samples
- Over 1300 notes from 39 different timbres (23
orchestral instruments) - Spectrum analysis by fiddle (2048 points)
- First 46232ms of attack (19 windows)
- Each analysis window (46 ms) consists of a list
of amplitudes and frequencies of the peaks in the
spectra
17Results
- Experiment I
- SHARC data
- static features
- Experiment II
- fiddle
- dynamic features
- Experiment III
- more features
- redefinition of attack point
18Conclusions
- Realtime timbre recognition system
- Analysis by Puckettes fiddle
- Recognition using dynamic features
- Adaptive recognizer by k-NN classifier enhanced
with genetic algorithm - A successful implementation of exemplar-based
classifier in a time-critical environment
19Future research
- Performer identification
- Speaker identification
- Tone-quality analysis
- Multi-instrument recognition
- Expert recognition of timbre
20Recognition rate for different lengths of
analysis window
21Comparison with Human Performance