Title: Learning to Align Polyphonic Music
1Learning to Align Polyphonic Music
- Shai Shalev-Shwartz
- Hebrew University, Jerusalem
- Joint work with
- Yoram Singer, Google Inc.
- Joseph Keshet, Hebrew University
2Motivation
Two ways for representing music
Symbolic representation
Acoustic representation
3Symbolic Representation
symbolic representation
- pitch
pitch
- start-time
time
4Acoustic Representation
acoustic signal
Feature Extraction (e.g. Spectral Analysis)
acoustic representation
5The Alignment Problem Setting
actual start-time
6The Alignment Problem Setting
- Goal learn an alignment function
actual start-times
alignment function
7Previous Work
- Dynamic Programming (rule based)
- Dannenberg 1984
- Soulez et al. 2003
- Orio Schwarz 2001
- Generative Approaches
- Raphael 1999
- Durey Clements 2001
- Shalev-Shwartz et al. 2002
8Our Solution
Discriminative Learning from examples
Training Set
Discriminative Learning Algorithm
Alignment function
9Why Discriminative Learning?
When Solving a given problem, try to avoid a
more general problem as an intermediate step
(Vladimir Vapniks principle for solving problems
using a restricted amount of information)
Or, if you would like to visit Barcelona, buy a
ticket ! Dont waste so much time on writing a
paper for ISMIR 2004
10Outline of Solution
- Define a quantitative assessment of alignments
- Define a hypotheses class - what is the form of
our alignment functions - Map all possible alignments into vectors in an
abstract vector-space - Find a projection in the vector-space which ranks
alignments according to their quality - Suggest a learning algorithm
11Assessing alignments
e.g.
12Feature Functions for Alignment
e.g.
acoustic and symbolic representation
feature function for alignment Assessing the
quality of a suggested alignment
suggested alignment (actual start-times)
e.g.
13Feature Functions for Alignment
Mapping all possible alignments into a vector
space
slightly incorrect alignment
correct alignment
grossly incorrect alignment
14Main Solution Principle
Find a linear projection that ranks alignments
according to their quality
slightly incorrect alignment
correct alignment
grossly incorrect alignment
15Main Solution Principle (cont.)
An example of projection with low confidence
slightly incorrect alignment
correct alignment
grossly incorrect alignment
16Main Solution Principle (cont.)
An example of incorrect projection
slightly incorrect alignment
correct alignment
grossly incorrect alignment
17Hypotheses class
- The form of our alignment functions predict
the alignment which attains the highest
projection - defines the direction of projection
18Learning algorithm
- Optimization Problem
- Given a training set
- Find
- a projection
and - a maximal confidence scalar
- such that the data is ranked correctly
19Algorithmic aspects
- Iterative algorithm
- Works on one alignment example at a time
- The algorithm works in polynomial time although
the number of constraints is exponentially large - Simple to implement
- Convergence
- Converges to a high confidence solution
- iterations depends on the best attainable
confidence - Generalization
- The gap between test and train error decreases
with the examples. The gap is bounded above by
20Experimental Results
- Task alignment of polyphonic piano music
- Dataset 12 musical pieces where sound and MIDI
were both recorded other performances of the
same pieces in MIDI format - Features see in the paper
- Algorithms
- Discriminative method
- Generative method Generalized Hidden Markov
Model (GHMM) - Using the same features as in the discriminative
method - Using different number of Gaussians (1,3,5,7)
21Experimental Results (Cont.)
Our discriminative method outperforms GHMM
22The End