Title: An Efficient Online Algorithm for Hierarchical Phoneme Classification
1An Efficient Online Algorithm for Hierarchical
Phoneme Classification
- Joseph Keshet
- joint work with Ofer Dekel and Yoram Singer
- The Hebrew University, Israel
MLMI 04 Martigny, Switzerland
2Motivation
Phonetic transcription of DECEMBER
Gross errors
d ix CH eh m bcl b er
Minor errors
d AE s eh m bcl b er
d ix s eh NASAL bcl b er
3Hierarchical Classification
- Goal spoken phoneme recognition
PHONEMES
Sononorants
Silences
Nasals
Obstruents
Liquids
n
m
ng
l
Vowels
y
w
Affricates
r
Plosives
jh
Fricatives
ch
Front
Center
Back
f
b
v
g
sh
oy
aa
iy
d
s
ow
ao
ih
k
th
uh
er
ey
p
dh
uw
aw
eh
t
zh
ay
ae
z
4Metric Over Phonetic Tree
- A given hierarchy induces a metric over the set
of phonemes ? tree distance
5Metric Over Phonetic Tree
- A given hierarchy induces a metric over the set
of phonemes ? tree distance
6Metric Over Phonemes
- Metric semantics?(a,b) is the severity of
predicting phoneme group b instead of correct
phoneme a
- Our high-level goal
- Tolerate minor errors
- Sibling errors
- Under-confident predictions - predicting a parent
- but, avoid major errors
7Hierarchical Classifier
- Assume and
- Associate a prototypewith each phoneme
- Score of phonemeas
- Classification rule
8Hierarchical Classifier
- Goal maintain close to
- Define
-
-
- Goal maintain small
-
9Online Learning
- For
- Receive an acoustic vector
- Predict a phoneme
- Receive correct phoneme
- Suffer tree-based penalty
- Apply update rule to obtain
Goal Suffer a small cumulative tree error
10Tree Loss
- Difficult to minimize
directly - Instead upper bound by
wherealso known as the hinge loss
11Online Update
w0
w1
w2
w6
w7
w4
w5
w8
w3
w10
w9
12Loss Bound Theorem
- sequence of examples
- satisfies
- Then
- where and
13Extension Kernels
- Since
- Note that
- Therefore
14Experiments
- Synthetic data
- Symmetric tree of depth 4, fan out 3, 121 labels
- Prototypes orthogonal set in with
Gaussian noise - 100 train instances and 50 test instances per
label - Phoneme recognition
- Subset of the TIMIT corpus
- 55 phonemes and phoneme groups
- MFCC??? front-end, concatenation of 5 frames
- RBF kernel
- 2000 train vectors and 500 test vector per phoneme
15Experiments
- Multiclass - Ignore the hierarchy
C
16Results
Averaged Tree Error Multiclass Error
Synthetic data (tree) 0.05 5
Synthetic data (multiclass) 0.11 8.6
Synthetic data (greedy) 0.52 34.9
Phonemes (tree) 1.3 40.6
Phonemes (multiclass) 1.41 41.8
Phonemes (greedy) 2.48 58.2
17Results
Difference between the tree error rates of the
tree algorithm and the multiclass (MC) algorithm
gross errors
Tree err-MC err
Tree err-MC err
minor errors
Synthetic data
Phonemes
18Tree vs. Multiclass Online Learning
- Similarity between the prototypes in Multiclass
and Tree training
19Thanks!