Objective intelligibility assessment of pathological speakers - PowerPoint PPT Presentation

About This Presentation
Title:

Objective intelligibility assessment of pathological speakers

Description:

Intelligibility = popular measure for pathological speech assessment ... 7 with dysphonia. 2 others. Pathological speakers : mean of 78,7 % Normals : mean of 93,3 ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 25
Provided by: fsto
Category:

less

Transcript and Presenter's Notes

Title: Objective intelligibility assessment of pathological speakers


1
  • Objective intelligibility assessment of
    pathological speakers
  • Catherine Middag, Gwen Van Nuffelen,
  • Jean-Pierre Martens, Marc De Bodt

2
Introduction
  • Intelligibility popular measure for
    pathological speech assessment
  • Perceptual assessment affected by non-speech
    information
  • familiarity with speaker and type of disorder
  • usage of linguistic context
  • Word intelligibility tests designed to eliminate
    bias due to linguistic context
  • Replacing the human listener by an automatic
    speech recognizer (ASR) can solve the other
    problems, but is the ASR sufficiently reliable?
  • test case automation of the Dutch
    Intelligibility Assessment (DIA)

3
Dutch Intelligibility Assessment (DIA)
  • 50 isolated CVC words
  • intelligibility percent phonemes correct

4
How to apply ASR in the DIA?
  • Two approaches
  • let ASR recognize the words and count the
    percentage of correct decisions
  • let ASR check how well the acoustics match with
    the phonetic transcription of the target word
    (alignment)
  • Our experience
  • intelligibility emerging from first approach
    insufficiently reliable
  • therefore we developed a system based on alignment

5
System architecture flow chart
Speech aligner
speaker features
Intelligibility Prediction Model
objective score
6
System architecture flow chart
Speech aligner
  • Two systems
  • complex state-of-the-art HMM-based system
    (ASR-ESAT)
  • simple system with phonological layer
    (ASR-ELIS)
  • (point more directly to articulatory
    problems)

7
System architecture flow chart
Speech aligner
speaker features
Intelligibility Prediction Model
  • Two feature sets
  • Phonemic features (patient has trouble
    pronouncing a certain phoneme)
  • Phonological features (patient has problems with
    voicing, manner or place of articulation)

objective score
8
Extraction of phonemic features (PMF)
Speech aligner ASR-ESAT
Phonemic features
  • (0.70.50.3) /3
  • /p/ (0.40.8) /2
  • /o/ (0.60.8) /2
  • /l/ 0.6

9
Extraction of phonological features (PLF)
Speech aligner ASR-ELIS
Phonological features
Burst 0.6 Back (0.70.9)/2 Voiced
(0.80.60.5)/3
10
Extraction of phonological features (PLF)
Speech aligner ASR-ELIS
Phonological features
Not burst (0.20.1 Not back
(0.10.1 Not voiced (0.10.1
11
Extraction of phonological features (PLF)
Speech aligner ASR-ELIS
Phonological features
Irrelevant features for these phones
12
System architecture flow chart
Speech aligner
speaker features
Intelligibility Prediction Model
objective score
13
Intelligibility prediction model (IPM)
  • Objective
  • map speaker features (PMF, PLF or
    combinations) to speaker intelligibility score
  • Model training
  • train on DIA recordings
  • pathological speakers ( some normal control
    speakers)
  • Model type and size
  • limited number of pathological speakers
  • high number of features
  • ? linear regression model
  • ? feature selection

14
Reference material (DIA)
  • 211 speakers
  • 51 normals
  • 60 dysarthric
  • 12 clefts
  • 42 hearing impaired
  • 37 with laryngectomy
  • 7 with dysphonia
  • 2 others
  • Pathological speakers mean of 78,7
  • Normals mean of 93,3
  • Few with very low score

15
Results individual systems
  • Based on five-fold cross validation
  • Measure Pearson Correlation Coefficient (PCC)

ELIS PLF PCC 0.78
ESAT PMF PCC 0.80
16
Results combined system
  • PMF PLF
  • PCC 0.86

17
Results pathology-specific IPM
  • Instead of creating one general IPM, one can
    create IPMs for specific pathologies
  • still trained on all speakers (enough speakers)
  • model selection based on performance of speakers
    of that pathology (importance of features depends
    on type of disorder)

18
Results pathology-specific IPM
  • Dysarthria 0.94 (red circles)
  • Dispersion of other speakers is increased
  • Largest deviations in low intelligibility area
  • scarce data in that area
  • can be solved by adding more weight to patients
    with very low intelligibility

19
Development of DIA-tool
  • PMF and PLF can predict intelligibility of
    pathological speech
  • Combining PMF and PLF yields high PCCs
  • 0.86 for general model
  • over 0.91 for pathology specific model
  • PCCs for specific pathologies compete with
    subjective inter-rater agreements (0.91)
  • This opens up possibilities for development of an
    automated version of the DIA (see demonstration
    later) based on PLF PMF

20
New feature set Context-dependent phonological
features (CD-PLF)
  • Until now
  • PMF Does the patient have trouble pronouncing a
    certain phoneme?
  • PLF Does the patient have problems with
    voicing, manner or place of articulation
  • New Does the patient have problems with a
    desired change of voicing, manner or place of
    articulation?
  • ? CD-PLFs how well is change in PLF realized?

21
Extraction of context-dependent phonological
features (CD-PLF)
Speech aligner ASR-ELIS
CD-PLF features
22
Results for CD-PLF
  • CD-PLFs alone compete with previous best PLFPMF
    0.86
  • CD-PLFPMF 0.90 ? new best!
  • Pathology-specific results for CD-PLFPMF

23
Conclusions and future work
  • PMF, PLF and CD-PLF can predict intelligibility
    of pathological speech
  • CD-PLFs seem to play an important role
  • CD-PLF PCC 0.87
  • CD-PLF PMF PCC0.90
  • ? not the articulation pattern but the change in
    the articulation pattern matters?
  • More research is needed before adding this
    feature set to the tool
  • High PCCs open up new possibilities for
  • more profound articulatory assessment, which is
    directly related to determination of appropriate
    therapy
  • monitoring of effectiveness of chosen therapy ?
    tool
  • using more natural speech (words, phrases) in
    tests

24
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com