Synthesis - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Synthesis

Description:

The definition of prosody 'exaggeration' The algorithm. Evaluation of exaggerated prosody ... Definition of prosody exaggeration. F0 contour ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 27
Provided by: lingOhi
Category:
Tags: synthesis

less

Transcript and Presenter's Notes

Title: Synthesis


1
Synthesis evaluation of prosodically
exaggerated utterancesA preliminary study
  • Kyuchul Yoon
  • Division of English
  • Kyungnam University
  • Spring 2008 Joint Conference of KSPS KASS

2
Contents
  • Synthesis evaluation of human utterances with
    exaggerated prosody
  • Synthesis of exaggerated prosody
  • Useful for native utterances
  • The definition of prosody exaggeration
  • The algorithm
  • Evaluation of exaggerated prosody
  • Useful for evaluating learner utterances
  • The algorithm an experiment

3
Teaching evaluating prosody
  • Teaching language prosody
  • The need for exaggeration of native utterances
  • How to define exaggeration
  • Evaluating language prosody
  • Given the native version of an utterance,
    evaluate learners utterances w/ atypical prosody
  • How to measure the differences btw/ the native
    and learner utterances

4
Exaggerating native prosody
  • Exaggeration of the F0 contour
  • One way would be to make the pitch peaks/valleys
    higher/lower
  • Exaggeration of the intensity contour
  • One way would be to manipulate the intensity
    contour of the pitch peaks/valleys
  • Exaggeration of the segmental durations
  • One way would be to manipulate the segmental
    durations of the pitch peaks/valleys

5
Exaggerating native prosody
F0
The fundamental frequency (F0) contour of an
utterance Marianna!.
6
Exaggerating native prosody
Intensity
The intensity contour of an utterance Marianna!.
7
Exaggerating native prosody
Duration
The segmental durations of an utterance Marianna!
before and after the exaggeration.
8
Algorithm prosody exaggeration
  • Definition of prosody exaggeration
  • F0 contour
  • Make pitch peaks/valleys higher/lower in Hz
    values
  • Intensity contour
  • Make pitch peaks higher in dB values
  • Segmental durations
  • Make pitch peaks longer in times values

9
Algorithm prosody exaggeration
F0
10
Algorithm prosody exaggeration
Intensity
11
Algorithm prosody exaggeration
Durations
12
How Praat script works
13
How Praat script works
F0
Intensity
Durations
14
How Praat script works
Original
F0
Durations
F0
Durations
Intensity
15
Evaluating learner prosody
  • Assumes the existence of the native version
  • Evaluates the learner versions
  • Evaluation of the F0 intensity contours
  • Is preceded by duration manipulation
  • The durations of the matching segments of the two
    utterances are made identical 3
  • Is preceded by F0/intensity normalization F0
    smoothing
  • The mean difference is added/subtracted to/from
    learner utterance
  • Is followed by pitch/intensity point-to-point
    comparison
  • Evaluation of segmental durations
  • Done without any duration manipulation.
    Segment-to-segment comparison
  • Evaluation measure Euclidean distance metric

16
Algorithm prosody evaluation
Before after duration manipulation
native
learner before
learner after
17
Algorithm prosody evaluation
F0 point-to-point comparison btw/ native and
learner
native
learner after
18
Algorithm prosody evaluation
Intensity point-to-point comparison btw/ native
and learner
native
learner after
19
Algorithm prosody evaluation
Duration segment-to-segment comparison btw/
native and learner
native
learner before
Euclidean distance metric for evaluation measure
P (p1, p2, p3,..., pn) and Q (q1, q2, q3,...,
qn) in Euclidean n-space
20
A pilot experiment
native
learner after
Euclidean distance should be minimum
21
A pilot experiment
native
learner after
F0 -100Hz to 100Hz with a 10Hz interval ? 21
stimuli Intensity -25dB to 25dB with a 5dB
interval ? 11 stimuli Duration 0.25, 0.50, 0.75,
1.00, 1.50, 2.00, 2.50, 3.00 times the original ?
8 stimuli
22
Results Conclusion
23
Results Conclusion
24
Results Conclusion
25
Results Conclusion
  • Prosody exaggeration
  • Can be a tool for teaching language prosody
  • Can be used to test measures for evaluating
    prosody
  • Limitation of the current prosody evaluation
  • Native utterances should exist to yield measures
  • TTS systems with advanced prosody models could be
    helpful
  • Weights of the three separate measures
    (F0/intensity/duration) need to be determined
  • Experiments with human evaluators could provide
    the weights

26
References
1 Boersma, Paul. 2001. Praat, a system for
doing phonetics by computer. Glot International
5(9/10). pp.341-345. 2 Moulines, E. F.
Charpentier. 1990. Pitch synchronous waveform
processing techniques for text-to-speech
synthesis using diphones. Speech Communication 9.
pp.453-467. 3 Yoon, K. 2007. Imposing native
speakers' prosody on non-native speakers'
utterances The technique of cloning prosody.
Journal of the Modern British American Language
Literature 25(4). pp.197-215.
Write a Comment
User Comments (0)
About PowerShow.com