Synthesis - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Synthesis

Description:

The definition of prosody 'exaggeration' The algorithm. Evaluation of exaggerated prosody ... Definition of prosody exaggeration. F0 contour ... – PowerPoint PPT presentation

Number of Views:106

Avg rating:3.0/5.0

Slides: 27

Provided by: lingOhi

Category:

Tags: synthesis

more less

Transcript and Presenter's Notes

Title: Synthesis

1
Synthesis evaluation of prosodically
exaggerated utterancesA preliminary study

Kyuchul Yoon
Division of English
Kyungnam University
Spring 2008 Joint Conference of KSPS KASS

2
Contents

Synthesis evaluation of human utterances with
exaggerated prosody
Synthesis of exaggerated prosody
Useful for native utterances
The definition of prosody exaggeration
The algorithm
Evaluation of exaggerated prosody
Useful for evaluating learner utterances
The algorithm an experiment

3
Teaching evaluating prosody

Teaching language prosody
The need for exaggeration of native utterances
How to define exaggeration
Evaluating language prosody
Given the native version of an utterance,
evaluate learners utterances w/ atypical prosody
How to measure the differences btw/ the native
and learner utterances

4
Exaggerating native prosody

Exaggeration of the F0 contour
One way would be to make the pitch peaks/valleys
higher/lower
Exaggeration of the intensity contour
One way would be to manipulate the intensity
contour of the pitch peaks/valleys
Exaggeration of the segmental durations
One way would be to manipulate the segmental
durations of the pitch peaks/valleys

5
Exaggerating native prosody
F0
The fundamental frequency (F0) contour of an
utterance Marianna!.
6
Exaggerating native prosody
Intensity
The intensity contour of an utterance Marianna!.
7
Exaggerating native prosody
Duration
The segmental durations of an utterance Marianna!
before and after the exaggeration.
8
Algorithm prosody exaggeration

Definition of prosody exaggeration
F0 contour
Make pitch peaks/valleys higher/lower in Hz
values
Intensity contour
Make pitch peaks higher in dB values
Segmental durations
Make pitch peaks longer in times values

9
Algorithm prosody exaggeration
F0
10
Algorithm prosody exaggeration
Intensity
11
Algorithm prosody exaggeration
Durations
12
How Praat script works
13
How Praat script works
F0
Intensity
Durations
14
How Praat script works
Original
F0
Durations
F0
Durations
Intensity
15
Evaluating learner prosody

Assumes the existence of the native version
Evaluates the learner versions
Evaluation of the F0 intensity contours
Is preceded by duration manipulation
The durations of the matching segments of the two
utterances are made identical 3
Is preceded by F0/intensity normalization F0
smoothing
The mean difference is added/subtracted to/from
learner utterance
Is followed by pitch/intensity point-to-point
comparison
Evaluation of segmental durations
Done without any duration manipulation.
Segment-to-segment comparison
Evaluation measure Euclidean distance metric

16
Algorithm prosody evaluation
Before after duration manipulation
native
learner before
learner after
17
Algorithm prosody evaluation
F0 point-to-point comparison btw/ native and
learner
native
learner after
18
Algorithm prosody evaluation
Intensity point-to-point comparison btw/ native
and learner
native
learner after
19
Algorithm prosody evaluation
Duration segment-to-segment comparison btw/
native and learner
native
learner before
Euclidean distance metric for evaluation measure
P (p1, p2, p3,..., pn) and Q (q1, q2, q3,...,
qn) in Euclidean n-space
20
A pilot experiment
native
learner after
Euclidean distance should be minimum
21
A pilot experiment
native
learner after
F0 -100Hz to 100Hz with a 10Hz interval ? 21
stimuli Intensity -25dB to 25dB with a 5dB
interval ? 11 stimuli Duration 0.25, 0.50, 0.75,
1.00, 1.50, 2.00, 2.50, 3.00 times the original ?
8 stimuli
22
Results Conclusion
23
Results Conclusion
24
Results Conclusion
25
Results Conclusion

Prosody exaggeration
Can be a tool for teaching language prosody
Can be used to test measures for evaluating
prosody
Limitation of the current prosody evaluation
Native utterances should exist to yield measures
TTS systems with advanced prosody models could be
helpful
Weights of the three separate measures
(F0/intensity/duration) need to be determined
Experiments with human evaluators could provide
the weights

26
References
1 Boersma, Paul. 2001. Praat, a system for
doing phonetics by computer. Glot International
5(9/10). pp.341-345. 2 Moulines, E. F.
Charpentier. 1990. Pitch synchronous waveform
processing techniques for text-to-speech
synthesis using diphones. Speech Communication 9.
pp.453-467. 3 Yoon, K. 2007. Imposing native
speakers' prosody on non-native speakers'
utterances The technique of cloning prosody.
Journal of the Modern British American Language
Literature 25(4). pp.197-215.

Write a Comment

User Comments (0)