Fundamental Frequency Contour Synthesis for Turkish Text to Speech - PowerPoint PPT Presentation

1 / 92
About This Presentation
Title:

Fundamental Frequency Contour Synthesis for Turkish Text to Speech

Description:

Prosodic Phenomena. Modulate the basic acoustic parameters. Modulation of fundamental frequency ... Prosodic labels - F0 contour ... – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 93
Provided by: erk7
Category:

less

Transcript and Presenter's Notes

Title: Fundamental Frequency Contour Synthesis for Turkish Text to Speech


1
Fundamental Frequency Contour Synthesis for
Turkish Text to Speech
  • Erkan Abdullahbese

2
Content
  • TTS systems and prosody
  • Turkish Intonation, Stress
  • Observations on Collected Data
  • Methodology
  • Improvements on Methodology
  • Discussion
  • Conclusion

3
Introduction to Text to Speech (TTS) Systems
  • Text -gt speech signal
  • Widespread applications
  • Message to speech generation
  • Man-machine dialogue
  • Multimedia applications
  • Talking aids for handicapped

CHALLENGE Machine Accent -gt Natural Speech
SOLUTION Prosody Generation Modules
4
What is Prosody?
  • Properties of speech that cannot be derived from
    the phoneme sequence
  • Modulation of voice pitch
  • Rhythm, changes in durations
  • Fluctuations of loudness
  • Related to domains larger than one phoneme
  • (supra-segmental properties)

5
Basic Acoustic Parameters
  • Fundamental Frequency F0 (pitch)
  • Duration
  • Intensity

Prosodic Phenomena
  • Modulate the basic acoustic parameters
  • Modulation of fundamental frequency
  • Intonation
  • Stress (accent)

6
Intonation
  • Ensemble of pitch variations
  • Perceived as speech melody

Stress
  • Modulate all the basic acoustic parameters
  • Increase in F0 and intensity (loudness)
  • Lengthening in duration
  • Three types
  • Word stress
  • Phrase stress
  • Sentence stress
  • Stress on a single syllable
  • Phrase and sentence stress coincide with word
    stress

7
Prosody Generation Modules in TTS
  • Prosodic description
  • Prosodic phrasing -gt phrase boundaries
  • Accent labeling -gt accents on syllables
  • Prosodic labels -gt F0 contour

PROBLEMS
  • Complex linguistic processing units (morphology,
    syntax, semantics)
  • Speaker-dependence
  • Articulation-related problems microprosody vs.
    macroprosody

8
Basic Intonation Models
  • Tone Sequence Models Pitch contour as a
    sequence of fluctuations generated by local
    accents
  • Pierrehumbert A sequence of independent H and L
    tones (ortography)
  • Pitch accent -gt pitch movements on stressed
    syllables
  • Boundary tone -gtat phrase boundaries
  • Phrase accent -gt between stressed syllable and
    phrase boundary
  • Superposition Models Pitch contour as the
    superposition of several components with
    different domains syllables, words, phrases,
    sentences, paragraphs, whole text
  • Fujisaki purely mathematical model -gt parametric
  • A basic F0
  • A phrase component (crit. Damped sec. Order to
    impulse)
  • An accent component (crit. Damped sec. Order to
    rectangular)
  • Optimization of parameter values wrt F0 (Analysis
    by Synthesis)
  • Möbius -gt Fujisaki Linguistics -gt German

9
Approaches
  • Perform an analysis on a speech corpus
  • Transcribe the corpus
  • Define F0 labels(rise, fall, peak etc.) and
    boundary labels (minor, major etc.)
  • Labeling
  • By hand
  • Examination -gt rules -gt automatic
  • Automatic learning of labels -gt F0 values (or
    parametrized)
  • Neural Networks
  • Stochastic methods
  • Intonation pattern dictionary (from natural
    speech)
  • Store pitch values in ST and key information
    (labels) for each pattern
  • For the patterns in input sentence -gt compare key
    info -gt find closest pattern from dictionary -gt
    apply pitch

10
Approaches
  • For integration into TTS (labeling input sentence
    from text)
  • Complex linguistic processing units
  • Morphology
  • Syntax
  • Semantics
  • Stochastic methods
  • Syntax -gt most probable label sequence

11
Sentence Intonation Types
  • Terminal intonation
  • pitch decreases at the end -gt message completed
  • Interrogative intonation
  • pitch slightly increases on the last syllable -gt
    waiting for response
  • Progressive intonation
  • pitch either increases slightly or does not show
    any lowering at the end -gt message not completed
    yet

12
Turkish Intonation
  • Classification of sentences
  • Type
  • Declaratives(?)
  • wh-questions(?)
  • yes-no questions(?)
  • Structure
  • Simple
  • Compound (?) at the end of subordinate
  • Mesgul oldugundan(?) bizimle sinemaya gelemedi(?).

13
Turkish Intonation
  • Tone groups (phrase or segment)
  • Division into tone groups
  • / Oraya varinca beni arayin. /
  • / Oraya varinca / beni arayin. /
  • Focus (new information) in each tone group
  • / Oraya varinca beni arayin. /
  • / Oraya varinca beni arayin. /
  • / Oraya varinca beni arayin. /
  • / Oraya varinca beni arayin. /
  • Pitch variations on focus

14
Turkish Intonation
  • Four levels of pitch low(1), mid(2), high(3),
    extra high(4)
  • gi2di3yoru1m
  • sa2hi4 mi1
  • Speech melody ltgt musical melody (Nash)
  • Hierarchy of intonation units(phrase -gt text)
  • Each intonation unit -gt melody
  • Successive intonation units related by motifs -gt
    melody of the upper level
  • Music reiteration of motifs -gt musical melody

15
Turkish Stress
Word Stress
  • Fixed(bound) stress vs. Free stress(Turkish)
  • Stress on a single syllable of a word in Turkish
  • Effect of suffixes on stress
  • Stress on final syllable of root stressable
    suffix
  • yolcu -lar ? yolcular
  • Stress on final syllable of root, unstressable
    suffix involves
  • oku -yor ? okuyor -lar ?
    okuyorlar
  • Stress on non-final syllable of root
  • karinca -lar ? karincalar
  • May disappear in sentence

16
Turkish Stress
Sentence Stress
  • Signals the prominance of the most
    information-bearing element in a sentence
  • Types
  • Unmarked (preverbal position)
  • Yarin Istanbula gidiyorlar.
  • marked (any position)
  • Yarin Istanbula gidiyorlar.
  • Focusing elements
  • Precede focus sadece, daha
  • Mehmet daha bugün ödevine baslayabildi.
  • Follow focus -mi, da, bile
  • Ayla mi bugün Ankaradan dönüyor?

17
Turkish Stress
Phrase Stress
  • Phrase modifier or complement and head
  • Phrase stress on modifier in Turkish
  • Types
  • Phrases used as nouns
  • telefon ahizesi
  • güzel çiçekler
  • Phrases used as verbs
  • hizli kos
  • severek yasa
  • Others
  • senin için
  • yarindan sonra
  • Preserved in the sentence

18
Motivation
  • Nevin bugün menemen yemeli. (template)
  • N Z F V
  • Nevin menemen yemeli.
  • N F V
  • Bizim Nevin domatesli menemen yemeli.
  • P N A F
    V
  • Nalan yarin ayna aliyor.
  • N Z F V
  • Nalan ayna aliyor.
  • N F V
  • Kardesim Nalan yeni ayna aliyor.
  • N N A F V

19
Nevin bugün menemen yemeli.
Nevin menemen yemeli.
20
Nevin bugün menemen yemeli.
Bizim Nevin domatesli menemen yemeli.
21
Nevin bugün menemen yemeli.
Nalan yarin ayna aliyor.
22
Nevin bugün menemen yemeli.
Nalan ayna aliyor.
23
Nevin bugün menemen yemeli.
Kardesim Nalan yeni ayna aliyor.
24
Sentences
  • 100 database sentences
  • 19 close test sentences (add/remove categories)
  • 18 random test sentences
  • Syllable-based handlabeling
  • Pitch extraction

25
Observations
Declaratives
  • Pitch decrease at the end (terminal intonation)
  • Division into phrases
  • Pitch increase on the phrase-final syllable
    (progressive intonation)

Nevin/bugün/menemen yemeli.
26
Observations
Declaratives
  • Pitch decrease at the end (terminal intonation)
  • Division into phrases
  • Pitch increase on the phrase-final syllable
    (progressive intonation)

Evvelki gün/ikimiz de/kuyumcu Aliye ugradik.
27
Observations
Wh-questions
  • Pitch increase on the last syllable
    (interrogative intonation)
  • Evident pitch increase on the stressed syllable
    of the wh-word
  • No division into phrases
  • Word stress often disappears

Dün neden zamanimi aldin?
28
Observations
Wh-questions
  • Pitch increase on the last syllable
    (interrogative intonation)
  • Evident pitch increase on the stressed syllable
    of the wh-word
  • No division into phrases
  • Word stress often disappears

Kimler yarin sinif gezisine katilacaklar?
29
Observations
Yes-no questions
  • Pitch decrease at the end
  • Evident pitch increase on the stressed syllable
    of the word before -mi
  • No division into phrases
  • Word stress often disappears

Oralari yine eskisi gibi güzel mi?
30
Observations
Yes-no questions
  • Pitch decrease at the end
  • Evident pitch increase on the stressed syllable
    of the word before -mi
  • No division into phrases
  • Word stress often disappears

Mudanyada bu sene de çok yagmur yagiyor mu?
31
Observations
Conditionals
  • Pitch decrease at the end (terminal intonation)
  • Division into phrases
  • Pitch increase on the phrase-final syllable
    (progressive intonation)
  • -se always a phrase-final syllable

Insan azimliyse herseyi basarabilir.
32
Observations
Conditionals
  • Pitch decrease at the end (terminal intonation)
  • Division into phrases
  • Pitch increase on the phrase-final syllable
    (progressive intonation)
  • -se always a phrase-final syllable

Babam keyifsizse ona konuyu bu aksam anlatamam.
33
Observations
Imperatives
  • Pitch decrease at the end (terminal intonation)
  • Division into phrases
  • Pitch increase on the phrase-final syllable
    (progressive intonation)

Aksam yemegi için çarsidan birseyler alsinlar.
34
Observations
Imperatives
  • Pitch decrease at the end (terminal intonation)
  • Division into phrases
  • Pitch increase on the phrase-final syllable
    (progressive intonation)

Sevgiyi ve mutlulugu yarinlara erteleme.
35
Observations
Exclamations
  • Diverse
  • Pitch decrease at the end (terminal intonation)
  • Evident pitch increase on the stressed syllable
    of interjection or of another word

Aman büyüklerine bir saygisizlik yapma!
36
Observations
Exclamations
  • Diverse
  • Pitch decrease at the end (terminal intonation)
  • Evident pitch increase on the stressed syllable
    of interjection or of another word

Haydi bugün hep birlikte piknige gidelim!
37
Local Observations
  • At most single stressed syllable excluding
    phrase-final increase
  • Stress within the sentence coincides with the
    word stress
  • Phrase stress preserved

Ekonomik kriz / her kesimden insani / olumsuz
etkiledi.
38
Local Observations
  • At most single stressed syllable excluding
    phrase-final increase
  • Stress within the sentence coincides with the
    word stress
  • Phrase stress preserved

Evvelki gün / ikimiz de / kuyumcu Aliye ugradik.
39
Local Observations
  • Word stress may disappear

Beden sagligimiz için aksamlari erken yatmaliyiz.
Mehmet daha bugün ödevine baslayabildi.
40
Local Observations
  • Word stress disappears at the end of positives
    (terminal intonation)

Nevin bugün menemen yemeli.
Merve evine zamaninda dönemez.
41
Local Observations
  • Sentence stress (stress on focus)

Nevin bugün menemen yemeli.
Mehmet daha bugün ödevine baslayabildi.
42
Local Observations
  • Effects on neighbour syllables
  • Unstressed stressed (nevin)
  • Stressed stressed
  • nevinbugün

Nevin bugün menemen yemeli.
43
Local Observations
  • Effects on neighbour syllables
  • Stressed stressed (Partiyegelmeyecegim)

Ben aksam partiye gelmeyecegim.
44
Local Observations
  • Effects on neighbour syllables
  • Stressed unstressed (Gecerüyasinda)

Kardesim beni dün gece rüyasinda görmüs.
45
Local Observations
  • Effects on neighbour syllables
  • Stressed unstressed (neyle)

Bu geç vakitte sizin eve neyle dönecegiz?
46
Local Observations
  • Effects on neighbour syllables
  • Stressed unstressed (last syllable, terminal
    intonation) (degildi)

Aksamki yemek pek güzel degildi.
47
Local Observations
  • Effects on neighbour syllables
  • Stressed unstressed (last syllable, terminal
    intonation)
  • (güzelmi)

Oralari yine eskisi gibi güzel mi?
48
Methodology
Overwiev
  • Choose best sentence from a sentence database
  • Apply its pitch to the matching regions of input
    sentence
  • Compression / Stretching
  • Interpolation
  • Fit data to remaining regions using interpolation

49
Methodology
Read Files
  • Input information used for sentences
  • Sentence type (declarative, wh-question, yes-no
    question, conditional, imperative, exclamation)
  • Sentence state (positive or negative)
  • Categories of each word
  • Number of syllables of each word
  • The index of the syllable bearing word stress,
    for each word (stress in sentence coincides with
    word stress)

50
Methodology
Read Files
  • Word categories rely mainly on part-of-speech
    (POS) categories

51
Methodology
Choose Best Sentence
  • Search in database to find the best sentence
  • Search the template sentences with the same
  • Type
  • State
  • as the input sentence
  • Two different approaches for
  • Sentences other than questions
  • Question sentences

52
Sentences other than Questions
  • Calculate sentence resemblance scores based on
    word resemblance scores (WRS)
  • Choose the template sentence having the maximum
    sentence resemblance score

Word Resemblance Score (WRS)
  • Measure of resemblance of two words
  • Consists of
  • Regional resemblance score (RRS) -gt word stress
    information
  • Category match score (CMS) -gt word categories
  • WRS RRS CMS

53
Regional Resemblance Score (RRS)
  • Makes use of the four regions defined for every
    word
  • Region before the stressed syllable
  • Stressed syllable
  • Region after the stressed syllable
  • Phrase-final syllable
  • Measure of resemblance of any two words in terms
    of these regions
  • Based on number of syllables in each region
  • Consists of
  • Score of existing regions
  • Score of lacking regions
  • RRS 0.9 x ERS 0.1 x LRS

54
Calculation of ERS and LRS
score ERS LRS 0 (initialization) for all
regions if the region exists in both
words score min( 1 , (NSRW1 / NSRW2)
) ERS ERS score else if region lacks
in both words LRS LRS 1 else LRS
LRS - 1 endif endif endfor ERS score
of existing regions LRS score of lacking
regions NSRW1 number of syllables in related
region for first word NSRW2 number of syllables
in related region for second word
55
Category Match Score (CMS)
  • Category match -gt CMS
  • CMS 3.7 (maximum possible value of RRS)

Example Calculation of WRS for the words Istanbul
and Ankara
ERS 1/1 1/2 3/2 LRS -1 1 0 RRS 0.9
x 3/2 0.1 x 0 1.35 CMS 3.7 WRS 1.35
3.7 5.05
56
Sentence Resemblance Score
  • I1, I2, ,IN words of the input sentence
  • D1, D2, ,DM words of the template sentence
  • MxN S score matrix with Si,js where Si,j WRS
    of the pair (Di, Ij)
  • Path (Da, Ib), (Dc, Id), , (De, If)
  • with 1 a lt c lt lt e M and 1 b lt d lt lt f
    N
  • Score of the path sum of WRSs of its pairs
  • TASK Find the path with the maximum score
    (maximum score path)
  • score of maximum score path sentence
    resemblance score
  • optimum combination of word pairings preserving
    order

57
EXAMPLE TEMPLATE Geçen aksam hepimiz müzigin
büyüsüne kapilmistik. INPUT Büyük dayimiz
Kadiköydeki evinde senelerdir yalniz
oturuyor. (aksam, Büyük), (müzigin, dayimiz),
(kapilmistik, evinde) valid (hepimiz, dayimiz),
(geçen, evinde), (büyüsüne, yalniz)
invalid (aksam, evinde), (müzigin, dayimiz),
(kapilmistik, oturuyor) invalid (geçen,
dayimiz), (hepimiz, dayimiz), (kapilmistik,
oturuyor) invalid
58
Procedure
  • MxN MPS maximum path scores matrix
  • MxNx2 CMPS maximum path scores coordinates
    matrix
  • MPSi,j contains the score of the maximum score
    path beginning with the pair (Di, Ij)
  • CMPSi,j,k contains the indices of the next pair
    in the same path ( for example if the max score
    path of (Di, Ij) is (Di, Ij), (Dm, In), , (Dp,
    Iq), then CMPSi,j,1 m and CMPSi,j,2 n )
  • Recursive generation of MPS from itself and S
  • CMPS generated from MPS

59
Procedure
for i M, M-1, , 1 for j N, N-1, ,
1 if (i M) or (j N)
MPSi,j Si,j CMPSi,j,1
CMPSi,j,2 EMPTY else
MPSi,j Si,j value of the max element of
MPSp,q i1 p M and j1 q
N CMPSi,j,1 first indice of
max element of MPSp,q
i1 p M and j1 q N
CMPSi,j,2 second indice of max element of
MPSp,q i1 p M and
j1 q N endif
endfor endfor
60
(No Transcript)
61
Finding the maximum score path from MPS and CMPS
  • Sentence resemblance score maxi,j(MPSi,j)
    MPSa,b for ex.
  • MPSa,b -gt max score path begins with (Da, Ib)
  • Apply to CMPSa,b,1 and CMPSa,b,2 to obtain the
    second pair of the path
  • If for ex. CMPSa,b,1 c and CMPSa,b,2 d -gt
    (Dc, Id) is the second pair
  • Similarly, apply to CMPSc,d,1 and CMPSc,d,2 to
    obtain the third pair of the path etc.
  • Entire path is obtained

62
We obtained answers to the following questions
  • What is the max resemblance capacity of the
    template sentence to the input sentence?
  • Answer sentence resemblance score (score of the
    max score path)
  • How to arrive this max capacity, i.e. how to
    match the words and choose the pairs?
  • Answer as in max score path

63
Question Sentences
  • Pitch curve of a question lt - gt Pitch curve of a
    word
  • Whole question regarded as a word
  • Use the same regions defined for words
  • Region before the stressed syllable
  • Stressed syllable (stressed syllable of the
    wh-word or question suffix word)
  • Region after the stressed syllable
  • Phrase-final syllable (exists for wh-questions)
  • Use the same procedure assigning RRS to words to
    assign sentence resemblance score to the questions

64
EXAMPLE
Sentences Ayse bugün evde hangi yemegi yapti? Bu
su sesi yukaridan mi geliyor? Regions
65
Methodology
Generate Regional Durations
  • Region -gt one or more syllables
  • Inputs(related to input and template sentences)
  • The label files
  • The number of syllables for each word
  • The index of the syllable bearing word stress,
    for each word
  • The information whether the last syllable shows a
    pitch rise or not, for each word (conditional,
    wh-question)
  • Assumes a perfect duration analysis for the input
    sentence (label file of input sentence)
  • Determines the durations of each region the
    onset and end, for each word in both sentences

66
Methodology
Apply Pitch
  • Inputs
  • Regional durations generated by the previos block
  • Pitch contour of the template sentence
  • The max score path pertaining to the input and
    template sentences
  • For all pairs of the path, the pitch of the
    template sentence is applied to the input
    sentence, for the regions existing in both
    elements of a pair
  • Usage of spline interpolation
  • Stretching / compression in time
  • Data fitting for nonexisting regions

67
Improvements
Discarding Unvoiced Regions
  • Problem
  • unvoiced regions of template sentence spline
    -gt distortions
  • Example
  • Input Yildizlar dünyadan gündüz görülmez
  • Template Zamanimi televizyonun karsisinda bos
    yere harcayamam
  • Path (zamanimi, yildizlar), (karsisinda,
    dünyadan), (yere, gündüz), (harcayamam, görülmez)
  • Problematic pairs (karsisinda, dünyadan) and
    (yere, gündüz)
  • unvoiced regions in karsisinda (/k/, /s/ and /s/)
    and yere
  • Solution discard zero samples (unvoiced) and
    then apply

68
Yildizlar dünyadan gündüz görülmez.
69
Improvements
  • Problem poor performance of spline outside the
    borders of data points to be interpolated
  • Example
  • Input Didem her aksam odasinda günlük gazeteleri
    okur
  • Template Annem bize her zaman çok lezzetli
    yemekler pisirir
  • Problematic pairs (annem, didem) and (pisirir,
    okur)
  • Solution applying the value of the outermost
    data point to the whole region, if the region
    goes beyond this data point

70
Didem her aksam odasinda günlük gazeteleri okur.
71
Improvements
  • Problem spline sometimes yields unsatisfactory
    results within the data points
  • Example
  • Input Çocuklar yazin günesin altinda fazla
    kalmamali.
  • Problematic region /zin/ of yazin generated by
    spline

Çocuklar yazin günesin altinda fazla kalmamali.
72
Improvements
  • Solution check spline spline -gt linear
    interpolation when necessary
  • Spline check linear regression line, upper
    threshold and lower threshold lines for the pitch
    of template sentence
  • If spline exceeds the threshold lines spline -gt
    linear

Linear regression and the two threshold lines.
73
Çocuklar yazin günesin altinda fazla kalmamali.
74
Discussion
Performance at sentence ends
  • good -gt choosing from same type and state -gt
    expected
  • microprosody degrades performance (unvoiced
    regions of input sentence unknown)

Kuzenim Nalan Oyaya yarin aliyor.
75
Discussion
Performance at sentence ends
  • good -gt choosing from same type and state -gt
    expected
  • microprosody degrades performance (unvoiced
    regions of input sentence unknown)

Marsta hayat var midir?
76
Discussion
Performance at sentence ends
  • erroneous endings (increase instead of decrease)
    due to template pitch

77
Discussion
Performance at sentence ends
  • erroneous endings (increase instead of decrease)
    due to template pitch

78
Discussion
Performance at movements (rises and falls)
  • limited since
  • the method is confined to
  • the capacity of the database (same type, state)
  • the capacity of the template sentence
  • prosodic boundaries (yazin) and accented
    syllables unknown

Çocuklar yazin günesin altinda fazla kalmamali.
79
Discussion
Performance at movements (rises and falls)
  • limited since
  • the slope of the rise or fall may differ in input
    and template sentences (bizim)

Bizim Nevin domatesli menemen yemeli.
80
Discussion
Performance at movements (rises and falls)
  • limited since
  • there may be an absolute difference between pitch
    values of both sentences (gündüz)

Yildizlar genellikle gündüz görülmez.
81
Discussion
Performance at movements (rises and falls)
  • limited since
  • microprosodic effects (kardesim)

Kardesim Nalan yeni ayna aliyor.
82
Discussion
Performance at movements (rises and falls)
  • limited since
  • effects of rises and falls on neighbouring
    syllables are handled partially (only within
    words)
  • Example
  • Input Merve bu sefer zamaninda dönemez
  • Template Aksamki yemek pek güzel degildi
  • Merve from yemek (/ye/ of yemek affected by /ki/
    of aksamki)

83
Aksamki yemek pek güzel degildi.
Merve bu sefer zamaninda dönemez.
84
Discussion
Performance at questions
  • High success due to their simple nature

Niçin sorularima cevap vermiyorsun?
85
Discussion
Performance at questions
  • High success due to their simple nature

Önce nereye bilgi verilmeli?
86
Discussion
Performance at questions
  • High success due to their simple nature

Ona bu güzel kolyeyi satin almayacak misin?
87
Discussion
Objective Evaluation
  • Pitch -gt speech melody, human perception -gt ST
    scale
  • distance d in ST between two frequencies f1 and
    f2 is given as
  • d 12 x log2 (f1 / f2)
  • metrics
  • mean squared distance between original and
    synthesized in ST
  • proportion lt 2ST distance
  • compare with baseline solution constructed as
  • 6 types x 2 states -gt 12 groups of DB sentences
  • for each sentence -gt median of nonzero pitch
  • average of median of sentences in each group -gt
    12 baselines

88
Discussion
Objective Evaluation
89
Discussion
Objective Evaluation
90
Discussion
Objective Evaluation
Results
  • ANOVA (analysis of variance)
  • p the probability of the means belonging to
    each method to be equal
  • p lt 0.10 or 0.05 or 0.01 -gt averages
    statistically significant
  • Method better than baseline in general
  • Performance at close test sentences gt Performance
    at random test sentences
  • best results in questions
  • similar results in both metrics

91
Conclusion
  • Intonation and stress -gt fundamental frequency
  • Analysis of pitch contours
  • Method based on syntactic structure in terms of
    word categories and word stress information
  • Automatic generation of these inputs from text is
    relatively easy.
  • Makes use of
  • a sentence database (corpus of natural speech)
  • interpolation
  • Recordings of a single speaker

92
Future Work
  • Inclusion of other speakers
  • A further categorization of words instead of POS
    categories -gt subcategories -gt more complex
    syntactic structures -gt larger database for
    efficiency
  • Other inputs
  • prosodic boundaries
  • accented syllables
  • and their automatic generation from input text
    (prosodic description)
  • Handling microprosody
Write a Comment
User Comments (0)
About PowerShow.com