Chap 16. Speech Synthesis - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Chap 16. Speech Synthesis

Description:

Prosodic label. Control parameters. Speech. Text to sound. Word sequence. Phonetic symbols ... Good Text analysis (understanding) Good prosodic information. Emotion ? ... – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 11
Provided by: yihru
Category:

less

Transcript and Presenter's Notes

Title: Chap 16. Speech Synthesis


1
Chap 16. Speech Synthesis
  • Goal Generate natural speech, like human
    speech.
  • We only introduce the concatenate method here!
  • basically, concatenate the synthesis units with
    parameters modification, such as energy/length of
    them.
  • The factor need to concern
  • Timber (Spectrum)
  • Prosody (????????)
  • Linguistic Level Stress, Intonation, Rhythm,
    Tone...
  • Acoustic level Pitch (F0), Duration (Timing),
    Amplitude (Energy, intensity)
  • Challenges
  • Text understanding, prosody generation, synthesis
    method.

2
Text to speech system model
3
Overlap and Add method
  • SOLA (Synchronous Overlap and Add)

Hamming window, length2N330
Epoch
Both duration and pitch frequency changed in this
example
4
PSOLA (Pitch Synchronous Overlap And Add)
  • Overlap and add units were pitch synchronous

Pitch mark (Glottal Closure Instants, GCI)
5
  • PSOLA 3 frames example

Tn
Overlap windowed signals(Tn New pitch duration)
Result of addition(synthesized speech)
6
  • Change speech duration remove whole period of
    speech signal
  • Change sampling rate will change the spectrum

Insert pitch to increase duration
Delete pitch to reduce duration
7
  • Example change 5 pitch periods into 3 pitch
    periods

Pitch marks
Adding signal
Windowing signal
8
PSOLA Synthesis
  • Voice part Modification
  • How to change pitch contour.
  • Change the offset when overlapping.
  • How to change length of speech.
  • Insert or delete pitches (change number of
    pitches).
  • How to change energy.
  • Multiply a factor to change amplitude.
  • Unvoiced part
  • Copy, no pitch change need.
  • Deletion or duplication to change period.
  • How to put desired pitch mark in the vertical
    waveform (y-axis) in previous page? Energy gain
    of each epoch?
  • The pitch scaling factor should be 0.7-1.5.

9
Analysis WaveTable
  • Problems
  • Speech corpus
  • sentence, word, syllable
  • Determine Synthesis Unit
  • syllable, diphone, etc
  • Process
  • voiced/unvoiced determination.
  • Pitch marking
  • Store all the speech pieces to create unit
    database.

10
Other synthesis techniques
  • Frequency domain PSOLA
  • using PSOLA for LPC (RPE-LP) residual signal only
  • The spectral information, LPC or LSP, should be
    interpolated.
  • Synthesis Unit selection
  • Larger concatenate units (words or event phrases)
  • ? corpus based TTS, size of WaveTable (20G?)
  • Good Text analysis (understanding)
  • Good prosodic information
  • Emotion ?
Write a Comment
User Comments (0)
About PowerShow.com