Speech Synthesis Markup Language Aim at Extension - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Speech Synthesis Markup Language Aim at Extension

Description:

SABLE. W3C SSML. National Laboratory of Pattern Recognition (NLPR) ... SABLE. Developed by Edinburgh and Bell Labs. Based on STML and JSML. The stated aims ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 20
Provided by: huan8
Category:

less

Transcript and Presenter's Notes

Title: Speech Synthesis Markup Language Aim at Extension


1
Speech Synthesis Markup Language -----Aim at
Extension
  • Dr. Jianhua Tao

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
2
Brief Introduction to Evolution of SSML
  • The original SSML (not W3C SSML)
  • STML
  • JSML
  • SABLE
  • W3C SSML

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
3
The original SSML
  • Mark phrase boundaries
  • Emphasis words
  • Specify pronunciations
  • Include other sound files

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
4
STML
  • Developed by Edinburgh and Bell Labs
  • Based on the original SSML
  • Aimed at giving the same basic impressions to
    listeners, not sounding identical on different
    systems

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
5
JSML
  • Developed by Sun
  • XML based
  • Include
  • Elements to mark the paragraphs and sentences
  • Elements to control the pronunciations
  • Elements to represent markers

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
6
SABLE
  • Developed by Edinburgh and Bell Labs
  • Based on STML and JSML
  • The stated aims
  • Synthesizer control
  • Text structure
  • Speech pronunciation
  • Multilinguality
  • Easy of Use
  • Portable
  • Extensibility

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
7
W3C SSML
  • Key design criteria
  • Consistency
  • Interoperability
  • Generality
  • Internationalization
  • Generation and Readability
  • Implementable

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
8
What we want from markup language
  • Controlling
  • Sharing
  • Extended to multimedia

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
9
Which level we should focus
  • Text analysis module
  • Prosody module
  • Acoustic module

10
Sharing
Text-analysis
acoustic
Prosody-analysis
Sys1
SSML
SSML
Sys2
National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
11
Text level for Mandarin
  • Word boundary
  • Pronunciation with tone
  • POS
  • Dialect?

12
Prosody level for Mandarin
  • Tone sandhi
  • Rhythm ?

13
Extensions to expressive synthesis
  • Emotion and Style
  • Others

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
14
Current elements related to prosody and style in
SSML
  • 3.2.1 "voice" Element
  • 3.2.2 "emphasis" Element
  • 3.2.3 "break" Element
  • 3.2.4 "prosody" Element

15
Emotion and Style
  • Emotion
  • Anger, happy, surprise, sad, fear,
  • Depend on speakers psychological and physical
    states
  • Local effects on prosody
  • Style
  • News, comments,
  • Depend on semantics of sentences
  • Global effects on prosody

16
Personalized Voice
  • Elementvoice
  • gender
  • age
  • name
  • variant
  • sample
  • ??ltvoice gendermalegt?????lt/voicegt
  • ???ltvoice genderfemalegt??????lt/voicegt

17
Extension?
  • To make it more expressive
  • Background music
  • VTTS
  • Combined with talking head and some other media
    information
  • We only can see the element mark

National Laboratory of Pattern Recognition
(NLPR) Institute of Automation, Chinese Academic
of Sciences
18
Thanks!
19
  • Element ltStructuregt
  • Level 0-.. paragraph, phrase,
  • POS
  • ltStructurelevelparagraphgt
  • ltStructurelevelsentencegt
  • ltStructurelevelphrasegt
  • ltStructurelevelwordgt
Write a Comment
User Comments (0)
About PowerShow.com