Title: Speech Synthesis Methods, Implementation and A Coprocessor with Leon
1Speech Synthesis - Methods, Implementation and
A Coprocessor with Leon
- Rajeev Bakshi
- M Tech Comp Sc and Engg
2Outline
- Speech synthesis technologies and vocal system
- Klatts speech synthesizer
- Architecture design approach
- CEERIs approach
- FPGA design flow
- Methodology to attach a coprocessor at the fpu
interface of Leon processor
3TTS Text To Speech synthesizer
- Automatic speech signal generation on computers
is commonly called speech synthesis. - Input can be in many of the forms. It can be
written text, optical characters, parameters, etc
4Applications
- Telecommunication Services
- Language Education
- Aid to handicapped persons
- Talking books and toys
- Vocal Monitoring
- Multimedia, man-machine communication.
- Fundamental and Applied Research
5Three Broad Classes
- Speech synthesis technologies can be broadly
categorized into three categories - Concatenate synthesis
- Formant synthesis
- Articulatory synthesis
6Vocal System
7Concatenate Speech Synthesis
- Waveform segments are stored in a database
- For a given text, these segments are joined based
on some joining rules - Audible glitches because of transition between
segments - Efficient lookup and searching is necessary to
locate the segments
8Formant Speech Synthesis
- Formants Resonant frequencies that occur at the
main resonant areas of the vocal tract for a
given sound - It consists of artificial reconstruction of the
formant characteristics to be produced. - This is done by exciting a set of resonators by a
voicing source or noise generator to achieve the
desired speech spectrum - The addition of a set of anti-resonators
furthermore allows the simulation of nasal tract
effects, fricatives and plosives.
9Articulatory Speech Synthesis
- Articulators are Speech organs
- It is based on computational models of the human
vocal tract and the articulation processes
occurring there. - Most natural sounding speech synthesis.
- Attempt to describe the actual speech production
mechanism.
10Comparison of Speech Synthesis Techniques
11Outline
- Speech synthesis technologies and vocal system
- Klatts speech synthesizer
- Architecture design approach
- CEERIs approach
- FPGA design flow
- Methodology to attach a coprocessor at the fpu
interface of Leon processor
12Klatts Model
- Cascade/Parallel formant synthesizer
- Source filter model
- Voicing source
- impulsive model
- Turbulent noise
- random number generator
- Vocal tract transfer function
- resonator combination
13Klatts Speech Synthesizer
14Overall System
words
.par
.raw
rules
klatt
input text
essentially a lookup table
parameter file
cascade parallel formant synthesizer
waveforms ascii file/ binary file
Private to an organization (ceeri) Language
dependent (hindi, ..)
Available in public domain Language independent
15Source Filter Model klatt
Source
White Noise
Vocal Tract Filter
speech
Impulses (f0)
16Source Filter Model
- Production of Speech
- Generation of Sound Source at the glottis at some
point along the length of the vocal tract - Filtering of these sources by the vocal tract
- Sound Sources
- Quasi Periodic Sources
- Turbulence Noise Sources
17Source Filter Model
- Turbulence Noise
- Turbulence occurs due to rapid air flow at a
constriction. Turbulence noise can be produced at
a constriction at the glottis, or at a
constriction made with the tongue or lips above
the glottis. - Aspiration Noise is produced at a glottal
constriction e.g. h - Frication Noise is produced at a supraglottal
constriction e.g. s, f, v - Noise is modeled by the pseudo random number
generator
18Source Filter Model
19Resonator band pass filter Antiresonator band
stop filter
20Digital Resonator
O/P Seq y(nT)
I/P Seq x(nT)
y(nT) Ax(nT) By(nT-T) Cy(nT-2T)
Where ninteger
21Digital Resonator
- Input output characteristics of resonator are
specified by - Resonant Frequency (Formant), F
- Resonance Bandwidth, BW
- C -exp(-2piBWT)
- B 2exp(-piBWT) Cos(2piFT)
- A 1- B - C
22Outline
- Speech synthesis technologies and vocal system
- Klatts speech synthesizer
- Architecture design approach
- CEERIs approach
- FPGA design flow
- Methodology to attach a coprocessor at the fpu
interface of Leon processor
23Architecture Design Approach
- Criteria
- Performance, Space Optimization, Semantic Gap,
Cost, etc. - ASICs
- Disadvantages
- Higher cost
- No flexibility
- DSPs
- Disadvantages
- Higher cost
24Architecture Design Approach
- CISCs
- Disadvantages
- Relatively Slower
- More Power Consuming
- RISCs
- Disadvantages
- Large semantic gap with the application
- ASIPs
- Advantages
- Reduced semantic gap
- Reduced power consumption
25Outline
- Speech synthesis technologies and vocal system
- Klatts speech synthesizer
- Architecture design approach
- CEERIs approach
- FPGA design flow
- Methodology to attach a coprocessor at the fpu
interface of Leon processor
26CEERIs work
- ASIP for Hindi TTS Voice Chip
- Rule chip and voice chip (Sw and FPGA)
- Rule Chip (SW)
- Database lookup and concatenation rules to
generate parameters for a given sentence - Voice Chip (HW)
- Klatts model of vocal cord which takes
parameters and produces amplitude samples
27Outline
- Speech synthesis technologies and vocal system
- Klatts speech synthesizer
- Architecture design approach
- CEERIs approach
- FPGA design flow
- Methodology to attach a coprocessor at the fpu
interface of Leon processor
28FPGA Design Approach
- HDL compilation
- Logic Optimization
- Technology Mapping
- Placement
- Routing
- Static timing analysis
- FPGA configuration file generation
29FPGA Design Flow 1
30FPGA Design Flow - 2
31Outline
- Speech synthesis technologies and vocal system
- Klatts speech synthesizer
- Architecture design approach
- CEERIs approach
- FPGA design flow
- Methodology to attach a coprocessor at the fpu
interface of Leon processor
32Methodology
- Leon core provides two interfaces
- Coprocessor interface
- FPU interface
- FPU interface
- fpu_core
- meiko_fpu
33Methodology
34Thanks
- Project Home Page -
- http//karnali.cse.iitd.ernet.in/srijan/text2speec
h/