VoiceDriven Email System - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

VoiceDriven Email System

Description:

The SW process is a structured set of activities to develop a complicated system: ... Tone Generator mode. Idle and standby. Exception Dictionary ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 39

Provided by: bishadg

Category:

more less

Transcript and Presenter's Notes

Title: VoiceDriven Email System

1
Voice-Driven Email System

Bingchen Li
Bishad Ghimire
Puneet Batra

2
Project Discussions

The discussion will be mainly divided into the
following parts
General Introduction
Hardware Specifications
Software Specifications

3
General Introduction

Artificial Intelligence researches
Speech recognition technologies
The purpose of our design

4
Design Specifications

Main purpose is to design and develop
voice-driven email system
Its usefulness
HW, SW and Platform

5
Decomposition

Systems development more efficient and
complicated.
Enables to have a better estimation of the
current system.
Is done as
Client-side Interface
Speech and Text based HW
Interface on the Backend

6
The Process

The SW process is a structured set of activities
to develop a complicated system
SW Process Models
The Rapid Application Development Model (RAD)
The Fourth Generation Paradigms
The Concurrent Model
Project Scheduling

7
Hard Ware
8
HW Specifications

Foundational Description on TTS
Our Project is designed to yield an experimental
but fully operational text/voice system
Based on a standardized modular interface within
the design.
Usefulness

9
Functional Decomposition

Chipset provides TTS conversion with its
integrated Text to Speech Synthesizer
Any English text written to chip is converted
into speech
All the data is sent to the chip through serial
port.

10
Setting the Chipset

Programming the control panel
Setting the mode
Voice type
Audio Control Registers
Protocol Options Registers

11
Manufacturers Specifications

Absolute Maximum Ratings
Supply voltage, Vcc and AVcc -0.3V to 6.5V
DC input voltage, Vi -0.3V to Vcc 0.3V
Operating temp., TA 00C to 700C

12
Contd ..

DC Characteristics
TA 0oC to 70 oC
VCC AVCC AVREF 5v /- 10
Vss AVss 0v
XIN 7.35 MHz.

13
Chipset Operation

Modes
Text
Character
Phoneme
Real time audio
Prerecorded audio
Tone Generator mode
Idle and standby

14
Exception Dictionary

Makes possible to alter the way this TTS
synthesizes
It can be created and edited for future use
E.g.
C(O)N AA
O after C and before N gets the pronunciation
AA, the word is cot.

15
How it works

Some demonstrations
Text
Welcome to the Read-Email System Model.
Phonetic
w eh l k ah m t uw dh ax r eh d iy m
ey l s ix s t ax m m aa d ax l

16
Teaching the chip

Defining the mode
Text whass
Phonetic w ae s ah p
Then,
Phonetic equi. w ah ts ah p
Then creat, compile and download.

17
Features of the Chip set

A full-featured voice synthesizer with unlimited
vocabulary
Programmable voice parameters voice, pitch,
rate, volume and so.
Non-volatile storage of user settings, dictionary
DTMF (touch tone) Dialer
Analog and Digital outputs

18
contd..

On chip A/D converter
Very low current drain
Integrated serial data port.
Upgradeable and programmable via serial port

19
Systematic flow

Refresh
Start with User Inputs
Authorizing and verifying Entered Information
Connecting with the Mail Server
Chip set operation
Audio Output
Stops and/or Restart

20
Conclusion

The chosen chipset has been a good fit
Successful completion even after programming the
chips in the phonetic mode
Working Demo

21
Soft Ware
22
NLP in Information Retrieval (IR)

What is wrong with todays IR systems ?
What is Natural Language Processing (NLP) ?
Different levels of understanding text
Typical IR system
Where does NLP come in

Todays Information Retrieval Systems
google.com, altavista.com, lycos.com
Mostly Boolean Logic (AND, OR, NOT)
Some systems use probabilistic and statistic
logic
Not very intelligent
Look at this scenario
Search Phrase America AND Russia AND Battle
Search Result A document titled America beats
Russia in
battle for gold at summer Olympics

What is Natural Language Processing ?
Understanding text comes naturally to human
beings
We understand text on a number of levels

Different levels of understanding text
Phonetic
Morphological
Syntactic
Semantic
Discourse
Pragmatic

26
A typical IR system
27

Where does NLP come in ?
Traditional IR systems use Boolean and
Probabilistic logic
In addition, different levels of understanding
can be
incorporated
Number of levels of understanding combined

28
(No Transcript)
29

Speech Recognition Algorithm
Hidden Markov model
Time Sliced Paradigm

30
Hidden Markov model

What is HMM?
A triple (?,A,B)
Usages
Three problems solved
The evaluation problem and Forward algorithm
The decoding problem and Viterbi algorithm
The learning problem

31
The evaluation problem and Forward algorithm

What is the evaluation problem?
How does Forward Algorithm work?
N
at1 (j) bj (ot1) S at (i) aij, 1 lt j lt N, 1
lt t lt T-1
i1
What is the benefit?

32
The decoding problem and Viterbi algorithm

What is the decoding problem?
How does Viterbi Algorithm work?
?t1 (j) bj (ot1) max ?t (i)aij, 1 lt ilt N,
1 lt t lt T-1 1 lt
i lt N
What is the benefit?

33
The learning problem

What is the learning problem?
Two main optimization
Maximum Likelihood (ML)
Maximum Mutual Information (MMI)

34
Time-Sliced Paradigm

Introduction
Objective
Processes

35
Process Step One

The spectrum of the input signal is divided into
slices of equal length along the time-axis, which
are passed through sequentially through the
recognition system.

36
Process Step Two
37
Process Step Three
38
Conclusion