Speech Enabling Mobile Devices using a Java Based Distributive Speech Recognition System presentation

About This Presentation

Transcript and Presenter's Notes

Title: Speech Enabling Mobile Devices using a Java Based Distributive Speech Recognition System

1
Speech Enabling Mobile Devices using a Java Based
Distributive Speech Recognition System

Speech Technology and Research Group
Department of Electrical Engineering
University of Cape Town
Dale Isaacs

2
Overview

Motivation for Research
Applications
Background
Current DSR Standards
Implementation of J2ME Front-End
Implementation of SPHINX4 Back-End
Preliminary Results
Summary

3
Motivation for Research

Increased performance of mobile phones
Advances in wireless technology
Rapid evolution in mobile phone services
Ubiquitous computing
Users forced to use small keypads and screens
Speech is most natural form of communicating
Advantageous for blind users

4
Motivation for Research

In 2005 2,168,433,600 mobile devices worldwide
708 Million Java-equipped Handsets
Currently there are 2.5 Billion mobile devices
Number of Java-equipped phones has increased
Current working implementations DSR only written
in C
Implement Speech Recognition System using J2ME
Front-End
Sphinx4 Back-End (Java Based)

5
Applications

Hands-free communication with devices
Voice navigation, Voice Dialling
Dictate outgoing SMSs
Replay incoming SMSs
Security Feature Speaker Identification /
Speaker Verification
Access to network based services
Allow easy access for blind users with new
devices

6
Background

Speech Technology
Speech Synthesizer
Text-to-Speech
Speech Recognizer
Automatic Speech Recognition Speech-to-Text
Speaker Identification identify persons by
sound of their voice
Speaker Verification is the person who he/she
claims to be?

7
Background

3 ways to implement a Automatic Speech
Recognition (ASR) System on a mobile device
Embedded Speech Recognition
Network Speech Recognition
Distributed Speech Recognition

8
Embedded Speech Recognition
Mobile Device
Built in Speech Recognizer
Pattern Recognizer
Feature Extractor
Recognition Decision
Features
Speech Input

All components sit on mobile device
Mobile devices still lack processing power
Voice Navigation / Command Control Small
Vocabulary

9
Network Speech Recognition
Server
Client
Speech Input
Recognition Decision
Pattern Recognizer
Speech Decoder
Feature Extraction
Speech Encoder

Client Server architecture
High bit-rate requirement

10
Distributed Speech Recognition
Front-End
Back-End
Server
Client
Speech Input
Recognition Decision
Speech Decoder
Pattern Recognizer
Feature Extraction
Encoder

Also Client Server architecture
Feature Extraction done on client side
Work load evenly distributed
Low bit-rate requirement

11
Current DSR Standards

In 2000, European Telecommunications Standards
Institute (ETSI) released first standard for
feature extraction in Front-Ends ES 201 108
In 2002, released ES 202 050 noise reduction
In 2003, released ES 202 211 and ES 202 212
allowing reconstruction of intelligible speech
All of the above standards existing only in C
Any Speech recognizer can be used at the
Back-End ISIP, HTK or SPHINX

12
Implementation of J2ME Front-End

Based on ES 201 108

OFFCOM
PE
FRAMING
ADC
LOG
MF
FFT
W
DCT
LogE
13 Cepstral Coefficients
Speech In

LogE
FEATURE COMPRESSION
BIT STREAM FORMATTING FRAMING
14 Compressed Cepstral Coefficients
to Transmission Channel
13
Implementation of SPHINX4 Back-End
Application
Tools Utilities
Control
Compressed Features from Transmission Channel
Result
Decoder
Linguist
Search Manager
AccousticModel
ActiveList
Dictionary
Speech Decoder
LanguageModel
Pruner
Scorer
Feature Reconstructor
Search Graph
Configuration Manager
14
Preliminary Results

WER Lower is better
Tested the Back-End (Stand Alone) using existing
speech databases
Values are what we expect this recognizer
Once we link Front-End, expect results to be
acceptable but less impressive due to noise and
errors over transmission channel

15
Summary

Implementing speech recognition will improve user
experience with their devices
DSR is best solution for implementing speech
recognition on mobile devices
More Java-Enabled handsets
Implement a Java based system
Preliminary results look promising at Back-End
Aim is to match baseline results for C equivalent

Speech Enabling Mobile Devices using a Java Based Distributive Speech Recognition System PowerPoint PPT Presentation