Speech Interface to Virtual Reality Applications - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

Speech Interface to Virtual Reality Applications

Description:

M.Cernak, A.Sannier ,Technical Report, 'Command Speech Interface to Virtual ... Wauchope, K., S. Everett, D. Tate, T. Maney, 'Speech-Interactive Virtual ... – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 36

Provided by: try3

Category:

more less

Transcript and Presenter's Notes

Title: Speech Interface to Virtual Reality Applications

1
Speech Interface to Virtual Reality Applications
Authors Wauchope, K., S. Everett, D. Tate, T.
Maney M.Cernak, A.Sannier

Reporter
Chun-Feng Liao

2
References
This report discuss 2 implementations of Speech
Interface to Virtual Reality Applications.

M.Cernak, A.Sannier ,Technical Report, Command
Speech Interface to Virtual Reality
Applications,Virtual Reality Applications Center
at Iowa State University of Science and
Technology, June 2002.
Wauchope, K., S. Everett, D. Tate, T. Maney,
"Speech-Interactive Virtual Environments for Ship
Familiarization," 2nd International
EuroConference on Computer and IT Applications in
the Maritime Industries (COMPIT '03), Hamburg,
Germany, May 14-17, 2003, pp. 70-83.

3
(No Transcript)
4
Agenda

Introduction
Paper I
Paper II
Conclusion
System design Discussion

5
Introduction

Both papers are newly published.(2002,2003)
These 2 papers address technical details of
Speech-VR integration.\
The 2nd paper take more modern approach .
Both of them use similar architecture.(and are
also similar to ours!)

ExChoosing VRML Java Speech API platform and
encountered serveral difficult problems such as
java security constraint and were force to use a
brwoser as an application instead of browser
as an applet
6
Paper I

M.Cernak, A.Sannier ,Technical Report, Command
Speech Interface to Virtual Reality
Applications,Virtual Reality Applications Center
at Iowa State University of Science and
Technology, June 2002.

7
Purposes of this paper

Describe an approach to control VR applications
using multimodal command speech interface
(CSI)based on dialog modeling.
Used to imporve the usability of VRACs C6 .

VRAC Virtual Reality Applications Center C6 is
a Virtual Reality System developed by VRAC.
8
Multimodal Interaction
Command Addressing,used to trigger system start
to record users voice for recognition.

U MoleBio
S Yes
U (Targeting the atom 512 by mouse)
U Go There !
S OK (goto Atom number 512 ).

U User , S System
9
System Architecture
Dialog Management and Speech facilities
VR System
10
System Architecture

VR VRACs C6
TTS Festival
SR CSLU Toolkit
Platform Windows OS on PII 400

11
Three Main Components(1)

Speech Synthesis (TTS) Festival .

12
Three Main Components(2)

CSLU Toolkit Dialog Modeling , Speech
Recognition and Nature Language Processing.
CSLU was implemented in C and Tcl/tk , developed
by OGI (Oregon Graduate Institute )

CSLU (Center of Spoken Language Understanding)
13
(No Transcript)
14
Three Main Components(3)

Communication Bridge to VR application.
To Integrate CSLU(Speech) and C6(VR).

15
How to Integrate CSLU and C6

Initial Attempt CORBA
C6 support CORBA .
Try to use Combat as tcl extension as CORBA
Client but failed.
Try to use Tcl Blend
Tck-gtJava-gtCORBA-gtC6 (efficient problems)
Result use TCP socket.

16
Natural Language Processing

Instead of using standard JSGF , the authors use
a custom grammar and wrote a specific parser to
evaluate it.
Very similar to JSGF.
We will not discuss the custom grammar in detail
here.

17
SCI Test Environment

A RAD (GUI) tool that help developers to quickly
build the dialog flow.

18
Paper I Conclusion

Major advantage of this system is quick
deployment.
The problematic area is the Speech Recognition
Accuracy(provided by CSLU) was poor.
US Navy also developed a Speech Inteface to VR
System , they will imporved the interaction with
VR in terms of their method.

19
Future Work

Change TTS and SR to IBM ViaVoice .
Support JSAPI(Java Speech API)
Java is easier to communicate with C6 via CORBA.

20
(No Transcript)
21
Paper II

Wauchope, K., S. Everett, D. Tate, T. Maney,
"Speech-Interactive Virtual Environments for Ship
Familiarization," 2nd International
EuroConference on Computer and IT Applications in
the Maritime Industries (COMPIT '03), Hamburg,
Germany, May 14-17, 2003, pp. 70-83.

22
Introduction

This paper intruduce 2 systems which help
newly-aboard crews of US Navy ships to be
familiar with their environment quickly.

User Tell me where is Rom 101 !
23
Motivation

Architects of US Navy Ships heavily use CAD tools
to design ship models.
CAD file can be transferred to 3D model format
with little effort.
Accroding to authors previous research ,this
Virtual Envirionment did shorten crews learning
time.

24
Systems introduced

2 Systems
MSFT(Multimodal Ship Familiarization Tool)
ISFS(Interactive Ship Familiarization System)
ISFS is a recent transition fo MSFT.

25
System ArchitectureMSFT
Run as different process
26
MSFT

VE veiwer component and speech interface run as
two separate processes.
Speech interface using a total IBM solution
ViaVoice.
IBMs SMAPI.
IBMs SRCL grammar.

Platform PIII 500MHz
27
ISFS

A recent transistion of MSFT.
Using VRML as 3D modeling language.
Using JSAPI as interface to speech engine.
ViaVoice totally support JSAPI.
VRML support Java as a scripting language
Other structure is identical to MSFT system.

Platform Xeon 2.0GHz -gtNeed more computing
power!
28
Why Chose to Use Standalone VRML Brwoser?

Security Limitations.(detail will be discussed
later)
VM Limitations.(detail will be discussed later)
Provide opportunities to customize interface to
VRML browser.

In my personal experience,system usually become
unstable when speech engine work with VRML
Plug-in via EAIs Java interface.
29
Security Limitations

JRE imposes security limitations on Java Applets.
JSAPI was unable to establish a connection with
speech engine unless we explicitly reconfig the
security settings.

30
Limited VM

Most VRML Browser s EAI were implemented using
ActiveX thus only support Microsofts old VM
which dosent support most modern functions of
Java.
ExThis may force us to use Java AWT instead of
swing which provide better GUI.

31
Providing GUI as VUI Fallback

GUI provides a fallback in case the speech
recognizer is having trouble accurately
transcribing the users voice.
GUI is adjusted dynamically to provide one-to-one
correspondence to VUI .

32
Paper 2 Conclusion

The Speech Interface is needed because GUI and VE
Viewer both rely on direct manipulation and keep
our hand too busy.
As HCI become increasingly multimodel,care must
be taken to integrate in natural manner.

33
Future Work