Title: Speech Interface to Virtual Reality Applications
1Speech Interface to Virtual Reality Applications
Authors Wauchope, K., S. Everett, D. Tate, T.
Maney M.Cernak, A.Sannier
2References
This report discuss 2 implementations of Speech
Interface to Virtual Reality Applications.
- M.Cernak, A.Sannier ,Technical Report, Command
Speech Interface to Virtual Reality
Applications,Virtual Reality Applications Center
at Iowa State University of Science and
Technology, June 2002. - Wauchope, K., S. Everett, D. Tate, T. Maney,
"Speech-Interactive Virtual Environments for Ship
Familiarization," 2nd International
EuroConference on Computer and IT Applications in
the Maritime Industries (COMPIT '03), Hamburg,
Germany, May 14-17, 2003, pp. 70-83.
3(No Transcript)
4Agenda
- Introduction
- Paper I
- Paper II
- Conclusion
- System design Discussion
5Introduction
- Both papers are newly published.(2002,2003)
- These 2 papers address technical details of
Speech-VR integration.\ - The 2nd paper take more modern approach .
- Both of them use similar architecture.(and are
also similar to ours!)
ExChoosing VRML Java Speech API platform and
encountered serveral difficult problems such as
java security constraint and were force to use a
brwoser as an application instead of browser
as an applet
6Paper I
- M.Cernak, A.Sannier ,Technical Report, Command
Speech Interface to Virtual Reality
Applications,Virtual Reality Applications Center
at Iowa State University of Science and
Technology, June 2002.
7Purposes of this paper
- Describe an approach to control VR applications
using multimodal command speech interface
(CSI)based on dialog modeling. - Used to imporve the usability of VRACs C6 .
VRAC Virtual Reality Applications Center C6 is
a Virtual Reality System developed by VRAC.
8Multimodal Interaction
Command Addressing,used to trigger system start
to record users voice for recognition.
- U MoleBio
- S Yes
- U (Targeting the atom 512 by mouse)
- U Go There !
- S OK (goto Atom number 512 ).
U User , S System
9System Architecture
Dialog Management and Speech facilities
VR System
10System Architecture
- VR VRACs C6
- TTS Festival
- SR CSLU Toolkit
- Platform Windows OS on PII 400
11Three Main Components(1)
- Speech Synthesis (TTS) Festival .
12Three Main Components(2)
- CSLU Toolkit Dialog Modeling , Speech
Recognition and Nature Language Processing. - CSLU was implemented in C and Tcl/tk , developed
by OGI (Oregon Graduate Institute )
CSLU (Center of Spoken Language Understanding)
13(No Transcript)
14Three Main Components(3)
- Communication Bridge to VR application.
- To Integrate CSLU(Speech) and C6(VR).
15How to Integrate CSLU and C6
- Initial Attempt CORBA
- C6 support CORBA .
- Try to use Combat as tcl extension as CORBA
Client but failed. - Try to use Tcl Blend
- Tck-gtJava-gtCORBA-gtC6 (efficient problems)
- Result use TCP socket.
16Natural Language Processing
- Instead of using standard JSGF , the authors use
a custom grammar and wrote a specific parser to
evaluate it. - Very similar to JSGF.
- We will not discuss the custom grammar in detail
here.
17SCI Test Environment
- A RAD (GUI) tool that help developers to quickly
build the dialog flow.
18Paper I Conclusion
- Major advantage of this system is quick
deployment. - The problematic area is the Speech Recognition
Accuracy(provided by CSLU) was poor. - US Navy also developed a Speech Inteface to VR
System , they will imporved the interaction with
VR in terms of their method.
19Future Work
- Change TTS and SR to IBM ViaVoice .
- Support JSAPI(Java Speech API)
- Java is easier to communicate with C6 via CORBA.
20(No Transcript)
21Paper II
- Wauchope, K., S. Everett, D. Tate, T. Maney,
"Speech-Interactive Virtual Environments for Ship
Familiarization," 2nd International
EuroConference on Computer and IT Applications in
the Maritime Industries (COMPIT '03), Hamburg,
Germany, May 14-17, 2003, pp. 70-83.
22Introduction
- This paper intruduce 2 systems which help
newly-aboard crews of US Navy ships to be
familiar with their environment quickly.
User Tell me where is Rom 101 !
23Motivation
- Architects of US Navy Ships heavily use CAD tools
to design ship models. - CAD file can be transferred to 3D model format
with little effort. - Accroding to authors previous research ,this
Virtual Envirionment did shorten crews learning
time.
24Systems introduced
- 2 Systems
- MSFT(Multimodal Ship Familiarization Tool)
- ISFS(Interactive Ship Familiarization System)
- ISFS is a recent transition fo MSFT.
25System ArchitectureMSFT
Run as different process
26MSFT
- VE veiwer component and speech interface run as
two separate processes. - Speech interface using a total IBM solution
- ViaVoice.
- IBMs SMAPI.
- IBMs SRCL grammar.
Platform PIII 500MHz
27ISFS
- A recent transistion of MSFT.
- Using VRML as 3D modeling language.
- Using JSAPI as interface to speech engine.
- ViaVoice totally support JSAPI.
- VRML support Java as a scripting language
- Other structure is identical to MSFT system.
Platform Xeon 2.0GHz -gtNeed more computing
power!
28Why Chose to Use Standalone VRML Brwoser?
- Security Limitations.(detail will be discussed
later) - VM Limitations.(detail will be discussed later)
- Provide opportunities to customize interface to
VRML browser.
In my personal experience,system usually become
unstable when speech engine work with VRML
Plug-in via EAIs Java interface.
29Security Limitations
- JRE imposes security limitations on Java Applets.
- JSAPI was unable to establish a connection with
speech engine unless we explicitly reconfig the
security settings.
30Limited VM
- Most VRML Browser s EAI were implemented using
ActiveX thus only support Microsofts old VM
which dosent support most modern functions of
Java. - ExThis may force us to use Java AWT instead of
swing which provide better GUI.
31Providing GUI as VUI Fallback
- GUI provides a fallback in case the speech
recognizer is having trouble accurately
transcribing the users voice. - GUI is adjusted dynamically to provide one-to-one
correspondence to VUI .
32Paper 2 Conclusion
- The Speech Interface is needed because GUI and VE
Viewer both rely on direct manipulation and keep
our hand too busy. - As HCI become increasingly multimodel,care must
be taken to integrate in natural manner.
33Future Work
- VRML is more close to Object oriented and
tree-structured. - It is hard to represent them in RDBMS.
- Must find some way to store model data easily and
efficiently.
Personal thought Using XML Database.
34Discussions
35Q A