Speech Interface to Virtual Reality Applications - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Speech Interface to Virtual Reality Applications

Description:

M.Cernak, A.Sannier ,Technical Report, 'Command Speech Interface to Virtual ... Wauchope, K., S. Everett, D. Tate, T. Maney, 'Speech-Interactive Virtual ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 36
Provided by: try3
Category:

less

Transcript and Presenter's Notes

Title: Speech Interface to Virtual Reality Applications


1
Speech Interface to Virtual Reality Applications
Authors Wauchope, K., S. Everett, D. Tate, T.
Maney M.Cernak, A.Sannier
  • Reporter
  • Chun-Feng Liao

2
References
This report discuss 2 implementations of Speech
Interface to Virtual Reality Applications.
  • M.Cernak, A.Sannier ,Technical Report, Command
    Speech Interface to Virtual Reality
    Applications,Virtual Reality Applications Center
    at Iowa State University of Science and
    Technology, June 2002.
  • Wauchope, K., S. Everett, D. Tate, T. Maney,
    "Speech-Interactive Virtual Environments for Ship
    Familiarization," 2nd International
    EuroConference on Computer and IT Applications in
    the Maritime Industries (COMPIT '03), Hamburg,
    Germany, May 14-17, 2003, pp. 70-83.

3
(No Transcript)
4
Agenda
  • Introduction
  • Paper I
  • Paper II
  • Conclusion
  • System design Discussion

5
Introduction
  • Both papers are newly published.(2002,2003)
  • These 2 papers address technical details of
    Speech-VR integration.\
  • The 2nd paper take more modern approach .
  • Both of them use similar architecture.(and are
    also similar to ours!)

ExChoosing VRML Java Speech API platform and
encountered serveral difficult problems such as
java security constraint and were force to use a
brwoser as an application instead of browser
as an applet
6
Paper I
  • M.Cernak, A.Sannier ,Technical Report, Command
    Speech Interface to Virtual Reality
    Applications,Virtual Reality Applications Center
    at Iowa State University of Science and
    Technology, June 2002.

7
Purposes of this paper
  • Describe an approach to control VR applications
    using multimodal command speech interface
    (CSI)based on dialog modeling.
  • Used to imporve the usability of VRACs C6 .

VRAC Virtual Reality Applications Center C6 is
a Virtual Reality System developed by VRAC.
8
Multimodal Interaction
Command Addressing,used to trigger system start
to record users voice for recognition.
  • U MoleBio
  • S Yes
  • U (Targeting the atom 512 by mouse)
  • U Go There !
  • S OK (goto Atom number 512 ).

U User , S System
9
System Architecture
Dialog Management and Speech facilities
VR System
10
System Architecture
  • VR VRACs C6
  • TTS Festival
  • SR CSLU Toolkit
  • Platform Windows OS on PII 400

11
Three Main Components(1)
  • Speech Synthesis (TTS) Festival .

12
Three Main Components(2)
  • CSLU Toolkit Dialog Modeling , Speech
    Recognition and Nature Language Processing.
  • CSLU was implemented in C and Tcl/tk , developed
    by OGI (Oregon Graduate Institute )

CSLU (Center of Spoken Language Understanding)
13
(No Transcript)
14
Three Main Components(3)
  • Communication Bridge to VR application.
  • To Integrate CSLU(Speech) and C6(VR).

15
How to Integrate CSLU and C6
  • Initial Attempt CORBA
  • C6 support CORBA .
  • Try to use Combat as tcl extension as CORBA
    Client but failed.
  • Try to use Tcl Blend
  • Tck-gtJava-gtCORBA-gtC6 (efficient problems)
  • Result use TCP socket.

16
Natural Language Processing
  • Instead of using standard JSGF , the authors use
    a custom grammar and wrote a specific parser to
    evaluate it.
  • Very similar to JSGF.
  • We will not discuss the custom grammar in detail
    here.

17
SCI Test Environment
  • A RAD (GUI) tool that help developers to quickly
    build the dialog flow.

18
Paper I Conclusion
  • Major advantage of this system is quick
    deployment.
  • The problematic area is the Speech Recognition
    Accuracy(provided by CSLU) was poor.
  • US Navy also developed a Speech Inteface to VR
    System , they will imporved the interaction with
    VR in terms of their method.

19
Future Work
  • Change TTS and SR to IBM ViaVoice .
  • Support JSAPI(Java Speech API)
  • Java is easier to communicate with C6 via CORBA.

20
(No Transcript)
21
Paper II
  • Wauchope, K., S. Everett, D. Tate, T. Maney,
    "Speech-Interactive Virtual Environments for Ship
    Familiarization," 2nd International
    EuroConference on Computer and IT Applications in
    the Maritime Industries (COMPIT '03), Hamburg,
    Germany, May 14-17, 2003, pp. 70-83.

22
Introduction
  • This paper intruduce 2 systems which help
    newly-aboard crews of US Navy ships to be
    familiar with their environment quickly.

User Tell me where is Rom 101 !
23
Motivation
  • Architects of US Navy Ships heavily use CAD tools
    to design ship models.
  • CAD file can be transferred to 3D model format
    with little effort.
  • Accroding to authors previous research ,this
    Virtual Envirionment did shorten crews learning
    time.

24
Systems introduced
  • 2 Systems
  • MSFT(Multimodal Ship Familiarization Tool)
  • ISFS(Interactive Ship Familiarization System)
  • ISFS is a recent transition fo MSFT.

25
System ArchitectureMSFT
Run as different process
26
MSFT
  • VE veiwer component and speech interface run as
    two separate processes.
  • Speech interface using a total IBM solution
  • ViaVoice.
  • IBMs SMAPI.
  • IBMs SRCL grammar.

Platform PIII 500MHz
27
ISFS
  • A recent transistion of MSFT.
  • Using VRML as 3D modeling language.
  • Using JSAPI as interface to speech engine.
  • ViaVoice totally support JSAPI.
  • VRML support Java as a scripting language
  • Other structure is identical to MSFT system.

Platform Xeon 2.0GHz -gtNeed more computing
power!
28
Why Chose to Use Standalone VRML Brwoser?
  • Security Limitations.(detail will be discussed
    later)
  • VM Limitations.(detail will be discussed later)
  • Provide opportunities to customize interface to
    VRML browser.

In my personal experience,system usually become
unstable when speech engine work with VRML
Plug-in via EAIs Java interface.
29
Security Limitations
  • JRE imposes security limitations on Java Applets.
  • JSAPI was unable to establish a connection with
    speech engine unless we explicitly reconfig the
    security settings.

30
Limited VM
  • Most VRML Browser s EAI were implemented using
    ActiveX thus only support Microsofts old VM
    which dosent support most modern functions of
    Java.
  • ExThis may force us to use Java AWT instead of
    swing which provide better GUI.

31
Providing GUI as VUI Fallback
  • GUI provides a fallback in case the speech
    recognizer is having trouble accurately
    transcribing the users voice.
  • GUI is adjusted dynamically to provide one-to-one
    correspondence to VUI .

32
Paper 2 Conclusion
  • The Speech Interface is needed because GUI and VE
    Viewer both rely on direct manipulation and keep
    our hand too busy.
  • As HCI become increasingly multimodel,care must
    be taken to integrate in natural manner.

33
Future Work
  • VRML is more close to Object oriented and
    tree-structured.
  • It is hard to represent them in RDBMS.
  • Must find some way to store model data easily and
    efficiently.

Personal thought Using XML Database.
34
Discussions
35
Q A
Write a Comment
User Comments (0)
About PowerShow.com