Speech Interfaces to Virtual Reality - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Speech Interfaces to Virtual Reality

Description:

Purposes of this paper. Analyze the technical and design issues to combine a virtual world with a speech ... http://java.cc.nccu.edu.tw/pr/report.htm ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 32
Provided by: try3
Category:

less

Transcript and Presenter's Notes

Title: Speech Interfaces to Virtual Reality


1
Speech Interfaces to Virtual Reality
  • Scott McGlashan
  • Swedish Institute of Computer Science
  • Reporter
  • try

2
Agenda
  • Purposes of this paper
  • Speech Interface ???????
  • Problems of Integration
  • DIVERSE System
  • Enhancing DIVERSE
  • Conclusion

3
Purposes of this paper
  • Analyze the technical and design issues to
    combine a virtual world with a speech interface.
  • Describe architecture of the DIVERSE system.
  • Enhances DIVERSE to allow users to talk directly
    to agent in virtual world.
  • ??Agent???????????

4
Integrating Speech Interface
  • Use Speech Direct manipulation to form the
    multimodal Interface.
  • Users are free to decide which modality to use.
  • In multimodal user interface,user can issue more
    commands with less effort.

5
Benefits of Speech Interface
  • Naturalness
  • Hands / eyes freedom
  • Beyond here and now
  • Users can refer to objects which are not present
    in theire current view .

6
Successful Speech Interface System
  • Speech interfaces can be beneficial when they are
    more efficient than their alternatives.(??????????
    ???)
  • Has been successfully deployed as part of
    interactive dialogue systems in limited task
    domains.
  • Ex banking services , travel services.

7
Features of Successful Speech Interface
  • Restricted Language
  • Incremental Information Transfer
  • Feedback(??)
  • Dialogue Management

8
Problems of Integration
  • Speech Recognition Limited vocabulary to gain
    accuracy.
  • Language Understanding Limited knowledge to
    maximize the understanding.
  • Interaction Metaphor Who does the user talk to?

9
Speech Recognition
  • The most successful methods for acoustic-phonetic
    modelling is HMM.
  • N-grams language model is also integrated into
    the recognition process.
  • ??HMM-basedN-grams????????????
  • http//java.cc.nccu.edu.tw/pr/report.htm
  • The result is a speaker-independent , continuous
    speech system.

10
?????????
11
(No Transcript)
12
Language Understanding
  • ???????????,????????,???????????????????

13
Interaction Metaphor
  • Direct manipulation -Personal Presence.
  • Various metaphors for spoken interaction have
    been proposed.
  • Proxy
  • Divinity
  • Telekinesis(????)
  • Interface Agent

14
DIVE-Virtual Reality System
  • DIVE(Distribute Interactive Virtual Environment)
    is a multi-user virtual environment.
  • DIVE allow users and environment interact in
    real-time.
  • DIVE contains a database composed of
    hierarchically organized objects .

15
DIVERSE
  • Augment DIVE by adding multimodal interface .
  • Speech interface is part of DIVERSE system.
  • Its focus was multimodal interface , not speech
    interface.

16
The DIVERSE System
17
DIVERSE Features
  • SR Woodland , output is text string
  • TTS INFOVOX
  • Does not manage the interaction for users
  • Does not confirm information.
  • Does not correct errors when they arise.
  • Always updated the world by recognition results.
  • No Dialogue management.

18
Reference Resolution of DIVERSE
  • Object Focus
  • Property Perception

19
Object Focus
  • A combination of parameters.
  • These parameters have priorities and may
    persist/decay over time.
  • Ex An object which is being point at is more
    focus than one just in visual field.

Bring me that box!
20
??????????
21
Property Perception
  • The property holds(??) of an object if the
    semantic value of this property is best-fit for
    that object.
  • Ex move the red object.

22
Enhancing DIVERSE
  • SR and Language Understanding
  • Reference Resolution
  • Discourse(??) Modeling.
  • Robustness(???)
  • Confirm and Error Handling.
  • Talking Agents

23
SR and Language Understanding
  • One of the main weakness of DIVERSE is its SR
    accuracy.
  • Change a better SR engine.
  • Use pre-defined grammars to increase accuracy.

24
Discourse Modeling
  • The search will be inefficient if the search
    space is always the whole virtual environment.
  • With discourse modeling , we can constraint the
    searching space.
  • Ex Bring me the cube. , the reference of the
    cube should be resolved only in the eye-sight.

25
Talking Agents
  • ?DIVERSE?,?????Agent???????????
  • Interaction metaphor - Interface Agent.
  • ?????????????talking agent ?non-talking agent.
  • Interaction metaphor proxy.

26
Agent Modeling Framework
  • The parent agent consists basic functions.
  • We can define more specific agent by extend
    parent agent.

27
ExampleLauncher Agent
  • Launcher launcher 476 here.
  • Usertarget red tank.
  • Launcher please specify coordinates of red tank.
  • User 437,342
  • Launcher targeted red tank at 437,342 .Launch
    missile ?
  • User confirm missile launch.
  • Launcher missile launched . over and out?
  • User over and out.

28
Conclusion
  • Combining speech into virtual worlds provides
    natural interaction.
  • Speech interfaces work well when cooperate with
    other user interface.
  • Authors enhance the DIVERSE to gain Further
    benefits.

29
Q A
30
Backup
31
DIVERSE System Architecture
Write a Comment
User Comments (0)
About PowerShow.com