Designing Speech Interfaces for Kiosks - PowerPoint PPT Presentation

About This Presentation

Title:

Designing Speech Interfaces for Kiosks

Description:

Provide timely, relevant information to visitors and members of the CSAIL ... entry Bryan Adams /entry entry Edward Adelson /entry IMPLEMENTATION Innovation ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 22

Provided by: peopleC

Learn more at: https://people.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Designing Speech Interfaces for Kiosks

1
Designing Speech Interfacesfor Kiosks
Max Van KleekBuddhika Kottahachchi Tyler
Horton Paul Cavallaro
2
AGENDA

Background
Motivation
Design
Current Implementation
Demo (Video)
Evaluation
Conclusions Future Work

3
(No Transcript)
4
BackgroundSmart Kiosk Information Navigation
and Noteposting Interface (SKINNI)
Provide timely, relevant information to visitors
and members of the CSAIL community through a
touchscreen GUI
5
BackgroundSmart Kiosk Information Navigation
and Noteposting Interface (SKINNI)
Provide timely, relevant information to visitors
and members of the CSAIL community through a
touchscreen GUI
6
BackgroundSmart Kiosk Information Navigation
and Noteposting Interface (SKINNI)
Provide timely, relevant information to visitors
and members of the CSAIL community through a
touchscreen GUI
7
MOTIVATION

Searching for specific information via
touchscreen GUIs feels tedious, error prone
- more time consuming than desirable - poor
pointing accuracy - widgets behave differently
on touchscreens - no tactile feedback
Optimizing the GUI for touchscreens, and adding
shortcuts to allow searching/rapid information
access yielded limited success - screen
clutter - new vs experienced users -
forced user to use attached keyboard

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
DESIGN Speech Challenges

Robustness - Speaker independence - Speech
dysfluencies and accents - Signal capture in
noisy environments ...achieving good
recognition accuracy.
Usability - Low threshold of use - Initial
learning curve - Visibility of system state -
Handling misrecognition errors gracefully -
Managing user expectations

Related work ESPIRIT MASK project Gavin et.
al. (1996) Smart Kiosk project Christian et
al. (2000)
12
DESIGN - Galaxy

Galaxy gives us... - Speaker independence -
Handling of Speech disfluencies/accents
Speechbuilder gives us... - Ease of speech
domain definition/manipulation
Distributed architecture lends well to
Kiosks - Thin clients dependent on more powerful
servers

13
IMPLEMENTATION -Architecture
14
IMPLEMENTATION Speech Domain

Constrained domain - Only directory field and
map queries
Iterative Design - Initial domain extended
through informal user survey

ltopt speechbuilder"4.0"gtltclass type"Action"
name"show_room"gt ltentrygt where is
room thirty two two two six A lt/entrygt
ltentrygt can you please (show me
tell me) a map of where room thirty
two two two six A lt/entrygt ltentrygt
can you please (show me tell me) a map
of where is Ben Bitdiddle office is
lt/entrygt ltentrygt Do you know where
is Ben Bitdiddle office is
lt/entrygt lt/classgtltclass type"Key"
name"Person"gt ltentrygtHal Abelsonlt/entrygt
ltentrygtBryan Adamslt/entrygt ltentrygtEdward
Adelsonlt/entrygt . .
15
IMPLEMENTATION Innovation

Speech state feedback GUI - Provides immediate
visual feedback of the system state - What was
recognized? - Is the system ready for
interaction? - Is the system busy?

16
IMPLEMENTATION Innovation

Advantages - User is made aware of what the
system is trying to do - Reasons for recognition
failures can be determined - Initial
familiarization process is much smoother - User
retention increases
Disadvantages - Isn't helpful for visually
impaired users - Takes up display space

17
DEMO
18
EVALUATION - Methodology

Informal user study
10 subjects (lab members not representative)
Task - Look up the phone number for 18 randomly
selected lab members - First 6 using the Speech
Interface - Second 6 using the Touchscreen
Interface - Final 6 using the preferred
Metric - Time taken - From when name to be
looked up provided to the subject - To when
subject retrieves the number from the kiosk

19
EVALUATION - Results

Subjects were not aware of supported query
forms - recognition rate in the first 2 queries
50 - thereafter 72
8/10 subjects preferred the speech interface
When recognition was successful, performance
was consistently better!

20
CONCLUSIONS

Users are receptive to using speech interfaces
Failed recognition imposes severe penalties on
performance
Ramp-up time can be reduced and user
retention increased by providing appropriate
feedback

21
FUTURE WORK

Improve recognition rates - Improve speech
domain - Update voice models (current ones from
phone data)
Further evaluation
Extend speech interface to support all
functionality exposed via touchscreen interface
Conversation support - dialog and discourse
management
Multi-language support - Stata visitors come
from all over the world