Designing Speech Interfaces for Kiosks - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Designing Speech Interfaces for Kiosks

Description:

Provide timely, relevant information to visitors and members of the CSAIL ... entry Bryan Adams /entry entry Edward Adelson /entry IMPLEMENTATION Innovation ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 22
Provided by: peopleC
Category:

less

Transcript and Presenter's Notes

Title: Designing Speech Interfaces for Kiosks


1
Designing Speech Interfacesfor Kiosks
Max Van KleekBuddhika Kottahachchi Tyler
Horton Paul Cavallaro
2
AGENDA
  • Background
  • Motivation
  • Design
  • Current Implementation
  • Demo (Video)
  • Evaluation
  • Conclusions Future Work

3
(No Transcript)
4
BackgroundSmart Kiosk Information Navigation
and Noteposting Interface (SKINNI)
Provide timely, relevant information to visitors
and members of the CSAIL community through a
touchscreen GUI
5
BackgroundSmart Kiosk Information Navigation
and Noteposting Interface (SKINNI)
Provide timely, relevant information to visitors
and members of the CSAIL community through a
touchscreen GUI
6
BackgroundSmart Kiosk Information Navigation
and Noteposting Interface (SKINNI)
Provide timely, relevant information to visitors
and members of the CSAIL community through a
touchscreen GUI
7
MOTIVATION
  • Searching for specific information via
    touchscreen GUIs feels tedious, error prone
    - more time consuming than desirable - poor
    pointing accuracy - widgets behave differently
    on touchscreens - no tactile feedback
  • Optimizing the GUI for touchscreens, and adding
    shortcuts to allow searching/rapid information
    access yielded limited success - screen
    clutter - new vs experienced users -
    forced user to use attached keyboard

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
DESIGN Speech Challenges
  • Robustness - Speaker independence - Speech
    dysfluencies and accents - Signal capture in
    noisy environments ...achieving good
    recognition accuracy.
  • Usability - Low threshold of use - Initial
    learning curve - Visibility of system state -
    Handling misrecognition errors gracefully -
    Managing user expectations

Related work ESPIRIT MASK project Gavin et.
al. (1996) Smart Kiosk project Christian et
al. (2000)
12
DESIGN - Galaxy
  • Galaxy gives us... - Speaker independence -
    Handling of Speech disfluencies/accents
  • Speechbuilder gives us... - Ease of speech
    domain definition/manipulation
  • Distributed architecture lends well to
    Kiosks - Thin clients dependent on more powerful
    servers

13
IMPLEMENTATION -Architecture
14
IMPLEMENTATION Speech Domain
  • Constrained domain - Only directory field and
    map queries
  • Iterative Design - Initial domain extended
    through informal user survey

ltopt speechbuilder"4.0"gtltclass type"Action"
name"show_room"gt ltentrygt where is
room thirty two two two six A lt/entrygt
ltentrygt can you please (show me
tell me) a map of where room thirty
two two two six A lt/entrygt ltentrygt
can you please (show me tell me) a map
of where is Ben Bitdiddle office is
lt/entrygt ltentrygt Do you know where
is Ben Bitdiddle office is
lt/entrygt lt/classgtltclass type"Key"
name"Person"gt ltentrygtHal Abelsonlt/entrygt
ltentrygtBryan Adamslt/entrygt ltentrygtEdward
Adelsonlt/entrygt . .
15
IMPLEMENTATION Innovation
  • Speech state feedback GUI - Provides immediate
    visual feedback of the system state - What was
    recognized? - Is the system ready for
    interaction? - Is the system busy?

16
IMPLEMENTATION Innovation
  • Advantages - User is made aware of what the
    system is trying to do - Reasons for recognition
    failures can be determined - Initial
    familiarization process is much smoother - User
    retention increases
  • Disadvantages - Isn't helpful for visually
    impaired users - Takes up display space

17
DEMO
18
EVALUATION - Methodology
  • Informal user study
  • 10 subjects (lab members not representative)
  • Task - Look up the phone number for 18 randomly
    selected lab members - First 6 using the Speech
    Interface - Second 6 using the Touchscreen
    Interface - Final 6 using the preferred
  • Metric - Time taken - From when name to be
    looked up provided to the subject - To when
    subject retrieves the number from the kiosk

19
EVALUATION - Results
  • Subjects were not aware of supported query
    forms - recognition rate in the first 2 queries
    50 - thereafter 72
  • 8/10 subjects preferred the speech interface
  • When recognition was successful, performance
    was consistently better!

20
CONCLUSIONS
  • Users are receptive to using speech interfaces
  • Failed recognition imposes severe penalties on
    performance
  • Ramp-up time can be reduced and user
    retention increased by providing appropriate
    feedback

21
FUTURE WORK
  • Improve recognition rates - Improve speech
    domain - Update voice models (current ones from
    phone data)
  • Further evaluation
  • Extend speech interface to support all
    functionality exposed via touchscreen interface
  • Conversation support - dialog and discourse
    management
  • Multi-language support - Stata visitors come
    from all over the world
Write a Comment
User Comments (0)
About PowerShow.com