SpeechBuilder: Facilitating Spoken Dialogue System Creation - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

SpeechBuilder: Facilitating Spoken Dialogue System Creation

Description:

SpeechBuilder aims to help novices rapidly create speech-based systems ... For a speech-based interface to ... city: Boston, New York... day: Monday, Tuesday ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 17
Provided by: jimg150
Category:

less

Transcript and Presenter's Notes

Title: SpeechBuilder: Facilitating Spoken Dialogue System Creation


1
SpeechBuilder Facilitating Spoken Dialogue
System Creation
  • Eugene Weinstein
  • Project Oxygen Core Team
  • MIT Laboratory for Computer Science
  • ecoder_at_mit.edu

2
Bridging the Experience Gap
  • Developing robust, mixed-initiative spoken
    dialogue systems is difficult
  • Complex systems can be created by human-language
    technology experts
  • Novice developers must overcome a considerable
    technical challenge
  • SpeechBuilder aims to help novices rapidly create
    speech-based systems
  • Uses intuitive methods for specifying
    domain-specific constraints
  • Automatically configures HLT components using MIT
    GALAXY architecture
  • Leverages future technical advances
  • Encourages research on portability

3
Baseline Configuration
  • Gives developer total control over application
    functionality
  • Communication with Galaxy via simple HTTP protocol

Developer Application
4
Modified Baseline Configuration (this class)
  • Still gives developer total control over
    application functionality
  • Frame Relay server exposes Galaxy meaning
    representation to app

Developer Application
5
Database Access Configuration
  • For a speech-based interface to structured data
  • No programming required specify table(s) and
    constraints

6
Creating a Speech-Based Application
Step 1 Off-line creation and compilation
Step 2 On-line deployment
7
Human Language Technologies
8
Extracting Database Information
What is the phone number for Victor Zue?
  • Some columns are used to access entries (e.g.,
    Name)
  • Column entries must be incorporated into ASR
    NLU
  • Some columns are only used in responses (e.g.,
    Phone)
  • Column names must be incorporated into ASR NLU

9
Knowledge Representation
  • Concepts and actions form basis for understanding
  • Concepts become key/value entries in meaning
    representation
  • city Boston, New York day Monday, Tuesday
  • Actions provide sentence-level patterns of
    specific queries
  • I want to fly from Boston to Taipei
    actionlookup_flight
  • Action text can be bracketed to define
    hierarchical concepts
  • I want to fly source(from Boston)
    destination(to Taipei)
  • sourceBoston destinationTaipei
  • Concepts and actions used to configure the
    following components
  • Speech Recognition
  • Natural Language Understanding
  • Discourse
  • Database columns define basic concepts
  • Column names can be grouped into concepts
  • property phone, email weather snow, rain

10
Language Modeling and Understanding
  • By default, concepts are used for language
    modeling, parsing grammar, and meaning
    representation

Will it snow?
weather snow
  • Concept usage can be fine-tuned to improve
    performance
  • For language modeling and parsing grammar only
    (i.e., no meaning)
  • For keyword spotting only (i.e., no role in
    language modeling)
  • For fine-grained language modeling with coarser
    meaning representation

snowfall
sprinkles
snowstorm
breezy
showers
accumulation
snowy
thunderstorm
flurries
blizzard
rainy
rainfall
weather snow
11
Current Status
  • SpeechBuilder has been operational for over two
    years
  • Used by over 50 developers from MIT and elsewhere
  • Used in undergraduate classes at MIT and
    Georgetown University
  • ASR capabilities benchmarked against main systems
  • Achieves same ASR performance as MIT Jupiter
    weather information system (6.8 word error rate
    on clean data) (phone )
  • Several prototype systems have been developed
  • Information about faculty, staff and students at
    LCS and AI Labs (phone, email, room, voice
    messages, transfer, etc.)
  • Application to control the various physical items
    in a typical office (lights, curtains, TV, VCR,
    projector, etc.)
  • Others include TV schedules, real-time weather
    forecasts, hotel and restaurant information etc.
  • SpeechBuilder used for initial design of many
    more complex domains

12
Ongoing and Future Work
  • Increase sophistication of discourse and dialogue
    manager to handle more complex dialogues
  • Enable finer specification of discourse
    capabilities
  • Add generic capabilities for times, dates, etc.
  • Incorporate confidence scoring and implement
    unsupervised training of acoustic and language
    models
  • Create functionality to allow developers to
    create domain-specific concatenative speech
    synthesis
  • Create alternative methods of domain
    specifications to streamline development
  • Advanced developers dont necessarily use web
    interface
  • Allow for more efficient automatic generation of
    SpeechBuilder domains

13
Acknowledgements
  • Issam Bazzi
  • Scott Cyphers
  • Ed Filisko
  • Jim Glass
  • TJ Hazen
  • Lee Hetherington
  • Joe Polifroni
  • Stephanie Seneff
  • Michelle Spina
  • Eugene Weinstein
  • Jon Yi
  • Misha Zitser

14
SpeechBuilder Hands-on Activity
  • Eugene Weinstein
  • Project Oxygen Core Team
  • MIT Laboratory for Computer Science
  • ecoder_at_mit.edu

15
Modified Baseline Configuration (this class)
  • Still gives developer total control over
    application functionality
  • Frame Relay server exposes Galaxy meaning
    representation to app

Jaim
Developer Application
Semantic Frame
16
SpeechBuilder API
  • Galaxy meaning representation provided through
    frame relay
  • Applications connect via TCP sockets
  • API provided in Perl, Python, and Java
  • This class Python API

Galaxy Frame Relay
Python class galaxy.server.Server
TCP Socket
galaxy.frame.Frame methods getAction() getAttribu
te(attr_name) getText() toString()
galaxy.server.Server methods Constructor(machine,
port,ID) connect() processMessage(blocking) discon
nect()
Python class galaxy.frame.Frame
Python API
Application
Write a Comment
User Comments (0)
About PowerShow.com