NAMEAWARE SPEECH RECOGNITION FOR INTERACTIVE QUESTION ANSWERING - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

NAMEAWARE SPEECH RECOGNITION FOR INTERACTIVE QUESTION ANSWERING

Description:

How many times has Rush Limbaugh been married? Modified with NE. Rush Limbaugh. Target NE. How many times has Limbaugh been married? Original TREC question ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 2
Provided by: dtur
Category:

less

Transcript and Presenter's Notes

Title: NAMEAWARE SPEECH RECOGNITION FOR INTERACTIVE QUESTION ANSWERING


1
NAME-AWARE SPEECH RECOGNITION FOR INTERACTIVE
QUESTION ANSWERING Svetlana Stoyanchev 3, Gokhan
Tur2 , Dilek Hakkani T\ur1 1 International
Computer Science Institute (ICSI), Speech Group,
Berkeley, CA, USA2 SRI International, Speech
Technology and Research (STAR) Lab., Menlo Park,
CA, USA 3 State University of New York (SUNY),
Stony Brook, NY, USA svetastenchikova_at_gmail.com
gokhan_at_speech.sri.com dilek_at_icsi.berkeley.edu
INTRODUCTION
DATASET
APPROACH
Goal Improving speech recognition in a
voice-enabled question answering application
through interactivity
  • Allow interactivity in specifying a named
    entity.
  • Recognize a named entity using grammars and
    retrieve matching documents
  • Build a question-specific language model

40 questions from TREC 2007 competition AQUAINT
corpus (3 Gigabyte document collection)
Question Answering QA is a natural language
interface for information retrieval TREC is a
yearly competition. Participants are given a
document set and a set of questions factoid
(who, what, when, where), list, or other (find
other relevant information on the given
topic).
EXPERIMENTS AND RESULTS
Evaluation 3 speakers read 40 questions
twice Set 3 40 questions with a named entity
(NE) Set 4 40 questions without a named entity
CONTROL FLOW
  • System asks a user to specify a named entity
    (1). A named entity is recognized using grammar
    constructed from a database of named entities
    (2). Next, we extract documents matching the
    named entity (3). Using these documents we build
    a language model using named entity-specific
    language model (4). In parallel Build a language
    model using TREC questions (5). Merge the two
    Language Models (6)
  • In the final step we recognize the question
    using the new language model (7).

On what date did Michael Brown resign as a head
of FEMA?
Identify that the target is DATE, identify a
named entities Michael Brown and FEMA
NE Michael Brown, FEMA Phrases head of
FEMA Verb resigned
Search can use web or a document collection
Find Candidate Sentences, identify DATEs
On September 12, 2005, in the wake of what was
widely believed to be feckless handling of the
aftermath of Hurricane Katrina and facing
allegations that he had falsified portions of his
résumé, Brown resigned, it is mentioned
earlier in the document that Michael Brown was a
head of FEMA After his September 12 resignation,
Brown continued working for FEMA
Cheating model (contains questions)
1
2
September 12, 2005 September 12
Candidate answers
Identify match between September12, 2005 and
September 12.
3
  • Motivation
  • The word error rates of the state-of-the-art
    open-domain speech recognition technology are
    around 25-30.
  • Performance is known to be even lower for names
    and rare words.
  • Named entities (NE) are strongly associated with
    the content words. For example for the target
    name Gordon Gekko, one question used in TREC 2004
    evaluations is In what film is Gordon Gekko the
    main character?, including non-function words
    related to the movie industry, such as film or
    character.
  • The goal is to capture content words using the
    documents where the named entity appears
    frequently.
  • Conclusions
  • The question-specific model achieves 32.2
    reduction in word error rate from the baseline
    using the questions where pronominal references
    are resolved. (from 58.36 to 26.02)
  • Used name specific language models in question
    answering task where target name in question is
    asked to the user beforehand.
  • Used TREC benchmark questions on the AQUAINT
    corpus.
  • Future Work
  • Grounding process of the named entity
  • ask a user for the type of the named entity (e.g.
    person, organization, location, movie) and
    associations
  • build a focused grammar a user may be
    specifying
  • Orhan Pamuk, a Turkish writer.

5
4
6
7
Related Work
  • R. Iyer and M. Ostendorf, Modeling long distance
    dependencies
  • in language Topic mixtures versus dynamic cache
    model IEEE Transactions on Speech and Audio
    Processing
  • D. Gildea and T. Hofmann, Topic-based language
    models using EM in Proceedings of Eurospeech
  • F. Bechet, G. Riccardi, and D. Hakkani-Tur
    Mining spoken dialog corpora for system
    evaluation and modeling
Write a Comment
User Comments (0)
About PowerShow.com