Title: Analysis for Speech Translation Using GrammarBased Parsing and Automatic Classification
1Analysis for Speech Translation Using
Grammar-Based Parsing and Automatic Classification
- Chad Langley
- Language Technologies Institute
- Carnegie Mellon University
2NESPOLE! System Overview
- Human-to-human spoken language translation for
e-commerce application (e.g. travel tourism)
(Lavie et al., 2002) - English, German, Italian, and French
- Translation via interlingua
- Translation servers for each language exchange
interlingua to perform translation - Speech recognition (Speech ? Text)
- Analysis (Text ? Interlingua)
- Generation (Interlingua ? Text)
- Synthesis (Text ? Speech)
3Interchange Format
- Interchange Format (IF) is a shallow semantic
interlingua for task-oriented domains - Utterances represented as sequences of semantic
dialog units (SDUs) - IF representation consists of four parts
- Speaker
- Speech Act
- Concepts
- Arguments
- speaker speech act concept arguments
Domain Action
4Example
- Hello. I would like to take a vacation in Val di
Fiemme.
hello i would like to take a vacation in val di
fiemme
cgreeting (greetinghello) cgive-informationdis
positiontrip (disposition(whoi, desire),
visit-spec(identifiabilityno, vacation),
location(place-nameval_di_fiemme))
ENG Hello! I want to travel for a vacation at
Val di Fiemme. ITA Salve. Io vorrei una vacanza
in Val di Fiemme.
5Why Hybrid Analysis?
- Goal A portable and robust analyzer for
task-oriented human-to-human speech-to-speech MT - Earlier MT systems used full semantic grammars to
parse complete DAs - Useful for parsing spoken language in restricted
domains - Difficult to port to new domains
- Continue to use semantic grammars to parse
domain-independent phrase-level arguments and
train classifiers to identify DAs
6Hybrid Analysis Approach
Use a combination of grammar-based phrase-level
parsing and machine learning to produce
interlingua (IF) representations
7Hybrid Analysis Approach
Hello. I would like to take a vacation in Val di
Fiemme. cgreeting (greetinghello) cgive-informa
tiondispositiontrip (disposition(whoi,
desire), visit-spec(identifiabilityno,
vacation), location(place-nameval_di_fiemme))
8Argument Parsing
- Goal Parse IF arguments and domain-independent
DAs using phrase-level grammars - SOUP Parser (Gavaldà , 2000) Stochastic,
chart-based, top-down robust parser designed for
real-time analysis of spoken language - Result of argument parsing is a ranked list of
phrase-level parses - Prototype analyzer uses only the best-ranked
argument parse
9Segmentation
- Goal Identify SDU boundaries in input utterances
- No punctuation or case information since input
comes from automatic speech recognizer - Argument parse provides useful information for
segmentation - Argument parses may contain cross-domain trees
- SDU boundaries not allowed within a parse tree
- Parse tree labels may be used in segmentation
models in addition to words in the input utterance
10Segmentation
- Drop unparsed words
- Consider SDU boundary between each pair of parse
trees - Insert boundary if either tree is from
cross-domain grammar - Otherwise use statistical model
11Domain Action Classification
- Goal Identify the domain action for each SDU in
an input utterance - Use trainable classifiers to identify the domain
action for each (non-cross-domain) SDU - Use IF specification to aid classification and to
ensure that the DA and arguments combine to
produce a well-formed IF representation
12Domain Action Classification
- Use two memory-based classifers (TiMBL, Daelemans
et al., 2000) to identify the DA - First classifier identifies the speech act
- Second classifier identifies the complete concept
sequence - Classifier features
- Binary features indicate presence or absence of
particular arguments from the argument parse - Concept sequence classifier also uses speech act
as feature
13Domain Action Classification
- Using IF specification constraints
- Check if best speech act, concept sequence, and
arguments form a legal DA - Fallback strategy Find the best DA that licenses
the most arguments - Trust the argument parser to produce reliable
argument labels - Better to select an alternative DA than to remove
arguments and lose detailed information from an
utterance
14End-to-End Translation Evaluation
- English-to-English and English-to-Italian
- Training set 3350 SDUs from NESPOLE!
- Test set 151 utterances (4 dialogs), 332 SDUs
- 40 SDUs from test set could not be assigned a
valid IF under the current IF definition - Uses IF specification fallback strategy
- Each SDU graded as perfect, ok, or bad
15End-to-End Translation Evaluation
16Future Work
- Alternative segmentation models
- Richer feature sets for classifiers
- Alternative classification methods
- Multiple argument parses
- Evaluation of portability to a new domain
17(No Transcript)
18Research Context
- Spoken language translation systems
- NESPOLE!
- C-STAR
- LingWear
- Translation via a task-oriented interlingua
(Interchange Format) - Speech recognition
- Analysis
- Generation
- Synthesis
19Interchange Format
- IF specification encodes how speech acts,
concepts, and arguments are allowed to combine to
form legal IF representations - IF specification contains
- 62 speech acts
- 103 concepts
- 147 argument names
20Interchange Format
- I would like to take a vacation in Val di Fiemme.
- cgive-informationdispositiontrip
- (disposition(whoi, desire),
visit-spec(identifiabilityno, vacation),
location(place-nameval_di_fiemme)) - Hello.
- cgreeting (greetinghello)
- Thank you very much.
- athank
21The SOUP Parser
- Stochastic, chart-based, top-down parser designed
for real-time analysis of spoken language - Context-free grammars encoded as probabilistic
recursive transition networks - Useful features
- Allows configurable word skipping
- Returns sequence of parse trees
- Supports modular grammar development and parsing
with multiple grammars - Can provide a ranked list of interpretations
22Grammars
- Argument grammar
- Phrase-level rules for parsing top-level IF
arguments - currency, time, location, etc.
- Pseudo-argument grammar
- Phrase-level rules for parsing common phrases
that can grouped into classes - all booked up, full, sold out, etc.
23Grammars
- Cross-domain grammar
- Rules for parsing complete domain-independent
domain actions - Greetings Hello, Good bye, Nice to meet you,
etc. - Thanks Thanks, Thank you very much, Youve been
a big help, etc. - Shared grammar
- Library of common rules for use by any other
grammar
24Data Ablation Experiment
- 16-fold cross validation setup
- Test set size ( SDUs) 400
- Training set sizes ( SDUs) 500, 1000, 2000,
3000, 4000, 5000, 6009 (all data) - Data from previous C-STAR system
- No use of IF specification
25Data Ablation Experiment
26Prototype System
- Argument Parsing
- Four basic grammars
- No word skipping within parse trees
- Default SOUP disambiguation weights and
heuristics - Only best-ranked parse used
27Hybrid Analysis Approach
Hello. I would like to take a vacation in Val di
Fiemme. cgreeting (greetinghello) cgive-informa
tiondispositiontrip (disposition(whoi,
desire), visit-spec(identifiabilityno,
vacation), location(place-nameval_di_fiemme))
SDU1 SDU2 greeting
disposition visit-spec location hello
i would like to take a vacation in val di
fiemme
greeting give-informationdispositiontrip gr
eeting disposition visit-spec
location hello i would like to take a
vacation in val di fiemme
greeting disposition visit-spec
location hello i would like to take a
vacation in val di fiemme
hello i would like to take a vacation in
val di fiemme
28Objective
- Provide fast, accurate, and portable analysis for
task-oriented speech-to-speech machine
translation - Must be robust to speech recognition errors,
spoken language disfluencies, and ungrammatical
input - Must produce analyses in (near) real time
- Must produce interlingua representations
- Should be easily portable to new domains and
languages
29Hybrid Analysis Approach