Quasi-Synchronous Grammars - PowerPoint PPT Presentation

About This Presentation

Title:

Quasi-Synchronous Grammars

Description:

Quasi-Synchronous Grammars Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in entirety. – PowerPoint PPT presentation

Number of Views:129

Avg rating:3.0/5.0

Slides: 35

Provided by: Schoolo183

Learn more at: https://cs.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: Quasi-Synchronous Grammars

1
Quasi-Synchronous Grammars

Based on key observations in MT
translated sentences often have some isomorphic
syntactic structure, but not usually in entirety.
the strictness of the isomorphism may vary across
words or syntactic rules.
Key idea
Unlike some synchronous grammars (e.g. SCFG,
which is more strict and rigid), QG defines a
monolingual grammar for the target tree,
inspired by the source tree.

2
Quasi-Synchronous Grammars

In other words, we model the generation of the
target tree, influenced by the source tree (and
their alignment)
QA can be thought of as extremely free
monolingual translation.
The linkage between question and answer trees in
QA is looser than in MT, which gives a bigger
edge to QG.

3
Model

Works on labeled dependency parse trees
Learn the hidden structure (alignment between Q
and A trees) by summing out ALL possible
alignments
One particular alignment tells us both the
syntactic configurations and the word-to-word
semantic correspondences
An example

answer parse tree
question parse tree
an alignment
answer
question
4
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
5
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
6
root
root
Q
A
root
root
met VBD
is VB
subj
with
Bush NNP person
Jacques Chirac NNP person
nmod
president NN
nmod
given its parent, a word is independent of all
other words (including siblings).
French JJ location
Our model makes local Markov assumptions to allow
efficient computation via Dynamic Programming
(details in paper)
7
root
root
Q
A
root
root
met VBD
is VB
subj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
nmod
president NN
nmod
French JJ location
8
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
nmod
president NN
nmod
French JJ location
9
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
nmod
president NN
the DT
nmod
French JJ location
10
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
11
6 types of syntactic configurations

Parent-child

12
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
13
Parent-child configuration
14
6 types of syntactic configurations

Parent-child
Same-word

15
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
16
Same-word configuration
17
6 types of syntactic configurations

Parent-child
Same-word
Grandparent-child

18
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
19
Grandparent-child configuration
20
6 types of syntactic configurations

Parent-child
Same-word
Grandparent-child
Child-parent
Siblings
C-command
(Same as D. Smith Eisner 06)

21
(No Transcript)
22
Modeling alignment

Base model

23
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
24
root
root
Q
A
root
root
met VBD
is VB
subj
obj
subj
with
Bush NNP person
Jacques Chirac NNP person
who WP qword
leader NN
det
of
nmod
president NN
the DT
France NNP location
nmod
French JJ location
25
Modeling alignment cont.

Base model
Log-linear model
Lexical-semantic features from WordNet,
Identity, hypernym, synonym, entailment, etc.
Mixture model

26
Parameter estimation

Things to be learnt
Multinomial distributions in base model
Log-linear model feature weights
Mixture coefficient
Training involves summing out hidden structures,
thus non-convex.
Solved using conditional Expectation-Maximization

27
Experiments

Trec8-12 data set for training
Trec13 questions for development and testing

28
Candidate answer generation

For each question, we take all documents from the
TREC doc pool, and extract sentences that contain
at least one non-stop keywords from the question.
For computational reasons (parsing speed, etc.),
we only took answer sentences lt 40 words.

29
Dataset statistics

Manually labeled 100 questions for training
Total 348 positive Q/A pairs
84 questions for dev
Total 1415 Q/A pairs
3.1, 17.1-
100 questions for testing
Total 1703 Q/A pairs
3.6, 20.0-
Automatically labeled another 2193 questions to
create a noisy training set, for evaluating model
robustness

30
Experiments cont.

Each question and answer sentence is tokenized,
POS tagged (MX-POST), parsed (MSTParser) and
labeled with named-entity tags (Identifinder)

31
Baseline systems (replications)

Cui et al. SIGIR 05
The algorithm behind one of the best performing
systems in TREC evaluations.
It uses a mutual information-inspired score
computed over dependency trees and a single fixed
alignment between them.
Punyakanok et al. NLE 04
measures the similarity between Q and A by
computing tree edit distance.
Both baselines are high-performing, syntax-based,
and most straight-forward to replicate
We further enhanced the algorithms by augmenting
them with WordNet.

32
Results
Mean Average Precision
Mean Reciprocal Rank of Top 1
28.2
41.2
30.3
23.9
Statistically significantly better than the 2nd
best score in each column
33
Summing vs. Max
34
Switching back