Linguistics 187287 Week 9

About This Presentation

Title:

Linguistics 187287 Week 9

Description:

Find specific facts in documents to answer specific questions. ... and Charlie Watts formed the Rolling Stones in 1962. A lighter-weight approach to QA ... – PowerPoint PPT presentation

Number of Views:30

Avg rating:3.0/5.0

Slides: 73

Provided by: ronka

Category:

more less

Transcript and Presenter's Notes

Title: Linguistics 187287 Week 9

1
Linguistics 187/287 Week 9
Question answering Knowledge mapping Translation

Ron Kaplan and Tracy King

2
Question Answering

Unindexed documents in information repository
Example Technical manuals, business procedures
Find specific facts in documents to answer
specific questions.
How do I make the paperclip go away?
Who authorizes foreign travel?
Documents may not contain exact words
The Help Assistant may be closed by clicking
the Option menu
Exact words may be misleading
Will Gore raise taxes?
Bush promised Gore to raise taxes
promise(Bush,
Gore, raise(Bush, taxes))
Bush persuaded Gore to raise taxes

persuade(Bush, Gore, raise(Gore, taxes))
XLE Check for overlapping meanings

3
Pipeline for Meaning Overlap
Overlap detector
Meaning
Meaning
Semantics
Semantics
DocumentCollection
Question
Parser
Parser
Englishgrammar
Englishgrammar
High precision, recall, but heavy computation.
4
A lighter-weight approach to QA
Analyze the question, anticipate and search for
possible answer phrases
Question F-structure
Queries

Question What is the graph partitioning problem?
Generated Queries The graph partitioning
problem is
Answer (Google) The graph partitioning problem
is defined as dividing a graph into disjoint
subsets of nodes
Question When were the Rolling Stones formed?
Generated Queries The Rolling Stones were
formed formed the Rolling Stones
Answer (Google) Mick Jagger, Keith Richards,
Brian Jones, Bill Wyman, and Charlie Watts
formed the Rolling Stones in 1962.

5
Pipeline for Answer Anticipation
Question f-structures
Answer f-structures
Convert
Generator
Parser
Question
AnswerPhrases
Search (Google...)
Englishgrammar
Englishgrammar
6
Who won the oscar in 1968?
?X.win(X,oscar) ? in(win,1968)
7
Mapping to knowledge representations via semantics
8
Knowledge mapping Architecture
Word structure Entity recognition
Text
C-structure
XLE/LFGparsing
LFG GrammarLexicon
F-structure
Gluederivation
Semanticlexicon
Linguisticsemantics
VerbNetWordNet
TermRewriting
Abstract KR
Mapping Rules
Cyc
Cyc ANSProlog
Target KRR
9
Abstract Knowledge Representation

Encode different aspects of meaning
Asserted content
concepts and arguments, relations among objects
Contexts
author commitment, belief, report, denial,
prevent,
Temporal relations
qualitative relations among time intervals,
events
Translate to various target KR's e.g.
CycL, Description Logic, AnsProlog
Capture meaning ambiguity

10
Abstract KR Ed fired the boy.

subconcept(Ed3 Person)
subconcept(boy2 MaleChild)
subconcept(fire_ev1 DischargeWithPrejudice)
role(fire_ev1 performedBy Ed3)
role(fire_ev1 objectActedOn boy2)
context(t)
instantiable(Ed3 t)
instantiable(boy2 t)
instantiable(fire_ev1 t)
temporalRel(startsAfterEndingOf Now fire_ev1)

Conceptual
Contextual
Temporal
11
Concept-role disambiguation

Mapping to conceptual structure
Ontology constrains verb concept and argument
roles

verb_concept_map(fire, V-SUBJ-OBJ,
DischargeWithPrejudice, subj, Agent-Generic,
doneBy, obj, Person, objectActedOn)
Ed fired the boy.
verb_concept_map(fire, V-SUBJ-OBJ,
ShootingAGun, subj, Person, doneBy, obj, Gun,
deviceUsed)
Ed fired the gun.
Templates for rewriting rules
Ed fired the goofball.
12
Concept-role ambiguation
Ed fired the goofball.
subconcept(Ed3 Person) role(fire_ev1 doneBy
Ed3) context(t) instantiable(Ed3 t)
instantiable(goofball2 t) instantiable(fire_ev1
t) A1subconcept(goofball2 Person) A1subconcept
(fire_ev1 DischargeWithPrejudice) A1role(fire_ev1
objectActedOn goofball2) A2subconcept(goofball2
Gun) A2subconcept(fire_ev1 ShootingAGun) A2role
(fire_ev1 DeviceUsed goofball2)
13
Embedded contexts The man said that Ed
fired a boy.
subconcept(say_ev19 Inform-CommunicationAct) role(
say_ev19 senderOfInfo man24) role(say_ev19
infoTransferred comp21) subconcept(fire_ev20
DischargeWithPrejudice) role(fire_ev20
objectActedOn boy22) role(fire_ev20 performedBy
Ed23) context(t) context(comp21) context_lifting_
rules(averidical t comp21) instantiable(man24
t) instantiable(say_ev19 t) instantiable(Ed23
t) instantiable(boy22 comp21) instantiable(fire_e
v20 comp21)
14
Contextual Structure

The senator prevented a war
role(preventRelation action1 ctx2)
role(doneBy action1 senator3)
subconcept (action1 Eventuality)
subconcept (senator3 USSenator)
subconcept (war4 WagingWar)
instantiable(t senator3)
instantiable(t action1)
uninstantiable(t war2)
uninstantiable(ctx2 action1)
instantiable(ctx2 war2)
more details on contextula structure to follow

15
Abstract KR Canonicalization

Ed cooled the room
subconcept(cool_ev18 scalar-state-change)
decreasesCausally(cool_ev18 room20
temperatureOfObject)
role(doneBy cool_ev18 Ed21)
subconcept(room20 RoomInAConstruction)
Ed lowered the temperature of the room
subconcept(lower_ev22 scalar-state-change)
decreasesCausally(lower_ev22 room24
temperatureOfObject)
role(doneBy lower_ev22 Ed23)
subconcept(room24 RoomInAConstruction)
The room cooled
subconcept(cool_ev25 scalar-state-change)
decreasesCausally(cool_ev25 room26
temperatureOfObject))
subconcept(room26 RoomInAConstruction)

16
Term-Rewriting for KR Mapping

Rule form
ltInput termsgt gt ltOutput termsgt
(obligatory)
ltInput termsgt ?gt ltOutput termsgt
(optional)
(optional rules introduces new choices)
Input patterns allow
Consume term if matched Term
Test on term without consumption Term
Test that term is missing
-Term
Procedural attachment
ProcedureCall
pred(E, hire), subj(E, X), obj(E, Y),
subconcept(X, CX), subconcept(Y,
CY),genls(CX, Organization), genls(CY,
Person) gt subconcept(E, EmployingEvent),
performedBy(E,X), personEmployed(E,Y).
Ordered rule application
Rule1 applied in all possible ways to Input to
produce Output1
Rule2 applied in all possible ways to Output1 to
produce Output2

Example Rule
17
Term Rewriting for Semantics

Canonicalizing syntactic constructions
undo passive
resolve relative and null pronouns
appositives
comparatives
Introduce contextual structure
Flatten facts
Semantic constructions
negation
coordination

18
Example semantics rules Depassivisation
VTYPE(V, ), PASSIVE(V,), SUBJ(V,
LogicalOBJ) gt OBJ(V, LogicalOBJ).
VTYPE(V, ), PASSIVE(V,),
OBL-AG(V,LogicalSUBJ), PFORM(LogicalSUBJ, )
gt SUBJ(V, LogicalSUBJ). VTYPE(V,
), PASSIVE(V,), -SUBJ(V,) gt
SUBJ(V, AgtPro), PRED(AgtPro, agent_pro),
PRON-TYPE(AgtPro, null_agent).
19
Term-rewriting can introduce ambiguity

Alternative lexical mappings
- cyc_concept_map(bank, FinancialInstitution
- cyc_concept_map(bank, SideOfRiver).
is_true_in(, P(Arg)), cyc_concept_map(P,
Concept) gt subconcept(Arg, Concept).
Input term
is_true_in(c0, bank(b1))
Alternative rule applications produce different
outputs Rewrite system represents this
ambiguity by a new choice
Output
C1 subconcept(b1, FinancialInstitution)C2
subconcept(b1, SideOfRiver)
(C1 xor C2) ? 1

Permanent Background Facts
Mapping from Predicate to Cyc concept
20
Term-rewriting can prune ill-formed mappings

The bank hired Ed
pred(E, hire), subj(E,X), obj(E,Y),
subconcept(X,CX), subconcept(Y,CY),genls(
CX, Organization), genls(CY,Person) gt
subconcept(E, EmployingEvent),
performedBy(E,X), personEmployed(E,Y).
From Cyc genls(FinancialInstitution,
Organization) true
genls(SideOfRiver, Organization)
false
If bank is mapped to SideOfRiver, the rule will
not fire.This leads to a failure to consume the
subject.
subj(,) gt stop.
prunes this analysis from the choice space.
In general, later rewrites prune analyses that
dont consume grammatical roles.

Rule for mapping hire
21
Answers from textLinguistic entailment
Knowledge-based reasoning

Linguistic context reflects author commitments
Did Iran enrich uranium?
Iran managed to enrich uranium. Yes
Iran failed to enrich uranium. No
Iran was able to enrich uranium. Unknown
Language interacts with world knowledge
Iran managed to enrich uranium.
Does Iran have a centrifuge? Yes
Enriching uranium requires a centrifuge
? Iran has a centrifuge

22
Answering questions throughlinguistic entailment

1. Convert questions to declarative form
Did Iran enrich uranium? ? Iran enriched
uranium.
Who enriched uranium? ? Some entity enriched
uranium.
Select candidate answer-source (e.g. search)
Compare candidate and declarative form

23
Linguistic knowledge is necessary
Iran bought yellowcake from Niger.

Relation of terms WordNet, Cyc
Did Iran purchase yellowcake? Yes
Relation of roles KR rules, VerbNet, Cyc
Was yellowcake sold to Iran? Yes
Implications and presuppositions KR rules
Did Iran fail to buy yellowcake? No

24
Awareness of alternative interpretations
Iran didnt wait to buy yellowcake.

Did Iran purchase yellow cake?
Unknown (Ambiguous)
Yes, if they did the buying immediately
No, if they did something else instead (e.g.
they stole it, they bought anthrax)

25
Access to world knowledge Cyc

KR rules Translating to Cyc Source Iran
enriched uranium. (and (isa P25
PurificationProcess) (isa U59
Uranium) (objectOfStateChange P25
U59) (performedBy P25 Iran))
Question Does Iran have a centrifuge?
(exists ?E (and (isa ?E Centrifuge) (ownedBy
Iran ?E))
Cyc knows (implies (and (isa ?P
PurificationProcess) (isa ?U
Uranium) (objectOfStateChange ?P
?U) (performedBy ?P ?X)) (exists ?C
(and (isa ?C Centrifuge) (ownedBy ?X ?C))))
Cyc deduces Yes

26
NYT 2000 top of the chart
27
Verbs differ as to speaker commitments

Bush realized that the US Army had to be
transformed to meet new threats
The US Army had to be transformed to meet new
threats

Bush said that Khan sold centrifuges to North
Korea Khan sold centrifuges to North Korea
28
Implicative verbs carry commitments

Commitment depends on verb context
Positive () or negative (-)
Speaker commits to truth-value of complement
True () or false (-)

29
/- - Implicative

positive context positive commitment negative
context negative commitment
Ed managed to leave
Ed left
Ed didnt manage to leave
Ed didnt leave

30
-/- Implicative

positive context negative commitment negative
context positive commitment
Ed forgot to leave
Ed didnt leave
Ed didnt forget to leave
Ed left

31
Implicative

positive context only positive commitment
Ann forced Ed to leave
Ed left
Ann didnt force Ed to leave
Ed left / didnt leave

32
- Implicative

positive context only negative commitment
Ed refused to leave
Ed didnt left
Ed didnt refuse to leave
Ed left / didnt leave

33
- Implicative

negative context only positive commitment
Ed hesitated to leave
Ed left / didnt leave
Ed didnt hesitate to leave
Ed left

34
-- Implicative

negative context only negative commitment
Ed attempted to leave
Ed left / didnt leave
Ed didnt attempt to leave
Ed didnt leave

35
Factive

positive commitment no matter what context

Ann realized that Ed left Ann didnt realize
that Ed left Ed left
36
- Factive
negative commitment no matter what context

Ann pretended that Ed left
Ann didnt pretend that Ed left
Ed didnt leave

Like -/-- implicative but with additional
commitments
37
Coding of implicative verbs

Implicative properties of complement-taking verbs
Not available in any external lexical resource
No obvious machine-learning strategy
1250 embedded-clause verbs in PARC lexicon
400 currently examined
Considered in BNC frequency order
Google search when intuitions unclear
1/3 are implicative, classified in Unified Lexicon

38
Matching for implicative entailments

Term rewriting promotes commitments in Abstract KR

context(cx_leave7) context(t)context_lifting_rul
es(veridical t cx_leave7) instantiable(Ed0 t)
instantiable(leave_ev7 cx_leave7)
instantiable(manage_ev3 t) role(objectMoving
leave_ev7 Ed0) role(performedBy manage_ev3 Ed0)
role(whatsSuccessful manage_ev3 cx_leave7)
context(cx_leave7) context(t)instantiable(leave_
ev7 t) instantiable(Ed0 t)
instantiable(leave_ev7 cx_leave7)
instantiable(manage_ev3 t) role(objectMoving
leave_ev7 Ed0) role(performedBy manage_ev3 Ed0)
role(whatsSuccessful manage_ev3 cx_leave7)
Ed managed to leave.
context(t) instantiable(Ed0 t)
instantiable(leave_ev1 t) role(objectMoving
leave_ev1 Ed0)
Ed left.
39
Embedded examples in real text

From Google
Song, Seoul's point man, did not forget to
persuade the North Koreans to make a strategic
choice of returning to the bargaining table...

Song persuaded the North Koreans
40
Promotion for a (simple) embedding

Ed did not forget to force Dave to leave

Dave left.
41
Ed
not
forget
force
Dave
leave
not
force
Dave
leave
forget
Ed
did
to
to
comp
subj
comp
Ed
obj
subj
comp
Dave
subj
42
not
-/- Implicative
comp
forget
-/- Implicative
subj
comp
Ed
force
Implicative
obj
subj
comp
Dave
leave
Ed
subj
Dave
43

not
-/- Implicative
comp
-
forget
-/- Implicative
subj
comp

Ed
force
Implicative

obj
subj
comp
Dave
leave
Ed
subj
Dave
44

Matches Dave left.
not
comp
-
forget
subj
comp

Ed
force

obj
subj
comp
Dave
leave
Ed
subj
Dave
45
DEMO
46
Grammatical Machine Translation

Stefan Riezler John Maxwell

47
Translation System
Lots of statistics
Translationrules
XLEParsing
XLEGeneration
F-structures
F-structures.
GermanLFG
English LFG
48
Transfer-Rule Induction from aligned bilingual
corpora

Use standard techniques to find many-to-many
candidate word-alignments in source-target
sentence-pairs
Parse source and target sentences using LFG
grammars for German and English
Select most similar f-structures in source and
target
Define many-to-many correspondences between
substructures of f-structures based on
many-to-many word alignment
Extract primitive transfer rules directly from
aligned f-structure units
Create powerset of possible combinations of basic
rules and filter according to contiguity and type
matching constraints

49
Induction

Example sentences Dafür bin ich zutiefst
dankbar.
I have a deep appreciation for that.
Many-to-many word alignment
Dafür6 7 bin2 ich1 zutiefst3 4 5
dankbar5
F-structure alignment

50
Extracting Primitive Transfer Rules

Rule (1) maps lexical predicates
Rule (2) maps lexical predicates and interprets
subj-to-subj link as indication to map subj of
source with this predicate into subject of target
and xcomp of source into object of target
X1, X2, X3, are variables for f-structures

(2) PRED(X1, sein),
SUBJ(X1,X2),
XCOMP(X1,X3)
gt
PRED(X1, have),
SUBJ(X1,X2)
OBJ(X1,X3)

(1) PRED(X1, ich) gt PRED(X1, I)
51
Extracting Primitive Transfer Rules, cont.

Primitive transfer rule (3) maps single source
f-structure into target f-structure of
prepositionobject units

(3) PRED(X1, dafür)
gt
PRED(X1, for),
OBJ(X1,X2)
PRED(X2,that)

52
Extracting Complex Transfer Rules

Complex rules are created by taking all
combinations of primitive rules, and filtering

(4) sein zutiefst dankbar
gt
have a deep appreciation
(5) sein zutiefst dankbar dafür
gt
have a deep appreciation for that
(6) sein ich zutiefst dankbar dafür
gt
have I a deep appreciation for that

53
Transfer Contiguity constraint

Transfer contiguity constraint
Source and target f-structures each have to be
connected
F-structures in the transfer source can only be
aligned with f-structures in the transfer target,
and vice versa
Analogous to constraint on contiguous and
alignment-consistent phrases in phrase-based SMT
Prevents extraction of rule that would translate
dankbar directly into appreciation since
appreciation is aligned also to zutiefst
Transfer contiguity allows learning idioms like
es gibt - there is from configurations that are
local in f-structure but non-local in string,
e.g., es scheint zu geben - there seems
to be

54
Linguistic Filters on Transfer Rules

Morphological stemming of PRED values
(Optional) filtering of f-structure snippets
based on consistency of linguistic categories
Extraction of snippet that translates zutiefst
dankbar into a deep appreciation maps
incompatible categories adjectival and nominal
valid in string-based world
Translation of sein to have might be discarded
because of adjectival vs. nominal types of their
arguments
Larger rule mapping sein zutiefst dankbar to have
a deep appreciation is ok since verbal types match

55
Transfer

Parallel application of transfer rules in
non-deterministic fashion
Unlike XLE ordered-rule rewrite system
Each fact must be transferred by exactly one rule
Default rule transfers any fact as itself
Transfer works on chart using parsers
unification mechanism for consistency checking
Selection of most probable transfer output is
done by beam-decoding on transfer chart

56
Generation

Bi-directionality allows us to use same grammar
for parsing training data and for generation in
translation application
Generator has to be fault-tolerant in cases where
transfer-system operates on FRAGMENT parse or
produces non-valid f-structures from valid input
f-structures
Robust generation from unknown (e.g.,
untranslated) predicates and from unknown
f-structures

57
Robust Generation

Generation from unknown predicates
Unknown German word Hunde is analyzed by German
grammar to extract stem (e.g., PRED Hund, NUM
pl) and then inflected using English default
morphology (Hunds)
Generation from unknown constructions
Default grammar that allows any attribute to be
generated in any order is mixed as suboptimal
option in standard English grammar, e.g. if SUBJ
cannot be generated as sentence-initial NP, it
will be generated in any position as any category
extension/combination of set-gen-adds and OT
ranking

58
Statistical Models

Log-probability of source-to-target transfer
rules, where probability r(ef) or rule that
transfers source snippet f into target snippet e
is estimated by relative frequency
Log-probability of target-to-source transfer
rules, estimated by relative frequency

59
Statistical Models, cont.

Log-probability of lexical translations l(ef)
from source to target snippets, estimated from
Viterbi alignments a between source word
positions i1, n and target word positions
j1,,m for stems fi and ej in snippets f and e
with relative word translation frequencies
t(ejfi)
Log-probability of lexical translations from
target to source snippets

60
Statistical Model, cont.

Number of transfer rules
Number of transfer rules with frequency 1
Number of default transfer rules
Log-probability of strings of predicates from
root to frontier of target f-structure, estimated
from predicate trigrams in English f-structures
Number of predicates in target f-structure
Number of constituent movements during
generations based on original order of head
predicates of the constituents

61
Statistical Models, cont.

Number of generation repairs
Log-probability of target string as computed by
trigram language model
Number of words in target string

62
Experimental Evaluation

Experimental setup
German-to-English on Europarl parallel corpus
(Koehn 02)
Training and evaluation on sentences of length
5-15, for quick experimental turnaround
Resulting in training set of 163,141 sentences,
development set of 1,967 sentences, test of 1,755
sentences (used in Koehn et al. HLT03)
Improved bidirectional word alignment based on
GIZA (Och et al. EMNLP99)
LFG grammars for German and English (Butt et al.
COLING02 Riezler et al. ACL02)
SRI trigram language model (Stocke02)
Comparison with PHARAOH (Koehn et al. HLT03) and
IBM Model 4 as produced by GIZA (Och et al.
EMNLP99)

63
Experimental Evaluation, cont.

Around 700,000 transfer rules extracted from
f-structures chosen by dependency similarity
measure
System operates on n-best lists of parses (n1),
transferred f-structures (n10), and generated
strings (n1,000)
Selection of most probable translations in two
steps
Most probable f-structure by beam search (n20)
on transfer chart using features 1-10
Most probable string selected from strings
generated from selected n-best f-structures using
features 11-13
Feature weights for modules trained by MER on 750
in-coverage sentences of development set

64
Automatic Evaluation

NIST scores (ignoring punctuation) Approximate
Randomization for significance testing (see
above)
44 in-coverage of grammars 51 FRAGMENT parses
and/or generation repair 5 timeouts
In-coverage Difference between LFG and P not
significant
Suboptimal robustness techniques decrease overall
quality

65
Manual Evaluation

Closer look at in-coverage examples
Random selection of 500 in-coverage examples
Two independent judges indicated preference for
LFG or PHARAOH, or equality, in blind test
Separate evaluation under criteria of
grammaticality/fluency and translational/semantic
adequacy
Significance assessed by Approximate
Randomization via stratified shuffling of
preference ratings between systems

66
Manual Evaluation

Result differences on agreed-on ratings are
statistically significant at p lt 0.0001
Net improvement in translational adequacy on
agreed-on examples is 11.4 on 500 sentences
(57/500), amounting to 5 overall improvement in
hybrid system (44 of 11.4)
Net improvement in grammaticality on agreed-on
examples is 15.4 on 500 sentences, amounting to
6.7 overall improvement in hybrid system

67
Examples LFG gt PHARAOH

src in diesem fall werde ich meine verantwortung
wahrnehmen
sef then i will exercise my responsibility
LFG in this case i accept my responsibility
P in this case i shall my responsibilities
src die politische stabilität hängt ab von der
besserung der lebensbedingungen
ref political stability depends upon the
improvement of living conditions
LFG the political stability hinges on the
recovery the conditions
P the political stability is rejects the
recovery of the living conditions

68
Examples PHARAOH gt LFG

src das ist schon eine seltsame vorstellung von
gleichheit
ref a strange notion of equality
LFG equality that is even a strange idea
P this is already a strange idea of equality
src frau präsidentin ich beglückwünsche herrn
nicholson zu seinem ausgezeichneten bericht
ref madam president I congratulate mr nicholson
on his excellent report
LFG madam president I congratulate mister
nicholson on his report excellented
P madam president I congratulate mr nicholson
for his excellent report

69
Discussion

High percentage of out-of-coverage examples
Accumulation of 2 x 20 error-rates in parsing
training data
Errors in rule extraction
Together result in ill-formed transfer rules
causing high number of generation
failures/repairs
Propagation of errors through the system also for
in-coverage examples
Error analysis 69 transfer errors, 10 due to
parse errors
Discrepancy between NIST and manual evaluation
Suboptimal integration of generator, making
training and translation with large n-best lists
infeasible
Language and distortion models applied after
generation

70
Conclusion

Integration of grammar-based generator into
dependency-based SMT system achieves
state-of-the-art NIST and improved grammaticality
and adequacy on in-coverage examples
Possibility of hybrid system since it is
determinable when sentences are in coverage of
system

71
Applications Conclusions

Use large-scale robust LFG grammars as base for
deep linguistic processing
Ordered rewrite system provides integrated method
to manipulate output and incorporate external
resources
Stochastic and shallow techniques can combine
with deep processing

72
(No Transcript)

Write a Comment

User Comments (0)