Title: Linguistics 187287 Week 9
1Linguistics 187/287 Week 9
Question answering Knowledge mapping Translation
- Ron Kaplan and Tracy King
2Question Answering
- Unindexed documents in information repository
- Example Technical manuals, business procedures
- Find specific facts in documents to answer
specific questions. - How do I make the paperclip go away?
- Who authorizes foreign travel?
- Documents may not contain exact words
- The Help Assistant may be closed by clicking
the Option menu - Exact words may be misleading
- Will Gore raise taxes?
- Bush promised Gore to raise taxes
promise(Bush,
Gore, raise(Bush, taxes)) - Bush persuaded Gore to raise taxes
persuade(Bush, Gore, raise(Gore, taxes)) - XLE Check for overlapping meanings
3Pipeline for Meaning Overlap
Overlap detector
Meaning
Meaning
Semantics
Semantics
DocumentCollection
Question
Parser
Parser
Englishgrammar
Englishgrammar
High precision, recall, but heavy computation.
4A lighter-weight approach to QA
Analyze the question, anticipate and search for
possible answer phrases
Question F-structure
Queries
- Question What is the graph partitioning problem?
- Generated Queries The graph partitioning
problem is - Answer (Google) The graph partitioning problem
is defined as dividing a graph into disjoint
subsets of nodes - Question When were the Rolling Stones formed?
- Generated Queries The Rolling Stones were
formed formed the Rolling Stones - Answer (Google) Mick Jagger, Keith Richards,
Brian Jones, Bill Wyman, and Charlie Watts
formed the Rolling Stones in 1962.
5Pipeline for Answer Anticipation
Question f-structures
Answer f-structures
Convert
Generator
Parser
Question
AnswerPhrases
Search (Google...)
Englishgrammar
Englishgrammar
6Who won the oscar in 1968?
?X.win(X,oscar) ? in(win,1968)
7Mapping to knowledge representations via semantics
8Knowledge mapping Architecture
Word structure Entity recognition
Text
C-structure
XLE/LFGparsing
LFG GrammarLexicon
F-structure
Gluederivation
Semanticlexicon
Linguisticsemantics
VerbNetWordNet
TermRewriting
Abstract KR
Mapping Rules
Cyc
Cyc ANSProlog
Target KRR
9Abstract Knowledge Representation
- Encode different aspects of meaning
- Asserted content
- concepts and arguments, relations among objects
- Contexts
- author commitment, belief, report, denial,
prevent, - Temporal relations
- qualitative relations among time intervals,
events - Translate to various target KR's e.g.
CycL, Description Logic, AnsProlog - Capture meaning ambiguity
10 Abstract KR Ed fired the boy.
- subconcept(Ed3 Person)
- subconcept(boy2 MaleChild)
- subconcept(fire_ev1 DischargeWithPrejudice)
- role(fire_ev1 performedBy Ed3)
- role(fire_ev1 objectActedOn boy2)
- context(t)
- instantiable(Ed3 t)
- instantiable(boy2 t)
- instantiable(fire_ev1 t)
- temporalRel(startsAfterEndingOf Now fire_ev1)
Conceptual
Contextual
Temporal
11Concept-role disambiguation
- Mapping to conceptual structure
- Ontology constrains verb concept and argument
roles
verb_concept_map(fire, V-SUBJ-OBJ,
DischargeWithPrejudice, subj, Agent-Generic,
doneBy, obj, Person, objectActedOn)
Ed fired the boy.
verb_concept_map(fire, V-SUBJ-OBJ,
ShootingAGun, subj, Person, doneBy, obj, Gun,
deviceUsed)
Ed fired the gun.
Templates for rewriting rules
Ed fired the goofball.
12Concept-role ambiguation
Ed fired the goofball.
subconcept(Ed3 Person) role(fire_ev1 doneBy
Ed3) context(t) instantiable(Ed3 t)
instantiable(goofball2 t) instantiable(fire_ev1
t) A1subconcept(goofball2 Person) A1subconcept
(fire_ev1 DischargeWithPrejudice) A1role(fire_ev1
objectActedOn goofball2) A2subconcept(goofball2
Gun) A2subconcept(fire_ev1 ShootingAGun) A2role
(fire_ev1 DeviceUsed goofball2)
13Embedded contexts The man said that Ed
fired a boy.
subconcept(say_ev19 Inform-CommunicationAct) role(
say_ev19 senderOfInfo man24) role(say_ev19
infoTransferred comp21) subconcept(fire_ev20
DischargeWithPrejudice) role(fire_ev20
objectActedOn boy22) role(fire_ev20 performedBy
Ed23) context(t) context(comp21) context_lifting_
rules(averidical t comp21) instantiable(man24
t) instantiable(say_ev19 t) instantiable(Ed23
t) instantiable(boy22 comp21) instantiable(fire_e
v20 comp21)
14Contextual Structure
- The senator prevented a war
- role(preventRelation action1 ctx2)
- role(doneBy action1 senator3)
- subconcept (action1 Eventuality)
- subconcept (senator3 USSenator)
- subconcept (war4 WagingWar)
- instantiable(t senator3)
- instantiable(t action1)
- uninstantiable(t war2)
- uninstantiable(ctx2 action1)
- instantiable(ctx2 war2)
- more details on contextula structure to follow
15Abstract KR Canonicalization
- Ed cooled the room
- subconcept(cool_ev18 scalar-state-change)
- decreasesCausally(cool_ev18 room20
temperatureOfObject) - role(doneBy cool_ev18 Ed21)
- subconcept(room20 RoomInAConstruction)
- Ed lowered the temperature of the room
- subconcept(lower_ev22 scalar-state-change)
- decreasesCausally(lower_ev22 room24
temperatureOfObject) - role(doneBy lower_ev22 Ed23)
- subconcept(room24 RoomInAConstruction)
- The room cooled
- subconcept(cool_ev25 scalar-state-change)
- decreasesCausally(cool_ev25 room26
temperatureOfObject)) - subconcept(room26 RoomInAConstruction)
16Term-Rewriting for KR Mapping
- Rule form
- ltInput termsgt gt ltOutput termsgt
(obligatory) - ltInput termsgt ?gt ltOutput termsgt
(optional) - (optional rules introduces new choices)
- Input patterns allow
- Consume term if matched Term
- Test on term without consumption Term
- Test that term is missing
-Term - Procedural attachment
ProcedureCall - pred(E, hire), subj(E, X), obj(E, Y),
subconcept(X, CX), subconcept(Y,
CY),genls(CX, Organization), genls(CY,
Person) gt subconcept(E, EmployingEvent),
performedBy(E,X), personEmployed(E,Y). - Ordered rule application
- Rule1 applied in all possible ways to Input to
produce Output1 - Rule2 applied in all possible ways to Output1 to
produce Output2
Example Rule
17Term Rewriting for Semantics
- Canonicalizing syntactic constructions
- undo passive
- resolve relative and null pronouns
- appositives
- comparatives
- Introduce contextual structure
- Flatten facts
- Semantic constructions
- negation
- coordination
18Example semantics rules Depassivisation
VTYPE(V, ), PASSIVE(V,), SUBJ(V,
LogicalOBJ) gt OBJ(V, LogicalOBJ).
VTYPE(V, ), PASSIVE(V,),
OBL-AG(V,LogicalSUBJ), PFORM(LogicalSUBJ, )
gt SUBJ(V, LogicalSUBJ). VTYPE(V,
), PASSIVE(V,), -SUBJ(V,) gt
SUBJ(V, AgtPro), PRED(AgtPro, agent_pro),
PRON-TYPE(AgtPro, null_agent).
19Term-rewriting can introduce ambiguity
- Alternative lexical mappings
- - cyc_concept_map(bank, FinancialInstitution
- - cyc_concept_map(bank, SideOfRiver).
- is_true_in(, P(Arg)), cyc_concept_map(P,
Concept) gt subconcept(Arg, Concept). - Input term
- is_true_in(c0, bank(b1))
- Alternative rule applications produce different
outputs Rewrite system represents this
ambiguity by a new choice - Output
- C1 subconcept(b1, FinancialInstitution)C2
subconcept(b1, SideOfRiver) - (C1 xor C2) ? 1
Permanent Background Facts
Mapping from Predicate to Cyc concept
20Term-rewriting can prune ill-formed mappings
- The bank hired Ed
- pred(E, hire), subj(E,X), obj(E,Y),
subconcept(X,CX), subconcept(Y,CY),genls(
CX, Organization), genls(CY,Person) gt
subconcept(E, EmployingEvent),
performedBy(E,X), personEmployed(E,Y). - From Cyc genls(FinancialInstitution,
Organization) true
genls(SideOfRiver, Organization)
false - If bank is mapped to SideOfRiver, the rule will
not fire.This leads to a failure to consume the
subject. - subj(,) gt stop.
- prunes this analysis from the choice space.
- In general, later rewrites prune analyses that
dont consume grammatical roles.
Rule for mapping hire
21Answers from textLinguistic entailment
Knowledge-based reasoning
- Linguistic context reflects author commitments
- Did Iran enrich uranium?
- Iran managed to enrich uranium. Yes
- Iran failed to enrich uranium. No
- Iran was able to enrich uranium. Unknown
- Language interacts with world knowledge
- Iran managed to enrich uranium.
- Does Iran have a centrifuge? Yes
- Enriching uranium requires a centrifuge
? Iran has a centrifuge
22Answering questions throughlinguistic entailment
- 1. Convert questions to declarative form
- Did Iran enrich uranium? ? Iran enriched
uranium. - Who enriched uranium? ? Some entity enriched
uranium. - Select candidate answer-source (e.g. search)
- Compare candidate and declarative form
23Linguistic knowledge is necessary
Iran bought yellowcake from Niger.
- Relation of terms WordNet, Cyc
- Did Iran purchase yellowcake? Yes
- Relation of roles KR rules, VerbNet, Cyc
- Was yellowcake sold to Iran? Yes
- Implications and presuppositions KR rules
- Did Iran fail to buy yellowcake? No
24Awareness of alternative interpretations
Iran didnt wait to buy yellowcake.
- Did Iran purchase yellow cake?
- Unknown (Ambiguous)
- Yes, if they did the buying immediately
- No, if they did something else instead (e.g.
they stole it, they bought anthrax)
25Access to world knowledge Cyc
- KR rules Translating to Cyc Source Iran
enriched uranium. (and (isa P25
PurificationProcess) (isa U59
Uranium) (objectOfStateChange P25
U59) (performedBy P25 Iran)) - Question Does Iran have a centrifuge?
- (exists ?E (and (isa ?E Centrifuge) (ownedBy
Iran ?E)) - Cyc knows (implies (and (isa ?P
PurificationProcess) (isa ?U
Uranium) (objectOfStateChange ?P
?U) (performedBy ?P ?X)) (exists ?C
(and (isa ?C Centrifuge) (ownedBy ?X ?C)))) - Cyc deduces Yes
26NYT 2000 top of the chart
27Verbs differ as to speaker commitments
- Bush realized that the US Army had to be
transformed to meet new threats - The US Army had to be transformed to meet new
threats
Bush said that Khan sold centrifuges to North
Korea Khan sold centrifuges to North Korea
28Implicative verbs carry commitments
- Commitment depends on verb context
- Positive () or negative (-)
- Speaker commits to truth-value of complement
- True () or false (-)
29/- - Implicative
- positive context positive commitment negative
context negative commitment -
- Ed managed to leave
- Ed left
- Ed didnt manage to leave
- Ed didnt leave
30-/- Implicative
- positive context negative commitment negative
context positive commitment -
- Ed forgot to leave
- Ed didnt leave
- Ed didnt forget to leave
- Ed left
31 Implicative
- positive context only positive commitment
-
- Ann forced Ed to leave
- Ed left
- Ann didnt force Ed to leave
- Ed left / didnt leave
32- Implicative
- positive context only negative commitment
-
- Ed refused to leave
- Ed didnt left
- Ed didnt refuse to leave
- Ed left / didnt leave
33- Implicative
- negative context only positive commitment
- Ed hesitated to leave
- Ed left / didnt leave
- Ed didnt hesitate to leave
- Ed left
34-- Implicative
- negative context only negative commitment
-
- Ed attempted to leave
- Ed left / didnt leave
- Ed didnt attempt to leave
- Ed didnt leave
35Factive
- positive commitment no matter what context
Ann realized that Ed left Ann didnt realize
that Ed left Ed left
36- Factive
negative commitment no matter what context
- Ann pretended that Ed left
- Ann didnt pretend that Ed left
-
- Ed didnt leave
Like -/-- implicative but with additional
commitments
37Coding of implicative verbs
- Implicative properties of complement-taking verbs
- Not available in any external lexical resource
- No obvious machine-learning strategy
- 1250 embedded-clause verbs in PARC lexicon
- 400 currently examined
- Considered in BNC frequency order
- Google search when intuitions unclear
- 1/3 are implicative, classified in Unified Lexicon
38Matching for implicative entailments
- Term rewriting promotes commitments in Abstract KR
context(cx_leave7) context(t)context_lifting_rul
es(veridical t cx_leave7) instantiable(Ed0 t)
instantiable(leave_ev7 cx_leave7)
instantiable(manage_ev3 t) role(objectMoving
leave_ev7 Ed0) role(performedBy manage_ev3 Ed0)
role(whatsSuccessful manage_ev3 cx_leave7)
context(cx_leave7) context(t)instantiable(leave_
ev7 t) instantiable(Ed0 t)
instantiable(leave_ev7 cx_leave7)
instantiable(manage_ev3 t) role(objectMoving
leave_ev7 Ed0) role(performedBy manage_ev3 Ed0)
role(whatsSuccessful manage_ev3 cx_leave7)
Ed managed to leave.
context(t) instantiable(Ed0 t)
instantiable(leave_ev1 t) role(objectMoving
leave_ev1 Ed0)
Ed left.
39Embedded examples in real text
- From Google
- Song, Seoul's point man, did not forget to
persuade the North Koreans to make a strategic
choice of returning to the bargaining table... -
Song persuaded the North Koreans
40Promotion for a (simple) embedding
- Ed did not forget to force Dave to leave
Dave left.
41Ed
not
forget
force
Dave
leave
not
force
Dave
leave
forget
Ed
did
to
to
comp
subj
comp
Ed
obj
subj
comp
Dave
subj
42not
-/- Implicative
comp
forget
-/- Implicative
subj
comp
Ed
force
Implicative
obj
subj
comp
Dave
leave
Ed
subj
Dave
43not
-/- Implicative
comp
-
forget
-/- Implicative
subj
comp
Ed
force
Implicative
obj
subj
comp
Dave
leave
Ed
subj
Dave
44Matches Dave left.
not
comp
-
forget
subj
comp
Ed
force
obj
subj
comp
Dave
leave
Ed
subj
Dave
45DEMO
46Grammatical Machine Translation
- Stefan Riezler John Maxwell
47Translation System
Lots of statistics
Translationrules
XLEParsing
XLEGeneration
F-structures
F-structures.
GermanLFG
English LFG
48Transfer-Rule Induction from aligned bilingual
corpora
- Use standard techniques to find many-to-many
candidate word-alignments in source-target
sentence-pairs - Parse source and target sentences using LFG
grammars for German and English - Select most similar f-structures in source and
target - Define many-to-many correspondences between
substructures of f-structures based on
many-to-many word alignment - Extract primitive transfer rules directly from
aligned f-structure units - Create powerset of possible combinations of basic
rules and filter according to contiguity and type
matching constraints
49Induction
- Example sentences Dafür bin ich zutiefst
dankbar. - I have a deep appreciation for that.
- Many-to-many word alignment
- Dafür6 7 bin2 ich1 zutiefst3 4 5
dankbar5 - F-structure alignment
50Extracting Primitive Transfer Rules
- Rule (1) maps lexical predicates
- Rule (2) maps lexical predicates and interprets
subj-to-subj link as indication to map subj of
source with this predicate into subject of target
and xcomp of source into object of target - X1, X2, X3, are variables for f-structures
-
- (2) PRED(X1, sein),
- SUBJ(X1,X2),
- XCOMP(X1,X3)
- gt
- PRED(X1, have),
- SUBJ(X1,X2)
- OBJ(X1,X3)
(1) PRED(X1, ich) gt PRED(X1, I)
51Extracting Primitive Transfer Rules, cont.
- Primitive transfer rule (3) maps single source
f-structure into target f-structure of
prepositionobject units
- (3) PRED(X1, dafür)
- gt
- PRED(X1, for),
- OBJ(X1,X2)
- PRED(X2,that)
52Extracting Complex Transfer Rules
- Complex rules are created by taking all
combinations of primitive rules, and filtering
- (4) sein zutiefst dankbar
- gt
- have a deep appreciation
- (5) sein zutiefst dankbar dafür
- gt
- have a deep appreciation for that
- (6) sein ich zutiefst dankbar dafür
- gt
- have I a deep appreciation for that
53Transfer Contiguity constraint
- Transfer contiguity constraint
- Source and target f-structures each have to be
connected - F-structures in the transfer source can only be
aligned with f-structures in the transfer target,
and vice versa - Analogous to constraint on contiguous and
alignment-consistent phrases in phrase-based SMT - Prevents extraction of rule that would translate
dankbar directly into appreciation since
appreciation is aligned also to zutiefst - Transfer contiguity allows learning idioms like
es gibt - there is from configurations that are
local in f-structure but non-local in string,
e.g., es scheint zu geben - there seems
to be
54Linguistic Filters on Transfer Rules
- Morphological stemming of PRED values
- (Optional) filtering of f-structure snippets
based on consistency of linguistic categories - Extraction of snippet that translates zutiefst
dankbar into a deep appreciation maps
incompatible categories adjectival and nominal
valid in string-based world - Translation of sein to have might be discarded
because of adjectival vs. nominal types of their
arguments - Larger rule mapping sein zutiefst dankbar to have
a deep appreciation is ok since verbal types match
55Transfer
- Parallel application of transfer rules in
non-deterministic fashion - Unlike XLE ordered-rule rewrite system
- Each fact must be transferred by exactly one rule
- Default rule transfers any fact as itself
- Transfer works on chart using parsers
unification mechanism for consistency checking - Selection of most probable transfer output is
done by beam-decoding on transfer chart
56Generation
- Bi-directionality allows us to use same grammar
for parsing training data and for generation in
translation application - Generator has to be fault-tolerant in cases where
transfer-system operates on FRAGMENT parse or
produces non-valid f-structures from valid input
f-structures - Robust generation from unknown (e.g.,
untranslated) predicates and from unknown
f-structures
57Robust Generation
- Generation from unknown predicates
- Unknown German word Hunde is analyzed by German
grammar to extract stem (e.g., PRED Hund, NUM
pl) and then inflected using English default
morphology (Hunds) - Generation from unknown constructions
- Default grammar that allows any attribute to be
generated in any order is mixed as suboptimal
option in standard English grammar, e.g. if SUBJ
cannot be generated as sentence-initial NP, it
will be generated in any position as any category - extension/combination of set-gen-adds and OT
ranking
58Statistical Models
- Log-probability of source-to-target transfer
rules, where probability r(ef) or rule that
transfers source snippet f into target snippet e
is estimated by relative frequency - Log-probability of target-to-source transfer
rules, estimated by relative frequency
59Statistical Models, cont.
- Log-probability of lexical translations l(ef)
from source to target snippets, estimated from
Viterbi alignments a between source word
positions i1, n and target word positions
j1,,m for stems fi and ej in snippets f and e
with relative word translation frequencies
t(ejfi) - Log-probability of lexical translations from
target to source snippets
60Statistical Model, cont.
- Number of transfer rules
- Number of transfer rules with frequency 1
- Number of default transfer rules
- Log-probability of strings of predicates from
root to frontier of target f-structure, estimated
from predicate trigrams in English f-structures - Number of predicates in target f-structure
- Number of constituent movements during
generations based on original order of head
predicates of the constituents
61Statistical Models, cont.
- Number of generation repairs
- Log-probability of target string as computed by
trigram language model - Number of words in target string
62Experimental Evaluation
- Experimental setup
- German-to-English on Europarl parallel corpus
(Koehn 02) - Training and evaluation on sentences of length
5-15, for quick experimental turnaround - Resulting in training set of 163,141 sentences,
development set of 1,967 sentences, test of 1,755
sentences (used in Koehn et al. HLT03) - Improved bidirectional word alignment based on
GIZA (Och et al. EMNLP99) - LFG grammars for German and English (Butt et al.
COLING02 Riezler et al. ACL02) - SRI trigram language model (Stocke02)
- Comparison with PHARAOH (Koehn et al. HLT03) and
IBM Model 4 as produced by GIZA (Och et al.
EMNLP99)
63Experimental Evaluation, cont.
- Around 700,000 transfer rules extracted from
f-structures chosen by dependency similarity
measure - System operates on n-best lists of parses (n1),
transferred f-structures (n10), and generated
strings (n1,000) - Selection of most probable translations in two
steps - Most probable f-structure by beam search (n20)
on transfer chart using features 1-10 - Most probable string selected from strings
generated from selected n-best f-structures using
features 11-13 - Feature weights for modules trained by MER on 750
in-coverage sentences of development set
64Automatic Evaluation
- NIST scores (ignoring punctuation) Approximate
Randomization for significance testing (see
above) - 44 in-coverage of grammars 51 FRAGMENT parses
and/or generation repair 5 timeouts - In-coverage Difference between LFG and P not
significant - Suboptimal robustness techniques decrease overall
quality
65Manual Evaluation
- Closer look at in-coverage examples
- Random selection of 500 in-coverage examples
- Two independent judges indicated preference for
LFG or PHARAOH, or equality, in blind test - Separate evaluation under criteria of
grammaticality/fluency and translational/semantic
adequacy - Significance assessed by Approximate
Randomization via stratified shuffling of
preference ratings between systems
66Manual Evaluation
- Result differences on agreed-on ratings are
statistically significant at p lt 0.0001 - Net improvement in translational adequacy on
agreed-on examples is 11.4 on 500 sentences
(57/500), amounting to 5 overall improvement in
hybrid system (44 of 11.4) - Net improvement in grammaticality on agreed-on
examples is 15.4 on 500 sentences, amounting to
6.7 overall improvement in hybrid system
67Examples LFG gt PHARAOH
- src in diesem fall werde ich meine verantwortung
wahrnehmen - sef then i will exercise my responsibility
- LFG in this case i accept my responsibility
- P in this case i shall my responsibilities
- src die politische stabilität hängt ab von der
besserung der lebensbedingungen - ref political stability depends upon the
improvement of living conditions - LFG the political stability hinges on the
recovery the conditions - P the political stability is rejects the
recovery of the living conditions
68Examples PHARAOH gt LFG
- src das ist schon eine seltsame vorstellung von
gleichheit - ref a strange notion of equality
- LFG equality that is even a strange idea
- P this is already a strange idea of equality
- src frau präsidentin ich beglückwünsche herrn
nicholson zu seinem ausgezeichneten bericht - ref madam president I congratulate mr nicholson
on his excellent report - LFG madam president I congratulate mister
nicholson on his report excellented - P madam president I congratulate mr nicholson
for his excellent report
69Discussion
- High percentage of out-of-coverage examples
- Accumulation of 2 x 20 error-rates in parsing
training data - Errors in rule extraction
- Together result in ill-formed transfer rules
causing high number of generation
failures/repairs - Propagation of errors through the system also for
in-coverage examples - Error analysis 69 transfer errors, 10 due to
parse errors - Discrepancy between NIST and manual evaluation
- Suboptimal integration of generator, making
training and translation with large n-best lists
infeasible - Language and distortion models applied after
generation
70Conclusion
- Integration of grammar-based generator into
dependency-based SMT system achieves
state-of-the-art NIST and improved grammaticality
and adequacy on in-coverage examples - Possibility of hybrid system since it is
determinable when sentences are in coverage of
system
71Applications Conclusions
- Use large-scale robust LFG grammars as base for
deep linguistic processing - Ordered rewrite system provides integrated method
to manipulate output and incorporate external
resources - Stochastic and shallow techniques can combine
with deep processing
72(No Transcript)