Title: Computational Semantics http:www'coli'unisb'declprojectsmilcaesslli Day II: A Modular Architecture
1Computational Semanticshttp//www.coli.uni-sb.de/
cl/projects/milca/esslliDay II A Modular
Architecture
- Aljoscha Burchardt,
- Alexander Koller,
- Stephan Walter,
- Universität des Saarlandes,
- Saarbrücken, Germany
- ESSLLI 2004, Nancy, France
2Computing Semantic Representations
- Yesterday
- ?-Calculus is a nice tool for systematic meaning
construction. - We saw a first, sketchy implementation
- Some things still to be done
- Today
- Lets fix the problems
- Lets build nice software
3Yesterday ?-Calculus
- Semantic representations constructed along the
syntax tree How to get there? - By using functional application
- ?s help to guide arguments in the right place on
?-reduction
?x.love(x,mary)_at_john
love(john,mary)
4Yesterdays disappointment
- Our first idea for NPs with determiner didnt
work out - A man gt ?z.man(z)
- A man loves Mary gt love(?z.man(z),mary)
But what was the idea after all? Nothing!
?z.man(z) just isnt the meaning of a man. If
anything, it translates the complete sentence
There is a man
Lets try again, systematically
5A solution
- What we want is
- A man loves Mary gt ?z(man(z) ? love(z,mary))
What we have is man gt ?y.man(y) loves
Mary gt ?x.love(x,mary)
?z(man(z) ? love(z,mary))
How about
?z(man(z) ? love(z,mary))
?z(?y.man(y)(z) ? love(z,mary))
?z(?y.man(y)(z) ? love(z,mary))
?z(?y.man(y)(z) ? ?x.love(x,mary)(z))
?z(?y.man(y)(z) ? ?x.love(x,mary)(z))
Remember We can use variables for any kind of
term. So next
?z(?y.man(y)(z) ? ?x.love(x,mary)(z))
?z(?y.man(y)(z) ? ?x.love(x,mary)(z))
?Q.
Q(z)) ?x.love(x,mary)
?P(
P
?P(?Q.?z(P(z) ?Q(z)))
)?y.man(y)
lt A
6But
?P(?Q.?z(P(z) ?Q(z)))_at_ ?y.man(y) _at_ ?x.love(x,mary)
?Q.?z(man(z)?Q(z))
?z.man(z) ? ?x.love(x,mary)(z)
man(z) ? love(z,mary)
?P(?Q.?z(P(z)?Q(z)))_at_?y.man(y)
?x.love(x,mary)
_at_
fine!
John loves Mary
?x.love(x,mary)
_at_
john
not systematic!
?x.love(x,mary)
_at_
john
not reducible!
_at_
?x.love(x,mary)
?
better!
?P.P_at_john
?x.love(x,mary)_at_john
love(john,mary)
So John gt ?P.P(john)
7Transitive Verbs
What about transitive verbs (like "love")?
"loves" gt ?y?x.love(x,y) ???
won't do
"Mary" gt ?Q.Q(mary)
?
"loves Mary" gt ?y?x.love(x,y)_at_?Q.Q(mary)
?x.love(x,?Q.Q(mary))
How about something a little more complicated
"loves" gt ?R?x(R_at_?y.love(x,y))
The only way to understand this is to see it in
action...
8"John loves Mary" again...
love(john,mary)
love(john,mary)
?x.love(x,mary)(john)
love(john,mary)
?x(?y.love(x,y)(mary))
?x.love(x,mary)
?R?x(R_at_?y.love(x,y))
?P.P(mary)
?P.P(john)
loves
John
Mary
9Summing up
- nouns man gt ?x.man(x)
- intransitive verbs smoke gt ?x.smoke(x)
- determiner a gt ?P(?Q.?z(P(z) ?Q(z)))
- proper names mary gt ?P.P(mary)
- transitive verbs love gt ?R?x(R_at_?y.love(x,y))
10Todays first success
- What we can do now (and could not do yesterday)
- Complex NPs (with determiners)
- Transitive verbs
- and all in the same way.
- Key ideas
- Extra ?s for NPs
- Variables for predicates
- Apply subject NP to VP
11Yesterdays implementation
- s(VP_at_NP) --gt np(NP),vp(VP).
- np(john) --gt john.
- np(mary) --gt mary.
- tv(lambda(X,lambda(Y,love(Y,X)))) --gt loves,
vars2atoms(X),vars2atoms(Y). - iv(lambda(X,smoke(X))) --gt smokes,
vars2atoms(X). - iv(lambda(X,snore(X))) --gt snorts,
vars2atoms(X). - vp(TV_at_NP) --gt tv(TV),np(NP).
- vp(IV) --gt iv(IV).
- This doesn't work!
- np(exists(X,man(X))) --gt a,man,
vars2atoms(X).
Was this a good implementation?
12A Nice Implementation
- What is a nice implementation?It should be
- Scalable If it works with five examples,
upgrading to 5000 shouldnt be a great problem
(e.g. new constructions in the grammar, more
words...) - Re-usable Small changes in our ideas about the
system shouldnt lead to complex changes in the
implementation (e.g. a new representation
language)
13Solution Modularity
- Think about your problem in terms of interacting
conceptual components - Encapsulate these components into modules of your
implementation, with clean and abstract
pre-defined interfaces to each other - Extend or change modules to scale / adapt the
implementation
14Another look at yesterdays implementation
- Okay, because it was small
- Not modular at all all linguistic functionality
in one file, packed inside the DCG - E.g. scalability of the lexicon Always have to
write new rules, like - tv(lambda(X,lambda(Y,visit(Y,X)))) --gt visit,
vars2atoms(X),vars2atoms(Y). - Changing parts for Adaptation? Change every
single rule! - Let's modularize!
15Semantic ConstructionConceptual Components
Black Box
16Semantic ConstructionInside the Black Box
Syntax
Semantics
Black Box
DCG
Phrases (combinatorial)
combine-rules
Words (lexical)
lexicon-facts
17DCG
- The DCG-rules tell us what phrases are acceptable
(mainly). Their basic structure is - s(...) --gt np(...), vp(...), ....
- np(...) --gt det(...), noun(...), ....
- np(...) --gt pn(...), ....
- vp(...) --gt tv(...), np(...), ....
- vp(...) --gt iv(...), ....
- (The gaps will be filled later on)
18combine-rules
- The combine-rules encode the actual semantic
construction process. That is, they glue
representations together using _at_ -
- combine(s(NP_at_VP),npNP,vpVP). combine(np(
DET_at_N),detDET,nN). combine(npPN,pnPN). - combine(vpIV,ivIV). combine(vp(TV_at_NP),tv
TV,npNP).
19Lexicon
The lexicon-facts hold the elementary information
connected to words lexicon(noun,bird,bird).
lexicon(pn,anna,anna). lexicon(iv,purr,pur
rs). lexicon(tv,eat,eats).
lexicon(tv,eat,eats).
lexicon(tv,eat,eats).
lexicon(tv,eat,eats).
- Their slots contain
- syntactic category
- constant / relation symbol (core semantics)
- the surface form of the word.
20Interfaces
Syntax
Semantics
Phrases (combinatorial)
DCG
combine-rules
combine-calls
Semantic macros
lexicon-calls
Words (lexical)
lexicon-facts
21Interfaces in the DCG
Information is transported between the three
components of our system by additional calls and
variables in the DCG
- Lexical rules are now fully abstract. We have one
for each category (iv, tv, n, ...). The DCG uses
lexicon-calls and semantic macros like this - iv(IV)--gt lexicon(iv,Sym,Word),ivSem(Sym,IV),
Word. - pn(PN)--gt lexicon(pn,Sym,Word),pnSem(Sym,PN),
Word. - In the combinatorial rules, using combine-calls
like this - vp(VP)--gt iv(IV),combine(vpVP,ivIV).
- s(S)--gt np(NP), vp(VP), combine(sS,npNP,vpV
P). -
22Interfaces How they work
iv(IV)--gt lexicon(iv,Sym,Word),ivSem(Sym,IV),
Word.
When this rule applies, the syntactic analysis
component
- looks up the Word found in the string, ...
(e.g. smokes)
?
- ... checks that its category is iv, ...
lexicon(iv, smoke, smokes)
lexicon(iv, smoke, smokes)
lexicon(iv, smoke, smokes)
- ... and retrieves the relation symbol Sym to be
used in the semantic construction.
So we have Word smokes Sym smoke
23Interfaces How they work II
iv(IV)--gt lexicon(iv,Sym,Word),ivSem(Sym,IV),
Word.
Then, the semantic construction component
Sym smoke
- ... and uses the semantic macro ivSem ...
ivSem(Sym,IV)
ivSem(smoke,IV)
ivSem(smoke,lambda(X, smoke(X)))
- ... to transfer it into a full semantic
representation for an intransitive verb.
The DCG-rule is now fully instantiated and looks
like this iv(lambda(X, smoke(X)))--gt
lexicon(iv,smoke,smokes), ivSem(smoke,
lambda(X, smoke(X))), smokes.
24Whats inside a semantic macro?
Semantic macros simply specify how to make a
valid semantic representation out of a naked
symbol. The one weve just seen in action for the
verb smokes was
ivSem(Sym,lambda(X,Formula))-
compose(Formula,Sym,X).
compose builds a first-order formula out of Sym
and a new variable X Formula smoke(X) This
is then embedded into a ? - abstraction over the
same X lambda(X, smoke(X))
Another one, without compose pnSem(Sym,lamb
da(P,P_at_Sym)). john ? lambda(P,P_at_john)
25Syntax
Semantics
s(S)--gt np(NP), vp(VP)
,combine(sS,npNP,vpVP).
Phrases (combinatorial)
np(NP) --gt ,pn(PN) vp(VP) --gt ,iv(IV)
NP lambda(P,P_at_john) VP lambda(X,smoke(X))
pn(PN) --gt ,john iv(IV) --gt ,smokes
PN lambda(P,P_at_john) IV lambda(X,smoke(X))
pnSem(Sym,PN) Sym john ivSem(Sym,IV) Sym smoke
Word john Word smokes
Words (lexical)
lexicon(pn,john,john).
lexicon(iv,smoke,smokes).
John smokes
26A look at combine
combine(sNP_at_VP,npNP,vpVP).
S NP_at_VP NP lambda(P,P_at_john) VP
lambda(X,smoke(X))
So
S lambda(P,P_at_john)_at_lambda(X,smoke(X))
Thats almost all, folks
betaConvert(lambda(P,P_at_john)_at_lambda(X,smoke(X),
Converted) Converted smoke(john)
27Little Cheats
A few special words are dealt with in a
somewhat different manner
- Determiners ("every man")
- No semantic Sym in the lexicon
- lexicon(det,_,every,uni).
- Semantic representation generated by the macro
alone - detSem(uni,lambda(P,lambda(Q,
- forall(X,(P_at_X)gt(Q_at_X))))).
- Negation same thing ("does not walk")
- No semantic Sym in the lexicon
- lexicon(mod,_,does,not,neg).
- Representation solely from macro
- modSem(neg,lambda(P,lambda(X,(P_at_X)))).
28The code that's online(http//www.coli.uni-sb.de/
cl/projects/milca/esslli)
- lexicon-facts have fourth argument for any kind
of additional information - lexicon(tv,eat,eats,fin).
- iv/tv have additional argument for infinite
/fin. - iv(I,IV)--gt lexicon(iv,Sym,Word,I),, Word.
- limited coordination, hence doubled categories
- vp2(VP2)--gt vp1(VP1A), coord(C), vp1(VP1B),
- combine(vp2VP2,vp1VP1A,coordC,vp1VP1B)
. - vp1(VP1)--gt v2(fin,V2),
- combine(vp1VP1,v2V2).
e.g. fin/inf, gender
e.g. "eat" vs. "eats"
e.g. "talks and walks"
29A demo
- lambda -
- readLine(Sentence),
- parse(Sentence,Formula), resetVars,
vars2atoms(Formula), - betaConvert(Formula,Converted),
- printRepresentations(Converted).
30Evaluation
- Our new program has become much bigger, but it's
- Modular everything's in its right place
- Syntax in englishGrammar.pl
- Semantics (macros combine) in lambda.pl
- Lexicon in lexicon.pl
- Scalable E.g. extend the lexicon by adding facts
to lexicon.pl - Re-usable E.g change only lambda.pl and keep the
rest for changing the semantic construction
method (e.g. to CLLS on Thursday)
31What weve done today
- Complex NPs, PNs and TVs in ?-based semantic
construction - A clean semantic construction framework in Prolog
- Its instantiation for ?-based semantic
construction
32Ambiguity
- Some sentences have more than one reading, i.e.
more than one semantic representation. - Standard Example "Every man loves a woman"
- Reading 1 the women may be different
- ?x(man(x) -gt ?y(woman(y) ? love(x,y)))
- Reading 2 there is one particular woman
- ?y(woman(y) ? ?x(man(x) -gt love(x,y)))
-
- What does our system do?
33Excursion lambda, variables and atoms
- Question yesterday Why don't we use Prolog
variables for FO-variables? - Advantage (at first sight) ?-reduction as
unification - betaReduce(lambda(X, F)_at_X,F).
- Now X john, F walk(X) ("John walks")
betaReduce(lambda(X, F)_at_X,F).
betaReduce(lambda(john,walk(john))_at_john,
walk(john))
F walk(john)
Nice, but
34Problem Coordination
- "John and Mary"
- (?X. ?Y.?P((X_at_P) ? (Y_at_P))_at_ ?Q.Q(john))_at_?R.R(mary)
?P((?Q.Q(john)_at_P) ? (?R.R(mary)_at_P))
?P(P(john) ? P(mary))
"John and Mary walk"
?P(P(john) ? P(mary))_at_ ?x.walk(x)
?x.walk(x)_at_john ? ?x.walk(x)_at_mary
lambda(X,walk(X))_at_john lambda(X,walk(X))_at_mary
?-reduction as unification X john X mary
?