Title: CSA350: NLP Algorithms
1CSA350 NLP Algorithms
- Sentence Parsing I
- The Parsing Problem
- Parsing as Search
- Top Down/Bottom Up Parsing
- Strategies
2References
- This lecture is largely based on material found
in Jurafsky Martin chapter 13
3Handling Sentences
- Sentence boundary detection.
- Finite state techniques are fine for certain
kinds of analysis - named entity recognition
- NP chunking
- But FS techniques are of limited use when trying
to compute grammatical relationships between
parts of sentences. - We need these to get at meanings.
4Grammatical Relationshipse.g. subject
- Wikipaedia definition
- The subject has the grammatical function in a
sentence of relating its constituent (a noun
phrase) by means of the verb to any other
elements present in the sentence, i.e. objects,
complements and adverbials.
5Grammatical Relationshipse.g. subject
- The dictionary helps me find words.
- Ice cream appeared on the table.
- The man that is sitting over there told me that
he just bought a ticket to Tahiti. - Nothing else is good enough.
- That nothing else is good enough shouldn't come
as a surprise. - To eat six different kinds of vegetables a day is
healthy.
6Why not use FS techniques for describing NL
sentences
- Descriptive Adequacy
- Some NL phenomena cannot be described within FS
framework. - example central embedding
- Notational Efficiency
- The notation does not facilitate 'factoring out'
the similarities. - To describe sentences of the form
subject-verb-object using a FSA, we must describe
possible subjects and objects, even though almost
all phrases that can appear as one can equally
appear as the other.
7Central Embedding
- The following sentences
- The cat spat 1 1
- The cat the boy saw spat 1 2
2 1 - The cat the boy the girl liked saw spat 1
2 3 3 2 1 - Require at least a grammar of the formS ? An Bn
8DCG-style Grammar/Lexicon
- GRAMMAR
- s --gt np, vp.
- s --gt aux, np, vp.
- s --gt vp.
- np --gt det nom.
- nom --gt noun.
- nom --gt noun, nom.
- nom --gt nom, pp
- pp --gt prep, np.
- np --gt pn.
- vp --gt v.
- vp --gt v np
- LEXICON
- d --gt thatthisa.
- n --gt bookflight
- mealmoney.
- v --gt
- bookinclude prefer.
- aux --gt does.
- prep --gt fromtoon.
- pn --gt HoustonTWA.
9Definite Clause Grammars
- Prolog Based
- LHS --gt RHS1, RHS2, ..., code.
- s(s(NP,VP)) --gt np(NP), vp(VP), mk-subj(NP)
- Rules are translated into executable Prolog
program. - No clear distinction between rules for grammar
and lexicon.
10Parsing Problem
- Given grammar G and sentence A discover all valid
parse trees for G that exactly cover A
S
VP
NP
V
Nom
Det
book
N
that
flight
11The elephant is in the trousers
S
VP
NP
NP
NP
PP
I shot an elephant in my trousers
12I was wearing the trousers
S
VP
NP
NP
PP
I shot an elephant in my trousers
13Parsing as Search
- Search within a space defined by
- Start State
- Goal State
- State to state transformations
- Two distinct parsing strategies
- Top down
- Bottom up
- Different parsing strategy, different state
space, different problem. - N.B. Parsing strategy ? search strategy
14Top Down
- Each state comprises
- a tree
- an open node
- an input pointer
- Together these encode the current state of the
parse. - Top down parser tries to build from the root node
S down to the leaves by replacing nodes with
non-terminal labels with RHS of corresponding
grammar rules. - Nodes with pre-terminal (word class) labels are
compared to input words.
15Top Down Search Space
Start node ?
Goal node ?
16Bottom Up
- Each state is a forest of trees.
- Start node is a forest of nodes labelled with
pre-terminal categories (word classes derived
from lexicon) - Transformations look for places where RHS of
rules can fit. - Any such place is replaced with a node labelled
with LHS of rule.
17Bottom Up Search Space
failed BU derivation
fl
fl
fl
fl
fl
fl
fl
18Top Down vs Bottom UpSearch Spaces
- Top down
- For space excludes trees that cannot be derived
from S - Against space includes trees that are not
consistent with the input
- Bottom up
- For space excludes states containing trees that
cannot lead to input text segments. - Against space includes states containing
subtrees that can never lead to an S node.
19Top Down Parsing - Remarks
- Top-down parsers do well if there is useful
grammar driven control search can be directed by
the grammar. - Not too many different rules for the same
category - Not too much distance between non terminal and
terminal categories. - Top-down is unsuitable for rewriting parts of
speech (preterminals) with words (terminals). In
practice that is always done bottom-up as lexical
lookup.
20Bottom Up Parsing - Remarks
- It is data-directed it attempts to parse the
words that are there. - Does well, e.g. for lexical lookup.
- Does badly if there are many rules with similar
RHS categories. - Inefficient when there is great lexical ambiguity
(grammar driven control might help here) - Empty categories termination problem unless
rewriting of empty constituents is somehow
restricted (but then its generally incomplete)
21Basic Parsing Algorithms
- Top Down
- Bottom Up
- see Jurafsky Martin Ch. 10
22Top Down Algorithm
23Recoding the Grammar/Lexicon
- Grammar
- rule(s,np,vp).
- rule(np,d,n).
- rule(vp,v).
- rule(vp,v,np).
- Lexicon
- word(d,the).
- word(n,dog).
- word(n,cat).
- word(n,dogs).
- word(n,cats).
- word(v,chase).
- word(v,chases).
24Top Down Depth First Recognitionin Prolog
- parse(C,WordS,S) -
- word(C,Word). word(noun,cat).
- parse(C,S1,S) -
- rule(C,Cs), rule(s,np,vp)
- parse_list(Cs,S1,S).
- parse_list(,S,S).
- parse_list(CCs,S1,S) -
- parse(C,S1,S2),
- parse_list(Cs,S2,S).
25Derivation top down, left-to-right, depth first
26Bottom UpShift/Reduce Algorithm
- Two data structures
- input string
- stack
- Repeat until input is exhausted
- Shift word to stack
- Reduce stack using grammar and lexicon until no
further reductions are possible - Unlike top down, algorithm does not require
category to be specified in advance. It simply
finds all possible trees.
27Shift/Reduce Operation
- ?
- Step Action Stack Input
- 0 (start) the dog barked
- 1 shift the dog barked
- 2 reduce d dog barked
- 3 shift dog d barked
- 4 reduce n d barked
- 5 reduce np barked
- 6 shift barked np
- 7 reduce v np
- 8 reduce vp np
- 9 reduce s
28Shift/Reduce Implementation
- parse(S,Res) - sr(S,,Res).
- sr(S,Stk,Res) -
- shift(Stk,S,NewStk,S1),
- reduce(NewStk,RedStk),
- sr(S1,RedStk,Res).
- sr(,Res,Res).
- shift(X,HY,HX,Y).
- reduce(Stk,RedStk) -
- brule(Stk,Stk2),
- reduce(Stk2,RedStk).
- reduce(Stk,Stk).
- grammar
- brule(vp,npX,sX).
- brule(n,dX,npX).
- brule(np,vX,vpX).
- brule(vX,vpX).
- interface to lexicon
- brule(WordX,CX) -
- word(C,Word).
? ? ? ? stack sent
nstack nsent
29Shift/Reduce Operation
- Words are shifted to the beginning of the stack,
which ends up in reverse order. - The reduce step is simplified if we also store
the rules backward, so that the rule s ? np vp is
stored as the factbrule(vp,npX,sX). - The term a,bX matches any list whose first and
second elements are a and b respectively. - The first argument directly matches the stack to
which this rule applies - The second argument is what the stack becomes
after reduction.
30Shift Reduce Parser
- Standard implementations do not perform
backtracking (e.g. NLTK) - Only one result is returned even when sentence is
ambiguous. - May not fail even when sentence is grammatical
- Shift/Reduce conflict
- Reduce/Reduce conflict
31Handling Conflicts
- Shift-reduce parsers may employ policies for
resolving such conflicts, e.g. - For Shift/Reduce Conflicts
- Prefer shift
- Prefer reduce
- For Reduce/Reduce Conflicts
- Choose reduction which removes most elements from
the stack