CSA350: NLP Algorithms - PowerPoint PPT Presentation

About This Presentation

Title:

CSA350: NLP Algorithms

Description:

Nothing else is good enough. That nothing else is good enough shouldn't come as a surprise. ... we must describe possible subjects and objects, ... – PowerPoint PPT presentation

Number of Views:65

Avg rating:3.0/5.0

Slides: 32

Provided by: MichaelR174

Category:

more less

Transcript and Presenter's Notes

Title: CSA350: NLP Algorithms

1
CSA350 NLP Algorithms

Sentence Parsing I
The Parsing Problem
Parsing as Search
Top Down/Bottom Up Parsing
Strategies

2
References

This lecture is largely based on material found
in Jurafsky Martin chapter 13

3
Handling Sentences

Sentence boundary detection.
Finite state techniques are fine for certain
kinds of analysis
named entity recognition
NP chunking
But FS techniques are of limited use when trying
to compute grammatical relationships between
parts of sentences.
We need these to get at meanings.

4
Grammatical Relationshipse.g. subject

Wikipaedia definition
The subject has the grammatical function in a
sentence of relating its constituent (a noun
phrase) by means of the verb to any other
elements present in the sentence, i.e. objects,
complements and adverbials.

5
Grammatical Relationshipse.g. subject

The dictionary helps me find words.
Ice cream appeared on the table.
The man that is sitting over there told me that
he just bought a ticket to Tahiti.
Nothing else is good enough.
That nothing else is good enough shouldn't come
as a surprise.
To eat six different kinds of vegetables a day is
healthy.

6
Why not use FS techniques for describing NL
sentences

Descriptive Adequacy
Some NL phenomena cannot be described within FS
framework.
example central embedding
Notational Efficiency
The notation does not facilitate 'factoring out'
the similarities.
To describe sentences of the form
subject-verb-object using a FSA, we must describe
possible subjects and objects, even though almost
all phrases that can appear as one can equally
appear as the other.

7
Central Embedding

The following sentences
The cat spat 1 1
The cat the boy saw spat 1 2
2 1
The cat the boy the girl liked saw spat 1
2 3 3 2 1
Require at least a grammar of the formS ? An Bn

8
DCG-style Grammar/Lexicon

GRAMMAR
s --gt np, vp.
s --gt aux, np, vp.
s --gt vp.
np --gt det nom.
nom --gt noun.
nom --gt noun, nom.
nom --gt nom, pp
pp --gt prep, np.
np --gt pn.
vp --gt v.
vp --gt v np

LEXICON
d --gt thatthisa.
n --gt bookflight
mealmoney.
v --gt
bookinclude prefer.
aux --gt does.
prep --gt fromtoon.
pn --gt HoustonTWA.

9
Definite Clause Grammars

Prolog Based
LHS --gt RHS1, RHS2, ..., code.
s(s(NP,VP)) --gt np(NP), vp(VP), mk-subj(NP)
Rules are translated into executable Prolog
program.
No clear distinction between rules for grammar
and lexicon.

10
Parsing Problem

Given grammar G and sentence A discover all valid
parse trees for G that exactly cover A

S
VP
NP
V
Nom
Det
book
N
that
flight
11
The elephant is in the trousers
S
VP
NP
NP
NP
PP
I shot an elephant in my trousers
12
I was wearing the trousers
S
VP
NP
NP
PP
I shot an elephant in my trousers
13
Parsing as Search

Search within a space defined by
Start State
Goal State
State to state transformations
Two distinct parsing strategies
Top down
Bottom up
Different parsing strategy, different state
space, different problem.
N.B. Parsing strategy ? search strategy

14
Top Down

Each state comprises
a tree
an open node
an input pointer
Together these encode the current state of the
parse.
Top down parser tries to build from the root node
S down to the leaves by replacing nodes with
non-terminal labels with RHS of corresponding
grammar rules.
Nodes with pre-terminal (word class) labels are
compared to input words.

15
Top Down Search Space
Start node ?
Goal node ?
16
Bottom Up

Each state is a forest of trees.
Start node is a forest of nodes labelled with
pre-terminal categories (word classes derived
from lexicon)
Transformations look for places where RHS of
rules can fit.
Any such place is replaced with a node labelled
with LHS of rule.

17
Bottom Up Search Space
failed BU derivation
fl
fl
fl
fl
fl
fl
fl
18
Top Down vs Bottom UpSearch Spaces

Top down
For space excludes trees that cannot be derived
from S
Against space includes trees that are not
consistent with the input

Bottom up
For space excludes states containing trees that
cannot lead to input text segments.
Against space includes states containing
subtrees that can never lead to an S node.

19
Top Down Parsing - Remarks

Top-down parsers do well if there is useful
grammar driven control search can be directed by
the grammar.
Not too many different rules for the same
category
Not too much distance between non terminal and
terminal categories.
Top-down is unsuitable for rewriting parts of
speech (preterminals) with words (terminals). In
practice that is always done bottom-up as lexical
lookup.

20
Bottom Up Parsing - Remarks

It is data-directed it attempts to parse the
words that are there.
Does well, e.g. for lexical lookup.
Does badly if there are many rules with similar
RHS categories.
Inefficient when there is great lexical ambiguity
(grammar driven control might help here)
Empty categories termination problem unless
rewriting of empty constituents is somehow
restricted (but then its generally incomplete)

21
Basic Parsing Algorithms

Top Down
Bottom Up
see Jurafsky Martin Ch. 10

22
Top Down Algorithm
23
Recoding the Grammar/Lexicon

Grammar
rule(s,np,vp).
rule(np,d,n).
rule(vp,v).
rule(vp,v,np).

Lexicon
word(d,the).
word(n,dog).
word(n,cat).
word(n,dogs).
word(n,cats).
word(v,chase).
word(v,chases).

24
Top Down Depth First Recognitionin Prolog

parse(C,WordS,S) -
word(C,Word). word(noun,cat).
parse(C,S1,S) -
rule(C,Cs), rule(s,np,vp)
parse_list(Cs,S1,S).
parse_list(,S,S).
parse_list(CCs,S1,S) -
parse(C,S1,S2),
parse_list(Cs,S2,S).

25
Derivation top down, left-to-right, depth first
26
Bottom UpShift/Reduce Algorithm

Two data structures
input string
stack
Repeat until input is exhausted
Shift word to stack
Reduce stack using grammar and lexicon until no
further reductions are possible
Unlike top down, algorithm does not require
category to be specified in advance. It simply
finds all possible trees.

27
Shift/Reduce Operation

?
Step Action Stack Input
0 (start) the dog barked
1 shift the dog barked
2 reduce d dog barked
3 shift dog d barked
4 reduce n d barked
5 reduce np barked
6 shift barked np
7 reduce v np
8 reduce vp np
9 reduce s

28
Shift/Reduce Implementation

parse(S,Res) - sr(S,,Res).
sr(S,Stk,Res) -
shift(Stk,S,NewStk,S1),
reduce(NewStk,RedStk),
sr(S1,RedStk,Res).
sr(,Res,Res).
shift(X,HY,HX,Y).

reduce(Stk,RedStk) -
brule(Stk,Stk2),
reduce(Stk2,RedStk).
reduce(Stk,Stk).
grammar
brule(vp,npX,sX).
brule(n,dX,npX).
brule(np,vX,vpX).
brule(vX,vpX).
interface to lexicon
brule(WordX,CX) -
word(C,Word).

? ? ? ? stack sent
nstack nsent
29
Shift/Reduce Operation

Words are shifted to the beginning of the stack,
which ends up in reverse order.
The reduce step is simplified if we also store
the rules backward, so that the rule s ? np vp is
stored as the factbrule(vp,npX,sX).
The term a,bX matches any list whose first and
second elements are a and b respectively.
The first argument directly matches the stack to
which this rule applies
The second argument is what the stack becomes
after reduction.

30
Shift Reduce Parser