Title: Basic Parsing with Context-Free Grammars
1- Basic Parsing with Context-Free Grammars
Slides adapted from Julia Hirschberg and Dan
Jurafsky
2Homework Getting Started
- Data
- News articles in TDT4
- Make sure you are looking in ENG sub-directory
- You need to represent each article in .arff form
- You need to write a program that will extract
features from each article - (Note that files with POS tags are now available
in Eng_POSTAGGED) - The .arff file contains independent variables as
well as the dependent variable
3An example
- Start with classifying into topic
- Suppose you want to start with just the words
- Two approaches
- Use your intuition to choose a few words that
might disambiguate - Start with all words
4What would your .arff file look like?
- Words are the attributes. What are the values?
- Binary present or not
- Frequency how many times it occurs
- TFIDF how many times it occurs in this document
(TF term frequency) divided by how many times
it occurs in all documents (DF document
frequency
5news_2865.input
- ltDOCgt
- ltSOURCE_TYPEgtNWIRElt/SOURCE_TYPEgt
- ltSOURCE_LANGgtENGlt/SOURCE_LANGgt
- ltSOURCE_ORGgtNYTlt/SOURCE_ORGgt
- ltDOC_DATEgt20010101lt/DOC_DATEgt
- ltBROAD_TOPICgtBT_9lt/BROAD_TOPICgt
- ltNARROW_TOPICgtNT_34lt/NARROW_TOPICgt
- ltTEXTgt
- Reversing a policy that has kept medical errors
secret for more than two decades, federal
officials say they will soon allow Medicare
beneficiaries to obtain data about doctors who
botched their care. Tens of thousands of Medicare
patients file complaints each year about the
quality of care they receive from doctors and
hospitals. But in many cases, patients get no
useful information because doctors can block the
release of assessments of their performance.
Under a new policy, officials said, doctors will
no longer be able to veto disclosure of the
findings of investigations. Federal law has for
many years allowed for review of care received
by Medicare patients, and the law says a peer
review organization must inform the patient of
the final disposition of the complaint'' in
each case. But the federal rules used to carry
out the law say the peer review organization may
disclose information about a doctor only with
the consent of that practitioner.'' The federal
manual for peer review organizations includes
similar language about disclosure. Under the new
plan, investigators will have to tell patients
whether their care met professionally
recognized standards of health care'' and inform
them of any action against the doctor or the
hospital. Patients could use such information in
lawsuits and other actions against doctors and
hospitals that provided substandard care. The new
policy came in response to a lawsuit against the
government
6All words for news_2865.input
- Class BT_9 dependent
- Reversing 1 independent
- A 100
- Policy 20
- That 50
- Has 75
- Kept 3
- Commonwealth 0 (news_2816.input)
- Independent 0
- States 0
- Preemptive 0
- Refugees 0
7Try it!
- Open your file
- Select attributes using Chi-square
- You can cut and paste resulting attributes to a
file - Classify
- How does it work?
- Try n-grams, POS or date next in same way
- How many features would each give you?
8CFG Example
- Many possible CFGs for English, here is an
example (fragment) - S ? NP VP
- VP ? V NP
- NP ? DetP N AdjP NP
- AdjP ? Adj Adv AdjP
- N ? boy girl
- V ? sees likes
- Adj ? big small
- Adv ? very
- DetP ? a the
the very small boy likes a girl
9Modify the grammar
10(No Transcript)
11(No Transcript)
12Derivations of CFGs
- String rewriting system we derive a string
(derived structure) - But derivation history represented by
phrase-structure tree (derivation structure)!
13Formal Definition of a CFG
- G (V,T,P,S)
- V finite set of nonterminal symbols
- T finite set of terminal symbols, V and T are
disjoint - P finite set of productions of the form
- A ? ?, A ? V and ? ? (T ? V)
- S ? V start symbol
14Context?
- The notion of context in CFGs has nothing to do
with the ordinary meaning of the word context in
language - All it really means is that the non-terminal on
the left-hand side of a rule is out there all by
itself (free of context) - A -gt B C
- Means that I can rewrite an A as a B followed by
a C regardless of the context in which A is found
15Key Constituents (English)
- Sentences
- Noun phrases
- Verb phrases
- Prepositional phrases
16Sentence-Types
- Declaratives I do not.
- S -gt NP VP
- Imperatives Go around again!
- S -gt VP
- Yes-No Questions Do you like my hat? S -gt Aux
NP VP - WH Questions What are they going to do?
- S -gt WH Aux NP VP
17(No Transcript)
18(No Transcript)
19(No Transcript)
20NPs
- NP -gt Pronoun
- I came, you saw it, they conquered
- NP -gt Proper-Noun
- New Jersey is west of New York City
- Lee Bollinger is the president of Columbia
- NP -gt Det Noun
- The president
- NP -gt Det Nominal
- Nominal -gt Noun Noun
- A morning flight to Denver
21PPs
- PP -gt Preposition NP
- Over the house
- Under the house
- To the tree
- At play
- At a party on a boat at night
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27Recursion
- Well have to deal with rules such as the
following where the non-terminal on the left also
appears somewhere on the right (directly) - NP -gt NP PP The flight to Boston
- VP -gt VP PP departed Miami at noon
28Recursion
- Of course, this is what makes syntax interesting
- Flights from Denver
- Flights from Denver to Miami
- Flights from Denver to Miami in February
- Flights from Denver to Miami in February on a
Friday - Flights from Denver to Miami in February on a
Friday under 300 - Flights from Denver to Miami in February on a
Friday under 300 with lunch
29Recursion
- Flights from Denver
- Flights from Denver to Miami
- Flights from Denver to Miami in
February - Flights from Denver to Miami in
February on a Friday - Etc.
- NP -gt NP PP
30Implications of recursion and context-freeness
- If you have a rule like
- VP -gt V NP
- It only cares that the thing after the verb is an
NP - It doesnt have to know about the internal
affairs of that NP
31The point
- VP -gt V NP
- (I) hate
- flights from Denver
- flights from Denver to Miami
- flights from Denver to Miami in February
- flights from Denver to Miami in February on a
Friday - flights from Denver to Miami in February on a
Friday under 300 - flights from Denver to Miami in February on a
Friday under 300 with lunch
32Grammar Equivalence
- Can have different grammars that generate same
set of strings (weak equivalence) - Grammar 1 NP ? DetP N and DetP ? a the
- Grammar 2 NP ? a N NP ? the N
- Can have different grammars that have same set of
derivation trees (strong equivalence) - With CFGs, possible only with useless rules
- Grammar 2 NP ? a N NP ? the N
- Grammar 3 NP ? a N NP ? the N, DetP ? many
- Strong equivalence implies weak equivalence
33 Normal Forms c
- There are weakly equivalent normal forms (Chomsky
Normal Form, Greibach Normal Form) - There are ways to eliminate useless productions
and so on
34Chomsky Normal Form
- A CFG is in Chomsky Normal Form (CNF) if all
productions are of one of two forms - A ? BC with A, B, C nonterminals
- A ? a, with A a nonterminal and a a terminal
- Every CFG has a weakly equivalent CFG in CNF
35Generative Grammar
- Formal languages formal device to generate a set
of strings (such as a CFG) - Linguistics (Chomskyan linguistics in
particular) approach in which a linguistic
theory enumerates all possible strings/structures
in a language (competence) - Chomskyan theories do not really use formal
devices they use CFG informally defined
transformations
36Nobody Uses Simple CFGs (Except Intro NLP Courses)
- All major syntactic theories (Chomsky, LFG, HPSG,
TAG-based theories) represent both phrase
structure and dependency, in one way or another - All successful parsers currently use statistics
about phrase structure and about dependency - Derive dependency through head percolation for
each rule, say which daughter is head
37Penn Treebank (PTB)
- Syntactically annotated corpus of newspaper texts
(phrase structure) - The newspaper texts are naturally occurring data,
but the PTB is not! - PTB annotation represents a particular linguistic
theory (but a fairly vanilla one) - Particularities
- Very indirect representation of grammatical
relations (need for head percolation tables) - Completely flat structure in NP (brown bag lunch,
pink-and-yellow child seat ) - Has flat Ss, flat VPs
38Example from PTB
- ( (S (NP-SBJ It)
- (VP 's
- (NP-PRD (NP (NP the latest investment
craze) - (VP sweeping
- (NP Wall Street)))
-
- (NP (NP a rash)
- (PP of
- (NP (NP new closed-end country funds)
- ,
- (NP (NP those
- (ADJP publicly traded)
- portfolios)
- (SBAR (WHNP-37 that)
- (S (NP-SBJ T-37)
- (VP invest
- (PP-CLR in
- (NP (NP stocks)
- (PP of
39Types of syntactic constructions
- Is this the same construction?
- An elf decided to clean the kitchen
- An elf seemed to clean the kitchen
- An elf cleaned the kitchen
- Is this the same construction?
- An elf decided to be in the kitchen
- An elf seemed to be in the kitchen
- An elf was in the kitchen
40Types of syntactic constructions (ctd)
- Is this the same construction?
- There is an elf in the kitchen
- There decided to be an elf in the kitchen
- There seemed to be an elf in the kitchen
- Is this the same construction?It is raining/it
rains - ??It decided to rain/be raining
- It seemed to rain/be raining
41Types of syntactic constructions (ctd)
- Conclusion
- to seem whatever is embedded surface subject can
appear in upper clause - to decide only full nouns that are referential
can appear in upper clause - Two types of verbs
42Types of syntactic constructions Analysis
S
S
NP
VP
VP
an elf
S
S
V
V
NP
VP
NP
VP
to decide
to seem
an elf
an elf
PP
PP
V
V
to be
to be
in the kitchen
in the kitchen
43Types of syntactic constructions Analysis
S
VP
an elf
S
V
NP
VP
seemed
an elf
PP
V
to be
in the kitchen
44Types of syntactic constructions Analysis
S
VP
an elf
S
V
NP
VP
seemed
an elf
PP
V
to be
in the kitchen
45Types of syntactic constructions Analysis
S
NPi
VP
an elf
an elf
S
V
NP
VP
seemed
ti
PP
V
to be
in the kitchen
46Types of syntactic constructions Analysis
- to seem lower surface subject raises to
- upper clause raising verb
-
- seems (there to be an elf in the kitchen)
- there seems (t to be an elf in the kitchen)
- it seems (there is an elf in the kitchen)
47Types of syntactic constructions Analysis (ctd)
- to decide subject is in upper clause and
co-refers with an empty subject in lower clause
control verb - an elf decided (an elf to clean the kitchen)
- an elf decided (PRO to clean the kitchen)
- an elf decided (he cleans/should clean the
kitchen) - it decided (an elf cleans/should clean the
kitchen)
48Lessons Learned from the Raising/Control Issue
- Use distribution of data to group phenomena into
classes - Use different underlying structure as basis for
explanations - Allow things to move around from underlying
structure -gt transformational grammar - Check whether explanation you give makes
predictions
49The Big Picture
Empirical Matter
or
- Formalisms
- Data structures
- Formalisms
- Algorithms
- Distributional Models
Maud expects there to be a riot Teri promised
there to be a riot Maud expects the shit to hit
the fan Teri promised the shit to hit the
descriptive theory is about
predicts
uses
explanatory theory is about
- Linguistic Theory
- Content Relate morphology to semantics
- Surface representation (eg, ps)
- Deep representation (eg, dep)
- Correspondence
50Syntactic Parsing
51Syntactic Parsing
- Declarative formalisms like CFGs, FSAs define the
legal strings of a language -- but only tell you
this is a legal string of the language X - Parsing algorithms specify how to recognize the
strings of a language and assign each string one
(or more) syntactic analyses
52Parsing as a Form of Search
- Searching FSAs
- Finding the right path through the automaton
- Search space defined by structure of FSA
- Searching CFGs
- Finding the right parse tree among all possible
parse trees - Search space defined by the grammar
- Constraints provided by the input sentence and
the automaton or grammar
53CFG for Fragment of English
S ? NP VP VP ? V
S ? Aux NP VP PP -gt Prep NP
NP ? Det Nom N ? old dog footsteps young
NP ?PropN V ? dog include prefer
Nom -gt Adj Nom Aux ? does
Nom ? N Nom Prep ?from to on of
Nom ? N PropN ? Bush McCain Obama
Nom ? Nom PP Det ? that this a the
VP ? V NP
LCs
TopD BotUp
E.g.
54 Parse Tree for The old dog the footsteps of the
young for Prior CFG
S
NP
VP
NP
V
DET
NOM
NOM
DET
N
PP
N
The
old
dog
the
of the young
footsteps
55Top-Down Parser
- Builds from the root S node to the leaves
- Expectation-based
- Common search strategy
- Top-down, left-to-right, backtracking
- Try first rule with LHS S
- Next expand all constituents in these trees/rules
- Continue until leaves are POS
- Backtrack when candidate POS does not match input
string
56Rule Expansion
- The old dog the footsteps of the young.
- Where does backtracking happen?
- What are the computational disadvantages?
- What are the advantages?
57Bottom-Up Parsing
- Parser begins with words of input and builds up
trees, applying grammar rules whose RHS matches - Det N V Det N Prep Det N
- The old dog the footsteps of the young. Det
Adj N Det N Prep Det N - The old dog the footsteps of the young.
- Parse continues until an S root node reached or
no further node expansion possible -
58Whats right/wrong with.
- Top-Down parsers they never explore illegal
parses (e.g. which cant form an S) -- but waste
time on trees that can never match the input - Bottom-Up parsers they never explore trees
inconsistent with input -- but waste time
exploring illegal parses (with no S root) - For both find a control strategy -- how explore
search space efficiently? - Pursuing all parses in parallel or backtrack or
? - Which rule to apply next?
- Which node to expand next?
59Some Solutions
- Dynamic Programming Approaches Use a chart to
represent partial results - CKY Parsing Algorithm
- Bottom-up
- Grammar must be in Normal Form
- The parse tree might not be consistent with
linguistic theory - Early Parsing Algorithm
- Top-down
- Expectations about constituents are confirmed by
input - A POS tag for a word that is not predicted is
never added - Chart Parser
60Earley Parsing
- Allows arbitrary CFGs
- Fills a table in a single sweep over the input
words - Table is length N1 N is number of words
- Table entries represent
- Completed constituents and their locations
- In-progress constituents
- Predicted constituents
61States
- The table-entries are called states and are
represented with dotted-rules. - S -gt ? VP A VP is predicted
- NP -gt Det ? Nominal An NP is in progress
- VP -gt V NP ? A VP has been found
62States/Locations
- It would be nice to know where these things are
in the input so - S -gt ? VP 0,0 A VP is predicted at the
start of the sentence - NP -gt Det ? Nominal 1,2 An NP is in progress
the Det goes from 1 to 2 - VP -gt V NP ? 0,3 A VP has been found
starting at 0 and ending at 3
63Graphically
64Earley
- As with most dynamic programming approaches, the
answer is found by looking in the table in the
right place. - In this case, there should be an S state in the
final column that spans from 0 to n1 and is
complete. - If thats the case youre done.
- S gt a ? 0,n1
65Earley Algorithm
- March through chart left-to-right.
- At each step, apply 1 of 3 operators
- Predictor
- Create new states representing top-down
expectations - Scanner
- Match word predictions (rule with word after dot)
to words - Completer
- When a state is complete, see what rules were
looking for that completed constituent
66Predictor
- Given a state
- With a non-terminal to right of dot
- That is not a part-of-speech category
- Create a new state for each expansion of the
non-terminal - Place these new states into same chart entry as
generated state, beginning and ending where
generating state ends. - So predictor looking at
- S -gt . VP 0,0
- results in
- VP -gt . Verb 0,0
- VP -gt . Verb NP 0,0
67Scanner
- Given a state
- With a non-terminal to right of dot
- That is a part-of-speech category
- If the next word in the input matches this
part-of-speech - Create a new state with dot moved over the
non-terminal - So scanner looking at
- VP -gt . Verb NP 0,0
- If the next word, book, can be a verb, add new
state - VP -gt Verb . NP 0,1
- Add this state to chart entry following current
one - Note Earley algorithm uses top-down input to
disambiguate POS! Only POS predicted by some
state can get added to chart!
68Completer
- Applied to a state when its dot has reached right
end of role. - Parser has discovered a category over some span
of input. - Find and advance all previous states that were
looking for this category - copy state, move dot, insert in current chart
entry - Given
- NP -gt Det Nominal . 1,3
- VP -gt Verb. NP 0,1
- Add
- VP -gt Verb NP . 0,3
69Earley how do we know we are done?
- How do we know when we are done?.
- Find an S state in the final column that spans
from 0 to n1 and is complete. - If thats the case youre done.
- S gt a ? 0,n1
70Earley
- More specifically
- Predict all the states you can upfront
- Read a word
- Extend states based on matches
- Add new predictions
- Go to 2
- Look at N1 to see if you have a winner
71Example
- Book that flight
- We should find an S from 0 to 3 that is a
completed state
72Example
73Example
74Example
75Details
- What kind of algorithms did we just describe
- Not parsers recognizers
- The presence of an S state with the right
attributes in the right place indicates a
successful recognition. - But no parse tree no parser
- Thats how we solve (not) an exponential problem
in polynomial time
76Converting Earley from Recognizer to Parser
- With the addition of a few pointers we have a
parser - Augment the Completer to point to where we came
from.
77Augmenting the chart with structural information
S8
S8
S9
S9
S10
S8
S11
S12
S13
78Retrieving Parse Trees from Chart
- All the possible parses for an input are in the
table - We just need to read off all the backpointers
from every complete S in the last column of the
table - Find all the S -gt X . 0,N1
- Follow the structural traces from the Completer
- Of course, this wont be polynomial time, since
there could be an exponential number of trees - So we can at least represent ambiguity
efficiently
79Left Recursion vs. Right Recursion
- Depth-first search will never terminate if
grammar is left recursive (e.g. NP --gt NP PP)
80- Solutions
- Rewrite the grammar (automatically?) to a weakly
equivalent one which is not left-recursive - e.g. The man on the hill with the telescope
- NP ? NP PP (wanted Nom plus a sequence of PPs)
- NP ? Nom PP
- NP ? Nom
- Nom ? Det N
- becomes
- NP ? Nom NP
- Nom ? Det N
- NP ? PP NP (wanted a sequence of PPs)
- NP ? e
- Not so obvious what these rules mean
81- Harder to detect and eliminate non-immediate left
recursion - NP --gt Nom PP
- Nom --gt NP
- Fix depth of search explicitly
- Rule ordering non-recursive rules first
- NP --gt Det Nom
- NP --gt NP PP
82Another Problem Structural ambiguity
- Multiple legal structures
- Attachment (e.g. I saw a man on a hill with a
telescope) - Coordination (e.g. younger cats and dogs)
- NP bracketing (e.g. Spanish language teachers)
83NP vs. VP Attachment
84- Solution?
- Return all possible parses and disambiguate using
other methods
85Summing Up
- Parsing is a search problem which may be
implemented with many control strategies - Top-Down or Bottom-Up approaches each have
problems - Combining the two solves some but not all issues
- Left recursion
- Syntactic ambiguity
- Next time Making use of statistical information
about syntactic constituents - Read Ch 14