Title: How to perform tree surgery
1How to perform tree surgery
- Anna Rafferty
- Marie-Catherine de Marneffe
2Tsurgeon by Roger Levy
- What?
- makes operations on a grammatical tree
- How?
- based on Tregex syntax
- Where?
- Javanlp trees.tregex.tsurgeon
3How? Tregex
- utility for identifying patterns in trees
- (like regular expressions for strings)
- node descriptions and relationships between
nodes
NP lt /NN/
4Tsurgeon syntax
- Define a pattern to be matched on the trees
- VBZvbz NP
- Define one or several operation(s)
- relabel vbz VBZ_TRANSITIVE
5Delete
- (ROOT
- (SBARQ
- (SQ (NP (NNS Cats))
- (VP (VBP do)
- (VP (WHNP what)
- (VB eat)))
- (PUNCT ?)))
- PUNCTpunct gt SBARQ
- delete punct
6Delete
- (ROOT
- (SBARQ
- (SQ (NP (NNS Cats))
- (VP (VBP do)
- (VP (WHNP what)
- (VB eat)))
- (PUNCT ?)))
- PUNCTpunct gt SBARQ
- delete punct
delete ltname1gtltnameNgt
Delete the node and everything below it
7Excise
- (ROOT
- (SBARQ
- (SQ (NP (NNS Cats))
- (VP (VBP do)
- (VP (WHNP what)
- (VB eat))))))
- SBARQsbarq gt ROOT
- excise sbarq sbarq
(ROOT (SQ (NP (NNS Cats)) (VP (VBP
do) (VP (WHNP what) (VB
eat)))))
8Excise
- (ROOT
- (SBARQ
- (SQ (NP (NNS Cats))
- (VP (VBP do)
- (VP (WHNP what)
- (VB eat))))))
- SBARQsbarq gt ROOT
- excise sbarq sbarq
excise ltname1gt ltname2gt
name1 is name2 or dominates name2. All children
of name2 go into the parent of name1, where
name1 was.
9Prune
- prune ltname1gtltnameNgt
- Different from delete
- If after the pruning the parent has no children
- anymore, the parent is pruned too.
10Insert
- (ROOT
- (SQ (NP (NNS Cats))
- (VP (VBP do)
- (VP (WHNP what)
- (VB eat)))))
- SQsq gt ROOT !lt- /PUNCT/
- insert (PUNCT .) gt-1 sq
- lttreegt ltpositiongt
(ROOT (SQ (NP (NNS Cats)) (VP (VBP
do) (VP (WHNP what) (VB
eat))) (PUNCT .)))
Caveat cyclic application of rules
11Position for insert and move
- insert ltnamegt ltpositiongt
- insert lttreegt ltpositiongt
- ltpositiongt ltrelationgt ltnamegt
- ltrelationgt
- the left sister of the named node
- - the right sister of the named node
- gti the i_th daughter of the named node
- gt-i the i_th daughter, counting from the
right, of the named node.
12Move
- (ROOT
- (SQ
- (NP (NNS Cats))
- (VP (VBP do)
- (VP (WHNP what)
- (VB eat)))
- (PUNCT .)))
- VP lt (/WH/wh /VB/vb)
- move vb wh
- ltpositiongt
move ltnamegt ltpositiongt
moves the named node into the specified position
13Move
- (ROOT
- (SQ
- (NP (NNS Cats))
- (VP (VBP do)
- (VP (WHNP what)
- (VB eat)))
- (PUNCT .)))
- VP lt (/WH/wh /VB/vb)
- move vb wh
- ltpositiongt
(ROOT (SQ (NP (NNS Cats)) (VP
(VBP do) (VP (VB eat) (WHNP
what))) (PUNCT .)))
14Adjoin
(ROOT (SQ (NP (NNS Cats)) (VP
(ADVP (RB usually)) (VP (VBP do)
(VP (VB eat) (WHNP
what))) (PUNCT .)))
- (ROOT
- (SQ
- (NP (NNS Cats))
- (VP (VBP do)
- (VP (VB eat)
- (WHNP what)))
- (PUNCT .)))
VPvp gt SQ !gt (__ ltlt usually) adjoin (VP (ADVP
(ADV usually)) VP_at_) vp
15Adjoin syntax
- adjoin ltauxiliary_treegt ltnamegt
- Adjoins the specified auxiliary tree into the
- named node.
- The daughters of the target node will become
- the daughters of the foot of the auxiliary tree.
- adjoin (VP (ADVP (ADV usually)) VP_at_) vp
- foot
16On the command line
- java Tsurgeon -treeFile
- ltaFilegt ltoperationFilegt
- aFile -gt a file containing the trees to be
transformed - operationFile -gt pattern (Tregex expression)
- an empty line
- operation(s) (one by line)
-
17How to use the Tsurgeon class
- TregexPattern matchPattern TregexPattern.compile
("SQsq lt (/WH/ VP)") - ListltTsurgeonPatterngt ps
- new ArrayListltTsurgeonPatterngt()
- TsurgeonPattern p
- Tsurgeon.parseOperation("relabel sq S")
- ps.add(p)
- CollectionltTreegt result Tsurgeon.processPatternO
nTrees(matchPattern,Tsurgeon.collectOperations(ps)
,lTrees)
18To become a specialist
- See Rogers README!
- Practice tree surgery!