Unambiguous%20 %20Unlimited%20=%20Unsupervised%20Using%20the%20Web%20for%20Natural%20Language%20Processing%20Problems

About This Presentation

Title:

Unambiguous%20 %20Unlimited%20=%20Unsupervised%20Using%20the%20Web%20for%20Natural%20Language%20Processing%20Problems

Description:

Machine translation is vastly improved. Speech recognition is decent in limited circumstances ... Association Models: 2 (Chi Squared) A = #(wi,wj) B = #(wi) ... – PowerPoint PPT presentation

Number of Views:105

Avg rating:3.0/5.0

Slides: 83

Provided by: Pres3

Learn more at: https://people.ischool.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Unambiguous%20 %20Unlimited%20=%20Unsupervised%20Using%20the%20Web%20for%20Natural%20Language%20Processing%20Problems

1
Unambiguous Unlimited UnsupervisedUsing the
Web for Natural Language Processing Problems

Marti Hearst
School of Information, UC Berkeley
UCB Neyman Seminar
October 25, 2006

This research supported in part by NSF DBI-0317510
2
Natural Language Processing

The ultimate goal write programs that read and
understand stories and conversations.
This is too hard! Instead we tackle
sub-problems.
There have been notable successes lately
Machine translation is vastly improved
Speech recognition is decent in limited
circumstances
Text categorization works with some accuracy

3
Automatic Help Desk Translation at MS
4
Why is text analysis difficult?

One reason enormous vocabulary size.
The average English speakers vocabulary is
around 50,000 words,
Many of these can be combined with many others,
And they mean different things when they do!

5
How can a machine understand these differences?

Get the cat with the gloves.

6
How can a machine understand these differences?

Get the sock from the cat with the gloves.

Get the glove from the cat with the socks.

7
How can a machine understand these differences?

Decorate the cake with the frosting.
Decorate the cake with the kids.
Throw out the cake with the frosting.
Throw out the cake with the kids.

8
Why is this difficult?

Same syntactic structure, different meanings.
Natural language processing algorithms have to
deal with the specifics of individual words.
Enormous vocabulary sizes.
The average English speakers vocabulary is
around 50,000 words,
Many of these can be combined with many others,
And they mean different things when they do!

9
How to tackle this problem?

The field was stuck for quite some time.
Hand-enter all semantic concepts and relations
A new approach started around 1990
Get large text collections
Compute statistics over the words in those
collections
There are many different algorithms.

10
Size Matters

Recent realization bigger is better than
smarter!
Banko and Brill 01 Scaling to Very, Very
Large Corpora for Natural Language
Disambiguation, ACL

11
Example Problem

Grammar checker example
Which word to use?
ltprincipalgt ltprinciplegt
Solution use well-edited text and look at which
words surround each use
I am in my third year as the principal of Anamosa
High School.
School-principal transfers caused some upset.
This is a simple formulation of the quantum
mechanical uncertainty principle.
Power without principle is barren, but principle
without power is futile. (Tony Blair)

12
Using Very, Very Large Corpora

Keep track of which words are the neighbors of
each spelling in well-edited text, e.g.
Principal high school
Principle rule
At grammar-check time, choose the spelling best
predicted by the surrounding words.
Surprising results
Log-linear improvement even to a billion words!
Getting more data is better than fine-tuning
algorithms!

13
The Effects of LARGE Datasets

From Banko Brill 01

14
How to Extend this Idea?

This is an exciting result
BUT relies on having huge amounts of text that
has been appropriately annotated!

15
How to Avoid Manual Labeling?

Web as a baseline (Lapata Keller 04,05)
Main idea apply web-determined counts to every
problem imaginable.
Example for t in ltprincipalgt ltprinciplegt
Compute f(w-1, t, w1)
The largest count wins

16
Web as a Baseline

Works very well in some cases
machine translation candidate selection
article generation
noun compound interpretation
noun compound bracketing
adjective ordering
But lacking in others
spelling correction
countability detection
prepositional phrase attachment
How to push this idea further?

Significantly better than the best supervised
algorithm.
Not significantly different from the best
supervised.
17
Using Unambiguous Cases

The trick look for unambiguous cases to start
Use these to improve the results beyond what
co-occurrence statistics indicate.
An Early Example
Hindle and Rooth, Structural Ambiguity and
Lexical Relations, ACL 90, Comp Ling93
Problem Prepositional Phrase attachment
I eat/v spaghetti/n1 with/p a fork/n2.
I eat/v spaghetti/n1 with/p sauce/n2.
Question does n2 attach to v or to n1?

18
Using Unambiguous Cases

How to do this with unlabeled data?
First try
Parse some text into phrase structure
Then compute certain co-occurrences
f(v, n1, p) f(n1, p) f(v, n1)
Problem results not accurate enough
The trick look for unambiguous cases
Spaghetti with sauce is delicious. (pre-verbal)
I eat with a fork. (no
direct object)
Use these to improve the results beyond what
co-occurrence statistics indicate.

19
Using Unambiguous Cases

Hindle Rooth, final algorithm
Parse text into phrase structure.
Create bigram counts (v, p) and (n1, p) as
follows
First, use unambiguous cases to populate bigram
table
Then, for the ambiguous cases
Compute a Lexical Association score comparing
(v1, n1, p) to (n1, p, n2).
If this is greater than a threshold, update the
bigram table with the assumed attachment
Else split the score and assign to both
attachments
The bigram table is used for further computations
of the Lexical Association score.

20
Unambiguous Unlimited Unsupervised

Apply the Unambiguous Case Idea to the Very, Very
Large Corpora idea
The potential of these approaches are not fully
realized
Our work (with Preslav Nakov)
Structural Ambiguity Decisions
PP-attachment
Noun compound bracketing
Coordination grouping
Semantic Relation Acquisition
Hypernym (ISA) relations
Verbal relations between nouns
SAT Analogy problems

21
Structural Ambiguity Problems

Apply the U U U idea to structural ambiguity
Noun compound bracketing
Prepositional Phrase attachment
Noun Phrase coordination
Motivation BioText project
In eukaryotes, the key to transcriptional
regulation of the Heat Shock Response is the Heat
Shock Transcription Factor (HSF).
Open-labeled long-term study of the subcutaneous
sumatriptan efficacy and tolerability in acute
migraine treatment.
BimL protein interact with Bcl-2 or Bcl-XL, or
Bcl-w proteins (Immuno-precipitation (anti-Bcl-2
OR Bcl-XL or Bcl-w)) followed by Western blot
(anti-EEtag) using extracts human 293T cells
co-transfected with EE-tagged BimL and (bcl-2 or
bcl-XL or bcl-w) plasmids)

22
Applying U U U to Structural Ambiguity

We introduce the use of (nearly) unambiguous
features
Surface features
Paraphrases
Combined with ngrams
Use from very, very large corpora
Achieve state-of-the-art results without labeled
examples.

23
Noun Compound Bracketing

(a) liver cell antibody (left
bracketing)
(b) liver cell line (right
bracketing)
In (a), the antibody targets the liver cell.
In (b), the cell line is derived from the liver.

24
Dependency Model

right bracketing w1w2w3
w2w3 is a compound (modified by w1)
home health care
w1 and w2 independently modify w3
adult male rat
left bracketing w1w2 w3
only 1 modificational choice possible
law enforcement officer

w1 w2 w3
w1 w2 w3
25
Related Work

Marcus(1980), Pustejoskyal.(1993), Resnik(1993)
adjacency model Pr(w1w2) vs. Pr(w2w3)
Lauer (1995)
dependency model Pr(w1w2) vs. Pr(w1w3)
Keller Lapata (2004)
use the Web
unigrams and bigrams
Girju al. (2005)
supervised model
bracketing in context
requires WordNet senses
to be given

Our approach
Web as data
?2 , n-grams
paraphrases
surface features

26
Our U U U Algorithm

Compute bigram estimates
Compute estimates from surface features
Compute estimates from paraphrases
Combine these scores with a voting algorithm to
choose left or right bracketing.
We use the same general approach for two other
structural ambiguity problems.

27
Computing Bigram Statistics

Dependency Model, Frequencies
Compare (w1,w2) to (w1,w3)
Dependency model, Probabilities
Pr(left) Pr(w1?w2w2)Pr(w2?w3w3)
Pr(right) Pr(w1?w3w3)Pr(w2?w3w3)
So we compare Pr(w1?w2w2) to Pr(w1?w3w3)

right
w1 w2 w3
left
28
Using ngrams to estimate probabilities

Using page hits as a proxy for n-gram counts
Pr(w1?w2w2) (w1,w2) / (w2)
(w2) word frequency query for w2
(w1,w2) bigram frequency query for w1 w2
smoothed by 0.5
Use ?2 to determine if w1 is associated with w2
(thus indicating left bracketing), and same for
w1 with w3

29
Association Models ?2 (Chi Squared)

A (wi,wj)
B (wi) (wi,wj)
C (wj) (wi,wj)
D N (ABC)
N 8 trillion ( ABCD)

8 billion Web pages x 1,000 words
30
Our U U U Algorithm

Compute bigram estimates
Compute estimates from surface features
Compute estimates from paraphrases
Combine these scores with a voting algorithm to
choose left or right bracketing.

31
Web-derived Surface Features

Authors often disambiguate noun compounds using
surface markers, e.g.
amino-acid sequence ? left
brain stems cell ? left
brains stem cell ? right
The enormous size of the Web makes these frequent
enough to be useful.

32
Web-derived Surface FeaturesDash (hyphen)

Left dash
cell-cycle analysis ? left
Right dash
donor T-cell ? right
Double dash
T-cell-depletion ? unusable

33
Web-derived Surface FeaturesPossessive Marker

Attached to the first word
brains stem cell ? right
Attached to the second word
brain stems cell ? left
Combined features
brains stem-cell ? right

34
Web-derived Surface FeaturesCapitalization

anycase lowercase uppercase
Plasmodium vivax Malaria ? left
plasmodium vivax Malaria ? left
lowercase uppercase anycase
brain Stem cell ? right
brain Stem Cell ? right
Disable this on
Roman digits
Single-letter words e.g. vitamin D deficiency

35
Web-derived Surface FeaturesEmbedded Slash

Left embedded slash
leukemia/lymphoma cell ? right

36
Web-derived Surface FeaturesParentheses

Single-word
growth factor (beta) ? left
(brain) stem cell ? right
Two-word
(growth factor) beta ? left
brain (stem cell) ? right

37
Web-derived Surface FeaturesComma, dot,
semi-colon

Following the first word
home. health care ? right
adult, male rat ? right
Following the second word
health care, provider ? left
lung cancer patients ? left

38
Web-derived Surface FeaturesDash to External
Word

External word to the left
mouse-brain stem cell ? right
External word to the right
tumor necrosis factor-alpha ? left

39
Web-derived Surface FeaturesProblems Solutions

Problem search engines ignore punctuation in
queries
brain-stem cell does not work
Solution
query for brain stem cell
obtain 1,000 document summaries
scan for the features in these summaries

40
Other Web-derived FeaturesPossessive Marker

We can also query directly for possessives
Yes, brain stems cell sort of works.
Search engines
drop the possessive marker
but s is kept
Still, we cannot query for brain stems cell

41
Other Web-derived FeaturesAbbreviation

After the second word
tumor necrosis factor (NF) ? right
After the third word
tumor necrosis (TN) factor ? right
We query for, e.g., tumor necrosis tn factor
Problems
Roman digits IV, VI
States CA
Short words me

42
Other Web-derived FeaturesConcatenation

Consider health care reform
healthcare 79,500,000
carereform 269
healthreform 812
Adjacency model
healthcare vs. carereform
Dependency model
healthcare vs. healthreform
Triples
healthcare reform vs. health carereform

43
Other Web-derived FeaturesUsing Googles
Operator

Each allows a one-word wildcard
Single star
health care reform ? left
health care reform ? right
More stars and/or reverse order
care reform health ? right

44
Other Web-derived FeaturesReorder

Reorders for health care reform
care reform health ? right
reform health care ? left

45
Other Web-derived FeaturesInternal Inflection
Variability

Vary inflection of second word
tyrosine kinase activation
tyrosine kinases activation

46
Other Web-derived FeaturesSwitch The First Two
Words

Predict right, if we can reorder
adult male rat as
male adult rat

47
Our U U U Algorithm

Compute bigram estimates
Compute estimates from surface features
Compute estimates from paraphrases
Combine these scores with a voting algorithm to
choose left or right bracketing.

48
Paraphrases

The semantics of a noun compound is often made
overt by a paraphrase (Warren,1978)
Prepositional
stem cells in the brain ? right
cells from the brain stem ? right
Verbal
virus causing human immunodeficiency ? left
Copula
office building that is a skyscraper ? right

49
Paraphrases

Lauer(1995), KellerLapata(2003), Girjual.
(2005) predict NC semantics by choosing the most
likely preposition
of, for, in, at, on, from, with, about, (like)
This could be problematic, when more than one
preposition is possible
In contrast
we try to predict syntax, not semantics
we do not disambiguate, just add up all counts
cells in (the) bone marrow ? left
cells from (the) bone marrow ? left

50
Paraphrases

prepositional paraphrases
We use 150 prepositions
verbal paraphrases
We use associated with, caused by, contained in,
derived from, focusing on, found in, involved in,
located at/in, made of, performed by, preventing,
related to and used by/in/for.
copula paraphrases
We use is/was and that/which/who
optional elements
articles a, an, the
quantifiers some, every, etc.
pronouns this, these, etc.

51
Our U U U Algorithm

Compute bigram estimates
Compute estimates from surface features
Compute estimates from paraphrases
Combine these scores with a voting algorithm to
choose left or right bracketing.

52
Evaluation Datasets

Lauer Set
244 noun compounds (NCs)
from Groliers encyclopedia
inter-annotator agreement 81.5
Biomedical Set
430 NCs
from MEDLINE
inter-annotator agreement 88 (? .606)

53
Evaluation Experiments

Exact phrase queries
Limited to English
Inflections
Lauer Set Carrolls morphological tools
Biomedical Set UMLS Specialist Lexicon

54
Co-occurrence Statistics

Lauer set

Bio set

55
Paraphrase and Surface Features Performance

Lauer Set
Biomedical Set

56
Individual Surface Features Performance Bio
57
Individual Surface Features Performance Bio
58
Results Lauer
59
Results Comparing with Others
60
Results Bio
61
Results for Noun Compound Bracketing

Introduced search engine statistics that go
beyond the n-gram (applicable to other tasks)
surface features
paraphrases
Obtained new state-of-the-art results on NC
bracketing
more robust than Lauer (1995)
more accurate than KellerLapata (2004)

62
Prepositional Phrase Attachment

Problem
(a) Peter spent millions of dollars. (noun
attach)
(b) Peter spent time with his family. (verb
attach)
Which attachment for quadruple
(v, n1, p, n2)
Results
Much simpler than other algorithms
As good as or better than best unsupervised, and
better than some supervised approaches

63
Related Work

Supervised
(Brill Resnik, 94) transformation-based
learning, WordNet classes, P82
(Ratnaparkhi al., 94)
ME, word classes (MI), P81.6
(Collins Brooks, 95)
back-off, P84.5
(Stetina Makoto, 97) decision trees, WordNet,
P88.1
(Toutanova al., 04) morphology, syntax,
WordNet, P87.5

Unsupervised
(Hindle Rooth, 93) partially parsed corpus,
lexical associations over subsets of (v,n1,p),
P80,R80
(Ratnaparkhi, 98) POS tagged corpus, unambiguous
cases for (v,n1,p), (n1,p,n2), classifier
P81.9
(Pantel Lin,00) collocation database,
dependency parser, large corpus (125M words),
P84.3

Unsup. state-of-the-art
64
PP-attachment Our Approach

Unsupervised
(v,n1,p,n2) quadruples, Ratnaparkhi test set
Google and MSN Search
Exact phrase queries
Inflections WordNet 2.0
Adding determiners where appropriate
Models
n-gram association models
Web-derived surface features
paraphrases

65
N-gram models

(i) Pr(pn1) vs. Pr(pv)
(ii) Pr(p,n2n1) vs. Pr(p,n2v)
I eat/v spaghetti/n1 with/p a fork/n2.
I eat/v spaghetti/n1 with/p sauce/n2.
Pr or (frequency)
smoothing as in (Hindle Rooth, 93)
back-off from (ii) to (i)
N-grams unreliable, if n1 or n2 is a pronoun.
MSN Search no rounding of n-gram estimates

66
Web-derived Surface Features
P R

Example features
open the door / with a key ? verb (100.00,
0.13)
open the door (with a key) ? verb (73.58,
2.44)
open the door with a key? verb (68.18,
2.03)
open the door , with a key ? verb (58.44,
7.09)
eat Spaghetti with sauce ? noun (100.00,
0.14)
eat ? spaghetti with sauce? noun (83.33,
0.55)
eat , spaghetti with sauce ? noun (65.77,
5.11)
eat spaghetti with sauce ? noun (64.71,
1.57)
Summing achieves high precision, low recall.

sum
compare
sum
67
Paraphrases

v n1 p n2
v n2 n1 (noun)
v p n2 n1 (verb)
p n2 v n1 (verb)
n1 p n2 v (noun)
v PRONOUN p n2 (verb)
BE n1 p n2 (noun)

68
Evaluation

Ratnaparkhi dataset
3097 test examples, e.g.
prepare dinner for family V
shipped crabs from province V
n1 or n2 is a bare determiner 149 examples
problem for unsupervised methods
left chairmanship of the N
is the of kind N
acquire securities for an N
special symbols , /, etc. 230 examples
problem for Web queries
buy for 10 V
beat SP-down from V
is 43-owned by firm N

69
Results
For prepositions other then OF. (of ? noun
attachment)
Models in bold are combined in a majority vote.
Simpler but not significantly different from
84.3 (PantelLin,00).
70
Noun Phrase Coordination

(Modified) real sentence
The Department of Chronic Diseases and Health
Promotion leads and strengthens global efforts to
prevent and control chronic diseases or
disabilities and to promote health and quality of
life.

71
NC coordination ellipsis

Ellipsis
car and truck production
means car production and truck production
No ellipsis
president and chief executive
All-way coordination
Securities and Exchange Commission

72
NC Coordination ellipsis

Quadruple (n1,c,n2,h)
Penn Treebank annotations
ellipsis
(NP car/NN and/CC truck/NN production/NN).
no ellipsis
(NP (NP president/NN) and/CC (NP chief/NN
executive/NN))
all-way can be annotated either way
This is a problem a parser must deal with.

Collins parser always predicts ellipsis, but
other parsers (e.g. Charniaks) try to solve it.
73
Results428 examples from Penn TB
74
Semantic Relation Detection

Goal automatically augment a lexical database
Many potential relation types
ISA (hypernymy/hyponymy)
Part-Of (meronymy)
Idea find unambiguous contexts which (nearly)
always indicate the relation of interest

75
Lexico-Syntactic Patterns
76
Lexico-Syntactic Patterns
77
Adding a New Relation
78
Semantic Relation Detection

Lexico-syntactic Patterns
Should occur frequently in text
Should (nearly) always suggest the relation of
interest
Should be recognizable with little pre-encoded
knowledge.
These patterns have been used extensively by
other researchers.

79
Semantic Relation Detection

What relationship holds between two nouns?
olive oil oil comes from olives
machine oil oil used on machines
Assigning the meaning relations between these
terms has been seen as a very difficult solution
Our solution
Use clever queries against the web to figure out
the relations.

80
Queries for Semantic Relations

Convert the noun-noun compound into a query of
the form
noun2 that noun1
oil that olive(s)
This returns search result snippets containing
interesting verbs.
In this case
Come from
Be obtained from
Be extracted from
Made from

81
Uncovering Semantic Relations

More examples
Migraine drug -gt treat, be used for, reduce,
prevent
Wrinkle drug -gt treat, be used for, reduce,
smooth
Printer tray -gt hold, come with, be folded, fit
under, be inserted into
Student protest -gt be led by, be sponsored by,
pit, be, be organized by

82
Application SAT Analogy Problems
83
Tackling the SAT Analogy Problem

First issue queries to find the relations
(features) that hold between each word pair
Compare the features for each answer pair to
those of the question pair.
Weight the features with term count and document
counts
Compare the weighted feature sets using Dice
coefficient

84
Queries for SAT Analogy Problem
85
Extract Features from Retrieved Text

Verb
The committee includes many members.
This is a committee, which includes many members.
This is a committee, including many members.
VerbPreposition
The committee consists of many members.
Preposition
He is a member of the committee.
Coordinating Conjunction
the committee and its members

86
Most Frequent Features for committee member
87
TF.IDF Weighting

TF.IDF classic
TF.IDF with add-one smoothing

88
Similarity Measure Dice Coefficient

Dice coefficient for sets
Dice coefficient extended to frequencies

89
SAT Results Nouns Only
90
Conclusions

The enormous size of the web opens new
opportunities for text analysis
There are many words, but they are more likely to
appear together in a huge dataset
This allows us to do word-specific analysis
To counter the labeled-data roadblock, we start
with unambiguous features that we can find
naturally.
Weve applied this to structural and semantic
language problems.
These are stepping stones towards sophisticated
language understanding.

91
Conclusions

Tapping the potential of very large corpora for
unsupervised algorithms
Go beyond n-grams
Surface features
Paraphrases
Results competitive with best unsupervised
Results can rival supervised algorithms
Future Work
Unambiguous Unlimited Unsupervised
How to extend to other problems?

92
Thank you!

http//biotext.berkeley.edu
Supported in part by NSF DBI-0317510

93
What about Search?

Web search currently does not use very much
language analysis.
Queries are very short (2.1 words/avg) so most
queries match many pages
Improvements in ranking make use of the massive
size of the web
Anchor text (words on links pointed to pages)
Which hits users clicked on (starting to use
this)
As well as the structure of language
Where query terms occur (title, etc)
How close together query words occur

94
Using n-grams to make predictions

Say trying to distinguish
home health care
home health care
Main idea compare these co-occurrence
probabilities
home health vs
health care

95
Using n-grams to make predictions

Use search engines page hits as a proxy for
n-gram counts
compare Pr(w1?w2w2) to Pr(w1?w3w3)
Pr(w1 ?w2w2 ) (w1,w2) / (w2)
(w2) word frequency query for w2
(w1,w2) bigram frequency query for w1 w2

96
Probabilities Why? (1)

Why should we use
(a) Pr(w1?w2w2), rather than
(b) Pr(w2?w1w1)?
KellerLapata (2004) calculate
AltaVista queries
(a) 70.49
(b) 68.85
British National Corpus
(a) 63.11
(b) 65.57

97
Probabilities Why? (2)

Why should we use
(a) Pr(w1?w2w2), rather than
(b) Pr(w2?w1w1)?
Maybe to introduce a bracketing prior.
Just like Lauer (1995) did.
But otherwise, no reason to prefer either one.
Do we need probabilities? (association is OK)
Do we need a directed model? (symmetry is OK)

98
Adjacency Dependency (2)

right bracketing w1w2w3
w2w3 is a compound (modified by w1)
w1 and w2 independently modify w3
adjacency model
Is w2w3 a compound?
(vs. w1w2 being a compound)
dependency model
Does w1 modify w3?
(vs. w1 modifying w2)

w1 w2 w3
w1 w2 w3
w1 w2 w3
99
Paraphrases pattern (1)

v n1 p n2 ? v n2 n1 (noun)
Can we turn n1 p n2 into a noun compound n2
n1?
meet/v demands/n1 from/p customers/n2 ?
meet/v the customer/n2 demands/n1
Problem ditransitive verbs like give
gave/v an apple/n1 to/p him/n2 ?
gave/v him/n2 an apple/n1
Solution
no determiner before n1
determiner before n2 is required
the preposition cannot be to

100
Paraphrases pattern (2)

v n1 p n2 ? v p n2 n1 (verb)
If p n2 is an indirect object of v, then it
could be switched with the direct object n1.
had/v a program/n1 in/p place/n2 ?
had/v in/p place/n2 a program/n1

Determiner before n1 is required to prevent n2
n1 from forming a noun compound.
101
Paraphrases pattern (3)

v n1 p n2 ? p n2 v n1 (verb)
indicates a wildcard position (up to three
intervening words are allowed)
Looks for appositions, where the PP has moved in
front of the verb, e.g.
I gave/v an apple/n1 to/p him/n2 ?
to/p him/n2 I gave/v an apple/n1

102
Paraphrases pattern (4)

v n1 p n2 ? n1 p n2 v (noun)
Looks for appositions, where n1 p n2 has moved
in front of v
shaken/v confidence/n1 in/p markets/n2 ?
confidence/n1 in/p markets/n2 shaken/v

103
Paraphrases pattern (5)

v n1 p n2 ? v PRONOUN p n2 (verb)
n1 is a pronoun ? verb (HindleRooth, 93)
Pattern (5) substitutes n1 with a dative pronoun
(him or her), e.g.
put/v a client/n1 at/p odds/n2 ?
put/v him at/p odds/n2

pronoun
104
Paraphrases pattern (6)

v n1 p n2 ? BE n1 p n2 (noun)
BE is typically used with a noun attachment
Pattern (6) substitutes v with a form of to be
(is or are), e.g.
eat/v spaghetti/n1 with/p sauce/n2 ?
is spaghetti/n1 with/p sauce/n2

to be
105
Related Work

(Resnik, 99) similarity of form and meaning,
conceptual association, decision tree, P80,
R100
(Rus al., 02) deterministic, rule-based
bracketing in context, P87.42, R71.05
(Chantree al., 05) distributional similarities
from BNC, Sketch Engine (freqs., object/modifier
etc.), P80.3, R53.8

106
N-gram models