Title: Collocations
1Collocations
Some typical collocations as long as with
respect to familial high-density lipoprotein
deficiency point the way
2Collocations
- Collocations are recognised at three levels
- Token stream
- Word level
- Structure level
- The point of recognition varies depending on the
variability and the potential ambiguity
3Token Stream
the disease known as familial
high - density lipoprotein deficiency is a
result
An unconditional collocation - that is, it cant
be anything else - is found by dictionary lookup
while the text is being turned into tokens, and
the tokens are combined into a single token
the disease known as FHLD is a result
4Word Level
The collocation is seen at the word level, when
structure building has begun but before the PARSE
operators have put out symbols. As long as can
be caught by only knowing the preceding symbol.
5Word Level
The result of recognising the collocation at this
point is that the parse chain is shrunk by
pruning the structure before parsing becomes
active
6Variable Collocations
- A collocation like point the way can appear in
many forms, making it unsuitable for recognition
at the word level - - he was pointing the way
- the sign pointed the way
- the way forward was pointed
- Other collocations allow variable symbols within
them so need to be replaced by a structure.
7Structure Level
This structure is recognizing a collocation and
turning it into a structure. Detection of
collocations at this level is much more
expensive, but also much more discriminating
8Which Level
Unconditional collocations can be caught at the
token level, but there are relatively few of
these Some collocations only need a light dusting
of analysis using neighboring symbols - these are
caught at the word level Some collocations are
worth catching, but involve a lot of structure
building to be sure they are what is meant