Concordances, collocations and connotation - PowerPoint PPT Presentation

About This Presentation
Title:

Concordances, collocations and connotation

Description:

Concordances, collocations and connotation. Barnbrook G (1996) Language ... Collocation and idiom. Listing collocations will often reveal idioms and cliches ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 26
Provided by: Har134
Category:

less

Transcript and Presenter's Notes

Title: Concordances, collocations and connotation


1
Concordances, collocations and connotation
  • Barnbrook G (1996) Language and Computers.
    Edinburgh EUP. Chapters 3,4,5
  • Partington A (1998) Patterns and Meanings.
    Amsterdam John Benjamins. Chapters 1,2,4

2
Lexical information in corpora
  • Start looking at the kind of information (about
    individual words) that can be got from corpora
  • Simple frequency information
  • Distribution information
  • Collocation (co-occurrence information)
  • Connotation (semantic prosody)
  • Introduce basic ideas
  • Future topics
  • Statistics
  • Case studies

3
Frequency information
  • Most banal information counting how many times a
    word (type) appears in a text
  • Most frequent words will be function words, so
    often f counts exclude words listed in a stop
    list
  • Should you count words or lemmas?
  • Should you distinguish alternate meanings of
    ambiguous word forms (if you can)?

4
Frequency information
  • Frequency information on its own is not
    particularly interesting
  • Quite useful to compare f of related words
  • eg alternative readings of a given word form
    (already seen in probability calculations in
    tagging)
  • or comparing near synonyms, especially if we can
    take context into account (see later)
  • f of a given word in a given context can be
    indicative, eg pronouns more frequent as subject
    or 1st word of sentence

5
Types and tokens
  • Remember distinction between tokens (words) and
    types (different words)
  • Type count gives a measure of how many DIFFERENT
    words are used
  • Type-token ratio gives a measure of vocabulary
    richness
  • If vocabulary is very varied, TTR will be higher
  • TTR is very sensitive to overall text length, so
    it is not meaningful to compare TTRs for texts of
    different lengths
  • Standardized TTR is the average of the TTR for
    each sequence of n words (typical default n1000)
    in a text or corpus

6
Vocabulary growth curve
  • Plotting types against tokens for a given text
    shows us how the TTR grows as the text gets
    longer
  • Typically, the curve starts steeply and then
    flattens, sooner or later reflecting homogeneity
    (or otherwise) of the text

VGC for Macbeth in Basic English source
http//web.missouri.edu/youmansc/vmp/help/Youmans
-TypeToken.pdf
7
Vocabulary growth curve
  • Comparative VGC for four texts
  • Simple measure used in some literary studies

(a)
(b)
(c)
(d)
(a) Longfellow (b) Hemingway (c) Basic English
(Macbeth) (d) Bible (Genesis 2)
8
Vocabulary in context
  • Concordance, also known as KWIC list (key word
    in context)
  • Allows us to see the (immediate) environment in
    which a word appears
  • Listings can be customised to show what you want
    more clearly, eg
  • sorted according to next or previous word
  • showing more or less context

9
source A Partington Patterns and Meanings.
Amsterdam (1998) John Benjamins
10
CIWK search
  • inverted KWIC
  • specify the context and look to see what words
    occur in it

11
Collocation
  • Term coined by J R Firth (1957) to characterise
    (part of) his theory of meaning
  • You shall judge a word by the company it keeps
  • The occurrence of two or more words within a
    short space of each other in a text (Sinclair
    1991)
  • The relationship a lexical item has with items
    tha appear with greater than random probability
    in its (textual) context (Hoey 1991 emphasis
    added)

12
Collocation, text type and style
  • Distinguish between general and more usual
    collocations vs technical and more personal ones
  • eg in a general corpus time collocates with save,
    spend, waste, fritter away,
  • but in a corpus of sports reports time collocates
    with half, full, extra, injury, first, second,
    third,

13
Collocation and idiom
  • Listing collocations will often reveal idioms and
    cliches
  • Important to think of collocation as extending
    beyond neighbouring words (which can be captured
    by simple concordances)

14
Collecting collocations
  • If we are to look beyond neighbouring words, what
    constraints might we impose?
  • Collocation means co-occurrence within some
    defined context
  • possibly a window of n words to left and/or
    right
  • if corpus is tagged/parsed, we can look at
    collocations within structures
  • or we can define the window in terms of
    constituents rather than words

15
Measuring significance
  • The significance of any co-occurrences nees to be
    established
  • Raw co-occurrence frequency counts mean nothing
  • Need to be compared to something else
  • Need to compare a given co-occurrence with random
    chance, or with some other co-occurrence
  • More detail next time

16
Collocation and synonymy
  • Collocation is good evidence in discussing (near)
    synonymy
  • Lots of studies take near synonyms and look to
    see if the nature of their relationship can be
    characterised by their distribution
  • In other words what words does each of the
    synonym set collocate with?
  • Especially useful for language learners

17
Example of sheer and synonyms
  • (from Partington book)
  • three senses (LDOCE)
  • pure, nothing but, eg sheer luck
  • steep, sheer drop
  • thin, sheer stockings
  • (Cobuild) use sheer to emphasize completeness of
    state
  • 92 occurrences of sheer (in meaning 1) in his
    corpus

18
collocations of sheer
  • expression of magnitude of weight or volume to
    right (20)
  • volume, weight, numbers, mass, scale, quantity,
    size
  • almost always with article the
  • expression of force, strength or energy (22)
  • energy. exertion, force, muscle, strength, power,
    pressure, fury, pace, intensity
  • usually with the, or a preposition but no article
  • expression of persistence (14)
  • pesistence, irreversibility, obstinacy,
    indomitability, insistence, reliability,
    integrity, hard work
  • left context through, because of, out of,
    expressing causation, but not the

19
collocations of sheer
  • nouns expressing strong emotion (11)
  • fun, joy, panic, inspiration, enjoyment, terror
  • nouns expressing extreme personal qualities (11)
  • beauty, glamour, brutality, thuggery, madness,
    folly
  • nouns expressing extreme ability or lack of same
    (8)
  • expertise, competence, virtuosity, gamesmanship

20
Synonyms of sheer - pure
  • LDOCE definitions, 5 meanings of which two
    overlap
  • not mixed with anything
  • complete, thorough
  • Corpus has 135 examples
  • Larger variety of syntactic environments (sheer
    was always modifying a noun) including
    predicative, which sheer does not occur in
  • ? The drop was sheer
  • His fury was sheer

21
Synonyms of sheer - pure
  • Religious-moral context sense of unmixed
  • doctrine, faith, goodness chemicals, gold
  • But, many examples where it has an emphasizing
    function, like sheer
  • accident, chance, comedy, guesswork, honesty,
    idiocy, malice, nostalgia, pleasure, selfishness,
    talent, theatre, vulnerability, whim, wickedness
  • often with proper nouns (unlike sheer)
  • No examples of pure collocating with items
    expressing magnitude, force or persistence
  • Some overlap with sheer
  • personal qualities, emotion (though generally
    less extreme ones)
  • Only few examples of pure in prepositional phrase
    expressing causation causes can be sheer, but
    states are pure

22
Other synonyms of sheer
  • Partington does similar analysis of complete and
    absolute
  • Shows that each of the synoynms has more
    typical uses and patterns, though there is some
    overlap
  • But there is also clear evidence of complementary
    usage

23
Connotation and semantic prosody
  • Collocation can also be used to illustrate
    connotation
  • secondary implications of a word (Lyons 1977)
  • Three distinct uses of the term
  • marker of a particular speech variety (eg lovely)
  • cultural implications (words used to describe
    women show what society thinks of them)
  • marker of speakers evaluation (firm stubborn)
  • Semantic prosody (Sinclair 1987)
  • use of a certain word spreads its connotation
    over the whole utterance

24
Some examples
  • object of commit is often something bad (foul,
    deception, offence)
  • if something is described as rife, it is not good
    (crime, disease, mistakes), and describing it as
    rife expresses a negative connotation
    (speculation is rife)
  • both the above exemplify unfavourable prosody,
    but other prosodies are possible
  • good example claim vs admit responsibility for an
    atrocity

25
More power to your elbow
  • Examples given in last few slides were largely
    subjective
  • More interesting if we can back up observations
    with calculations of statistical significance
  • Next time we will look at some simple statistical
    measures
Write a Comment
User Comments (0)
About PowerShow.com