Title: Key Cluster Patterns in Shakespeare
1Key Cluster Patterns in Shakespeare
- 2009 Aston Symposium
- 22 May 2009
- Mike Scott
2in pursuit of the
- "cunning'st pattern of excelling nature" (Othello)
3or
- but sound and fury signifying nothing?
4Abstract
- Key words (KWs) in Shakespeare plays have been
shown to belong to certain category-types such as
theme-related KWs, character-related KWs. - Other KWs, generally the more interesting ones,
seem to be pointers to other patterns indicative
of quite specific features of the language, or of
the status of characters or of individual
sub-themes. - It may be that there is a tension between global
KWs and much more localised, "bursty" ones in
this regard. - The presentation turns attention now to key word
clusters, that is n-grams which are shown to
occur distinctively in each individual play, or
in the speeches of an individual character. The
diverse types of patterns are what will be
explored here. - Are n-grams a mere coincidence of relatively
frequent words co-occurring frequently so that
they are but sound and fury signifying nothing?
5- Alas poor Yorick!
- Double, double toil and trouble
- And thereby hangs a tale
- Friends, Romans, countrymen, lend me your ears
- A blinking idiot
- Beggar'd all description
6yet
- Crystal Crystal (2002) only list one-word
headwords
7Aims
- take previous key word (KW) analysis of
Shakespeare plays up one level - by examining KW clusters
8 a proviso
keyness
- no claim to illuminate understanding of the
plays, - the objective being to understand more about
keyness and key words
clusters
9Clusters
- sequences of consecutive words repeatedly found
in corpora - Biber's "bundles"
- n-grams
- no guarantee they are "phrases"
- In WordSmith,
- n is between 2 and 8
10Why bother?
- (increasing awareness that words don't act alone
- and anyway some inconsistencies e.g.
- "behind" v. "in front of"
- "France" v. "Saudi Arabia" v. "United Arab
Emirates")
but hang about in gangs)
11So how should we think about words?
- When you pick up a word,
- you pick up another two
- or three.
12Keyness
- A word is said to be "key" if
- a) Â Â Â Â it occurs in the text at least as many
times as the user has specified as a Minimum
Frequency - b) Â Â Â Â its frequency in the text when compared
with its frequency in a reference corpus is such
that the statistical probability as computed by
an appropriate procedure is smaller than or equal
to a p value specified by the user. - (WordSmith manual)
13KW Clusters
- re-interpreting "word" to include "cluster"
- so the questions are
- How much overlap is there between KWs and KW
clusters? - What (if anything) do key clusters show that KWs
don't?
14Procedures
- with the 1916 OUP Shakespeare corpus at my site
- build one overall "index" which knows the
positions and neighbours of each word in all 37
plays - compute 2-word clusters using the index
- build one individual index for each of the plays
- compute 2-word clusters for each play using its
index
15Procedures (cont.)
- repeat previous steps for all lengths of cluster
2 to 5 - result 38 indexes
- 37 4 152 individual play cluster wordlists
- 4 cluster wordlists for the set of 37 plays
16single-word list (all the plays)
pure grammar
172-word clusters
I AUX incomplete prepositional phrases
183-word clusters
negatives
194-word clusters
requesting etc., social interactions
205-word clusters
social formulae
21Procedures (cont.)
- compare the 2-cluster wordlists of each play with
the 2-cluster wordlist of all the plays - repeat for 3-, 4- and 5-word clusters
- 37 4 148 key cluster lists
22KW settings
- p value 0.001
- minimum frequency 2
- negative KW clusters excluded
23Key 3-clusters in Lear
24just a title
25repetition!
- When we are born, we cry that we are come
- To this great stage of fools. This' a good block!
- It were a delicate stratagem to shoe
- A troop of horse with felt I'll put it in proof,
- And when I have stol'n upon these sons-in-law,
- Then, kill, kill, kill, kill, kill, kill!
(Lear)
26more repetition!
- And my poor fool is hang'd! No, no, no life!
- Why should a dog, a horse, a rat, have life,
- And thou no breath at all? Thou'lt come no more,
- Never, never, never, never, never!
- Pray you, undo this button thank you, sir.
- Do you see this? Look on her, look, her lips,
- Look there, look there!
- lt/LEARgt
- ltSTAGE DIRgt
- ltDies.gt
- lt/STAGE DIRgt
27Character-specific
- the foul fiend (Edgar)
- Tom's a cold (Edgar)
- i' the middle (Fool)
28theme of the play
- dost thou know?
- thou know me?
29speech-specific, rhythmic
- Have more than thou showest,
- Speak less than thou knowest,
- Lend less than thou owest,
- Ride more than thou goest,
- Learn more than thou trowest,
- Set less than thou throwest
- Leave thy drink and thy whore,
- And keep in-a-door,
- And thou shalt have more
- Than two tens to a score
30RQ 1 (How much overlap is there between KWs and
KW clusters?) Procedure
- For selected plays (Hamlet, Romeo, Henry IV part
1, As You Like It) - Save the column of single word KWs as a plain
text file - Save the column of 2-cluster KWs as a separate
file too - Save the columns of 3-, 4- and 5-cluster KWs
likewise - Make wordlists of these "texts"
- Compute "detailed consistency" of these wordlists
- Use "Set" function to classify items which appear
in various listings - Identify the percentage of words which appear in
the KW-cluster lists but not in the single word
KW listings vice-versa - Identify items which appear in numerous listings.
31Romeo and Juliet
- There are 43 (207-117 90) of the KWs which
come into the 2-,3-,4-,or 5-word KW clusters but
are absent from the single KW list. - 2s not found in the single KW list include high
frequency grammar items (THE, MY, AT, TO etc.) - 2s which are not found elsewhere in any cluster
include SHALL - 3s not found elsewhere include TELL, WHERE
- 4s not found elsewhere include COMMEND
32types in KW list but not in KW clusters (A-C)
- AH, ALACK, AN, APOTHECARY, BED, BENVOLIO,
CAPULET, CLOUDS, CORDS, CORSE
33Common to 4 or 5 KW listings
- HER, O, SILVER, A, ART, BOTH, JULE, LADY, PLAGUE,
SOUND, THOU, THY, WITH YOUR
34As You Like It
- There are 48 (190-98 92) KWs which come into
the 2-,3-,4-,or 5-word KW clusters but are absent
from the single KW list. - 2s not found in the single KW list include high
freq. grammar items (THE, OF, FOR, AND) - 2s which are not found elsewhere include HIM, WHO
- 3s not found elsewhere include AT, WOULD
35types in KW list but not in KW clusters (A-C)
- ADAM, ALIENA, AMBLES, AURDEY, BEARDS, CELIA,
CHARLES, CLOWN, COUNTERFEITED, COUTIER'S,
COVERED, COZ, CURED
36Henry IV part 1
- There are 43 (204-117 87) KWs which come into
the 2-,3-,4-,or 5-word KW clusters but are absent
from the single KW list. - 2s not found in the single KW list include high
frequency grammar items (IN, TO, YOU) but also
SIR, TRUE - 2s which are not found elsewhere include TWO,
FEAR, FIRE, CUDGEL - 3s not found elsewhere include WELL, WHY, FATHER
- 4s not found elsewhere include GIVE, ARE, DOOR,
LET
37types in KW list but not in KW clusters (A-C)
- AFOOT, BANISH, BARDOLPH, CLIFTON, COMPULSION,
COUNTERFEIT, COWARD
38Hamlet
- There are (44) 140-79 61 KWs which come into
the 2-,3-,4-,or 5-word KW clusters but are absent
from the single KW list. - 2s not found in the single KW list include high
freq. grammar items (MY, AND OF) but also GOOD - 2s which are not found elsewhere include FROM, O,
OUR, IS, IN - 3s not found elsewhere include HOW, LIFE, EXCEPT,
YOUR, REVENGE, NOT, OWN
39types in KW list but not in KW clusters (A-C)
- ACT, ARGAL, BERNARDO, CLOSES, CUSTOM
40Common to 3 or 4 KW listings
- NUNNERY, A, HAMLET, HAVE, I, IT, LORD, OPHELIA,
THE, TO, WAGER
41RQ 1 How much overlap is there between KWs and
KW clusters?
- More than 50 of the single-word KWs are in the
clusters - but the clusters add some 40 or more extra words
- not all additions are grammatical
- Key clusters tail off at 4 or 5
42at 4 Kws, which play is this?
midsummer night's dream
all's well that ends well
anthony cleopatra
"bursty" keyness?
43bursts (1)
midsummer night's dream
44bursts (2)
julius caesar
45bursts (3)
macbeth
46bursts of burstiness
as you like it
47compare burstinesses?
king lear 2s (part)
483s and 4s
king lear
49Conclusions
- How much overlap is there between KWs and KW
clusters? - Only a moderate amount they highlight different
aspects of the play - What (if anything) do key clusters show that KWs
don't? - At the extremes they may highlight songs and very
localised bursts in the play but by no means
always or only this
50- ltSHALLOWgt
- It is well said, in faith, sir and it is well
said indeed too. 'Better accommodated!' it is
good yea indeed, is it good phrases are surely
and ever were, very commendable. Accommodated! it
comes of accommodo very good a good phrase. - lt/SHALLOWgt
- ltBARDOLPHgt
- Pardon me, sir I have heard the word. 'Phrase,'
call you it? By this good day, I know not the
phrase but I will maintain the word with my
sword to be a soldier-like word, and a word of
exceeding good command, by heaven. Accommodated
that is, when a man is, as they say,
accommodated or, when a man is, being, whereby,
a' may be thought to be accommodated, which is an
excellent thing. - lt/BARDOLPHgt
51References
- Crystal, David Ben Crystal, 2002. Shakespeare's
words. London Penguin.
52Join us in Liverpool