Title: MA in English Linguistics Experimental design and statistics
1MA in English LinguisticsExperimental design and
statistics
Sean Wallis Survey of English Usage University
College London s.wallis_at_ucl.ac.uk
2Outline
- What is a research question?
- Choice and baselines
- Making sense of probability
- Observing change in a corpus
- Drawing inferences to larger populations
- Estimating error in observations
- Testing results for significance
3What is a research question?
- You may have heard this phrase last term
- What do you think we mean by a research
question? - Can you think of any examples?
4Examples
- Some example research questions
5Examples
- Some example research questions
- smoking is good for you
6Examples
- Some example research questions
- smoking is good for you
- dropped objects accelerate toward the ground at
9.8 metres per second squared
7Examples
- Some example research questions
- smoking is good for you
- dropped objects accelerate toward the ground at
9.8 metres per second squared - s is a clitic rather than a word
8Examples
- Some example research questions
- smoking is good for you
- dropped objects accelerate toward the ground at
9.8 metres per second squared - s is a clitic rather than a word
- the word shall is used less often in recent years
9Examples
- Some example research questions
- smoking is good for you
- dropped objects accelerate toward the ground at
9.8 metres per second squared - s is a clitic rather than a word
- the word shall is used less often in recent years
- the degree of preference for shall rather than
will has declined in British English over the
period 1960s-1990s
10Testable hypotheses
- An hypothesis a testable research question
- Compare
- the word shall is used less in recent years
- to
- the degree of preference for shall rather than
will has declined in British English over the
period 1960s-1990s - How could you test these hypotheses?
11Questions of choice
- Suppose we wanted to test the following
hypothesis using DCPSE - the word shall is used less in recent years
- When we say the word shall is used less...
- ...less compared to what?
- traditionally corpus linguists have normalised
data as a proportion of words (so we might say
shall is used less frequently per million words) - But what might this mean?
12Questions of choice
- From the speakers perspective
- The probability of a speaker using a word like
shall depends on whether they had the opportunity
to say it in the first place - They were about to say will, but said shall
instead
13Questions of choice
- From the speakers perspective
- The probability of a speaker using a word like
shall depends on whether they had the opportunity
to say it in the first place - They were about to say will, but said shall
instead - Per million words might still be relevant from
the hearers perspective
14Questions of choice
- From the speakers perspective
- The probability of a speaker using a word like
shall depends on whether they had the opportunity
to say it in the first place - They were about to say will, but said shall
instead - Per million words might still be relevant from
the hearers perspective - If we can identify all points where the choice
arose, we have an ideal baseline for studying
linguistic choices made by speakers/writers.
15Questions of choice
- From the speakers perspective
- The probability of a speaker using a word like
shall depends on whether they had the opportunity
to say it in the first place - They were about to say will, but said shall
instead - Per million words might still be relevant from
the hearers perspective - If we can identify all points where the choice
arose, we have an ideal baseline for studying
linguistic choices made by speakers/writers. - Can all cases of will be replaced by shall ?
- What about second or third person shall ?
16Baselines
- The baseline is a central element of the
hypothesis - Changes are always relative to something
- You can get different results with different
baselines - Different baselines imply different conclusions
- We have seen two different kinds of baselines
- A word baseline
- shall per million words
- A choice baseline (an alternation experiment)
- shall as a proportion of the choice shall vs.
will (includingll ), when the choice arises
17Baselines
- In many cases it is very difficult to identify
all cases where the choice arises - e.g. studying modal verbs
18Baselines
- In many cases it is very difficult to identify
all cases where the choice arises - e.g. studying modal verbs
- You may need to pick a different baseline
- Be as specific as you can
- words ? VPs ? tensed VPs ? alternating modals
19Baselines
- In many cases it is very difficult to identify
all cases where the choice arises - e.g. studying modal verbs
- You may need to pick a different baseline
- Be as specific as you can
- words ? VPs ? tensed VPs ? alternating modals
alternation different words, same meaning
20Baselines
- In many cases it is very difficult to identify
all cases where the choice arises - e.g. studying modal verbs
- You may need to pick a different baseline
- Be as specific as you can
- words ? VPs ? tensed VPs ? alternating modals
- Other hypotheses imply different baselines
- Different meanings of the same word
- e.g. uses of very, as a proportion of all cases
of very - very N - the very person
- very ADJ - the very tall person
- very ADV - very slightly moving
alternation different words, same meaning
semasiologicalvariation
21Probability
- We are used to concepts like these being
expressed as numbers - length (distance, height)
- area
- volume
- temperature
- wealth (income, assets)
22Probability
- We are used to concepts like these being
expressed as numbers - length (distance, height)
- area
- volume
- temperature
- wealth (income, assets)
- We are going to discuss another concept
- probability (proportion, percentage)
23Probability
- Based on another, even simpler, idea
- probability p x / n
24Probability
- Based on another, even simpler, idea
- probability p x / n
- e.g. the probability that the speaker says will
instead of shall
25Probability
- Based on another, even simpler, idea
- probability p x / n
- where
- frequency x (often, f )
- the number of times something actually happens
- the number of hits in a search
- e.g. the probability that the speaker says will
instead of shall
26Probability
- Based on another, even simpler, idea
- probability p x / n
- where
- frequency x (often, f )
- the number of times something actually happens
- the number of hits in a search
- e.g. the probability that the speaker says will
instead of shall
27Probability
- Based on another, even simpler, idea
- probability p x / n
- where
- frequency x (often, f )
- the number of times something actually happens
- the number of hits in a search
- baseline n is
- the number of times something could happen
- the number of hits
- in a more general search
- in several alternative patterns (alternate
forms)
- e.g. the probability that the speaker says will
instead of shall
28Probability
- Based on another, even simpler, idea
- probability p x / n
- where
- frequency x (often, f )
- the number of times something actually happens
- the number of hits in a search
- baseline n is
- the number of times something could happen
- the number of hits
- in a more general search
- in several alternative patterns (alternate
forms)
- e.g. the probability that the speaker says will
instead of shall
29Probability
- Based on another, even simpler, idea
- probability p x / n
- where
- frequency x (often, f )
- the number of times something actually happens
- the number of hits in a search
- baseline n is
- the number of times something could happen
- the number of hits
- in a more general search
- in several alternative patterns (alternate
forms) - Probability can range from 0 to 1
- e.g. the probability that the speaker says will
instead of shall
30A simple research question
- What happens to modal shall vs. will over time
in British English? - Does shall increase or decrease?
- What do you think?
- How might we find out?
31Lets get some data
- Open DCPSE with ICECUP
- FTF query for first person declarative shall
- repeat for will
32Lets get some data
- Open DCPSE with ICECUP
- FTF query for first person declarative shall
- repeat for will
- Corpus Map
- DATE
Do the first set of queries and then drop into
Corpus Map
33Modal shall vs. will over time
- Plotting probability of speaker selecting modal
shall out of shall/will over time (DCPSE)
1.0
p(shall shall, will)
shall 100
0.8
0.6
0.4
0.2
shall 0
0.0
1955
1960
1965
1970
1975
1980
1985
1990
1995
(Aarts et al., 2013)
34Modal shall vs. will over time
- Plotting probability of speaker selecting modal
shall out of shall/will over time (DCPSE)
1.0
p(shall shall, will)
shall 100
0.8
0.6
0.4
Is shall going up or down?
0.2
shall 0
0.0
1955
1960
1965
1970
1975
1980
1985
1990
1995
(Aarts et al., 2013)
35Is shall going up or down?
- Whenever we look at change, we must ask ourselves
two things
36Is shall going up or down?
- Whenever we look at change, we must ask ourselves
two things - What is the change relative to?
- What is our baseline for comparison?
- In this case we ask
- Does shall decrease relative to shall will ?
37Is shall going up or down?
- Whenever we look at change, we must ask ourselves
two things - What is the change relative to?
- What is our baseline for comparison?
- In this case we ask
- Does shall decrease relative to shall will ?
- How confident are we in our results?
- Is the change big enough to be reproducible?
38The sample and the population
39The sample and the population
- The corpus is a sample
- If we ask questions about the proportions of
certain words in the corpus - We ask questions about the sample
- Answers are statements of fact
40The sample and the population
- The corpus is a sample
- If we ask questions about the proportions of
certain words in the corpus - We ask questions about the sample
- Answers are statements of fact
- Now we are asking about British English
?
41The sample and the population
- The corpus is a sample
- If we ask questions about the proportions of
certain words in the corpus - We ask questions about the sample
- Answers are statements of fact
- Now we are asking about British English
- We want to draw an inference
- from the sample (in this case, DCPSE)
- to the population (similarly-sampled BrE
utterances) - This inference is a best guess
- This process is called inferential statistics
42Basic inferential statistics
- Suppose we carry out an experiment
- We toss a coin 10 times and get 5 heads
- How confident are we in the results?
- Suppose we repeat the experiment
- Will we get the same result again?
43Basic inferential statistics
- Suppose we carry out an experiment
- We toss a coin 10 times and get 5 heads
- How confident are we in the results?
- Suppose we repeat the experiment
- Will we get the same result again?
- Lets try
- You should have one coin
- Toss it 10 times
- Write down how many heads you get
- Do you all get the same results?
44The Binomial distribution
- Repeated sampling tends to form a Binomial
distribution around the expected mean X
F
- We toss a coin 10 times, and get 5 heads
N 1
X
x
45The Binomial distribution
- Repeated sampling tends to form a Binomial
distribution around the expected mean X
F
- Due to chance, some samples will have a higher or
lower score
N 4
X
x
46The Binomial distribution
- Repeated sampling tends to form a Binomial
distribution around the expected mean X
F
- Due to chance, some samples will have a higher or
lower score
N 8
X
x
47The Binomial distribution
- Repeated sampling tends to form a Binomial
distribution around the expected mean X
F
- Due to chance, some samples will have a higher or
lower score
N 12
X
x
48The Binomial distribution
- Repeated sampling tends to form a Binomial
distribution around the expected mean X
F
- Due to chance, some samples will have a higher or
lower score
N 16
X
x
49The Binomial distribution
- Repeated sampling tends to form a Binomial
distribution around the expected mean X
F
- Due to chance, some samples will have a higher or
lower score
N 20
X
x
50The Binomial distribution
- Repeated sampling tends to form a Binomial
distribution around the expected mean X
F
- Due to chance, some samples will have a higher or
lower score
N 26
X
x
51The Binomial distribution
- It is helpful to express x as the probability of
choosing a head, p, with expected mean P - p x / n
- n max. number of possible heads (10)
- Probabilities are inthe range 0 to 1
- percentages (0 to 100)
F
P
p
52The Binomial distribution
- Take-home point
- A single observation, say x hits (or p as a
proportion of n possible hits) in the corpus, is
not guaranteed to be correct in the world! - Estimating the confidence you have in your
results is essential
F
p
P
p
53The Binomial distribution
- Take-home point
- A single observation, say x hits (or p as a
proportion of n possible hits) in the corpus, is
not guaranteed to be correct in the world! - Estimating the confidence you have in your
results is essential - We want to makepredictions about future runs of
the same experiment
F
p
P
p
54Binomial ? Normal
- The Binomial (discrete) distribution is close to
the Normal (continuous) distribution
F
x
55Binomial ? Normal
- Any Normal distribution can be defined by only
two variables and the Normal function z
? population mean P
? standard deviationS ? P(1 P) / n
F
- With more data in the experiment, S will be
smaller
z . S
z . S
0.5
0.3
0.1
0.7
p
56Binomial ? Normal
- Any Normal distribution can be defined by only
two variables and the Normal function z
? population mean P
? standard deviationS ? P(1 P) / n
F
z . S
z . S
- 95 of the curve is within 2 standard deviations
of the expected mean
- the correct figure is 1.95996!
- the critical value of z for an error level of
0.05.
2.5
2.5
95
0.5
0.3
0.1
0.7
p
57The single-sample z test...
- Is an observation p gt z standard deviations from
the expected (population) mean P?
- If yes, p is significantly different from P
F
observation p
z . S
z . S
0.25
0.25
P
0.5
0.3
0.1
0.7
p
58...gives us a confidence interval
- P z . S is the confidence interval for P
- We want to plot the interval about p
F
z . S
z . S
0.25
0.25
P
0.5
0.3
0.1
0.7
p
59...gives us a confidence interval
- P z . S is the confidence interval for P
- We want to plot the interval about p
60...gives us a confidence interval
- The interval about p is called the Wilson score
interval
observation p
- This interval reflects the Normal interval about
P - If P is at the upper limit of p,p is at the
lower limit of P
F
w
w
(Wallis, 2013)
P
0.25
0.25
0.5
0.3
0.1
0.7
p
61Modal shall vs. will over time
- Simple test
- Compare p for
- all LLC texts in DCPSE (1956-77) with
- all ICE-GB texts (early 1990s)
- We get the following data
- We may plot the probabilityof shall being
selected,with Wilson intervals
p(shall shall, will)
62Modal shall vs. will over time
- Simple test
- Compare p for
- all LLC texts in DCPSE (1956-77) with
- all ICE-GB texts (early 1990s)
- We get the following data
- We may plot the probabilityof shall being
selected,with Wilson intervals
May be input in a 2 x 2 chi-square test
- or you can check Wilson intervals
63Modal shall vs. will over time
- Plotting modal shall/will over time (DCPSE)
- Small amounts of data / year
1.0
p(shall shall, will)
0.8
0.6
0.4
0.2
0.0
1955
1960
1965
1970
1975
1980
1985
1990
1995
64Modal shall vs. will over time
- Plotting modal shall/will over time (DCPSE)
- Small amounts of data / year
- Confidence intervals identify the degree of
certainty in our results
1.0
p(shall shall, will)
0.8
0.6
0.4
0.2
0.0
1955
1960
1965
1970
1975
1980
1985
1990
1995
65Modal shall vs. will over time
- Plotting modal shall/will over time (DCPSE)
- Small amounts of data / year
- Confidence intervals identify the degree of
certainty in our results - Highly skewed p in some cases
- p 0 or 1 (circled)
66Modal shall vs. will over time
- Plotting modal shall/will over time (DCPSE)
- Small amounts of data / year
- Confidence intervals identify the degree of
certainty in our results - We can now estimate an approximate downwards curve
(Aarts et al., 2013)
67Recap
- Whenever we look at change, we must ask ourselves
two things - What is the change relative to?
- Is our observation higher or lower than we might
expect - In this case we ask
- Does shall decrease relative to shall will ?
- How confident are we in our results?
- Is the change big enough to be reproducible?
68Conclusions
- An observation is not the actual value
- Repeating the experiment might get different
results - The basic idea of inferential statistics is
- Predict range of future results if experiment was
repeated - Significant effect gt 0 (e.g. 19 times out of
20) - Based on the Binomial distribution
- Approximated by Normal distribution many uses
- Plotting confidence intervals
- Use goodness of fit or single-sample z tests to
compare an observation with an expected baseline - Use 2?2 tests or independent-sample z tests to
compare two observed samples
69References
- Aarts, B., Close, J., and Wallis, S.A. 2013.
Choices over time methodological issues in
investigating current change. Chapter 2 in Aarts,
B. Close, J., Leech G., and Wallis, S.A. (eds.)
The Verb Phrase in English. Cambridge University
Press. - Wallis, S.A. 2013. Binomial confidence intervals
and contingency tests. Journal of Quantitative
Linguistics 203, 178-208. - Wilson, E.B. 1927. Probable inference, the law of
succession, and statistical inference. Journal of
the American Statistical Association 22 209-212 - NOTE Statistics papers, more explanation,
spreadsheets etc. are published on
corp.ling.stats blog http//corplingstats.wordpre
ss.com