Title: Statistical Inference, Hypothesis testing and Estimation
1BITS PIECES From PREVIOUS LECTURES
2T-TESTS
T-tests are used to compare the mean of the same
CONTINUOUS variable in 2 or more groups.
3CHI - SQUARED TESTS
Chi-Squared tests are used to check whether or
not two CATEGORICAL variables are associated,
i.e., if they are not associated then they are
independent.
As we are working with categorical variables the
data for the two variables is summarised in a
contingency table (a crosstabulation in SPSS).
Under the assumption of independence (no
association between the 2 variables) the expected
frequencies for each cell are easily calculated.
4Chi-squared test for association
H0 there is no association between 2 categorical
variables
- Continuity Correction
- for 2 x 2 tables
- Fishers Exact Test
- if greater than 20 of expected values are less
than 5 (calculated for 2 x 2 tables only) - any single cell 1 or less
- Pearson
- for tables that have more than 2 rows or columns.
- Mantel-Haenszel Test for trend (Chi square test
for trend) - When one of the variables is ordinal
5Independent Groups Example in Lecture 5
H0 The variables smoking status and gender are
independent
P 0.376 (continuity corrected chi-square test,
SPSS)
DO NOT REJECT H0
6Related Groups 1 Example in Lecture 5
H0 Endometrial ablation has no effect on
symptoms of women.
Pairs of reports of symptoms of discomfort for
each woman, one before the operation (pre op) and
the other after operation (post op)
P 0.291 (McNemar test, SPSS)
DO NOT REJECT H0
7Outbreak of influenza A (H3N2) in a
highly-vaccinated religious community a
retrospective cohort study. Nicholls S et al.
Communicable Disease and Public Health Dec 2004
7(4)272-277
The rate of influenza was significantly
associated with the age of the subject (p0.04).
Test used is (Chi square test for trend). From
the percentages the percentage of each age group
with the disease decreases with increasing age.
Chi-squared test for linear trend ?2 4.0 (df
1), P 0.04
8- State the research question of interest,
- Summarise information in abstract in an
appropriate contingency table, - (Calculate expected frequencies for each cell),
- Write down the null and alternative hypothesis
which you think are associated with the given
P-value. - What are the conclusions?
9Communicable disease and Public Health (Dec
2004) Research Question within US drug
users Does area of residence affect take up of
Hepatitis B vaccination (HBV)?
Contingency table with respective proportions
(by row) and expected frequencies.
Red text shows expected count
10H0 there is no association between area and HBV
H1 there an association between area and HBV
?2 test statistic 17.754 (calculated in
SPSS)
( 140 - 116.9 0.5)2 116.9 (22.6)2
/116.9 4.38 5.11 3.81 4.45 17.75
REJECT H0
P value P (?2 ? 17.754) lt 0.01 ? lt0.05
There a significant association between area and
the take up of Hepatitis B Vaccine
11Difference in proportion taking up HBV is 0.199
or 20 higher take up of HBV in Bronx 95 CI for
(p1 p2) where p1 and p2 are proportions (p1
p2) ? 1.96 x
(0.110, 0.288) This 95 CI excludes 0 11 to
29 higher take up of HBV in Bronx
12Non-parametric methods
13Six main tests
14Non-parametric methods
- Many of the statistical methods encountered so
far require certain assumptions about the data - These tests may give misleading results if their
assumptions do not hold - When the data does not follow certain
distributional assumptions - use non-parametric methods
- transform the data
- ...
15Parametric v non-parametric tests
- Parametric tests (eg t-tests) assume that the
data follows a particular distribution (eg Normal
distribution) - Non-parametric tests do not assume a particular
distribution of the data - These methods are distribution free because they
are based on the analysis of ranks and not the
actual values - The averages tested are usually the medians
16Non-parametric methods
- Advantages
- no parametric assumptions about underlying
distribution required - can be used on ranked data
- mathematical concepts are simpler than for
parametric tests - Disadvantages
- less discriminating(less powerful)
- although simple, arithmetic can be lengthy
- do not easily provide magnitude of differences
17Introduction to ranking
- 29 birthweights
- The smallest actual value is given the rank of 1,
the largest is given the rank of 29 - If there are some values which occur more than
once, then add up their ranks for these values
and divide by the number of observations with the
same value
18Example
Birthweight Rank 2.34 1 1 2.38
2 2.5 2.38 3 2.5 3.5 4 4 3.8 5 5 4
.0 6 7 4.0 7 7 4.0 8 7
Try exercise in pg 43 (bottom)
19Distribution of birthweights
20Class Exercise pg 43
21When to use non-parametric tests?
- Data has been measured on an ordinal scale,
(suggest at least 5 ordered categories)
- Rank ordered data (e.g. placing in race 3rd,
7th),
- Small sample sizes (consider if nlt30, but
not absolute rule),
- Continuous/discrete data which do not follow a
- Normal distribution,
- Unequal variances across groups,
22Analysis of continuous data
- Comparison of two related groups
- Paired t-test
- Assumption that the differences came from a
population following a Normal distribution - Comparison of two independent groups
- Independent groups t-test
- Assumption that data came from populations
following a Normal distribution
23Comparison of two related groups
- Wilcoxon matched pairs test
- Data is continuous (interval) but assumptions of
paired t-test are not satisfied - Data is ordinal (ranked scale)
- H0 No tendency for the first outcome to be
higher or lower than the second
24Example Wilcoxon matched pairs test
- A crossover trial of pronethalol versus placebo
for the prevention of angina was carried out.
The outcome of interest was the number of angina
attacks experienced. Twelve patients took part
in the trial
H0 no tendency for the number of angina attacks
when on placebo to be higher or lower than when
on the active drug.
H0 is essentially saying that the distribution of
the differences in the number of angina attacks
between the placebo and the active drug is
located around zero (no difference).
25Example data Wilcoxon matched pairs
Ordered from smallest difference to largest
(ignoring positive or negative sign)
Number of angina attacks experienced
26SPSSWilcoxon matched pairs test
67/116.09
27SPSSWilcoxon matched pairs test
- Statistically significant result (p0.028)
- Patients tend to have fewer attacks whilst on the
active drug than when taking the placebo - Note the median of the differences is equal to 7
- So seven fewer attacks on average on active
drug than on placebo
P-value
28Comparison of two independent groups
- Mann-Whitney U test
- Data is continuous/discrete, but assumptions for
the independent t-test are not satisfied - Data is ordinal (ranked)
- H0 The distribution of the two populations is
the same - (ie the two distributions do not differ in
location)
29Example Mann-Whitney test
- Bicep skinfold thickness has been measured in
patients with two different types of intestinal
disease. - Research question
- Is there a difference in the median skinfold
thickness between the two groups of patients?
30Example data
- Crohns disease Coeliac disease
- (n20) (n9)
- 1.8 2.8 4.2 6.2 1.8 3.8
- 2.2 3.2 4.4 6.6 2.0 4.2
- 2.4 3.6 4.8 7.0 2.0 5.4
- 2.5 3.8 5.6 10.0 2.0 7.6
- 2.8 4.0 6.0 10.4 3.0
(Group A)
(Group B)
31Computation (order observations from smallest to
largest 1st to 29th)
- 1 2 3 4 5 6 7 8 ORDER
-
- 1.8 1.8 2.0 2.0 2.0 2.2 2.4 2.5 OBSERVED
VALUE - 1.5 1.5 4 4 4 6 7 8 RANK
-
- A B B B B A A A GROUP
32SPSS output Mann-Whitney
33SPSS output Mann-Whitney
- The difference in distribution of bicep skinfold
thickness was not found to be statistically
significant (P0.15) - Therefore the null hypothesis of equal
distributions can not be rejected
P-value
34Presentation of data
35Comparison of more than two independent groups
- Kruskal-Wallis Test for k independent groups (3
or 4 or 5 independent groups) - Non-parametric equivalent of one way analysis of
variance (ANOVA) - H0 There is no difference in the distribution of
values across the three groups (in the
population)
36Example Kruskal-Wallis
- Randomised comparison of three treatments for
children who suffer from frequent and severe
migraine. - 18 children randomised (6 per treatment group)
- Headache activity after treatment was expressed
as a percentage of baseline data - (100 indicates complete absence of headaches,
negative values indicate an increase in headaches)
37Treatment for headaches data
38SPSS Kruskal-Wallis
39SPSS Kruskal-Wallis
- P value 0.06
- Borderline significance (officially not
significant) - Close to cut off point for statistical
significance - Suggests that there may be a difference in the
effectiveness of treatment
P-value
40Multiple comparisons
- It is possible (as in ANOVA) to examine which
groups differ from one another - Unfortunately these tests are not available in
SPSS - Options (not examinable)
- Pairwise Mann-Whitney tests (with some adjustment
for multiple testing) - Dunn multiple comparison (can be computed by
hand)
41Summaryparametric v non-parametric tests
- If the assumptions for the parametric test are
met by the data then the parametric test is more
powerful - and should be used - Using a parametric test when the assumptions are
not fully met can result in serious errors
42Choice of statistical method
- For every research question there are a number of
options for the statistical analysis - (eg independent t-test, or transform data and
then independent t-test, or Mann-Whitney). - The decision as to which statistical method to
use remains with the researcher after exploring
the suitability of each method.
43Next Friday (10 Nov)
- Prepare two answers from the first exam paper in
your handbook January 2003 - Question 2 and Question 4. You will need to look
ahead in notes to do parts 4, 8 and 9 of Q4. - Room to be announced next week. Please come at
the correct time 9-10, 10-11 or 11-12. - 9-10 in 3rd floor conference room 3.052.
- 10-12 room(s) not yet confirmed.
- We will be going through answers as a classroom
exercise. - Everyone gains more if each person has done the
preparation. - You will learn more if you write answers in
sentences. - If short of time write brief notes.