Title: The ttest, the paired ttest, and introduction to nonparametric tests July 8, 2004
1The t-test, the paired t-test, and
introduction to non-parametric testsJuly 8, 2004
2The t-testfor comparing means (averages)
3Comparing two means
- Is the difference in means that we observe
between two groups more than wed expect to see
based on chance alone?
4Are the two means different enough to conclude
that the observed difference is greater than
would be expected by chance?
5(No Transcript)
6Comparing two means
- What is the sampling distribution of the
difference in the means of two samples? - First we need to know What is the distribution
of a difference between two normally distributed
random variables?
7N(12,25)
simulation of 500 averages of 30 from a normal
distribution with mean 12 and standard deviation
5 (variance 25)
Most experiments will yield a mean between 10 and
14 (2 se)
8N(8,25)
simulation of 500 averages of 30 from a normal
distribution with mean 8 and standard deviation 5
(variance 25)
Most experiments will yield a mean between 6 and
10 (2 se)
9Distribution of the difference
simulation of 500 differences between means from
above distributions
Notice that most experiments will yield a
difference value between 1 and 7 (wider than the
above sampling distributions!)
10Distribution of differences
- if X and Y are independent and X N(?x, ?x2) and
Y N(?y, ?y2) - recall that averages are normally distributed if
n is large enough, by the central limit theorem - then (X-Y) N(?y-?x, ?x2?y2) and (XY)
N(?y?x, ?x2?y2) - Therefore, if X and Y are the averages of n and m
subjects, respectively
11Example
- A particular IQ test is designed to have a range
of 0 to 200 with a standard deviation of 10 when
given to U.S. adults. You suspect that female
doctors have higher IQs than male doctors. To
test this hypothesis, you take a random sample of
30 female doctors and 30 male doctors. The women
score an average of 152 and the men an average of
149. What is your conclusion?
12Recall steps of a hypothesis test
- 1. Define your hypotheses (null, alternative)
- 2. Specify your null distribution
- 3. Do an experiment
- 4. Calculate the p-value of what you observed
- 5. Reject or fail to reject (accept) the null
hypothesis
131. Define your hypotheses (null, alternative)
- H0 ?-doctor IQ ?-doctor IQ (? - ? 0)
-
- Ha ?-doctor IQ ? ?-doctor IQ (?- ? ? 0 )
two-sided
142. Specify your null distribution
153. Do an experiment
- Observed difference in our experiment 3.0 IQ
points
164. Calculate the p-value of what you observed
- 3/2.581.16
- Z (FROM SAS)
- data _null_
- x(1-probnorm(1.16))2
- put x
- run
- 0.2460488061 (two-sided p-value)
175. Reject or fail to reject (accept) the null
hypothesis
- Not enough evidence to reject at the .05
significance level. (.24.05)
18Complication 1
- The harsh reality is, we hardly ever know the
true standard deviation a priori. If we knew
that much, we probably wouldnt need to run an
experiment! In most cases, we must use the
sample standard deviation as a stand-in for the
truth. However, by estimating the population
standard deviation we are adding more uncertainty
to our experiment. The null distribution is
slightly wider than a normal curvecalled a
t-distribution.
19Recall sample variance and standard deviation
20Example calculation of sample standard deviation
- systolic blood pressures 104, 114, 120, 148,
130, 132, 143, 152, 133, 124 - Mean 1300/10 130
- Sample standard deviation
- Estimated standard error of the mean
21Complication 1
- The null distribution is slightly wider than a
normal curvecalled a t-distribution.
22The t probability density function
23The t distributions
- The t distribution depends on the degrees of
freedom. - Degrees of freedom herenumber of observations
used to calculate the standard deviation (n)
minus the number of sample means (1 or 2) used in
calculation of the sample standard deviation
24The t distributions
- The t distribution is just a slightly flattened
version of the normal curve. - The t distribution is actually a family of
distributions that comes closer and closer to the
normal probability distribution as degrees of
freedom increase. - With n30, the t distribution is approximately
normal.
The t-function in SAS is probt(t-statistic, df)
25Example
- A one-sample test when the standard deviation is
unknown (one-sample t-test)
26Example One sample t-test
- A British sleep researcher claims that the
British sleep an average of 6.0 hours a night.
If you ask 30 Brits how many hours they sleep per
night and your sample average is 6.9 hours with a
sample standard deviation of 3.0, do you think
the researcher was mistaken in his claim? - 1. Specify hypothesis
- H0 average hours 6.0
- Ha average hourse ? 6.0 two-sided
27One sample t-test
- 2. Specify null distribution.
- The null distribution here actually follows a
t-distribution with 29 (n-1) degrees of
freedom (the higher the number of degrees of
freedom, the more the t-distribution looks like a
normal curve).
28One sample t-test
- 3. Observed data6.9 hours with a sample standard
of 3.0
29One sample t-test
- 4. USE SAS TO CALCULATE p-value
- data _null_
- pval1-probt(1.64, 29)
- put pval
- run
- 0.0559046876
- For two-sided test, multiply by 2 p-value.11
- This gives just a slightly higher answer than the
Z-test (Z1.64), which yields a two-sided p-value
of .10. Diminished certainty due to estimating
the standard deviation.
30One sample t-test
- 5. .11.05 do not reject null at a
significance level of .05
31Example two-sample t-test
- In 1980, some researchers reported that men have
more mathematical ability than women as
evidenced by the 1979 SATs, where a sample of 30
random male adolescents had a mean score 1
standard deviation of 43677 and 30 random female
adolescents scored lower 41681 (genders were
similar in educational backgrounds,
socio-economic status, and age). Do you agree
with the authors conclusions?
32Two-sample t-test
- 1. Define your hypotheses (null, alternative)
- H0 ?-? math SAT 0
- Ha ?-? math SAT ? 0 two-sided
33Two-sample t-test
- 2. Specify your null distribution
- F and M have approximately equal standard
deviations/variances, so make a pooled estimate
of variance.
34Two-sample t-test
- 3. Observed difference in our experiment 20
points
35Two-sample t-test
- 4. Calculate the p-value of what you observed
data _null_
pval(1-probt(.98, 58))2
put pval
run
0.3311563454
5. Do not
reject null! No evidence that men are better in
math )
36Example 2
- Example Rosental, R. and Jacobson, L. (1966)
Teachers expectancies Determinates of pupils
I.Q. gains. Psychological Reports, 19, 115-118.
37The Experiment (note exact numbers have been
altered)
- Grade 3 at Oak School were given an IQ test at
the beginning of the academic year (n90). - Classroom teachers were given a list of names of
students in their classes who had supposedly
scored in the top 20 percent these students were
identified as academic bloomers (n18). - BUT the children on the teachers lists had
actually been randomly assigned to the list. - At the end of the year, the same I.Q. test was
re-administered.
38The results
- Children who had been randomly assigned to the
top-20 percent list had mean I.Q. increase of
12.2 points (sd2.0) vs. children in the control
group only had an increase of 8.2 points (sd2.5) - Is this a statistically significant difference?
Give a confidence interval for this difference.
391. Hypotheses
- H0 mean change (gifted) mean change
(control) 0 - Ha mean change (gifted) mean change
(control) ? 0
402. Null distribution
- Null distribution of difference of two means
413. Empirical data
- Observed difference in our experiment 12.2-8.2
4.0 -
424. P-value
- t-curve with 88 dfs has slightly wider
cut-offs for 95 area (t1.99) than a normal
curve (Z1.96)
435. Reject null!
- Conclusion I.Q. scores can bias expectancies in
the teachers minds and cause them to
unintentionally treat bright students
differently from those seen as less bright.
44Confidence interval (more information!!)
- 95 CI for the difference 4.01.99(.64) (2.7
5.3)
452. The paired T-test
46The Paired T-test
- Paired data either the same person on different
occasions or pairs of people who are more similar
to each other than to individuals from other
pairs (husband-wife pairs, twin pairs, matched
cases and controls, etc.) - For example, evaluates whether an observed change
in mean (before vs. after) represents a true
improvement (or decrease). - Null hypothesis difference (after-before)0
47Did the control group in the previous experiment
improveat all during the year?
p-value
48Summary
Equal variances are pooled
Unequal variances (unpooled)
49Non-parametric tests
- t-tests require your outcome variable to be
normally distributed (or close enough). - Non-parametric tests are based on RANKS instead
of means and standard deviations (population
parameters).
50Example non-parametric tests
10 dieters following Atkins diet vs. 10 dieters
following Jenny Craig Hypothetical
RESULTS Atkins group loses an average of 34.5
lbs. J. Craig group loses an average of 18.5
lbs. Conclusion Atkins is better?
51Example non-parametric tests
BUT, take a closer look at the individual
data Atkins, change in weight (lbs) 4, 3,
0, -3, -4, -5, -11, -14, -15, -300 J. Craig,
change in weight (lbs) -8, -10, -12, -16, -18,
-20, -21, -24, -26, -30
52Jenny Craig
30
25
20
P
e
r
c
15
e
n
t
10
5
0
-30
-25
-20
-15
-10
-5
0
5
10
15
20
Weight Change
53Atkins
30
25
20
P
e
r
c
15
e
n
t
10
5
0
-300
-280
-260
-240
-220
-200
-180
-160
-140
-120
-100
-80
-60
-40
-20
0
20
Weight Change
54t-test doesnt work
- Comparing the mean weight loss of the two groups
is not appropriate here. - The distributions do not appear to be normally
distributed. - Moreover, there is an extreme outlier (this
outlier influences the mean a great deal).
55Statistical tests to compare ranks
- Wilcoxon Mann-Whitney test is analogue of
two-sample t-test.
56Wilcoxon Mann-Whitney test
- RANK the values, 1 being the least weight loss
and 20 being the most weight loss. - Atkins
- 4, 3, 0, -3, -4, -5, -11, -14, -15, -300
- 1, 2, 3, 4, 5, 6, 9, 11, 12, 20
- J. Craig
- -8, -10, -12, -16, -18, -20, -21, -24, -26, -30
- 7, 8, 10, 13, 14, 15, 16, 17, 18,
19
57Wilcoxon Mann-Whitney test
- Sum of Atkins ranks
- 1 2 3 4 5 6 9 11 12 2073
- Sum of Jenny Craigs ranks
- 7 8 10 13 14 1516 17 1819137
- Jenny Craig clearly ranked higher!
- P-value (from computer) .018