Title: The t-distribution
1The t-distribution
2General comment on z and t
3Moving from z to t
- Same concept, different assumptions
- Can only use z-tests if you know population SD
- So when you have to estimate s, use t-dist.
- t-test estimates population SD from sample SD
- t-test more robust against departures from
normality (doesnt affect the accuracy of the
p-estimate as much)
4Calculating the t-statistic
- We dont know anything about the population
?
Step 1 Estimate s
5Calculating the t-statistic
- We dont know anything about the population
?
Step 2 Use to estimate SEM
6Calculating the t-statistic
- We dont know anything about the population
?
Step 3 Use these in t-statistic
7t-statistic factors in significance
- Size of estimated SE obviously depends on both SD
of sample, and sample size - Thus, factors affecting size of calculated t are
mean diff, sample SD, and sample size
8Sampling distribution of t
- The t ratio requires 2 sample statistics to be
used to estimate population parameters (sample
mean and sample standard error) - The Z-ratio only required one (sample mean)
9Sampling distribution of t
- So, sampling variation in Z-distribution
reflected variability with respect to sample mean - BUT sampling variation in t-distribution reflects
variability with respect to sample mean and
standard error of the mean - Soas the sample gets smaller (and the standard
error of the mean then increases) wed expect the
sampling distribution of t to differ from that of
Z - The good old 1.96 for 95 is toast
10Sampling distribution of t
Large n ? t-dist pretty much like the
z-dist (because sample SD is a good estimate of
pop SD, sample SE is a good estimate of pop SE)
11Sampling distribution of t
Small n ? t-dist departs from the z-dist (because
sample SD is a poor estimate of pop SD, sample
SE is a poor estimate of pop SE)
12Sampling distribution of t
a (Significance level)
df n-1
Because distribution gets flatter as n gets
smaller, this implies t for significance gets
bigger as n gets smaller
13(digression degrees of freedom)
- Degrees of freedom
- The number of independent pieces of information a
sample of observations can provide for purposes
of statistical inference - E.g. 3 numbers in a sample 2, 2, 5
- Sample mean 3 deviations are 1, -1, 2
- Are these independent?
- No when you know two, youll know the other
because of - In other words, for any sample of size n you
have n-1 thing that are free to vary the
other one is fixed
14Confidence intervals for t
Standard error of the mean
Sample mean
t-statistic (changes with different df and ?)
15Independent t-tests
- Uses a sampling distribution of differences
between means
16Lets think
- Comparing 2 means from the same population
- If we sampled enough, what would we expect the
mean difference to be? - What would influence the accuracy of this
expectation?
17Comparing 2 means
- Estimating the mean of the sampling distribution
of differences in the means - Estimating the standard error of the differences
in the means - Getting a little complicated
18Comparing 2 means
- Estimating the standard error of the differences
in the means - Must assume equal population variance in the two
samples - So this is an assumption of independent t-tests
that must be tested - Then
- Know that
the SD of the distribution of differences between
2 sample means
19Comparing 2 means
Difference between sample means
- T-test for 2 independent samples
Note
SEM of difference between sample means
20Recall
- Larger sample size, and less variability in
population imply... - ...reduced variability in the distribution of
sampling means
21Extending to 2 sample means
- With larger samples, it is less likely that the
observed difference in sample means is
attributable to random sampling error - With reduced variability among the cases in each
sample, it is less likely that the observed
difference in sample means is attributable to
random sampling error - With larger observed difference between two
sample means, it is less likely that the observed
difference in sample means is attributable to
random sampling error - See applet
- http//physics.ubishops.ca/phy101/lectures/Beaver/
twoSampleTTest.html
22Evaluating tobserved for 2 samples
- The d of f changes from the one-sample case
- comparing two independent means
becomes
If the 2 groups are of equal size
23Reporting t-test in text
Descriptive statistics for the time to exhaustion
for the two diet groups are presented in Table 1
and graphically in Figure 1. A t-test for
independent samples indicated that the 44.2 (?
2.9) minute time to exhaustion for the CHO group
was significantly longer than the 38.9 (? 3.5)
minutes for the regular diet group (t18 - 3.68,
p ? 0.05). This represents a 1.1 increase in
time to exhaustion with the CHO supplementation
diet.
In discussion, address whether the statistically
significant difference is meaningful
24Reporting t-test in table
- Descriptives of time to exhaustion (in minutes)
for the 2 diets.
Group n Mean SD
Reg Diet 10 38.9 3.54
CHO sup 10 44.2 2.86
Note indicates significant difference, p ? 0.05
25Reporting t-test graphically
Figure 1. Mean time to exhaustion with different
diets.
26Reporting t-test graphically
Figure 1. Mean time to exhaustion with different
diets.
27Summary of theindependent t-test
- Utilize when the assumption of no correlation
between the groups is valid - Compares the difference in means by evaluating
the observed magnitude of the mean difference to
the expected variability in the magnitude of the
mean differences when Ho is true.
28t-tests in SPSS
- First note the data format one continuous
variable (in this case, age)
29t-tests in SPSS
- Second, run the procedure
drag the test variable over
and specify µ
30t-tests in SPSS
N, Mean, SD, SEM
significance (if a .05, then lt .05 is
significant)
df n-1 19
31independent-tests in SPSS
One grouping variable
One test variable
32independent-tests in SPSS
- Second, run the procedure
33independent-tests in SPSS
- Second, run the procedure
1. slide variables over
2. click define groups
3. define groups
34independent-tests in SPSS
- Third, examine the output
N, Mean, SD, SEM
test for equal variances (gt .05 is good)
significance (if a .05, then lt .05 is
significant)