Title: If you think you made a lot of mistakes in the survey project
1If you think you made a lot of mistakes in the
survey project.
- Think of how much you accomplished and the
mistakes you did not make
2- Went from not knowing much about surveys to
having designed, deployed, and completed one in 1
½ months - Actually got people to respond!
- Did not end up with 100 open ended responses
which you had to content analyze!
3One Tailed and Two Tailed tests
One tailed tests Based on a uni-directional
hypothesis Example Effect of training on
problems using PowerPoint Population figures
for usability of PP are known Hypothesis
Training will decrease number of problems with
PP
Two tailed tests Based on a bi-directional
hypothesis Hypothesis Training will change the
number of problems with PP
4If we know the population mean
Identify region
Unidirectional hypothesis .05 level
Bidirectional hypothesis .05 level
5- What does it mean if our significance level is
.05? - For a uni-directional hypothesis
- For a bi-directional hypothesis
- PowerPoint example
- Unidirectional
- If we set significance level at .05 level,
- 5 of the time we will higher mean by chance
- 95 of the time the higher mean mean will be real
- Bidirectional
- If we set significance level at .05 level
- 2.5 of the time we will find higher mean by
chance - 2.5 of the time we will find lower mean by
chance - 95 of time difference will be real
6Changing significance levels
- What happens if we decrease our significance
level from .01 to .05 - Probability of finding differences that dont
exist goes up (criteria becomes more lenient) - What happens if we increase our significance from
.01 to .001 - Probability of not finding differences that exist
goes up (criteria becomes more conservative)
7- PowerPoint example
- If we set significance level at .05 level,
- 5 of the time we will find a difference by
chance - 95 of the time the difference will be real
- If we set significance level at .01 level
- 1 of the time we will find a difference by
chance - 99 of time difference will be real
- For usability, if you are set out to find
problems setting lenient criteria might work
better (you will identify more problems)
8- Effect of decreasing significance level from .01
to .05 - Probability of finding differences that dont
exist goes up (criteria becomes more lenient) - Also called Type I error (Alpha)
- Effect of increasing significance from .01 to
.001 - Probability of not finding differences that exist
goes up (criteria becomes more conservative) - Also called Type II error (Beta)
9Degree of Freedom
- The number of independent pieces of information
remaining after estimating one or more parameters - Example List 1, 2, 3, 4 Average 2.5
- For average to remain the same three of the
numbers can be anything you want, fourth is fixed - New List 1, 5, 2.5, __ Average 2.5
10Major Points
- T tests are differences significant?
- One sample t tests, comparing one mean to
population - Within subjects test Comparing mean in condition
1 to mean in condition 2 - Between Subjects test Comparing mean in
condition 1 to mean in condition 2
11Effect of training on Powerpoint use
- Does training lead to lesser problems with PP?
- 9 subjects were trained on the use of PP.
- Then designed a presentation with PP.
- No of problems they had was DV
12Powerpoint study data
13Results of Powerpoint study.
- Results
- Mean number of problems 23.89
- Assume we know that without training the mean
would be 30, but not the standard deviation - Population mean 30
- Is 23.89 enough smaller than 30 to conclude that
training affected results?
14One sample t test cont.
- Assume mean of population known, but standard
deviation (SD) not known - Substitute sample SD for population SD (standard
error) - Gives you the t statistics
- Compare t to tabled values which show critical
values of t
15t Test for One Mean
- Get mean difference between sample and population
mean - Use sample SD as variance metric 4.40
16Degrees of Freedom
- Skewness of sampling distribution of variance
decreases as n increases - t will differ from z less as sample size
increases - Therefore need to adjust t accordingly
- df n - 1
- t based on df
17Looking up critical t (Table E.6)
18Conclusions
- Critical t n 9, t.05 2.62 (two tail
significance) - If t gt 2.62, reject H0
- Conclude that training leads to less problems
19Factors Affecting t
- Difference between sample and population means
- Magnitude of sample variance
- Sample size
20Factors Affecting Decision
- Significance level a
- One-tailed versus two-tailed test
21Sampling Distribution of the Mean
- We need to know what kinds of sample means to
expect if training has no effect. - i. e. What kinds of sample means if population
mean 23.89 - Recall the sampling distribution of the mean.
22Sampling Distribution of the Mean--cont.
- The sampling distribution of the mean depends on
- Mean of sampled population
- St. dev. of sampled population
- Size of sample
23Cont.
24Sampling Distribution of the mean--cont.
- Shape of the sampled population
- Approaches normal
- Rate of approach depends on sample size
- Also depends on the shape of the population
distribution
25Implications of the Central Limit Theorem
- Given a population with mean m and standard
deviation s, the sampling distribution of the
mean (the distribution of sample means) has a
mean m, and a standard deviation s /?n. - The distribution approaches normal as n, the
sample size, increases.
26Demonstration
- Let population be very skewed
- Draw samples of 3 and calculate means
- Draw samples of 10 and calculate means
- Plot means
- Note changes in means, standard deviations, and
shapes
Cont.
27Parent Population
Cont.
28Sampling Distribution n 3
Cont.
29Sampling Distribution n 10
Cont.
30Demonstration--cont.
- Means have stayed at 3.00 throughout--except for
minor sampling error - Standard deviations have decreased appropriately
- Shapes have become more normal--see superimposed
normal distribution for reference
31Within subjects t tests
- Related samples
- Difference scores
- t tests on difference scores
- Advantages and disadvantages
32Related Samples
- The same participants give us data on two
measures - e. g. Before and After treatment
- Usability problems before training on PP and
after training - With related samples, someone high on one measure
probably high on other(individual variability).
Cont.
33Related Samples--cont.
- Correlation between before and after scores
- Causes a change in the statistic we can use
- Sometimes called matched samples or repeated
measures
34Difference Scores
- Calculate difference between first and second
score - e. g. Difference Before - After
- Base subsequent analysis on difference scores
- Ignoring Before and After data
35Effect of training
36Results
- The training decreased the number of problems
with Powerpoint - Was this enough of a change to be significant?
- Before and After scores are not independent.
- See raw data
- r .64
Cont.
37Results--cont.
- If no change, mean of differences should be zero
- So, test the obtained mean of difference scores
against m 0. - Use same test as in one sample test
38t test
D and sD mean and standard deviation of
differences.
df n - 1 9 - 1 8
Cont.
39t test--cont.
- With 8 df, t.025 2.306 (Table E.6)
- We calculated t 6.85
- Since 6.85 gt 2.306, reject H0
- Conclude that the mean number of problems after
training was less than mean number before training
40Advantages of Related Samples
- Eliminate subject-to-subject variability
- Control for extraneous variables
- Need fewer subjects
41Disadvantages of Related Samples
- Order effects
- Carry-over effects
- Subjects no longer naïve
- Change may just be a function of time
- Sometimes not logically possible
42Between subjects t test
- Distribution of differences between means
- Heterogeneity of Variance
- Nonnormality
43Powerpoint training again
- Effect of training on problems using Powerpoint
- Same study as before --almost
- Now we have two independent groups
- Trained versus untrained users
- We want to compare mean number of problems
between groups
44Effect of training
45Differences from within subjects test
Cannot compute pairwise differences, since we
cannot compare two random people We want to test
differences between the two sample means (not
between a sample and population)
46Analysis
- How are sample means distributed if H0 is true?
- Need sampling distribution of differences between
means - Same idea as before, except statistic is (X1 -
X2) (mean 1 mean2)
47Sampling Distribution of Mean Differences
- Mean of sampling distribution m1 - m2
- Standard deviation of sampling distribution
(standard error of mean differences)
Cont.
48Sampling Distribution--cont.
- Distribution approaches normal as n increases.
- Later we will modify this to pool variances.
49Analysis--cont.
- Same basic formula as before, but with
accommodation to 2 groups. - Note parallels with earlier t
50Degrees of Freedom
- Each group has 6 subjects.
- Each group has n - 1 9 - 1 8 df
- Total df n1 - 1 n2 - 1 n1 n2 - 2 9 9
- 2 16 df - t.025(16) 2.12 (approx.)
51Conclusions
- T 4.13
- Critical t 2.12
- Since 4.13 gt 2.12, reject H0.
- Conclude that those who get training have less
problems than those without training
52Assumptions
- Two major assumptions
- Both groups are sampled from populations with the
same variance - homogeneity of variance
- Both groups are sampled from normal populations
- Assumption of normality
- Frequently violated with little harm.
53Heterogeneous Variances
- Refers to case of unequal population variances.
- We dont pool the sample variances.
- We adjust df and look t up in tables for adjusted
df. - Minimum df smaller n - 1.
- Most software calculates optimal df.