Title: S519: Evaluation of Information Systems
1S519 Evaluation of Information Systems
- Social Statistics
- Inferential Statistics
- Chapter 10 t test
2T test for dependent
- A repeated-measures study (a.k.a dependent study)
is one in which a single sample of individuals is
measured more than once on the same dependent
variable. - Main benefit two sets of data are from the same
subjects.
3Example
- Three professors at University of Alabama studied
the effects of resource and regular classrooms on
the reading achievement of learning-disabled
children. A group of children was test before
they take the 1-year daily instruction and after
they took the 1-year daily instruction. - Which statistical test we should use?
4T test for dependent
the sum of all the difference between
groups the sum of the differences squared
between groups n the number of pairs of
observations
5T test for dependent
Pre-test Post-test D D2
3 7 4 16
5 8 3 9
4 6 2 4
6 7 1 1
5 8 3 9
5 9 4 16
4 6 2 4
5 6 1 1
3 7 4 16
6 8 2 4
7 8 1 1
8 7 -1 1
7 9 2 4
6 10 4 16
7 9 2 4
8 9 1 1
8 8 0 0
9 8 -1 1
9 4 -5 25
8 4 -4 16
7 5 -2 4
7 6 -1 1
6 9 3 9
7 8 1 1
8 12 4 16
6T test for dependent
- Step1 A statement of the null and research
hypotheses
7T test for dependent
- Step2 setting the level of risk (or the level of
significance or Type I error) associated with
null hypothesis - 0.05
8T test for dependent
- Step3 selection of the appropriate test
statistics - Following Figure 10.1
- T test for dependent t test for paired samples
t test for correlated samples
9T test for dependent
- Step4 computation of the test statistic value
- t2.45
10T test for dependent
- Step5 determination of the value needed for
rejection of the null hypothesis - Table B2
- dfn-125-124
- One tailed because research hypothesis is
directed
11T test for dependent
- Step6 a comparison of the t value and the
critical value - 2.45gt1.711
- Reject the null hypothesis
12T test for dependent
- Step7 and 8 time for a decision
- There is the difference between pre-test and
post-test the post-test scores are higher than
the pre-test scores.
13Excel TTEST function
- TTEST (array1, array2, tails, type)
- array1 the cell address for the first set of
data - array2 the cell address for the second set of
data - tails 1 one-tailed, 2 two-tailed
- type 1 a paired t test 2 a two-sample test
(independent with equal variances) 3 a
two-sample test with unequal variances
14Excel TTEST()
- It does not computer the t value
- It returns the likelihood that the resulting t
value is due to chance - Less than 1 of the possibility that two tests
are different due to chance ? the two tests are
difference due to other reasons than chance.
15Excel ToolPak
- T test paired two sample for means option
t-Test Paired Two Sample for Means t-Test Paired Two Sample for Means
pretest posttest
Mean 6.32 7.52
Variance 2.976666667 3.343333333
Observations 25 25
Pearson Correlation 0.050718341
Hypothesized Mean Difference 0
df 24
t Stat -2.449489743
P(Tltt) one-tail 0.010991498
t Critical one-tail 1.710882067
P(Tltt) two-tail 0.021982997
t Critical two-tail 2.063898547
16Advantages of theRepeated-Samples Design
- Repeated-measures design reduces or limits the
variance, by eliminating the individual
differences between samples.
17Problems With theRepeated-Samples Design
- Carryover effect (specifically associated with
repeated-measures design) subjects score in
second measurement is altered by a lingering
aftereffect from the first measurement.
18Types of t test
19Example I
- A researcher is interested in a new technique to
improve SAT verbal scores. It is known that SAT
verbal scores have µ500 s100. - She randomly selects n30 students from this
population, and has them undergo her training
technique. Students are given analogy questions,
and are shocked each time they get an answer
wrong. - The sample then writes the SAT, and gets M 560.
20Example II
- A social psychologist is interested in whether
people feel more or less hopeful following a
devastating flood in a small rural community. He
randomly selects n10 people and asks them to
report how hopeful the feel using a 7-point scale
from extremely hopeful (1) to neutral (4) to
extremely unhopeful (7) - The researcher is interested in whether the
responses are consistently above or below the
midpoint (4) on the scale, but has no hypothesis
about what direction they are likely to go. - His sample reports M4.7, s 1.89.
21Example III
- To test the hypothesis that people give out more
candy to kids in cute costumes than scary ones, I
hire 20 kids to work for me. Ten are randomly
assigned to wear cute bunny costumes, and the
other ten wear Darth Vader costumes. - I drop the kids off in random parts of the city,
and count the total pieces of candy each has
after 1 hour of trick-or-treat. - Cute bunnies M 120, s 10
- Darth Vaders M 112, s 12
22Example IV
- We are testing the effects of moderate amounts of
alcohol on driving performance. We make the
hypothesis that even a small amount of beer will
degrade driving performance (an increase in
obstacles hit). - To test our hypothesis, we have n5 subjects
drive around a course on Big Wheels covered with
cardboard cutouts of children and furry animals,
and we record the number of cutouts they hit.
Then, they drink one beer, and do the course
again again we record the number of cutouts hit. - What is a potential confound with this experiment?
23Example V
- We want to determine if IU SLIS faculty publish
more than the national average of 4 papers per
year (per person). We take a random sample of
n12 IU SLIS profs and survey the number of
papers each has published, obtaining M6.3,
s1.13.
24Example VI
- I want to know which dog is responsible for the
holes in my yard. I buy 10 German Shepherds, 10
Beagles, and randomly assign each dog to its own
yard. At the end of the day, the Beagles have dug
M11.3 holes, s2.1, and the Shepherds have dug
M5.4 holes, s1.9. Test my hypothesis that
Beagles dig more holes than German Shepherds.
25Example VII
- We want to know if noise affects surgery
performance. We randomly select a sample of 9
surgeons, and have them perform a hand-eye
coordination task (not while performing surgery,
of course). The surgeons first perform the task
in a quiet condition, and then we have them
perform the same task under a noisy condition.
Test the hypothesis that noise will cause poorer
performance on the task.
26Example VIII
- ETS reports that GRE quantitative scores for
people who have not taken a training course are
µ555, s139. We take a sample of 10 people from
this population and give them a new preparation
course. Test the hypothesis that their test
scores differ from the population.