Title: Inferential Statistics
1Inferential Statistics Test of Significance
2Confidence Interval (CI)
Y mean Z Z score related with a 95 CI s
standard error
3Building a CI
4CI
Why do we use 1.96?
5Source Knoke Bohrnstead (1991167)
Is there a sample that is different from the
mean?
6Significance Testing
- When we explain some phenomenon we move beyond
description to inferential statistics and
hypothesis testing. - Tests of significance allow us to test
hypotheses, and when we find a relationship
between variables, reject the null hypothesis.
7Hypothesis testing
- Hypothesis testing means that we are testing our
null hypothesis (Ho) against some competing or
alternative hypothesis (H1) - Normally we choose statements such as
- Ho µy 100
- H1 µy ?100
- Or
- H1 µy gt 100
- Or
- H1 µy lt 100
8Significance Testing
- Even with high powered statistical measures,
there will be results that pop up that are
affected by chance. If we were to keep running
our models a thousand times, or fewer, we would
likely see some results that do not stem from
systematic processes. - Thus, we need to determine at what level of
significance we are willing to frame our results.
We can never be 100 confident. - Conventional levels of significance where we
reject the null hypothesis are usually .05 or
.01. The probability .10 is weakly significant.
9Significance Testing
- When you erroneously reject the null hypothesis
when it is true, you make a Type I error. This
means you are accepting a False Positive
result. - Think of this as a fiancé test. The chances of
rejecting or saying no to mister or miss right
10Significance Testing
- A Type II error occurs when you accept the null
hypothesis when it is not true. - This is a False Negative, when you have say yes
to Mr. or Miss wrong - Type II errors in statistical testing result from
too little data, omitted variable bias, and
multicollinearity.
11Other distributions
- The normal distribution assumes
- We know the standard error of the population,
however, often we dont know it. - The t-distribution become the best alternative
when we dont know the standard error but we know
the standard deviation. - As the sample gets bigger the t-distribution
approaches the normal distribution - There are other distribution such as chi square
and the that we will discuss latter.
12T- Distribution Normal Distribution
The form of the t-distribution depends on the
sample size. As the sample gets Larger there is
not difference between the normal and the
t-distribution
Source Gujarati (199276)
13The t formula
For a .05 and N30 , t 2.045
1495 CI using t-test
20 2.093 (5/v20) 22.34 upper 18.88 lower
15Why do we care about CI?
- We use CI interval for hypothesis testing
- For instance, we want to know if there is a
difference of home values between El Paso and
Boston - We want to know whether or not taking class at
Kaplan makes a difference in our GRE scores - We want to know if there is a difference between
the treatment and control groups.
16Mean Difference testing
Mean USA
Boston
Las Cruces
El Paso
Home Values
17(No Transcript)
18T-Tests of Independence
- Used to test whether there is a significant
difference between the means of two samples. - We are testing for independence, meaning the two
samples are related or not. - This is a one-time test, not over time with
multiple observations. - Example The values of homes between El Paso and
Boston
19T-Test of Independence
- Useful in experiments where people are assigned
to two groups, when there should be no
differences, and then introduce Independent
variables (treatment) to see if groups have real
differences, which would be attributable to
introduced X variable. This implies the samples
are from different populations (with different
µ). - This is the Completely Randomized Two-Group
Design.
20T-Test of Independence
- For example, we can take a random sample of high
school students and divided into two groups. One
gets tutoring for the SAT and the other does not.
- Ho µ1? µ2
- H1 µ1 µ2
- After one group gets tutoring, but not the other,
we compare the scores. We find that indeed the
group exposed to tutoring outperformed the other
group. We thus conclude that tutoring makes a
difference.
21- Positive increments at a different rate
Treatment
Control
Post-test
Pre-test
22Two Sample Difference of Means T-Test
Pooled variance of the two groups
Sp2
common standard deviation of two groups
23Two Sample Difference of Means T-Test
- The nominator of the equation captures difference
in means, while the denominator captures the
variation within and between each group. - Important point of interest is the difference
between the sample means, not sample and
population means. However, rejecting the null
means that the two groups under analysis have
different population means.
24An example
- Test on GRE verbal test scores by gender
- Females mean 50.9, variance 47.553, n6
- Males mean41.5, variance 49.544, n10
25Now what do we do with this obtained value?
26Steps of Testing and Significance
- Statement of null hypothesis if there is not one
then how can you be wrong? - Set Alpha Level of Risk .10, .05, .01
- Selection of appropriate test statistic T-test,
- Computation of statistical value get obtained
value. - Compare obtained value to critical value done
for you for most methods in most statistical
packages.
27Steps of Testing and Significance
- Comparison of the obtained and critical values.
- If obtained value is more extreme than critical
value, you may reject the null hypothesis. In
other words, you have significant results. - If point seven above is not true, obtained is
lower than critical, then null is not rejected.
28GRE Verbal Example
- Obtained Value 2.605
- Critical Value?
- Degrees of Freedom number of cases left after
subtracting 1 for each sample. (14) - Ho µf µm
- H1 µf ?µm
- Is the null hypothesis (Ho) supported?
- Answer No, women have higher verbal skills and
this is statistically significant. This means
that the mean scores of each gender as a
population are different.
29Paired T-Tests
- We use Paired T-Tests, test of dependence, to
examine a single sample subjects/units under two
conditions, such as pretest - posttest
experiment. - For example, we can examine whether a group of
students improves if they retake the GRE exam.
The T-test examines if there is any significant
difference between the two studies. If so, then
possibly something like studying more made a
difference.
30Paired T-Tests
- Unlike a test for independence, this test
requires that the two groups/samples being
evaluated are dependent upon each other. - For example, we can use a paired t-test to
examine two sets of scores across time as long as
they come from the same students. - This is appropriate for a pre-test post-test
research design
31SD sum differences between groups, plus it is
squared. n number of paired groups
32Comparing Test Scores
Midterm Final
48 71.2
69 73.3
95 96
87 94.2
50 81.4
75 86.7
74 72.8
88 88
92 95
69 88
75 91.8
86 93.6
73 71.8
60 80.1
33What can we conclude?