Title: Chapter Thirteen Part I
1Chapter Thirteen Part I
- Hypothesis Testing
- Basic Concepts and Tests of Association,
- Chi-Square Tests
2Basic concepts - Example
- GEICO feels that consumers are bored with the
gecko ad campaign (mean liking 2 (1 (strongly
dislike) 5 (strongly like) scale). - GEICO wants to verify this feeling so they take a
sample and measure liking levels. The mean in the
sample is 4 - Should GEICO conclude that their feeling is wrong
or that the sample mean is a function of chance?
3Hypothesis Testing Basic Concepts
- Hypothesis An assumption made about a population
parameter (not sample statistic) - E.g. mean attitudes are 2 measured on a 1 5
scale - Purpose of Hypothesis Testing To make a judgment
about the difference between the sample statistic
and the population parameter - The mechanism adopted to make this objective
judgment is the core of hypothesis testing
4Hypothesis testing Logic
- Is the sample statistic a function of chance or
luck rather than an accurate representation of
the population parameter? - Example
- Hypothesized mean attitudes are 2 (on a 1 5
scale) - Observed mean sample attitudes are 4 (on a 1 5
scale) - Is the difference between the two a chance event
or are we really wrong about our hypothesis? - This is statistically evaluated.
5Problem Definition
Clearly state the null and alternative hypotheses.
Choose the relevant test and the appropriate
probability distribution
Determine the degrees of freedom
Determine the significance level
Choose the critical value
Compare test statistic and critical value
Compute relevant test statistic
Decide if one-or two-tailed test
Does the test statistic fall in the critical
region?
No
Do not reject null
Yes
Reject null
61. Formulate Null Alternative hypotheses
- Null hypothesis (Ho)
- the hypothesis of no difference
- between the population parameter and sample
statistic - OR no relationship
- Between two sample statistics
- A mirror-image of the alternative (research)
hypothesis - Alternative hypothesis (Ha or H1) the
hypothesis of differences or relationships - Example
- Ho Mean population attitudes 2
- Ha Mean population attitudes are not 2
72. Choose appropriate test and probability
distribution
- Depends on whether we are
- Comparing means (Z distribution if population
standard deviation is known t distribution if
population standard deviation is not known) - Comparing frequencies (chi-square distribution)
83. Determine significance level
- The level at which we want to make a judgment
about the population parameter (the null
hypothesis) - Generally 10, 5, 1 (corresponding to 90, 95
and 99 confidence levels) in social sciences - The level at which the critical test statistic is
identified
94. Determine degrees of freedom
- Number of bits of unconstrained data available to
calculate a sample statistic - E.g. for X bar, d.f. is n for s, d.f. is n-1,
since 1 d.f. is lost due to the restriction that
we need to calculate the mean first to calculate
the standard deviation
105. Decide if it is a one / two tailed test
- One Tailed test If the Research Hypothesis is
expressed directionally - E.g. Head-On wants to test if consumers dislike
their ad campaign (mean liking
dislike) 5 (strongly like) scale). - Ho Population mean attitudes are greater than or
equal to 3.0 - Ha Population mean attitudes are less than 3.0
- For confirmation of H1 look in the tail of the
direction of the Research Hypothesis
115. Decide if it is a one / two tailed test
- Two Tailed test If the Research Hypothesis is
expressed without direction - E.g. Head-On wants to test if consumers feel
differently about their ad campaign than they
felt a year ago. (mean liking 4.5 (1 (strongly
dislike) 5 (strongly like) scale). - Ho Population mean attitudes 4.5
- Ha Population mean attitudes are not equal to
4.5 - For confirmation of H1 look in the tails on both
sides of the distribution
126. Find the critical test statistic
- Critical z value requires knowledge of level of
significance - Critical t value requires knowledge of level of
significance and degrees of freedom - Critical chi-square requires knowledge of level
of significance and degrees of freedom
137. Criteria for rejecting / not rejecting H0
- Compute observed test statistic
- Compare critical test statistic with observed
test statistic - If the absolute value of observed test statistic
is greater than the critical test statistic,
reject Ho - If the absolute value of observed test statistic
is smaller than the critical test statistic then
Ho cannot be rejected. - Regions of rejection / acceptance
14Type 1 and Type 2 errors
Null hypothesis in population is
Data Analysis conclusion is
True
False
Reject Null hypothesis
Do not reject Null hypothesis
15Type 1 and Type 2 errors
- The lower the confidence level, the greater the
risk of rejecting a true H0 Type 1 error
(alpha) - i.e. if you reduce the confidence level from 95
to 90 the chances of you declaring that the
effect observed in the sample actually prevails
in the population, are higher. - If the effect in reality does not exist in the
population, then you increase the risk of
committing a Type 1 error. - Therefore in Type 1 error you declare an effect
which does not exist
16Type 1 and Type 2 errors
- The higher the confidence level the greater the
risk of accepting a false H0 Type 2 error
(beta) - i.e. if you increase the confidence level from
95 to 99, the chances that you miss the effect
which may actually be there in the population,
are higher. - the power of the test to spot the effect is
reduced - Therefore power 1 beta
- Therefore in Type 2 error you miss an effect
which exists
17Hypothesis Testing
- Tests in this class
- Statistical Test
- Frequency Distributions ?2
- Means (one) z (if ? is known)
- t (if ? is unknown)
- Means (two) t
- Means (more than two) ANOVA
-
18Chi-Square as a test of independence
- Statistical Independence if knowledge of one
does not influence the outcome of the other - E.g. Affiliation to school (nominally scaled)
does not influence decision to eat at the student
union - Expected Value The average value in a cell if
the sampling procedure is repeated many times - Observed Value The value in the cell in one
sampling procedure - Only nominal / categorical variables
19Chi-square Step-by-Step
20Chi-Square As a Test of Independence
- Null Hypothesis Ho
- Two (nominally scaled) variables are
statistically independent - There is no relationship between school
affiliation and decision to eat at the student
union - Alternative Hypothesis Ha
- The two variables are not independent
- School affiliation does influence the decision to
eat at the student union
21Chi-square As a Test of Independence (Contd.)
- Chi-square Distribution
- A probability distribution for categorical data
- Total area under the curve is 1.0
- A different chi-square distribution is associated
with different degrees of freedom
22The chi-square distribution
23Chi-square Step-by-Step
- 1) Formulate Hypotheses
- 2) Calculate row and column totals
- 3) Calculate row and column proportions
- 4) Calculate expected frequencies (Ei)
- 5) Calculate ?2 statistic
24Chi-square Statistic (?2)
- Measures of the difference between the actual
numbers observed in cell i (Oi), and number
expected (Ei) under independence if the null
hypothesis were true -
- With (r-1)(c-1) degrees of freedom
- r number of rows c number of columns
- Expected frequency in each cell Ei pc pr n
- Where pc and pr are proportions for independent
variables and n is the total number of
observations
25Chi-square Step-by-Step
- 1) Formulate Hypotheses
- 2) Calculate row and column totals
- 3) Calculate row and column proportions
- 4) Calculate expected frequencies (Ei)
- 5) Calculate ?2 statistic
- 6) Calculate degrees of freedom
26Chi-square As a Test of Independence (Contd.)
- Degree of Freedom
- v (r - 1) (c - 1)
- r number of rows in contingency table
- c number of columns
27Chi-square Step-by-Step
- 1) Formulate Hypotheses
- 2) Calculate row and column totals
- 3) Calculate row and column proportions
- 4) Calculate expected frequencies (Ei)
- 5) Calculate ?2 statistic
- 6) Calculate degrees of freedom
- 7) Obtain Critical Value from table
28The chi-square distribution
F(x2)
Critical value 9.49
df 4
5 of area under curve
? .05
x2
- Ex Significance level .05
- Degrees of freedom 4
- CVx2 9.49
29Chi-square Step-by-Step
- 1) Formulate Hypotheses
- 2) Calculate row and column totals
- 3) Calculate row and column proportions
- 4) Calculate expected frequencies (Ei)
- 5) Calculate ?2 statistic
- 6) Calculate degrees of freedom
- 7) Obtain Critical Value from table
- 8) Make decision regarding the Null-hypothesis
30Example of Chi-square as a Test of Independence
- Eat / Dont eat
- Y N
- A 10 8
- School B 20 16
- C 45 18
- D 16 6
- E 9 2
This is the observed value
This is a Cell
31Chi-square example
0.24 0.67 150
36/150
32Chi-square example
- Observed chi-square (10 12)2 / 12 (8
6)2 / 6 (20 24)2 / 24 (2 4)2 / 4
5.42 - d.f. (r-1)(c-1) (5-1)(2-1) 4
- Critical chi-square at 5 level of significance
at 4 degrees of freedom 9.49 - Since observed chi-square (5.42
- Hence decision to eat / not eat at the student
union is statistically independent of their
school affiliation. In other words there is no
relationship between the decision to eat at the
SU and the school they are in.
33The chi-square distribution
F(x2)
Critical value 9.49
df 4
5 of area under curve
? .05
x2
- Ex Significance level .05
- Degrees of freedom 4
- CVx2 9.49
- The decision rule when testing hypotheses by
means of chi-square distribution is - If x2 is .05
- If x2 is CVx2, reject H0 If If x2 is accept H0