Introduction to Formal Inference - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Introduction to Formal Inference

Description:

... of observing a sample mean that is between 4 and 6 in a sample of size n=10? ... of Agriculture, the mean farm rent in Indiana was $89.00 per acre in 1995. ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 46
Provided by: amy53
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Formal Inference


1
Chapter 6
  • Introduction to Formal Inference
  • Part II Hypothesis Tests for µ

2
5.5 Functions of Several Random Variables
  • Let X1, X2, , Xn be iid N(5, 10) rvs. So then
  • N(5, 10/n)
  • What is the probability of observing a sample
    mean that is between 4 and 6 in a sample of size
    n10?

3
5.5 Functions of Several Random Variables
  • What is the probability of observing a sample
    mean that is between 4 and 6 in a sample size
    n90?

4
5.5 Functions of Several Random Variables
  • If you had taken a sample of 90 people and found
    that their sample mean was less than 4 or greater
    than 6, what might you conclude?
  • There was a 99.74 chance of being between 4 and
    6
  • Maybe I just got a REALLY rare sample of 90
    people
  • -OR-
  • Maybe the population mean I started with doesnt
    reflect the population like I thought it did
    (more likely scenario)

5
6.2 Significance (Hypothesis) Testing
  • Motivating Example (continued)
  • Recall A simple random sample of n 25
    sections of ¼ wire rope yielded 15 tons as the
    sample average of the breaking strength. Also,
    we somehow know or believe that s2 36 tons.
  • How would you answer a buyer if they ask you are
    you sure that the breaking strength isnt 10
    tons?

6
6.2 Significance (Hypothesis) Testing
  • Heres how we might convince them
  • Suppose the true breaking strength is 10 tons,
    i.e. µ10. Then, the probability of obtaining a
    SRS with a mean of 15 or higher is
  • (It is actually 1-0.9999848 1.523x10-5)

7
6.2 Significance (Hypothesis) Testing
  • Sowe have in fact observed an event (i.e the
    sample mean greater than 15) that is very rare if
    the assumption that µ10 is true. Therefore,
    this assumption is probably not plausible.
  • Definition Significance testing is the use of
    data in the quantitative assessment of the value
    for a parameter

8
6.2 Significance (Hypothesis) Testing
  • Definition The Null Hypothesis (null, H0) is a
    statement of the parameter being equal to some
    number (usually a number that is what we want to
    disprove, either based on history or experience)
  • Note Null refers to that fact that the
    hypothesis is a statement of no difference
  • E.g. H0 µ0
  • or H0 µ100 (which is equivalent to H0 µ
    1000

9
6.2 Significance (Hypothesis) Testing
  • Definition the Alternative Hypothesis
    (alternative, HA, H1) is a statement of
    opposition to the null hypothesis (usually
    reflects what we really believe is true about the
    parameter compared to H0)
  • Note the alternative is a statement that creates
    a one-sided or two-sided test
  • E.g. H0 µ0 vs HA µ?0 is a two-sided test
  • H0 µ100 vs HA µ lt 100 is a one-sided
    test
  • H0 µ7 vs HA µ gt 7 is a one-sided test

10
6.2 Significance (Hypothesis) Testing
Determine for the following the null and
alternative hypotheses. 1. According to the
United States Department of Agriculture, the mean
farm rent in Indiana was 89.00 per acre in 1995.
A researcher for the USDA claims that the mean
rent has decreased since then. H0 µ 89
i.e. µ0 89 HA µ lt 89 2. According to the
United States Energy Information Administration,
the mean expenditure for residential energy
consumption was 1338 in 1997. An economist
claims that the mean expenditure for residential
energy is different today. H0 µ 1338 i.e.
µ0 1338 HA µ ? 1338
11
6.2 Significance (Hypothesis) Testing
  • Definition a test statistic is a formula that
    summarizes the data under the null hypothesis
    (i.e. assuming that H0 is true)
  • E.g. for testing H0 µ 0 vs HA µ ? 0, we might
    use the sample mean, , since it is our
    estimate of µ or
  • where µ0 is µ specified by H0

12
6.2 Significance (Hypothesis) Testing
  • Defintion the Reference (or null) Distribution
    is the probability distribution of the test
    statistic under the null hypothesis
  • e.g.
  • for
  • for

13
6.2 Significance (Hypothesis) Testing
  • Definition a p-value is the (conditional)
    probability of observing a test statistic that is
    as extreme or more extreme than what is actually
    observed given the null hypothesis is true
  • What this means depends on HA
  • E.g. for testing H0 µ0 vs HA µ ? 0 using
  • where ZN(0,1) and

14
6.2 Significance (Hypothesis) Testing
  • P-value illustration
  • -z z
  • More extreme in the More extreme in the
  • negative end positive end

15
6.2 Significance (Hypothesis) Testing
  • P-value calculations for three alternative
    hypotheses
  • H0 µ µ0 vs HA µ? µ0 (two-sided alternative)
  • -z z
  • P-value PZgtz µ µ0
  • 2PZ lt -z
  • 2PZ gt z

16
6.2 Significance (Hypothesis) Testing
  • P-value calculations for three alternative
    hypotheses
  • H0 µ µ0 vs HA µ gt µ0 (one-sided alternative)
  • z
  • Note if HA µ gt µ0 , then the test statistic had
    better be positive (i.e. we know z gt 0)
  • P-value PZ gt z µ µ0

17
6.2 Significance (Hypothesis) Testing
  • P-value calculations for three alternative
    hypotheses
  • H0 µ µ0 vs HA µ lt µ0 (one-sided alternative)
  • z
  • Note if HA µ lt µ0 then the test statistic had
    better be negative (i.e. we know z lt 0)
  • P-value PZ lt z µ µ0

18
6.2 Significance (Hypothesis) Testing
  • Example Suppose we know that the standard
    deviation for the weight of a bag of MMs
    (labeled 10 oz) is 1.0 ounce. We want to test
    the hypothesis that the mean weight of all the
    MM bags under consideration is 10 ounces. So we
    randomly sampled 30 bags. The mean weight of the
    bags is 10.2 ounces.
  • Sowe know
  • µ0 10, we want to test if this is true
  • s 1
  • 10.2
  •  

19
6.2 Significance (Hypothesis) Testing
  • State the null hypothesis and alternative
    hypothesis and calculate the observed value of
    the test statistic.
  • H0 µ 10 vs HA µ ? 10 - or -
  • H0 µ 10 0 vs HA µ 10 ? 0

20
6.2 Significance (Hypothesis) Testing
  • Calculate the probability of observing a value
    larger than the test statistic in magnitude, i.e.
    the p-value.
  • We have a two-sided alternative so
  • P-value PZ gt z where ZN(0,1)
  • 2PZ lt -z
  • 2PZ lt -1.095 (round to 2 digits)
  • 2PZ lt -1.10
  • 2(0.1357) 0.2714

21
6.2 Significance (Hypothesis) Testing
  • What would you be willing to conclude? Why?
  • p-value 0.2714
  • 27.14 chance of observing a more
    extreme value than what we did observe based
    on our sample of 30 bags (if µ 10)
  • µ 10 is fairly plausible

22
6.2 Significance (Hypothesis) Testing
  • Would it make a difference if the mean weight for
    the sample of 30 bags was 10.4 ounces instead?
  • P-value 2PZ lt -z
  • 2PZ lt -2.19
  • 2(0.0143) 0.0286
  • 2.86 chance of observing a more
    extreme value than what we observed (if µ
    10)
  • µ 10 is probably not a plausible value

23
6.2 Significance (Hypothesis) Testing
  • This is the general process used for doing a
    hypothesis test
  • Recall from 10.1 that for large-n, we can use the
    sample standard deviation in place of the
    population standard deviation (s instead of s)

24
6.2 Significance (Hypothesis) Testing
  • Five-Step Format for Significance Testing (NOT
    the same as in the book!)
  • Step 1 State the null and alternative hypotheses
  • Step 2 Compute the test statistic and state its
    distribution
  • Step 3 Compute the p-value
  • Step 4 Make a decision (compare steps 2 and 3)
  • Step 5 State your conclusion

25
6.2 Significance (Hypothesis) Testing
  • How do you make a decision?
  • Researchers typically think of p-value in terms
    of their level of significance (this gets hazy)
  • In terms of Hypothesis testingwell want to
    compare the p-value to a

26
6.2 Significance (Hypothesis) Testing
  • Example 2 (continued) repeat part (d) but
  • Use n 100
  • p-value 2PZ lt -4 0
  • Use a mean of 10.05 and n 4,000
  • p-value 2PZ lt -3.16
  • 0.0016

27
6.2 Significance (Hypothesis) Testing
  • Note that with a very large sample, one can
    make any difference significant according to
    the chart. We must ask ourselves, is the
    difference meaningful?
  • Rather than try to figure out a scale for the
    p-value, compare it to the level of significance,
    a
  • Reject H0 (and conclude µ µ0 is not reasonable)
    if p-value lt a
  • Fail to reject H0 (FTR and conclude µ µ0 is
    reasonable) otherwise

28
6.2 Significance (Hypothesis) Testing
Reality
Null Hypothesis
True
False
Type I
Reject
Action
Type II
Fail to Reject
The level of significance, ?, is the probability
of making a Type I error.
29
6.2 Significance (Hypothesis) Testing
  • Definition a Type I Error is rejecting H0 when
    H0 is true
  • The p-value is the probability of making a Type
    I Error
  • Type I Error typically denoted a
  • We want this to be low, typically a 0.01, 0.05
    or 0.10 (similar to CIs)
  • If no value is specified, use a 0.05
  • Definition a Type II Error is failing to reject
    H0 when H0 is false
  • Generally considered not as bad as a Type I
    Error (hence why we compare everything to a)

30
6.2 Significance (Hypothesis) Testing
  • Criminal Justice Analogy for Understanding Type I
    vs. Type II Error
  •   In a criminal trial, we assume that the
    defendant is innocent until proven guilty.
    Therefore, we have H0 The defendant is innocent
    vs.
  • HA The defendant is guilty
  • A Type I Error would occur if the defendant was
    innocent but found guilty.
  • A Type II Error would occur if the defendant was
    guilty but was found not guilty.

31
6.2 Significance (Hypothesis) Testing
  • Note that in the judicial system, we never say
    that a defendant was found innocent. Instead,
    our system decides if they are proven guilty, and
    if not, we say that they are not guilty.
  • Similarly, statisticians say fail to reject the
    null hypothesis instead of saying accept the
    null hypothesis. 

32
6.2 Significance (Hypothesis) Testing
  • Since we do not want an innocent defendant to be
    found guilty, strong evidence has to be presented
    to convict them. In the same way, we dont want
    to reject the null hypothesis unless there is
    strong evidence to reject it.
  •  
  • However, the stronger the evidence we require for
    convicting a defendant, the more likely a guilty
    defendant will walk away after being declared not
    guilty. Similarly, as we lower the risk of
    making a Type I Error, we increase the risk of
    making a Type II Error.

33
6.2 Significance (Hypothesis) Testing
  • Example (using the 5 steps) A researcher claims
    that the average age of a woman before she has
    her first child is greater than the 1990 mean age
    of 24.6 years, on the basis of data obtained from
    the National Vital Statistics Report, Vol. 48,
    No. 14. She obtains a simple random sample of 40
    women who gave birth to their first child in 1999
    and finds the sample mean age to be 27.1 years.
    Assume that the population standard deviation is
    6.4 years. Test the researchers claim, using
    the classical approach at the ? 0.05 level of
    significance.

34
6.2 Significance (Hypothesis) Testing
  • Step 1 H0 µ 24.6 vs HA µ gt 24.6
  • Step 2 Z N(0,1)
  • Step 3 one-sided alternative so
  • p-value P(Z gt z) P(Z gt 2.47)
  • 1 0.9932 0.0068

35
6.2 Significance (Hypothesis) Testing
  • Step 4 p-value 0.0068, a 0.05
  • a
  • p-value
  • p-value lt a Reject H0
  • Step 5 At the a 0.05 level of significance,
    there is significant evidence to reject H0 and
    conclude the average age of a woman when she has
    her first child has increased since 1990

36
6.2 Significance (Hypothesis) Testing
  • Example The thickness of metal wires used in the
    manufacture of silicon wafers is assumed to be
    normally distributed with mean µ . To monitor the
    production process, the thickness of 40 wires is
    taken. The output is considered unacceptable if
    the mean differs from the target value of 10. The
    40 measurements yield a sample mean of 10.2 and
    sample standard deviation of 1.2. Conduct the
    appropriate statistical test and state its
    implications for the problem. Use an a .1
    level of significance.

37
6.2 Significance (Hypothesis) Testing
  • Step 1 H0 µ 10 vs HA µ ? 10
  • Step 2 Z N(0,1)
  • Step 3 two-sided alternative so
  • p-value P(Z gt z) 2P(Z lt -1.05)
  • 2(0.1469) 0.2938

38
6.2 Significance (Hypothesis) Testing
  • Step 4 p-value 0.2938, a 0.10
  • p-value
  • a
  • p-value gt a FTR H0
  • Step 5 At the a 0.10 level of significance,
    there is not enough significant evidence to
    reject H0 and conclude the output is unacceptable
  • (i.e. The output is acceptable)

39
6.2 Significance (Hypothesis) Testing
  • Question Does hypothesis testing have anything
    in common with confidence intervals?
  • Answer YES!
  • Suppose we are considering a confidence level of
    95. Then all values located within this
    confidence interval,
  • are values that when assumed to be the true
    mean value (think null hypothesis) will produce a
    p-value of 0.05 or greater.

40
6.2 Significance (Hypothesis) Testing
  • In other words, H0 , where is some number
    in the interval above, implies
  • So any value not in the interval is going to
    provide enough evidence against the null
    hypothesis. If a certain value is in the
    interval, we will have little or no evidence
    against the null hypothesis.

41
6.2 Significance (Hypothesis) Testing
  • Since repeated applications of forming (1-a)100
    C.I.s results in the true mean being bracketed
    by the C.I. (1-a)100 of the time, we know that
    (a)100 of the time, the C.I. will not capture
    the true mean. When the true mean is not
    captured by the C.I., we have evidence against
    H0, which will lead to rejecting the null
    hypothesis when the null hypothesis is in fact
    true (Type I Error).
  • So, for a (1-a)100 confidence interval, a
    represents the probability of making a Type I
    Error.

42
6.2 Significance (Hypothesis) Testing
  • Example Air bags were tested to determine the
    pressure present in the air bags 40 milliseconds
    after releasing the air bag. Suppose 50 bags
    were tested, the mean pressure is 6.5 psi and the
    standard deviation is 0.25 psi.
  • Determine plausible (likely) values for the
    population mean given an 80 confidence level.

43
6.2 Significance (Hypothesis) Testing
  • Is there clear evidence that the mean pressure
    for all air bags under consideration is not 6.5
    psi?
  • Since 6.5 is in the computed 80 confidence
    interval, then at the a 0.2 level of
    significance we would conclude it is a plausible
    value for the mean so we would FTR H0 µ 6.5
  • Is there clear evidence that the mean pressure
    for all air bags under consideration is not 6
    psi?
  • Since 6 is not in the confidence interval,
    then at the
  • a 0.2 level of significance we would
    conclude it is not a
  • plausible value for the mean so we would
    Reject H0 µ 6

44
6.2 Significance (Hypothesis) Testing
  • Going the opposite waysuppose we had performed
    the hypothesis test for H0 µ 6.475 HA µ ?
    6.475 and we compute a p-value of 0.15.
  • Would we conclude that 6.475 would be inside an
    80 CI?
  • No, because a .2 gt p-value 0.15 so we would
    reject H0 and conclude 6.475 is not a plausible
    value for the mean and thus would not be inside
    an 80 CI.
  • What about a 90 confidence interval?
  • Yes because a .1 lt p-value 0.15 so we would
    FTR H0 and conclude 6.475 is a plausible value
    for the mean so it would be contained in a 90 CI.

45
6.2 Significance (Hypothesis) Testing
  • Note you have to be careful what type of interval
    you are using with what type of alternative
    hypothesis
  • HA µ ? µ0 a is split in two so we would
    compare it to a two sided confidence interval
    (upper and lower bounds)
  • HA µ lt µ0 or HA µ gt µ0 a is NOT split in
    two so we would compare it to a one sided
    confidence bound (upper bound for HA µ gt µ0 and
    lower bound for HA µ lt µ0 )
Write a Comment
User Comments (0)
About PowerShow.com