Sampling Distributions, Confidence Intervals, and Hypothesis Tests - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Sampling Distributions, Confidence Intervals, and Hypothesis Tests

Description:

... gambler at the Sands Casino in Las Vegas believes that one of the casino's ... A sample of 300 UNT student records shows an average graduation time of 6 years ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 40
Provided by: michaelm99
Category:

less

Transcript and Presenter's Notes

Title: Sampling Distributions, Confidence Intervals, and Hypothesis Tests


1
Sampling Distributions, Confidence
Intervals,andHypothesis Tests
2
Sampling Distributions
  • Parameter estimation is the process of estimating
    the value of a parameter (e.g., mean or variance)
    associated with a population or process through
    sampling values of that population/process
  • The sampling could involve i.i.d. repetitions of
    the process or sampling without replacement from
    the population
  • If the population is large with respect to the
    sample size, then the sampling can be regarded as
    nearly i.i.d., otherwise correction factors can
    be used

3
Sampling Distributions
  • To evaluate the goodness of the parameter
    estimate, something about the sampling
    distribution of the estimation procedure needs to
    be known
  • For example, if have i.i.d. sample from a
    Gaussian process, with variance ?2, then the
    sampling distribution for the sample mean is
    normal

4
Sampling Distributions
  • If have i.i.d. sample from a process, not
    necessarily Gaussian, with variance ?2, then the
    sampling distribution for the sample mean is
    approximately normal for large n

5
Sampling Distributions
  • If have i.i.d. sample from a Gaussian process,
    with variance ?2, then the sampling distribution
    for the sample variance sample is
  • If have i.i.d. sample from a Gaussian process,
    with unknown variance, then

6
Confidence Intervals
  • A confidence interval for a parameter estimate
    provides a measure of the accuracy of the
    estimate
  • An x confidence interval is a random interval
    (derived from the sample) that has a x
    probability of containing the population parameter

7
Components of a Confidence Interval Calculation
  • Sample statistic. The sample statistic serves as
    a point estimate for the corresponding population
    parameter
  • Population variance. Large population variance
    will imply larger confidence interval, small
    variance implies smaller confidence interval
  • If population variance is known, it will be used
    in the confidence interval calculation
  • If the population variance is not known, then
    sample variance (with appropriate correction
    factor) will be used to estimate population
    variance

8
Components of a Confidence Interval Calculation
  • Standard error. The standard error of the sample
    statistic is the standard deviation of the
    sampling distribution
  • Confidence Level. The confidence level indicates
    the probability that the confidence interval
    contains the population parameter
  • This is often just an estimate, since the
    sampling assumptions are usually not met

9
CI Sample Mean
  • General form of a confidence interval is
  • (sample statistic) /- (sampling distribution
    score)(SE)
  • For estimating the population mean

10
CI Sample Mean
  • For , let be that value
    such that
  • So

11
CI Sample Mean
  • The confidence interval
    for the sample mean, assuming simple random
    sampling and large n, is given by
  • The confidence
    interval for the sample proportion , assuming
    simple random sampling, is given by

12
Examples
  • Chapter 7 problems
  • 3, 4, 5, 9, 10, 16, 17, 18

13
Tests for Significance
  • Tests for significance, or hypothesis tests
    address the question of whether an observed
    difference what would be expected under a
    specified model is real or just due to chance
    variation
  • For example, is there a statistically significant
    difference between the response rate of one type
    of cancer treatment versus another type of
    treatment or is the difference just due to
    chance variation

14
Components of a Significance Test
  • Null and alternative hypothesis
  • The null hypothesis says that an observed
    difference reflects chance variation. Denoted by
    H0.
  • The alternative hypothesis says that the observed
    difference is real. Denoted by HA

15
Components of a Significance Test
  • Test statistic
  • A test statistic is used to measure the
    difference between the data and what is expected
    according to the null hypothesis
  • To perform the hypothesis test, assumptions are
    made about the sampling distribution of the test
    statistics IF the null hypothesis was true
  • Test statistic often has the general form

16
Components of a Significance Test
  • Significance level
  • The observed significance level is the chance of
    getting a test statistic as extreme or more
    extreme than the observed one
  • This chance (p-value) is computed on the basis
    that the null hypothesis is correct
  • The smaller this chance is, the stronger the
    evidence against H0
  • The p-value is not the chance that H0 is right
  • A significance level that must be achieved to
    reject the null hypothesis is generally set
    before performing a significance test

17
Type I and Type II Errors
  • If the alternative hypothesis is accepted
    whenever the observed p-value is below the
    specified significance level, a, then a
    represents how often this would be done when in
    fact the null hypothesis holds (Type I error)
  • Balanced against the probability, b, of not
    rejecting null hypothesis when it is false (Type
    II error)
  • Commonly take (or 1)
  • is often not straightforward to calculate
  • Note that b increases as a decreases
  • Ideally, experiments are designed, ahead of time,
    to achieve a given power

18
Notes
  • Required significance levels are somewhat
    arbitrary
  • Even if statistically significant, results can
    still be due to chance
  • With large samples, even a small difference can
    be statistically significant. That does not
    necessarily make it important. Conversely, an
    important difference may not be statistically
    significant for a small sample

19
Notes
  • Every legitimate test of significance involves a
    chance model. The test addresses whether the
    observed difference is real or just a chance
    variation.
  • If the entire population has been surveyed, then
    a significance test is irrelevant
  • If the sample can not be viewed as a random
    sample of the population, then a significance
    test is inappropriate

20
Test on the Mean of a Sample
  • A common problem is determining whether the group
    represented by a sample data set is significantly
    different from a specified population
  • Or whether data obtained from an experiment
    represent a significant departure from a
    hypothesis
  • This type of problem is often phrased in terms of
    the difference of the sample mean and a
    hypothesized mean

21
Test on the Mean of a Sample
  • For a large sample, the central limit theorem
    gave the sampling distribution for the sample
    mean
  • With known population mean and standard
    deviation ,

22
Test on the Mean of a Sample
  • In practice, as long as the sample is
    approximately normal and the sample size is
    large, then often assume that the z-test
    statistic is
  • And a test on the mean is conducted as if
    probability that the observed mean would have
    come from a population with mean can be
    calculated accordingly

23
Test on the Mean of a Sample
  • Thus, for large n, could test the hypothesis
  • by calculating the probability that the
    observed mean would have come from a population
    with mean

H0 Data came from population with mean Ha
Data came from population with mean gt (lt
)
24
Test on the Mean of a Sample
  • A frequent gambler at the Sands Casino in Las
    Vegas believes that one of the casinos roulette
    wheels is not balanced. The gambler records 2000
    plays of the wheel. 1000 of plays of the wheel
    come up black. Is there statistically
    significant evidence that the wheel is
    unbalanced?

25
Test on the Mean of a Sample
  • The national average number of years that it now
    takes a college student to graduate is 5.5 years.
    A sample of 300 UNT student records shows an
    average graduation time of 6 years with an
    standard deviation of 1 year. Is the difference
    between the national average of 5.5 years and the
    UNT average of 6 years real?

26
Test on the Mean of a Sample
  • If the underlying distribution is known to be
    normal, then for known population mean
  • where s is the sample standard deviation
  • Note that for large n, the sampling distribution
    is approximately normal
  • So, again, can test a hypothesis on the mean by
    calculating the probability that the observed
    mean would have come from a population with mean

27
Chi-Square Test for Homogeneity
  • Consider the contingency table
  • Was the treatment really effective? Or, was the
    treatment useless and the results were merely the
    result of chance?
  • Chi-square test for homogeniety can be applied to
    provide an answer to these questions

28
Chi-square Test for Homogeneity
  • The null hypothesis of the chi-square for
    homogeneity is that the row distributions are the
    same and are given by the column marginal
  • That is H0 for
    all i,k, and j
  • Under the null hypothesis, the maximum likelihood
    estimates for ?ji are obtained from the data by

29
Chi-square Test for Homogeneity
  • The chi-square test statistic is given by

30
Chi-square Test for Homogeneity
  • The distribution of X2 under the null hypothesis
    is chi-square with (I-1)(J-1) degrees of freedom
  • For the polio example the X2 statistic is
    calculated using

31
Chi-square Test for Homogeneity
  • The chi-square test yields

32
Hypothesis Test and Confidence Intervals
  • One way to view a hypothesis test is as a check
    whether the value for a parameter specified in
    the null hypothesis lies in the
    confidence region (interval)
  • In other words, a
    confidence region for a parameter, , consists
    of those values, , for which a null
    hypothesis that will not be
    rejected at the ? significance level

33
Hypothesis Test and Confidence Intervals
  • Example
  • Suppose X1, X2,,Xn is a sample from a normal
    distribution having unknown mean and known
    variance . Suppose
  • H0
  • HA
  • Let ?? be the specified significance level
  • Then a hypothesis test would accept the null
    hypothesis if

34
Hypothesis Test and Confidence Intervals
35
Test of Equality of Means from Two Independent
Samples
  • Suppose that X1, X2,..,Xn is an independent
    sequence of random variables
    and Y1, Y2,..,Ym is an independent sequence of
    random variables. Then the
    statistic
  • has a t distribution with mn-2 d.f., for

36
Test of Equality of Means from Two Independent
Samples
  • Thus, the confidence
    interval on the difference between the means is
  • Accordingly, would reject the null hypothesis
    that
  • (i.e.,
    ), at the significance level for the
    two-sided test ( level if one-sided
    test of or
    ), if 0 was outside of the confidence interval

37
Test of Equality of Means from Two Independent
Samples
  • That is, would reject the hypothesis that
    (two-sided) at the level if
  • Would reject the hypothesis that
    (one-sided) and accept the alternative
    at the level if

38
Test of Equality of Means from Two Independent
Samples
  • If the variances of the populations are not
    assumed equal then sampling distribution of the
    difference of the sample means is no longer a t
    distribution
  • However, if the SE of the sample means difference
    is estimated by

39
Test of Equality of Means from Two Independent
Samples
  • Then, the statistic
  • is approximately t with degrees of freedom equal
    to
Write a Comment
User Comments (0)
About PowerShow.com