Estimation in Sampling - PowerPoint PPT Presentation

About This Presentation
Title:

Estimation in Sampling

Description:

You ask these folks if they trust Walter Cronkite when he delivers the nightly news ... e.g., in our sample, 57% of the viewers thought Walter Cronkite is trustworthy ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 39
Provided by: lax3
Category:

less

Transcript and Presenter's Notes

Title: Estimation in Sampling


1
Estimation in Sampling
  • GTECH 201
  • Lecture 15

2
Conceptual Setting
  • How do we come to conclusions from empirical
    evidence?
  • Isnt common sense enough?
  • Why?
  • Systematic methods for drawing conclusions from
    data
  • Statistical inference
  • Inductive versus Deductive Reasoning

3
Drawing Conclusions
  • Statistical inference
  • Based on the laws of probability
  • What would happen if?
  • You ran your experiment hundreds of times
  • You repeated your survey over and over again
  • Statistic and Parameter
  • The proportion of the population who are
    ltdisabledgt usually denoted by p
  • In a SRS of 1000 people, the proportion of the
    people who are ltdisabledgt usually denoted by
    (p -hat)

4
Estimating with Confidence
  • Say you are conducting an opinion poll
  • SRS of 1000 adult television viewers
  • You ask these folks if they trust Walter Cronkite
    when he delivers the nightly news
  • Out of 1000, 570 say, they trust him
  • 57 of the people trust Walter
  • is 0.57
  • If you collect another set of 1000 television
    viewers, what will the rating be?

5
Confidence Statement
  • We need to add a confidence statement
  • We need to say something about the margin of
    error
  • Confidence statements are based on the
    distribution of the values of the sample
    proportion that would occur if many
    independent SRS were taken from the same
    population
  • The sampling distribution of the statistic

6
Terminology Review
  • Sample
  • Population
  • Statistic
  • a numerical characteristic associated with a
    sample
  • Parameter
  • A numerical characteristic associated with the
    population
  • Sampling error
  • The need for interval estimation

7
Point Estimation
  • Point estimation of a parameter is the value of a
    statistic that is used to estimate the parameter
  • Compute statistic (e.g., mean)
  • Use it to estimate corresponding population
    parameter
  • Point Estimators of Population Parameters(see
    next slide)

8
Point Estimators for Population Parameters
Population Sample Calculating Paramete
r statistic formula
9
Interval Estimation
  • Sample point estimators are usually not
    absolutely precise
  • How close or how distant is the calculated sample
    statistic from the population parameter
  • We can say that the sample statistic is within a
    certain range or interval of the population
    parameter.
  • The determination of this range is the basis for
    interval estimation

10
Interval Estimation (2)
  • A confidence interval (CI) represents the level
    of precision associated with a population
    estimate
  • Width of the interval is determined by
  • Sample size,
  • variability of the population, and
  • the probability level or the level of confidence
    selected

11
Sampling Distributionof the Mean
  • The distribution of all possible sample means for
    a sample of a given size
  • Use the mean of a sample to estimate and draw
    conclusions about the mean of that entire
    population
  • So we have samples of a particular size
  • We need formulas to determine the mean and the
    standard deviation of all possible sample means
    for samples of a given size from a population

12
Sample and Population Mean
  • For samples of size n, mean of the variable
  • Is equal to the mean of the variable under
    consideration
  • Mean of all possible sample means is equal to the
    population mean

13
Sample Standard Deviation
  • For samples of size n, the standard deviation of
    the variable
  • Is equal to the standard deviation of the
    variable under consideration, divided by the
    square root of the sample size
  • For each sample size, the standard deviation of
    all possible sample means equals the population
    standard deviation divided by the square root of
    the sample size

14
Central Limit Theorem
  • Suppose all possible random samples of size n are
    drawn from an infinitely large, normally
    distributed population having a mean and a
    standard deviation
  • The frequency distribution of these sample means
    will have
  • A mean of (the population mean)
  • A normal distribution around this population mean
  • A standard deviation of

15
Sampling Error
  • Standard Error of the mean (SEM) is a basic
    measure for the amount of sampling error
  • SEM indicates how much a typical sample mean is
    likely to differ from a true population mean
  • Sample size, and population standard deviation
    affect the sampling error

16
Sampling Error (2)
  • The larger the sample size, the smaller the
    amount of sampling error
  • The larger the standard deviation, the greater
    the amount of sampling error

17
Finite Population Correction Factor
  • The frequency distribution of the sample means is
    approximately normal if the sample size is large
  • N lt 30 (small sample) N gt 30 (large sample)
  • If you have a finite population, then you need to
    introduce a correction, i.e., the fpc rule/factor
    in the estimation process
  • where fpc finite population correction
  • n sample size
  • N population size

18
Standard Error of the Mean for Finite Populations
  • When including the fpc should be
  • In general, you include the fpc in the
    population estimates only when the ratio of
    sample size to population size exceeds 5 or
  • when n / N gt 0.05

19
Constructing Confidence Intervals
  • A random sample of 50 commuters reveals that
    their average journey-to-work distance was 9.6
    miles
  • A recent study has determined that the std.
    deviation of journey-to-work distance is
    approximately 3 miles
  • What is the CI around this sample mean of 9.6
    that guarantees with 90 certainty that the true
    population mean is enclosed within that interval?

20
Confidence Intervalfor the Mean
  • Z value associated with a 90 confidence level
    (Z 1.65)
  • The sample mean is the best estimate of the true
    population mean
  • CI
  • 9.6 1.65 (3/ ) 10.30 miles
  • 9.6 - 1.65 (3/ ) 8.90 miles

21
Confidence Interval
  • We say that the sample statistic is within a
    certain range or interval of the population
    parameter
  • e.g., in our sample, 57 of the viewers thought
    Walter Cronkite is trustworthy
  • In the general population, between 54 and 60 of
    viewers think that Walter Cronkite is trustworthy
  • Or, in our sample, the average commuting distance
    was 9.6 miles
  • In the population, we calculated that the average
    commute is likely to be somewhere between 8.9
    miles and 10.3 miles

22
Confidence Level
  • Gives you an understanding of how reliable your
    previous statement regarding the confidence
    interval is
  • The probability that the interval actually
    includes the population parameter
  • For example, the confidence level refers to the
    probability that the interval (8.9 miles to 10.3
    miles) actually encompasses the TRUE population
    mean (90, 95, 99.7)
  • Confidence Level probability is 1 - ?

23
Significance Level
  • ? (alpha)
  • The probability that the interval that surrounds
    the sample statistic DOES NOT include the
    population parameter
  • E.g., the probability that the average commuting
    distance does not fall between 8.9 miles and 10.3
    miles
  • ? 0.10 (90) 0.05 (95) 0.01 (99.7)
  • Confidence Interval width -- increases

24
Sampling Error
  • Total sampling error ?
  • Probability that the sample statistic will fall
    into either tail of the distribution is
  • ?/2
  • If you want 99.7 confidence (i.e., low error),
    then you have to settle for giving a less precise
    estimate (the CI is wider)

25
If the Standard Deviationis Unknown
  • If we dont know the population mean, its likely
    we dont know the standard deviation
  • What you are likely to have is the variance and
    standard deviation of your sample
  • Also, you have a small population, so you have to
    use the finite population correction factor that
    was discussed earlier
  • Once you have the formula for standard error,
    then you can proceed as before to determine the
    confidence interval

26
Standard Error
27
Students T Distribution
  • William Gosset (1876-1937)
  • Published his contributions to statistical theory
    under a pseudonym
  • Students t distribution is used in performing
    inferences for a population mean, when,
  • The population being sampled is approximately
    normally distributed
  • The population standard deviation is unknown
  • And the sample size is small (n lt 30)

28
Characteristics of the t - Distribution
  • A t curve is symmetric, bell shaped
  • Exact shape of distribution varies with sample
    size
  • When n nears 30, the value of t approaches the
    standard normal Z value
  • A particular distribution is identified by
    defining its degrees of freedom (df)
  • For a t distribution, df (n -1)

29
Properties of t Curves
  • The total area under a t curve 1
  • A t curve extends indefinitely in both
    directions, approaching, but never touching the
    horizontal axis
  • A t-curve is symmetrical about 0
  • As the degrees of freedom become larger, t
    curves look increasingly like the standard normal
    curve
  • We need to use a t-table and look for values of
    t, instead of Z to determine the confidence
    interval

30
Calculating various CIs
  • Sampling
  • SRS, systematic, or stratified
  • Parameters
  • Mean, total, or proportion
  • Six situations
  • Consider whether to use fpc
  • when n/N gt 0.05
  • Consider whether to use Z or t
  • when n lt 30

31
If Random or Systematic Sample
  • Estimate of Population Mean
  • Best estimate is ?
  • Estimate of sampling error
  • Standard error of the mean (inc. fpc)

32
If Stratified Sample
  • Estimate of population mean
  • Still equal to sample mean but
  • Std. Error of the mean (inc. fpc)

Where mnumber of strata i refers to a
particular stratum
33
Minimum Sample Size
  • Before going out to the field, you want to know
    how big the sample ought to be for your research
    problem
  • Sample must be large enough to achieve precision
    and CI width that you desire
  • Formulas to determine the three basic population
    parameters with random sampling

34
Sample Size Selection - Mean
  • Your goal is to determine the minimum sample size
  • You want to situate the estimated population
    mean, in a specified CI

E amount of error you are willing to tolerate
35
(No Transcript)
36
(No Transcript)
37
Example 1
  • We are looking at Neighborhood X
  • 3,500 households
  • Sample size 25 households
  • Sample mean 2.73
  • Sample variance 2.6
  • CI 90
  • Find the mean number of people per household

38
Example 2
  • Sample of 30 households
  • Sample standard deviation is 1.25
  • What sample size is needed to estimate the mean
    number of persons per household in neighborhood X
  • and be 90 confident that your estimate will be
    within 0.3 persons of the true population mean?
Write a Comment
User Comments (0)
About PowerShow.com