Estimation in Sampling - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Estimation in Sampling

Description:

A pilot study suggests that 36% of households contain one or more child. We want to know how many households to sample. A Political Poll ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 43
Provided by: juliocr
Category:

less

Transcript and Presenter's Notes

Title: Estimation in Sampling


1
Estimation in Sampling
  • How many cases do I need?
  • How good is my sample?
  • References Julio Rivera

2
Stop for a Moment
  • What is our goal in sampling?
  • We want to make an inference about something
    using as little information as prudently possible
  • Sometimes we want to be more certain than others
  • How do you guarantee accuracy plus or minus 3?

3
Stop for a Moment
  • How do you know how many observations to make?
  • How do you know how many points to sample?
  • How do you know how many people to question?

4
Two types of Estimation
  • Point and Interval Estimation

5
Point Estimation
  • Simple concept
  • Statistic is calculated from a sample
  • Then you estimate the population parameter
  • For a point estimate of the mean use a sample
    mean
  • For a point estimate of the standard deviation,
    use a sample standard deviation

6
  • In your book note that the formula for the
    standard deviation uses n-1 in the denominator.
  • For samples less than 30 use n-1. For samples
    greater than 30 use n.
  • For small samples the n-1 formula better
    represents the standard deviation

7
How good are point estimates
  • Always some error.
  • Unlikely that the sample statistic is going to
    equal the population parameter
  • But we want to know how good this sample
    statistic is
  • Therefore we construct a confidence interval

8
Interval Estimates
  • An interval bounded by two values used to
    estimate the value of a population parameter.
  • The values that bound this interval are
    calculated from the sample that is being used as
    the basis for the estimation

9
Level of Confidence
  • 1-a (a probability)
  • The probability that the sample selected yields
    boundary values that lie on opposites sides of
    the parameter being estimated
  • For instance
  • The probability the mean of the sample is between
    the two boundaries

10
Confidence Interval
  • An interval estimate with a specified level of
    confidence
  • In other words it is the level of precision
    associated with the sample estimate
  • Determined by sample size, variability of the
    sample and the confidence selected
  • But before we look at CIs more closely we need to
    go back and look at the Central Limit Theorem

11
Central Limit Theorem
  • If all possible random samples, each of size n,
    are taken from any population with a mean m and a
    standard deviation s, the sampling distribution
    of sample means will
  • Have a mean of m
  • have a standard deviation of s of s/Ön
  • Be normally distributed or approximately normal
    for samples of 30 or more when the parent sample
    is normal

12
Qualities of the Central Limit Theorem
  • In essence this is the mean of the means
  • The larger the sample--the closer the sample mean
    will be to the population mean the smaller the
    amount of sampling error
  • The larger the standard deviation, the larger the
    amount of sampling error

13
The standard error of the mean
  • is the standard deviation of these sample means
  • Divide the population standard deviation by the
    square root of the sample size

14
Why are these valuable?
  • We often deal with populations which are not
    normally distributed
  • However, the collection of means of the samples
    of those populations are normally distributed
    (n30)

15
(No Transcript)
16
The Central Limit Theorem allows us to put a
confidence interval around our sample mean
  • It allows us to decide how good our sample is

17
Confidence Intervals and Estimation
  • You want to know that your sample mean falls with
    in a certain range with a level of confidence
  • You are trying to estimate the population mean
    using your sample mean
  • A confidence interval is constructed by
    establishing the interval (a z-score
    corresponding to the CI) from the mean.

18
Confidence Interval
  • If you wanted a 95 CI from a mean of 35.6.
  • 95z score of 1.96 (Table A)
  • If we think the population standard deviation
    is16
  • Our n100

19
Confidence Interval
  • What we have suggested here is that the
    probability is that 95 the true sample mean is
    between 32.46 and 38.74

20
Confidence Interval
  • In a 95 CI, the probability of being wrong is 5
    (a)
  • This alpha error is divided in half and
    distributed out on the tails of the normal
    distribution (a/2)
  • The case at right is from you book and uses a 90
    interval

21
Some things to note
  • If you dont know the population standard
    deviation, use the sample standard deviation
  • Likewise, if you don't know the population
    variance, use the sample variance

22
Being Rigorous
  • Generally you shouldnt set a confidence interval
    below .95
  • Allows a precise estimate of the population
    parameter
  • If you can, set it at .99 (Often this is too
    expensive and gives too wide an interval)
  • However there are some types of data which you
    may be able to set lower CIs
  • Do this on a case by case basis

23
Using t instead of z
  • For sample sizes less than 30 use the students t
    value rather than z
  • Distribution is symmetric and bell
    shaped--approaches normal near n30

24
Middletown Example
  • Planners are asking several questions because
    they believe the census data is out of date
  • What is the mean number of people in each
    household?
  • How many people are in the community?
  • The proportion of households with a child under
    18 years of age

25
First step
  • How many people per household
  • Small random sample
  • Planners sample 25 households
  • Set a 90 confidence level
  • Important values are at left

26
  • Ignore the finite population correction
  • Resulting interval is a little large
  • The planners decide to increase the sample size
    to 250
  • The decide to keep the level of confidence .90

27
  • The new mean is 2.68
  • variance is 4.3
  • n250
  • we use z rather than t (1.65)
  • Finite population correction factor is used
  • Changes narrow the CI

28
How Many People are in Middletown?
  • Make an estimate by multiplying the number of
    households by the mean number of persons you just
    calculated

29
How good is this estimate?
  • Lets construct the confidence interval
  • Use the formulas for random and systematic
    samples from your book
  • 90 probability that the true mean is between
    8650.16 and 10,109.84

30
Children
  • How many households have children
  • of the 250 households surveyed--105 have children
  • Our best estimate of proportion is .42
  • Once again, we want to construct a CI--use
    formulas for random and systematic samples

31
  • Plug in the values
  • The probability that true mean of the proportion
    of households with families is between 0.37 and
    0.47 is 90
  • Translates out to numbers

32
How large should my n be?
  • There are a number of methods for determining
    what size n you should do for your research
  • We will go over a few of the more standard ways
  • This information is important because you can
    save yourself time and money if you know how to
    predict the sample size you need

33
Think before you sample
  • What kind of sample
  • What population parameter are you estimating
    (mean, total, proportion)
  • What level of precision (width of your confidence
    interval that can be tolerated)
  • Level of confidence to be obtained from sample
  • Types of statistical tests you will use (more
    important later)

34
Determining Sample Size
  • Use these formulas when you want to determine
    sample size for a confidence interval around a
    mean

35
Determining Sample Size
  • A sample of what size would be needed to estimate
    a population mean within six units with 98
    confidence if the population has a standard
    deviation of 16?

36
Determining Sample Size
  • Plug in the values
  • Round up to 39
  • Most of the time we dont know the standard
    deviation
  • Do a pilot of n30
  • Use this and then take additional sample units
  • This is two-stage sampling

37
  • Suppose the total you want to estimate the total
    population of Middletown within 1000 persons
  • How many households would you need to sample
  • N3500 z1.65 s2.05

38
Sample Size using proportion
  • A pilot study suggests that 36 of households
    contain one or more child
  • We want to know how many households to sample

39
A Political Poll
  • The Rivera Opinion Poll Company wishes to
    determine the percentage of eligible voters who
    would approve the levying of additional property
    taxes to finance public schools in Milwaukee
  • The probability should be about 99.7 that the
    percentage estimated would be plus or minus 3.
  • How large a sample do we need?

40
  • The most conservative assumption of p and p-1 is
    .50 for each (its a mathematical property)
  • We need to sample 2500 people for this level of
    accuracy

41
  • If you keep plus minus 3
  • And change the probability (z) to 95

42
Estimation in Sampling
  • What a nice thing to know how to make these
    estimates
  • They will save you (if you do graduate research)
    time and money
  • They will save your employer time and money (this
    will make you look good)
Write a Comment
User Comments (0)
About PowerShow.com