Title: Sampling Variability and Sampling Distributions
1Chapter 8
- Sampling Variability and Sampling Distributions
2Chapter 8 Sampling Variability and Sampling
Distribution
- Inferential statistics uses information from a
sample to reach conclusion about one or more
characteristics of the population. - For example, to learn the true mean fat content
of hamburgers marked by a fast-food chain, a
sample of 50 hamburgers were obtained and the
mean fat content of the sample is 28.4 grams. - How close is the sample mean to the population
mean?
38.1 Statistics and Sampling Variability
- Can we estimate the population mean µ using a
sample mean ? - It is quite unlikely that would be exactly
µ. - Also the values of for a particular sample
usually differ from one another. - This sample-to-sample variability makes it
challenging to generalize from a sample to the
population from which it is selected.
4Example Constructing a Sampling Distribution
- Consider a small population consisting of the
board of directors of a day care center. Let x
the number of children for each board member - Members Jay Carol Allison Teresa Lygia
Bob Roxy Kyle - No. of kids 2 2 0 0
2 2 0 3 - What is the average number of children for the
entire group of eight (the population)? - What are the sample means when a sample of size 2
is selected?
Answer 1. µ 1.38 2. See next slide.
5Sampling Distribution of Sample Mean
6Sampling Distribution of Sample Mean
- What is the probability of observing a sample
mean that is within 0.5 of µ 1.38? Use the
table on last slide, we find
- The sampling distribution of displayed
in density histogram or in tabular form below
provides important information about the behavior
of the statistics.
7Definitions
- Any quantity computed from values in a sample is
called a statistic. - The observed value of a statistic depends on the
particular sample selected from the population
typically, it varies from sample to sample. - This variability is called sampling variability.
8Sampling Distribution
- To estimate the population mean µ we must use the
sample mean. - The are many, many different possible random
samples of size n. - The population of samples consists of all
possible samples of a given size n. - The distribution formed by considering the value
of a sample statistics for every possible
different sample of size n for the population is
called its sampling distribution.
98.2 The Sampling Distribution of a Sample Mean
- Example (Blood Platelet Size) The distribution
of platelet size for patients with noncardiac
chest pain is approximately normal with mean µ
8.25 and standard deviation s 0.75.
- Select 500 random samples from this normal
distribution with each sample consisting of n
5, 10, 20, and 30 observations. The resulting
sample histograms of the values are displayed
on next slide. - What do you notice (1) the shape of the
histograms? (2) the center of each histogram? And
(3) their spread relative to one another?
10When the population is normally distributed, the
sampling distribution of is normal
regardless of the sample size.
11What can we notice from the histograms?
- To a reasonable approximation, each of the four
histograms looks like a normal curve. - Each histogram is centered approximately at 8.25,
the mean of the population being sampled. - The smaller the value of n, the greater the
extent to which the sampling distribution spread
out about the population mean value. (As n
increases, the histogram becomes narrower.) - The sample mean based on a large sample
size n will tend to be closer to µ than will
based on a small n.
12Example Length of Overtime Period in Hockey
- In hockey, the overtime period ends as soon as
one of the teams scores a goal. 251 NHL play-off
games between 1970 and 1993 went into overtime.
The following histogram displays the distribution
of the length of the overtime period for all the
251 games (which we consider as a population).
13Even the population is not normal, the sampling
distribution of is approximately
normal if the sample size is large enough.
14Properties of the Sampling Distribution
15Example Bluethroat Song Duration
- Male bluethroats have a complex song, which is
thought to be used to attract female birds. Let x
denote the duration of a randomly selected song
(in seconds) for a male bluethroat. Suppose the
mean value of song duration is µ 13.8 sec. And
that the standard deviation of song duration is s
11.8 sec. - Find the mean and standard deviation of the
sampling distribution of size n 25. - Can we assume that the sampling distribution of
is normally distributed?
Solution on next slide
16Solution of Bluethroat Song Duration
- The sampling distribution of has mean value
- The standard deviation of is
- The sampling distribution of is not
necessarily normal because we do not know if the
population distribution of x is normal, and also
the sample size is less than 30.
17Example Soda Volumes
- A soft-drink bottler claims that, on average,
cans contain 12 oz of soda. Let x denote the
actual volume of soda in a randomly selected can.
Suppose that x is normally distributed with s
0.16 oz. Sixteen cans are to be selected, and the
soda volume will be determined for each one. - Find the probability that a randomly selected can
contains at most 11.9 oz. - Find the probability that the mean soda volume of
a sample of size 16 is at most 11.9 oz. - Find the probability that the sample soda mean
volume is between 11.96 oz and 12.08 oz.
Solution on next slide
18Solution of Soda Volumes Example
- Let x denote the actual volume of soda in a
randomly selected can, and then represents
the mean soda volume of the sample. - Given µ 12 oz, s 0.16 oz and n 16.
Therefore, - The probability that the soda volume of a
randomly selected can is at most 11.9 oz is - The probability that the sample mean soda volume
is at most 11.9 oz is - For your exercise.
-
19Exercise Fat Content of Hot Dogs
- A hot dog manufacturer asserts that one of its
brands of hot dogs has an average fat content of
µ 18 g per hot dog. Let x denote the fat
content of a randomly selected hot dog and
suppose that s, the standard deviation of the x
distribution, is 1. (a) What is the probability
that a randomly selected hot dog has a fat
content of more than 18.4 g? (b) What is the
probability that a sample of 36 hot dogs has an
average fat content of more than 18.4 g? (c) Does
the result cast substantial doubt on the
manufacturers claim?
Answer (a) P( x 18.4 ) .34 (b) P(
18.4) .0082 (c) If the companys claim is
correct, values of as large as 18.4 will be
observed only about 0.82 of the time with the
given population mean and standard deviation. The
value 18.4 exceeds 18 by enough to cast
substantial doubt on the manufacturers claim.