Title: Topic VI: Sampling Distributions
1Topic VI Sampling Distributions
2Importance of Samples and Sampling
- Sampling is important in Statistics because
statistics which are calculated from samples are
used to make estimates about the values of
population parameters.
3Sampling Distributions
Sampling Distributions
Sampling Distribution of the Mean
Sampling Distribution of the Proportion
4Sampling Distributions
- A sampling distribution is a distribution of all
of the possible values of a statistic for a given
sample size selected from a population
5Developing a Sampling Distribution
- Assume there is a population
- Population size N4
- Random variable, X,is age of individuals
- Values of X 18, 20,22, 24 (years)
D
C
A
B
6Developing a Sampling Distribution
(continued)
Summary Measures for the Population Distribution
P(x)
.3
.2
.1
0
x
18 20 22 24 A
B C D
Uniform Distribution
7Now consider all possible samples of size n2
Developing a Sampling Distribution
(continued)
16 Sample Means
16 possible samples (sampling with replacement)
8Sampling Distribution of All Sample Means
Developing a Sampling Distribution
(continued)
Sample Means Distribution
16 Sample Means
_
P(X)
.3
.2
.1
_
0
18 19 20 21 22 23 24
X
(no longer uniform)
9Summary Measures of this Sampling Distribution
Developing a Sampling Distribution
(continued)
10Comparing the Population with its Sampling
Distribution
Population N 4
Sample Means Distribution n 2
_
P(X)
P(X)
.3
.3
.2
.2
.1
.1
_
0
0
X
18 19 20 21 22 23 24
18 20 22 24 A
B C D
X
11Sampling Distribution of the Mean
Sampling Distributions
Sampling Distribution of the Mean
Sampling Distribution of the Proportion
12Standard Error of the Mean
- Different samples of the same size from the same
population will yield different sample means - A measure of the variability in the mean from
sample to sample is given by the Standard Error
of the Mean - (This assumes that sampling is with replacement
or - sampling is without replacement from an infinite
population) - Note that the standard error of the mean
decreases as the sample size increases
13If the Population is Normal
- If a population is normal with mean µ and
standard deviation s, the sampling distribution
of is also normally distributed with - and
14Z-value for Sampling Distributionof the Mean
- Z-value for the sampling distribution of
where sample mean population mean
population standard deviation n
sample size
15Sampling Distribution Properties
Normal Population Distribution
Normal Sampling Distribution (has the same mean)
16Sampling Distribution Properties
(continued)
- As n increases,
- decreases
Larger sample size
Smaller sample size
17If the Population is not Normal
- We can apply the Central Limit Theorem
- Even if the population is not normal,
- sample means from the population will be
approximately normal as long as the sample size
is large enough. - Properties of the sampling distribution
- and
18Central Limit Theorem
the sampling distribution becomes almost normal
regardless of shape of population
As the sample size gets large enough
n?
19Effect of Sample Size on Sampling Distribution
20Sampling Distributions I
- The Central Limit Theorem
- If x1, x2, x3, , xn is a random sample taken
from a population with mean µ and variance s2,
then - E( ) µ
- Var( ) s2/n
- Where is the sample mean
21If the Population is not Normal
(continued)
Sampling distribution properties
Population Distribution
Central Tendency
Sampling Distribution (becomes normal as n
increases)
Variation
Larger sample size
Smaller sample size
22How Large is Large Enough?
- For most distributions, n gt 30 will give a
sampling distribution that is nearly normal - For fairly symmetric distributions, n gt 15
- For normal population distributions, the sampling
distribution of the mean is always normally
distributed
23Example
- Suppose a population has mean µ 8 and standard
deviation s 3. Suppose a random sample of size
n 36 is selected. - What is the probability that the sample mean is
between 7.8 and 8.2?
24Example
(continued)
- Solution
- Even if the population is not normally
distributed, the central limit theorem can be
used (n gt 30) - so the sampling distribution of is
approximately normal - with mean 8
- and standard deviation
25Example
(continued)
Sampling Distribution
Standard Normal Distribution
Population Distribution
.1554 .1554
?
?
?
?
?
?
?
?
?
?
Sample
Standardize
?
?
-0.4 0.4
Z
7.8 8.2
X
26Example
- For each of the following three populations,
indicate what the sampling distribution for
samples of 25 would consist of. - Travel expense vouchers for a university in an
academic year - Absentee records (days absent per year) in 2002
for employees in a large manufacturing company - Yearly sales (in gallons) of unleaded gasoline at
service stations located in a particular parish
27Example
- The diameter of Ping-Pong balls manufactured at a
large factory is expected to be approximately
normally distributed with a mean of 1.30 inches
and a standard deviation of 0.04 inch. What is
the probability that a randomly selected
Ping-Pong ball will have a diameter - Less than 1.28 inches
- Between 1.31 and 1.33 inches
- Between what two values (symmetrically
distributed) will 60 of the Ping-Pong balls fall
(in terms of diameter)
28Example contd
- If many random samples of 16 Ping-Pong balls are
selected, - What will be the values of the population mean
and standard error of the mean? - What distribution will the sample mean follow?
- What proportion of the sample means will be less
than 1.28 inches? - What proportion of the sample means will be
between 1.31 and 1.33 inches? - Between what two values symmetrically distributed
around the mean will 60 of the sample means be? - Compare the answers of (1) with (6) and (2) with
(7). Discuss. - Explain the difference in the results of (3) and
(8).
29Sampling Distribution of the Proportion
Sampling Distributions
Sampling Distribution of the Mean
Sampling Distribution of the Proportion
30Population Proportions
- p the proportion of the population having
- some characteristic
- Sample proportion ( p ) provides an estimate
- of
p - 0 p 1
- p has a binomial distribution
- (assuming sampling with replacement from a
finite population or without replacement from an
infinite population)
31Sampling Distribution of p
- Approximated by anormal distribution if
-
- Where and
Sampling Distribution
P( ps)
.3 .2 .1 0
p
0 . 2 .4 .6 8 1
(where p population proportion)
32Z-Value for Proportions
Standardize p to a Z value with the formula
33Example
- If the true proportion of voters who support
Proposition A is p 0.4, what is the
probability that a sample of size 200 yields a
sample proportion between 0.40 and 0.45?
- i.e. if p 0.4 and n 200, what is
- P(0.40 p 0.45) ?
34Example
(continued)
- if p 0.4 and n 200, what is
- P(0.40 p 0.45) ?
Find
Convert to standard normal
35Example
(continued)
- if p 0.4 and n 200, what is
- P(0.40 p 0.45) ?
Use standard normal table P(0 Z 1.44)
0.4251
Standardized Normal Distribution
Sampling Distribution
0.4251
Standardize
0.45
1.44
0.40
0
Z
p
36Example
- A recent survey has indicated that 20 of
fine-dining restaurants have instituted policies
restricting the use of cell phones. If random
samples of 100 fine-dining restaurants are
selected , - What proportion of samples are likely to have
between 15 and 25 that have established
policies restricting use? - Within what symmetrical limits of the population
percentage will 90 of the sample percentages
fall? 95?
37Sampling Distributions Means Proportions -
Review
38Difference Between 2 Means or Proportions
- In some cases we may be interested in examining
the sampling distribution of the difference
between two means or two proportions - The main assumption is that the samples are taken
from large populations
39Difference Between 2 Means or Proportions
- The sampling distribution of the difference
between means can be thought of as the
distribution that would result if we repeated the
following three steps over and over again - Sample n1 scores from Population 1and n2 scores
from Population 2, - Compute the means of the two samples
- Compute the difference between means
. The distribution of the differences between
means is the sampling distribution of the
difference between means.
40Difference 2 Means 2 Proportions
41Sampling Distributions - Example I
- A statistics entrance examination consists of two
sections, I II. The maximum mark that can be
achieved on each section is 50. The population
mean and standard deviation for both sections are
- Mean Std. Dev
- Section I 36 5.5
- Section II 42 4.5
- If a random sample of 100 candidates marks from
Section I and 50 candidates marks from Section
II are taken, find the probability that the mean
mark obtained from Section I will be 5 marks or
more, less than the mean mark from Section II.
42Sampling Distributions - Example II
- It is estimated that 65 of the people who live
in Portmore are comfortable in their homes. The
corresponding figure for the people of Duhaney
Park is 70. A random sample of 50 is taken from
each of these two communities. Determine the
probability that the sample suggests that the
proportion in Portmore who are comfortable with
their homes is greater than that of Duhaney Park.