Title: Sampling Distributions
1Chapter 6
2Those who jump off a bridge in Paris are in
Seine. A backward poet writes inverse. A
man's home is his castle, in a manor of speaking.
3Sampling
- The Need
- Get information about a population without
checking the entire population - Advantages
- Cost
- Time
- Accuracy (can be achieved with low cost)
- Destruction is sometimes involved checking all
is not possible. - Insert Excel Simulation here
4Distribution of Means
5Visual Mean of Means
6Distribution of Sample Means
- Many different sample means are possible
- The sample means cluster closer to the population
mean than the population values do. - The larger the sample, the closer they cluster
around the population mean - Therefore the likelihood of a single sample mean
being close to the true mean is high
7Distribution of Sample Means
- When trying to use a sample to estimate a
population mean, we know we wont get the exact
value - We want some way of managing the error so as to
be as close as we need to be - We can decide on a margin of error that we are
willing to accept (polls typically 2 - 4). - We cannot eliminate the possibility of getting a
value outside that range, but we can keep it
small by adjusting the sample size.
8How Close Can We Get?
- The variance of the sample mean is the population
variance divided by n (sample size) - Thus larger ns bring smaller variances
- Lets look at an example. In order to understand
the process, we will assume we actually know the
true mean and variance. Each of the following
graphs is from a computer simulation of taking
100 samples from a normal population with µ15
and s3, but with different sample sizes.
9µ15, s3, Sample Size 1 Number observed in
14,16 30
10µ 15, s3, Sample Size 4 Number observed in
14,16 52
11µ 15, s3, Sample Size 9 Number observed in
14,16 74
12µ 15, s3, Sample Size 16 Number observed in
14,16 81
13µ 15, s3, Sample Size 25 Number observed in
14,16 90
14µ 15, s3, Sample Size 36 Number observed in
14,16 97
15Number in 14,16 vs Sample Size
16So What?
- In Real Life, we dont know the true mean and
variance. We want to estimate them. - Furthermore, we will only take one sample, which
represents just one data point from the
distributions we have illustrated. - We will probably NEVER know where in the
distribution that data point is coming from. - Under these conditions, how can we provide an
estimate that is trustworthy? - Clearly, the sample size directly affects the
likelihood that the sample mean will be close to
the true mean.
17Which one would you like to pick from?
- The situation You have 100 balls in an urn
(left). Each has an odd number on it, which may
be from 7-25, but you dont know how many of each
there are. You will draw one ball and record its
number. If this number matches the mean of the
distribution, your company will make lots of
money and you will get a promotion. However, you
have the opportunity, for a sizable fee, to trade
in the urn for the one on the right. If you do
so, and are wrong, you will be fired because of
the excessive expense you incurred.
18Does the name Pavlov ring a bell? Reading while
sunbathing makes you well red. When two
egotists meet, it's an I for an I.
19- Notes
- 1. the sample mean.
- 2. the standard deviation of the sample
means. - 3. The theory involved with sampling
distributions described in the remainder of this
chapter requires random sampling. - Random Sample A sample obtained in such a way
that each possible sample of a fixed size n has
an equal probability of being selected. - (Example Every possible handful of size n has
the same probability of being selected.)
20The Central Limit Theorem
- The most important idea in all of statistics.
- Describes the sampling distribution of the sample
mean. - Examples suggest the sample mean (and sample
total) tend to be normally distributed.
21- Distribution of Sample Means
- If all possible random samples of a particular
size n are taken from any population with a mean
m and a standard deviation s, the distribution of
sample means will - 1. have a mean equal to m.
- 2. have a standard deviation equal to
- Further, if the sampled population has a normal
distribution, then the sampling distribution of
will also be normal for samples of all sizes. - Central Limit Theorem
- The distribution of sample means will come closer
to normal as the sample size increases.
22- Graphical Illustration of the Central Limit
Theorem
Distribution of n 2
Original Population
Distribution of n 10
Distribution of n 30
23- Example Consider a normal population with m 50
and s 15. Suppose a sample of size 9 is
selected at random. Find - 1.
- 2.
- Solution
- Since the original population is normal, the
distribution of the sample mean is also (exactly)
normal.
24(No Transcript)
25(No Transcript)
26- Example A report stated that the day-care cost
per week in Boston is 109. Suppose we accept
this as the true (population) mean cost per week,
and also know that the standard deviation is 20. - 1. Find the probability that a sample of 50
day-care centers would show a mean cost of 105
or less per week. - 2. Suppose the sample of 50 day-care centers
results in a sample mean of 120. Does this
provide evidence to refute the claim that the
true mean is 109? - Solution
- The shape of the original distribution is
unknown, but the sample size, n, is large. The
CLT applies. - The distribution of is approximately normal.
27(No Transcript)
28- To investigate the claim, we need to examine how
likely a sample mean of 120 is, if the claim is
true. - Consider how far out in the tail of the sample
mean distribution the value 120 is found. - Compute the tail probability.
- Since the tail probability is so small, this
suggests the observation of 120 is very rare (if
the mean cost is really 109). - There is evidence to suggest the claim of m
109 is wrong.
29In democracy your vote counts. In feudalism your
count votes. She was engaged to a boyfriend
with a wooden leg but broke it off. A chicken
crossing the road is poultry in motion.