Probability and Sampling Distributions

About This Presentation

Title:

Probability and Sampling Distributions

Description:

DMS causes 'off-odors' in wine, so winemakers want to know the odor threshold ... The odor thresholds for 10 randomly chosen subjects (in micrograms/liter) ... – PowerPoint PPT presentation

Number of Views:74

Avg rating:3.0/5.0

Slides: 59

Provided by: melanie58

Category:

more less

Transcript and Presenter's Notes

Title: Probability and Sampling Distributions

1
Chapter 4

Probability and Sampling Distributions

2
Random Variable

Definition A random variable is a variable whose
value is a numerical outcome of a random
phenomenon.
The statistic calculated from a randomly chosen
sample is an example of a random variable.
We dont know the exact outcome beforehand.
A statistic from a random sample will take
different values if we take more samples from the
same population.

3
Section 4.4

The Sampling Distribution of a Sample Mean

4
Introduction

A statistic from a random sample will take
different values if we take more samples from the
same population
The values of a statistic do no vary haphazardly
from sample to sample but have a regular pattern
in many samples
We already saw the sampling distribution
Were going to discuss an important sampling
distribution. The sampling distribution of the
sample mean, x-bar( )

5
Example

Suppose that we are interested in the workout
times of ISU students at the Recreation center.
Lets assume that µ is the average workout time
of all ISU students
To estimate µ lets take a simple random sample of
100 students at ISU
We will record each students work out time (x)
Then we find the average workout time for the 100
students
The population mean µ is the parameter of
interest.
The sample mean, , is the statistic (which is
a random variable).
Use to estimate µ (This seems like a sensible
thing to do).

6
Example

A SRS should be a fairly good representation of
the population so the x-bar should be somewhere
near the ?.
x-bar from a SRS is an unbiased estimate of ? due
to the randomization
We dont expect x-bar to be exactly equal to ?
There is variability in x-bar from sample to
sample
If we take another simple random sample (SRS) of
100 students, then the x-bar will probably be
different.
Why, then, can I use the results of one sample to
estimate ??

7
Statistical Estimation

If x-bar is rarely exactly right and varies from
sample to sample, why is x-bar a reasonable
estimate of the population mean ??
Answer if we keep on taking larger and larger
samples, the statistic x-bar is guaranteed to get
closer and closer to the parameter ?
We have the comfort of knowing that if we can
afford to keep on measuring more subjects,
eventually we will estimate the mean amount of
workout time for ISU students very accurately

8
The Law of Large Numbers

Law of Large Numbers (LLN)
Draw independent observations at random from any
population with finite mean ?
As the number of observations drawn increases,
the mean x-bar of the observed values gets closer
and closer to the mean ? of the population
If n is the sample size as n gets large
The Law of Large Numbers holds for any
population, not just for special classes such as
Normal distributions

9
Example

Suppose we have a bowl with 21 small pieces of
paper inside. Each paper is labeled with a number
0-20. We will draw several random samples out of
the bowl of size n and record the sample means,
x-bar for each sample.
What is the population?
Since we know the values for each individual in
the population (i.e. for each paper in the bowl),
we can actually calculate the value of µ, the
true population mean. µ 10
Draw a random sample of size n 1.
Calculate x-bar for this sample.

10
Example

Draw a second random sample of size n 5.
Calculate for this sample.
Draw a third random sample of size n 10.
Calculate for this sample.
Draw a fourth random sample of size n 15.
Calculate for this sample.
Draw a fifth random sample of size n 20.
Calculate for this sample.
What can we conclude about the value of as
the sample size increases?
THIS IS CALLED THE LAW OF LARGE NUMBERS.

11
Another Example

Example Suppose we know that the average height
of all high school students in Iowa is 5.70
feet.
We get SRSs from the population and calculate
the height.

Mean of first n observations
12
Example 4.21 From Book

Sulfur compounds such as dimethyl sulfide (DMS)
are sometimes present in wine
DMS causes off-odors in wine, so winemakers
want to know the odor threshold
What is the lowest concentration of DMS that the
human nose can detect
Different people have different thresholds, so we
start by asking about the mean threshold ? in the
population of all adults
? is a parameter that describes this population

13
Example 4.21 From Text

To estimate ?, we present tasters with both
natural wine and the same wine spiked with DMS at
different concentrations to find the lowest
concentration at which they can identify the
spiked wine
The odor thresholds for 10 randomly chosen
subjects (in micrograms/liter)
28 40 28 33 20 31 29 27 17 21
The mean threshold for these subjects is 27.4
x-bar is a statistic calculated from this sample
A statistic, such as the mean of a random sample
of 10 adults, is a random variable.

14
Example

Suppose ? 25 is the true value of the parameter
we seek to estimate
The first subject had threshold 28 so the line
starts there
The second point is the mean of the first two
subjects
This process continues many many times, and our
line begins to settle around ? 25

15
Example 4.21From Book
The law of large numbers in action as we take
more observations, the sample mean always
approaches the mean of the population
16
The Law of Large Numbers

The law of large numbers is the foundation of
business enterprises such as casinos and
insurance companies
The winnings (or losses) of a gambler on a few
plays are uncertain -- thats why gambling is
exciting(?)
But, the house plays tens of thousands of times
So the house, unlike individual gamblers, can
count on the long-run regularity described by the
Law of Large Numbers
The average winnings of the house on tens of
thousands of plays will be very close to the mean
of the distribution of winnings
Hence, the LLN guarantees the house a profit!

17
Thinking about the Law of Large Numbers

The Law of Large Numbers says broadly that the
average results of many independent observations
are stable and predictable
A grocery store deciding how many gallons of milk
to stock and a fast-food restaurant deciding how
many beef patties to prepare can predict demand
even though their customers make independent
decisions
The Law of Large Numbers says that the many
individual decisions will produce a stable result

18
The Law of Small Numbers or Averages

The Law of Large Numbers describes the regular
behavior of chance phenomena in the long run
Many people believe in an incorrect law of small
numbers
We falsely expect even short sequences of random
events to show the kind of average behaviors that
in fact appears only in the long run

19
The Law of Small Numbers or Averages

Example Pretend you have an average free throw
success rate of 70. One day on the free throw
line, you miss 8 shots in a row. Should you hit
the next shot by the mythical law of averages.
No. The law of large numbers tells us that the
long run average will be close to 70. Missing 8
shots in a row simply means you are having a bad
day. 8 shots is hardly the long run.
Furthermore, the law of large numbers says
nothing about the next event. It only tells us
what will happen if we keep track of the long run
average.

20
The Hot Hand Debate

In some sports If player makes several
consecutive good plays, like a few good golf
shots in a row, often they claim to have the hot
hand, which generally implies that their next
shot is likely to a good one.
There have been studies that suggests that runs
of golf shots good or bad are no more frequent in
golf than would be expected if each shot were
independent of the players previous shots
Players perform consistently, not in streaks
Our perception of hot or cold streaks simply
shows that we dont perceive random behavior very
well!

21
The Gambling Hot Hand

Gamblers often follow the hot-hand theory,
betting that a lucky run will continue
At other times, however, they draw the opposite
conclusion when confronted with a run of outcomes
If a coin gives 10 straight heads, some gamblers
feel that it must now produce some extra tails to
get back into the average of half heads and half
tails
Not true! If the next 10,000 tosses give about
50 tails, those 10 straight heads will be
swamped by the later thousands of heads and
tails.
No short run compensation is needed to get back
to the average in the long run.

22
Need for Law of Large Numbers

Our inability to accurately distinguish random
behavior from systematic influences points out
the need for statistical inference to supplement
exploratory analysis of data
Probability calculations can help verify that
what we see in the data is more than a random
pattern

23
How Large is a Large Number?

The Law of Large Numbers says that the actual
mean outcome of many trials gets close to the
distribution mean ? as more trials are made
It doesnt say how many trials are needed to
guarantee a mean outcome close to ?
That depends on the variability of the random
outcomes
The more variable the outcomes, the more trials
are needed to ensure that the mean outcome x-bar
is close to the distribution ?

24
More Laws of Large Numbers

The Law of Large Numbers is one of the central
facts about probability
LLN explains why gambling, casinos, and insurance
companies make money
LLN assures us that statistical estimation will
be accurate if we can afford enough observations
The basic Law of Large Numbers applies to
independent observations that all have the same
distribution
Mathematicians have extended the law to many more
general settings

25
What if Observations are not Independent

You are in charge of a process that manufactures
video screens for computer monitors
Your equipment measures the tension on the metal
mesh that lies behind each screen and is critical
to its image quality
You want to estimate the mean tension ? for the
process by the average x-bar of the measurements
The tension measurements are not independent

26
AYK 4.82

Use the Law of Large Numbers applet on the text
book website

27
Sampling Distributions

The Law of Large Numbers assures us that if we
measure enough subjects, the statistic x-bar will
eventually get very close to the unknown
parameter ?

28
Sampling Distributions

What if we dont have a large sample?
Take a large number of samples of the same size
from the same population
Calculate the sample mean for each sample
Make a histogram of the sample means
the histogram of values of the statistic
approximates the sampling distribution that we
would see if we kept on sampling forever

The idea of a sampling distribution is the
foundation of statistical inference
The laws of probability can tell us about
sampling distributions without the need to
actually choose or simulate a large number of
samples

30
Mean and Standard Deviation of aSample Mean

Suppose that x-bar is the mean of a SRS of size n
drawn from a large population with mean ? and
standard deviation ?
The mean of the sampling distribution of x-bar is
? and its standard deviation is
Notice averages are less variable than
individual observations!

31
Mean and Standard Deviation of aSample Mean

The mean of the statistic x-bar is always the
same as the mean ? of the population
the sampling distribution of x-bar is centered at
?
in repeated sampling, x-bar will sometimes fall
above the true value of the parameter ? and
sometimes below, but there is no systematic
tendency to overestimate or underestimate the
parameter
because the mean of x-bar is equal to ?, we say
that the statistic x-bar is an unbiased estimator
of the parameter ?

32
Mean and Standard Deviation of aSample Mean

An unbiased estimator is correct on the average
in many samples
how close the estimator falls to the parameter in
most samples is determined by the spread of the
sampling distribution
if individual observations have standard
deviation ?, then sample means x-bar from samples
of size n have standard deviation
Again, notice that averages are less variable
than individual observations

33
Mean and Standard Deviation of aSample Mean

Not only is the standard deviation of the
distribution of x-bar smaller than the standard
deviation of individual observations, but it gets
smaller as we take larger samples
The results of large samples are less variable
than the results of small samples
Remember, we divided by the square root of n

34
Mean and Standard Deviation of aSample Mean

If n is large, the standard deviation of x-bar is
small and almost all samples will give values of
x-bar that lie very close to the true parameter ?
The sample mean from a large sample can be
trusted to estimate the population mean
accurately
Notice, that the standard deviation of the sample
distribution gets smaller only at the rate
To cut the standard deviation of x-bar in half,
we must take four times as many observations, not
just twice as many (square root of 4 is 2)

35
Example

Suppose we take samples of size 15 from a
distribution with mean 25 and standard deviation
7
the distribution of x-bar is
the mean of x-bar is
25
the standard deviation of x-bar is
1.80739

36
What About Shape?

We have described the center and spread of the
sampling distribution of a sample mean x-bar, but
not its shape
The shape of the distribution of x-bar depends on
the shape of the population distribution

37
Sampling Distribution of a Sample Mean

If a population has the N(?, ?) distribution,
then the sample mean x-bar of n independent
observations has the
distribution

38
Example

Adults differ in the smallest amount of dimethyl
sulfide they can detect in wine
Extensive studies have found that the DMS odor
threshold of adults follows roughly a Normal
distribution with mean ? 25 micrograms per
liter and standard deviation ? 7 micrograms per
liter

39
Example

Because the population distribution is Normal,
the sampling distribution of x-bar is also Normal
If n 10, what is the distribution of x-bar?

40
What if the Population Distribution is not
Normal?

As the sample size increases, the distribution of
x-bar changes shape
The distribution looks less like that of the
population and more like a Normal distribution
When the sample is large enough, the distribution
of x-bar is very close to Normal
This result is true no matter what shape of the
population distribution as long as the population
has a finite standard deviation ?

41
Central Limit Theorem

Draw a SRS of size n from any population with
mean ? and finite standard deviation ?
When n is large, the sampling distribution of the
sample mean x-bar is approximately Normal
x-bar is approximately

42
Central Limit Theorem

More general versions of the central limit
theorem say that the distribution of a sum or
average of many small random quantities is close
to Normal
The central limit theorem suggests why the Normal
distributions are common models for observed data

43
How Large a Sample is Needed?

Sample Size depends on whether the population
distribution is close to Normal
We require more observations if the shape of the
population distribution is far from Normal

44
Example

The time X that a technician requires to perform
preventive maintenance on an air-conditioning
unit is governed by the Exponential distribution
(figure 4.17 (a)) with mean time ? 1 hour and
standard deviation ? 1 hour
Your company operates 70 of these units
The distribution of the mean time your company
spends on preventative maintenance is

45
Example

What is the probability that your companys units
average maintenance time exceeds 50 minutes?
50/60 0.83 hour
So we want to know P(x-bar gt 0.83)
Use Normal distribution calculations we learned
in Chapter 2!

46
4.86 ACT scores

The scores of students on the ACT college
entrance examination in a recent year had the
Normal distribution with mean µ 18.6 and
standard deviation s 5.9

47
4.86 ACT scores

What is the probability that a single student
randomly chosen from all those taking the test
scores 21 or higher?

48
4.86 ACT scores

About 34 of students (from this population)
scored a 21 or higher on the ACT
The probability that a single student randomly
chosen from this population would have a score of
21 or higher is 0.34

49
4.86 ACT scores

Now take a SRS of 50 students who took the test.
What are the mean and standard deviation of the
sample mean score x-bar of these 50 students?
Mean 18.6 same as µ
Standard Deviation 0.8344 sigma/sqrt(50)

50
4.86 ACT scores

What is the probability that the mean score x-bar
of these students is 21 or higher?

51
4.86 ACT scores

About 0.2 of all random samples of size 50
(from this population) would have a mean score
x-bar of 21 or higher.
The probability of having a mean score x-bar of
21 or higher from a sample of 50 students (from
this population) is 0.002.

52
Section 4.4 Summary

When we want information about the population
mean µ for some variable, we often take a SRS and
use the sample mean x-bar to estimate the unknown
parameter µ.

53
Section 4.4 Summary

The Law of Large Numbers states that the actually
observed mean outcome x-bar must approach the
mean µ of the population as the number of
observations increases.

54
Section 4.4 Summary

The sampling distribution of x-bar describes how
the statistic x-bar varies in all possible
samples of the same size from the same population.

55
Section 4.4 Summary

The mean of the sampling distribution is µ, so
that x-bar is an unbiased estimator of µ.

56
Section 4.4 Summary

The standard deviation of the sampling
distribution of x-bar is sigma over the square
root of n for a SRS of size n if the population
has standard deviation sigma. That is, averages
are less variable than individual observations.

57
Section 4.4 Summary

If the population has a Normal distribution, so
does x-bar.

58
Section 4.4 Summary

The Central Limit Theorem states that for large n
the sampling distribution of x-bar is
approximately Normal for any population with
finite standard deviation sigma. That is,
averages are more Normal than individual
observations. We can use the fact that x-bar has
a known Normal distribution to calculate
approximate probabilities for events involving
x-bar.

Write a Comment

User Comments (0)