Objectives

About This Presentation

Transcript and Presenter's Notes

Title: Objectives

1
Objectives

6.1 Estimating with confidence
Statistical confidence
Confidence intervals
Confidence interval for a population mean
How confidence intervals behave
Choosing the sample size

Methods for drawing conclusions about a
population from sample data are called
statistical inference
So well use data to make inferences i.e., draw
conclusions about populations from data in our
samples or from our experiments
We'll consider two types
Confidence interval estimation
Tests of significance
In both of these cases, we'll consider our data
as either being a random sample from a population
or as data from a randomized experiment
Start with estimation there are two situations
we'll consider
estimating the mean m of a population of
measurements
estimating the proportion p of Ss in a population
of Ss and Fs

In either case, we'll construct a confidence
interval of the form estimate /- M.O.E., where
M.O.E. margin of error of the estimator.
The MOE gives information on how good the
estimate is through the variation in the
estimator (its standard error) and through the
level of confidence in the confidence interval
(through a tabulated value).
The standard error of an estimator is its
estimated standard deviation (treating the
estimator as a statistic with a sampling
distribution)
Best estimator of m is and we know from
the previous chapter that is approximately
Best estimator of p is phat and we know from the
last chapter that phat is approx.

4
Statistical confidence

Although the sample mean, , is a unique number
for any particular sample, if you pick a
different sample you will probably get a
different sample mean.
In fact, you could get many different values for
the sample mean, and virtually none of them would
actually equal the true population mean, ?.

But the sample distribution is narrower than the
population distribution, by a factor of 1/vn.
Thus, the estimates gained from our
samples are always relatively close to the
population parameter µ.

n
Sample means,n subjects
Population, xindividual subjects
m
If the population is normally distributed N(µ,s),
so will the sampling distribution N(µ,s/vn),
6
95 of all sample means will be within the MOE
(2s/vn) of the population parameter
m.??MOEMargin of Error) Distances are
symmetrical which implies that the population
parameter m must be within roughly 2 standard
deviations from the sample average , in 95
of all samples.
Red dot mean value of individual sample
This reasoning is the essence of statistical
inference - know and understand this figure!
7
Confidence intervals

The confidence interval is a range of values with
an associated probability or confidence level C.
The probability quantifies the chance that the
interval contains the true population parameter.

4.2 is a 95 confidence interval for the
population parameter m. This equation says that
in 95 of the cases, the actual value of m will
be within 4.2 units of the value of .
8
Reworded

With 95 confidence, we can say that µ should be
within roughly 2 standard deviations (2s/vn)
from our sample mean .
In 95 of all possible samples of this size n, µ
will indeed fall in our confidence interval.
In only 5 of samples would be farther from µ.

A confidence interval can be expressed as
Sample Mean MOE MOE is called the margin of
errorm within mExample 120 6

Two endpoints of an interval m within ( -
MOE) to ( MOE) ex. 114 to 126

A confidence level C (in ) indicates the sense
of confidence that the µ falls within the
interval. It represents the area under the
normal curve within MOE of the center of the
curve.
MOE
MOE
10
Review standardizing the normal curve using z
N(64.5, 2.5) N(µ, s/vn)
N(0,1)
Standardized height (no units)
Here, we work with the sampling distribution of
the sample mean, and s/vn is its standard
deviation (spread). Remember that s is the
standard deviation of the original population.
11
Varying confidence levels

Confidence intervals contain the population mean
m in C of samples, in the long run. Different
areas under the curve give different confidence
levels C.

Practical use of z z
z is related to the chosen confidence level C.
C is the area under the standard normal curve
between -z and z.

C
z
-z
Example For an 80 confidence level C, 80 of
the normal curves area is contained in the
interval.
12
How do we find specific z values?

We can use a table of z (Table A) or t values
(Table D). In Table D, for a particular
confidence level, C, the appropriate z value is
just above it.

Example For a 98 confidence level, z2.326
We can use software. In JMP Create a new
column, Edit Formula, and choose Normal Quantile(
p ) under Probability where p (1-C)/2 is the
area to the left of z Since we want the middle C
probability, the probability we require is (1 -
C)/2 Example A 98 confidence level, Normal
Quantile (.01) -2.326349 ( neg. z)
13
Link between confidence level and margin of error

The confidence level C determines the value of z
(in table A or D).
The margin of error m also depends on z.

Higher confidence C implies a larger margin of
error m (thus less precision in our
estimates). A lower confidence level C produces
a smaller margin of error m (thus better
precision in our estimates).
14
Different confidence intervals for the same set
of measurements
Density of bacteria in solution Measurement
equipment has standard deviation s 1 106
bacteria/ml fluid. Three measurements 24, 29,
and 31 106 bacteria/ml fluid Mean 28
106 bacteria/ml. Find the 96 and 70 CI.

96 confidence interval for the true density, z
2.054, and write
28 2.054(1/v3)
28 1.19 x 106
bacteria/ml

70 confidence interval for the true density, z
1.036, and write
28 1.036(1/v3)
28 0.60 x 106
bacteria/ml

15
Properties of Confidence Intervals

User chooses the confidence level
Margin of error follows from this choice

We want
high confidence
small margins of error

The margin of error, , is smaller when
z (and thus the confidence level C) gets smaller
s is smaller
n is larger

16
Impact of sample size

The spread in the sampling distribution of the
mean is a function of the number of individuals
per sample.
The larger the sample size, the smaller the
standard deviation (spread) of the sample mean
distribution.
But the spread only decreases at a rate equal to
1/vn.

Standard error ? / vn
Sample size n
17
Sample size and experimental design

You may need a certain margin of error (e.g.,
drug trial, manufacturing specs). In many cases,
the population variability (s) is fixed, but we
can choose the number of measurements (n).
So plan ahead what sample size to use to achieve
that margin of error.

Remember, though, that sample size is not always
stretchable at will. There are typically costs
and constraints associated with large samples.
The best approach is to use the smallest sample
size that can give you useful results.
18
What sample size for a given margin of error?
Density of bacteria in solution Measurement
equipment has standard deviation s 1 106
bacteria/ml fluid. How many measurements should
you make to obtain a margin of error of at most
0.5 106 bacteria/ml with a confidence level of
95? For a 95 confidence interval, z 1.96.
Using only 15 measurements will not be enough to
ensure that m is no more than 0.5 106.
Therefore, we need at least 16 measurements.
19
Cautions about using

Data must be a SRS from the population.
Formula is not correct for other sampling
designs.
Inference cannot rescue badly produced data.
Confidence intervals are not resistant to
outliers.
If n is small (lt15) and the population is not
normal, the true confidence level will be
different from C.
The standard deviation ? of the population must
be known.
? The margin of error in a confidence interval
covers only random sampling errors!

20
Interpretation of Confidence Intervals

Conditions under which an inference method is
valid are never fully met in practice.
Exploratory data analysis and judgment should be
used when deciding whether or not to use a
statistical procedure.
Any individual confidence interval either will or
will not contain the true population mean. It is
wrong to say that the probability is 95 that the
true mean falls in the confidence interval.
The correct interpretation of a 95 confidence
interval is that we are 95 confident that the
true mean falls within the interval. The
confidence interval was calculated by a method
that gives correct results in 95 of all
possible samples.
In other words, if many such confidence
intervals were constructed, 95 of these
intervals would contain the true mean.
HW Read Introduction to Chapter 6 and Section
6.1 do 6.1-6.8, 6.10-6.18, 6.27, 6.28, 6.34,
6.35

Write a Comment

User Comments (0)

About PowerShow.com

Objectives PowerPoint PPT Presentation