Ch 8, Confidence

About This Presentation

Title:

Ch 8, Confidence

Description:

Ch 8, Assumptions. We use the t distn when we estimate the SD from the data ... Ch 8, Assumptions. There may be outliers in the sample ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 23

Provided by: johnt1

Category:

more less

Transcript and Presenter's Notes

Title: Ch 8, Confidence

1
Ch 8, Confidence

In Ch7, we saw that the average of a sample has a
normal distn
The mean of the avg is the mean of the data
We can find how far the avg might be from the
mean with a certain probability

2
Ch 8, Confidence

This leads us to say that we can be, say, 95
confident that the mean is within a certain
distance of the avg
Note that the mean might be unknown, but it is
not random
It does not change if we take a different sample
So we cannot assign probabilities to the mean

3
Ch 8, Confidence

If we give a 95 confidence interval for the
mean, we are 95 confident that the true mean
lies within this interval
The interval will be based on an interval that
has a 95 probability of containing the avg
(Since the avg IS random, we can assign
probabilities to it.)

4
Ch 8, Confidence

Suppose we have a normal distn with SD11.5
Consider the avg of N6 values from this
population
In what interval (around the mean) will the avg
fall with probability 95?

5
Ch 8, Confidence
6
Ch 8, Confidence

So 95 of the time, the avg will be within /-9.2
of the mean
We now reverse this and say that we are 95
confident that the mean is within /-9.2 of the
avg
If the avg turns out to be 63.7, then the 95 CI
for the mean is 63.7 /-9.2

7
Ch 8, Confidence
8
Ch 8, Confidence

Procedure
For given confidence, find how many SD the avg
and mean might be apart
We are confident that the mean is within this far
of the avg

9
Ch 8, Confidence

Example
We have a normal population with SD8.4
The avg of a sample of size 7 is 33.5
Find 90 CI for the mean of the population

10
Ch 8, Confidence
11
Ch 8, Confidence

So the avg might be 1.6458.4/sqrt(7) away from
the mean
We are 90 confident that the mean is no higher
than 33.5 1.6458.4/sqrt(7) and no lower than
35- 1.6458.4/sqrt(7)

12
Ch 8, Confidence

In the previous problems, we had to assume we
knew the value for the SD
This is generally not true
Generally we have to estimate SD from the data
Replace the normal distn with Students t distn

13
Ch 8, Confidence

Students t is similar to normal, but has an
extra parameter
Degrees of freedom (df)
Measures how good our est of SD is
FOR THESE PROBLEMS, dfN-1
For large df, t is essentially normal
For smaller df, the quantiles are a little larger
for t than for normal

14
Ch 8, Confidence

(See new PROBCALC.XLS)
For normal, 90 CI is /-1.645 SD
For t with 5 df, 90 is 2.015 SD
For df15, 90 is 1.753 SD
For df75, 90 is 1.665 SD

15
Ch 8, Assumptions

We use the t distn when we estimate the SD from
the data
We are still assuming that the data comes from a
normal population
There are two issues

16
Ch 8, Assumptions

There may be outliers in the sample
This means that some of the observations should
not be considered to be representative of the
underlying population
For our purposes, we will only consider an
observation to be an outlier if it is WAY outside
the others

17
Ch 8, Assumptions

Boxplots may be helpful in suggesting outliers
These are marked with asterisks
But simply because a value is far from the other
values does not make it an outlier
There should be some reason to think that this
point should not have been in the sample

18
Ch 8, Assumptions

Suppose we take a sample of cities
New York might be an outlier because it is so
different from other cities
If we consider the stock market, we might want to
eliminate 9/11
If we consider Olympic records, we might ignore
Bob Beamons 1968 record

19
Ch 8, Assumptions

The other assumption is that the underlying distn
is normal
A good way to assess whether the distn is normal
is to use a normal probability plot
This plots the sorted data vs percentiles of the
(standard) normal
If the plot is approx a straight line, then the
data appears to come from a normal distn

20
Ch 8, Assumptions

How close to a straight line?
We will only be concerned if the plot is
OBVIOUSLY not a straight line

21
Ch 8, Assumptions

What to do if the distn is NOT normal?
This changes the question
The mean is the obvious way to describe the
normal distn
For a different distn, it is less clear what to
use to describe

22
Ch 8, Assumptions

We might consider using the median to describe
the distn
If we consider a possible value for the median
and count the number in our sample that are above
(or below) this value, we get the binomial distn
Not so simple to form confidence intervals

Write a Comment

User Comments (0)