Inference on averages - PowerPoint PPT Presentation

About This Presentation
Title:

Inference on averages

Description:

Data on the daily intakes of calcium (in milligrams) for 36 women, between the ... What is the estimated average calcium intake for women in this age range? ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 21
Provided by: rset
Category:

less

Transcript and Presenter's Notes

Title: Inference on averages


1
Inference on averages
  • Data are collected to learn about certain
    numerical characteristics of a process or
    phenomenon that in most cases are unknown.
  • Example A study was conducted to analyze womens
    bone health. Data on the daily intakes of calcium
    (in milligrams) for 36 women, between the age of
    18 and 24 years, were collected. What is the
    estimated average calcium intake for women in
    this age range?
  • The sample average is an estimate of the average
    calcium intake for women between the age of 18
    and 24 years.
  • Population all the women of age (18-24) years.
  • Sample 36 women of age (18-24) years selected
    at random

2
Estimating the population average
  • To estimate the population average
  • Select a simple random sample of size n from
    the population of interest, so that each unit in
    the sample has the same probability to be
    selected.
  • Collect data from the sample
  • Compute the sample average and the standard
    deviation.
  • The sample average x is an estimate of the
    population average.
  • How accurate is such an estimate?
  • A measure of the accuracy is given by the
    standard error S.E. of the sample average.
  • where s is the standard deviation of the
    observations. The larger the sample, the more
    accurate the average is as an estimate of the
    population average

3
What is distribution of the sample average?
If the investigators takes several samples of
size n and compute the averages in each sample,
then all the sample averages will be somewhere
around the population average. sample average
population average m sampling error

S.E.
m
4
What is the shape of the sampling distribution?
If the sample size n is large (ngt50), the sample
average is approximately normal with mean equal
to the population mean and standard deviation
equal to the standard error of the sample
average.
  • The larger the sample, the more accurate the
    normal approximation is.
  • If the distribution of the population is not
    symmetric, the normal approximation is less
    accurate, and you need a larger sample.

5
Confidence Intervals for averages
Problem We want to estimate the unknown
population mean µ. Answer We compute a
confidence interval for µ, that is the set of
plausible values for µ in the light of the data.
A 95 confidence interval for µ is defined
as sample average ? margin of error Where
the margin of error indicates how accurate our
estimate is.
6
Confidence Intervals
In samples of size n, a level C confidence
interval for the population average is sample
average ta/2S.E. where ta/2 is the
critical value, such that the area between - ta/2
and ta/2 under the curve of the t-distribution
with n-1 degrees of freedom is C1-a.
0.95

The value of ta/2 is computed using the Excel
function TINV(a, df) Where df sample size -1
ta/2
- ta/2
7
Example
  • Data on the daily intakes of calcium (in
    milligrams) for 36 women, between the age of 18
    and 24 years were collected.
  • The sample average is
  • The standard deviation is s422
  • The sample size is n36
  • The standard error is S.E.422/sqrt(36)70.33
  • The 95 confidence interval is
  • (898.44 t 0.02570.33, 898.44 t 0.025 70.33)
  • The value t 0.0252.03, thus a 95 C.I. for m
    is (755.66mg, 1041.23mg)
  • We are 95 confident that the true average
    calcium intake is a value between 755.66 mg and
    1041.23 mg.

8
COUNT(data) B4/sqrt(B5)
stdev/sqrt(n) B5-1
n-1 TINV((1-B6), B10) TINV(alpha, df)
9
Understanding a 95 confidence interval
For about 95 out of 100 samples, the population
average m lies in the associated 95 confidence
intervals. Suppose we take 25 samples of 36
women between 18 and 24 years of age and for each
sample we compute the sample average and the 95
C.I.
Distribution of sample averages
Why do the intervals move around? How many
intervals contain the true value m?
m
In the long run, 95 of all the samples will
produce an interval that contains the true value
m. Be careful though, it might happen that the
C.I. computed with the sample collected in the
study DOES NOT contain the true average value!
10
What is the t-distribution?
The t-distribution with n-1 degrees of freedom is
a symmetric distribution with center at 0. For
large n, the t-distribution is close to the
standard normal distribution.
11
Comparing the t-distribution curve and the
standard normal curve
d.f.5
d.f.15
t
t
t-distribution Standard Normal
curve t-distribution curve has fatter tails.
For d.f. around 30, the t-distribution curve is
very similar to the standard normal curve.
d.f.30
t
12
A different confidence level
  • Suppose we want to compute a 90 confidence
    interval for the average calcium intake.
  • We will use the same formula, with a different
    critical value t
  • The sample average is 898.44 - The standard
    deviation is s422
  • The sample size is n36
  • The standard error is S.E.422/sqrt(36)70.33
  • The confidence level C0.90, alpha1-C0.10
  • The 90 confidence interval is
  • (898.44 t 0.0570.33, 898.44 t 0.05 70.33)

13
The critical value t 0.05 1.688 The C.I. Is
(898.44 1.68870.33, 898.44 1.688
70.33) (779.72mg, 1017.168mg) With 90
confidence level, we state that the average
calcium intake is between 779.72mg and 1017.168
mg.
14
Approximate Confidence Intervals
The normal approximation can be used to compute
approximate confidence intervals if the sample
size is large (ngt30).
Area under the normal curve 95
m-1.96SE m m1.96SE

1.64 S.E
Margin of error
90 Confidence Interval
1.96 S.E
95 Confidence Interval
2.57 S.E
99 Confidence Interval
15
Expressions for C.I.s
is the sample average of n observations in a
simple random sample of size n, where n is large
(gt30)
s is the standard deviation of the n
observations.
The 90 C.I. for the population mean The
95 C.I. for the population mean The 99 C.I.
for the population mean
16
General remarks on C.I.s
  • The purpose of a C.I. is to estimate an unknown
    parameter with an indication of how accurate the
    estimate is and of how confident we are that the
    result is correct.
  • The methods used here rely on the assumption that
    the sample is randomly selected.
  • Any confidence interval has two parts
    estimate margin of error
  • The confidence level states the probability that
    the method will give a correct answer, i.e. the
    confidence interval contains the true value of
    the parameter.
  • The margin of error of a confidence interval
    decreases as
  • The confidence level decreases
  • The sample size n increases

17
  • Remarks
  • Notice the trade off between the margin of error
    and the confidence level. The greater the
    confidence you want to place in your prediction,
    the larger the margin of error is (and hence less
    informative you have to make your interval).
  • A C.I. gives the range of values for the unknown
    population average that are plausible, in the
    light of the observed sample average. The
    confidence level says how plausible.
  • A C.I. is defined for the population parameter,
    NOT the sample statistic.
  • To make a margin of error smaller, you can take a
    larger sample!
  • Use the t-distribution in small samples (nlt30).
    For large samples, the t-distribution is
    equivalent to the standard normal distribution.

18
Testing hypotheses
  • The recommended daily allowance (RDA) of calcium
    for women between 18-24 years of age is 1300
    milligrams. An health organization claims that,
    on average, women in this age range take less
    calcium than the RDA level.
  • Using the collected data, what can we conclude
    regarding the claim of the health organization?

19
Testing hypotheses
Confidence intervals can be used to test
conjectures or hypotheses about a certain
characteristic of interest. A trucking firm
suspects the claim that the average lifetime of
certain tires is at least 28,000 miles. To check
the claim, the firm puts 80 of these tires on its
trucks and gets an average lifetime of 27,563
miles with a standard deviation of 1,348 miles.
What can you conclude from the data ? We can
construct a confidence level and check if the
interval contains the value of 28,000 miles. In
such a case, we could conclude that 28,000 is
plausible in the light of the data!
20
Testing hypotheses
  • A 95 C.I. for the average lifetime is
  • (Are we using the t-distribution or the normal
    curve?)
  • 27,563 1.96 1,348/sqrt(80) 27,563 295.39
    miles (27267, 27858).
  • Based on the data, the confidence interval
    contains values that are lower than 28,000 miles
    . It is more likely that the tires will last a
    shorter time.
Write a Comment
User Comments (0)
About PowerShow.com