MATH 401 Probability and Statistics - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

MATH 401 Probability and Statistics

Description:

We assume that an unknown population is described by a random variable. ... that case a histogram and an ogive for relative frequencies based on a sample ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 43
Provided by: mathG9
Category:

less

Transcript and Presenter's Notes

Title: MATH 401 Probability and Statistics


1
MATH 401Probability and Statistics
  • Spring 2009

2
Basis for Inferential Statistics
  • We assume that an unknown population is described
    by a random variable.
  • In that case a histogram and an ogive for
    relative frequencies based on a sample give
    the contour of the PDF and Cumulative Probability
    Function, respectively.
  • Other important characteristics of a random
    variable are the expectation and the variance.
  • Summarizing data is essentially an initial
    attempt to estimate these parameters.

3
Population Mean
  • For a population of size N, its mean, ?, is given
    by

4
Sample Mean
  • For a sample of size n, the sample mean is given
    by

5
The Population Variance
  • The population variance is given by

6
Sample Variance
  • For a better estimate (???), the sample variance
    is defined by

7
Parameter Estimation
  • Lecture 9

8
Parameters and Statistics
  • A parameter is a population measure (e.g. ?, ?2).
  • A statistic is a sample function (e.g. sample
    mean, sample variance).
  • Hence, statistics may be regarded as random
    variables.
  • Statistics are used to estimate parameters and
    are called point estimators.
  • A point estimate of a parameter is a single
    numerical value of a respective estimator.
  • The standard deviation of an estimator is called
    the standard error.

9
Good Point Estimator
  • Let ? be a parameter and An a statistic
    estimating ? (based on a sample of size n). An is
    a good estimator of ?, if
  • It is unbiased E(An) a
  • It is consistent
  • It is relatively efficient It has the smallest
    variance.

10
Estimating the Mean and Variance
  • Given a population of size N, we need to find its
    mean ? and variance s2.
  • The number N is too big, so we pick a sample of
    reasonable (?) size n.
  • Find the sample mean and sample variance.
  • How good is as an estimate of ??
  • How good is s2 as an estimate of s2?

11
Reminder 1 Scaling of Expectation and Variance
  • Let a be a real number.
  • Then aX is a new random variable with the same
    distribution as X.
  • We observe that

12
Reminder 2 Sum of Independent RV
  • Let Xk, k1,,n, be independent RV. Then

13
Conclusion
  • Let Xk, k1,,n, be independent RV. Then

14
Analysis of Sample Variance
  • Let X1,, Xn be independent i.d. NRV. One can
    show that

15
Analysis of Sample Variance
  • We compute the expectation of the sum

16
Analysis of Sample Variance
  • It remains to notice that

17
Reminder 3 Normal Distribution
  • A continuous RV is said to be normally
    distributed if its PDF is given by

18
Reminder 4 Standard Normal Distribution
  • A non-standard ND can be standardized by
  • That is,

19
Reminder 5 Sum of Normal Distributions
  • Let Xk, k1,,n, be independent normally
    distributed RV. Then

20
Distribution of a Normal Sample Mean
  • Suppose all Xi are identically normally
    distributed.
  • Then the sample mean is clearly normally
    distributed with

21
Standardization of Sample Mean
  • Hence, the random variable
  • has a SND.

22
Interval Estimates
  • An interval estimate of a parameter is an
    interval within which the parameter is estimated
    to exist.
  • The confidence level of an interval estimate is
    the probability that the interval contains the
    parameter.
  • Notation An interval estimate with a confidence
    level 1-a, is referred to as a 1-a confidence
    interval.

23
Interval Estimates on the Population Mean
  • An interval estimate on the mean is an interval
    centered at the sample mean
  • ? is the maximum error of estimation.
  • Saying that ??? is
    equivalent to saying that .
  • How confident we are in this statement depends on
    (1 a) - the confidence level of the interval.

24
The Error and the Confidence Level
  • Recall that
  • It is clear that 1-a grows as ? increases.
  • You have a better chance of hitting the
    population mean if you widen the interval around
    the sample mean.
  • We would like to know the exact relation between
    a and ?.
  • For example, what would the error be if you would
    like to be 99 confident in your interval
    estimate of ??

25
Note
26
Observations
  • If the original variable X is normally
    distributed, then the sample mean is normally
    distributed with mean ? and variance ?2/n.
  • Were interested in
  • where x1 ??? and x2 ???.
  • The corresponding z-values are

27
The Relation between a and ?
  • Thus,
  • That is, 1-a is twice the area under the
    SND-curve between 0 and .
  • Hence, z?/2 the
    100(1-a/2)-percentage point of the standard ND
    variable.

28
The Central Limit Theorem
  • Let X1, X2, . . ., Xn be a sequence of
    independent identically distributed random
    variables, each having mean ? and variance ?2.
  • Then, for large values of n, the distribution of
  • X X1 X2 . . . Xn
  • is approximately normal
  • with mean n? and variance n?2.

29
Implications of the Central Limit Theorem
  • For large n,
  • The distribution of the sum of independent
    identically distributed random variables is
    normal although the variables themselves need not
    be normally distributed.
  • The distribution of the sample means is
    approximately normal, with mean ? and variance
    ?2/n.
  • In many practical examples a sample of size 40 or
    more will be sufficient for the normal
    approximation to work well. In some cases the
    Central Limit Theorem will work even if nlt40.

30
Example
  • The president of a large university wishes to
    estimate the average age of students presently
    enrolled. From past studies, the standard
    deviation is known to be 2 years. A sample of 50
    students is selected, and the mean is found to be
    23.2 years. Find the 95 confidence interval of
    the population mean.

31
Solution
  • We need to find ? such that
  • P(23.2 ? ? ?? ? ? 23.2 ?) 0.95 1-a
  • Hence,
  • Thus, we need to find z (a.k.a z?/2)such that

32
Solution
  • From the standard normal distribution table, we
    get

33
Solution
  • Hence, the 95 confidence interval of the
    population mean is
  • (23.2 ? 0.6, 23.2 0.6)
  • (22.6, 23.8)

34
Example
  • A college president wishes to estimate the
    average age of students presently enrolled. How
    large a sample is necessary? The president would
    like to be 99 confident that the estimate should
    be accurate with 1 year. From a previous study
    the standard deviation of the ages is known to be
    3 years.

35
Solution
  • Here we are given the following
  • 1-a 0.99
  • ? 3
  • ? 1
  • We would like to know the sample size, n, such
    that
  • where .

36
Solution
  • From the table za/2 2.58.
  • Thus, n (2.58)(3)2 59.9.
  • Which is rounded up to 60.

37
Estimating the Variance
  • Another parameter which often needs to be
    estimated is the variance s2.
  • Its natural estimator is the sample variance S2 .
  • In order to construct an interval estimate on the
    population variance we shall require a more
    detailed analysis of S2.

38
Analysis of Sample Variance
  • Let X1,, Xn be independent i.d. NRV. We showed
    that

39
Analysis of Sample Variance
  • Standardization of ND implies that

40
Chi-Square Distribution
  • Let Z1,,Zn be independent standard
    normally-distributed random variables.
  • The random variable
  • is called a chi-square distribution with n
    degrees of freedom (d.f.).

41
Future Plans
  • In the next meeting we are going to study the
    chi-square distribution in detail.
  • This will enable us to construct confidence
    intervals on the population variance.
  • So the next lecture is on
  • CONFIDENCE INTERVALS ON VARIANCE.

42
Thank you
Write a Comment
User Comments (0)
About PowerShow.com