45-733: lecture 7 (chapter 6) - PowerPoint PPT Presentation

About This Presentation
Title:

45-733: lecture 7 (chapter 6)

Description:

There is some population we are interested in: Families in the US ... Take 1 out of each 10,000 units off our prod line. 11/11/09 ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 61
Provided by: Andre7
Category:
Tags: chapter | lecture | take1

less

Transcript and Presenter's Notes

Title: 45-733: lecture 7 (chapter 6)


1
45-733 lecture 7 (chapter 6)
  • Sampling Distributions

2
Samples from populations
  • There is some population we are interested in
  • Families in the US
  • Products coming off our assembly line
  • Consumers in our products market segment
  • Employees

3
Samples from populations
  • We are interested in some quantitative
    information (called variables) about these
    populations
  • Income of families in the US
  • Defects in products coming off our assembly line
  • Perception of consumers of our product
  • Productivity of our employees

4
Samples from populations
  • All the information (accessible to statistics)
    about a quantity in a population is contained in
    its distribution function
  • Real-world distribution functions are complicated
    things
  • In real life, we usually know little or nothing
    about the distribution functions of the variables
    we are interested in

5
Samples from populations
  • Because distribution functions are complex, we
    only try to find out about certain aspects of
    them (parameters)
  • Average income of families in the US
  • Rate of defects coming off our production line
  • of customers who view our product favorably
  • Average pieces/hour finished by a worker

6
Samples from populations
  • Of course, we do not begin by knowing even these
    quantities
  • One possibility is to measure the whole
    population
  • Allows us to answer any question about the
    distribution or parameters, using the techniques
    of chapter 2
  • However, this is almost always expensive and
    often infeasible

7
Samples from populations
  • Instead, we take a sample
  • Taking a sample
  • We select only a few of the members of the
    population
  • We measure the variables of interest for those
    members we select
  • Examples
  • Phone survey
  • Take 1 out of each 10,000 units off our prod line

8
Samples from populations
  • The whole of statistics is figuring out what we
    can learn about the population from a sample
  • What can we say about the distribution of a
    variable from the information in a sample?
  • What can we say about the parameters we are
    interested in from our sample?
  • How good is the information in our sample about
    the population?

9
Samples from populations
  • Example
  • We are interested in how favorably our product is
    viewed by customers
  • We do a phone survey of our 5 good friends and
    ask them if they view our product favorably or
    unfavorably
  • All 5 say favorably
  • What can we conclude?

10
Samples from populations
  • Example
  • We are interested in how favorably our product is
    viewed by customers
  • We do a phone survey of 500 people who have
    purchased our product before and ask them if they
    view our product favorably or unfavorably
  • 466 say they view our product favorably
  • What can we conclude?

11
Samples from populations
  • Example
  • We are interested in how favorably our product is
    viewed by customers
  • We do a phone survey of 500 random adults and
    ask them if they view our product favorably or
    unfavorably
  • 351 say they view our product favorably
  • What can we conclude?

12
Samples and statistics
  • As a practical matter, we are usually interested
    in using our sample to say something about a
    parameter of the distribution we care about
  • To get at this parameter, we construct a variable
    called an estimator or statistic

13
Samples and statistics
  • Example
  • If we want to know the average income of families
    in the US, we draw a sample from a random phone
    survey of 1000 families
  • We ask, among other things, for their family
    income
  • To estimate E(I), we calculate the estimator or
    statistic called sample mean

14
Samples and statistics
  • Example
  • But, what does the sample mean of income tell us
    about E(I)?
  • Answering this question is the subject of the
    rest of the course, and of statistics in general

15
Random sampling
  • There are different ways to sample a population,
    different sampling schemes
  • The simplest sampling scheme is called simple
    random sampling or just random sampling
  • If there is a population of size N from which we
    are to draw a sample of size n, random sampling
    just says that the probability of any one of the
    N members of the population being drawn is just
    1/N, and that the draws are independent.

16
Statistic or estimator
  • A statistic (or estimator) is any function of a
    sample
  • It is an algorithm which tells us what we would
    do given a sample
  • Example
  • Sample mean
  • Sample variance

17
Statistic as random variable
  • A statistic is a random variable!!
  • A statistic is a random variable!!
  • A statistic is a random variable!!
  • A statistic is a random variable!!
  • A statistic is a random variable!!
  • A statistic is a random variable!!
  • A statistic is a random variable!!
  • A statistic is a random variable!!

18
Statistic as random variable
  • A simple example
  • Consider the Bernoulli random variable X with
    parameter p
  • We are interested in p, the probability of a
    success
  • To estimate p, we will calculate the sample mean
    of X

19
Statistic as random variable
  • A simple example
  • First, with a sample size of n1

20
Statistic as random variable
  • A simple example
  • Next, with a sample size of n2

21
Statistic as random variable
  • A simple example
  • Next, with a sample size of n3

22
Statistic as random variable
  • The statistic is a random variable
  • It has a distribution
  • Probability function or density
  • Cumulative distribution function
  • It has an expectation
  • It has a variance / standard deviation

23
Statistic as random variable
  • For the Bernoulli example
  • Expectation, variance with n1

24
Statistic as random variable
  • For the Bernoulli example
  • Expectation, variance with n2

25
Statistic as random variable
  • For the Bernoulli example
  • Expectation, variance with n3

26
Statistic as random variable
  • For the Bernoulli example
  • Probability function, n1

p
1-p
0
p
1
27
Statistic as random variable
  • For the Bernoulli example
  • Probability function, n2

0
p
1
1/2
28
Statistic as random variable
  • For the Bernoulli example
  • Probability function, n3

0
1
2/3
1/3
p
29
Sample mean
  • As we have discussed before, the sample mean of a
    random variable X from a sample of size n is

30
Sample mean
  • The sample mean is a random variable!!
  • Sample mean is made out of n random variables
    therefore, it is a random variable

31
Sample mean
  • Lets suppose X is a random variable with mean ?X
    and standard deviation ?X, and lets consider the
    sample mean

32
Sample mean
  • Since the sample mean is a random variable, we
    can ask about its expectation

33
Sample mean
  • Since the sample mean is a random variable, we
    can ask about its expectation

34
Sample mean
  • The expectation of the sample mean is equal to
    the expectation of the underlying random variable
  • On average, the sample mean is equal to the
    underlying random variable

35
Sample mean
  • We can also ask about the variance of the sample
    mean

36
Sample mean
  • If it is an independent, random sample then the
    covariances are all zero

37
Sample mean
  • The variance of the sample mean is less than the
    variance of the underlying random variable
  • The variance of the sample mean gets smaller as
    the sample size increases
  • The variance of the sample mean goes to zero as
    the sample size goes to infinity

38
Sample mean
  • Our two results

39
Sample mean
  • Say that
  • On average, the sample mean is equal to the mean
    of the underlying random variable, regardless of
    sample size
  • As the sample size grows, the variance of the
    sample mean shrinks, eventually approaching zero

40
Sample mean
  • What would happen if the sample size got to
    infinity?
  • Then the sample mean would no longer be a random
    variable, it would literally equal the population
    mean, E(X)

41
Sample mean
  • Suppose XN(1,1).

n100
n1
42
Sample mean
  • Suppose XN(1,1).

n1000
n100
n1
43
Sample mean
  • Finite sample correction
  • What has gone before has assumed either that you
    sample with replacement or that the population
    you are sampling from is very large (infinite)
  • Just as we needed to use hypergeometric rather
    than binomial when sampling from a small pop
    without replacement, so here

44
Sample mean
  • Finite sample correction
  • For a population of size N, sampled without
    replacement by a sample of size n

45
Sample mean
  • Normal variables and
  • If X is normal, then so is X-bar
  • If X is normal, then

46
Sample mean
  • Central limit theorem and
  • As long as X comes from an independent random
    sample

47
Sample proportion
  • Consider W a Bernoulli and an independent random
    sample of size n
  • Observe that X W1 W2 Wn is distributed
    Binomial (and therefore approx normal)

48
Sample proportion
  • The sample mean (I.e. sample proportion) is
  • Just a binomial divided by n
  • Also approx normal

49
Sample proportion
  • To emphasize that we are estimating the p
    parameter of the Bernoulli, we may write

50
Sample proportion
  • Just as before, the sample mean has the same
    expectation as the underlying Bernoulli random
    variable

51
Sample proportion
  • Just as before, the sample mean has the variance
    of the underlying Bernoulli random variable over
    n

52
Sample proportion
  • Just as before, if there is a finite population
    sampled w/o replacement

53
Sample variance
  • As we have discussed before, the sample variance
    and sample standard deviation are given by

54
Sample variance
  • Sometimes these are written

55
Sample variance
  • It turns out that

56
Sample variance
  • It turns out that

57
Sample variance
  • It turns out that, if X is distributed normal

58
Sample variance
  • It turns out that (by the CLT), if X is from an
    independent random sample

59
Sample variance
  • Discuss Chi-Squared distribution

60
Sample variance
  • Example (problem 46, page 251)
  • A drug company manufactures pills
  • These pills have normally distributed weight
  • The drug co wants the variance of weight to be
    smaller than 1.5 milligrams squared
  • Drug co collects a sample of size 20
  • The sample variance is 2.05
  • How likely is it that a sample variance this
    high or higher would be found if the true
    variance is 1.5?
Write a Comment
User Comments (0)
About PowerShow.com