Statistical Sampling - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Statistical Sampling

Description:

... the probability (likelihood) that the interval contains the ... 95% Confidence Intervals. 0.95. z.025= -1.96. z.025= 1.96. Dr. C. Ertuna. 27. CI for Proportions ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 46
Provided by: zeli1
Category:

less

Transcript and Presenter's Notes

Title: Statistical Sampling


1
Statistical Sampling Analysis of Sample Data
  • (Lesson - 04/A)
  • Understanding the Whole from Pieces

2
Sampling
  • Sampling is
  • Collecting sample data from a population and
  • Estimating population parameters
  • Sampling is an important tool in business
    decisions since it is an effective and efficient
    way obtaining information about the population.

3
Sampling (Cont.)
  • How good is the estimate obtained from the
    sample?
  • The means of multiple samples of a fixed size (n)
    from some population will form a distribution
    called the sampling distribution of the mean
  • The standard deviation of the sampling
    distribution of the mean is called the standard
    error of the mean

4
Sampling (Cont.)
  • Standard Error of the mean
  • Estimates from larger sample sizes provide more
    accurate results
  • If the sample size is large enough the sampling
    distribution of the mean is approximately normal,
    regardless of the shape of the population
    distribution - Central Limit Theorem

5
Sampling Distribution of the Mean
THE CENTRAL LIMIT THEREOM For samples of n
observations taken from a population with mean ?
and standard deviation ?, regardless of the
populations distribution, provided the sample
size is sufficiently large, the distribution of
the sample mean , will be normal with a
mean equal to the population mean
. Further, the standard deviation will equal the
population standard deviation divided by the
square-root of the sample size .
The larger the sample size, the better the
approximation to the normal distribution.
6
Sampling Statistics
  • Sampling statistics are statistics that are based
    on values that are created by repeated sampling
    from a population,
  • such as
  • Mean of the sampling means
  • Standard Error of the sampling mean
  • Sampling distribution of the means

7
Sampling Key Issues
  • Key Sampling issues are
  • Sample Design (Planning)
  • Sampling Methods (Schemes)
  • Sampling Error
  • Sample Size Determination.

8
Sampling Design
  • Sample Design (Sample Planning) describes
  • Objective of Sampling
  • Target Population
  • Population Frame
  • Method of Sampling
  • Statistical tools for Data Analysis

9
Sampling Methods
Sampling Methods (Sampling Schemes)
  • Subjective Methods
  • Judgment Sampling
  • Convenience Sampling
  • Probabilistic Methods
  • Simple Random Sampling
  • Systematic Sampling
  • Stratified Sampling
  • Cluster Sampling

10
Sampling Methods (Cont.)
  • Simple Random Sampling Method
  • refers to a method of selecting items from a
    population such that every possible sample of a
    specified size has an equal chance of being
    selected
  • with or without replacement

11
Sampling Methods (Cont.)
  • Stratified Sampling Method
  • Population is divided into natural subsets
    (Strata)
  • Items are randomly selected from stratum
  • Proportional to the size of stratum.

12
Stratified Sampling Example
Population Cash holdings of All Financial
Institutions in the Country
Stratified Sample of Cash Holdings of Financial
Institutions
13
Cluster Sampling
  • Cluster sampling refers to a method by which the
    population is divided into groups, or clusters,
    that are each intended to be mini-populations. A
    random sample of m clusters is selected.

14
Cluster Sampling Example
Mid-Level Managers by Location for a Company
15
Sampling Error
  • SAMPLING ERROR-SINGLE MEAN
  • The difference between a value (a statistic)
    computed from a sample and the corresponding
    value (a parameter) computed from a population.
  • Where

16
Sampling Error (Cont.)
  • Sampling Error is inherent in any sampling
    process due to the fact that samples are only a
    subset of the total population.
  • Sampling Errors depends on the relative size of
    sample
  • Sampling Errors can be minimized but not
    eliminated.

17
Sampling Error (Cont.)
  • If Sampling size is more than 5 of the
    population
  • With Replacement assumption of Central Limit
    Theorem and hence, Standard Error calculations
    are violated
  • Correction by the following factor is needed.

18
Sampling Size
  • Sample Size Determination.

where, n sample size z z-score a factor
representing probability in terms of standard
deviation a (100 - confidence level) E
interval on either side of the mean
19
Estimation
  • Estimation (Inference) is assessing the the value
    of a population parameter using sample data
  • Two types of estimation
  • Point Estimates
  • Interval Estimates

20
Estimation
FOR ESTIMATION USE ALLWAYS z or t DISTRIBUTION
21
Estimation (Cont.)
  • Most common point estimates are the descriptive
    statistical measures.
  • If the expected value of an estimator equals to
    the population parameter then it is called
    unbiased.

22
Estimation (Cont.)
That means that we can use sample estimates as if
they were population parameters without
committing an error.
23
Estimation (Cont.)
  • Interval Estimate provides a range within which
    population parameter falls with certain
    likelihood.
  • Confidence Level is the probability (likelihood)
    that the interval contains the population
    parameter. Most commonly used confidence levels
    are 90, 95, and 99.

24
Confidence Interval
  • Confidence Interval (CI) is an interval estimate
    specified from the perspective of the point
    estimate.
  • In other words CI is
  • an interval on either side (/-) of the point
    estimate
  • based on a fraction (t or z-score) of the Std.
    Dev. of the point estimate

25
Confidence Intervals
Lower Confidence Limit
Upper Confidence Limit
Point Estimate
26
95 Confidence Intervals
0.95
z.025 -1.96
z.025 1.96
27
CI for Proportions
  • For categorical variables having only two
    possible outcomes proportions are important.
  • An unbiased estimation of population proportion
    (p) is the sample statistics

p x/n where, x number of observations in the
sample with desired characteristics
28
Confidence Interval- From General to Specific
Format -
29
CI of the Mean (Cont.)
where, E Margin of Error
30
Confidence Interval- From Statistical Expression
to Excel Formula -
  • Where
  • z a/2 Normsinv(1 a/tails)
  • and when n lt 30 AND s is not known , then z ?
    t so
  • t a/2 n-1 Tinv(2a/tails, n-1)

31
CI of the Mean (Cont.)
where, z z-score a critical factor
representing probability in terms of Standard
Deviation (for sampling Standard Error) (valid
for normal distribution) (critical value) t
t-score a factor representing probability in
terms of standard deviation (or Std. Error)
(valid for t distribution) (critical value) a
(100 - confidence level)
32
Confidence Interval
  • PHStat2

33
Z-score
  • A z-score is a critical factor, indicating how
    many standard deviation (standard error for
    sampling) away from the mean a value should be to
    observe a particular (cumulative) probability.
  • There is a relationship between z-score and
    probability over p(x) (1-Normsdist(z))tails
    and
  • There is a relationship between z-score and the
    value of the random variable over

34
Z-score (Cont.)
  • Since the z-score is a measure of distance from
    the mean in terms of Standard Deviation (Standard
    Error for sampling), it provides us with
    information that a cumulative probability could
    not. For example, the larger z-score the unusual
    is the observation.

35
Students t-Distribution
The t-distribution is a family of distributions
that is bell-shaped and symmetric like the
Standard Normal Distribution but with greater
area in the tails. Each distribution in the
t-family is defined by its degrees of freedom.
As the degrees of freedom increase, the
t-distribution approaches the normal distribution.
36
Degrees of freedom
Degrees of freedom (df) refers to the number of
independent data values available to estimate the
populations standard deviation. If k parameters
must be estimated before the populations
standard deviation can be calculated from a
sample of size n, the degrees of freedom are
equal to n - k.
37
Example of a CI Interval Estimate for ?
  • A sample of 100 cans, from a population with ?
    0.20, produced a sample mean equal to 12.09. A
    95 confidence interval would be

38
Example of Impact of Sample Size on Confidence
Intervals
  • If instead of sample of 100 cans, suppose a
    sample of 400 cans, from a population with ?
    0.20, produced a sample mean equal to 12.09. A
    95 confidence interval would be

12.0704 ounces
12.1096 ounces
n400
n100
12.051 ounces
12.129 ounces
39
Example of CI for Proportion
  • 62 out of a sample of 100 individuals who were
    surveyed by Quick-Lube returned within one month
    to have their oil changed. To find a 90
    confidence interval for the true proportion of
    customers who actually returned

40
From Margin of Error to Sampling Size
41
Sampling Size
  • Sample Size Determination.

where, n sample size z z-score a factor
representing probability in terms of standard
deviation a 100 - confidence level E
interval on either side of the mean
42
Sampling Size
  • PHStat2

43
Pilot Samples
A pilot sample is a sample taken from the
population of interest to provide and estimate
for the population standard deviation. Normally
its size is smaller than the anticipated sample
size.
44
Example of Determining Required Sample Size
  • The manager of the Georgia Timber Mill wishes to
    construct a 90 confidence interval with a margin
    of error of 0.50 inches in estimating the mean
    diameter of logs. A pilot sample of 100 logs
    yield a sample standard deviation of 4.8 inches.

45
Next Lesson
  • (Lesson - 04/B)
  • Hypothesis Testing
Write a Comment
User Comments (0)
About PowerShow.com