Statistical Sampling presentation

About This Presentation

Transcript and Presenter's Notes

Title: Statistical Sampling

1
Statistical Sampling Analysis of Sample Data

(Lesson - 04/A)
Understanding the Whole from Pieces

2
Sampling

Sampling is
Collecting sample data from a population and
Estimating population parameters
Sampling is an important tool in business
decisions since it is an effective and efficient
way obtaining information about the population.

3
Sampling (Cont.)

How good is the estimate obtained from the
sample?
The means of multiple samples of a fixed size (n)
from some population will form a distribution
called the sampling distribution of the mean
The standard deviation of the sampling
distribution of the mean is called the standard
error of the mean

4
Sampling (Cont.)

Standard Error of the mean

Estimates from larger sample sizes provide more
accurate results
If the sample size is large enough the sampling
distribution of the mean is approximately normal,
regardless of the shape of the population
distribution - Central Limit Theorem

5
Sampling Distribution of the Mean
THE CENTRAL LIMIT THEREOM For samples of n
observations taken from a population with mean ?
and standard deviation ?, regardless of the
populations distribution, provided the sample
size is sufficiently large, the distribution of
the sample mean , will be normal with a
mean equal to the population mean
. Further, the standard deviation will equal the
population standard deviation divided by the
square-root of the sample size .
The larger the sample size, the better the
approximation to the normal distribution.
6
Sampling Statistics

Sampling statistics are statistics that are based
on values that are created by repeated sampling
from a population,
such as
Mean of the sampling means
Standard Error of the sampling mean
Sampling distribution of the means

7
Sampling Key Issues

Key Sampling issues are
Sample Design (Planning)
Sampling Methods (Schemes)
Sampling Error
Sample Size Determination.

8
Sampling Design

Sample Design (Sample Planning) describes
Objective of Sampling
Target Population
Population Frame
Method of Sampling
Statistical tools for Data Analysis

9
Sampling Methods
Sampling Methods (Sampling Schemes)

Subjective Methods
Judgment Sampling
Convenience Sampling

Probabilistic Methods
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling

10
Sampling Methods (Cont.)

Simple Random Sampling Method
refers to a method of selecting items from a
population such that every possible sample of a
specified size has an equal chance of being
selected
with or without replacement

11
Sampling Methods (Cont.)

Stratified Sampling Method
Population is divided into natural subsets
(Strata)
Items are randomly selected from stratum
Proportional to the size of stratum.

12
Stratified Sampling Example
Population Cash holdings of All Financial
Institutions in the Country
Stratified Sample of Cash Holdings of Financial
Institutions
13
Cluster Sampling

Cluster sampling refers to a method by which the
population is divided into groups, or clusters,
that are each intended to be mini-populations. A
random sample of m clusters is selected.

14
Cluster Sampling Example
Mid-Level Managers by Location for a Company
15
Sampling Error

SAMPLING ERROR-SINGLE MEAN
The difference between a value (a statistic)
computed from a sample and the corresponding
value (a parameter) computed from a population.
Where

16
Sampling Error (Cont.)

Sampling Error is inherent in any sampling
process due to the fact that samples are only a
subset of the total population.
Sampling Errors depends on the relative size of
sample
Sampling Errors can be minimized but not
eliminated.

17
Sampling Error (Cont.)

If Sampling size is more than 5 of the
population
With Replacement assumption of Central Limit
Theorem and hence, Standard Error calculations
are violated
Correction by the following factor is needed.

18
Sampling Size

Sample Size Determination.

where, n sample size z z-score a factor
representing probability in terms of standard
deviation a (100 - confidence level) E
interval on either side of the mean
19
Estimation

Estimation (Inference) is assessing the the value
of a population parameter using sample data
Two types of estimation
Point Estimates
Interval Estimates

20
Estimation
FOR ESTIMATION USE ALLWAYS z or t DISTRIBUTION
21
Estimation (Cont.)

Most common point estimates are the descriptive
statistical measures.
If the expected value of an estimator equals to
the population parameter then it is called
unbiased.

22
Estimation (Cont.)
That means that we can use sample estimates as if
they were population parameters without
committing an error.
23
Estimation (Cont.)

Interval Estimate provides a range within which
population parameter falls with certain
likelihood.
Confidence Level is the probability (likelihood)
that the interval contains the population
parameter. Most commonly used confidence levels
are 90, 95, and 99.

24
Confidence Interval

Confidence Interval (CI) is an interval estimate
specified from the perspective of the point
estimate.
In other words CI is
an interval on either side (/-) of the point
estimate
based on a fraction (t or z-score) of the Std.
Dev. of the point estimate

25
Confidence Intervals
Lower Confidence Limit
Upper Confidence Limit
Point Estimate
26
95 Confidence Intervals
0.95
z.025 -1.96
z.025 1.96
27
CI for Proportions

For categorical variables having only two
possible outcomes proportions are important.
An unbiased estimation of population proportion
(p) is the sample statistics

p x/n where, x number of observations in the
sample with desired characteristics
28
Confidence Interval- From General to Specific
Format -
29
CI of the Mean (Cont.)
where, E Margin of Error
30
Confidence Interval- From Statistical Expression
to Excel Formula -

Where
z a/2 Normsinv(1 a/tails)
and when n lt 30 AND s is not known , then z ?
t so
t a/2 n-1 Tinv(2a/tails, n-1)

31
CI of the Mean (Cont.)
where, z z-score a critical factor
representing probability in terms of Standard
Deviation (for sampling Standard Error) (valid
for normal distribution) (critical value) t
t-score a factor representing probability in
terms of standard deviation (or Std. Error)
(valid for t distribution) (critical value) a
(100 - confidence level)
32
Confidence Interval

PHStat2

33
Z-score

A z-score is a critical factor, indicating how
many standard deviation (standard error for
sampling) away from the mean a value should be to
observe a particular (cumulative) probability.
There is a relationship between z-score and
probability over p(x) (1-Normsdist(z))tails
and
There is a relationship between z-score and the
value of the random variable over

34
Z-score (Cont.)

Since the z-score is a measure of distance from
the mean in terms of Standard Deviation (Standard
Error for sampling), it provides us with
information that a cumulative probability could
not. For example, the larger z-score the unusual
is the observation.

35
Students t-Distribution
The t-distribution is a family of distributions
that is bell-shaped and symmetric like the
Standard Normal Distribution but with greater
area in the tails. Each distribution in the
t-family is defined by its degrees of freedom.
As the degrees of freedom increase, the
t-distribution approaches the normal distribution.
36
Degrees of freedom
Degrees of freedom (df) refers to the number of
independent data values available to estimate the
populations standard deviation. If k parameters
must be estimated before the populations
standard deviation can be calculated from a
sample of size n, the degrees of freedom are
equal to n - k.
37
Example of a CI Interval Estimate for ?

A sample of 100 cans, from a population with ?
0.20, produced a sample mean equal to 12.09. A
95 confidence interval would be

38
Example of Impact of Sample Size on Confidence
Intervals

If instead of sample of 100 cans, suppose a
sample of 400 cans, from a population with ?
0.20, produced a sample mean equal to 12.09. A
95 confidence interval would be

12.0704 ounces
12.1096 ounces
n400
n100
12.051 ounces
12.129 ounces
39
Example of CI for Proportion

62 out of a sample of 100 individuals who were
surveyed by Quick-Lube returned within one month
to have their oil changed. To find a 90
confidence interval for the true proportion of
customers who actually returned

40
From Margin of Error to Sampling Size
41
Sampling Size

Sample Size Determination.

where, n sample size z z-score a factor
representing probability in terms of standard
deviation a 100 - confidence level E
interval on either side of the mean
42
Sampling Size

PHStat2

43
Pilot Samples
A pilot sample is a sample taken from the
population of interest to provide and estimate
for the population standard deviation. Normally
its size is smaller than the anticipated sample
size.
44
Example of Determining Required Sample Size

The manager of the Georgia Timber Mill wishes to
construct a 90 confidence interval with a margin
of error of 0.50 inches in estimating the mean
diameter of logs. A pilot sample of 100 logs
yield a sample standard deviation of 4.8 inches.

Statistical Sampling PowerPoint PPT Presentation