Title: Sampling and Statistical Inference A Short Review
1Sampling and Statistical Inference A Short
Review
- MS 205
- Instructor Dr. Vince Yen
2Populations and Samples
- Population all items of interest for a
particular study or investigation - All married drivers in the U.S. over age 25
- All individuals who do not own a cell phone
- Sample a subset of a population
- Nielsen samples of TV viewers
- samples of invoices for audits
- Samples are used
- To reduce costs of data collection
- When a full census cannot be taken
3Concept of Population and Sample
Population distributio of X Normal(µ, s2) (mean,
variance)
Draw a sample of size n
X1
X2
. . .
Xn
Sample (mean, variance)
4Arithmetic Mean
- Population mean
- Sample mean
- Excel function AVERAGE(range)
Npopulation size
n sample size
5Variance
- Population
- variance
- Sample
- variance
6Standard Deviation
- Population
- Sample
- The standard deviation has the same units of
measurement as the original data, unlike the
variance
7Theoretical Properties of the Sampling
Distribution of the Mean
- Expected value of the sample mean is the
population mean, m - Variance of the sample mean is s2/n, where s2 is
the variance of the population - Standard deviation of the sample mean, called the
standard error of the mean, is s/?n
x
µ
8Central Limit Theorem Sampling Distribution of
the Mean
- If the sample size is large enough (generally at
least 30, but depends on the actual
distribution), the sampling distribution of the
mean is approximately normal, regardless of the
distribution of the population. - If the population is normal, then the sampling
distribution of the mean is exactly normal for
any n.
X ?(?, ?2)
n ? 30
X N(?, ?2)
X N(?, ?2/n)
9Standard Normal Distribution Probability Table
Lookup
- Transformation from N(m,s2) to N(0,1)
- Standard normal mean 0, variance 1, denoted
as N(0,1)
10Statistical Analysis of Sample Data
- Estimation of population parameters
- Confidence intervals for population parameters
- Hypothesis testing to draw conclusions about
population parameters or differences between them
11Theoretical Issues What Are Good Estimators?
- Unbiased estimator one for which the expected
value equals the population parameter it is
intended to estimate - The sample variance is an unbiased estimator for
the population variance
12Confidence Intervals
- Confidence interval (CI) an interval estimated
that specifies the likelihood that the interval
contains the true population parameter - Level of confidence (1 a) the probability
that the CI contains the true population
parameter, usually expressed as a percentage
(90, 95, 99 are most common). - ? is called the level of significance
13Confidence Interval for the Mean s Known
- A 100(1 a) CI is ?x ? z?/2(?/?n)
z?/2 may be found from Normal ProbabilityTable or
using the Excel function NORMSINV(1-a/2)
14Confidence Interval for( Known)
Critical Value
- Assumptions
- Population standard deviation is known
- Population is normally distributed
- If population is not normal, use large sample
- Confidence Interval Estimate
-
-
Standard Error