Title: Chapter 1 Statistics, Data, and Statistical Thinking
1Chapter 1 Statistics, Data, and Statistical
Thinking
- Statistics the science of data.
- Statistical methods Descriptive
Statistics(collecting data, presenting data,
characterizing data), Inferential
Statistics(Estimation, Hypothesis testing) - Population, sample
- Quantitative data, Qualitative data
2Chapter 1 Statistics, Data, and Statistical
Thinking
- Data Sources(published source, designed
experiment, survey, observational study) - Random sampling(representative, equally likely to
be selected) - Sources of Error in Survey Data(Selection bias,
Nonresponse bias, Measurement error)
3Chapter 2 Methods for Describing Sets of Data
- Graphical methods qualitative data (bar graph,
par chart) quantitative data(dot plot,
stem--leaf, histogram) - Summation
-
- Central tendency(mean, median, mode)
- Shape left-skewed(meanltmedianltmode)
symmetric(meanmedianmode) right-skewed(modeltmed
ianltmean)
4Chapter 2 Methods for Describing Sets of Data
- Variability
- Range Largest measurement minus the smallest
measurement - sample variance
- Sample standard deviation(s)the positive
square root of the sample variance
5Chapter 2 Methods for Describing Sets of Data
- Chebyshevs rule and Empirical rule
- pth percentile(25th 50th 75th percentile)
- Z-score
- Outliers box plots z-score.
6Chapter 3 Probability
- Events, sample space and probability
- Probability of an event the sum of the
probabilities of all sample points in the
collection for the event - Combination rule
-
7Chapter 3 Probability
- Intersection
- If A and B are independent events
- Unions
- Complementary Event
-
-
-
-
8Chapter 3 Probability
- Mutually Exclusive Events
- Conditional probability
- Events A and B are independent
9Chapter 3 Probability
10Chapter 4 Discrete Random Variables
- Two types of Random Variable (discrete,
continuous) - 2 Requirements for discrete random variables
- Mean
- variance
- standard deviation
11Chapter 4 Discrete Random Variables
- Binomial
- Mean
- Variance
- Standard deviation
12Chapter 5 Continuous Random Variables
- Probability areas under curve
- and
- Uniform pdf c ? x ? d
-
c ? a lt b ? d - Mean
- Standard deviation
13Chapter 5 Continuous Random Variables
- Normal
- Standard normal ? 0 and ? 1
- use normal table to get P(cltzltd)?
- Normal Convert to standard normal using
14Chapter 5 Continuous Random Variables
- Approximation of a binomial distribution
- 1. Calculate the interval
- If interval lies in rang 0 to n, normal
approximation can - be used
- 2. Express binomial probability in form
- or
- 3. For each value, a, use
- 4. Use standard normal table to find probability
15Chapter 6 Sampling Distributions
- Concept Parameter, Sample Statistic, Sampling
distribution - Sampling Distribution of
- Mean of sampling distribution equals mean of
sampled population - Standard deviation of sampling distribution
equals - Standard deviation of sampled
population Square root of sample
size
16Chapter 6 Sampling Distributions
- Central Limit Theorem
- In a population with standard deviation ?
and mean ? , the distribution of sample means
from samples of N observations will approach a
normal distribution with standard deviation of
and mean of as N gets
larger (n ?30).
17Chapter 7 Inferences Based on a Single Sample
Estimation with Confidence Intervals
- Concept confidence interval, Confidence
coefficient(1 - ?) , Confidence level - Confidence Interval for ?
- known ?
- Unknown ?
- large sample n ? 30,
- small sample n lt 30,
- t?/2 on n-1 degrees
of freedom
18Chapter 7 Inferences Based on a Single Sample
Estimation with Confidence Intervals
- Confidence interval for proportion p
- Sample statistic of
- Mean of sampling distribution of is p
- Standard deviation of the sampling
distribution is where
q1-p - For large samples, the sampling distribution
ofis approximately normal
19Chapter 7 Large-Scale Confidence Interval for a
Population Proportion
- Sample size n is large if falls
between 0 and 1 - Confidence interval is calculated as
- where and
- X of successs
20Chapter 7 Inferences Based on a Single Sample
Estimation with Confidence Intervals
- Determining the Sample Size
- Sampling Error (SE), which is half the width
of the confidence interval - To estimate ? with Sampling error SE and
100(1-?) confidence, - where ? is estimated by s or R/4. Rounding
the value of n obtained upward to ensure the
sample size
21Chapter 7 Inferences Based on a Single Sample
Estimation with Confidence Intervals
- Sample size can also be estimated for population
proportion p - If pq is unknown you can be estimated by using
. - If pq is unknown and has no information about ,
- using p .5 since the value of pq is at its
maximum - when p .5
22Chapter 8 Inferences Based on a Single Sample
Tests of Hypothesis
- 7 elements
- The Null hypothesis H0 ?, ?, or ??
- The alternate, or research hypothesis Ha
?,??, or ? - The test statistic
- The rejection region
- The assumptions
- The Experiment and test statistic calculation
- The Conclusion
23Chapter 8 Large-Sample Test of Hypothesis about ?
- One-Tailed Test
Two-Tailed Test - H0
H0 - Ha (or Ha )
Ha - Teat Statistic
Test Statistic - 1. Rejection region
Rejection region - (or when )
- 2. Rejection region
Rejection region - (or when )
24Chapter 8 Small-Sample Test of Hypothesis about
- One-Tailed Test
Two-Tailed Test - H0
H0 - Ha (or Ha )
Ha - Teat Statistic
Test Statistic - 1. Rejection region
Rejection region - (or when )
- 2. Rejection region
Rejection region - (or when )
- where t? and t?/2 are based on (n-1) degrees of
freedom
25Chapter 8 Large-Sample Test of Hypothesis about a
Population Proportion
26Chapter 9 Inferences Based on Two Samples
Confidence Intervals and Tests of Hypothesis
- Comparing Two Population Means Independent
Sampling - Large Sample Confidence Interval for ?1 - ?2
27Chapter 9 Comparing Two Population Means
Independent Sampling
- Large Sample Test of Hypothesis for ?1 - ?2
28Chapter 9 Comparing Two Population Means
Independent Sampling
- Small Sample(n1 lt 30, n2 lt 30 )Confidence
Interval for ?1 - ?2 (assume
) - where
- pooled estimated of
- and t?/2 is based on (n1 n2-2) degrees of
freedom
29Chapter 9 Comparing Two Population Means
Independent Sampling
- Small Sample Test of Hypothesis for ?1 - ?2
30Chapter 9 Comparing Two Population Means
Independent Sampling
- Small Samples what to do when
- When sample sizes are equal (n1n2n)
- Confidence interval
- Test Statistic for H0
- where t is based on
degrees of freedom
31Chapter 9 Comparing Two Population Means
Independent Sampling
- Small Samples what to do when
- When sample sizes are not equal (n1?n2)
- Confidence interval
- Test Statistic for H0
- where t is based on
degrees of -
freedom - Round down to the nearest integer
32Chapter 9 Comparing Two Population Means Paired
Difference Experiments
- Paired Difference Confidence Interval for ?d?1 -
?2 - Large Sample
- (nd?30)
- Small Sample
- (ndlt30)
- where t?/2 is based on (nd-1) degrees of freedom
33Chapter 9 Comparing Two Population Means Paired
Difference Experiments
- Paired Difference Test of Hypothesis for ?d?1 -
?2, Large Sample
34Chapter 9 Comparing Two Population Means Paired
Difference Experiments
- Paired Difference Test of Hypothesis for ?d?1 -
?2, Small Sample
35Chapter 9 Comparing Two Population Proportions
Independent Sampling
- Large-Sample(both intervals,
- and , fall between 0 and 1)
100(1-?) Confidence Interval for (p1-p2)
36Chapter 9 Comparing Two Population Proportions
Independent Sampling
- Large-Sample Test of Hypothesis about (p1-p2)
37Chapter 9 Determining the Sample Size
- For estimating ?1-?2 (assuming n1n2n)
-
(rounding up) -
- These estimates of and might be
sample variance and from prior
sampling, or based on the range sR/4. - For estimating p1-p2 (assuming n1n2n)
These estimates of and might be
based on prior samples, or for guess
38Chapter 11 Simple Linear Regression
- Model
- Least Squares Line
- Slope y-intercept
39Chapter 11 Model Assumptions
- Mean of the probability distribution of e is 0
- Variance of the probability distribution of e is
constant for all values of x - Probability distribution of e is normal
- Values of e are independent of each other
40Chapter 11 Simple Linear Regression
- Estimator of ?2
- sampling distribution of
- We estimate by
41Chapter 11 Simple Linear Regression
42Chapter 11 Simple Linear Regression
- A 100(1-a) Confidence Interval for ?1
- where
- Coefficient of Correlation r
- Coefficient of Determination r²
-
-
43Chapter 11 Using the Model for Estimation and
Prediction
- 100(1-a) Confidence interval for Mean Value of y
at xxp - 100(1-a) Confidence interval for an Individual
New Value of y at xxp - where ta/2 is based on (n-2) degrees of freedom
44Chapter 13 Testing Category Probabilities
One-Way Table
- Ha at least one of the multinomial probabilities
does not equal its hypothesized value - Test statistic
- where is the expected cell
count. - Rejection region
- where has (k-1)df
45Chapter 13 Testing Category Probabilities
Two-Way (Contingency) Table
- Used when classifying with two qualitative
variables - H0 The two classifications are independent
- Ha The two classifications are dependent
- Test Statistic
- Rejection region?2gt?2?, where ?2? has (r-1)(c-1)
df