Title: Comparison of 2 Population Means
1Comparison of 2 Population Means
- Goal To compare 2 populations/treatments wrt a
numeric outcome - Sampling Design Independent Samples (Parallel
Groups) vs Paired Samples (Crossover Design) - Data Structure Normal vs Non-normal
- Sample Sizes Large (n1,n2gt20) vs Small
2Independent Samples
- Units in the two samples are different
- Sample sizes may or may not be equal
- Large-sample inference based on Normal
Distribution (Central Limit Theorem) - Small-sample inference depends on distribution of
individual outcomes (Normal vs non-Normal)
3Parameters/Estimates (Independent Samples)
- Parameter
- Estimator
- Estimated standard error
- Shape of sampling distribution
- Normal if data are normal
- Approximately normal if n1,n2gt20
- Non-normal otherwise (typically)
4Large-Sample Test of m1-m2
- Null hypothesis The population means differ by
D0 (which is typically 0) - Alternative Hypotheses
- 1-Sided
- 2-Sided
- Test Statistic
5Large-Sample Test of m1-m2
- Decision Rule
- 1-sided alternative
- If zobs ? za gt Conclude m1-m2 gt D0
- If zobs lt za gt Do not reject m1-m2 D0
- 2-sided alternative
- If zobs ? za/2 gt Conclude m1-m2 gt D0
- If zobs ? -za/2 gt Conclude m1-m2 lt D0
- If -za/2 lt zobs lt za/2 gt Do not reject m1-m2
D0
6Large-Sample Test of m1-m2
- Observed Significance Level (P-Value)
- 1-sided alternative
- PP(z ? zobs) (From the std. Normal
distribution) - 2-sided alternative
- P2P( z? zobs ) (From the std. Normal
distribution) - If P-Value ? a, then reject the null hypothesis
7Large-Sample (1-a)100 Confidence Interval for
m1-m2
- Confidence Coefficient (1-a) refers to the
proportion of times this rule would provide an
interval that contains the true parameter value
m1-m2 if it were applied over all possible
samples - Rule
8Large-Sample (1-a)100 Confidence Interval for
m1-m2
- For 95 Confidence Intervals, z.0251.96
- Confidence Intervals and 2-sided tests give
identical conclusions at same a-level - If entire interval is above D0, conclude m1-m2 gt
D0 - If entire interval is below D0, conclude m1-m2 lt
D0 - If interval contains D0, do not reject m1-m2 ? D0
9Example Vitamin C for Common Cold
- Outcome Number of Colds During Study Period for
Each Student - Group 1 Given Placebo
- Group 2 Given Ascorbic Acid (Vitamin C)
Source Pauling (1971)
102-Sided Test to Compare Groups
- H0 m1-m2 0 (No difference in trt effects)
- HA m1-m2? 0 (Difference in trt effects)
- Test Statistic
- Decision Rule (a0.05)
- Conclude m1-m2 gt 0 since zobs 25.3 gt z.025
1.96
1195 Confidence Interval for m1-m2
- Point Estimate
- Estimated Std. Error
- Critical Value z.025 1.96
- 95 CI 0.30 1.96(0.0119) ? 0.30 0.023
- ? (0.277 , 0.323) Entire interval gt 0
12Small-Sample Test for m1-m2 Normal Populations
- Case 1 Common Variances (s12 s22 s2)
- Null Hypothesis
- Alternative Hypotheses
- 1-Sided
- 2-Sided
- Test Statistic(where Sp2 is a pooled estimate
of s2)
13Small-Sample Test for m1-m2 Normal Populations
- Decision Rule (Based on t-distribution with
nn1n2-2 df) - 1-sided alternative
- If tobs ? ta,n gt Conclude m1-m2 gt D0
- If tobs lt ta,n gt Do not reject m1-m2 D0
- 2-sided alternative
- If tobs ? ta/2 ,n gt Conclude m1-m2 gt D0
- If tobs ? -ta/2,n gt Conclude m1-m2 lt D0
- If -ta/2,n lt tobs lt ta/2,n gt Do not reject
m1-m2 D0
14Small-Sample Test for m1-m2 Normal Populations
- Observed Significance Level (P-Value)
- Special Tables Needed, Printed by Statistical
Software Packages - 1-sided alternative
- PP(t ? tobs) (From the tn distribution)
- 2-sided alternative
- P2P( t ? tobs ) (From the tn distribution)
- If P-Value ? a, then reject the null hypothesis
15Small-Sample (1-a)100 Confidence Interval for
m1-m2 - Normal Populations
- Confidence Coefficient (1-a) refers to the
proportion of times this rule would provide an
interval that contains the true parameter value
m1-m2 if it were applied over all possible
samples - Rule
- Interpretations same as for large-sample CIs
16Small-Sample Inference for m1-m2 Normal
Populations
- Case 2 s12 ? s22
- Dont pool variances
- Use adjusted degrees of freedom
(Satterthwaites Approximation)
17Example - Scalp Wound Closure
- Groups Stapling (n115) / Suturing (n216)
- Outcome Physician Reported VAS Score at 1-Year
- Conduct a 2-sided test of whether mean scores
differ - Construct a 95 Confidence Interval for true
difference
Source Khan, et al (2002)
18Example - Scalp Wound Closure
H0 m1-m2 0 HA m1-m2 ? 0 (a
0.05)
No significant difference between 2 methods
19Small Sample Test to Compare Two Medians -
Nonnormal Populations
- Two Independent Samples (Parallel Groups)
- Procedure (Wilcoxon Rank-Sum Test)
- Rank measurements across samples from smallest
(1) to largest (n1n2). Ties take average ranks. - Obtain the rank sum for each group (T1 , T2 )
- 1-sided testsConclude HA M1 gt M2 if T2 ? T0
- 2-sided testsConclude HA M1 ? M2 if min(T1,
T2) ? T0 - Values of T0 are given in many texts for various
sample sizes and significance levels. P-values
printed by statistical software packages.
20Example - Levocabostine in Renal Patients
- 2 Groups Non-Dialysis/Hemodialysis (n1 n2
6) - Outcome Levocabastine AUC (1 Outlier/Group)
2-sided Test Conclude Medians differ if
min(T1,T2) ? 26
Source Zagornik, et al (1993)
21Computer Output - SPSS
22Inference Based on Paired Samples (Crossover
Designs)
- Setting Each treatment is applied to each
subject or pair (preferably in random order) - Data di is the difference in scores (Trt1-Trt2)
for subject (pair) i - Parameter mD - Population mean difference
- Sample Statistics
23Test Concerning mD
- Null Hypothesis H0mDD0 (almost always 0)
- Alternative Hypotheses
- 1-Sided HA mD gt D0
- 2-Sided HA mD ? D0
- Test Statistic
24Test Concerning mD
Decision Rule (Based on t-distribution with
nn-1 df) 1-sided alternative If tobs ? ta,n
gt Conclude mD gt D0 If tobs lt ta,n gt Do
not reject mD D0 2-sided alternative If tobs ?
ta/2 ,n gt Conclude mD gt D0 If tobs ?
-ta/2,n gt Conclude mD lt D0 If -ta/2,n lt
tobs lt ta/2,n gt Do not reject mD D0
Confidence Interval for mD
25Example - Evaluation of Transdermal Contraceptive
Patch In Adolescents
- Subjects Adolescent Females on O.C. who then
received Ortho Evra Patch - Response 5-point scores on ease of use for each
type of contraception (1Strongly Agree) - Data di difference (O.C.-EVRA) for subject i
- Summary Statistics
Source Rubinstein, et al (2004)
26Example - Evaluation of Transdermal Contraceptive
Patch In Adolescents
- 2-sided test for differences in ease of use
(a0.05) - H0mD 0 HAmD ? 0
Conclude Mean Scores are higher for O.C., girls
find the Patch easier to use (low scores are
better)
27Small-Sample Test For Nonnormal Data
- Paired Samples (Crossover Design)
- Procedure (Wilcoxon Signed-Rank Test)
- Compute Differences di (as in the paired t-test)
and obtain their absolute values (ignoring 0s) - Rank the observations by di (smallest1),
averaging ranks for ties - Compute T and T-, the rank sums for the positive
and negative differences, respectively - 1-sided testsConclude HA M1 gt M2 if T- ? T0
- 2-sided testsConclude HA M1 ? M2 if min(T, T-
) ? T0 - Values of T0 are given in many texts for various
sample sizes and significance levels. P-values
printed by statistical software packages.
28Example - New MRI for 3D Coronary Angiography
- Previous vs new Magnetization Prep Schemes (n7)
- Response Blood/Myocardium Contrast-Noise-Ratio
- All Differences are negative, T- 127 28,
T 0 - From tables for 2-sided tests, n7, a0.05,
T02 - Since min(0,28) ? 2, Conclude the scheme means
differ
Source Nguyen, et al (2004)
29Computer Output - SPSS
Note that SPSS is taking NEW-PREVIOUS in top table