Title: Classroom Simulation: Are VarianceStabilizing Transformations Really Useful
1Classroom Simulation AreVariance-StabilizingTra
nsformations Really Useful?
2Bruce E. Trumbo Eric A. SuessRebecca E.
Brafman
- Department of Statistics
- California State University, Hayward
- Presentation, JSM 2004, Toronto
- btrumbo_at_csuhayward.edu
3Introduction to One-way ANOVA
- In a one-way ANOVA, we test the null hypothesis
that all group means ?i are equal against the
alternative hypotheses that all group means are
not equal. - ANOVA Table
- Source DF SS MS F-Ratio
. Factor I 1 SS(Fact) MS(Fact)
MS(Fact)/MS(Err)Error IJ I SS(Err)
MS(Err) .Total IJ 1
4Model and Assumptions
- We use the model Xij i.i.d. NORM(?i, ?2),
for i 1, , I and j 1, , J. - Assumptions
- normal data
- independent groups
- independent observations within groups
- equal variances
5When Data Are Not Normal
- If H0 True Distributional difficulties arise
- MS(Factor) and MS(Error) not chi-squared
- MS(Factor) and MS(Error) not independent
- F-ratio not distributed as F
- If H0 False
- Different means may imply
- Different variances
6Commonly Recommended Method For Transformating
Data to Stabilize Variances
- Based on two-term Taylor-series approximations.
- Given relationship between mean and variance
- s2 j(m).
- The following transformation makes variances
- approximately equal even if means differ
- Y f(X), where f(m) j(m)1/2
7Some Types of Nonnormal Data and Their
Variance-Stabilizing Transformations
8Square Root Transformations (Right) of Three
Poisson Samples Have Similar Variances
9Arcsine of Square Root Transformations (Right) of
Three Binomial Samples Have Similar Variances
10Log Transformations (Right) of Three Exponential
Samples Have Similar Variances
11Additional Transformations
- We also consider rank transformations for
exponential data. - Possible future work (no results given here)
Box-Cox Transformation of the type Y Xa,where
a is based on the data. - Examples
- Square root if a 1/2
- Reciprocal if a 1
- Interpreted as log transformation if a 0
12 Simulation Study
- 1. Simulations are based on data with known
- distributions Poisson, binomial, or
exponential. - 2. Use R, S-Plus, and Minitab. (SAS can also
be used but is very time consuming.) - 3. In each simulation we generate 20,000
datasets from the nonnormal distribution under
study. - 4. Each dataset consists of I 3 groups,
usually with J 5 or 10 observations per group. - 5. For each distribution Datasets under H0,
- and for a variety of cases with Ha.
13Comparisons to JudgeUsefulness of Transformations
- All tests have nominal size ? 5.
- PRej is estimated as the proportion of 20,000
- simulated datasets in which H0 is rejected.
- With and without transformation
- When is H0 is true, does PRej 5 ?
- For various alternatives When is PRej larger,
with or withouttransformation?
14R / S-Plus Code for Exponential Simulation
15Summary of Findings
- Within the limited scope of our study
- For Poisson data, the square root transformation
seems ineffective. - For binomial data, the arcsine transformation
seems ineffective. - For exponential data, both the log and the rank
transformations seem to be useful in some
casesparticularly for small samples.
16Some Specific Results PRej for Poisson Data
Three groups, each with 5 observations
17Some Specific Results PRej for Binomial
ProportionsThree groups, each with 5 observations
18For Exponential Data Log and Rank Transformations
Sometimes UsefulPower PRejHa often
larger for transformed data (one borderline
exceptional case shown)
19Exponential Power Against Ha 1, 10, 100For
Various Numbers of ReplicationsLog and rank
transformations work well when r is small and
population means are widely separated.
O Original Log Transf Rank Transf.
20Exponential Power Against Ha 1, 2, 4For
Various Numbers r of ReplicationsWhen means are
not so widely separated, log and rank
transformations do some harm unless r is small .
O Original Log Transf Rank Transf.
21Exponential Power for Various AlternativesWhen
M 1, H0 is true when M 2, the group means
are 1, 2, 4 when M 4, the group means are 1,
4 , 16 etc. For r 5 and M gt 2 transformations
are useful.
Solid Original Dotted Log Transf Dashed
Rank Transf.
22Exponential Power for Various AlternativesWhen
M 1, H0 is true when M 2, the group means
are 1, 2, 4 when M 4, the group means a are
1, 4 , 16 etc. For r 20, transformations may
be harmful.
Solid OriginalDotted Log TransfDashed
Rank Transf.
23References / Acknowledgments
REFERENCES ON VARIANCE STABILIZING
TRANSFORMATIONS G. Oehlert A First Course in
Design and Analysis of Experiments, Freeman
(2000), Chapter 6. D. Montgomery Design and
Analysis of Experiments, 5th ed., Wiley (2001),
Chapter 3. K. Brownlee Statistical Theory and
Methodology in Science and Engineering, 2nd ed.,
Wiley (1965). Chapter 3. H. Scheffé The Analysis
of Variance, Wiley 1959, Chapter 10. G. Snedecor
and W. Cochran Statistical Methods, 7th ed. Iowa
State Univ. Press (1980), Chapter 15. WEB PAGES
including computer code and results for this
paper www.sci.csuhayward.edu/btrumbo/JSM20
04/simtrans/. THANKS TO Jaimyoung Kwan (UC
Berkeley/CSU Hayward) for suggestions,
especially concerning the inclusion of power
curves.Rebecca Brafmans graduate study
supported by NSF Graduate Research Fellowship.
24About the Authors
- Rebecca E. Brafman, presenting this poster at JSM
2004 in Toronto, has recently completed her M.S.
in Statistics from CSU Hayward. - Eric A. Suess received his Ph.D. in Statistics
from U.C. Davis and is Associate Professor of
Statistics at CSU Hayward. His interests include
statistical computation, time series and Bayesian
statistics.esuess_at_csuhayward.edu - Bruce E. Trumbo is a fellow of ASA and IMS and
has been a professor in the Statistics
Department at CSU State University, Hayward for
over 30 years.btrumbo_at_csuhayward.edu