Title: Nonparametric Methods III
1Nonparametric Methods III
- Henry Horng-Shing Lu
- Institute of Statistics
- National Chiao Tung University
- hslu_at_stat.nctu.edu.tw
- http//tigpbp.iis.sinica.edu.tw/courses.htm
2PART 4 Bootstrap and Permutation Tests
- Introduction
- References
- Bootstrap Tests
- Permutation Tests
- Cross-validation
- Bootstrap Regression
- ANOVA
3References
- Efron, B. Tibshirani, R. (1993). An Introduction
to the Bootstrap. Chapman Hall/CRC. - http//cran.r-project.org/doc/contrib/Fox-Companio
n/appendix-bootstrapping.pdf - http//cran.r-project.org/bin/macosx/2.1/check/boo
tstrap-check.ex - http//bcs.whfreeman.com/ips5e/content/cat_080/pdf
/moore14.pdf
4Hypothesis Testing (1)
- A statistical hypothesis test is a method of
making statistical decisions from and about
experimental data. - Null-hypothesis testing just answers the question
of how well the findings fit the possibility
that chance factors alone might be responsible. - This is done by asking and answering a
hypothetical question.
http//en.wikipedia.org/wiki/Statistical_hypothesi
s_testing
5Hypothesis Testing (2)
- Hypothesis testing is largely the product of
Ronald Fisher, Jerzy Neyman, Karl Pearson and
(son) Egon Pearson. Fisher was an agricultural
statistician who emphasized rigorous experimental
design and methods to extract a result from few
samples assuming Gaussian distributions. Neyman
(who teamed with the younger Pearson) emphasized
mathematical rigor and methods to obtain more
results from many samples and a wider range of
distributions. Modern hypothesis testing is an
(extended) hybrid of the Fisher vs.
Neyman/Pearson formulation, methods and
terminology developed in the early 20th century.
6Hypothesis Testing (3)
7Hypothesis Testing (4)
8Hypothesis Testing (5)
9Hypothesis Testing (7)
- Parametric Tests
- Nonparametric Tests
- Bootstrap Tests
- Permutation Tests
10Confidence Intervals vs.
Hypothesis Testing (1)
- Interval estimation ("Confidence Intervals") and
point estimation ("Hypothesis Testing") are two
different ways of expressing the same information.
http//www.une.edu.au/WebStat/unit_materials/ c5_i
nferential_statistics/confidence_interv_hypo.html
11Confidence Intervals vs.
Hypothesis Testing (2)
- If the exact p-value is reported, then the
relationship between confidence intervals and
hypothesis testing is very close. However, the
objective of the two methods is different - Hypothesis testing relates to a single conclusion
of statistical significance vs. no statistical
significance. - Confidence intervals provide a range of plausible
values for your population.
http//www.nedarc.org/nedarc/analyzingData/ advanc
edStatistics/convidenceVsHypothesis.html
12Confidence Intervals vs.
Hypothesis Testing (3)
- Which one?
- Use hypothesis testing when you want to do a
strict comparison with a pre-specified hypothesis
and significance level. - Use confidence intervals to describe the
magnitude of an effect (e.g., mean difference,
odds ratio, etc.) or when you want to describe a
single sample.Â
http//www.nedarc.org/nedarc/analyzingData/ advanc
edStatistics/convidenceVsHypothesis.html
13P-value
http//bcs.whfreeman.com/ips5e/content/cat_080/pdf
/moore14.pdf
14Achieved Significance Level (ASL)
https//www.cs.tcd.ie/Rozenn.Dahyot/453Bootstrap/0
5_Permutation.pdf
15Bootstrap Tests
- Methodology
- Flowchart
- R code
16Bootstrap Tests
- Beran (1988) showed that bootstrap inference is
refined when the quantity bootstrapped is
asymptotically pivotal. - It is often used as a robust alternative to
inference based on parametric assumptions.
http//socserv.mcmaster.ca/jfox/Books/Companion/ap
pendix-bootstrapping.pdf
17Hypothesis Testing by a Pivot
http//en.wikipedia.org/wiki/Pivotal_quantity
18One Sample Bootstrap Tests
- T statistics can be regarded as a pivot or an
asymptotic pivotal when the data are normally
distributed. - Bootstrap T tests can be applied when the data
are not normally distributed.
19Bootstrap T tests
20Flowchart of Bootstrap T Tests
Bootstrap B times
21Bootstrap T Tests by R
22An Example of Bootstrap T Tests by R
23Bootstrap Tests by The BCa
- The BCa percentile method is an efficient method
to generate bootstrap confidence intervals. - There is a correspondence between confidence
intervals and hypothesis testing. - So, we can use the BCa percentile method to test
whether H0 is true. - Example use BCa to calculate p-value
24BCa Confidence Intervals
- Use R package boot.ci(boot)
- Use R package bcanon(bootstrap)
- http//qualopt.eivd.ch/stats/?pagebootstrap
- http//www.stata.com/capabilities/boot.html
25http//finzi.psych.upenn.edu/R/library/boot/DESCRI
PTION
26An Example of boot.ci(boot) in R
27http//finzi.psych.upenn.edu/R/library/bootstrap/D
ESCRIPTION
28An example of bcanon(bootstrap) in R
29BCa by http//qualopt.eivd.ch/stats/?pagebootstra
p
30Use BCa to calculate p-value by R
31Two Sample Bootstrap Tests
32Flowchart of Two-Sample Bootstrap Tests
mnN
combine
Bootstrap B times
33Two-Sample Bootstrap Tests by R
34Output (1)
35Output (2)
36Permutation Tests
- Methodology
- Flowchart
- R code
37Permutation
- In several fields of mathematics, the term
permutation is used with different but closely
related meanings. They all relate to the notion
of (re-)arranging elements from a given finite
set into a sequence.
http//en.wikipedia.org/wiki/Permutation
38Permutation Tests
- Permutation test is also called a randomization
test, re-randomization test, or an exact test. - If the labels are exchangeable under the null
hypothesis, then the resulting tests yield exact
significance levels. - Confidence intervals can then be derived from the
tests. - The theory has evolved from the works of R.A.
Fisher and E.J.G. Pitman in the 1930s.
http//en.wikipedia.org/wiki/Pitman_permutation_te
st
39Applications of Permutation Tests (1)
- We can use a permutation test only when we can
see how to resample in a way that is consistent
with the study design and with the null
hypothesis.
http//bcs.whfreeman.com/ips5e/content/ cat_080/pd
f/moore14.pdf
40Applications of Permutation Tests (2)
- Two-sample problems when the null hypothesis says
that the two populations are identical. We may
wish to compare population means, proportions,
standard deviations, or other statistics. - Matched pairs designs when the null hypothesis
says that there are only random differences
within pairs. A variety of comparisons is again
possible. - Relationships between two quantitative variables
when the null hypothesis says that the variables
are not related. The correlation is the most
common measure of association, but not the only
one.
http//bcs.whfreeman.com/ips5e/content/ cat_080/pd
f/moore14.pdf
41Inference by Permutation Tests
https//www.cs.tcd.ie/Rozenn.Dahyot/453Bootstrap/0
5_Permutation.pdf
42Flowchart of The Permutation Test for Mean Shift
in One Sample
Partition 2 subset B times
(treatment group)
(treatment group)
(control group)
(control group)
43An Example for One Sample Permutation Test by R
http//mason.gmu.edu/csutton/ EandTCh15a.txt
44(No Transcript)
45An Example of Output Results
46Flowchart of The Permutation Test for Mean Shift
in Two Samples
combine
mnN
Partition subset B times
treatment subgroup
control subgroup
treatment subgroup
control subgroup
47Bootstrap Tests vs. Permutation Tests
- Very similar results between the permutation test
and the bootstrap test. - is the exact probability when .
- is not an exact probability but is
guaranteed to be accurate as an estimate of the
ASL, as the sample size B goes to infinity.
https//www.cs.tcd.ie/Rozenn.Dahyot/453Bootstrap/0
5_Permutation.pdf
48Cross-validation
49Cross-validation
- Cross-validation, sometimes called rotation
estimation, is the statistical practice of
partitioning a sample of data into subsets such
that the analysis is initially performed on a
single subset, while the other subset(s) are
retained for subsequent use in confirming and
validating the initial analysis. - The initial subset of data is called the training
set. - the other subset(s) are called validation or
testing sets.
http//en.wikipedia.org/wiki/Cross-validation
50Overfitting Problems
- In statistics, overfitting is fitting a
statistical model that has too many parameters. - When the degrees of freedom in parameter
selection exceed the information content of the
data, this leads to arbitrariness in the final
(fitted) model parameters which reduces or
destroys the ability of the model to generalize
beyond the fitting data. - The concept of overfitting is important also in
machine learning. - In both statistics and machine learning, in order
to avoid overfitting, it is necessary to use
additional techniques (e.g. cross-validation,
early stopping, Bayesian priors on parameters or
model comparison), that can indicate when further
training is not resulting in better
generalization. - http//en.wikipedia.org/wiki/Overfitting
51library(bootstrap) ?crossval
52An Example of Cross-validation by R
53output
54Bootstrap Regression
- Bootstrapping pairs
- Resample from the sample pairs .
- Bootstrapping residuals
- 1. Fit by the original sample and
obtain the residuals. - 2. Resample from residuals.
55Bootstrapping Pairs by R
http//www.stat.uiuc.edu/babailey/stat328/lab7.ht
ml
56Output
57Bootstrapping Residuals by R
http//www.stat.uiuc.edu/babailey/stat328/lab7.ht
ml
58Bootstrapping residuals
59ANOVA
- When random errors follow a normal distribution
- When random errors do not follow a Normal
distribution - Bootstrap tests
- Permutation tests
60An Example of ANOVA by R (1)
- Example
- Twenty lambs are randomly assigned to three
different diets. The weight gain (in two weeks)
is recorded. Is there a difference among the
diets? - Reference
- http//mcs.une.edu.au/stat261/Bootstrap/bootstrap
.R
61An Example of ANOVA by R (1)
62An Example of ANOVA by R (2)
63An Example of ANOVA by R (3)
64Output (1)
65Output (2)
66Output (3)
67Output (4)
68Output (5)
69Output (6)
70Output (7)
71The Second Example of ANOVA by R (1)
- Data source
- http//finzi.psych.upenn.edu/R/library/rpart/html/
kyphosis.html - Reference
- http//www.stat.umn.edu/geyer/5601/examp/parm.html
- Kyphosis is a misalignment of the spine. The data
are on 83 laminectomy (a surgical procedure
involving the spine) patients. The predictor
variables are age and age2 (that is, a quadratic
function of age), number of vertebrae involved in
the surgery and start the vertebra number of the
first vertebra involved. The response is presence
or absence of kyphosis after the surgery (and
perhaps caused by it).
72The Second Example of ANOVA by R (2)
73The Second Example of ANOVA by R (3)
74The Second Example of ANOVA by R (4)
75Output (1)
Data kyphosis
76Output (2)
77Output (3)
78Output (4)
79Output (5)
deviance
p-value
80Output (6)
81Exercises
- Write your own programs similar to those examples
presented in this talk. - Write programs for those examples mentioned at
the reference web pages. - Write programs for the other examples that you
know. - Practice Makes Perfect!
81