Title: Brownian Bridge and nonparametric rank tests
1Brownian Bridge and nonparametric rank tests
- Olena Kravchuk
- School of Physical Sciences
- Department of Mathematics
- UQ
2Lecture outline
- Definition and important characteristics of the
Brownian bridge (BB) - Interesting measurable events on the BB
- Asymptotic behaviour of rank statistics
- Cramer-von Mises statistic
- Small and large sample properties of rank
statistics - Some applications of rank procedures
- Useful references
3Definition of Brownian bridge
4Construction of the BB
5Varying the coefficients of the bridge
6Two useful properties
7Ranks and anti-ranks
First sample First sample First sample Second sample Second sample Second sample
Index 1 2 3 4 5 6
Data 5 7 0 3 1 4
Rank 5 6 1 3 2 4
Anti-rank 3 5 4 6 1 2
8Simple linear rank statistic
- Any simple linear rank statistic is a linear
combination of the scores, as, and the
constants, cs. - When the constants are standardised, the first
moment is zero and the second moment is
expressed in terms of the scores. - The limiting distribution is normal because of a
CLT.
9Constrained random walk on pooled data
- Combine all the observations from two samples
into the pooled sample, Nmn. - Permute the vector of the constants according to
the anti-ranks of the observations and walk on
the permuted constants, linearly interpolating
the walk Z between the steps. - Pin down the walk by normalizing the constants.
- This random bridge Z converges in distribution to
the Brownian Bridge as the smaller sample
increases.
10From real data to the random bridge
First sample First sample First sample Second sample Second sample Second sample
Index, i 1 2 3 4 5 6
Data, X 5 7 0 3 1 4
Constant, c 0.41 0.41 0.41 -0.41 -0.41 -0.41
Rank, R 5 6 1 3 2 4
Anti-rank, D 3 5 4 6 1 2
Bridge, Z 0.41 0 -0.41 -0.82 -0.41 0
11Symmetric distributions and the BB
12Random walk model no difference in distributions
13Location and scale alternatives
14Random walk location and scale alternatives
Scale 2
Shift 2
15Simple linear rank statistic again
- The simple linear rank statistic is expressed in
terms of the random bridge. - Although the small sample properties are
investigated in the usual manner, the large
sample properties are governed by the properties
of the Brownian Bridge. - It is easy to visualise a linear rank statistic
in such a way that the shape of the bridge
suggests a particular type of statistic.
16Trigonometric scores rank statistics
- The Cramer-von Mises statistic
- The first and second Fourier coefficients
17Combined trigonometric scores rank statistics
- The first and second coefficients are
uncorrelated
- Fast convergence to the asymptotic distribution
- The Lepage test is a common test of the combined
alternative (SW is the Wilcoxon statistic and
SA-B is the Ansari-Bradley, adopted Wilcoxon,
statistic)
18Percentage points for the first component
(one-sample)
- Durbin and Knott Components of Cramer-von Mises
Statistics
19Percentage points for the first component
(two-sample)
- Kravchuk Rank test of location optimal for HSD
20Some tests of location
21Trigonometric scores rank estimators
- Location estimator of the HSD (Vaughan)
Scale estimator of the Cauchy distribution
(Rublik)
Trigonometric scores rank estimator (Kravchuk)
22Optimal linear rank test
- An optimal test of location may be found in the
class of simple linear rank tests by an
appropriate choice of the score function, a. - Assume that the score function is differentiable.
- An optimal test statistic may be constructed by
selecting the coefficients, bs.
23Functionals on the bridge
- When the score function is defined and
differentiable, it is easy to derive the
corresponding functional.
24Result 4 trigonometric scores estimators
- Efficient location estimator for the HSD
- Efficient scale estimator for the Cauchy
distribution - Easy to establish exact confidence level
- Easy to encode into automatic procedures
25Numerical examples test of location
- Normal, N(500,1002)
- Normal, N(580,1002)
t-test Wilcoxon S1
p-value 0.150 0.162 0.154
CI95 (-172.4,28.6) (-185.0,25.0) (-183.0,25.0)
26Numerical examples test of scale
- Normal, N(300,2002)
- Normal, N(300,1002)
F-test Siegel-Tukey S2
p-value 0.123 0.064 0.054
27Numerical examples combined test
- Normal, N(580,2002)
- Normal, N(500,1002)
F-test t-test S12S22 Lepage CM
p-value 0.021 0.174 0.018 0.035 0.010
28Application palette-based images
29Application grey-scale images
30Application grey-scale images, histograms
31Application colour images
32Useful books
- H. Cramer. Mathematical Methods of Statistics.
Princeton University Press, Princeton, 19th
edition, 1999. - G. Grimmett and D. Stirzaker. Probability and
Random Processes. Oxford University Press, N.Y.,
1982. - J. Hajek, Z. Sidak and P.K. Sen. Theory of Rank
Tests. Academic Press, San Diego, California,
1999. - F. Knight. Essentials of Brownian Motion and
Diffusion. AMS, Providence, R.I., 1981. - K. Knight. Mathematical Statistics. Chapman
Hall, Boca Raton, 2000. - J. Maritz. Distribution-free Statistical Methods.
Monographs on Applied Probability and Statistics.
Chapman Hall, London, 1981.
33Interesting papers
- J. Durbin and M. Knott. Components of Cramer
von Mises statistics. Part 1. Journal of the
Royal Statistical Society, Series B., 1972. - K.M. Hanson and D.R. Wolf. Estimators for the
Cauchy distribution. In G.R. Heidbreder, editor,
Maximum entropy and Bayesian methods, Kluwer
Academic Publisher, Netherlands, 1996. - N. Henze and Ya.Yu. Nikitin. Two-sample tests
based on the integrated empirical processes.
Communications in Statistics Theory and
Methods, 2003. - A. Janseen. Testing nonparametric statistical
functionals with application to rank tests.
Journal of Statistical Planning and Inference,
1999. - F.Rublik. A quantile goodness-of-fit test for the
Cauchy distribution, based on extreme order
statistics. Applications of Mathematics, 2001. - D.C. Vaughan. The generalized secant hyperbolic
distribution and its properties. Communications
in Statistics Theory and Methods, 2002.