Non-parametric methods - PowerPoint PPT Presentation

About This Presentation

Title:

Non-parametric methods

Description:

There are transformations improving the normality and homoscedascity [we will go ... that there is no difference between heights of male and female students. ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 24

Provided by: janl4

Category:

more less

Transcript and Presenter's Notes

Title: Non-parametric methods

1
Non-parametric methods
t-test (et cetera) tests hypotheses about
parameters of distribution (in t-test about µ as
a parameter of normal distribution) there are
other approaches too
2
What to do, if data have not normal
distribution?and disturbance of normality is so
large, that I cannot rely on test robustness

There are transformations improving the normality
and homoscedascity we will go through it later
If data have such a distribution, which can be
approximated with selected types of distribution,
then special methods can be used developed for
them (generalized linear modes)
We use non-parametric tests

3
Non-parametric methods

Most often
Permutation commonly randomized tests
Rank-based tests

4
Permutation tests

Basic idea (for t-test)
Reached level of significance is probability,
that so different samples I get just by chance,
if from one population. So, I can try it I put
all the observations from both groups together,
and then randomly assign their group membership
(e.g. by tossing from a hat or by computer random
number generator)

5
And so on, at least thousand times
I look how often is t from randomly generated
groups bigger than from data.
So, I try to simulate it here.
I dont believe this P as I dont know, if
assumptions are fulfilled
6
Reached level of significance (P) is computed then
Number of random permutations, where it was
better than in data (so where tpermut gt
tdata
7
Attention

I test hypothesis, that both samples are from one
(and same) population. If I want to interpret the
test as location test, then I have to add an
assumption that both populations have the same
distribution shape. If they differ after that,
they can differ in the location parameter.

8
Rank-based tests

Basic idea We dont know, what the distribution
is, so we forgot real values and replace them
with their rank
Many parametric methods have their non-parametric
counterparts

9
Mann-Whitney testnon-parametric analogue of
two-sample t-test

All values from both samples are arrayed (and so
they get numbers from 1 to n, where nn1n2
It doesnt matter, if the arrangement is made
from top or from bottom, but I must pay attention
on it, if one-tailed tests are used.

10
compute
it gives especially high value, if ranks in the
first group are low
or
it gives especially high value, if ranks in the
second group are low
holds U U' n1n2,
11
Male and female students are the same high.
Male and female students arent the same high.
High of males
High of females
High of males rank
High of females rank
As
we refuse H0
Mann-Whitney test for non-parametric testing if
two-tailed hypothesis, that there is no
difference between heights of male and female
students.
12
Attention
All sorts of values are tabulated, so pay
attention, what is tabulated and how Statistica
prints 21sided exact p (if I want one-tailed
test, if deviation goes in the right direction, I
divide by two)
13
Normal approximation if there is great number
of observations, holds
Z (U-?U)/ ?U has near normal distribution. At
it is easy job to find corresponding p to it
Statistica prints - Attention if I have exact
p, this value is never more of interest.
14
Similar to permutation test

even M-W has its presumptions
It is either test of null hypothesis, that the
samples are from the same population
If it is formulated as a location test, then
there is an assumption that samples have the same
distribution shape

15
It is thus absurd to write

As we had not homogeneity of variances, we had to
use non-parametric test.
1. to test, if it is the same population, when I
have proved inhomogeneity of variance
previously, doesnt make any sense
2. for location test, inhomogeneity of variance
is the same problem for MW as for t-test.

16
Another presumption - data can be ranked
Ties are averaged deviation from original
presumption can make problem, some tests use
equalities correction ties
17
Median test

I compute median for all observations and how
much observations is in each group above and how
much below this median. I analyse it then with
classic 2 x 2 table. So, it is test about overall
median and it has not any further assumptions,
but it is very weak.

18
Wilcoxon test

Analogue of pair t-test
Attention, more tests are called Wilcoxon, thus
it is sometimes written as Wilcoxon for pair
observations

19
Wilcoxon test

First, we count differences among observations,
then we rank them according to the size of their
absolute value from the smallest to the largest
one. After that we total of positive differences
ranks and number of negative differences ranks
(marked as T and T-). (As the sum of series
numerical from 1 to n is n(n1)/2, we can easily
compute Tn(n1)/2-T-)

Thus, test reflects number as well as quantity of
positive and negative differences.
20
Length of foreleg and hind leg is the same in
roe-deer.
Length of foreleg and hind leg isnt the same in
roe-deer.
Roe-deer
Hind leg L.
Foreleg L.
Difference
Rank
Rank with mark
As
is rejected
or
Wilcoxon pair test applied upon data of roe-deer
legs length
21
Approximation can be used again (for large
samples)
and from this compute Z. Attention, Statistica
shows just normal approximation, does not print
exact p look for it in tables, if
needed. tables can be found here
http//fsweb.berry.edu/academic/education/vbissonn
ette/tables/wilcox_t.pdf
Test has assumption about symmetric distribution
of differences.
22
Sign test
Compares numbers of positive and negative
differences Has no assumptions, but very weak
23
Non-parametric tests

If assumptions for parametric test are fulfilled,
non-parametric tests are weaker than
corresponding parametric test.
Common idea about no assumptions for
nonparametric test is not true.
Generally the more observations I have, the
more robust parametric tests used to be to
disturbances of their presumptions
The stronger assumptions are fulfilled, the more
powerful test I can usually use