NonParametric Tests - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

NonParametric Tests

Description:

The chi-square statistic X2 sums the differences between observed values Oij and ... Bilingual word mapping in machine translation. 13. Aligned French-English words ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 15
Provided by: VasileiosH9
Category:

less

Transcript and Presenter's Notes

Title: NonParametric Tests


1
Non-Parametric Tests
  • Vasileios Hatzivassiloglou
  • University of Texas at Dallas

2
Non-parametric tests
  • Non-parametric methods remove distributional
    assumptions
  • Often such methods are based on
  • combinatorial analysis and/or
  • comparisons (greater than/lesser than) between
    two variables
  • Tradeoff
  • Less power/sensitivity than comparable parametric
    methods on similar sample size

3
The sign test
  • Given samples from two populations, are the means
    the same?
  • Count cases where xi gt yi
  • The distribution of the number of such cases is
  • binomial
  • If the means are the same
  • p would be 0.5

4
Handling matches
  • There is no provision in the binomial
    distribution for xiyi
  • Two possible solutions
  • Ignore such cases (reducing the total number of
    samples, and thus the power of the test)
  • Assign half of them to each outcome

5
The chi-square test
  • Consider two random variables with categorical
    outcomes (each has a finite set of values)
  • Then, we can form a contingency table that
    contains the various combinations of the outcomes
    in the data

6
Example contingency table
7
Expected cell values
  • Under the assumption of independence between the
    two random variables (rows and columns), we can
    calculate the expected frequency of any
    combination from the marginal frequencies of the
    outcomes

8
Pearsons chi-square statistic
  • The chi-square statistic X2 sums the differences
    between observed values Oij and expected values
    Eij across all cells

9
Distribution of X2 statistic
  • With certain assumptions, the distribution of X2
    asymptotically approaches a ?2 distribution with
    density
  • k is the single parameter (degrees of freedom)
    mean is k and variance is 2k
  • In our case k is the number of columns minus one
    times the number of rows minus one

10
The gamma function
  • Extends the factorial function to non-integers
    G(z1)z! for integer z

11
X2 Approximation Assumptions
  • A reasonable number of samples (N50)
  • A reasonable expected value for each cell (Eij
    5)
  • Of the two constraints
  • the second is much harder to satisfy because it
    depends on the marginal probabilities
  • sometimes relaxed to Eij 5 in 80 of the cells
    and Eij 1 elsewhere (for large tables)

12
Applying Pearsons chi-square
  • Collocation discovery as mentioned earlier
  • Other cases of association measurement
  • Example
  • Bilingual word mapping in machine translation

13
Aligned French-English words
  • Uses an aligned corpus at the sentence level

14
Reading
  • Section 5.3.3 on chi-square tests
Write a Comment
User Comments (0)
About PowerShow.com