One and TwoSample Tests - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

One and TwoSample Tests

Description:

Sample context / domain for Homework #2. Planet Nibiru is supposedly ... dont under estimate the power of 13 zodiac sign coz were the one who handles our ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 20
Provided by: VasileiosH9
Category:
Tags: twosample | one | tests | zodiac

less

Transcript and Presenter's Notes

Title: One and TwoSample Tests


1
One- and Two-Sample Tests
  • Vasileios Hatzivassiloglou
  • University of Texas at Dallas

2
Planet Nibiru
  • Sample context / domain for Homework 2
  • Planet Nibiru is supposedly speeding towards
    Earth when it arrives on December 21, 2012, the
    Earth axis will shift / Earth rotation will stop
  • Proof Sumerians / Mayans / Exodus / aliens from
    Zeta Reticulis told psychic
  • More than 300 books on Amazon.com

3
Sample YouTube comments
  • dont under estimate the power of 13 zodiac sign
    coz were the one who handles our life if we want
    to go straight then we are.. if we think in a bad
    way in what they want to proposed in year 2012
    then where all in bad ways.. im surely have those
    things,,, we want to top thinking for this for
    the serpent not to born in ourselves.... that's
    gonna be...

4
Sample YouTube comments
  • Buy/Sell (Rev 13) has nothing to do with the
    physical, but the spiritual. Once we compare
    scripture with scripture, as we are commanded to
    do by God, we find it totally identifies with
    those who share the gospel. However, in our time
    of the Great Tribulation (May 21, 1988 - May 21,
    2011), those in the churches are
    "buying/selling", bringing their own man-made
    gospels, under Satan's (The beast) authority.

5
One-sample t-test
  • We assume the two words occur with probabilities
    p1 and p2
  • We estimate pi as count(wi)/N (maximum likelihood
    estimate)
  • Under H0 (independence), P(w1w2)p1p2
  • The number of occurrences of w1, w2, and w1w2
    follow binomial distributions

6
Estimating mean and variance
  • The probability of the collocation under
    independence is p p1p2
  • We estimate µ p, s2 p(1-p) p
  • because for word bigrams p is very small
  • The observed is count(w1w2)/N
  • Then

7
Formula for t
8
Example one-sample test
  • In a corpus of 14,307,668 words,
  • new appears 15,828 times
  • companies appears 4,675 times
  • new companies appears 8 times
  • p1 c(new)/N 0.0011
  • p2 c(companies)/N 3.267 10-4
  • pind p1 p2 3.615 10-7
  • pobs c(new companies)/N 5.591 10-7

9
Example one-sample test
  • t statistic
  • Is this significant? Critical values for t
    (infinite degrees of freedom)
  • a0.0005 (99.9 area) t 3.291
  • a0.005 (99 area) t 2.576
  • a0.01 (98 area) t 2.326
  • a0.05 (90 area) t 1.645

10
Two-sample t-test
  • Given samples with estimated means and ,
    and estimated standard deviations s1 and s2, are
    the true means the same?

11
Derivation of two-sample formula
  • We are comparing the distribution of the
    difference of two random variables to the
    hypothesized mean 0
  • Mean of difference
  • Variance of the difference

12
Alternate formula for variance
  • Recall that for any random variable X,

13
Variance of difference
14
Two-sample t statistic (paired samples)
  • The t statistic is
  • and follows the same t distribution with N-1
    degrees of freedom

15
Two-sample t statistic (unpaired samples)
  • The t statistic is
  • The degrees of freedom is a variance-weighted
    average of N1 and N2 (minus one)

16
Applying the two-sample test
  • Consider two words v1 and v2 what other words w
    exhibit most different behavior in associating
    with v1 versus v2?
  • Example strong versus powerful
  • We calculate P(v1w) count(v1w)/N,
    P(v2w) count(v2w)/N, siP(viw)
  • This is an example of paired samples, N1N2N

17
Two-sample t formula
18
T-test assumptions
  • In both the one- and two-sample cases, the t-test
    relies on distributional assumptions
  • Specifically,
  • the population(s) are assumed to be normally
    distributed
  • This may not be correct with our text data

19
Reading
  • Section 5.3.1 on the one-sample t-test
  • Section 5.3.2 on the two-sample t-test
Write a Comment
User Comments (0)
About PowerShow.com