Chi2 Tests - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Chi2 Tests

Description:

Can 'pool' to test this ... Compare Chi^2 for pooled table (33.4005) to original value (36.1246) ... Cannot pool any more ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 39
Provided by: johnt1
Category:
Tags: chi | chi2 | pool | table | tests

less

Transcript and Presenter's Notes

Title: Chi2 Tests


1
Chi2 Tests
  • Recall that the Chi2 distn with dfN can be
    defined as the sum of squares of N ind. standard
    normals (p. 252)
  • The Chi2 distn is a special form of the Gamma
    distn (p. 198)
  • Write as ?2N
  • Mean(?2N) N
  • SD(?2N) ?(2N)

2
Chi2 Tests
  • To compare two things (means or proportions), we
    can subtract
  • To compare more than 2 things, we cant use
    subtraction
  • Can use the sum of squares
  • Sum of squares is used in the sample SD
    (variance) to measure how spread out the data
    is
  • We can use SS to measure how spread out several
    probabilities are, for instance

3
Chi2 Tests
  • A number of ?2 tests use the following formula
  • ? (obs-exp)2/exp
  • For a number of different reasons, this has
    (approximately) a ?2N distn
  • Often a result of the Central Limit Thm causing
    obs to be (approx) normal
  • The exact defn of obs and exp varies from problem
    to problem

4
Chi2 Tests
  • In SM239, we developed the distn for the
    difference of two proportions (normal)
  • Suppose we need to extend this to determining
    whether 3 or more probabilities are the same
  • Need ?2

5
Chi2 Tests
  • See p. 472, Ex29 and Fig 10.28
  • Compare level of allergic reaction for 3 drugs

6
Chi2 Tests
7
Chi2 Tests
  • Matlab note
  • The Excel files have the data in one long column
  • Call it x
  • x2zeros(4,3) x2()x
  • This will make it into a matrix
  • BE SURE TO USE THE RIGHT SHAPE
  • If we had used x2zeros(3,4), it would have
    produced a matrix, but the wrong one.
  • The first argument is how often each category is
    repeated
  • The second argument is how many catetories
    (Drugs) there are

8
Chi2 Tests
  • H0 Reaction levels are the same for each drug
  • Since we had equal numbers of the 3 drugs, we
    would expect equal numbers in each level of
    reaction
  • Does not say that each level occurs ¼ of the time
  • Expected should be 1/3 of the total in each
    category, since 1/3 of the people were given each
    drug

9
Chi2 Tests
  • gtgt obs11 30 36 23
  • 8 31 25 36
  • 13 28 28 31
  • gtgt csumsum(obs)
  • gtgt rsum(sum(obs'))
  • gtgt exptrsumcsum/sum(rsum)

10
Chi2 Tests
  • Sum(rsum) is the total , 300 in this case
  • Rsum/N is the fraction in each row
  • Mult by Csum to get expected in each cell
  • Rsum, Csum are the marginals (often written in
    the margins)

11
Chi2 Tests
  • Then compute ?2
  • gtgt c2a(obs-expt).2./expt
  • c2a
  • 0.0104 0.0037 1.3521 1.6333
  • 0.6667 0.0599 0.7341 1.2000
  • 0.5104 0.0936 0.0936 0.0333
  • gtgt sum(sum(c2a))
  • 6.3912

12
Chi2 Tests
  • Exercises compute Chi2
  • P. 472, Fig 10.29-30

13
Chi2 Tests
  • Need df
  • For these problems, df( rows-1)( cols-1)
  • Df(3-1)(4-1) 6
  • Can think of it as the number of relevant numbers
    that can vary in the problem
  • Margins are not relevant
  • Also, one row and one col cannot vary (determined
    by margins)
  • Or just learn the formula

14
Chi2 Tests
  • To compute p-value, have to determine more
    extreme
  • If obsexpt, then it appears that H0 is true
  • This would be when ?2 is small
  • More extreme would be larger values
  • I.e., prob (6.3912 or greater)

15
Chi2 Tests
  • function ychiprob(df,lo,hi),
  • gtgt chiprob(6,6.3912,99)
  • 0.3808
  • This is not a small probability
  • Would not reject H0
  • Data does not dispute the reaction levels being
    the same for all 3 drugs

16
Chi2 Tests
  • For significance testing, how large would
    statistic have to be?
  • For ?0.05,
  • gtgt bisect(_at_(x) chiprob(6,x,99),0.05,6,99,.0001)
  • 12.5917

17
Chi2 Tests
  • Exercises
  • P. 472, Fig 10.29-30
  • Also p. 480, 10.4.7, 10.4.9, 10.4.10

18
Chi2 Tests
  • Consider 10.2.3, p 455
  • System A 35/44 successful
  • B 36/52
  • Test H0 p1p2 vs Ha not equal
  • gtgt f135/44f236/52f0(3536)/(4452)
  • gtgt pv2nprob(0,sqrt(f0(1-f0)(1/441/52)),f1-f2,
    99)
  • 0.2512

19
Chi2 Tests
  • Or, make a table

20
Chi2 Tests
  • d
  • 35 36
  • 9 16
  • gtgt yrbyc(d)
  • 1.3166
  • gtgt chiprob(1,y,99)
  • 0.2512
  • gtgt
  • So we get the same (two-sided) answer by either
    method

21
Chi2 Tests
  • Recall that Chi2 arises as sums of squares of
    normals
  • CLT says that many quantities are approximately
    normal
  • Guideline Need all the expected values to be at
    least 5 for normal approx and, hence, for Chi2

22
Chi2 Tests
  • Consider 10.4.5, p 480 (DS10.4.5)
  • Confirm that ALL expected values for Not
    Satisfied are lt5
  • According to guideline, Chi2 is not appropriate
    here

23
Chi2 Tests
  • Small sample problem can be solved using Fishers
    Exact Test
  • Based on hypergeometric distn
  • Wont deal with here

24
Chi2 Tests
  • Could express hypotheses in terms of independence
  • H0 rows and cols are independent
  • Ha probabilities not all equal
  • Ha might have to say probability distributions
    not all equal

25
Chi2 Tests
  • Not all the same does not mean all are
    different
  • Could have some the same and some different

26
Chi2 Tests
  • A new use for hypothesis tests
  • Tells us something about the underlying process
  • Suppose grade distn classes meeting at different
    times are the same
  • I.e., grades are ind of time
  • Tells us about the relationship (or lack) between
    grades and time

27
Chi2 Tests
  • What about when we reject H0?
  • Might like to know more about where the
    differences lie
  • Not always a simple answer
  • Sometimes, the differences can be clear

28
Chi2 Tests
  • Suppose the drug data had been

29
Chi2 Tests
30
Chi2 Tests
  • Chi2 value is 36.12, which is very large
  • The terms in Chi2 are

31
Chi2 Tests
32
Chi2 Tests
  • Suggest that No for Drug C is quite different
    from what we expected
  • Can pool to test this
  • BIG SECRET The df are not changed when we pool
    as a result of how the data looked

33
Chi2 Tests
  • Pooling programs
  • Cpool(table,c1,c2)
  • Rpool(table,r1,r2)
  • Adds 2 cols/rows and zeros the second one
  • Have to write chi2() so you dont get divide by
    zero
  • c2a(obs-expt).2./max(expt,expt0)

34
Chi2 Tests
  • gtgt o2rpool(obs,1,2)
  • 19 70 71 40
  • 0 0 0 0
  • 13 19 18 50
  • gtgt y,expt,c2a,rsum,csumrbyc(o2),
  • y
  • 33.4005

35
Chi2 Tests
  • Compare Chi2 for pooled table (33.4005) to
    original value (36.1246)
  • Can show that Chi2 cannot increase (and almost
    always goes down) after pooling
  • Trick is that it doesnt go down very much if you
    do it right
  • Remember that df are not changing, so if Chi2
    goes down too much, it will no longer be
    significant

36
Chi2 Tests
  • gtgt o3cpool(o2,2,1)
  • 0 89 71 40
  • 0 0 0 0
  • 0 32 18 50
  • gtgt yrbyc(o3)
  • 29.4647
  • Still quite large

37
Chi2 Tests
  • gtgt o4cpool(o3,3,2)
  • 0 0 160 40
  • 0 0 0 0
  • 0 0 50 50
  • gtgt yrbyc(o4)
  • 28.5714
  • Still very large
  • Cannot pool any more
  • Conclude that the differences in drug reactions
    are due to No reaction to Drug C

38
Chi2 Tests
  • It is possible to come to different conclusions
    from the same data
  • Depends on what you choose to pool
  • Can be guided by what makes sense
  • Also look at (obs-expt)./expt (no 2)
  • This will tell which cells are above and below
    expt and about how much
Write a Comment
User Comments (0)
About PowerShow.com