Understanding P-values and Confidence Intervals - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Understanding P-values and Confidence Intervals

Description:

Understanding P-values and Confidence Intervals Thomas B. Newman, MD, MPH 20 Nov 08 Announcements Optional reading about P-values and Confidence Intervals on the ... – PowerPoint PPT presentation

Number of Views:310
Avg rating:3.0/5.0
Slides: 38
Provided by: ThomasN164
Category:

less

Transcript and Presenter's Notes

Title: Understanding P-values and Confidence Intervals


1
Understanding P-values and Confidence Intervals
  • Thomas B. Newman, MD, MPH

20 Nov 08
2
Announcements
  • Optional reading about P-values and Confidence
    Intervals on the website
  • Exam questions due Monday 11/24/08 500 PM
  • Next week (11/27) is Thanksgiving
  • Following week Physicians and Probability
    (Chapter 12) and Course Review
  • Final exam to be distributed in SECTION 12/4 and
    posted on web
  • Exam due 12/11 845 AM
  • Key will be posted shortly thereafter

3
Overview
  • Introduction and justification
  • What P-values and Confidence Intervals dont mean
  • What they do mean analogy between diagnostic
    tests and clinical researc
  • Useful confidence interval tips
  • CI for negative studies absolute vs. relative
    risk
  • Confidence intervals for small numerators

4
Why cover this material here?
  • P-values and confidence intervals are ubiquitous
    in clinical research
  • Widely misunderstood and mistaught
  • Pedagogical argument
  • Is it important?
  • Can you handle it?

5
Example Douglas Altman Definition of 95
Confidence Intervals
  • "A strictly correct definition of a 95 CI is,
    somewhat opaquely, that 95 of such intervals
    will contain the true population value.
  • Little is lost by the less pure interpretation
    of the CI as the range of values within which we
    can be 95 sure that the population value lies.

Quoted in Guyatt, G., D. Rennie, et al. (2002).
Users' guides to the medical literature
essentials of evidence-based clinical practice.
Chicago, IL, AMA Press.
6
Understanding P-values and confidence intervals
is important because
  • It explains things which otherwise do not make
    sense, e.g. the need to state hypotheses in
    advance and correction for multiple hypothesis
    testing
  • You will be using them all the time
  • You are future leaders in clinical research

7
You can handle it because
  • We have already covered the important concepts at
    length earlier in this course
  • Prior probability
  • Posterior probability
  • What you thought before new information what
    you think now
  • We will support you through the process

8
Review of traditional statistical significance
testing
  • State null (Ho) and alternative (Ha) hypotheses
  • Choose a
  • Calculate value of test statistic from your data
  • Calculate P- value from test statistic
  • If P-value lt a, reject Ho

9
Problem
  • Traditional statistical significance testing has
    led to widespread misinterpretation of P-values

10
What P-values dont mean
  • If the P-value is 0.05, there is a 95
    probability that
  • The results did not occur by chance
  • The null hypothesis is false
  • There really is a difference between the groups

11
So if P 0.05, what IS there a 95 probability
of?
12
White board
  • 2x2 tables and false positive confusion
  • Analogy with diagnostic tests
  • (This is covered step-by-step in the course book.)

13
Analogy between diagnostic tests and research
studies
14
Analogy between diagnostic tests and research
studies
15
Extending the Analogy
  • Intentionally ordered tests and hypotheses stated
    in advance
  • Multiple tests and multiple hypotheses
  • Laboratory error and bias
  • Alternative diagnoses and confounding

16
Bonferroni
  • Inequality If we do k different tests, each with
    significance level a, the probability that one or
    more will be significant is less than or equal to
    k ? a
  • Correction If we test k different hypotheses and
    want our total Type 1 error rate to be no more
    than alpha, then we should reject H0 only if P lt
    a/k

17
Derivation
  • Let A B probability of a Type 1 error for
    hypotheses A and B
  • P(A or B) P(A) P(B) P(A B)
  • Under Ho, P(A) P(B) a
  • So P(A or B) a a - P(A B) 2a - P(A B).
  • Of course, it is possible to falsely reject 2
    different null hypotheses, so P(A B) gt 0.
    Therefore, the probability of falsely rejecting
    either of the null hypotheses must be less than
    2a.
  • Note that often A B are not independent, in
    which case Bonferroni will be even more
    excessively conservative

18
Problems with Bonferroni correction
  • Overly conservative (especially when hypotheses
    are not independent)
  • Maintains specificity at the expense of
    sensitivity
  • Does not take prior probability into account
  • Not clear when to use it
  • BUT can be useful if results still significant

19
CONFIDENCE INTERVALS
20
What Confidence Intervals dont mean
  • There is a 95 chance that the true value is
    within the interval
  • If you conclude that the true value is within the
    interval you have a 95 chance of being right
  • The range of values within which we can be 95
    sure that the population value lies

21
One source of confusion Statistical confidence
  • (Some) statisticians say You can be 95
    confident that the population value is in the
    interval.
  • This is NOT the same as There is a 95
    probability that the population value is in the
    interval.
  • Confidence is tautologously defined by
    statisticians as what you get from a confidence
    interval

22
Illustration
  • If a 95 CI has a 95 chance of containing the
    true value, then a 90 CI should have a 90
    chance and a 40 CI should have a 40 chance.
  • Study 4 deaths in 10 subjects in each group
  • RR 1.0 (95 CI 0.34 to 2.9)
  • 40 CI 0.75 to 1.33
  • Conclude from this study that there is 60 chance
    that the true RR is lt0.75 or gt 1.33?

23
Confidence Intervals apply to a Process
  • Consider a bag with 19 white and 1 pink
    grapefruit
  • The process of selecting a grapefruit at random
    has a 95 probability of yielding a white one
  • But once Ive selected one, does it still have a
    95 chance of being white?
  • You may have prior knowledge that changes the
    probability (e.g., pink grapefruit have thinner
    peel are denser, etc.)

24
Confidence Intervals for negative studies 5
levels of sophistication
  • Example 1 Oral amoxicillin to treat possible
    occult bacteremia in febrile children
  • Randomized, double-blind trial
  • 3-36 month old children with T 39º C (N 955)
  • Treatment Amox 125 mg/tid ( 10 kg) or 250 mg
    tid (gt 10 kg)
  • Outcome major infectious morbidity

Jaffe et al., New Engl J Med 19873171175-80
25
Amoxicillin for possible occult bacteremia 2
Results
  • Bacteremia in 19/507 (3.7) with amox, vs 8/448
    (1.8) with placebo (P0.07)
  • Major Infectious Morbidity 2/19 (10.5) with
    amox vs 1/8 (12.5) with placebo (P 0.9)
  • Conclusion Data do not support routine use of
    standard doses of amoxicillin

26
5 levels of sophistication
  • Level 1 P gt 0.05 treatment does not work
  • Level 2 Look at power for study. (Authors
    reported power 0.24 for OR4. Therefore, study
    underpowered and negative study uninformative.)

27
5 levels of sophistication, contd
  • Level 3 Look at 95 CI!
  • Authors calculated OR 1.2 (95 CI 0.02 to 30.4)
  • This is based on 1/8 (12.5) with placebo vs 2/19
    (10.5) with amox
  • (They put placebo on top)
  • (Silly to use OR)
  • With amox on top, RR 0.84 (95 CI 0.09 to
    8.0)
  • This was level of TBN in letter to the editor
    (1987)

28
5 levels of sophistication, contd
  • Level 4 Make sure you do an intention to
    treat analysis!
  • It is not OK to restrict attention to bacteremic
    patients
  • So it should be 2/507 (0.39) with amox vs 1/448
    (0.22) with placebo
  • RR 1.8 (95 CI 0.05 to 6.2)

29
Level 5 the clinically relevant quantity is the
Absolute Risk Reduction (ARR)!
  • 2/507 (0.39) with amox vs 1/448 (0.22) with
    placebo
  • ARR -0.17 amoxicillin worse
  • 95 CI (-0.9 harm to 0.5 benefit)
  • Therefore, LOWER limit of 95 CI for benefit
    (I.e., best case) is NNT 1/0.5 200
  • So this study suggests need to treat 200
    children to prevent Major Infectious Morbidity
    in one

30
Stata output
  • . csi 2 1 505 447
  • Exposed Unexposed
    Total
  • ------------------------------------------------
    ---
  • Cases 2 1
    3
  • Noncases 505 447
    952
  • ------------------------------------------------
    ---
  • Total 507 448
    955
  • Risk .0039448 .0022321
    .0031414
  • Point estimate 95
    Conf. Interval
  • -------------------------------
    ---------------
  • Risk difference .0017126
    -.005278 .0087032
  • Risk ratio 1.767258
    .1607894 19.42418
  • Attr. frac. ex. .4341518
    -5.219315 .9485178
  • Attr. frac. pop .2894345
  • --------------------------------
    ---------------
  • chi2(1) 0.22
    Prgtchi2 0.6369

31
Example 2 Pyelonephritis and new renal scarring
in the International Reflux Study in Children
  • RCT of ureteral reimplantation vs prophylactic
    antibiotics for children with vesicoureteral
    reflux
  • Overall result surgery group fewer episodes of
    pyelonephritis (8 vs 22 NNT 7 P lt 0.05) but
    more new scarring (31 vs 22 P .4)
  • This raises questions about whether new scarring
    is caused by pyelonephritis

Weiss et al. J Urol 1992 1481667-73
32
Within groups no association between new pyelo
and new scarring
  • Trend goes in the OPPOSITE direction

RR0.28 95 CI (0.09-1.32)Weiss, J Urol
19921481672
33
Stata output to get 95 CI
. csi 2 18 28 58 Exposed
Unexposed Total -------------------------
---------------------------- Cases
2 18 20
Noncases 28 58
86 ---------------------------------------------
-------- Total 30 76
106
Risk .0666667 .2368421
.1886792
Point estimate
95 Conf. Interval
------------------------------------------------
Risk difference -.1701754
-.3009557 -.0393952 Risk ratio
.2814815 .069523 1.13965 Prev.
frac. ex. .7185185 -.1396499
.930477 Prev. frac. pop .2033543
--------------------------
--------------- chi2(1)
4.07 Prgtchi2 0.0437
34
Conclusions
  • No evidence that new pyelonephritis causes
    scarring
  • Some evidence that it does not
  • P-values and confidence intervals are
    approximate, especially for small sample sizes
  • There is nothing magical about 0.05
  • Key concept calculate 95 CI for negative
    studies
  • ARR for clinical questions (less generalizable)
  • RR for etiologic questions

35
Confidence intervals for small numerators
36
When P-values and Confidence Intervals Disagree
  • Usually P lt 0.05 means 95 CI excludes null
    value.
  • But both 95 CI and P-values are based on
    approximations, so this may not be the case
  • Illustrated by IRSC slide above
  • If you want 95 CI and P- values to agree, use
    test-based confidence intervals see next slide

37
Alternative Stata output Test-based CI
  • .
  • . csi 2 18 28 58,tb
  • Exposed Unexposed
    Total
  • ------------------------------------------------
    ----
  • Cases 2 18
    20
  • Noncases 28 58
    86
  • ------------------------------------------------
    ----
  • Total 30 76
    106
  • Risk .0666667 .2368421
    .1886792
  • Point estimate
    95 Conf. Interval
  • -------------------------------
    ----------------
  • Risk difference -.1701754
    -.3363063 -.0040446 (tb)
  • Risk ratio .2814815
    .0816554 .9703199 (tb)
  • Prev. frac. ex. .7185185
    .0296801 .9183446 (tb)
  • Prev. frac. pop .2033543
  • --------------------------------
    -----------------
Write a Comment
User Comments (0)
About PowerShow.com