TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER

Description:

TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER N. Scott Urquhart Director, STARMAP Department of Statistics Colorado State University Fort Collins, CO 80523-1877 – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 27
Provided by: statColo
Category:

less

Transcript and Presenter's Notes

Title: TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER


1
TROUBLESOME CONCEPTS IN STATISTICS r2 AND POWER
  • N. Scott Urquhart
  • Director, STARMAP
  • Department of Statistics
  • Colorado State University
  • Fort Collins, CO 80523-1877

2
STARMAP FUNDINGSpace-Time Aquatic Resources
Modeling and Analysis Program
The work reported here today was developed under
the STAR Research Assistance Agreement CR-829095
awarded by the U.S. Environmental Protection
Agency (EPA) to Colorado State University. This
presentation has not been formally reviewed by
EPA.  The views expressed here are solely those
of the presenter and STARMAP, the Program he
represents. EPA does not endorse any products or
commercial services mentioned in these
presentation.
3
INTENT FOR TODAY
  • To discuss two topics which have given some of
    you a bit of confusion
  • r2 in regression
  • Power in the context of tests of hypotheses
  • Thanks for Ann Brock and Harriett Bassett for
    suggesting these topics
  • Approach Visually illustrate the idea,
  • Then talk about the concepts illustrated
  • The sequences of graphs are available on
    the internet right now (address is at the end of
    this handout)
  • Questions are welcome

4
r2 IN REGRESSION
  • r2 provides a summary of the strength of a
    (linear) regression which reflects
  • The relative size of the residual variability,
  • The slope of the fitted line, and
  • How good the observed values of the
    predictor variable are for prediction
  • Mainly the range of the Xs
  • Lets see these features in action, then
  • Look at the formulas

5
WHAT MAKES r2 TICK?
varying one thing, leaving the remaining things
fixed
r2 increases as residual variation decreases
r2 increases as the slope increases
r2 increases the range of x increases
6
WHAT IS r2?
  • r2 provides A measure of the fit of a line to a
    set of data which incorporates
  • The amount of residual variation,
  • The strength of the line (slope), and
  • How good the set of values of x are
    for estimating the line
  • Some areas of endeavor tend to overuse it!

7
HOW DOES r2 TELL US ABOUT VARIATION?
  • The following graph illustrates this
  • The data scatter has r2 0.5 (approximately)
  • The red points have the same values, but
    all concentrated at X 5.
  • Strictly speaking the above formulas
    apply only in the case of bivariate regression.
  • Estimation formulas involve factors of n-1
    and n-2.

8
(No Transcript)
9
FORMULAS FOR r2
  • But these have little intuitive appeal !
  • Well decompose observations into parts
  • Mean
  • Regression
  • Residual

10
DECOMPOSING REGRESSION
  • This is really n equations
  • Square each of these equations and add them up
    across i.
  • The three cross product terms will each add to
    zero. (Try it!)

11
DECOMPOSING REGRESSION(continued)
12
POWER OF A TEST OF HYPOTHESIS
  • Power Prob(Being right)
    Prob(Rejecting false hypothesis)
  • Power depends on two main things
  • The difference in the hypothesized and
    true situations, and
  • The strength of the information for making
    the test
  • Sample size is very important factor
  • In regression it depends on the same factors as
    the ones which increase r2.
  • Again, see it, then talk about it

Power increases as D m1 - m2 increases
13
POWER VARIES WITH DIFFERENCE (D m1 - m2) and
SAMPLE SIZE (n)
14
ON TESTS OF HYPOTHESES( ON THE WAY TO POWER)
TRUE SITUATION
HYPOTHESIS FALSE
HYPOTHESIS TRUE
ACTION
FAIL TO REJECT THE NULL HYPOTHESIS
TYPE II ERROR
CORRECT ACTION
REJECT THE NULL HYPOTHESIS
TYPE I ERROR
CORRECT ACTION
Tests of hypotheses are designed to control a
Prob (Type I Error)
While getting Power 1- Prob (Type II Error) as
large as possible
15
ON TESTS OF HYPOTHESES(AN ASIDE)
  • Which is worse,
  • a type I error, or
  • a type II error?
  • It depends tremendously on perspective
  • Consider the criminal justice system
  • Truth Accused is innocent (HO) or guilty (HA)
  • Action Accused is acquitted or convicted
  • Type I error Convict an innocent person
  • Type II error Acquit a guilty person
  • Which is worse?
  • Consider the difference in view of the
  • Accused
  • Society especially if accused is terrorist

16
COMPUTING THE CRITICAL REGION
  • Consider a simple case X N( m, 1)
  • HO m 4 versus HA m ¹ 4
  • Critical Region (CR) is
  • X l and X ³ u , so
  • 0.025 P(X l ) P((X-4)/1 ( l - 4)/1)
    P(Z -1.96)
  • l 2.04, similarly,
  • u 5.96

17
COMPUTING POWER
  • Consider a simple case X N( m, 1)
  • HO m 4 versus HA m ¹ 4
  • Power (at m 5) ?
  • Prob(XA in CR m 5)
  • XA N( 5, 1)
  • Prob(XA 2.04) Prob(XA ³ 5.96)
  • Prob(Z -2.96) Prob(Z ³ 0.96)
  • 0.0015 0.1685 0.1700

18
POWER VARIES WITH DIFFERENCE (D m1 - m2) and
SAMPLE SIZE (n)
19
COMPUTING POWER USING A MEAN BASED ON n 2
OBSERVATIONS
  • Consider a simple case
  • When
  • the mean of two observations follows
  • HO m 4 versus HA m ¹ 4
  • Power (at m 5) ?
  • Critical Region (CR) is
  • l and ³ u , so
  • 0.025 P( l ) P(( -4)/0.707 ( l -
    4)/0.707) P(Z -1.96)
  • So l 4 (1.96)(0.707) 2.61, similarly, u
    5.39

20
COMPUTING POWER USING A MEAN BASED ON n 2
OBSERVATIONS(continued)
21
POWER VARIES WITH DIFFERENCE (D m1 - m2) and
SAMPLE SIZE (n)
22
COMPUTING POWER USING A MEAN BASED ON n 4
OBSERVATIONS(continued)
(This page is not in the handout so it all
would fit on one page)
23
POWER VARIES WITH DIFFERENCE (D m1 - m2) and
SAMPLE SIZE (n)
24
POWER VARIES WITH DIFFERENCE (D m1 - m2) and
SAMPLE SIZE (n)
25
DIRECTIONAL NOTE
  • As the alternative has been two-sided throughout
    this presentation, the power curves are symmetric
    about the vertical axis.
  • By examining only the positive side, we can see
    the curves twice as large.

26
YOU HAVE ACCESS TO THESEPRESENTATIONS
  • You can find each of the slide shows shown here
    today at
  • http//www.stat.colostate.edu/starmap/learning.htm
    l
  • Each show begins with authorship funding
    slides
  • You are welcome to use them, and adapt them
  • But, please always acknowledge source and
    funding
  • You are free to reorder the graphs if it
    makes more sense for r2 to decrease than
    increase.
  • Urquhart is available to talk to AP Stat
    classes about statistics as a profession.
  • See content on the web site above.
Write a Comment
User Comments (0)
About PowerShow.com