Statistics of Illumination - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics of Illumination

Description:

Statistics of Illumination Beth Chance Roxy Peck Cal Poly, San Luis Obispo STATISTICS SAY Increasingly daily life involves statistical information interpretations ... – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 58
Provided by: Defa53
Category:

less

Transcript and Presenter's Notes

Title: Statistics of Illumination


1
Statistics of Illumination
  • Beth Chance
  • Roxy Peck
  • Cal Poly, San Luis Obispo

2
STATISTICS SAY
  • Increasingly daily life involves statistical
    information
  • interpretations of graphical and numerical
    summaries
  • comparisons of groups
  • poll results from random samples
  • conclusions from randomized experiments
  • predictions of future outcomes

3
Most people use statistics as a drunkard uses a
lamppost-
more for support than for illumination.
4
Predicting Variable Behavior
5
Predicting Variable Behavior
  • (a) Height of students in this class
  • (b) Students preference for coca-cola vs.
    pepsi-cola
  • (c) Number of siblings of individuals
  • (d) Amount paid for last haircut
  • (e) Gender breakdown
  • (f) Students guesses of my age

6
Matching Variables to Graphs
7
Matching Variables to Graphs
  • Think about context!
  • Anticipate patterns and variations
  • variable intuition
  • graph-sense

8
STATISTICS SAY
  • Students heights would show more variability
    than guesses of my age
  • KDC Pursues High-Return, Low-Risk Strategy

9
What is Variability?

10
What is Variability?
Class F Class G Class H Class I Class J
range 6 8 8 8 8
IQR 2.75 3 0 8 5
Std. Dev. 1.769 2.041 1.180 4.000 2.657
11
Describing Variability
  • The bumpiness of a histogram does not determine
    the variability of the observations
  • The number of distinct values the variable takes
    does not determine the variability of the
    observations

12
STATISTICS SAY
  • 5236 drivers age 65 and over were involved in
    fatal accidents, compared to only 2900 drivers
    aged 16 and 17, so young people are safer
    drivers...
  • 65 of motorcycle fatalities occurred in states
    with mandatory helmet laws...

13
Counts Versus Ratios
  • Simple counts are often not a good basis for
    comparison of two or more groups.
  • Group size isnt always obvioustwo groups of 25
    U.S. states may have very different sizes even
    though both include the same number of states.
  • Deciding on a sensible basis for comparison
    requires thought!

14
STATISTICS SAY
  • 85 of software developers predicted that
    Microsoft's integration of Internet functions
    into Windows would help their company

15
Some Simple Questions
  • Question 1
  • Lost ticket
  • Yes 6
  • No 9
  • Lost 20
  • Yes 8
  • No 6

16
Some Simple Questions
  • People are more likely to say yes when they
    have lost a 20 bill
  • People tend to answer not surprising to both
    expressions
  • People are more likely to choose program A with
    the save version and program B with the die
    version

17
Some Simple Questions
  • Be careful when wording survey questions ask to
    see the phrasing!
  • Bill Gates It would help me EMENSELY to have a
    survey showing that 90 of developers believe
    putting the browser into the operating system is
    a good idea
  • Browser vs. browser technologies

18
STATISTICS SAY
  • Researchers in Philadelphia investigated whether
    pamphlets containing information for cancer
    patients are written at a level that the cancer
    patients can comprehend
  • Median reading levels are equal

19
Readability of Cancer Pamphlets
20
Readability of Cancer Pamphlets
  • Graphs can illuminate
  • Look at the data!
  • Think about the question

21
STATISTICS SAY
  • American men were randomly selected for the 1970
    draft
  • Draft numbers (1-366) were assigned to birthdates

22
Draft Lottery
  • Calculate the median draft number for each month
  • 31 days 16th value
  • 30 days average 15th and 16th values
  • 29 days 15th value

23
Draft Lottery
  • month median
  • January 211.0
  • February 210.0
  • March 256.0
  • April 225.0
  • May 226.0
  • June 207.5
  • month median
  • July 188.0
  • August 145.0
  • September 168
  • October 201
  • November 131.5
  • December 100

24
Draft Lottery
25
Draft Lottery
26
Draft Lottery
  • Statistics matter
  • Summaries can illuminate
  • Randomization can be difficult

27
STATISTICS SAY
  • The average time between eruptions of the Old
    Faithful Geyser is 71 minutes
  • August, 1985

28
Geyser Eruptions
29
Geyser Eruptions
  • Looks can be deceiving!
  • Use the graph that summarizes without losing
    important details

30
STATISTICS SAY
  • The average major league baseball salary in the
    United States is about 1.5 million

31
Rowers Weights
  • 2000 Mens Olympic Rowing Team

32
Rowers Weights
33
Rowers Weights
  • Mean Median
  • Full Data Set 197.29 207.5
  • Without Coxswain 200.11 210.00
  • Without Coxswain or 210.57 210.00 lightweight
    rowers
  • With heaviest at 320 215.33 210.00
  • Resistance....

34
Rowers Weights
  • Know what your numerical summary is measuring
  • Investigate causes for unusual observations
  • Baseball median salary 500,000

35
STATISTICS SAY
People live longer in countries with more
televisions
36
Televisions and Life Expectancy
  • Buy another television?
  • Association is not causation

37
STATISTICS SAY
  • Overall survival rates
  • A 80 B 90
  • Fair condition
  • A 98.3 B 96.7
  • Poor condition
  • A 52.5 B 30.0

38
Hospital Recovery Rates
  • Simpsons Paradox
  • Hospital A gets most of the poor condition cases
  • Patients in poor condition are less likely to
    survive
  • Thus hospital A has the lower survival rate
    despite being the better choice for either
    condition
  • Beware of lurking variables

39
Hospital Recovery Rates (cont.)
100
Fair
survive
Hospital A
Hospital B
0
40
Hospital Recovery Rates (cont.)
100
Fair
survive
Poor
Hospital A
Hospital B
0
41
Hospital Recovery Rates (cont.)
100
Fair
survive
Poor
Hospital A
Hospital B
0
42
STATISTICS SAY
  • Taking an aspirin each day reduces the risk of
    heart attack for men, but less so for women

43
How Experiments Take Variability Into Account
  • Direct control
  • Blocking
  • Randomization

44
Randomization
1 2
3 4
5 6
7 8
45
Blocking Scheme A
1 2
3 4
5 6
7 8
46
Blocking Scheme B
1 2
3 4
5 6
7 8
47
Results from 100 Trials
First Blocking Scheme
Completely Randomized
Second Blocking Scheme
48
Controlling for Variability
  • Blocking reduces variability in the estimated
    mean difference
  • Homogeneous blocks are desirable
  • Randomization evens out the effects of extraneous
    variables

49
STATISTICS SAY
  • A log was selected at random

50
Sampling Logs
  • Does choosing times at random result in a random
    sample of logs?
  • _______________________________ ?

51
Estimating Mean String Length
  • Does the sampling procedure produce a simple
    random sample?
  • How is this related to the log problem??
  • Can you suggest a better sampling method?

52
Selecting a Sample
  • Random Sampling eliminates human selection bias
    so the sample will be fair and unbiased/representa
    tive of the population.
  • While increasing the sample size improves
    precision, this does not decrease bias.

53
STATISTICS SAY
  • 45 /- 1 of people surveyed claim to prefer
    watching soccer to baseball

54
Reeses Pieces
55
Reeses Pieces
  • Take sample of 25 candies
  • Sort by color
  • Calculate the proportion of orange candies in
    your sample
  • Construct a dotplot of the distribution of sample
    proportions

56
Reeses Pieces
  • Did everyone obtain the same sample result?
  • Is there a pattern to the sample results?
  • Is it possible to make predictions about the
    population based on only one sample?
  • Can you be confident of your prediction?

57
Thank You
  • bchance_at_calpoly.edu
  • rpeck_at_calpoly.edu
Write a Comment
User Comments (0)
About PowerShow.com