Lecture 5 How much variety is there Measures of variation PowerPoint PPT Presentation

presentation player overlay
1 / 39
About This Presentation
Transcript and Presenter's Notes

Title: Lecture 5 How much variety is there Measures of variation


1
Lecture 5 How much variety is there?Measures of
variation
  • Sociology 549
  • Paul von Hippel

2
Measures of variation
  • Sensitive (to extreme values)
  • standard deviation (s)
  • variance (s2)
  • range
  • Robust
  • interquartile range (IQR)
  • Standard deviation
  • is basis for standardization

3
Center vs. variation
  • Two distributions can have same center
  • But differ with respect to variation
  • 2 basketball teams
  • Clippers
  • Knicks
  • Similar mean height, between 66 and 67
  • But they dont match up well

4
Basketball teams Mean heights
5
Variance and standard deviation
  • Most common measures are
  • Variance (s2)
  • Standard deviation (s)
  • To understand
  • Must understand deviation

6
Deviations from the mean
  • Deviation from the mean is
  • Y is the value for a particular case
  • Y65 for Earl Boykins
  • Y bar is the mean over all the cases
  • Deviation -13.54 for Earl Boykins
  • Interpretation He is 13.54 shorter than the
    team mean

7
Variance Calculation
  • Variance is the mean of the squared deviations.
  • Formula
  • Steps
  • Calculate the deviation for each case
  • Square each deviation
  • Sum the squared deviation
  • Divide by N-1 (not N)
  • Remember N is sample size, of cases
  • Why N-1?
  • If N1
  • you cant see variation
  • and you cant divide by N-10

8
Variance Example
  • Note influence of extreme cases (esp. Boykins)

9
Variance Interpretation
  • More variety?larger variance
  • Beyond that, not easy to interpret
  • Variance is in squared units
  • Need to un-square them

10
Standard deviation Calculation
  • How can we un-square the variance
  • Take the square root!
  • Square root of variance is standard deviation
  • Example. For Clippers,
  • s(23.10)1/24.81 inches

11
Standard deviation Interpretation
  • Variance is in squared units
  • Variance of Clipper heights is 23.10
    inches-squared
  • Standard deviation is in original units
  • SD of Clipper heights4.81 inches
  • Deviations from mean also in inches
  • Boykinss deviation 13.54 inches
  • Can compare
  • Standard deviation is a
  • standard to which
  • deviations are compared

12
StandardizationDeviation vs. standard deviation
  • Earl Boykins has a deviation of 13.54 inches
  • The standard deviation is 4.81 inches
  • So Earl Boykins is 13.54/4.81-2.81 standard
    deviations from the mean height for his team
  • This is a standard or Z score
  • General formula
  • Interpretation
  • The case is Z standard deviations from the mean
  • E.g., Boykins is 2.81 standard deviations below
    the mean height

13
Standard scores Interpretation
  • Extreme values? extreme standard scores
  • Its rare to find Zgt2 or Zlt-2

14
Exercise
  • For exam scores below
  • mean 70.2
  • standard deviation (25)
  • Calculate and interpret the standard score of
    the most extreme value.

15
Reversing standardization
  • Given
  • standard score Z
  • mean
  • standard deviation SY
  • You can get back the raw score Y
  • This is just a rearrangement of the
    standardization formula

16
Reversing standardization Example
  • Earl Boykins is 2.81 standard deviations below
    the mean for his team.
  • His team has a mean height of 78.54 inches, and a
    standard deviation of 4.81 inches
  • What is Earl Boykins height again?

17
Dummy variables review
  • Suppose Y is a dummy variable
  • e.g. Y1 if a student is female, Y0 if male
  • Some proportion p have Y1 (female)
  • p is also the mean, i.e.

18
Dummy variables variance SD
  • Can calculate variance ( SD) in usual way
  • But theres a shortcut
  • s2 p(1-p)
  • s (s2)1/2

19
Dummy variance SD Examples
Makes sense Colleges with more gender variety
have larger variance ( SD)
20
Other measures of variation
  • In addition to variance and sd
  • Range
  • Inter-Quartile Range (IQR)

21
Range Calculation
  • Largest minus smallest value
  • E.g., Clippers
  • shortest 65 inches (Boykins)
  • tallest 84 inches (Olowakandi)
  • Range 84-6519 inches

22
Range Interpretation
  • Really easy
  • All the player heights fit in a 19-inch range,
    from 65 to 84 inches.
  • But
  • Sensitive to extreme values
  • Uses only extreme values!
  • Increases with N

23
Interquartile range Motivation
  • Less sensitive to extreme values
  • Could be called trimmed range
  • Range ignoring extreme scores
  • Recipe
  • 3rd quartile 1st quartile

24
Finding the quartiles
  • Quartiles split the distribution into quarters
  • Split the distribution in half
  • at the median (2nd quartile)
  • 1st quartile median of smaller half
  • 3rd quartile median of larger half

25
IQR Example with odd N
median
IQR4.5
Interpretation About half the players (7 of 13)
have heights within a 4.5 range, between 77
(65) and 81.5 (69.5).
26
IQR Example with even N
median
IQR81-756
Interpretation About half the players (6 or 8 of
14) have heights within a 6 range, between 75
(63) and 81 (69).
27
Comparing measures of variation
  • Using range, variance, or SD,Clippers look more
    variable.
  • But using IQR, Knicks look more variable.
  • Why?

28
Influence of extreme values
One extreme height (Boykins) expands range and
variance of Clippers, but cant affect IQR.
29
Formulas for frequency tables
30
IQR from a frequency table
  • Tricky to get a recipe thats always right.
  • Rough method, usually right for large N.
  • Q1 first value with cgt25
  • Q3 first value with cgt75

31
IQR from a frequency tableExample
  • Q11, Q32, IQRQ3-Q11
  • Interpretation
  • More than 50 of surveyed householdshad 1 to 2
    residents.

32
Variance from a frequency table
  • Data set
  • Mean
  • Variance (mean squared deviation)
  • Frequency table
  • Mean
  • Variance (mean squared deviation)

33
Variance from frequency table Example
  • Same answer as from raw data.

34
Summary
  • Whats typical? isnt the whole story
  • Lots of cases arent typical
  • Some important cases may be very atypical
  • Measures of variation
  • Variance s2, Standard deviation s
  • IQR
  • Variance and s.d. are sensitive, IQR is robust
  • Remaining lectures use variance and s.d.
  • S.D. is basis for standardization

35
Bonus slides
36
Exercise
  • Given exam scores
  • Calculate variance, standard deviation, range and
    IQR
  • Interpret range and IQR

37
Answer
Q1(3165)/248
IQR91.5-4843.5
Q3(8895)/291.5
  • Interpretation
  • All the scores fit in a 64-point range (31 to
    95).
  • But over half the scores fit in a 43.5-point
    range.
  • (Here even the IQR is influenced by the lowest
    score.)

38
Warning
  • There are other formulas for IQR!
  • Your textbooks is the worst (not symmetric).
  • For Excels formula, see http//www.staff.city.ac.
    uk/r.j.gerrard/excelfaq/faq.htmlqtls
  • But we wont get fussy
  • Discrepancies are small in large samples

39
Variance of a dummy Proof
p is proportion with Y1 (1-p) is proportion with
Y0
Write a Comment
User Comments (0)
About PowerShow.com