Summary Statistics: Mean, Median, Standard Deviation, and More - PowerPoint PPT Presentation

About This Presentation
Title:

Summary Statistics: Mean, Median, Standard Deviation, and More

Description:

Title: Introduction to Decision Analysis Subject: Decision Analysis Author: Michael Monticino Description: Course to Denton Utilities Last modified by – PowerPoint PPT presentation

Number of Views:193
Avg rating:3.0/5.0
Slides: 28
Provided by: Michael4092
Learn more at: http://www.math.unt.edu
Category:

less

Transcript and Presenter's Notes

Title: Summary Statistics: Mean, Median, Standard Deviation, and More


1
Summary Statistics Mean, Median, Standard
Deviation, and More
  • Seek simplicity and then distrust it.
  • (Dr. Monticino)

2
Assignment Sheet
  • Read Chapter 4
  • Homework 3 Due Wednesday Feb. 9th
  • Chapter 4
  • exercise set A 1 -6, 8, 9
  • exercise set C 1, 2, 3
  • exercise set D 1 - 4, 8,
  • exercise set E 4, 5, 7, 8, 11, 12
  • Quiz 2 will be over Chapter 2
  • Quiz 3 on basic summary statistic calculations
    mean, median, standard deviation, IQR, SD units
  • If youd like a copy of notes - email me

3
Overview
  • Measures of central tendency
  • Mean (average)
  • Median
  • Outliers
  • Measures of dispersion
  • Standard deviation
  • Standard deviation units
  • Range
  • IQR
  • Review and applications

4
Central Tendency
  • Measures of central tendency - mean and median -
    are useful in obtaining a single number summary
    of a data set
  • Mean is the arithmetic average
  • Median is a value such that at least 50 of the
    data is less and at least 50 is greater

5
Example
  • Calculate mean and median for following data sets

37 44 55 78 100 111 125 151 161 37 44 55 69 90
120 125 152 157 161
6
Outliers and Robustness
  • Mean can be sensitive to outliers in data set
  • Not robust to data collection errors or a single
    unusual measurement
  • Blind calculation can give misleading results

mean 170.35
median 151
7
Outliers and Robustness
  • Always a good idea to plot data in the order that
    it was collected
  • Spot outliers
  • Identify possible data collection errors

mean without outliers 150.14
median without outliers 149
8
Outliers and Robustness
  • Median can be a more robust measure of central
    tendency than mean
  • Life expectancy
  • U.S. males mean 80.1, median 83
  • U.S. females mean 84.3, median 87
  • Household income
  • Mean 51,855, median 38,885
  • .3 account for 12 of income
  • Net worth
  • Mean 282,500, median 71,600

9
Which Central Tendency Measure?
  • Calculate mean, median and mode
  • Plot data
  • Create histogram to inspect mode(s)
  • Do not delete data points
  • If analyze data without outliers, report and
    explain outliers
  • Many statistical studies involve studying the
    difference between population means
  • Reporting the mean may be dictated by objective
    of study

10
Which Central Tendency Measure?
  • If data is
  • Unimodal
  • Fairly symmetric
  • Mean is approximately equal to median
  • Then mean is a reasonable measure of central
    tendency

11
Which Central Tendency Measure?
  • If data is
  • Unimodal
  • Asymmetric
  • Then report both median and mean
  • Difference between mean and median indicates
    asymmetry
  • Median will usually be the more reasonable
    summary of central tendency

12
Which Central Tendency Measure?
  • If data is
  • Not unimodal
  • Then report modes and cautiously mean and median
  • Analyze data for differences in groups around the
    modes

13
Limitations of Central Tendency
  • Any single number summary may not adequately
    represent data and may hide differences between
    data sets
  • Example

14
Measures of Dispersion
  • Including an additional statistic - a measure of
    dispersion - can help distinguish between data
    sets which have similar central tendencies
  • Range max - min
  • Standard deviation root mean square difference
    from the mean

15
Measures of Dispersion
  • Examples
  • Range

16
Measures of Dispersion
  • Examples
  • Standard deviation

17
Measures of Dispersion
  • Both range and standard deviation can be
    sensitive to outliers
  • However, many data sets can be characterized by
    mean and SD
  • If the values of the data set are distributed in
    an approximately bell shape, the
  • 68 of the data will be within 1 SD unit of
    mean, 95 will be within 2 SD units and nearly
    all will be within 3 SD units

18
Measures of Dispersion
  • Example
  • Suppose data set has mean 35 and SD 7
  • How many SD units away from the mean is 42?
  • How many SD units away from the mean is 38?
  • How many SD units away from the mean is 30?
  • Assuming bell shape distribution, 95 are
    between what two values?

19
Measures of Dispersion
  • A robust measure of dispersion is the
    interquartile range
  • Q1 value such that 25 of data less than, and
    75 greater than
  • Q3 value such that 75 less than, and 25
    greater than
  • IQR Q3 - Q1

20
Example
  • Calculate range, standard deviation and
    interquartile range for the following data sets

1 98 99 100 100 100 102 102 104 107 95 98 99
100 100 100 102 102 104 107
21
Assignment, Discussion, Evaluation
  • Read Chapter 4
  • Discussion problems
  • Chapter 4
  • exercise set A 1 -6, 8, 9
  • exercise set C 1, 2, 3
  • exercise set D 1 - 4, 8,
  • exercise set E 4, 5, 7, 8, 11, 12
  • Quiz 3 on basic summary statistic calculations
    mean, median, standard deviation, IQR, SD units

22
Review of Definitions
  • Measures of central tendency
  • Mean (average)
  • Median
  • If odd number of data points, middle value
  • If even number of data points, average of two
    middle values

23
Question and Examples
  • Can mean be larger than median? Can median be
    larger than mean?
  • Give examples
  • Can mean be a negative number? Can the median?
  • The average height of three men is 69 inches.
    Two other men enter the room of heights 73 and 70
    inches. What is the average height of all five
    men?

24
Questions and Examples
  • The average of a data set is 30.
  • A value of 8 is added to each element in the data
    set. What is the new average?
  • Each element of the data set is increased by 5.
    What is the new average?
  • Suppose that data consists of only 1s and 0s
  • What does the average represent?
  • Application an experiment is performed and only
    two outcomes can occur
  • Label one type of outcome 1 and the other 0
  • For the data set 31, 45, 72, 86, 62, 78, 50, find
    the median, Q1 (25th percentile) and Q3 (75th
    percentile)

25
Review of Definitions
  • Measures of dispersion
  • Standard deviation
  • Range max - min
  • IQR Q3 - Q1

26
Questions and Examples
  • Can the SD be negative? Can the range? Can the
    IQR?
  • Can the SD equal 0?
  • For the data set 3,1,5,2,1,6 find the SD, range
    and IQR
  • The average weight for U.S. men is 175 lbs and
    the standard deviation is 20 lbs
  • If a man weighs 190 lbs., how many standard
    deviation units away from the mean weight is he?
  • Assuming a normal (bell-shaped) distribution for
    weight, ninety-five percent of U.S. men weigh
    between what two values?

27
Questions and Examples
  • The average of a data set is 23 and the standard
    deviation is 5
  • A value of 8 is added to each element in the data
    set. What is the new standard deviation?
  • Each element of the data set is increased by 5.
    What is the new standard deviation?  
  • (Dr. Monticino)
Write a Comment
User Comments (0)
About PowerShow.com