Measures of Variability - PowerPoint PPT Presentation

About This Presentation
Title:

Measures of Variability

Description:

The variance is based on the difference between each observation (xi) and the ... The Standard Deviation of a data set is the square root of the variance. ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 38
Provided by: chris185
Category:

less

Transcript and Presenter's Notes

Title: Measures of Variability


1
Measures of Variability
  • Range
  • Interquartile range
  • Variance
  • Standard deviation
  • Coefficient of variation

2
Consider the sample of starting salaries of
business grads. We would be interested in knowing
if there was a low or high degree of variability
or dispersion in starting salaries received.
3
Range
  • Range is simply the difference between the
    largest and smallest values in the sample
  • Range is the simplest measure of variability.
  • Note that range is highly sensitive to the
    largest and smallest values.

4
Example Apartment Rents
Seventy studio apartments were randomly
sampled in a small college town. The monthly
rent prices for these apartments are listed in
ascending order on the next slide.
5
Range
Range largest value - smallest value
Range 615 - 425 190
6
Interquartile Range
  • The interquartile range of a data set is the
    difference
  • between the third quartile and the first
    quartile.
  • It is the range for the middle 50 of the data.
  • It overcomes the sensitivity to extreme data
    values.

7
Interquartile Range
3rd Quartile (Q3) 525
1st Quartile (Q1) 445
Interquartile Range Q3 - Q1 525 - 445 80
8
Variance
  • The variance is a measure of variability that
    uses all the data
  • The variance is based on the difference between
    each observation (xi) and the mean ( for
    the sample and µ for the population).

9
The variance is the average of the squared
differences between the observations and the mean
value
For the population
For the sample
10
Standard Deviation
  • The Standard Deviation of a data set is the
    square root of the variance.
  • The standard deviation is measured in the same
    units as the data, making it easy to interpret.

11
Computing a standard deviation
For the population
For the sample
12
Coefficient of Variation
Just divide the standard deviation by the mean
and multiply times 100
Computing the coefficient of variation
For the population
For the sample
13
The heights (in inches) of 25 individuals were
recorded and the following statistics were
calculated mean 70range 20mode 73variance
784median 74 The coefficient of variation
equals
  1. 11.2
  2. 1120
  3. 0.4
  4. 40

14
If index i (which is used to determine the
location of the pth percentile) is not an
integer, its value should be
  1. squared
  2. divided by (n - 1)
  3. rounded down
  4. rounded up

15
Which of the following symbols represents the
variance of the population?
  1. s2
  2. s
  3. m

16
Which of the following symbols represents the
size of the sample
  1. s2
  2. s
  3. N
  4. n

17
The symbol s is used to represent
  1. the variance of the population
  2. the standard deviation of the sample
  3. the standard deviation of the population
  4. the variance of the sample

18
The numerical value of the variance
  1. is always larger than the numerical value of the
    standard deviation
  2. is always smaller than the numerical value of the
    standard deviation
  3. is negative if the mean is negative
  4. can be larger or smaller than the numerical value
    of the standard deviation

19
If the coefficient of variation is 40 and the
mean is 70, then the variance is
  1. 28
  2. 2800
  3. 1.75
  4. 784

20
Problem 22, page 94
21
Broker-Assisted 100 Shares at 50 per Share
 
Range 45.05
Interquartile Range 23.98
Variance 190.67
Standard Deviation 13.8
Coefficient of Variation 38.02
   
25th percentile 6
75th percentile 18
interquart 25 24.995
interquart 75 48.975
Mean 36.32
22
Online 500 Shares at 50 per Share
Range 57.50
Interquartile Range 11.475
Variance 140.633
Standard Deviation 11.859
Coefficient of Variation 57.949
   
25th percentile  
75th percentile  
interquart 25 13.475
interquart 75 24.95
Mean 20.46
23
The variability of commissions is greater for
broker-assisted trades
24
Using Excel to Compute the Sample Variance,
Standard Deviation, and Coefficient of Variation
  • Formula Worksheet

Note Rows 8-71 are not shown.
25
Using Excel to Compute the Sample Variance,
Standard Deviation, and Coefficient of Variation
  • Value Worksheet

Note Rows 8-71 are not shown.
26
Using ExcelsDescriptive Statistics Tool
Step 4 When the Descriptive Statistics dialog
box appears
Enter B1B71 in the Input Range box Select
Grouped By Columns Select Labels in First
Row Select Output Range Enter D1 in the Output
Range box Select Summary Statistics Click OK
27
Using Excels Descriptive Statistics Tool
  • Descriptive Statistics Dialog Box

28
Using ExcelsDescriptive Statistics Tool
  • Value Worksheet (Partial)

Note Rows 9-71 are not shown.
29
Using ExcelsDescriptive Statistics Tool
  • Value Worksheet (Partial)

Note Rows 1-8 and 17-71 are not shown.
30
Measures of Relative Location and Detecting
Outliers
  • z-scores
  • Chebyshevs Theorem
  • Detecting Outliers

By using the mean and standard deviation
together, we can learn more about the relative
location of observations in a data set
31
z-score
Here we compare the deviation from the mean of a
single observation to the standard deviation
The z-score is compute for each xi
Where zi is the z-score for xi is the sample
mean s is the sample standard deviation
32
The z-score can be interpreted as the number of
standard deviations xi is from the sample mean
33
Z-scores for the starting salary data
Graduate Starting Salary xi - x z-score
1 2850 -90 -0.543
2 2950 10 0.060
3 3050 110 0.664
4 2880 -60 -0.362
5 2755 -185 -1.117
6 2710 -230 -1.388
7 2890 -50 -0.302
8 3130 190 1.147
9 2940 0 0.000
10 3325 385 2.324
11 2920 -20 -0.121
12 2880 -60 -0.362
34
Chebyshevs Theorem
At least (1-1/z2) of the data values must be
within z standard deviations of the mean, where z
is greater than 1.
This theorem enables us to make statements about
the proportion of data values that must be within
a specified number of standard deviations from
the mean
35
Implications of Chebychevs Theorem
  • At least .75, or 75 percent of the data values
    must be within 2 ( z 2) standard deviations of
    the mean.
  • At least .89, or 89 percent, of the data values
    must be within 3 (z 3) standard deviations of
    the mean.
  • At least .94, or 94percent, of the data values
    must be within 4 (z 4) standard deviations from
    the mean.

Note z must be greater than one but need not be
an integer.
36
Chebyshevs Theorem
  • For example

At least (1 - 1/(1.5)2) 1 - 0.44 0.56 or 56
of the rent values must be between
and
(Actually, 86 of the rent values are between
409 and 573.)
37
Detecting Outliers
You can use z-scores to detect extreme values in
the data set, or outliers. In the case of very
high z-scores (absolute values) it is a good idea
to recheck the data for accuracy.
Write a Comment
User Comments (0)
About PowerShow.com