Title: Central Tendency
1Central Tendency
2Numerical DataProperties Measures
Numerical Data
Properties
Central
RelativeStanding
Variation
Tendency
Mean
Range
Percentiles
Interquartile Range
Median
Zscores
Mode
Variance
Standard Deviation
3Mean
- Measure of central tendency
- Most common measure
- Acts as balance point
- Affected by extreme values (outliers)
- Formula (sample mean)
4Mean Example
- Raw Data 10.3 4.9 8.9 11.7 6.3 7.7
n
?
X
i
X
X
X
X
X
X
?
?
?
?
?
1
2
3
4
5
6
i
?
1
X
?
?
n
6
?
?
?
?
?
10
3
4
9
8
9
11
7
6
3
7
7
.
.
.
.
.
.
?
6
?
8
30
.
5Numerical DataProperties Measures
Numerical Data
Properties
Central
RelativeStanding
Variation
Tendency
Mean
Range
Percentiles
Median
Interquartile Range
Zscores
Mode
Variance
Standard Deviation
6Median
- Measure of central tendency
- Middle value in ordered sequence
- If n is odd, middle value of sequence
- If n is even, average of 2 middle values
- Position of median in sequence
- Not affected by extreme values
7Median Example Odd-Sized Sample
- Raw Data 24.1 22.6 21.5 23.7 22.6
- Ordered 21.5 22.6 22.6 23.7 24.1
- Position 1 2 3 4 5
?
?
n
1
5
1
Positioning
Point
?
?
?
3
0
.
2
2
Median
?
22
6
.
8Median Example Even-Sized Sample
- Raw Data 10.3 4.9 8.9 11.7 6.3 7.7
- Ordered 4.9 6.3 7.7 8.9 10.3 11.7
- Position 1 2 3 4 5 6
?
?
n
1
6
1
Positioning
Point
?
?
?
3
5
.
2
2
?
7
7
8
9
.
.
Median
?
?
8
30
.
2
9Numerical DataProperties Measures
Numerical Data
Properties
Central
RelativeStanding
Variation
Tendency
Mean
Range
Percentiles
Interquartile Range
Median
Zscores
Mode
Variance
Standard Deviation
10Mode
- Measure of central tendency
- Value that occurs most often
- Not affected by extreme values
- May be no mode or several modes
- May be used for quantitative or qualitative data
11Mode Example
- No ModeRaw Data 10.3 4.9 8.9 11.7 6.3 7.7
- One ModeRaw Data 6.3 4.9 8.9 6.3 4.9 4.9
- More Than 1 ModeRaw Data 21 28 28 41 43 43
12Thinking Challenge
- Youre a financial analyst for Prudential-Bache
Securities. You have collected the following
closing stock prices of new stock issues 17,
16, 21, 18, 13, 16, 12, 11. - Describe the stock pricesin terms of central
tendency.
13Central Tendency Solution
n
?
X
i
X
X
X
?
?
?
1
2
8
i
?
1
X
?
?
n
8
?
?
?
?
?
?
?
17
16
21
18
13
16
12
11
?
8
?
15
5
.
14Central Tendency Solution
- Median
- Raw Data 17 16 21 18 13 16 12 11
- Ordered 11 12 13 16 16 17 18 21
- Position 1 2 3 4 5 6 7 8
?
?
n
1
8
1
Positioning Point
?
?
?
4
5
.
2
2
?
16
16
Median
?
?
16
2
15Central Tendency Solution
Mode Raw Data 17 16 21 18 13 16 12 11 Mode
16
16Summary of Central Tendency Measures
Measure
Formula
Description
Mean
Balance Point
??
X
/
n
i
Median
(
n
1)
Middle Value
Position
2
When Ordered
Mode
none
Most Frequent
17Shape
18Shape
- Describes how data are distributed
- Measures of Shape
- Skew Symmetry
Right-Skewed
Left-Skewed
Symmetric
Mean
Median
Mean
Median
Median
Mean
19Variation
20Numerical DataProperties Measures
Numerical Data
Properties
Central
RelativeStanding
Variation
Tendency
Range
Mean
Percentiles
Interquartile Range
Median
Zscores
Mode
Variance
Standard Deviation
21Range
- Measure of dispersion
- Difference between largest smallest
observations Range Xlargest Xsmallest - Ignores how data are distributed
7
8
9
10
7
8
9
10
Range 10 7 3
Range 10 7 3
22Numerical DataProperties Measures
Numerical Data
Properties
Central
RelativeStanding
Variation
Tendency
Mean
Range
Percentiles
Interquartile Range
Median
Zscores
Mode
Variance
Standard Deviation
23Variance Standard Deviation
- Measures of dispersion
- Most common measures
- Consider how data are distributed
X
8.3
4
6
10
12
8
24Sample Variance Formula
n - 1 in denominator! (Use N if Population
Variance)
25Sample Standard Deviation Formula
2
S
S
?
n
(
)
2
?
X
X
?
i
i
?
1
?
n
?
1
(
)
(
)
(
)
2
2
2
X
X
X
X
X
X
?
?
?
?
?
?
n
1
2
?
n
?
1
26Variance Example
- Raw Data 10.3 4.9 8.9 11.7 6.3 7.7
n
n
(
)
2
?
?
X
X
X
?
i
i
2
i
i
1
1
?
?
S
X
8
3
?
?
?
where
.
n
n
1
?
(
)
(
)
(
)
2
2
2
10
3
8
3
4
9
8
3
7
7
8
3
?
?
?
?
?
?
.
.
.
.
.
.
2
S
?
6
1
?
6
368
?
.
27Thinking Challenge
- Youre a financial analyst for Prudential-Bache
Securities. You have collected the following
closing stock prices of new stock issues 17, 16,
21, 18, 13, 16, 12, 11. - What are the variance and standard deviation of
the stock prices?
28Variation Solution
Sample Variance Raw Data 17 16 21 18 13 16 12 11
n
n
(
)
2
?
?
X
X
X
?
i
i
2
i
i
1
1
?
?
S
X
15
5
?
?
?
where
.
n
n
1
?
(
)
(
)
(
)
2
2
2
17
15
5
16
15
5
11
15
5
?
?
?
?
?
?
.
.
.
2
S
?
8
1
?
11
14
?
.
29Variation Solution
- Sample Standard Deviation
n
(
)
2
?
X
X
?
i
2
i
?
1
S
S
?
?
?
?
11
14
3
34
.
.
n
?
1
30Summary of Variation Measures
Measure
Formula
Description
X
X
Total Spread
Range
largest
smallest
Dispersion about
Standard Deviation
Sample Mean
(Sample)
Dispersion about
Standard Deviation
Population Mean
(Population)
2
Squared Dispersion
Variance
?
(
X
?
X
)
?
i
about Sample Mean
(Sample)
n
1
31Interpreting Standard Deviation
32Interpreting Standard Deviation Chebyshevs
Theorem
- Applies to any shape data set
33Interpreting Standard Deviation Chebyshevs
Theorem
34Chebyshevs Theorem Example
- Previously we found the mean closing stock price
of new stock issues is 15.5 and the standard
deviation is 3.34. - Use this information to form an interval that
will contain at least 75 of the closing stock
prices of new stock issues.
35Chebyshevs Theorem Example
- At least 75 of the closing stock prices of new
stock issues will lie within 2 standard
deviations of the mean. - x 15.5 s 3.34
(x 2s, x 2s) (15.5 23.34, 15.5
23.34) (8.82, 22.18)
36Interpreting Standard Deviation Empirical Rule
- Applies to data sets that are mound shaped and
symmetric - Approximately 68 of the measurements lie in the
interval µ s to µ s - Approximately 95 of the measurements lie in the
interval µ 2s to µ 2s - Approximately 99.7 of the measurements lie in
the interval µ 3s to µ 3s
37Interpreting Standard Deviation Empirical Rule
µ 3s µ 2s µ s µ µ s
µ 2s µ 3s
38Empirical Rule Example
- Previously we found the mean closing stock price
of new stock issues is 15.5 and the standard
deviation is 3.34. If we can assume the data is
symmetric and mound shaped, calculate the
percentage of the data that lie within the
intervals x s, x 2s, x 3s.
39Empirical Rule Example