Title: Section 6A Characterizing a Data Distribution
1Section 6ACharacterizing a Data
Distribution pages 380 - 391
2Definition -The distribution of a variable (or
data set) describes the values taken on by the
variable and the frequency (or relative
frequency) of these values.
ex1/381 Eight grocery stores sell the PR energy
bar for the following prices 1.09, 1.29,
1.35, 1.79, 1.49, 1.59, 1.39, 1.29
3How do we characterize a data distribution?
Average
- Mean- Median- Mode- Effect of an Outlier-
Confusion
Shape of a Distribution
- Number of Peaks- Symmetry or Skewness-
Variation
more in section 6B
4What do we mean by AVERAGE?
The mean is what we most commonly call the
average value.
The median is the middle value in the sorted data
set (or halfway between the two middle values.)
The mode is the most common value (or group of
values).
5ex1/381 Eight grocery stores sell the PR energy
bar for the following prices 1.09, 1.29,
1.35, 1.79, 1.49, 1.59, 1.39, 1.29
median 1.09, 1.29, 1.29, 1.35, 1.39,
1.49, 1.59, 1.79
median 1.37
mode 1.09, 1.29, 1.29, 1.35, 1.39, 1.49,
1.59, 1.79
mode 1.29
617/389 High temperatures (oF) during a 15 day
period in Alaska in March 15, 11, 10, 9, 0, 2,
4, 5, 5, 7, 10, 12, 15, 18, 19
median 0, 2, 4, 5, 5, 7, 9, 10, 10, 11, 12, 15,
15, 18, 19
median 10 (oF)
mode 0, 2, 4, 5, 5, 7, 9, 10, 10, 11, 12, 15,
15, 18, 19
modes 5, 10, 15trimodal
717/389 High temperatures (oF) during a 15 day
period in Alaska in March 15, 11, 10, 9, 0, 2,
4, 5, 5, 7, 10, 12, 15, 18, 19
Mean balancing pointMedian middle pointMode
high point(s)
8How do we characterize a data distribution?
Average
- Mean- Median- Mode- Effect of an Outlier-
Confusion
Shape of a Distribution
- Number of Peaks- Symmetry or Skewness-
Variation
more in section 6B
9The Effect of an Outlier
Definition An outlier is a data value that is
much higher or much lower than almost all other
values.
ex/382 Five graduating seniors on a college
basketball team receive the following first-year
contract offers to play in the National
Basketball Association 0, 0, 0, 0,
3,500,000
???
Including an outlier can pull the mean
significantly upward or downward.Including an
outlier does not affect the median.Including an
outlier does not affect the mode.
10The Effect of an Outlier
ex2/383 A track coach wants to determine an
appropriate heart rate for her athletes during
their workouts. In the middle of the workout,
she reads the following heart rates (beats/min)
from five athletes 130, 135, 140, 145, 325,
median 130, 135, 140, 145, 325
median 140 bpm
mode none
_____________________________________________Clea
ry 325 is an outlier. Clearly 325 is a mistake
(faulty heart monitor?)
Throw out the outlier?
median 130, 135, 140, 145
median 137.5 bpm
mode none
11How do we characterize a data distribution?
Average
- Mean- Median- Mode- Effect of an Outlier-
Confusion
Shape of a Distribution
- Number of Peaks- Symmetry or Skewness-
Variation
more in section 6B
12Confusion about Average
ex3/383 A newspaper surveys wages for assembly
workers and reports an average of 22 per hour.
The workers at one large firm immediately request
a pay raise, claiming that they work as hard as
other companies but their average wage is only
19. The management rejects their request,
telling them that they are overpaid because their
average wage, in fact is 23 per hour. Can they
both be right?
median 19 mean 23
salaries 19, 19, 19, 19, outlier
salaries 19, 19, 19, 19, 39
13Confusion about Average
ex3/383 A newspaper survey wages for assembly
workers and reports an average of 22 per hour.
The workers at one large firm immediately request
a pay raise, claiming that they work as hard as
other companies but their average wage is only
19. The management rejects their request,
telling them that they are overpaid because their
average wage, in fact is 23 per hour. Can they
both be right?
median 23 mean 19
salaries outlier, 20, 23, 23, 23
salaries 6, 20, 23, 23, 23
14Confusion about Average
ex4/383 All 100 first-year students at a small
college take three courses in the Core Studies
Program. The first two courses are taught in
large lectures, with all 100 students in a single
class. The third course is taught in ten classes
of 10 students each. The students claim that the
mean size of their Core Studies classes is 70.
The administrators claim that the mean class size
is only 25 students. Explain.
Students say my average class size is
mean class size per student
Administrators say the average Core Studies class
size is
mean number of students per class
15How do we characterize a data distribution?
Average
- Mean- Median- Mode- Effect of an Outlier-
Confusion
Shape of a Distribution
- Number of Peaks- Symmetry or Skewness-
Variation
more in section 6B
16Shape of a DistributionUse a smooth curve
17Shape of a DistributionNumber of Peaks
18How do we characterize a data distribution?
Average
- Mean- Median- Mode- Effect of an Outlier-
Confusion
Shape of a Distribution
- Number of Peaks- Symmetry or Skewness-
Variation
more in section 6B
19Shape of a DistributionSymmetry and Skewness
A distribution is symmetric if its left half is a
mirror image of its right half.(note
positioning of mean, median, and mode.)
20Shape of a DistributionSymmetry and Skewness
A distribution is left-skewed if its values are
more spread out on the left (outliers?).(note
positioning of mean, median, and mode.)
21Shape of a DistributionSymmetry and Skewness
A distribution is right-skewed if its values are
more spread out on the right (outliers?).(note
positioning of mean, median, and mode.)
22Shape of a DistributionSymmetry and Skewness
ex6/387 Do you expect the distribution of
heights of 100(20) women to be symmetric,
left-skewed, or right-skewed? Explain.
ex6/387 Do you expect the distribution of speeds
of cars on a road where a visible patrol car is
using radar to be symmetric, left-skewed, or
right skewed. Explain.
23How do we characterize a data distribution?
Average
- Mean- Median- Mode- Effect of an Outlier-
Confusion
Shape of a Distribution
- Number of Peaks- Symmetry or Skewness-
Variation
more in section 6B
24Shape of a DistributionVariation
Variation describes how widely data values are
spread out about the center of distribution.
ex7/388 How would you expect the variation to
differ between times in the Olympic marathon and
times in the New York Marathon? Explain.
25Shape of a DistributionNumber of Peaks,
Symmetry/Skewness, Variation
- number of peaks
- symmetric, left-skewed, or right-skewed
- small or large variation.
27/389 The exam scores on a 100-point exam where
50 students got an A, 20 students got a B, and 5
students got a C.
ex5/385 The heights of all students at Virginia
Tech.
ex5/385 The numbers of people with a particular
last digit (0 through 9) in their Social Security
Number.
26How do we characterize a data distribution?
Average
- Mean- Median- Mode- Effect of an Outlier-
Confusion
Shape of a Distribution
- Number of Peaks- Symmetry or Skewness-
Variation
more in section 6B
27Homework Pages 388-391 14,16,18,20, 28,29,30,31