Box Plots - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Box Plots

Description:

For now, a box plot can be a good indicator of shape because the median is clearly visible as the center. Statisticians often talk about the shape of a data set. – PowerPoint PPT presentation

Number of Views:778
Avg rating:3.0/5.0
Slides: 28
Provided by: pat8169
Category:
Tags: box | plot | plots

less

Transcript and Presenter's Notes

Title: Box Plots


1
Box Plots
  • Lesson 2.1

2
  • In this chapter you will graph data sets in
    several different ways. Youll also study some
    numerical measures that help you better
    understand what a data set tells you.
  • A good description of a data set includes not
    only a measure of central tendency, such as the
    mean, median, or mode, but the spread and
    distribution of the data as well. This is often
    done with a set of summary values or a graph.

3
Example A
  • Owen is a member of the student council and wants
    to present data about backpack safety to the
    school board. He collects these data on the
    weights of backpacks of 30 randomly chosen
    students. Owen wants to present a graph that
    shows the distribution and shape of the backpack
    data. Create a box plot of the data.

4
A box plot (or box-and-whisker plot) can be
created from the five-number summary of the
data.
5
Make a list of all the weights.
3, 4, 4, 4, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9,
9, 10, 10, 10, 10, 10, 10, 13, 15, 15, 16, 17,
20, 33
6
Find the minimum, maximum, and median
3
9
33
3, 4, 4, 4, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9,
9, 10, 10, 10, 10, 10, 10, 13, 15, 15, 16, 17,
20, 33
3, 4, 4, 4, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9,
9, 10, 10, 10, 10, 10, 10, 13, 15, 15, 16, 17,
20, 33
7
Find quartile 1 and 3
3
9
7
33
10
3, 4, 4, 4, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9,
9, 10, 10, 10, 10, 10, 10, 13, 15, 15, 16, 17,
20, 33
3, 4, 4, 4, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9,
9, 10, 10, 10, 10, 10, 10, 13, 15, 15, 16, 17,
20, 33
8
Begin to make your Box and Whisker Plot
9
7
3
33
10
3, 4, 4, 4, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9,
9, 10, 10, 10, 10, 10, 10, 13, 15, 15, 16, 17,
20, 33
3
33
28
18
8
13
23
7
10
9
9
The data set did not include every student in the
school, so it may or may not tell much about all
student backpack weights.
If Owen took his sample from the first 30
students who arrived to a single class, then the
data set might be biased, or unfair. It could
represent students who hurry to class because
their backpacks are too heavy. How might the
information be biased if Owen took the sample
from the first 30 volunteers?
Assume that Owens data are from a simple random
sample of the population. This means that every
student is equally likely to be selected. This
means that you can conclude that results for the
sample data, such as a median of 9 lb, apply to
all backpacks in the school.
10
The range is the difference between the maximum
and the minimum. In this case the range is 33-3
or 30.
The interquartile range (IQR), is the difference
between the third quartile Q3 and the first
quartile Q1 , or the length of the box in the
box plot. In this case it is 10-7 or 3. The
IQR is less affected than the range by extreme
values in the data.
Can you create two data sets with the same range
where one has an IQR half as big as the other?
2, 3, 4, 5, 6, 7, 8, 9, 10 and 2, 5, 5, 5, 6,
7, 7, 7, 10 Both have range 8. The IQR for the
first data set is 4 the IQR for the second data
set is 2.
11
You can use a graph of data to look for clusters,
gaps, and extreme values in the sample. One
backpack in Owens sample weighed 33 lb, far more
than the next largest weight of 20 lb. Would
the sample be more representative of the
population if that very heavy backpack were
omitted?
12
Extreme values are called outliers when there is
a gap between them and the rest of the data. A
modified box plot can be used to show these gaps.
In a modified box plot, any values that are more
than 1.5 times the IQR from the ends of the box
are plotted as separate points.
13
Example B
  • Use the backpack data from Example A to answer
    each question.
  • a. Find the range and the interquartile range.
  • b. Create a modified box plot showing the
    outliers.

The range is equal to the maximum minus the
minimum 33-330 lb.
The IQR is Q3-Q1 10 - 7 3 lb.
14
Decide on the largest and smallest value to be
included on the whiskers.
9
7
3
33
10
3, 4, 4, 4, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9,
9, 10, 10, 10, 10, 10, 10, 13, 15, 15, 16, 17,
20, 33
1.5 x IQR 1.5 x 3 4.5 so nothing below 2.5
or greater than 14.5.
The range is 3. Not 3 to 33.
The IQR is 3. Not 7 to 10.
3
33
28
18
8
13
23
7
10
9
15
Statisticians often talk about the shape of a
data set.
The shape describes how the data are distributed
relative to the center.
A symmetric data set is balanced, or nearly
balanced, at the center.
Skewed data are spread out more on one side of
the center than on the other side.
The backpack data provide an example of skewed
data. For now, a box plot can be a good
indicator of shape because the median is clearly
visible as the center.
16
Pulse Rates
  • Pulse rate is often used as a measure of whether
    or not a person is in good physical condition. In
    this investigation you will practice making box
    plots, compare box plots, and draw some
    conclusions about pulse rates.

17
  • What do you think a data set of all of our pulse
    rates would look like?

18
Do you think the pulse rates will be skewed left
or right, or will they be symmetric?
19
Step 1
  • Measure and record your resting pulse for 15
    seconds.
  • Multiply this value by 4 to get the number of
    beats per minute.
  • Pool data from the entire class.

20
  • Exercise for 2 min by doing jumping jacks or by
    running in place.
  • Afterward, measure and record your exercise pulse
    rate. Pool your data.

21
Step 3
  • Order each set of data.
  • Calculate the five-number summaries for your
    classs resting pulse rates and for your exercise
    pulse rates.

22
Step 4
  • Prepare a box plot of the resting pulse rates and
    a box plot of the exercise pulse rates.
  • Determine a range suitable for displaying both of
    these graphs on a single axis.

23
A sample set of data
24
Step 5
  • Draw conclusions about pulse rates by comparing
    these two graphs. Be sure to compare not only
    centers but also spreads and shapes.
  • Could your conclusion apply to a larger
    population?
  • Describe the population and explain how your
    class is representative of that population.

25
Students should see
  • The range of resting pulse rates is less than the
    range of exercise pulse rates.
  • There was a percent increase in pulse rate from
    resting to exercise.
  • Using the medians of the sample data sets,
    students might say that ones pulse rate should
    increase.

26
  • Answers about larger population will depend on
    your class.
  • If all students are the same age, but diverse in
    other characteristics, the class might be
    representative of a population of the same age.
  • If the class is heterogeneous in age, it might
    represent a sample of the school, but students
    might note that there are a smaller or larger
    percentage of athletes in the class than in the
    school, more or fewer girls or boys, and so
    forth.

27
  • If your sample is representative of a larger
    population, then the shape and spread of your
    sample data will be like the shape and spread of
    the entire population.
  • In general you can draw conclusions about the
    population by describing the sample.
  • What factors will influence how confident you are
    in your conclusions?
  1. the size of the sample
  2. how well the sample represents the population
  3. how well the conclusion applies to the sample
Write a Comment
User Comments (0)
About PowerShow.com