Title: Excursions in Modern Mathematics Sixth Edition
1Excursions in Modern MathematicsSixth Edition
2Chapter 14Descriptive Statistics
- Graphing and Summarizing Data
3Descriptive StatisticsOutline/learning Objectives
- To interpret and produce an effective graphical
summary of a data set. - To identify various types of numerical variables.
- To interpret and produce numerical summaries of
data including percentiles and five-number
summaries.
4Descriptive StatisticsOutline/learning Objectives
- To describe the spread of a data set using range,
interquartile range, and standard deviation.
5Descriptive Statistics
- 14.1 Graphical Descriptions of Data
6Descriptive Statistics
- Data set
- A collection of data values denoted by N.
- Data points
- Individual data values in a data set.
7Descriptive Statistics
Stat 101 Test Scores Part 1 Professor Blackbeard
has posted the results in the hallway outside his
office. The data set consists of N 75 data
points (the number of students that took the
test). Each data point is a raw score on the
midterm between 0 and 25. Each student has one
question on their mind How did I do? Its the
next question that is statistically more
interesting How did the class as a whole do?
8Descriptive Statistics
Stat 101 Test Scores Part 2 The first step in
summarizing the information is to organize the
scores in a frequency table. In this table, the
number below each score gives the frequency of
the score that is, the number of students
getting that particular score.
9Descriptive Statistics
Stat 101 Test Scores Part 2 The figure below
shows the information in a more visual way called
a bar graph. With a bar graph, it is easy to
detect outliers -- extreme data points that do
not fit into the overall pattern of the data (the
score of 1 and 24).
10Descriptive Statistics
Stat 101 Test Scores Part 2 Sometimes it is more
convenient to express the bar graph in a term of
relative frequencies that is, the frequencies
given in terms of percentages of the total
population.
11Descriptive Statistics
Stat 101 Test Scores Part 2 Frequency charts
that use icons or pictures instead of bars to
show the frequencies are commonly referred to as
pictograms.
12Descriptive Statistics
13Descriptive Statistics
- Variable
- Any characteristic that varies with the members
of a population. - Numerical (Quantitative) Variable
- A variable that represents a measurable quantity.
14Descriptive Statistics
- Continuous
- When the difference between the values of a
numerical variable can be arbitrarily small. - Discrete
- When possible values of the numerical variable
change by minimum increments.
15Descriptive Statistics
- Categorical (Qualitative) Variables
- Variables can also describe characteristics that
cannot be measured numerically. - Pie Chart
- When the number of categories is small, another
commonly used way to describe the relative
frequencies of the categories.
16Descriptive Statistics
- Stat 101 Test Scores Part 3
- The process of converting test scores (a
numerical variable) into grades ( a categorical
variable) requires setting up class intervals for
the various letter grades. - The grade distribution in the Stat 101 midterm
can now be seen by means of a bar graph.
17Descriptive Statistics
- Histograms
- When a numerical variable is continuous, its
possible values can vary by infinitesimally small
increments. As a consequence, there are no gaps
between the class intervals.
18Descriptive Statistics
- 14.3 Numerical Summaries of Data
19Descriptive Statistics
- Measures of Location
- The mean (or average), the median, and the
quartiles are numbers that provide information
about the values of the data. - Measures of Spread
- The range, the interquartile range, and the
standard deviation are numbers that provide
information about the spread within the data set.
20Descriptive Statistics
- Stat 101 Test Scores Part 4
- The average of a set of N numbers is found by
adding the numbers and dividing the total by N. - Step 1. Find the sum Sum d1 f1 d2 f2
dk fk - (1 1) (6 1) (24 1) 814
- Step 2. Find N N f1 f2 fk 75
- Step 3. Find A A Sum/N 814/75 ? 10.85
21Descriptive Statistics
- Percentile
- The pth percentile of a data set is a value such
that p percent of the numbers fall at or below
this value and the rest fall at or above it. - Locator
- Computed by the pth percent of N and is denoted
by L. L (p/100) N
22Descriptive Statistics
- Finding the pth Percentile of a Data Set
- Step 0. Sort the data set. Let d1, d2, d3,
, dN represent the sorted data set. - Step 1. Find the locator L (p/100) N
- Step 2. Find the pth percentile If L is a
whole number, the pth percentile is given by d
L.5. If L is not a whole number, the pth
percentile is given by dL (L is L rounded up).
23Descriptive Statistics
- The 50th percentile of a data set is known as
the median and denoted by M. - Finding the Median of a Data Set
- Sort the data set. Let d1, d2, d3, , dN
represent the sorted data set. - If N is odd, the median is d (N1)/2 . If N
is even, the median is the average of d N/2 and d
(N/2)1 .
24Descriptive Statistics
After the median, the next most commonly used
set of percentiles are the first and third
quartiles. The first quartile (denoted by Q1) is
the 25th percentile, and the third quartile
(denoted by Q3) is the 75th percentile.
25Descriptive Statistics
- Stat 101 Test Scores Part 5
- We will now find the median and quartile scored
for Stat 101. - Here N 75 (odd), the median is d (751)/2 d
38 . We conclude that the 38th test score is 11.
Thus, M 11. - The locator for the first quartile is L (0.25)
X 75 18.75. We tally from left to right. Thus
Q1 d 19 9 . - Since the first and third quartiles are at equal
distance, a quick way to locate the third
quartile is to count from right to left. Thus,
Q3 12.
26Descriptive Statistics
- A common way to summarize a large data set is by
means of its five-number summary. The
five-number summary is given by the smallest
value in the data set (called the Min), the first
quartile (Q1), the median (M), the third quartile
(Q3), and the largest value in the data set
(called the Max). These five numbers together
often tells us a great deal about the data.
27Descriptive Statistics
- Stat 101 Test Scores Part 6
- For the Stat 101 data set, the five-number
summary is Min 1, Q1 9, M 11, Q3 12 and
Max 24. - What useful information can we get out of this?
- The big picture we get from the five-number
summary is that there were a lot of bunching up
in a narrow band of scores. - In general, this type of lumpy distribution of
test scores is indicative of a test with an
uneven level of difficulty.
28Descriptive Statistics
29Descriptive Statistics
- Range
- The difference between the highest and lowest
values of the data and is denoted by R. Thus, R
Max - Min. - Interquartile Range
- The difference between the third quartile and the
first quartile (IQR Q3 Q1), and it tells us
how spread out the middle 50 of the data values
are.
30Descriptive Statistics
- The Standard Deviation of a Data Set
- Let A denote the mean of the data set. For each
number x in the data set, compute its deviation
from the mean (x A), and square each of these
numbers. These are called the squared
deviations. - Find the average of the squared deviations.
This number is called the variance V. - The standard deviation is the square root of
the variance ( ).
31Descriptive Statistics Conclusion
- Basic concepts in statistics
- Graphical summaries
- Numerical summaries
-