Describing Quantitative Data Numerically - PowerPoint PPT Presentation

1 / 21

About This Presentation

Title:

Describing Quantitative Data Numerically

Description:

Describing Quantitative Data Numerically Symmetric Distributions Mean, Variance, and Standard Deviation – PowerPoint PPT presentation

Number of Views:76

Avg rating:3.0/5.0

Slides: 22

Provided by: DavidD296

Category:

more less

Transcript and Presenter's Notes

Title: Describing Quantitative Data Numerically

1
Describing Quantitative Data Numerically
Symmetric Distributions Mean, Variance, and
Standard Deviation
2
Symmetric Distributions

Describing a typical value for a set of data
when the distribution is at least approximately
symmetric allows us to choose our measure of
center
We can use either
Mean
Median

3
Finding the Mean of a Distribution

The mean of a set of numbers is the arithmetic
average. We find this value by adding together
each value and then dividing by the number of
values we added together
The formula for the mean is

4
Lets see the Formula in Action

Consider Babe Ruths HR data
A check of a dotplot indicates that the
distribution is approximately symmetric

54 59 35 41 46 25 47 60
54 46 49 46 41 34 22
5

So the first step is to add all the values
54 59 35 41 46 25 47 60 54 46
49 46 41 34 22
659
Now we need to divide that sum by the number of
values we added together.

So the mean of the data is 43.9333. Now, if we
wish to talk about the typical number of home
runs for Babe Ruth (and we ALWAYS wish to talk
about the context of our data!), we could say
something like
On average, Babe Ruth hit approximately 44 home
runs per season during the 15 seasons he played.

Remember that although the center is a very
important part of our description, we also need
to look at the spread of the distribution.
When we use the mean as our measure of center, we
use the standard deviation as our measure of
spread.
We can think of standard deviation as an average
distance of values from the mean
To calculate the standard deviation by hand,
well make a data table

8
Calculating Standard Deviation
S
9
X X X - X (X X)2
54 43.9333 10.0667 101.3384
59 43.9333 15.0667 227.0054
35 43.9333 -8.9333 79.8038
41 43.9333 -2.9333 8.6042
46 43.9333 2.0667 4.2712
25 43.9333 -18.9333 358.4698
47 43.9333 3.0667 9.4046
60 43.9333 16.0667 258.1388
54 43.9333 10.0667 101.3384
46 43.9333 2.0667 4.2712
49 43.9333 5.0667 25.6714
46 43.9333 2.0667 4.2712
41 43.9333 -2.9333 8.6042
34 43.9333 -9.9333 98.6704
22 43.9333 -21.9333 481.0696
SUM .0005 (essentially 0) 1770.9333
10
Creating the Data Table
X - X
54 43.9333 10.0667
15.0667
-8.9333
-2.9333
2.0667
-18.9333
3.0667
16.0667
10.0667
2.0667
5.0667
2.0667
-2.9333
-9.9333
-21.9333

The first part of our formula indicates that we
need to find the distance from the mean for each
of our values (x x)

Now that we know the individual distances for
each value, we want to find an average of those
distances.
To find an average we have to add all the values
together
We find, though, that the sum of those values is
always zero.
Why? Because some of the values are above the
mean (positive values) and some are below
(negative). The positives and negatives cancel
each other out.
So what values can we use to find the average
distance from the mean for a set of values?

One way to get rid of the negative values in
these distances is to square each of the values.
Thats exactly what our formula tells us to do.
(x x)2
Once we have these values, to find the average we
must add them together

(X X)2
101.3384
227.0054
79.8038
8.6042
4.2712
358.4698
9.4046
258.1388
101.3384
4.2712
25.6714
4.2712
8.6042
98.6704
481.0696
SUM 1770.9333
13

The final step in finding an average is to divide
by the number of values we added together, but
our formula is a little different here.

Instead of dividing by the total number of values
we added together, we divide by 1 less than the
total.
Why? We have taken a sample of the data
instead of every piece of data in the population.
Since another sample would produce a slightly
different mean, it would also produce a slightly
different standard deviation. Dividing by 1 less
than the total number of values added together
will give us a slightly larger spread to account
for this sampling variation.

So, we divide the sum of the squared deviations
by n-1
We have now calculated everything inside the
square root sign
This value is an important oneIt is called the
Variance --S2

Since the units of the variance are not the same
as our original units, we have one more
calculation we must make.
The square root of the variance will restore the
original units and give us the average distance
from the meanthe standard deviation
S 11.2470

16
TI-TipsMean, Variance, Standard Deviation