Title: GTECH 201 Lecture 12
1GTECH 201Lecture 12
Intro to Descriptive Statistics
2Topics for Today
- Measures of Central Tendency
- Mean, Median, Mode
- Sample and Population Mean
- Weighted Means
- Selecting Appropriate Measures of Central
Tendency - Measures of Dispersion
- Variance
- Standard Deviation
3Descriptive vs. Inferential
- Descriptive Statistics
- Methods for organizing and summarizing
information - Inferential Statistics
- Methods for drawing and measuring the reliability
of conclusions about a population based on
information obtained from a sample of the
population
4Looking at This Data Set
5Overview
- Mean
- Median
- Mode
- Sample and Population Mean
- Weighted Means
- Selecting Appropriate Measures of Central
Tendency - Applying these measures
6Mean
- The mean of a set of n observations is the
arithmetic average - Mean of n observations x1, x2,x3,.xn is
-
In Excel, AVERAGE(insert range)
7Median
- The data value that is exactly in the middle of
an ordered list if the number of pieces of data
is odd - The mean of the two middle pieces of data in an
ordered list if the number of pieces of data is
even - The median is a typical value it is the midpoint
of observations when they are arranged in an
ascending or descending order
8Mode
- The most frequent data value i.e., any value
having the highest frequency among the
observations - In Excel,you use the functions
- MEDIAN (insert range)
- MODE (insert range)
- Unimodal, Bimodal, Multimodal data sets
- Outliers
9Sample and Population Means
- Mean of a data set
- Population mean if data set includes entire
population - Sample mean if data set is only a sample of the
population
10Weighted Means
- To calculate the mean when your information is
available only in the form of summary data - C Interval Freq
- 25 29.9 4
- 30 34.9 5
- 35 39.9 12
11Skewed Distributions
12Skewed Distributions
- When there is one mode and the distribution is
symmetric - mean, median, mode are the same
- Positive skew
- mean moves towards the positive tail
- median also pulls towards the positive tail
- Negative skew
- mean moves towards the negative tail
- median also moves towards the negative tail
13Selecting Appropriate Measures
- Mean
- affected by extreme values
- includes all observations, therefore
comprehensive (useful for interval/ratio data) - Median
- not affected by the number of observations
- reveals typical situations (used for ordinal
data) - Mode
- useful for nominal variables
14Other Useful Calculations
- In addition to the sum of data, Sxwe need to be
able to calculate
15Variability or Spread
- Mean and the median - limits
- Range coarse measure of variability
- Percentiles
- kth percentile is the point at which k percent of
the numbers fall below it and the rest are fall
above it - 25th percentile (lower quartile)
- 50th percentile (median)
- 75th percentile (upper quartile)
- Interquartile range (difference between the 25th
percentile value and the 75th percentile value)
16Describing the Spread
- A five number summary
- Median
- Quartiles
- Extremes
- Variance and Standard Deviation
- Measures spread about the mean
- Standard deviation cannot be discussed without
the mean
17Calculating Percentiles
- In the list of twelve observations
- 4 7 11 11 11 11 14 16 16 24 29
- Compute median, 25th and 75th percentiles
The lower quartile is the median of the 6
observations that fall below the median
The upper quartile is the median of the 6
observations that fall above the median
18Five Number Summary
- Median 11
- Lower Quartile 9
- Upper Quartile 16
- Extremes are 2 and 29
- Can compute the range 27
- In a symmetric distribution, the lower and upper
quartiles are equally distant from the median
19Variance
- Is the mean of the squares of the deviations of
the observations from their mean - Population variance
- Sample variance
20Example
The heights, in inches for five starting players
in a mens college basket ball team
are 67 72 76 76 84 Compute the mean and
standard deviation.
75
21Standard Deviation
- Standard deviation is positive square root of the
variance - Variance in our basketball example
39
22Formulas Standard Deviation
Standard deviation of a sample
Standard deviation of a population
23Example (Continued)
24Short Cut Simpler Formula
Standard Deviation of a sample
Sum of the squares of data values, i.e., you
square each data value and then sum those squared
values
Square of the sum of data values, i.e., you sum
all the data values and then square that sum
25Example (using the short cut)
26Interpreting Std. Deviation
- s and s 2 will be small when all the data are
close together - The deviations from the mean
- Will be both positive and negative
- Sum will always be 0
- s is always 0 or a positive number
- s 0 means no spread as s value increases, the
spread of the data increases - The units of s are the same as the original
observations - s is heavily influenced by outliers
27Coefficient of Variation
CV is the standard deviation described as a
percent of the mean
CV
CV is useful when comparing different sets of
data where sample size and standard deviation are
different