Title: Statistics for Managers Using Microsoft Excel, 2/e
1Online Course 1
Descriptive Statistics
Roger L. Brown, Ph.D. Medical Research
Consulting Middleton, WI
2This online course is a FREE service to all MRC
clients
3Purpose of this series
- To assist researchers in the interpretation and
application of statistical analyses
4Statistics ?
The Science of collecting, organizing, analyzing,
interpreting and presenting data
5Topics we will review
- Descriptive Statistics
- Frequency Distributions and Histograms
- Relative / Cumulative Frequency
- Measures of Central Tendency
- Mean, Median, Mode, Midrange
6Topics (continued)
- Measures of Dispersion (Variation)
- Range, Standard Deviation,
- Variance and Coefficient of variation
- Shape
- Symmetric, Skewed, using Box-and-
- Whisker Plots
- Quartile
- Statistical Relationships
- Correlation , Covariance
7 Descriptive Statistics
A collection of quantitative measures and ways of
describing data. This includes Frequency
distributions histograms, measures of central
tendency and measures of dispersion
8Descriptive Statistics
- Collect Data e.g. Survey
- Present Data e.g. Tables and Graphs
- Characterize Data e.g. Mean
-
A Characteristic of a Population is
a Parameter Sample is a Statistic.
9Collection of Data
- Survey/questionnaires/interviews
- Direct observation
- Secondary data source (e.g., Medical charts)
10Presenting DataGraphics
The visual representation of data may be used not
only to present results/findings in the data, but
may also be used to learn about the data.
11Summary Measures in Descriptive Statistics
Summary Measures
Variation
Central Tendency
Quartile
Mean
Mode
Coefficient of Variation
Range
Median
Variance
Midrange
Standard Deviation
12Measures of Central Tendency
Central Tendency
Mean
Median
Mode
Midrange
13The Mean (Arithmetic Average)
- It is the Arithmetic Average of data values
- The Most Common Measure of Central Tendency
- Affected by Extreme Values (Outliers)
Sample Mean
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 12
14
Mean 5
Mean 6
14The Median
- Important Measure of Central Tendency
- In an ordered array, the median is the
- middle number.
- If n is odd, the median is the middle number.
- If n is even, the median is the average of the 2
- middle numbers.
- Not Affected by Extreme Values
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 12
14
Median 5
Median 5
15The Mode
- A Measure of Central Tendency
- Value that Occurs Most Often
- Not Affected by Extreme Values
- There May Not be a Mode
- There May be Several Modes
- Used for Either Numerical or Categorical Data
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11
12 13 14
No Mode
Mode 9
16Midrange
- A Measure of Central Tendency
- Average of Smallest and Largest
- Observation
- Affected by Extreme Value
Midrange
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Midrange 5
Midrange 5
17Summary Measures in Descriptive Statistics
Summary Measures
Variation
Central Tendency
Quartile
Mean
Mode
Coefficient of Variation
Range
Median
Variance
Midrange
Standard Deviation
18Quartiles
- Not a Measure of Central Tendency
- Split Ordered Data into 4 Quarters
- Position of i-th Quartile position of point
25
25
25
25
Q1
Q2
Q3
i(n1)
Q
i
4
Data in Ordered Array 11 12 13 16 16
17 18 21 22
1(9 1)
Position of Q1
2.50
Q1
12.5
4
19Quartiles
- Not a Measure of Central Tendency
- Split Ordered Data into 4 Quarters
- Position of i-th Quartile position of point
25
25
25
25
Q1
Q2
Q3
i(n1)
Q
i
4
Data in Ordered Array 11 12 13 16 16
17 18 21 22
3(9 1)
Position of Q3
7.50
Q3
19.5
4
20Summary Measures
Summary Measures
Central Tendency
Quartile
Variation
Mean
Mode
Coefficient of Variation
Range
Median
Variance
Midrange
Standard Deviation
21Measures of Dispersion (Variation)
Variation
Variance
Standard Deviation
Coefficient of Variation
Range
Population Variance
Population Standard Deviation
Sample Variance
Sample Standard Deviation
22Understanding Variation
- The more Spread out or dispersed data
- the larger the measures of variation
- The more concentrated or homogenous the data
the smaller the measures of variation - If all observations are equal
- measures of variation Zero
- All measures of variation are Nonnegative
23The Range
- Measure of Variation
- Difference Between Largest Smallest
- Observations
- Range
- Ignores How Data Are Distributed
Range 12 - 7 5
Range 12 - 7 5
7 8 9 10 11 12
7 8 9 10 11 12
24 Variance
- Important Measure of Variation
- Shows Variation About the Mean
- For the Population
- For the Sample
For the Population use N in the denominator.
For the Sample use n - 1 in the denominator.
25 Standard Deviation
- Most Important Measure of Variation
- Shows Variation About the Mean
- For the Population
- For the Sample
For the Population use N in the denominator.
For the Sample use n - 1 in the denominator.
26 Sample Standard Deviation
For the Sample use n - 1 in the denominator.
s
Data 10 12 14
15 17 18 18 24
n 8 Mean 16
s
4.2426
27 Comparing Standard Deviations
Data 10 12 14
15 17 18 18 24
N 8 Mean 16
s
4.2426
3.9686
Value for the Standard Deviation is larger for
data considered as a Sample.
28 Comparing Standard Deviations
Data A
Mean 15.5 s 3.338
11 12 13 14 15 16 17 18
19 20 21
Data B
Mean 15.5 s .9258
11 12 13 14 15 16 17 18
19 20 21
Data C
Mean 15.5 s 4.57
11 12 13 14 15 16 17 18
19 20 21
29Coefficient of Variation
- Measure of Relative Variation
- Always a
- Shows Variation Relative to Mean
- Used to Compare 2 or More Groups
- Formula ( for Sample)
30Comparing Coefficient of Variation
- Group A Average Health Measure 50
- Standard Deviation 5
- Group B Average Health Measure 100
- Standard Deviation 5
Coefficient of Variation Group A CV
10 Group B CV 5
31Shape
- Describes How Data Are Distributed
- Measures of Shape
- Symmetric or skewed
32Shape
- Describes How Data Are Distributed
- Measures of Shape
- Symmetric or skewed
-0.5 lt0 lt 0.5
Symmetric
Mean
Median
Mode
33Shape
- Describes How Data Are Distributed
- Measures of Shape
- Symmetric or skewed
lt -1
-0.5 lt0 lt 0.5
Left-Skewed
Symmetric
Mean
Median
Mode
Mean
Median
Mode
34Shape
- Describes How Data Are Distributed
- Measures of Shape
- Symmetric or skewed
gt 1
lt -1
-0.5 lt0 lt 0.5
Right-Skewed
Left-Skewed
Symmetric
Mean
Median
Mode
Mean
Median
Mode
Mode
Median
Mean
Negatively Skewed
Positively Skewed
35Box-and-Whisker Plot
- Graphical Display of Data Using 5-Number
Summary
Median
Q
Q
X
X
smallest
3
1
largest
12
4
6
8
10
36Distribution Shape Box-and-Whisker Plots
Right-Skewed
Left-Skewed
Symmetric
Q
Median
Q
Q
Median
Q
Q
Median
Q
1
3
1
3
3
1
37Summary
- Discussed Measures of Central Tendency
- Mean, Median, Mode, Midrange
- Quartiles
- Addressed Measures of Variation
- The Range, Interquartile Range, Variance,
- Standard Deviation, Coefficient of
Variation - Determined Shape of Distributions
- Symmetric, Skewed, Box-and-Whisker Plot
Mean
Median
Mode
Mean
Median
Mode
Mode
Median
Mean