Title: Descriptive statistics using Excel
1Descriptive statistics using Excel
- Displaying Data
- Calculating Measures of Central Tendency and
Dispersion - Classification of data
- Index Numbers and Correlation
- Bell Shaped distribution
Kari Peisa, Ramk/Teli 2006
2Displaying Data
- Types of Data and Measurement Scales
Scale Data Kvalitative Kvantitative
Nominal Nominal X
Ordinal Ordinal X X
Interval Interval X
Ratio Ratio X
3Constructing Diagrams in Excel
- General instructions
- 1. Construct an array where data values form
continues series either in rows or in columns - 2. Write titles for each series
- 3. Paint (activate) the series which you want
to include in diagram (data titles), select
Chart Wizard, and follow the instructions
4Examples of Displaying Data Nominal Scale and
Pie Chart
Observations can be assigned to different
categories which cant be placed in a meaningful
order
5Examples of Displaying Data Ordinal Scale and
Bar Chart
Classes can be rank-ordered from highest to
lowest. Classes may be represented also
numerically when the measures of central tendency
have a real value. But, do they actually have a
real meaning?
6Examples of Displaying Data Interval Scale and
Line Chart
Addition and subtraction but not multiplication
or division can be performed on data to compare
observations. We can say that temperature at
1500 is 1.1 ?C warmer than at 1200, but we
cant say for example that -6?C is twice as warm
as -12 ?C or vice versa.
7Examples of Displaying Data Ratio Scale and Line
Chart
Ratio data has an absolute beginning point (true
0 point). All mathematical operations can be
performed to compare values. We can say that the
length of the plant in Septemper is more than
twice as tall as in June.
8Measures of Central Tendency
- Average (Mean)
-
- Mode
- Median
- Quartiles and Percentiles
The most frequently occurred
The occurrence in the middle of a set of ordered
occurrences
The occurrence at the first (25), second (50)
or third (75) quartile or at the given
percentile (p ) of a set of ordered occurrences
9Measures of Central Tendency in Excel
Write the formula into a cell
AVERAGE(A1A4B2C1C2E1E4)
AVERAGE(B2E5)
A colon () stands between the upper left corner
and the lower right corner of an array Empty
cells dont effect on the value of mean The
reference to an array is made by painting the
array The semicolon () connects separate arrays
10Measures of Central Tendency in Excel
Or, use the wizard Insert function
Select Category Statistics Function Average
Activate the command line in the box and paint an
array in Excel sheet
11Measures of Central Tendency in Excel
- Mode
- Median
- Quartiles
- Percentiles
MODE(B2B5D2E4)
1., 2. or 3. quartile (the 2. Median)
MEDIAN(B2B5D2E4)
QUARTILE((B2B5D2E4)1)
PERCENTILE((B2B5D2E4)0.35))
12Measures of Dispersion
- Average deviation
- Variance
-
- Standard Deviation
-
- Skewness
13Measures of Dispersion in Excel
- Average deviation
- Variance
-
- Standard Deviation
-
- Skewness
AVEDEV(B2B5D2E4)
VAR(B2B5D2E4)
STDEV(B2B5D2E4)
SKEW(B2B5D2E4)
14Classification (Grouping) of Data
In classification we arrange a large sample of
data into classes
There are some rules usually followed when
arranging classes
- The classes should be of equal size (if
possible) - All data values from the original table need to
be included in one and only in one class - The number of classes should be between 5 and 15.
15Classification in Excel
The frequencies indicate the number of
observations in the data array that are more than
the upper limit in the previous row but less than
or equal to the upper limit in this row
Activate the (whole) frequency column and write
the formula FREQUENCY(databins) into the
first cell. Remark! This is an array formula,
which means that we have to accept the formula by
pressing shift ctrl enter
16Index Numbers
- Index numbers are used to display data in
proportional form
Source CIA/ World Factbook 2002 https//www.cia.g
ov/cia/publications/factbook/index.html
17Writing own formulas in Excel
Fill right the formula
Formula C2/B2B6
- Some fundamental rules for writing formulas
- Starts with sign
- A pair ColumnRow refers a cell in a relative
position - A pair ColumnRow refers a cell in an absolute
position - In arrays you should be able to fill right or
fill down formulas that are correctly formed
Press F4 for switching between relative,
absolute, and mixed references
18Displaying Correlation in Scattered Chart
In scattered chart two data series are displayed
in correspondence. The correlation coefficient
between two measurement variables and different
fitted curves are associated with this type of
chart.
19Bell-Shaped model
When the sample size is reasonably large and tha
data are not too skewed, we can estimate the real
distribution by a mathematical model called
bell-shaped model (or Gaussian shape or Normal
Distribution).
The model needs to know the mean and the standard
deviation of the original data
20Example of using Bell-Shaped Model
- How many percentage of the students in the
previous frequency distribution got the mark at
least 4? - From original data
- With the bell-shaped model, where the average
2.697 and standard deviation 1.1674 of the
original data, we have to calculate the total
area under the model curve after the point x3.5 - In Excel the formula is
- And it returns 0.246
1-NORMDIST(3.52.6971.16771)
Total area 1
Returns the area before x3.5