Title: Mean or average
1Measure of central location --- Contd.
- Mean or average
- Arithmetic mean
- Geometric
- Harmonic
- Median
- Mode
- Mid range
- Mid-hinge
- Quartiles
- Percentiles
Remaining from the last session
2Measure of central location --- Contd.
- Mid-range Average of the smallest and largest
observations
- Mid-hinge The average of the first and third
quartiles.
3Quartiles Observations that divide data into
four equal parts.
4First Quartile (Q1)
Measure of central location --- Contd.
- Value such that 25 of the ordered observations
are smaller and 75 of the ordered observations
are larger
Third Quartile (Q3)
- Value such that 75 of the ordered observations
are smaller and 25 of the ordered observations
are larger
5Measure of other locations -- Contd.
Other Fractiles or Quantiles Quintiles - 5
equal parts Deciles 10 equal
parts Percentiles 100 equal parts
- The fractile relating to the of total observed
frequency values. For example - 25th percentile 1st quartile
- 50th percentile Median
- 75th percentile 3rd quartile
6Measure of other locations -- Contd.
- It is easier to find the value of the required
fractile item in a grouped frequency distribution
(The following method is important if you collect
secondary data for your thesis) - Find the fractile item by multiplying a fraction
by the total number of observations - Example1 the third quartile of students in the
Biometry class ¾ X 36 27th item - Example 2 60th percentile of the class would be
60/10036 21.6 22nd item (round off)
7Measure of other locations --- Contd.
- Applications
- LC50 lethal concentration (ml/L etc.) of
certain medicines or chemicals at which 50 of
the animals die in a certain period of time - LD50 lethal dose (mg, g/kg animal etc.) at
which 50 organisms die in a certain period of
time such as 30 min, 1 hr, etc. - ED50 effective dose, 50 animals are
cured/recovered - In these cases, death of or effect on 50
organisms is adequate to see the effects, rest
are not necessary.
8Measure of other locations --- Contd.
- Percentiles can be used as cut-off values e.g.
lower than 2.5 and above 97.5 values of the
distribution will be considered as extreme values
and thus disregarded. The investigator is
interested in only the middle 95 of values
without considering first and last 2.5 values in
the tail area.
9Measure of dispersion/variability
Measurement of how scattered or clustered the
data are around the center or central
location Remember - definition of statistics
The scientific study of numerical data based on
variation in nature Science of analyzing data
and drawing conclusions, taking variation into
account
10Measure of dispersion/variability
- Parameters/statistics of dispersion
- Range
- Quartile range/Quartile deviation
- Mean deviation (MD)
- Standard deviation (SD)
- Variance (Var)
- Standard error (SE)
- Coefficient of variation (CV)
11Measure of dispersion/variability
- Range Difference between the largest and the
smallest observations in a set of data - Range Xlargest Xsmallest
- simplest and rough measure of dispersion
- affected by single extreme/outlying observation
- affected by sample size
- it considers only how far the two extreme values
are from the center and doesnt take into account
how and where other observations are clustered or
dispersed
12Inter-quartile range/deviation (Mid-spread)
Difference between the Third and the First
Quartiles, therefore, considers data of central
half and ignores the extreme valuesInter-quartil
e Range Q3 - Q1Quartile deviation (Q3 -
Q1)/2
Measure of dispersion/variability
13Measure of dispersion/variability
- Mean deviation (MD)
- Each observation deviates from the mean
- The average of these deviates is the mean
deviation or average deviation - e.g. MD ?(y-?) n
- But in a normal population as 50 observations
are higher than mean and 50 are lower, - sum of these deviations is zero (0), therefore,
later, the absolute difference (ignoring the or
- signs), MD ?/y-?/ n - It was popular during early 20th century
14- Variance (Var)
- Now variance is more popular and widely used
- Has become the basis for analysis, therefore, has
fundamental importance - In order to eliminate negative sign, deviations
are squared (squared units e.g. m2) - Variance is the average of the squared deviations
- i.e. Variance ?(y-?)2 n
- Standard deviation (SD)
- Positive square root of the variance
- SD v ?(y-?)2 n
15- Population parameters and sample statistics
- If we are working with samples, the calculation
under-estimates the variance and SD which is
biased - Therefore, instead of using n, n-1 (degrees of
freedom) is used for sample, e.g.
Population SD
Sample SD
16Variance and standard deviation are useful for
probability and hypothesis testing, therefore, is
widely used unlike mean deviation
Working formula Variance (S2) ? Xi2
(?Xi)2 /n (n-1) SD (S) v? Xi2 (?Xi)2
/n (n-1)
17Example 1
n 7 Range 2.4-1.21.2g Mean deviation
2.4/7 0.34g s2 ?(y-?) 2/(n-1) 1.12/60.187g2
s v0.187 0.43 g
18Example 2 Using working formula
n 7 Range 2.4-1.21.2g s2 ?(xi2) (?
xi)2 /n / (n-1) 23.8-(12.6)2/7/6 1.12/6
0.187g2 SD (s) v0.187 0.43 g Coefficient of
Variation (CV) (SD / Mean) 100 (0.43/1.8)
100 24
19- It has become popular recently
- Researchers often misunderstand and mis- use SE
- Variability of observations is SD while
variability of 2 or more sample means is SE - Therefore, often called Standard error of the
means and SD of a set of observations or a
population
20 21Coefficient of Variation
Measure of dispersion/variability
- Relative measure of variation (data scatter)
about the mean - Expressed as a percentage
22- Understanding the variation
- The more the data is spread out, the larger the
range, variance, SD and SE (Low precision and
accuracy) - The more concentrated the data (precise or
homogenous), the smaller the range, variance, and
standard deviation (high precision and accuracy) - If all the observations are the same, the range,
variance, and standard deviation 0 - None of these measures can be negative
- Two distant means with little variations are more
likely to be significantly different and vice
versa
23Example there is more chance of having
significant difference in Group A than in B.
Group A Group B
24Example In which group Trt I and II seem to be
significantly different?
Error bars represent 1SE
60 40 20 0
Group A Group B
25Conclusion Means without variability (SD or
SE) have no meaning SD or SE are used as a
measure of variability which should be presented
along with the means while presenting results
both in tabular and graphical forms. For
examples Table 1. Means for Treatment I and II
( SE).
26Example 2 Graphical presentation SD or SE are
shown using error bars.
Error Bars
27See in the Computer Lab after 15 min
Thank you