Title: FAQs about StyleAdvisor Math
1FAQs about StyleAdvisor Math
Zephyr Associates, Inc. 2000
2Purpose
The purpose of computing is insight, not
numbers. Richard Wesley Hamming
- Frequently asked questions often hint at deeper,
conceptual issues that cannot be explained in a
tech support conversation. - The purpose of this talk is to explore at least
one of your frequently asked questions to its
true depth.
3Overview
- FAQs that I will try to address in this talk
- I have encountered two different definitions of
standard deviation. Which is the correct one, and
why? - I have encountered two different definitions of
annualized excess return. Which is the correct
one, and why?
4Two Standard Deviations?
Which of the following two frequently encountered
definitions of standard deviation is the correct
one, or which is the better one under what
circumstances? stddev(x1, ?, xn) stddev(x1,
?, xn)
5An Easy Example
Suppose we wish to gather statistical data on the
height of the people in this room. We would ask
everybody for their height to obtain a set of
numbers x1, ?, xn. The first and most obvious
statistic to compute is the mean, or average of
the data
6Variability (Dispersion)
var(x1, ?, xn)
Variance is the mean of the squares of these
distances
?
?
?
?
mean
?
?
?
7Sampling
Now suppose we wish to obtain statistical data
about the heights of the entire US population. It
is not realistic to obtain the height of every
person in the US. Instead, we must take one or
more random samples, calculate our statistics on
the sample data, and then draw conclusions on the
statistics for the entire population. Big
question What are the conclusions that can be
drawn from sample data? That is a very difficult
question, and it leads to a lot of highly
non-trivial mathematical theory.
8Unbiased Estimators
Suppose we were to take every possible sample of
size 50 out of the entire US population. Suppose
further that we were to calculate the average
height for each of these samples, say m1, m2, ? .
Then the mean of all those sample means is the
true average height of the entire population. In
other words, if we measure peoples average
height on a larger and larger number of samples
of size 50, then in the long run, we will
approach the true average height of the
population. Mathematically The expected value of
the sample mean is the population mean. The
sample mean is an unbiased estimator of the
population mean.
9Back To Variance
Suppose we were to take every possible sample of
size 50 out of the entire US population. Suppose
further that we were to calculate the variance of
peoples heights for each of these samples, say
m1, m2, ? . Then the mean of all those sample
variances is not the true variance of the entire
population. In other words, if we take a larger
and larger number of samples of size 50 and
calculate the variances, then on average, we will
not approach the true variance of the
population. The expected value of the sample
variance is not the population variance. The
sample variance is not an unbiased estimator of
the population variance.
10Sample Vs. Population Variance
E(vars) varP wheren sample sizevars
variance of samplesvarP population variance In
other words, if we keep taking samples of size n
and calculating the variance on these samples,
then we will, on average, approach a number that
is a little smaller than the actual variance of
the entire population.
11Unbiased Estimator For Variance
E(vars) varP E(vars) varP E(
vars) varP In other words, if we keep
taking samples of size n and calculating n/(n?1)
times the variance on these samples, then we
will, on average, approach the actual variance of
the entire population. The statistic n/(n?1)
times the variance is an unbiased estimator for
the population variance.
12Putting It All Together
The one and only true definition of variance
isvar(x1, ?, xn) When the variance is
estimated via sampling, then the statistic that
must be calculated on the sample is var(x1, ?,
xn) ?
13Variance In Portfolio Theory
When analyzing historical data of a portfolio
(money manager, mutual fund), which of the two
formulas should we use? The most widely held
point of view is that the sample variance should
be used (denominator n ? 1). However, there isnt
really any right or wrong here. The decision
which formula to use is a philosophical one.
14Returns Sample or Population
The crucial question is When we look at
historical return data of a manager or mutual
fund, is this data
- the entire statistical population, or
- a sample of a larger population of data that
extends indefinitely into the future?
If your answer is a), then you must use the
formula for the actual variance (denominator n).
Else, you must use the formula for the sample
variance (denominator n ? 1).
15Annualized Excess Return
The FAQ was I have encountered two different
definitions of annualized excess return. Which is
the correct one, and why?
16Annualization
Let r1, ?, rn be a return series of any
periodicity. Then the total return (cumulative
return, compound return) of the series is r ?
The annualized return of the series is defined
as the constant annual return that would result
in the same compound return over the time period
covered by the series.
17Annualized Excess Return
Let r1, ?, rn and s1, ?, sn be return series.
Then the excess return series is simply the
pointwise difference of the two (e1, ?, en) ?
(r1 ? s1, ?, rn ? sn) The following two
definitions of annualized excess return are
encountered AnnExRtn ? AnnExRtn ?
X
!
18The Simplest Example Ever
Let us consider the monthly returns of a manager
and a benchmark for two months Obviously,
the annualized excess return of the manager over
the benchmark must be positive the manger made
money, the benchmark lost money!
19Example (Cont.)
X
AnnExRtn ? ? (.9 ? 1.11)6 ? 1 ? ?
0.059850 AnnExRtn ? ((.91 ? 1.1)6 ? 1) ? ((1.01
? .99)6 ? 1) ? 0.0066
negative!
!