Title: Noise sensitivity of risk measures
1Noise sensitivity of risk measures
- Imre Kondor
- Collegium Budapest and Eötvös
- University, Budapest, Hungary
- Institute for Theoretical Sciences, a Notre Dame
University and Argonne National Laboratory
collaboration, August 17, 2005
2Contents
- I. Preliminaries
- the problem of noise, risk measures, noisy
covariance matrices - II. Noise sensitivity of Gaussian portfolios
- III. Alternative risk measures (mean absolute
deviation, expected shortfall, worst loss),
their sensitivity to noise,
the feasibility problem
3Coworkers
- Szilárd Pafka and Gábor Nagy (CIB Bank, Budapest)
- Richárd Karádi (Institute of Physics, Budapest
University of Technology, now at ProcterGamble) - Balázs Janecskó, András Szepessy, Tünde Ujvárosi
(Raiffeisen Bank, Budapest) - István Varga-Haszonits (Eötvös University,
Budapest)
4I. PRELIMINARIES
5Preliminary considerations 1
- Portfolio selection a tradoff between risk and
reward
6Reward vs. risk
Efficient frontier
Set of all possible portfolios in the economy
7Preliminary considerations 2
- There is a more or less general agreement on
what we mean by reward in a finance context - relative price change
- log return
- but the status of risk measures is controversial
8- For optimal portfolio selection we have to know
what we want to optimize - The chosen risk measure should respect some
obvious mathematical requirements, must be
stable, and easy to implement in practice
9(No Transcript)
10The problem of noise
- Even if returns formed a clean, stationary
stochastic process, we could only observe finite
time segments, therefore we never have sufficient
information to completely reconstruct the
underlying process. Our estimates will always be
noisy. - Mean returns (on short time horizons) are
particularly hard to measure on the market with
any precision - Even if we disregard returns and go for the
minimal risk portfolio, lack of sufficient
information will introduce noise, i. e. error,
into our decision. - The problem of noise is more severe for large
portfolios (size N) and relatively short time
series (length T) of observations, and different
risk measures are sensitive to noise to a
different degree. - We have to know how the decision error depends on
N and T for a given risk measure
11Some elementary criteria on risk measures
- A risk measure is a quantitative characterization
of our intuitive concept of risk (fear of
uncertainty and loss). - Risk is related to the stochastic nature of
returns. It is (or should be) a functional of the
pdf of returns. - Any reasonable risk measure must satisfy
- - convexity
- - invariance under addition of risk free asset
- - monotonicity and assigning zero risk to a zero
position - The appropriate choice may depend on the nature
of data (e.g. on their asymptotics) and on the
context (investment, risk management,
benchmarking, tracking, regulation, capital
allocation)
12A more elaborate set of risk measure axioms
- Coherent risk measures (P. Artzner, F. Delbaen,
J.-M. Eber, D. Heath, Risk, 10, 33-49 (1997)
Mathematical Finance,9, 203-228 (1999)) Required
properties monotonicity, subadditivity, positive
homogeneity, and translational invariance.
Subadditivity and homogeneity imply convexity.
(Homogeneity is questionable for very large
positions. Multiperiod risk measures?) - Spectral measures (C. Acerbi, in Risk Measures
for the 21st Century, ed. G. Szegö, Wiley, 2004)
a special subset of coherent measures, with an
explicit representation. They are parametrized by
a spectral function that reflects the risk
aversion of the investor.
13Convexity
- Convexity is extremely important.
- A non-convex risk measure
- - penalizes diversification (without convexity
risk - can be reduced by splitting the portfolio in
two - or more parts)
- - does not allow risk to be correctly aggregated
- - cannot provide a basis for rational pricing of
risk - (the efficient set may not be not convex)
- - cannot serve as a basis for a consistent limit
- system
- In short, a non-convex risk measure is really not
a risk measure at all.
14II. NOISE SENSITIVITY OF GAUSSIAN PORTFOLIOS
15A classical risk measure the variance
- When we use variance as a risk measure we assume
that the underlying process is essentially
multivariate normal or close to it.
16Portfolios
- Consider a linear combination of returns
- with weights . The
weights add up to unity . The
portfolios expectation value is
with variance - where is the covariance matrix,
the standard deviation of return , and
- the correlation matrix.
17Level surfaces of risk measured in variance
- The covariance matrix is positive definite. It
follows that the level surfaces (iso-risk
surfaces) of variance are (hyper)ellipsoids in
the space of weights. The convex iso-risk
surfaces reflect the fact that the variance is a
convex measure. - The principal axes are inversely proportional to
the square root of the eigenvalues of the
covariance matrix. - Small eigenvalues thus correspond to long
axes. - The risk free asset would correspond to an
infinite axis, and the correspondig ellipsoid
would be deformed into an elliptical cylinder.
18The Markowitz problem
- According to Markowitz classical theory the
tradeoff between risk and reward can be realized
by minimizing the variance - over the weights, for a given expected return
- and budget
19- Geometrically, this means that we have to blow up
the risk ellipsoid until it touches the
intersection of the two planes corresponding to
the return and budget constraints, respectively.
The point of tangency is the solution to the
problem. - As the solution is the point of tangency of a
convex surface with a linear one, the solution is
unique. - There is a certain continuity or stability in the
solution A small miss-specification of the risk
ellipsoid leads to a small shift in the solution.
20- Covariance matrices corresponding to real markets
tend to have mostly positive elements. - A large matrix with nonzero average elements will
have a large (Frobenius-Perron) eigenvalue, with
the corresponding eigenvector having all positive
components. This will be the direction of the
shortest principal axis of the risk ellipsoid. - Then the solution also will have all positive
components. Fluctuations in the small eigenvalue
sectors may have a relatively mild effect on the
solution.
21The minimal risk portfolio
- Expected returns are hardly possible (on
efficient markets, impossible) to determine with
any precision. - In order to get rid of the uncertainties in the
returns, we confine ourselves to considering the
minimal risk portfolio only, that is, for the
sake of simplicity, we drop the return
constraint. - Minimizing the variance of a portfolio without
considering return does not, in general, make
much sense. In some cases (index tracking,
benchmarking), however, this is precisely what
one has to do.
22The weights of the minimal risk portfolio
- Analytically, the minimal variance portfolio
corresponds to the weights for which - is minimal, given .
- The solution is .
- Geometrically, the minimal risk portfolio is the
point of tangency between the risk ellipsoid and
the plane of the budget constraint.
23Empirical covariance matrices
- The covariance matrix has to be determined from
measurements on the market. From the returns
observed at time t we get the estimator - For a portfolio of N assets the covariance matrix
has O(N²) elements. The time series of length T
for N assets contain NT data. In order for the
measurement be precise, we need N ltltT. Bank
portfolios may contain hundreds of assets, and it
is hardly meaningful to use time series longer
than 4 years (T1000). Therefore, N/T ltlt 1 rarely
holds in practice. As a result, there will be a
lot of noise in the estimate, and the error will
scale in N/T.
24Fighting the curse of dimensions
- Economists have been struggling with this problem
for ages. Since the root of the problem is lack
of sufficient information, the remedy is to
inject external info into the estimate. This
means imposing some structure on s. This
introduces bias, but beneficial effect of noise
reduction may compensate for this. - Examples
- single-index models (ßs) All these help
to - multi-index models various degrees.
- grouping by sectors Most studies are
based - principal component analysis on
empirical data - Bayesian shrinkage estimators, etc.
25An intriguing observation
- L.Laloux, P. Cizeau, J.-P. Bouchaud, M. Potters,
PRL 83 1467 (1999) and Risk 12 No.3, 69 (1999) - and
- V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N.
Amaral, H.E. Stanley, PRL 83 1471 (1999) - noted that there is such a huge amount of noise
in empirical covariance matrices that it may
render them useless. - A paradox Covariance matrices are in widespread
use and banks still survive ?!
26Laloux et al. 1999
The spectrum of the covariance matrix obtained
from the time series of SP 500 with N406,
T1308, i.e. N/T 0.31, compared with that of a
completely random matrix (solid curve). Only
about 6 of the eigenvalues lie beyond the random
band.
27Remarks on the paradox
- The number of junk eigenvalues may not
necessarily be a proper measure of the effect of
noise The small eigenvalues and their
eigenvectors fluctuate a lot, but perhaps they
have a relatively minor effect on the optimal
portfolio, whereas the large eigenvalues and
their eigenvectors are fairly stable. - The investigated portfolio was too large compared
to the length of the time series. - Working with real, empirical data, it is hard to
distinguish the effect of insufficient
information from other parasitic effects, like
nonstationarity.
28A filtering procedure suggested by RMT
- The appearence of random matrices in the context
of portfolio selection triggered a lot of
activity, mainly among physicists. Laloux et al.
and Plerou et al. proposed a filtering method
based on random matrix theory (RMT). This has
been further developed and refined by many
workers. - The proposed filtering consists basically in
discarding as pure noise that part of the
spectrum that falls below the upper edge of the
random spectrum. Information is carried only by
the eigenvalues and their eigenvectors above this
edge. Optimization should be carried out by
projecting onto the subspace of large
eigenvalues, and replacing the small ones by a
constant chosen so as to preserve the trace. This
would then drastically reduce the effective
dimensionality of the problem.
29- Interpretation of the large eigenvalues The
largest one is the market, the other big
eigenvalues correspond to the main industrial
sectors. - The method can be regarded as a systematic
version of principal component analysis, with an
objective criterion on the number of principal
components. - According to our comparative studies, the method
works consistently well
30A measure of the effect of noise
- Assume we know the true covariance matrix and
- the noisy one . Then a natural, though not
unique, - measure of the impact of noise is
- where w are the optimal weights corresponding
- to and , respectively.
31To test the noise sensitivity of various risk
measures we use simulated data
- The rationale behind this is that in order to be
able to compare the sensitivity of risk measures
to noise, we better get rid of other sources of
uncertainty, like non-stationarity. This can be
achieved by using artificial data where we have
total control over the underlying stochastic
process.
32The model-simulation approach
- Our strategy is to choose various model
covariance matrices and generate long
simulated time series by them. Then we cut out
segments of length T from these time series, as
if observing them on the market, and try to
reconstruct the covariance matrices from them. We
optimize a portfolio both with the true and
with the observed covariance matrix and
determine the measure .
33Model 1 the unit matrix
- Spectrum
-
- ? 1, N-fold degenerate
- Noise will split this
- into band
1
0
C
34The empirical covariance matrix corresponding
to Model 1 is the Wishart matrix
- If N and T go to infinity such that their ratio
N/T is finite, lt 1, then the spectrum of this
empirical covariance matrix is given by the
Wishart or Marchenko-Pastur spectrum (eigenvalue
distribution) -
where
35Model 2 single-index
- Singlet ?11?(N-1) O(N)
- eigenvector (1,1,1,)
- ?2 1- ? O(1)
- (N-1) fold degenerate
?
1
36- The spectrum of the empirical covariance matrix
corresponding to Model 2 is still the Marchenko
Pastur spectrum, plus an isolated, large,
Frobenius Perron eigenvalue (the market).
37Model 3 market sectors
singlet
- fold degenerate
1
This structure has also been studied by economists
- fold degenerate
38- The spectrum of the empirical covariance matrix
corresponding to Model 3 consists of the
Marchenko Pastur spectrum, the large market
eigenvalue, and a number of eigenvalues in
between. If the sectors are not equivalent, we
can, by an appropriate choice of the parameters,
tune the model so as to mimic the empirical
covariance matrices observed on the market (Noh
model)
39Model 4 Semi-empirical
- Suppose we have very long time series (T) for
many assets (N). - Choose N lt N time series randomly and derive Cº
from these data. Generate time series of length
T ltlt T from Cº. - The error due to T is much larger than that due
to T.
40- We look for the minimal risk portfolio for both
the true and the empirical covariances and
determine the measure
41 Numerically we get the following scaling result
for Model 1
42This confirms the expected scaling in N/T. The
corresponding analytic result
- can easily be derived for Model 1. It is valid
within O(1/N) corrections also for more general
models. This simple result does not seem to have
been noticed earlier
43The derivation
44III.
- III. ALTERNATIVE RISK MEASURES
45Risk measures in practice VaR
- VaR (Value at Risk) is a high (95, or 99)
quantile, a threshold beyond which a given
fraction (5 or 1) of the statistical weight
resides. - Its merits (relative to the Greeks, e.g.)
- - universal can be applied to any portfolio
- - probabilistic content associated to the
distribution - - expressed in money
- Widespread across the whole industry and
regulation. Has been promoted from a diagnostic
tool to a decision tool. - Its lack of convexity promted search for coherence
46Mean absolute deviation (MAD)
Some methodologies (e.g. Algorithmics) use the
mean absolute deviation rather than the standard
deviation to characterize the fluctuation of
portfolios. The objective function to minimize is
then
instead of
The iso-risk surfaces of MAD are polyhedra. This
is a feature MAD shares with some regulatory risk
measures.
47Effect of noise on absolute deviation-optimized
portfolios
We generate artificial time series (say iid
normal), determine the true abs. deviation and
compare it to the measured one
We get
48Noise sensitivity of MAD
- The result scales in T/N (same as with the
variance). The optimal portfolio other things
being equal - is more risky than in the
variance-based optimization. - Geometrical interpretation The level surfaces of
the variance are ellipsoids.The optimal portfolio
is found as the point where this risk-ellipsoid
first touches the plane corresponding to the
budget constraint. In the absolute deviation case
the ellipsoid is replaced by a polyhedron, and
the solution occurs at one of its corners. A
small error in the specification of the
polyhedron makes the solution jump to another
corner, thereby increasing the fluctuation in the
portfolio.
49(No Transcript)
50Expected shortfall (ES) optimization
- ES is the mean loss beyond a high threshold
defined in probability (not in money). For
continuous pdfs it is the same as the
conditional expectation beyond the VaR quantile.
ES is coherent (in the sense of Artzner et al.)
and as such it is strongly promoted by a group of
academics. In addition, Uryasev and Rockefellar
have shown that its optimizaton can be reduced to
linear programming for which extremely fast
algorithms exist. - ES-optimized portfolios tend to be much noisier
than either of the previous ones. One reason is
the instability related to the (piecewise) linear
risk measure, the other is that a high quantile
sacrifices most of the data. The noise
sensitivity of ES appears to be non-monotonous as
function of the threshold. - In addition, ES optimization is not always
feasible!
51Before turning to the discussion of the
feasibility problem, let us compare the noise
sensitivity of the following risk measures
standard deviation, absolute deviation and
expected shortfall (the latter at 95). For the
sake of comparison we use the same (Gaussian)
input data of length T for each, determine the
minimal risk portfolio under these risk measures
and compare the error due to noise.
52The next slides show
- plots of wi (porfolio weights) as a function of i
- display of q0 (ratio of risk of optimal portfolio
determined from time series information vs full
information) - results show that the effect of estimation noise
can be significant and more advanced risk
measures pose a higher demand for input
information (in a portfolio optimization context)
53(No Transcript)
54(No Transcript)
55(No Transcript)
56(No Transcript)
57- the suboptimality (q0) scales in T/N (for large N
and T)
58The feasibility problem
- For T lt N, there is no solution to the portfolio
optimization problem under any of the risk
measures considered here. - For T gt N, there always is a solution under
the variance and MAD, even if it is bad for T not
large enough. In contrast, under ES (and WL to be
considered later), there may or may not be a
solution for T gt N, depending on the sample. The
probability of the existence of a solution goes
to 1 only for T/N going to infinity. - The problem does not appear if short selling is
banned
59Feasibility of optimization under ES
Probability of the existence of an optimum under
CVaR. F is the standard normal distribution. Note
the scaling in N/vT.
60A pessimistic risk measure worst loss
- In order to better understand the feasibility
problem, select the worst return in time and
minimize this over the weights -
- subject to
-
- This risk measure is coherent, one of Acerbis
spectral measures. - For T lt N there is no solution
- The existence of a solution for T gt N is a
probabilistic issue again, depending on the time
series sample
61Why is the existence of an optimum a random event?
- To get a feeling, consider NT2.
- The two planes
- intersect the plane of the budget constraint in
two straight lines. If one of these is
decreasing, the other is increasing with ,
then there is a solution, if both increase or
decrease, there is not. It is easy to see that
for elliptical distributions the probability of
there being a solution is ½.
62Probability of the feasibility of the minimax
problem
- For TgtN the probability of a solution (for an
elliptical underlying pdf) is -
- (The problem is isomorphic to some problems in
operations research and random geometry Todd,
M.J. (1991), Probabilistic models for linear
programming, Math. Oper. Res. 16, 671-693. ) - For N and T large, p goes over into the error
function and scales in N/vT. For T? infinity, p
?1.
63Probability of the existence of a solution under
maximum loss. F is the standard normal
distribution. Scaling is in N/vT again.
64(No Transcript)
65(No Transcript)
66(No Transcript)
67(No Transcript)
68(No Transcript)
69(No Transcript)
70(No Transcript)
71(No Transcript)
72(No Transcript)
73(No Transcript)
74Concluding remarks
- Due to the large number of assets in typical bank
portfolios and the limited amount of data, noise
is an all pervasive problem in portfolio theory. - It can be efficiently filtered by a variety of
techniques from portfolios optimized under
variance. - Unfortunately, variance is not an adequate risk
measure for fat-tailed pdfs. - Piecewise linear risk measures show instability
(jumps) in a noisy environment. - Risk measures focusing on the far tails show
additional sensitivity to noise, due to loss of
data. - The two coherent measures we have studied display
large sample-to-sample fluctuations and
feasibility problems under noise. This may cast a
shade of doubt on their applications.
75Some references
- Physica A 299, 305-310 (2001)
- European Physical Journal B 27, 277-280 (2002)
- Physica A 319, 487-494 (2003)
- Physica A 343, 623-634 (2004)
- submitted to Quantitative Finance, e-print
cond-mat/0402573
76Benchmark tracking
- The goal can be (e.g. in benchmark tracking or
index replication) to minimize the risk (e.g.
standard deviation) relative to a benchmark - Portfolio
- Benchmark
- Relative portfolio
77- Therefore the relevant problems are of similar
structure but with returns relative to the
benchmark - For example, to minimize risk relative to the
benchmark means minimizing the standard deviation
of - with the usual budget contraint (no condition on
expected returns!)
78The economic content of the single-index model
-
-
-
- return market return with
- standard deviation s
- The covariance matrix implied by the above
- The assumed structure reduces of parameters to
N. - If nothing depends on i then this is just the
caricature Model 2.
79Risk measures implied by regulation
- Banks are required to set aside capital as a
cushion against risk - Minimal capital requirements are fixed by
international regulation (Basel I and II, Capital
Adequacy Directive of the EEC) the magic 8 - Standard model vs. internal models
- Capital charges assigned to various positions in
the standard model purport to cover the risk in
those positions, therefore, they must be regarded
as some kind of implied risk measures - These measures are trying to mimic variance by
piecewise linear approximants. They are quite
arbitrary, sometimes concave and unstable
80An example Specific risk of bonds
CAD, Annex I, 14 The capital requirement of
the specific risk (due to issuer) of bonds is
Iso-risk surface of the specific risk of bonds
81Another example Foreign exchange
According to Annex III, 1, (CAD 1993, Official
Journal of the European Communities, L14, 1-26)
the capital requirement is given as
,
,
in terms of the gross
.
and the net position
The iso-risk surface of the foreign exchange
portfolio
82(No Transcript)
83(No Transcript)