Statistical physics and finance - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Statistical physics and finance

Description:

The concrete field of application will be the problem of portfolio selection ... the purposes of our numerical calculations we chose various model covariance ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 69
Provided by: kondo
Category:

less

Transcript and Presenter's Notes

Title: Statistical physics and finance


1
Statistical physics and finance
  • I. Kondor
  • Collegium Budapest and Eötvös University
  • Seminar talk at Morgan-Stanley Fixed Income
  • Budapest, March 1, 2007

2
Coworkers
  • Sz. Pafka (ELTE CIB Bank Paycom, Santa Monica)
  • G. Nagy (Debrecen University CIB Bank)
  • R. Karádi (Budapest University of Technology
    ProcterGamble)
  • N. Gulyás (ELTE Budapest Bank Lombard Leasing
    ELTE Collegium Budapest)
  • I. Varga-Haszonits István (ELTE Morgan-Stanley)
  • G. Papp (ELTE)
  • A. Ciliberti (Roma and ScienceFinance, Paris)
  • M. Mézard (Orsay)

3
Contents
  • Links between economics and physics
  • What can physics offer to finance that
    mathematics might not?
  • Three examples random matrices, phase
    transitions and replicas

4
Early links
  • The physics-complex of classical economics
  • Maxwell
  • Bachelier

5
Physicists in finance
  • From the early nineties on financial institutions
    hire more and more physicists.
  • Some 30-35 of the invited speakers of risk
    managements conferences are ex-physicists.
  • Today finance is one of the standard fields of
    employment for physics graduates and PhDs (EU
    document on the harmonization of the Bologna-type
    higher education curricula Tuning Educational
    Structures in Europe http//tuning.unideusto.org/
    tuningeu/ ).

6
Econophysics is there such a thing?
  • The term was introduced by H. E. Stanley, it is
    not universally beloved, but wide-spread.
  • Do these two disciplines have anything to do with
    each other?
  • A trivial answer we are dealing with stochastic
    processes in finance, and statistical physics is
    their main field of application.
  • But the theory of stochastic processes in its
    pure form belongs to probability theory.

7
So the question is Why do banks hire not only
probabilists, applied mathematicians, computer
scientists, statisticians, etc., but also
physicists?
  • What is the special knowledge or skill, if any,
    that physicists can bring into finance? What can
    physics offer to finance? (Stanley at the Nikkei
    conference)
  • A common, albeit vague, answer modeling skills,
    creative use of mathematics, knowledge of a
    wide spectrum of approximation and numerical
    methods, etc. may contribute to the market value
    of physicists.

8
A bit deeper
  • Physics has got the farthest in the understanding
    of strongly interacting systems and collective
    phenomena.
  • Textbook-economics is, at best, on the conceptual
    level of mean-field theory even today
    (representative agent).
  • The building up of structures and new qualities
    from simple interactions, emergence, collective
    coordinates, averaging over microscopic degrees
    of freedom, etc. these conceptual tools are not
    known in finance or economics at large (cf. Basel
    II).

9
Therefore I think that
  • some knowledge of quantum mechanics, many body
    problem, field theory, renormalisation, phase
    transitions, nonlinear and complex systems, etc.,
    although neither necessary nor sufficient, may be
    useful (as a conceptual introduction or just as a
    source of metaphores) in the understanding of
    social phenomena, including the market.

10
In this talk I will illustrate the use of
conceptual tools imported from physics on the
following three examples
  • Random matrices
  • Phase transitions and critical phenomena
  • Replica method

11
The concrete field of application will be the
problem of portfolio selection
  • The basic question How to distribute our wealth
    over the set of possible investment instruments
    so that to earn the highest return at the lowest
    risk?
  • Here I will focus my attention on the minimal
    risk portfolio, irrespective of the return.

12
The original formulation of the problem
  • The returns , i1,2,,N, are random
    variables drawn from a known (say, multivariate
    normal) distribution, with covariance matrix
    ( is the correlation
    matrix, the standard deviation of ).
  • Find the weights , , for
    which the variance
  • of the portfolio
    is minimal.

13
Unconstrained short selling
  • We have not stipulated that the weights be
    positive, they can be of either sign, with an
    arbitrarily large absolute value. This is
    obviously unrealistic, for, among other things,
    liquidity reasons. Nevertheless, it is useful to
    consider the problem first in this idealised form
    (just as the finance text-books do), because then
    the optimal weights can be calculated
    analytically
  • If we ban short selling, the task becomes one in
    quadratic programming.

14
Infinite volume limit
  • Allowing unlimited short selling makes the domain
    of the optimization task infinite. This is not an
    innocent idealisation, because, as we will see,
    the solution vector can show huge fluctuations,
    and the restriction on the domain could bound
    these fluctuations .
  • Similarly to the theory of phase transitions,
    however, it is expedient to understand the
    essence of the phenomenon in the limit of
    infinite volume, and take into account the
    finite-volume effects only later.

15
Variants of the problem
  • When we use the standard deviation as a risk
    measure, we are assuming that the underlying
    process is normal, or has some similarly
    concentrated distribution. Typically, financial
    processes are not like this.
  • Alternative risk measures mean absolute
    deviation (MAD), average loss above a high
    threshold (ES), maximal loss (ML), or, indeed,
    any homogeneous convex functional defined over
    the distribution of the losses.

16
Empirical covariance matrices
  • The covariance matrix has to be determined from
    measurements on the market. From the returns
    observed at time t we have the estimator
  • The number of covariance matrix elements of a
    portfolio composed of N instruments is O(N²). In
    the time series of length T of N instruments we
    have NT data. In order to have a precise
    estimate, we should have N ltltT . Large portfolios
    can contain hundreds of instruments, while it is
    hardly meaningful to use data older than, say, 4
    years, that is T1000. Therefore the inequality
    N/T ltlt 1 almost never holds in reality. Thus our
    estimates will contain a lot of noise, and the
    estimation error will depend on the scaling
    variable N/T .

17
Information deficit
  • Thus the Markowitz problem suffers from the
    curse of dimensions, or from information
    deficit
  • The estimates will contain error and the
    resulting portfolios will be suboptimal
  • How serious is this effect?
  • How sensitive are the various risk measures to
    this kind of error?
  • How can we reduce the error?

18
Fighting the curse of dimensions
  • Economists have been struggling with this problem
    for ages. Since the root of the problem is lack
    of sufficient information, the remedy is to
    inject external info into the estimate. This
    means imposing some structure on s. This
    introduces bias, but beneficial effect of noise
    reduction may compensate for this.
  • Examples
  • single-index models (ßs) All these help
    to
  • multi-index models various degrees.
  • grouping by sectors Most studies are
    based
  • principal component analysis on
    empirical data
  • Bayesian shrinkage estimators, etc.
  • Random matrix theory

19
Random matrices
20
Origins of random matrix theory (RMT)
  • Wigner, Dyson 1950s
  • Originally meant to describe (to a zeroth
    approximation) the spectral properties of heavy
    atomic nuclei
  • - on the grounds that something that is
    sufficiently complex is almost random
  • - fits into the picture of a complex system, as
    one with a large number of degrees of freedom,
    without symmetries, hence irreducible, quasi
    random.
  • - markets, by the way, are considered stochastic
    for similar reasons

21
RMT
  • Later found applications in a wide range of
    problems, from quantum gravity through quantum
    chaos, mesoscopics, random systems, etc., etc.
  • Has developed into a rich field with a huge set
    of results for the spectral properties of various
    classes of random matrices
  • They can be thought of as a set of central limit
    theorems for matrices

22
Wigner semi-circle law
  • Mij symmetrical NxN matrix with i.i.d. elements
    (the distribution has zero mean and finite second
    moment)
  • ?k eigenvalues of
  • The density of eigenvalues ?k (normed by N) goes
    to the Wigner semi-circle for N?8 with prob. 1
  • ,
  • , otherwise

23
Remarks on the semi-circle law
  • Can be proved by the method of moments (as done
    originally by Wigner) or by the resolvent method
    (Marchenko and Pastur and countless others)
  • Holds also for slightly dependent or
    non-homogeneous entries
  • The convergence is fast (believed to be of 1/N,
    but proved only at a lower rate), especially what
    concerns the support

24
Wishart matrices
  • Generate very long time series for N iid random
    variables, with an arbitrary distribution of
    finite variance, and cut out samples of length T
    from these, as if making empirical observations.
  • The true covariance matrix of these variables
    is the unit matrix, but if we try to reconstruct
    this from the simulated samples we will not
    recover the unit matrix for any finite T.
    Instead, we will have an empirical covariance
    matrix.

25
Correlation matrix of iid normal random variables
  • The spectrum consists of a single,
    N-fold degenerate
  • eigenvalue ? 1
  • The noise lifts the degeneracy and makes
    a band out of the single eigenvalue.

1
0
C
26
The corresponding empirical covariance matrix
is the Wishart matrix
  • If N and T ?8 such, that their ratio r N/T
    is fixed, lt 1, then the spectrum of this
    empirical covariance matrix will be the Wishart
    or Marchenko-Pastur spectrum (eigenvalue
    distribution)


  • where

27
Remarks
  • The theorem also holds when the (average) sample
    covariance matrix is of finite rank
  • The assumption that the entries are identically
    distributed is not necessary
  • If T lt N the distribution is the same with an
    extra point of mass 1 T/N at the origin
  • If T N the Marchenko-Pastur law is the squared
    Wigner semi-circle
  • The proof extends to slightly dependent and
    inhomogeneous entries
  • The convergence is fast, believed to be of 1/N ,
    but proved only at a lower rate

28
N1000 T/N2
29
  • If the matrix elements are not centered but have,
    say, a common mean, one large eigenvalue breaks
    away, the rest stay in the random band
  • Eigenvector components just as in the Wigner
    case, the eigenvectors in the bulk are random,
    the one outside is delocalized (has nonzero
    entries everywhere)
  • There is a lot of fluctuation, level crossing,
    random rotation of eigenvectors taking place in
    the random band
  • The eigenvector belonging to the large eigenvalue
    (when there is one) is much more stable. The
    larger the eigenvalue, the more so.

30
An intriguing observation
  • L.Laloux, P. Cizeau, J.-P. Bouchaud, M. Potters,
    PRL 83 1467 (1999) and Risk 12 No.3, 69 (1999)
  • and
  • V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N.
    Amaral, H.E. Stanley, PRL 83 1471 (1999)
  • noted that there is such a huge amount of noise
    in empirical covariance matrices that it may be
    enough to make them useless.
  • A paradox Covariance matrices are in widespread
    use and banks still survive ?!

31
Laloux et al. 1999
The spectrum of the covariance matrix obtained
from the time series of SP 500 with N406,
T1308, i.e. N/T 0.31, compared with that of a
completely random matrix (solid curve). Only
about 6 of the eigenvalues lie beyond the random
band.
32
Remarks on the paradox
  • The number of junk eigenvalues may not
    necessarily be a proper measure of the effect of
    noise The small eigenvalues and their
    eigenvectors fluctuate a lot, indeed, but perhaps
    they have a relatively minor effect on the
    optimal portfolio, whereas the large eigenvalues
    and their eigenvectors are fairly stable.
  • The investigated portfolio was too large compared
    with the length of the time series (although it
    is hard to find a better ratio in practice).
  • Working with real, empirical data, it is hard to
    distinguish the effect of insufficient
    information from other parasitic effects, like
    nonstationarity (which is why we prefer to work
    with simulated data for the purposes of
    theoretical studies).

33
A filtering procedure suggested by RMT
  • The appearence of random matrices in the context
    of portfolio selection triggered a lot of
    activity, mainly among physicists. Laloux et al.
    and Plerou et al. proposed a filtering method
    based on random matrix theory (RMT) subsequently.
    This has been further developed and refined by
    many workers.
  • The proposed filtering consists basically in
    discarding as pure noise that part of the
    spectrum that falls below the upper edge of the
    random spectrum. Information is carried only by
    the eigenvalues and their eigenvectors above this
    edge. Optimization should be carried out by
    projecting onto the subspace of large
    eigenvalues, and replacing the small ones by a
    constant chosen so as to preserve the trace. This
    would then drastically reduce the effective
    dimensionality of the problem.

34
  • Interpretation of the large eigenvalues The
    largest one is the market, the other big
    eigenvalues correspond to the main industrial
    sectors.
  • The method can be regarded as a systematic
    version of principal component analysis, with an
    objective criterion on the number of principal
    components.
  • In order to better understand this novel
    filtering method, I introduce a simple market
    model

35
Simple modell market sectors

single
- fold degenerate
1
- fold degenerate
36
  • The empirical covariance martix corresponding to
    this model consists of the Marchenko Pastur
    spectrum, a large (Frobenius-Perron) eigenvalue
    (the whole market), and a number of medium-sized
    eigenvalues.
  • If we resolve the equivalence of the sectors,
    with the appropriate tuning of the parameters we
    can mimic the spectrum observed on real markets
    (Noh model)

37
  • We have made extensive studies on the RMT-based
    filtering, and found that it performs
    consistently well compared with other, more
    conventional methods.
  • An additional advantage is that the method can be
    tuned according to the assumed structure of the
    market.
  • There are attempts to extract information from
    below the random band edge.

38
Divergent sampling error an algorithmic phase
transition
39
A measure of the effect of noise
  • Assume we know the true covariance matrix and
  • the noisy one . Then a natural, though not
    unique,
  • measure of the impact of noise is
  • where w are the optimal weights corresponding
  • to and , respectively.

40
The model-simulation approach
  • For the purposes of our numerical calculations
    we chose various model covariance matrices
    and generated long simulated time series with
    them. Then we cut out segments of length T from
    these time series, as if observing them on the
    market, and tried to reconstruct the covariance
    matrices from them. We optimized a portfolio both
    with the true and with the observed
    covariance matrix and determined the measure .

41
Fluctuations over the samples
  • The relative error refers to a given
    sample, so it is a random variable that
    fluctuates from sample to sample.
  • Likewise, there are strong fluctuations in the
    weights of the optimal portfolio

42
The distribution of qo over the samples
43
The average of qo as function of N/T
44
The divergent error signals an algorithmic phase
transition (I.K., Sz. Pafka, G. Nagy)
  • The rank of the covariance matrix is minN,T
  • In the limit N/T 1 the lower band edge of
    eigenvalues goes to zero, around the lower edge
    there are many small eigenvalues many soft
    modes.
  • N/T 1 is the critical point of the problem
  • Upon approaching the critical point we find
    scaling laws, e.g.
  • the expectation value of the portfolio error is
    ,
  • while the standard deviation diverges as
  • For TltN zero modes appear and the optimization
    becomes meaningless

45
Fluctuations of the weights the distribution of
the weights of a portfolio consisting of N100
iid normal variables, in a given sample, for T500
46
Sample to sample fluctuation of the weight of a
given instrument, non-overlapping windows, N100,
T500
47
Fluctuation of the weight of a given instrument,
step by step moving average, N100, T500
48
After RMT filtering the error drops to an
acceptable level and we can even penetrate the
region TltN
49
Finite volume
  • A ban on short selling, or any other constraint
    that renders the domain of optimization finite,
    or filtering will all supress the infinite
    fluctuations. However, the weights will keep
    wildly fluctuating as we approach N/T1 and an
    increasing part of them will stick to the walls
    of the allowed region. These zero weights will
    belong to different instruments in different
    samples. If we are not sufficiently far away from
    the critical point, the solution of the Markowitz
    problem cannot serve as the basis of rational
    decision making.

50
Universality
  • We have studied a number of different market
    models, different risk measures and different
    underlying processes (including fat tailed ones,
    and, with István Varga-Haszonits, also
    autoregressive, GARCH-like processes). The value
    of the critical point and the coefficients can
    change, but we have not yet found convincing
    evidence for any change in the critical exponents
    we have not yet discovered the boundaries of
    the universality class.

51
How come these phenomena have not been noticed
earlier?
  • Somehow the scaling has escaped attention
  • Typically, econometricians study the limit
    Nfixed, T?8 , rather than N/Tfixed, N,T ?8 .
  • The instability of the weights is an everyday
    experience, but the idea that their fluctuations
    can actually diverge has not arisen. If one
    insists on using empirical data, one cannot study
    the fluctuations over the samples, because there
    are not enough of them.
  • Random matrices, critical phenomena, zero modes,
    etc. are mostly unknown in finance.
  • The different aspects of the problem have not
    been integrated into a coherent picture that can
    only be achieved on the basis of the phase
    transition concept.

52
Replicas
53
Optimization and statistical mechanics
  • Any convex optimization problem can be
    transformed into a problem in statistical
    mechanics, by promoting the objective function
    into a Hamiltonian, and introducing a fictitious
    temperature. At the end we can recover the
    original problem in the limit of zero
    temperature.
  • Averaging over the time series segments (samples)
    is similar to what is called quenched averaging
    in the statistical physics of random systems one
    has to average the logarithm of the partition
    function (i.e. the cumulant generating function).
  • Averaging can then be performed by the replica
    trick a heuristic, but very powerful method
    that is on its way to be firmly established by
    mathematicians (Guerra and Talagrand).

54
The first application of replicas in a finance
context the ES phase boundary (A. Ciliberti,
I.K., M. Mézard)
  • ES is the average loss above a high threshold ß
    (a conditional expectation value). Very popular
    among academics and slowly spreading in practice.
    In addition, as shown by Uryasev and Rockafellar,
    the optimization of ES can be reduced to linear
    programming, for which very fast algorithms
    exist.
  • Portfolios optimized under ES are much more noisy
    than those optimized under either the variance or
    absolute deviation. The critical point of ES is
    always below N/T 1/2 and it depends on the
    threshold, so it defines a phase boundary on the
    N/T- ß plane.
  • The measure ES can become unbounded from below
    with a certain probability for any finite N and T
    , and then the optimization is not feasible!
  • The transition for finite N,T is smooth, for N,T
    ?8 it becomes a sharp phase boundary that
    separates the region where the optimization is
    feasible from that where it is not.

55
Formulation of the problem
  • The time series of returns
  • The objective function
  • The variables
  • The linear programming problem
  • Normalization

56
Associated statistical mechanics problem
  • Partition function
  • Free energy
  • The optimal value of the objective function

57
The partition function
  • Lagrange multipliers

58
Replicas
  • Trivial identity
  • We consider n identical replicas
  • The probability distribution of the n-fold
    replicated system
  • At an appropriate moment we have to analytically
    continue to real ns

59
Averaging over the random samples
  • where

60
Replica-symmetric Ansatz
  • By symmetry considerations
  • Saddle point condition
  • where

61
Condition for the existence of a solution to the
linear programming problem
  • The meaning of the parameter
  • Equation of the phase boundary

62
(No Transcript)
63
The limit
  • The problem goes over into the minimax problem of
    maximal loss
  • In this limit the phase boundary can be
    determined by a direct geometric argument (I.K.,
    Sz. Pafka, G. Nagy)

64
The probability of the solvability of the minimax
problem
  • For TgtN the probability of a solution (for any
    elliptical underlying process)
  • (The problem is isomophic to some operations
    research or random geometrical tasks Todd, M.J.
    (1991), Probabilistic models for linear
    programming, Math. Oper. Res. 16, 671-693. )
  • For large N and T , p goes over into the error
    function.
  • For N,T? 8, the transition becomes sharp at
    N/T1/2.

65
(No Transcript)
66
(No Transcript)
67
Conclusions
  • P.W. Anderson The fact is that the techniques
    which were developed for this apparently very
    specialized problem of a rather restricted class
    of special phase transitions and their behavior
    in a restricted region are turning out to be
    something which is likely to spread over not just
    the whole of physics but the whole of science.

68
In a similar spirit...
  • I think the phenomenon treated here, that is the
    sampling error catastrophe due to lack of
    sufficient information appears in a much wider
    set of problems than just the problem of
    investment decisions. (E.g. multivariate
    regression, all sorts of linearly programmable
    technology and economy related optimization
    problems, microarrays, etc.)
  • Whenever a phenomenon is influenced by a large
    number of factors, but we have a limited amount
    of information about this dependence, we have to
    expect that the estimation error will diverge and
    fluctuations over the samples will be huge.
Write a Comment
User Comments (0)
About PowerShow.com