POLYNOMIAL TIME HEURISTIC OPTIMIZATION METHODS APPLIED TO PROBLEMS IN COMPUTATIONAL FINANCE

About This Presentation

Title:

POLYNOMIAL TIME HEURISTIC OPTIMIZATION METHODS APPLIED TO PROBLEMS IN COMPUTATIONAL FINANCE

Description:

Outline of Presentation . Introduction. Motivation: Computational Finance and NP hard probems. My contributions. Thesis Group I. Mean reverting portfolio selection – PowerPoint PPT presentation

Number of Views:153

Avg rating:3.0/5.0

Slides: 55

Provided by: fog89

Category:

more less

Transcript and Presenter's Notes

Title: POLYNOMIAL TIME HEURISTIC OPTIMIZATION METHODS APPLIED TO PROBLEMS IN COMPUTATIONAL FINANCE

1
POLYNOMIAL TIME HEURISTIC OPTIMIZATION METHODS
APPLIED TO PROBLEMS IN COMPUTATIONAL
FINANCE Ph.D. dissertation of Fogarasi Norbert,
M.Sc. Supervisor Dr. Levendovszky János, D.
Sc. Doctor of the Hungarian Academy of Sciences
Department of Telecommunications Budapest
University of Technology and Economics Budapest,
2014 February 25
2
Outline of Presentation

Introduction
Computational Finance and Recent Industry Trends
My contribution
Thesis Group I. Mean reverting portfolio
selection
Thesis Group II. Optimal scheduling on identical
machines
Summary of results and contributions
Questions and Answers

3
Computational Finance and Recent Industry Trends

Relatively new branch of computer science
(Markowitz 1950s Modern Portfolio Theory. Nobel
Prize in 1990)
Numerical methods and algorithms with huge focus
on applicability (quantitative study of markets,
arbitrage, options pricing, mortgage
securitization)
Recent focus Algorithmic trading, quantitative
investing, high frequency trading
Post the 2008 financial crisis
Regulatory pressure (timely reporting,
transparency)
High-frequency trading (flash crashes)
Unprecedented focus on cost and efficiency
Finding quick (polynomial time) approximate
solutions to difficult (exponential, NP hard)
problems is a key focus

4
Computational Finance Open Issues
Challenges

My Contribution

Polynomial time approximation using stochastic
optimization
Real-time portfolio identification
Overnight Monte-Carlo risk calculation scheduling
Polynomial time heuristic scheduling algorithms
5
My Contribution

Finding polynomial time approx solutions to NP
hard problems
Mean Reverting Portfolio selection (Thesis Group
I)
Task Scheduling on Identical machines (Thesis
Group II)
Show measurable improvement to existing
approximate methods
Prove practical applicability in real world
settings
Have very quick runtime characteristics for high
frequency trading, timely regulatory reporting
and hardware cost savings
5 refereed journal publications, 1 conference
presentation
1. Fogarasi, N., Levendovszky, J. (2012) A
simplified approach to parameter estimation and
selection of sparse, mean reverting portfolios.
Periodica Polytechnica, 56/1, 21-28.
2. Fogarasi, N., Levendovszky, J. (2012) Improved
parameter estimation and simple trading algorithm
for sparse, mean-reverting portfolios. Annales
Univ. Sci. Budapest., Sect. Comp., 37, 121-144.
3. Fogarasi, N., Tornai, K., Levendovszky, J.
(2012) A novel Hopfield neural network approach
for minimizing total weighted tardiness of jobs
scheduled on identical machines. Acta Univ.
Sapientiae, Informatica, 4/1, 48-66.
4. Tornai, K., Fogarasi, N., Levendovszky, J.
(2013) Improvements to the Hopfield neural
network solution to the total weighted tardiness
scheduling problem. Periodica Polytechnica, 57/1,
1-8.
5. Fogarasi, N., Levendovszky, J. (2013) Sparse,
mean reverting portfolio selection using
simulated annealing. Algorithmic Finance, 2/3-4,
197-211.
6. Fogarasi, N., Levendovszky, J. (2012)
Combinatorial methods for solving the generalized
eigenvalue problem with cardinality constraint
for mean reverting trading. 9th Joint Conf. on
Math and Comp. Sci. February 2012 Siofok, Hungary

6
Summary of numerical results on real world
problems
Field Real world problem Average performance of traditional approaches Average performance of the proposed new method Impact on computational finance (improvement in percentage)
Portfolio optimization Convergence trading on US SP 500 stock data 11.6 (SP 500 index return) 34 22.4
Schedule optimization Morgan Stanley overnight scheduling problem 24709 (LWPF performance) 22257 (PSHNN performance) 10
7
Thesis Group I. Mean reverting portfolio
selection

Modern Portfolio Theory (MPT) maximize expected
return for a given amount of risk
Profitability vs. Predictability

Mean-reverting portfolios have a large degree of
predictability
Therefore, we can develop profitable convergence
trading strategies (35 annual return on
portfolio selected from SP500)

8
Intuitive task description
My contribution Developing novel algorithms for
identifying mean reverting portfolios with
cardinality constraints, trading and performance
analysis
9
Thesis Group I. Problem Description
How to identify mean reverting portfolios based
on multivariate historical time series?
Constraint Sparse portfolio (limited transaction
costs, easier to understand/interpret
strategy) dAspremont, A.(2011) Identifying small
mean-reverting portfolios. Quantitative Finance,
113, 351-364 (Ecole Polytechnique, Paribas
London, Phd-Stanford, Postdoc-Berkeley, Princeton)
10
Thesis Group I. Summary
Thesis I.1
New numerical method for estimating covariance matrix of VAR1 process
Periodica Polytechnica 2011
Thesis I.2
Adopted simulated annealing to probl of maximizing mean reversion under cardinality constraint
Algorithmic Finance 2013
Thesis I.3
Novel mean estimation technique for O-U processes using pattern matching
Annales Univ Sci Bp 2012
Thesis I.4
Simple trading strategy based on decision theoretic formulation
Joint Conf on Math and Comp Sci 2012
11
Thesis Group I. The model
12
The discrete model - VAR(1)First degree vector
autoregressive process
13
Optimal portfolio as a generalized eigenvalue
problem

Problem develop a fast solution to the
generalized eigenvalue problem under the
cardinality constraint NP hard
?Poly time ??
14
Thesis I.1 Estimation of Model Parameters

Given nxT historical VAR(1) data st we need to
estimate A, K (covar matrix of W) and G (covar
matrix of st)
A and K can be estimated using max likelihood
G can be estimated using sample covariance.
Classical research focuses on regularization
techniques (Dempster 1972, Banerjee et al 2008,
dAspremont et al 2008, Rothman et al 2008)
My novel approach use sample covariance and an
iterative recursive estimate in tandem to
approximate G.

15
Thesis I.1 Estimation of covariance

From definition of VAR(1), we have the Lyapunov
relationship in the stationary case
However, the solution may be non-positive
definite so we introduce a numerical method that
ensures positive definiteness
Start with G(0)sample covariance
Also gives a goodness of model fit
0 for generated VAR(1) data, shows how well
VAR(1) assumption works for real data.

16
Thesis I.1 Numerical results
vs. t for n8, s0.1, 0.3, 0.5,
generating 100 independent time series for each t
and plotting the average norm of error
17
Cardinality reduction by exhaustive search
18
Polynomial Time Heuristic Approaches

Greedy Method (dAspremont 2011) On each
iteration, consider adding all remaining n-k
dimensions and choose the one that yields the
largest max eigenvalue.
Truncation Method (Fogarasi et al 2012) Compute
unconstrained solution then use k heaviest
dimensions to solve the constrained problem.
Super fast heuristic (only 2 eigenvalue
computations)

19
Greedy Solution (dAspremont 2011)

Simple iterative heuristic that works very well
in practice. DAspremont couldnt consistently
outperform it.
Let Ik be the set of indices belonging to the k
non-zero components of x.
On each iteration, we consider adding all
remaining n-k dimensions and we choose the one
that yields the largest max eigenvalue.
Amounts to solving (n - k) generalized eigenvalue
problems of size k 1. Polynomial runtime

20
Thesis I.2 Application of SA by random
projection

Restrict the portfolio vector x to have only
integer values
Consider the Energy function to be minimized
At each step of the algorithm, we consider a
neighboring state w' of the current state w and
decide between moving or staying
As T is decreased (cooling), above has been
proven to converge (in distribution) to optimal
solution.

21
Thesis I.2 Application of SA by random
projection

Cardinality constraint can easily be built into
the neighbor function
Starting point can be selected as Greedy solution
Memory feature can be built in to ensure solution
is at least as good as starting point
Periodic revert to starting point improves
performance
Cooling schedule can be set to be fast enough for
the specific application
Procedure can be stopped at any point or an
adaptive stopping condition has been developed.

22
Thesis I.2 Numerical Results

For n10, k5 Greedy and SA find theoretical best
in 70 of cases, but in 11 of the remaining 30,
SA outperforms Greedy.
For larger problem sizes, SA performs even better
(eg. for n20, k10 it outperforms Greedy 25 of
the cases)

23
Thesis I.2 Runtime Analysis

Truncation method sub-second portfolio
selection, can be used in real-time algorithmic
trading
Greedy seconds to compute, can be used in
intraday trading
Simulated Annealing minutes to compute, improves
upon Greedy, can be used to finetune intraday
trading
Exhaustive impractical for ngt20, can be used for
low frequency trading

CPU runtime (in seconds) versus total number of
assets n, to compute a full set of sparse
portfolios, with cardinality ranging from 1 to n
24
Thesis I.3 Portfolio mean estimation

Given historical portfolio valuations pt and
assuming it follows O-U process, estimate µ.
Classical methods in literature
Sample mean estimate
Least squares regression
Max likelihood estimator (numerically complex)
I developed a novel mean estimation method based
on pattern matching and decision theory

25
Thesis I.3 Novel portfolio parameter estimation
using pattern matching

Starting from definition of Ornstein-Uhlenbeck
process
Taking expected value of the above
Typical tendencies of µ(t) below (idea portf
value without noise given µ(0) and long term µ)

26
Thesis I.3 Portfolio mean estimation

Use pt and max likelihood estimation techniques
to decide which pattern they match the most, and
determine long term µ
where U is the time correlation matrix of pt
This estimate is more accurate than sample mean
and more resilient to small ? than linear
regression.

27
Thesis I.3 Portfolio mean estimation
28
Thesis I.4 Simple Convergence Trading Model

We are deciding whether µ(t)lt µ by only observing
p(t) using an approach based on decision theory
We can use this simplified model to prove the
economic viability of our algorithms and compare
them to each other.

29
Thesis I.4 Simple Convergence Trading Model
As a result, for a given rate of acceptable error
e , we can select an a for which
Thus the trading strategy can then be summarized
as follows
Observed sample Accepted hypothesis Error probability Action (Cash / Portfolio)
Buy / Hold
No Action / Sell
No Action / Sell
30
Convergence Trading Simulation on synthetic data

Generated VAR(1) sequence
All four methods produced average profits of the
same order of magnitude and the distribution of
trading gains was very similar.
The exhaustive method produced mean reversion
coefficients on average 15 times those produced
by the truncation method and 3 times those
produced by the greedy method and simulated
annealing
Profits reached by this simple convergence
trading strategy are not directly proportional to
the lambda produced by the portfolio selection
method.
Implies that more complex objective functions
could be developed

Total number of assets (n) 10
Cardinality constraint (k) 5
Number of data points for portfolio selection (T) 20
Number of data points in trading window 250 ( of trading days in a year)
Number of simulations ran (rep) 2000
Initial cash 100
Maximum cash 14M
Percentage of repeats profitable 97
31
Convergence Trading Simulation

Typical example of convergence trading over 250
time steps on a sparse mean-reverting portfolio..
A profit of 1440 was achieved with an initial
cash of 100 having made 85 transactions.

32
Convergence Trading Simulation Histogram of
profits generated by SA

Histogram of profits achieved over 2000 different
generated VAR(1) series by simulated annealing.

33
Thesis Group I. SP500 Test

Consider the 500 stocks that make up the SP500
during 2009-2010 and select the K4 stock
portfolio to maximize mean-reversion.
Repeat for 250 trading days (1 year)
SP500 went up by 11.6, our method generates 34
return
Minimum, maximum, average and final portfolio
values starting from 100.

34
Thesis Group I. Conclusions and directions for
future research

Novel parameter estimation techniques introduced
Successfully applied simulated annealing with
random projections to meet cardinality
constraints
Outperformed greedy method in 10 of the cases
for 10 assets, 25 for 20 assets on generated
data (good scaling features)
Simple trading model developed for testing
economic viability, proved significant
improvement over other methods
More factors need to be considered for objective
function more closely linked to trading profits
Simulated Annealing approach can be used for
these too!
More complex trading model can be developed to
increase profits

35
Thesis Group II. Optimal Scheduling

Scheduling problems manufacturing,
pharmaceutical, biological, financial
computations.
Complex portfolios are evaluated and risk managed
using Monte-Carlo simulations at many financial
institutions (eg. Morgan Stanley)
Each night a changed portfolio needs to be
evaluated/risk managed with new market data/model
parameters
Need a quick way to schedule 10000s of jobs on
10000s of machines in a near optimal way
Why? 10M/year spend on hardware, timely response
to clients and regulators regarding portfolio
values and VaR.
My novel method saved 53 minutes on top priority
jobs running for 12 hours overnight, compared to
the next best heuristic.

36
Thesis Group II. Problem Formulation

Scheduling jobs on a finite number (V) of
identical processors under constraints on the
completion times
Given n users/jobs of sizes
Cutoff times
Weights/priorities
Scheduling matrix
Where if job i is processed at
time step j.
Jobs can stop/restart on different machine
(preemption)
For example V2, n3, x2,3,1, K3,3,3.

37
Thesis Group II. Problem Formulation

Define Tardiness as
where is the finishing time of job i as
per C.
Minimizing Total Weighted Tardiness (TWT) is
stated as
Under the following constraints
For example V2, n3, x2,3,2, K3,3,3,
w3,2,1 All jobs cannot complete before their
cutoff times, but the optimal TWT solution is

38
Heuristic Approaches to TWT

1990 Du and Leung prove that TWT is NP-hard
1979 Dogramaci, Sulkis simple heuristic
1983 Rachamadugu myopic heuristic, compares to
Earliest Due Date (EDD) and WSPT (Weighted
Shortest Processing Time)
1998 Azizoglu branch and bound heuristic, too
slow gt 15 jobs
1994 Koulamas KPM algorithm
2000 Armentano tabu search
1995 Guinet simulated annealing, lower bound
2002 Sen, 2008 Biskup Surveys of existing
methods
2000 Artificial Neural Network approach to
scheduling problems
2004 Maheswaran Hopfield Neural Network
approach to single machine TWT on a specific
10-job problem.

39
Thesis Group II. Optimal Scheduling
Thesis II.1 Thesis II.2 Thesis II.3 Thesis II.4
I converted TWT problem to quadratic form including the constraints with heuristic constants I applied the Hopfield Neural Net (HNN) and found approximate solutions in polynomial time I showed that HNN solution outperforms other simple heuristics on large set of random problems I improved HNN by intelligent selection of starting point and random perturbations
Acta Univ Sapientiae 2012 Acta Univ Sapientiae 2012 Acta Univ Sapientiae 2012 Periodica Polytechnica 2013
40
Thesis II.1 TWT to QP

HNN are a recursive Neural Network which are good
for solving quadratic optimization problems in
the form
Has been applied successfully for finding good
approximate solutions to the Travelling Salesman
problem.
Our task is to transform the TWT to a quadratic
optimization problem.

41
Thesis II.1 TWT to QP

Move constraints to objective function
Each member of the above addition can be
converted to quadratic Lyapunov form separately
to bring the expression into the form

R
R
42
Thesis II.1 TWT to QP

Results of the matrix conversions

R
43
Thesis II.2 Applying HNN

Hopfield (1982) proved that the recursion
converges to its fixed point, so minimizes a
quadratic Lyapunov function
I implemented this in MATLAB, including
systematic selection of the heuristic constants
a,ß and ?. I also developed algorithms to
validate and correct the resulting schedule
matrix if needed.

44
Thesis II.3 HNN outperforms other simple
heuristics

For each problem size ( of jobs) 100 random
problems were generated and the average TWT was
computed and plotted

45
Thesis II.3 HNN outperforms other simple
heuristics

Outperformance is consistent over a broad
spectrum of problems over simple heuristics in
literature (LWPF Largest Weighted Process
First, WSPT Weighted Shortest Processing Time,
EDD Earliest Due Date)

Job size 5 10 15 20 30 40 50 75 100
outperf 99.9 100 100 99.5 99.2 99.6 99.3 98.6 98.8
46
Thesis II.4 Further improving HNN

Smart HNN (SHNN)
Use the resultof Largest Weighted Path First
(LWPF) as starting point for HNN rather than
random starting points
Speeds up HNN due to single starting point, but
still require multiple iterations due to setting
of heuristic constants.
Perturbed Smart HNN (PSHNN)
Consider random perturbations of LWPF as starting
point to HNN, in order to avoid getting stuck in
local minima

47
Thesis II.4 Further improving HNN

Perturbed Largest Weighted Path First (PLWPF)
Simple, but surprisingly well performing heuristi
The idea is to avoid getting stuck in local
minima by trying starting points near LWPF
solution

48
Thesis II.4 Further improving HNN

For small job sizes, we compare performance to
the theoretical best exhaustive search over 100
randomly generated problems per job size

PSHNN consistently outperforms other methods, but
there is room for improvement

49
Thesis II.4 Further improving HNN

For small job sizes, we compare performance to
the theoretical best exhaustive search over 100
randomly generated problems per job size

PSHNN outperforms other methods by increasing
margin as job size grows

50
Thesis Group II. Practical Application

Monte Carlo simulation based risk calculations
scheduling at Morgan Stanley overnight for
trading and regulatory reporting
100 portfolios, 556 jobs, 792 seconds average
size
7 improvement over HNN, 10 over LWPF (best
method in literature prior to my study).
53 minutes saved on top 3 priority jobs compated
to next best heuristic

Weight 3 4 5 6 7 8 9 10 SUM Increment to PSHNN
PSHNN 4401 11116 4020 1620 1092 8 0 0 22257 0
PLWPF 3513 9624 5130 1788 490 312 2304 190 24019 5
HNN 4404 11040 4735 1824 1092 456 468 0 24019 7
LWPF 4404 11140 5470 2472 1183 40 0 0 24709 10
EDD 4401 9940 1770 636 1134 464 22752 1430 42527 48
51
Summary of numerical results on real world
problems
Field Real world problem Average performance of traditional approaches Average performance of the proposed new method Impact on computational finance (improvement in percentage)
Portfolio optimization Convergence trading on US SP 500 stock data 11.6 (SP 500 index return) 34 22.4
Schedule optimization Morgan Stanley overnight scheduling problem 24709 (LWPF performance) 22257 (PSHNN performance) 10
52
Summary of my Contribution
Managed to find a generic approach to
approximating NP hard problems in polynomial time
using heuristic methods
Proved the practical effectiveness and
applicability on real world problems for 2 very
difficult open problems
This can speed up financial calculations and
their scheduling

Provides faster, more timely data to banks,
clients and financial regulators ? improves
society as a whole

53
Thank You For Your Attention!
Questions and Answers
54
Questions and Answers

In the VAR(1) parameter estimation, could you use
a stability theorem in the solution of the
Lyapunov equation to ensure G is positive
definite, rather than using the numerical method
suggested?
The Lyapunov equation is
If K and A are positive definite and A has
eigenvalues with modulus lt 1 then G is guaranteed
to be positive definite
When estimating A from real world data using
maximum likelihood, I found that the best
estimate often had eigenvalues with modulus gt 1,
and G became non-positive-definite, hence my
suggested method.
One could research new ways to estimate A to
ensure eigenvalue moduli are lt 1.