Genetic Programming for Financial Trading

About This Presentation

Title:

Genetic Programming for Financial Trading

Description:

GP is the process of evolving a population of computer programs, that are ... In GP, programs are represented by trees (3/3) Trading rule formula : ... – PowerPoint PPT presentation

Number of Views:178

Avg rating:3.0/5.0

Slides: 88

Provided by: nicola4

Category:

Tags: financial | genetic | gp | programming | trading

more less

Transcript and Presenter's Notes

Title: Genetic Programming for Financial Trading

1
Genetic Programming for Financial Trading

Nicolas NAVET
INRIA, France
AIECON NCCU, Taiwan
http//www.loria.fr/nnavet

Tutorial at CIEF 2006, Kaohsiung, Taiwan,
08/10/2006
2
Outline of the talk (1/2)

PART 1 Genetic programming (GP) ?
GP among machine learning techniques
GP on the symbolic regression problem
Pitfalls GP
PART 2 GP for financial trading
Various schemes
How to implement it ?
Experimentations GP at work

3
Outline of the talk (2/2)

PART 3 Analyzing GP results
Why GP results are usually inconclusive?
Benchmarking with
Zero-intelligence trading strategies
Lottery Trading
Answering the questions
is there anything to learn on the data at hand
is GP effective at this task
PART 4 Perspectives

4
GP is a Machine Learning technique

Ultimate goal of machine learning is the
automatic programming, that is computers
programming themselves ..
More achievable goal Build computer-based
systems that can adapt and learn from their
experience
ML algorithms originate from many fields
mathematics (logic, statistics), bio-inspired
techniques (neural networks), evolutionary
computing (Genetic Algorithm, Genetic
Programming), swarm intelligence (ant, bees)

5
Evolutionary Computing

Algorithms that make use of mechanisms inspired
by natural evolution, such as
Survival of the fittest among an evolving
population of solutions
Reproduction and mutation
Prominent representatives
Genetic Algorithm (GA)
Genetic Programming (GP) GP is a branch of GA
where the genetic code of a solution is of
variable length
Over the last 50 years, evolutionary algorithms
have proved to be very efficient for finding
approximate solutions to algorithmically complex
problems

6
Two main problems in Machine Learning

Classification model output is a prediction
whether the input belongs to some particular
class
Examples Human being recognition in image
analysis, spam detection, credit scoring, market
timing decisions
Regression prediction of the systems output
for a specific input
Example predict tomorrow's opening price for a
stock given closing price, market trend, other
stock exchanges,

7
Functioning scheme of ML
Learning on a training interval
8
GP basics
9
Genetic programming

GP is the process of evolving a population of
computer programs, that are candidate solutions,
according to the evolutionary principles (e.g.
survival of the fittest)

Generate a population of random programs
10
In GP, programs are represented by trees (1/3)

Trees are a very general representation form

Formula

11
In GP, programs are represented by trees (2/3)

Logical formula

12
In GP, programs are represented by trees (3/3)

Trading rule formula BUY IF (VOLgt10) AND
(Moving Average(25) gt Moving Average(45))

Picture from BhPiZu02
13
Preliminary steps of GP

The user has to define
the set of terminals
the set of functions
how to evaluate the quality of an individual
the fitness measure
parameters of the run e.g. number of
individuals of the population
the termination criterion

14
Symbolic regression a problem GP is good at

Symbolic regression find a function that fits
well a set of experimental data points

Symbolic means that one looks for both
the functional form
- the value of the parameters, e.g.

Differs from other regressions where one solely
looks for the best coefficient values for a
pre-fixed model. Usually the choice of the model
is the most difficult issue !

15
Symbolic regression

Given a set of points

Find the function s.t. as far as
possible

Possible fitness function

GP functions

GP terminals

16
GP Operators biologically inspired

Recombination (aka crossover) 2 individuals
share genetic material and create one or several
offsprings
Mutation introduce genetic diversity by random
changes in the genetic code
Reproduction individual survives as is in the
next generation

17
Selection Operators for Crossover/reproduction

General principles in GP the fittest
individuals should have more chance to survive
and transmit their genetic code

18
Standard Recombination (aka crossover)

Standard recombination exchange two randomly
chosen sub-trees among the parents

19
Mutation Operator 1 standard mutation

Standard mutation replacement of a sub-tree
with a randomly generated one

20
Mutation Operator 2 swap sub-tree mutation

Swap sub-tree Mutation swap two sub-trees of
an individual

21
Mutation Operator 3 shrink mutation

Shrink Mutation replacing a branch (a node
with one or more arguments) with one of his child
node

22
Other Mutation Operators

Swap mutation (? swap sub-tree mutation)
exchanging the function associated to a node by
one having the same number of arguments
Headless Chicken crossover mutation
implemented as a crossover between a program and
a newly generated random program
.

23
Reproduction / Elitism Operators

Reproduction an individual is reproduced in
the next generation without any modification

Elitism the best n individuals are kept in the
next generation

24
GP is no silver bullet
25
GP Issue 1 how to choose the function set ?

The problem cannot be solved if the set of
functions is not sufficient
But Non-relevant functions increases uselessly
the search space

Problem there is no automatic way to decide a
priori the relevant functions and to build a
sufficient function sets

26
Problem cannot be solved if the set of functions
is not sufficient illustration

Generating function

27
Results with sin(x) in the function set ?
Typical outcome
28
Results without sin(x) in the function set ?
Typical outcome
29
Yes, sin(x) can be approximated by its Taylors
series ..
Sin(x) and taylor approximation of degree 1, 3 ,
5, 7, 9, 11, 13 image Wikipedia

Problem 1 there is little hope to discover
that ..

Problem 2 what happens outside the training
interval ?

30
Composition of the function set is crucial
illustration

GP functions

Subset is
extraneous in this context

Same experimental setup as before

31
Function set containing redundant functions ?
(1/2)
Typical outcome
32
Function set containing redundant functions ?
(2/2)

On average, with the extraneous functions the
best solution is 10 farther from the curve in
the training interval (much more outside!)

With the extraneous functions, the average
solution is better .. because the tree is more
likely to contain a trigonometric function

33
GP Issue 2 code bloat

Solutions increase in size over generations

Same experimental setup as before
34
GP Issue 2 code bloat

Much of the genetic code has no influence on the
fitness .. but may constitute a useful reserve
of genetic material

non-effective code !! aka introns
35
Code bloat why is it a problem ?

Solutions are hard to understand
learning something from huge solutions is almost
impossible ..
One has no confidence using programs one does not
understand !
Much of the computing power is spent manipulating
non-contributing code, which may slow down the
search

36
Countermeasures .. (1/2)

Static limit of the tree depth
Dynamic maximum tree depth SiAl03 the limit
is increased each time an outstanding individual
deeper than the current limit is found
Limit the probability of longer-than-average
individuals to be chosen by reducing their
fitness
Apply operators than ensure limited code growth
Discard newly created individuals whose
behavior is too close to the ones of their
parents (e.g. behavior for regression pb could
be position of the points Str03)

37
Countermeasures .. (2/2)

Possible symbolic simplification of the tree

can be simplified into

Needs to be further investigated ! preliminary
experiments TeHe04 show that simplification
does not necessarily help (introns may
constitute a useful reserve of genetic materials)

38
GP Issue 3 GP can be disappointing outside the
training set
and such a behavior can hardly be predicted
39
GP Issue 3 explanation (1/2)

Usually GP functions are implemented to have the
closure property each function must be able to
handle every possible value
What to do with
division by 0 ?
sqrt(x) with x lt 0 ?

Solution protected operators, eg. the
division
if (abs(denominator) lt value-near-0) return 1

40
GP Issue 3 explanation (2/2)

in our case, fragment of the best GP tree

Why did it not occur on the training interval ?
not training points chosen such that

41
GP Issue 4 standard GP is not good at finding
numerical constants (1/3)

Where do numerical values come from ?
Ephemeral random constants random values
inserted at the leafs of the GP trees during the
creation of initial population
Use of arithmetic operators on existing
numerical constants
Generation by combination of variables/functions

Lately, many studies show that standard GP is
not good at finding constants

42
GP Issue 4 standard GP is not good at finding
numerical constants (2/2)

Experiment find a constant function equal to
the numeric constant 3.141592

43
GP Issue 4 standard GP is not good at finding
numerical constants (3/3)

There are several more efficient schemes for
constants generation in GP Dem95
- local optimization ZuPiMa01,
numeric mutation EvFe98,
One of them should be implemented otherwise 1)
computation time is lost searching for constants
2) solutions may tend to be bigger

44
Some (personal) conclusions on GP (1/3)

GP is undoubtedly a powerful technique
Efficient for predicting / classifying .. but
not more than other techniques
Symbolic representation of the created solutions
may help to give good insight into the system
under study .. not only the best solutions are
interesting but also how the population has
evolved over time
GP is a tool to learn knowledge

45
Some (personal) conclusions on GP (2/3)

Powerful tool but ...
a good knowledge of the application field is
required for choosing the right functions set
prior experience with GP is mandatory to avoid
common mistakes there is no theory to tell us
what to do !
it tend to create solutions too big to be
analyzable -gt countermeasures should be
implemented
fine-tuning the GP parameters is very
time-consuming

46
Some (personal) conclusions on GP (3/3)

How to analyze the results of GP ?
efficiency can hardly be predicted, it varies
from problem to problem
and from GP run to GP run
if results are not very positive
is it because there is no good solution ?
or GP is not effective and further work is
needed ?

There are solutions part 3 of the talk

47
Part 2 GP for financial trading
48
Why GP is an appealing technique for financial
trading ?

Easy to implement / robust evolutionary technique
Trading rules (TR) should adapt to a changing
environment GP may simulate this evolution
Solutions are produced under a symbolic form that
can be understood and analyzed
GP may serve as a knowledge discovery tool (e.g.
evolution of the market)

49
GP for financial trading

GP for composing portfolio (not discussed here,
see Lag03 )
GP for evolving the structure of neural networks
used for prediction (not discussed here, see
GoFe99 )
GP for predicting price evolution (briefly
discussed here, see Kab02 )
Most common GP for inducing technical trading
rules

50
Predicting price evolution general comments ..

Long term forecast of stock prices remain a
fantasy Kab02

Swing trading or intraday trading

CIEF Tutorial 1 by Prof. Fyfe today 1h30 pm !

2 excellent starting points
Kab02 single-day-trading-strategy based on
the forecasted spread
SaTe01 winner of the CEC2000 Dow-Jones
Prediction Prediction t1, t2, t3,, th - a
solution has one tree per forecast horizon

51
Predicting price evolution fitness function

Definition of the fitness function has been shown
to be crucial e.g. SaTe01, there are many
possible

(Normalized) Mean square error
Mean Absolute Percentage Error
(1-?) statistic 1 - MAPE / MAPE-Randow-Walk
Directional symmetry index (DS)
DS weighted by the direction and amplitude of
the error

Issue a meaningful fitness function is not
always GP friendly

52
Inducing technical trading rules
53
Steps of the algorithm (1/3)
1. Extracting training time series from the
database
2. Preprocessing cleaning, sampling, averaging,
normalizing,
54
Steps of the algorithm (2/3)
3. GP on the training set 3.1 Creation of the
individuals 3.2 Evaluation
Trading Rules Interpreter
Trading Sequence Simulator
3.3 Selection of the individuals
4. Analysis of the evolution statistics, html
files
55
Steps of the algorithm (3/3)
5. Evaluate selected individuals on the
validation set
6. Evaluate best individual out-of sample
56
GP at work Demo on the Taiwan Capitalization
Weighted Stock Index
57
Part 3 Analyzing GP results
58
One may cast doubts on GP efficiency ..

Highly heuristic no theory ! Problems on which
GP has been shown not to be significantly better
than random search
Few clear-cut successes reported in the financial
literature
GP embeds little domain specific knowledge yet ..
Doubts on the efficiency of GP to use the
available computing time
code bloat
bad at finding numerical constants
best solutions are sometimes found very early in
the run ..
Variability of the results ! e.g. returns

-0.160993, 0.0526153, 0.0526153, 0.0526153,
0.0526153, -0.0794787, 0.0526153, -0.0794787,
0.132354, 0.364311, -0.0990995, -0.0794787,
-0.0855786, -0.094433, 0.0464288, -0.140719,
0.0526153, 0.0526153, -0.0746189, 0.418075, .
59
Possible pretest measure of predictability of
the financial time-series

Actual question how predictable for a given
horizon with a given cost function?

Serial correlation
Kolmogorov complexity
Lyapunov exponent
Unit root analysis
Comparison with results on surrogate data
shuffled series (e.g. Kaboudan statistics)
...

60
In practice, some predictability does not imply
profitability ..

Prediction horizon must be large enough!

Volatility may not be sufficient to cover
round-trip transactions costs!

Not the right trading instrument at hand ..
typically short selling not available

61
Pretest methodology

Compare GP with several variants of
Random search algorithms
Zero-Intelligence Strategies - ZIS
Random trading behaviors
Lottery trading - LT

Issue how to best constrain randomness ?

Statistical hypotheses testing
Null GP does not outperform ZIS
Null GP does not outperform LT

62
Pretest 1 GP versus Zero-Intelligence
strategies(Equivalent search intensity Random
Search (ERS) with validation stage)

Null hypothesis H1,0 GP does not outperform
equivalent random search - Alternative
hypothesis is H1,1

63
Pretest 1 GP vs zero-intelligence strategies
ERS

H1,0 cannot be rejected interpretation
There is nothing to learn or GP is not very
effective

64
Pretest 4 GP vs lottery trading

Lottery trading (LT) random trading behavior
according the outcome of a r.v. (e.g. Bernoulli
law)
Issue 1 if LT tends to hold positions (short,
long) for less time that GP, transactions costs
may advantage GP ..
Issue 2 it might be an advantage or an
disadvantage for LT to trade much less or much
more than GP.
ex downward oriented market with no short-sell

65
Frequency and intensity of a trading strategy

Frequency average number of transactions per
unit of time
Intensity proportion of time where a position
is held
For pretest 4
We impose that average frequency and intensity
of LT is equal to the ones of GP
Implementation generate random trading
sequences having the right characteristics

0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,0,1,0,0,0,
0,0,0,1,1,1,1,1,1,
66
Pretest 4 implementation
67
Answering question 1 is there anything to learn
on the training data at hand ?
68
Question 1 pretests involved

Starting point if a set of search algorithms do
not outperform LT, it gives evidence that there
is nothing to learn ..
Pretest 4 GP vs Lottery Trading
Null hypothesis H4,0 GP does not outperform LT
Pretest 5 Equivalent Random Search (ZIS) vs
Lottery Trading
Null hypothesis H5,0 ERS does not outperform LT

69
Question 1 some answers ...

?R means that the null hypothesis Hi,0 cannot be
rejected R means we should favor Hi,1

H4,0 H5,0 Interpretation
Case 1 ?R ?R
Case 2 R R
Case 3 R ?R
Case 4 ?R R

there is nothing to learn

there is something to learn
there may be something to learn -ERS might not be
powerful enough
there may be something to learn GP evolution
process is detrimental
70
Answering question 2 is GP effective ?
71
Question 2 some answers ...

Question 2 cannot be answered if there is nothing
to learn (case 1)
Case 4 provides us with a negative answer ..
In case 2 and 3, run pretest 1 GP vs Equivalent
random search
Null hypothesis H1,0 GP does not outperform ERS
If one cannot reject H1,0 GP shows no evidence
of efficiency

72
Pretests at work Methodology Draw conclusions
from pretests using our own programs and compare
with results in the literature ChKuHo06 on the
same time series
73
Setup GP control parameters - same as in
ChKuHo06
74
Setup statistics, data, trading scheme

Hypothesis testing with student t-test with a 95
confidence level
Pretests with samples made of 50 GP runs, 50 ERS
runs and 100 LT runs
Data indexes of 3 stock exchanges Canada,
Taiwan and Japan
Daily trading with short selling
Training of 3 years Validation of 2 years
Out-of-sample periods 1999-2000, 2001-2002,
2003-2004
Data normalized with a 250 days moving average

75
Results on actual data (1/2)

Evidence that there is something to learn 4
markets out of 9 (C3,J2,T1,T3)
Experiments in ChKuHo06, with another GP
implementation, show that GP performs very well
on these 4 markets
Evidence that there is nothing to learn 3
(C1,J3,T2)
In ChKuHo06, there is only one (C1) where GP
has positive return (but less than BH)

76
Results on actual data (2/2)

GP effective 3 markets out of 6
In these 3 markets, GP outperforms Buy and Hold
same outcome as in ChKuHo06
Preliminary conclusion one can rely on pretests
..
When there is nothing to learn, no GP
implementation did good (except in one case)
When there is something to learn, at least one
implementation did good (always)
When our GP is effective, GP in ChKuHo06 is
effective too (always)

77
Further conclusion

Our GP implementation is
is more efficient than random search no case
where ERS outperform LT and GP did not
But only slightly more efficient one would
expect much more cases where GP does better than
LT and not ERS
Our GP is actually able to take advantage of
regularities in data but only of simple ones

78
Part 4 Perspectives in the field of GP for
financial trading
79
Rethinking fitness functions
From LaPo02

Fitness functions accumulated return,
risk-adjusted return,
Issue on some problems LaPo02, GP is only
marginally better than random search because
fitness function induces a difficult" landscape
Come up with GP-friendly fitness functions

80
Preprocessing of the data still an open issue

Studies in forecasting show the importance of
preprocessing for GP, often, normalization with
MA(250) is used - with benefits ChKuHo06
Length of MA should change according to markets
volatility, regime changes, etc ?
Why not consider MACD, Exponential MA,
differencing, rate of change, log value, FFT,
wavelet,

81
Data division scheme

There is evidence that GP performs poorly when
the characteristics of the training interval are
very different from the out-of-sample interval
Characterization of the current market condition
mean reverting, trend following ...
Relearning on a smaller interval if needed ?

82
More extensive tests are needed .. automating the
test

A comprehensive test for daily indexes done in
ChKuHo06, none exists for individual stocks and
intraday data
Automated testing on several hundred of stocks
is fully feasible but require a software
infrastructure and much computing power

83
Ensemble methods combining trading rules

In ML, ensemble methods have proven to be very
effective
Majority rule tested in ChKuHo06 with some
success
Efficiency requirement accuracy (better than
random) and diversity (uncorrelated errors)
what does it mean for trading rules?
More fine grained selection / weighting scheme
may lead to better results

84
Embed more domain specific knowledge

Black-box algorithms are usually outperformed by
domain-specific algorithms
Domain-specific language is limited as yet
Enrich primitive set with volume, indexes,
bid/ask spread,
Enrich function set with cross-correlation,
predictability measure,

85
References (1/2)

ChKuHo06 S.-H. Chen and T.-W. Kuo and K.-M.
Hoi. Genetic Programming and Financial Trading
How Much about "What we Know. In 4th NTU
International Conference on Economics, Finance
and Accounting, April 2006.
ChNa06 S.-H. Chen and N. Navet. Pretests for
genetic-programming evolved trading programs
zero-intelligence strategies and lottery
trading, Proc. ICONIP2006.
SiAl03 S. Silva and J. Almeida, Dynamic
Maximum Tree Depth - A Simple Technique for
Avoiding Bloat in Tree-Based GP, GECCO 2003,
LNCS 2724, pp. 17761787, 2003.
Str03 M.J. Streeter, The Root Causes of Code
Growth in Genetic Programming, EuroGP 2003, pp.
443 - 454, 2003.
TeHe04 M.D. Terrio, M. I. Heywood, On Naïve
Crossover Biases with Reproduction for Simple
Solutions to Classification Problems, GECCO
2004, 2004.
ZuPiMa01 G. Zumbach, O.V. Pictet, and O.
Masutti, Genetic Programming with Syntactic
Restrictions applied to Financial Volatility
Forecasting, Olsen Associates, Research
Report, 2001.
EvFe98 M. Evett, T. Fernandez, Numeric
Mutation Improves the Discovery of Numeric
Constants in Genetic Programming, Genetic
Programming 1998 Proceedings of the Third Annual
Conference, 1998.

86
References (2/2)

Kab02 M. Kaboudan, GP Forecasts of Stock
Prices for Profitable Trading, Evolutionary
computation in economics and finance, 2002.
SaTe02 M. Santini, A. Tettamanzi, Genetic
Programming for Financial Series Prediction,
Proceedings of EuroGP'2001, 2001.
BhPiZu02 S. Bhattacharyya, O. V. Pictet, G.
Zumbach, Knowledge-Intensive Genetic Discovery
in Foreign Exchange Markets, IEEE Transactions
on Evolutionary Computation, vol 6, n 2, April
2002.
LaPo02 W.B. Langdon, R. Poli, Fondations of
Genetic Programming, Springer Verlag, 2002.
Kab00 M. Kaboudan, Genetic Programming
Prediction of Stock Prices, Computational
Economics, vol16, 2000.
Wag03 L. Wagman, Stock Portfolio Evaluation
An Application of Genetic-Programming-Based
Technical Analysis, Genetic Algorithms and
Genetic Programming at Stanford 2003, 2003.
GoFe99 W. Golubski and T. Feuring, Evolving
Neural Network Structures by Means of Genetic
Programming, Proceedings of EuroGP'99, 1999.
Dem05 I. Dempsey, Constant Generation for the
Financial Domain using Grammatical Evolution,
Proceedings of the 2005 workshops on Genetic and
evolutionary computation 2005, pp 350 353,
Washington, June 25 - 26, 2005.