Genetic Programming for Financial Trading - PowerPoint PPT Presentation

1 / 87
About This Presentation
Title:

Genetic Programming for Financial Trading

Description:

GP is the process of evolving a population of computer programs, that are ... In GP, programs are represented by trees (3/3) Trading rule formula : ... – PowerPoint PPT presentation

Number of Views:178
Avg rating:3.0/5.0
Slides: 88
Provided by: nicola4
Category:

less

Transcript and Presenter's Notes

Title: Genetic Programming for Financial Trading


1
Genetic Programming for Financial Trading
  • Nicolas NAVET
  • INRIA, France
  • AIECON NCCU, Taiwan
  • http//www.loria.fr/nnavet

Tutorial at CIEF 2006, Kaohsiung, Taiwan,
08/10/2006
2
Outline of the talk (1/2)
  • PART 1 Genetic programming (GP) ?
  • GP among machine learning techniques
  • GP on the symbolic regression problem
  • Pitfalls GP
  • PART 2 GP for financial trading
  • Various schemes
  • How to implement it ?
  • Experimentations GP at work

3
Outline of the talk (2/2)
  • PART 3 Analyzing GP results
  • Why GP results are usually inconclusive?
  • Benchmarking with
  • Zero-intelligence trading strategies
  • Lottery Trading
  • Answering the questions
  • is there anything to learn on the data at hand
  • is GP effective at this task
  • PART 4 Perspectives

4
GP is a Machine Learning technique
  • Ultimate goal of machine learning is the
    automatic programming, that is computers
    programming themselves ..
  • More achievable goal Build computer-based
    systems that can adapt and learn from their
    experience
  • ML algorithms originate from many fields
    mathematics (logic, statistics), bio-inspired
    techniques (neural networks), evolutionary
    computing (Genetic Algorithm, Genetic
    Programming), swarm intelligence (ant, bees)

5
Evolutionary Computing
  • Algorithms that make use of mechanisms inspired
    by natural evolution, such as
  • Survival of the fittest among an evolving
    population of solutions
  • Reproduction and mutation
  • Prominent representatives
  • Genetic Algorithm (GA)
  • Genetic Programming (GP) GP is a branch of GA
    where the genetic code of a solution is of
    variable length
  • Over the last 50 years, evolutionary algorithms
    have proved to be very efficient for finding
    approximate solutions to algorithmically complex
    problems

6
Two main problems in Machine Learning
  • Classification model output is a prediction
    whether the input belongs to some particular
    class
  • Examples Human being recognition in image
    analysis, spam detection, credit scoring, market
    timing decisions
  • Regression prediction of the systems output
    for a specific input
  • Example predict tomorrow's opening price for a
    stock given closing price, market trend, other
    stock exchanges,

7
Functioning scheme of ML
Learning on a training interval
8
GP basics
9
Genetic programming
  • GP is the process of evolving a population of
    computer programs, that are candidate solutions,
    according to the evolutionary principles (e.g.
    survival of the fittest)

Generate a population of random programs
10
In GP, programs are represented by trees (1/3)
  • Trees are a very general representation form
  • Formula

11
In GP, programs are represented by trees (2/3)
  • Logical formula

12
In GP, programs are represented by trees (3/3)
  • Trading rule formula BUY IF (VOLgt10) AND
    (Moving Average(25) gt Moving Average(45))

Picture from BhPiZu02
13
Preliminary steps of GP
  • The user has to define
  • the set of terminals
  • the set of functions
  • how to evaluate the quality of an individual
    the fitness measure
  • parameters of the run e.g. number of
    individuals of the population
  • the termination criterion

14
Symbolic regression a problem GP is good at
  • Symbolic regression find a function that fits
    well a set of experimental data points
  • Symbolic means that one looks for both
  • the functional form
  • - the value of the parameters, e.g.
  • Differs from other regressions where one solely
    looks for the best coefficient values for a
    pre-fixed model. Usually the choice of the model
    is the most difficult issue !

15
Symbolic regression
  • Given a set of points
  • Find the function s.t. as far as
    possible
  • Possible fitness function
  • GP functions
  • GP terminals

16
GP Operators biologically inspired
  • Recombination (aka crossover) 2 individuals
    share genetic material and create one or several
    offsprings
  • Mutation introduce genetic diversity by random
    changes in the genetic code
  • Reproduction individual survives as is in the
    next generation

17
Selection Operators for Crossover/reproduction
  • General principles in GP the fittest
    individuals should have more chance to survive
    and transmit their genetic code

18
Standard Recombination (aka crossover)
  • Standard recombination exchange two randomly
    chosen sub-trees among the parents


19
Mutation Operator 1 standard mutation
  • Standard mutation replacement of a sub-tree
    with a randomly generated one

20
Mutation Operator 2 swap sub-tree mutation
  • Swap sub-tree Mutation swap two sub-trees of
    an individual

21
Mutation Operator 3 shrink mutation
  • Shrink Mutation replacing a branch (a node
    with one or more arguments) with one of his child
    node

22
Other Mutation Operators
  • Swap mutation (? swap sub-tree mutation)
    exchanging the function associated to a node by
    one having the same number of arguments
  • Headless Chicken crossover mutation
    implemented as a crossover between a program and
    a newly generated random program
  • .

23
Reproduction / Elitism Operators
  • Reproduction an individual is reproduced in
    the next generation without any modification
  • Elitism the best n individuals are kept in the
    next generation

24
GP is no silver bullet
25
GP Issue 1 how to choose the function set ?
  • The problem cannot be solved if the set of
    functions is not sufficient
  • But Non-relevant functions increases uselessly
    the search space
  • Problem there is no automatic way to decide a
    priori the relevant functions and to build a
    sufficient function sets

26
Problem cannot be solved if the set of functions
is not sufficient illustration
  • Generating function

27
Results with sin(x) in the function set ?
Typical outcome
28
Results without sin(x) in the function set ?
Typical outcome
29
Yes, sin(x) can be approximated by its Taylors
series ..
Sin(x) and taylor approximation of degree 1, 3 ,
5, 7, 9, 11, 13 image Wikipedia
  • Problem 1 there is little hope to discover
    that ..
  • Problem 2 what happens outside the training
    interval ?

30
Composition of the function set is crucial
illustration
  • GP functions
  • Subset is
    extraneous in this context
  • Same experimental setup as before

31
Function set containing redundant functions ?
(1/2)
Typical outcome
32
Function set containing redundant functions ?
(2/2)
  • On average, with the extraneous functions the
    best solution is 10 farther from the curve in
    the training interval (much more outside!)
  • With the extraneous functions, the average
    solution is better .. because the tree is more
    likely to contain a trigonometric function

33
GP Issue 2 code bloat
  • Solutions increase in size over generations

Same experimental setup as before
34
GP Issue 2 code bloat
  • Much of the genetic code has no influence on the
    fitness .. but may constitute a useful reserve
    of genetic material

non-effective code !! aka introns
35
Code bloat why is it a problem ?
  • Solutions are hard to understand
  • learning something from huge solutions is almost
    impossible ..
  • One has no confidence using programs one does not
    understand !
  • Much of the computing power is spent manipulating
    non-contributing code, which may slow down the
    search

36
Countermeasures .. (1/2)
  • Static limit of the tree depth
  • Dynamic maximum tree depth SiAl03 the limit
    is increased each time an outstanding individual
    deeper than the current limit is found
  • Limit the probability of longer-than-average
    individuals to be chosen by reducing their
    fitness
  • Apply operators than ensure limited code growth
  • Discard newly created individuals whose
    behavior is too close to the ones of their
    parents (e.g. behavior for regression pb could
    be position of the points Str03)

37
Countermeasures .. (2/2)
  • Possible symbolic simplification of the tree

can be simplified into
  • Needs to be further investigated ! preliminary
    experiments TeHe04 show that simplification
    does not necessarily help (introns may
    constitute a useful reserve of genetic materials)

38
GP Issue 3 GP can be disappointing outside the
training set
and such a behavior can hardly be predicted
39
GP Issue 3 explanation (1/2)
  • Usually GP functions are implemented to have the
    closure property each function must be able to
    handle every possible value
  • What to do with
  • division by 0 ?
  • sqrt(x) with x lt 0 ?
  • Solution protected operators, eg. the
    division
  • if (abs(denominator) lt value-near-0) return 1

40
GP Issue 3 explanation (2/2)
  • in our case, fragment of the best GP tree
  • Why did it not occur on the training interval ?
  • not training points chosen such that

41
GP Issue 4 standard GP is not good at finding
numerical constants (1/3)
  • Where do numerical values come from ?
  • Ephemeral random constants random values
    inserted at the leafs of the GP trees during the
    creation of initial population
  • Use of arithmetic operators on existing
    numerical constants
  • Generation by combination of variables/functions
  • Lately, many studies show that standard GP is
    not good at finding constants

42
GP Issue 4 standard GP is not good at finding
numerical constants (2/2)
  • Experiment find a constant function equal to
    the numeric constant 3.141592

43
GP Issue 4 standard GP is not good at finding
numerical constants (3/3)
  • There are several more efficient schemes for
    constants generation in GP Dem95
  • - local optimization ZuPiMa01,
  • numeric mutation EvFe98,
  • One of them should be implemented otherwise 1)
    computation time is lost searching for constants
    2) solutions may tend to be bigger

44
Some (personal) conclusions on GP (1/3)
  • GP is undoubtedly a powerful technique
  • Efficient for predicting / classifying .. but
    not more than other techniques
  • Symbolic representation of the created solutions
    may help to give good insight into the system
    under study .. not only the best solutions are
    interesting but also how the population has
    evolved over time
  • GP is a tool to learn knowledge

45
Some (personal) conclusions on GP (2/3)
  • Powerful tool but ...
  • a good knowledge of the application field is
    required for choosing the right functions set
  • prior experience with GP is mandatory to avoid
    common mistakes there is no theory to tell us
    what to do !
  • it tend to create solutions too big to be
    analyzable -gt countermeasures should be
    implemented
  • fine-tuning the GP parameters is very
    time-consuming

46
Some (personal) conclusions on GP (3/3)
  • How to analyze the results of GP ?
  • efficiency can hardly be predicted, it varies
  • from problem to problem
  • and from GP run to GP run
  • if results are not very positive
  • is it because there is no good solution ?
  • or GP is not effective and further work is
    needed ?
  • There are solutions part 3 of the talk

47
Part 2 GP for financial trading
48
Why GP is an appealing technique for financial
trading ?
  • Easy to implement / robust evolutionary technique
  • Trading rules (TR) should adapt to a changing
    environment GP may simulate this evolution
  • Solutions are produced under a symbolic form that
    can be understood and analyzed
  • GP may serve as a knowledge discovery tool (e.g.
    evolution of the market)

49
GP for financial trading
  • GP for composing portfolio (not discussed here,
    see Lag03 )
  • GP for evolving the structure of neural networks
    used for prediction (not discussed here, see
    GoFe99 )
  • GP for predicting price evolution (briefly
    discussed here, see Kab02 )
  • Most common GP for inducing technical trading
    rules

50
Predicting price evolution general comments ..
  • Long term forecast of stock prices remain a
    fantasy Kab02
  • Swing trading or intraday trading

CIEF Tutorial 1 by Prof. Fyfe today 1h30 pm !
  • 2 excellent starting points
  • Kab02 single-day-trading-strategy based on
    the forecasted spread
  • SaTe01 winner of the CEC2000 Dow-Jones
    Prediction Prediction t1, t2, t3,, th - a
    solution has one tree per forecast horizon

51
Predicting price evolution fitness function
  • Definition of the fitness function has been shown
    to be crucial e.g. SaTe01, there are many
    possible
  • (Normalized) Mean square error
  • Mean Absolute Percentage Error
  • (1-?) statistic 1 - MAPE / MAPE-Randow-Walk
  • Directional symmetry index (DS)
  • DS weighted by the direction and amplitude of
    the error
  • Issue a meaningful fitness function is not
    always GP friendly

52
Inducing technical trading rules
53
Steps of the algorithm (1/3)
1. Extracting training time series from the
database
2. Preprocessing cleaning, sampling, averaging,
normalizing,
54
Steps of the algorithm (2/3)
3. GP on the training set 3.1 Creation of the
individuals 3.2 Evaluation
Trading Rules Interpreter
Trading Sequence Simulator
3.3 Selection of the individuals
4. Analysis of the evolution statistics, html
files
55
Steps of the algorithm (3/3)
5. Evaluate selected individuals on the
validation set
6. Evaluate best individual out-of sample
56
GP at work Demo on the Taiwan Capitalization
Weighted Stock Index
57
Part 3 Analyzing GP results
58
One may cast doubts on GP efficiency ..
  • Highly heuristic no theory ! Problems on which
    GP has been shown not to be significantly better
    than random search
  • Few clear-cut successes reported in the financial
    literature
  • GP embeds little domain specific knowledge yet ..
  • Doubts on the efficiency of GP to use the
    available computing time
  • code bloat
  • bad at finding numerical constants
  • best solutions are sometimes found very early in
    the run ..
  • Variability of the results ! e.g. returns

-0.160993, 0.0526153, 0.0526153, 0.0526153,
0.0526153, -0.0794787, 0.0526153, -0.0794787,
0.132354, 0.364311, -0.0990995, -0.0794787,
-0.0855786, -0.094433, 0.0464288, -0.140719,
0.0526153, 0.0526153, -0.0746189, 0.418075, .
59
Possible pretest measure of predictability of
the financial time-series
  • Actual question how predictable for a given
    horizon with a given cost function?
  • Serial correlation
  • Kolmogorov complexity
  • Lyapunov exponent
  • Unit root analysis
  • Comparison with results on surrogate data
    shuffled series (e.g. Kaboudan statistics)
  • ...

60
In practice, some predictability does not imply
profitability ..
  • Prediction horizon must be large enough!
  • Volatility may not be sufficient to cover
    round-trip transactions costs!
  • Not the right trading instrument at hand ..
    typically short selling not available

61
Pretest methodology
  • Compare GP with several variants of
  • Random search algorithms
  • Zero-Intelligence Strategies - ZIS
  • Random trading behaviors
  • Lottery trading - LT

Issue how to best constrain randomness ?
  • Statistical hypotheses testing
  • Null GP does not outperform ZIS
  • Null GP does not outperform LT

62
Pretest 1 GP versus Zero-Intelligence
strategies(Equivalent search intensity Random
Search (ERS) with validation stage)
  • Null hypothesis H1,0 GP does not outperform
    equivalent random search - Alternative
    hypothesis is H1,1

63
Pretest 1 GP vs zero-intelligence strategies
ERS
  • H1,0 cannot be rejected interpretation
  • There is nothing to learn or GP is not very
    effective

64
Pretest 4 GP vs lottery trading
  • Lottery trading (LT) random trading behavior
    according the outcome of a r.v. (e.g. Bernoulli
    law)
  • Issue 1 if LT tends to hold positions (short,
    long) for less time that GP, transactions costs
    may advantage GP ..
  • Issue 2 it might be an advantage or an
    disadvantage for LT to trade much less or much
    more than GP.
  • ex downward oriented market with no short-sell

65
Frequency and intensity of a trading strategy
  • Frequency average number of transactions per
    unit of time
  • Intensity proportion of time where a position
    is held
  • For pretest 4
  • We impose that average frequency and intensity
    of LT is equal to the ones of GP
  • Implementation generate random trading
    sequences having the right characteristics

0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,0,1,0,0,0,
0,0,0,1,1,1,1,1,1,
66
Pretest 4 implementation
67
Answering question 1 is there anything to learn
on the training data at hand ?
68
Question 1 pretests involved
  • Starting point if a set of search algorithms do
    not outperform LT, it gives evidence that there
    is nothing to learn ..
  • Pretest 4 GP vs Lottery Trading
  • Null hypothesis H4,0 GP does not outperform LT
  • Pretest 5 Equivalent Random Search (ZIS) vs
    Lottery Trading
  • Null hypothesis H5,0 ERS does not outperform LT

69
Question 1 some answers ...
  • ?R means that the null hypothesis Hi,0 cannot be
    rejected R means we should favor Hi,1

H4,0 H5,0 Interpretation
Case 1 ?R ?R
Case 2 R R
Case 3 R ?R
Case 4 ?R R
  • there is nothing to learn

there is something to learn
there may be something to learn -ERS might not be
powerful enough
there may be something to learn GP evolution
process is detrimental
70
Answering question 2 is GP effective ?
71
Question 2 some answers ...
  • Question 2 cannot be answered if there is nothing
    to learn (case 1)
  • Case 4 provides us with a negative answer ..
  • In case 2 and 3, run pretest 1 GP vs Equivalent
    random search
  • Null hypothesis H1,0 GP does not outperform ERS
  • If one cannot reject H1,0 GP shows no evidence
    of efficiency

72
Pretests at work Methodology Draw conclusions
from pretests using our own programs and compare
with results in the literature ChKuHo06 on the
same time series
73
Setup GP control parameters - same as in
ChKuHo06
74
Setup statistics, data, trading scheme
  • Hypothesis testing with student t-test with a 95
    confidence level
  • Pretests with samples made of 50 GP runs, 50 ERS
    runs and 100 LT runs
  • Data indexes of 3 stock exchanges Canada,
    Taiwan and Japan
  • Daily trading with short selling
  • Training of 3 years Validation of 2 years
  • Out-of-sample periods 1999-2000, 2001-2002,
    2003-2004
  • Data normalized with a 250 days moving average

75
Results on actual data (1/2)
  • Evidence that there is something to learn 4
    markets out of 9 (C3,J2,T1,T3)
  • Experiments in ChKuHo06, with another GP
    implementation, show that GP performs very well
    on these 4 markets
  • Evidence that there is nothing to learn 3
    (C1,J3,T2)
  • In ChKuHo06, there is only one (C1) where GP
    has positive return (but less than BH)

76
Results on actual data (2/2)
  • GP effective 3 markets out of 6
  • In these 3 markets, GP outperforms Buy and Hold
    same outcome as in ChKuHo06
  • Preliminary conclusion one can rely on pretests
    ..
  • When there is nothing to learn, no GP
    implementation did good (except in one case)
  • When there is something to learn, at least one
    implementation did good (always)
  • When our GP is effective, GP in ChKuHo06 is
    effective too (always)

77
Further conclusion
  • Our GP implementation is
  • is more efficient than random search no case
    where ERS outperform LT and GP did not
  • But only slightly more efficient one would
    expect much more cases where GP does better than
    LT and not ERS
  • Our GP is actually able to take advantage of
    regularities in data but only of simple ones

78
Part 4 Perspectives in the field of GP for
financial trading
79
Rethinking fitness functions
From LaPo02
  • Fitness functions accumulated return,
    risk-adjusted return,
  • Issue on some problems LaPo02, GP is only
    marginally better than random search because
    fitness function induces a difficult" landscape
  • Come up with GP-friendly fitness functions

80
Preprocessing of the data still an open issue
  • Studies in forecasting show the importance of
    preprocessing for GP, often, normalization with
    MA(250) is used - with benefits ChKuHo06
  • Length of MA should change according to markets
    volatility, regime changes, etc ?
  • Why not consider MACD, Exponential MA,
    differencing, rate of change, log value, FFT,
    wavelet,

81
Data division scheme
  • There is evidence that GP performs poorly when
    the characteristics of the training interval are
    very different from the out-of-sample interval
  • Characterization of the current market condition
    mean reverting, trend following ...
  • Relearning on a smaller interval if needed ?

82
More extensive tests are needed .. automating the
test
  • A comprehensive test for daily indexes done in
    ChKuHo06, none exists for individual stocks and
    intraday data
  • Automated testing on several hundred of stocks
    is fully feasible but require a software
    infrastructure and much computing power

83
Ensemble methods combining trading rules
  • In ML, ensemble methods have proven to be very
    effective
  • Majority rule tested in ChKuHo06 with some
    success
  • Efficiency requirement accuracy (better than
    random) and diversity (uncorrelated errors)
    what does it mean for trading rules?
  • More fine grained selection / weighting scheme
    may lead to better results

84
Embed more domain specific knowledge
  • Black-box algorithms are usually outperformed by
    domain-specific algorithms
  • Domain-specific language is limited as yet
  • Enrich primitive set with volume, indexes,
    bid/ask spread,
  • Enrich function set with cross-correlation,
    predictability measure,

85
References (1/2)
  • ChKuHo06 S.-H. Chen and T.-W. Kuo and K.-M.
    Hoi. Genetic Programming and Financial Trading
    How Much about "What we Know. In 4th NTU
    International Conference on Economics, Finance
    and Accounting, April 2006.
  • ChNa06 S.-H. Chen and N. Navet. Pretests for
    genetic-programming evolved trading programs
    zero-intelligence strategies and lottery
    trading, Proc. ICONIP2006.
  • SiAl03 S. Silva and J. Almeida, Dynamic
    Maximum Tree Depth - A Simple Technique for
    Avoiding Bloat in Tree-Based GP, GECCO 2003,
    LNCS 2724, pp. 17761787, 2003.
  • Str03 M.J. Streeter, The Root Causes of Code
    Growth in Genetic Programming, EuroGP 2003, pp.
    443 - 454, 2003.
  • TeHe04 M.D. Terrio, M. I. Heywood, On Naïve
    Crossover Biases with Reproduction for Simple
    Solutions to Classification Problems, GECCO
    2004, 2004.
  • ZuPiMa01 G. Zumbach, O.V. Pictet, and O.
    Masutti, Genetic Programming with Syntactic
    Restrictions applied to Financial Volatility
    Forecasting, Olsen Associates, Research
    Report, 2001.
  • EvFe98 M. Evett, T. Fernandez, Numeric
    Mutation Improves the Discovery of Numeric
    Constants in Genetic Programming, Genetic
    Programming 1998 Proceedings of the Third Annual
    Conference, 1998.

86
References (2/2)
  • Kab02 M. Kaboudan, GP Forecasts of Stock
    Prices for Profitable Trading, Evolutionary
    computation in economics and finance, 2002.
  • SaTe02 M. Santini, A. Tettamanzi, Genetic
    Programming for Financial Series Prediction,
    Proceedings of EuroGP'2001, 2001.
  • BhPiZu02 S. Bhattacharyya, O. V. Pictet, G.
    Zumbach, Knowledge-Intensive Genetic Discovery
    in Foreign Exchange Markets, IEEE Transactions
    on Evolutionary Computation, vol 6, n 2, April
    2002.
  • LaPo02 W.B. Langdon, R. Poli, Fondations of
    Genetic Programming, Springer Verlag, 2002.
  • Kab00 M. Kaboudan, Genetic Programming
    Prediction of Stock Prices, Computational
    Economics, vol16, 2000.
  • Wag03 L. Wagman, Stock Portfolio Evaluation
    An Application of Genetic-Programming-Based
    Technical Analysis, Genetic Algorithms and
    Genetic Programming at Stanford 2003, 2003.
  • GoFe99 W. Golubski and T. Feuring, Evolving
    Neural Network Structures by Means of Genetic
    Programming, Proceedings of EuroGP'99, 1999.
  • Dem05 I. Dempsey, Constant Generation for the
    Financial Domain using Grammatical Evolution,
    Proceedings of the 2005 workshops on Genetic and
    evolutionary computation 2005, pp 350 353,
    Washington, June 25 - 26, 2005.

87
?
Write a Comment
User Comments (0)
About PowerShow.com