Title: Time Series Prediction Forecasting the Future and Understanding the Past Santa Fe Institute Proceedings on the Studies in the Sciences of Complexity Edited by Andreas Weingend and Neil Gershenfeld
1Time Series PredictionForecasting the Future
andUnderstanding the PastSanta Fe Institute
Proceedings on the Studies in the Sciences of
ComplexityEdited by Andreas Weingend and Neil
Gershenfeld
- NIST Complex System Program
- Perspectives on Standard Benchmark Data
- In Quantifying Complex Systems
- Vincent Stanford
- Complex Systems Test Bed project
- August 31, 2007
2Chaos in Nature, Theory, and Technology
Rings of Saturn
Lorentz Attractor
Aircraft dynamics at high angles of attack
3Time Series Prediction A Santa Fe Institute
competition using standard data sets
- Santa Fe Institute (SFI) founded in 1984 to
focus the tools of traditional scientific
disciplines and emerging computer resources on
the multidisciplinary study of complex systems - This book is the result of an unsuccessful joke.
Out of frustration with the fragmented and
anecdotal literature, we made what we thought was
a humorous suggestion run a competition. no one
laughed. - Time series from physics, biology, economics, ,
beg the same questions - What happens next?
- What kind of system produced this time series?
- How much can we learn about the producing system?
- Quantitative answers can permit direct
comparisons - Make some standard data sets in consultation with
subject matter experts in a variety of areas. - Very NISTY but we are in a much better position
to do this in the age of Google and the Internet.
4Selecting benchmark data setsFor inclusion in
the book
- Subject matter expert advisor group
- Biology
- Economics
- Astrophysics
- Numerical Analysis
- Statistics
- Dynamical Systems
- Experimental Physics
5The Data Sets
- Far-infrared laser excitation
- Sleep Apnea
- Currency exchange rates
- Particle driven in nonlinear multiple well
potentials - Variable star data
- J. S. Bach fugue notes
6J.S. Bach benchmark
- Dynamic, yes.
- But is it an iterative map?
- Is it amenable to time delay embedding?
7Competition Tasks
- Predict the withheld continuations of the data
sets provided for training and measure errors - Characterize the systems as to
- Degrees of Freedom
- Predictability
- Noise characteristics
- Nonlinearity of the system
- Infer a model for the governing equations
- Describe the algorithms employed
8 Complex Time Series Benchmark Taxonomy
- Natural
- Stationary
- Low dimensional
- Clean
- Short
- Documented
- Linear
- Scalar
- One trial
- Continuous
- Synthetic
- Nonstationary
- Stochastic
- Noisy
- Long
- Blind
- Nonlinear
- Vector
- Many trials
- Discontinuous
- Switching
- Catastrophes
- Episodes
9Time honored linear models
- Auto Regressive Moving Average (ARMA)
- Many linear estimation techniques based on Least
Squares, or Least Mean Squares - Power spectra, and Autocorrelation characterize
such linear systems - Randomness comes only from forcing function x(t)
10Simple nonlinear systemscan exhibit chaotic
behavior
- Spectrum, autocorrelation, characterize linear
systems, not these - Deterministic chaos looks random to linear
analysis methods - Logistic map is an early example (Elam 1957).
Logisic map 2.9 lt r lt 3.99
11Understanding and learningcomments from SFI
- Weak to Strong models - many parameters to few
- Data poor to data rich
- Theory poor to theory rich
- Weak models progress to strong, e.g. planetary
motion - Tycho Brahe observes and records raw data
- Kepler equal areas swept in equal time
- Newton universal gravitation, mechanics, and
calculus - Poincaré fails to solve three body problem
- Sussman and Wisdom Chaos ensues with
computational solution! - Is that a simplification?
12Discovering properties of dataand inferring
(complex) models
- Cant decompose an output into the product of
input and transfer function Y(z)H(z)X(z) by
doing a Z, Laplace, or Fourier transform. - Linear Perceptrons were shown to have severe
limitations by Minsky and Papert - Perceptrons with non-linear threshold logic can
solve XOR and many classifications not available
with linear version - But according to SFI Learning XOR is as
interesting as memorizing the phone book. More
interesting - and more realistic - are real-world
problems, such as prediction of financial data. - Many approaches are investigated
13Time delay embeddingDiffers from traditional
experimental measurements
- Provides detailed information about degrees of
freedom beyond the scalar measured - Rests on probabilistic assumptions - though not
guaranteed to be valid for any particular system - Reconstructed dynamics are seen through an
unknown smooth transformation - Therefore allows precise questions only about
invariants under smooth transformations - It can still be used for forecasting a time
series and characterizing essential features of
the dynamics that produced it
14Time delay embedding theoremsThe most important
Phase Space Reconstruction technique is the
method of delays
Vector Sequence
Scalar Measurement
Time delay Vectors
- Assuming the dynamics f(X) on a V dimensional
manifold has a strange attractor A with box
counting dimension dA - s(X) is a twice differentiable scalar measurement
giving sns(Xn) - M is called the embedding dimension
- ? is generally referred to as the delay, or lag
- Embedding theorems if sn consists of scalar
measurements of the state a dynamical system
then, under suitable hypotheses, the time delay
embedding Sn is a one-to-one transformed image
of the Xn, provided M gt 2dA. (e.g. Takens 1981,
Lecture Notes in Mathematics, Springer-Verlag or
Sauer and Yorke, J. of Statistical Physics, 1991)
15Time series predictionMany different techniques
thrown at the data to see if anything sticks
- Examples
- Delay coordinate embedding - Short term
prediction by filtered delay coordinates and
reconstruction with local linear models of the
attractor (T. Sauer). - Neural networks with internal delay lines -
Performed well on data set A (E. Wan), (M. Mozer) - Simple architectures for fast machines - Know
the data and your modeling technique (X. Zhang
and J. Hutchinson) - Forecasting pdfs using HMMs with mixed states -
Capturing Embedology (A. Frasar and A.
Dimiriadis) - More
16Time series characterizationMany different
techniques thrown at the data to see if anything
sticks
- Examples
- Stochastic and deterministic modeling - Local
linear approximation to attractors (M. Kasdagali
and A. Weigend) - Estimating dimension and choosing time delays -
Box counting (F. Pineda and J. Sommerer) - Quantifying Chaos using information-theoretic
functionals - mutual information and nonlinearity
testing.(M. Palus) - Statistics for detecting deterministic dynamics -
Course grained flow averages (D. Kaplan) - More
17What to make of this?Handbook for the corpus
driven study of nonlinear dynamics
- Very NISTY
- Convene a panel of leading researchers
- Identify areas of interest where improved
characterization and predictive measurements can
be of assistance to the community - Identify standard reference data sets
- Development corpra
- Test sets
- Develop metrics for prediction and
characterization - Evaluate participants
- Is there a sponsor?
- Are there areas of special importance to
communities we know? For example predicting
catastrophic failures of machines from sensors.
18Ideas?