Time%20Series%20Prediction%20as%20a%20Problem%20of%20Missing%20Values%20Application%20to%20ESTSP2007%20and%20NN3%20Competition%20Benchmarks - PowerPoint PPT Presentation

About This Presentation
Title:

Time%20Series%20Prediction%20as%20a%20Problem%20of%20Missing%20Values%20Application%20to%20ESTSP2007%20and%20NN3%20Competition%20Benchmarks

Description:

Time Series Prediction as a Problem of Missing Values Application to ESTSP2007 and NN3 Competition Benchmarks ... Time Series Prediction and ChemoInformatics Group – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Time%20Series%20Prediction%20as%20a%20Problem%20of%20Missing%20Values%20Application%20to%20ESTSP2007%20and%20NN3%20Competition%20Benchmarks


1
Time Series Prediction as a Problem of Missing
ValuesApplication to ESTSP2007 and NN3
Competition Benchmarks
  • Antti Sorjamaa and Amaury Lendasse
  • Time Series Prediction and ChemoInformatics Group
  • Adaptive Informatics Research Centre
  • Helsinki University of Technology

2
Outline
  • Time Series Predictionvs. Missing Values
  • Global methodology
  • Self-Organizing Maps (SOM)
  • Empirical Orthogonal Functions (EOF)
  • Results

3
Missing Values
1 9 ? 11 76 2
4 13 7 ? ? 3
7 ? 0 8 12 21
10 2 ? 1 ? ?
12 ? 3 ? 5 6
? 5 8 ? ? 11
9 6 7 2 90 6
3 ? 21 ? 2 0
47
48
49
50
?
?
?
?
42 43 44 45 46 47
43 44 45 46 47 48
44 45 46 47 48 49
45 46 47 48 49 50
46 47 48 49 50 ?
47 48 49 50 ? ?
48 49 50 ? ? ?
49 50 ? ? ? ?
Time
Time
4
Time Series Predictionvs. Missing Values
  • Methods designed for finding Missing Values in
    temporally related databases
  • Time series is such a database
  • Unknown future can be considered as a set of
    missing values
  • ? Same methods can be applied

5
Global Methodology
  • Based on two methods
  • SOM
  • Nonlinear projection / interpolation
  • Topology preservation on a low-dimensional grid
  • EOF
  • Linear projection
  • Projection to high-dimensional output space
  • Needs initialization

6
SOM
7
SOM Interpolation
  • SOM learning is done with known data
  • Missing values are left out Approach proposed
    by Cottrell and Letrémy
  • (in Applied Stochastic Models and Data Analysis
    2005)

8
EOF Projection
  • Based on Singular Value Decomposition (SVD)
  • Only q Singular Values and Vectors are used
  • q is smaller than K (the rank of X)
  • Larger values contain more signal than smaller

9
EOF Projection (2)
  • SVD cannot deal with missing values
  • Initialization is crucial!
  • Decomposition with SVD and reconstruction
  • q largest singular values and vectors are used in
    the reconstruction
  • Original data is not modified!
  • The selection of q using validation

10
EOF Projection (3)
1 9 ? 11 76 2
4 13 7 ? ? 3
7 ? 0 8 12 21
10 2 ? 1 ? ?
12 ? 3 ? 5 6
? 5 8 ? ? 11
9 6 7 2 90 6
3 ? 21 ? 2 0
1 9 5 11 76 2
4 13 7 5 5 3
7 5 0 8 12 21
10 2 5 1 5 5
12 5 3 5 5 6
5 5 8 5 5 11
9 6 7 2 90 6
3 5 21 5 2 0
1 9 4 11 76 2
4 13 7 6 11 3
7 2 0 8 12 21
10 2 3 1 8 10
12 5 3 3 5 6
7 5 8 2 5 11
9 6 7 2 90 6
3 8 21 1 2 0
1 9 4 11 76 2
4 13 7 9 21 3
7 4 0 8 12 21
10 2 1 1 9 12
12 5 3 3 5 6
9 5 8 2 5 11
9 6 7 2 90 6
3 8 21 1 2 0
1 9 4 11 76 2
4 13 7 9 22 3
7 5 0 8 12 21
10 2 1 1 9 13
12 4 3 3 5 6
10 5 8 2 5 11
9 6 7 2 90 6
3 8 21 1 2 0
  • Initialization
  • Round 1
  • Round 2
  • Round 3
  • .
  • .
  • .
  • n. Done!

11
Global Methodology (2)
  • Missing Data

EOF iteration
SOM
EOF
SOM grid size
Data with filled values
Number of EOF
12
ESTSP2007Competition Data
Validation
Learning
13
Results, Regressor size 11
  • EOF
  • SOM
  • SOMEOF

14
Results (2)
  • EOF
  • SOM
  • SOMEOF

2
18
174
15
Prediction
16
NN3 Competition
  • Prediction of 111 time series
  • Single, automatic, methodology for predicting all
    the series
  • Prediction of 18 values to the future for each
    series
  • All series rather short, which makes the
    prediction tricky
  • Mean SMAPE of all series evaluated in the
    competition

17
NN3 Long Series
Validation MSE 0,1559
Validation MSE 0,0076
18
NN3 Short Series
Validation MSE 0,3493
19
NN3 Validation Errors
Long
Short
20
Summary
  • Time Series Prediction can be viewed as a problem
    of Missing Values
  • SOMEOF methodology works well, better than
    individual methods alone
  • SOM projection is discrete
  • EOF needs sufficiently good initialization
  • ? Methods complete each other

21
Further Work
  • Improvements to the methodology
  • The selection of singular values and vectors
  • Convergence criterion
  • How to guarantee quick convergence?
  • Applying the methodology to data sets from other
    fields
  • Climatology, finance, process data

22
Questions?
Time Series Prediction as a Problem of Missing
ValuesApplication to ESTSP2007 and NN3
Competition Benchmarks
  • Antti.Sorjamaa_at_hut.fi
  • Lendasse_at_cis.hut.fi
  • http//www.cis.hut.fi/projects/tsp
Write a Comment
User Comments (0)
About PowerShow.com