Title: Ensembles of Nearest Neighbor Forecasts
1Ensembles of Nearest Neighbor Forecasts
- Dragomir Yankov, Eamonn Keogh
- Dept. of Computer Science Eng.
- University of California Riverside
- Dennis DeCoste
- Yahoo! Research
2Outline
- Problem formulation
- NN forecasting framework
- Stability of the forecasts
- Ensembles of NN forecasts
- Experimental evaluation
3Problem formulation
- Predict the number of impressions to be observed
for a specific website
4Forecasting framework overview
5Forecasting framework formalization
- Formalization
- Direct forecasts
- Given a query , its k nearest
neighbors -
- Estimate the query continuation
- Other approaches iterative forecasts, mutually
validating forecasts
6Forecasting framework components
- Similarity measure
- Standardized Euclidean distance
- where
- Prediction accuracy
- Prediction root mean square error
- Weighting function uniform weights
7Stability of the forecasts
- Stability with respect to the training data
- NN is stable in the case of classification and
majority voting (Breiman 96) - Here extrapolation plus regression. Changing
one neighbor can change the forecast
significantly - Stability with respect to the input parameters
- Parameters k, weights of different neighbors,
query length, prediction horizon - Different combinations lead to different
forecasts
8Ensembles of NN forecasts
- Main idea rather than tuning up the best
parameters for the entire dataset, for each query
select the model that will predict it best - Issues
- What base models
- to use
- How to select
- among them
9Ensembles of NN forecasts
- Base models to use
- We focus on pairs of NN learners, in which the
base models differ in the number of neighbors
used - The optimal single predictors and the suitable
ensembles are determined on a validation set
using an oracle
k RMSE (k-NN) (k1, k2) RMSE (Ens)
1 2.0447 (1, 20) 1.5829
2 1.9504 (2, 40) 1.5996
6 1.8321 (6, 1) 1.6305
10 1.8387 (10, 1) 1.5953
100 2.9608 (100, 1) 1.6095
10Ensembles of NN forecasts
- Selecting among the base models
- Learn a classifier to select the more suitable
model for individual queries (SVM with Gaussian
kernel) - Note The classifier does not need to be perfect.
It is important - to identify the bad cases for each base
learner
11Ensembles of NN forecasts
- Selecting among the base models
- Extracted features
- Statistics from the query and its nearest
neighbors - Mean, Median, Variance, Amplitude
- Statistics from the models forecasts
- Mean, Median, Variance, Amplitude
- Distances between the forecasts of the individual
neighbors - Performance of the models on the querys nearest
neighbors - Step-back forecasts (good for short horizons)
12Experimental evaluation
13Experimental evaluation
- Website impressions
- Computing the optimal single predictors
- Comparison with the accuracy of the ensemble
approach
Horizon Predictor Test RMSE Std
h 30 10-NN (optimal k) 1.123 0.644
h 30 Ens 10-NN,1-NN 1.021 0.452
h 60 8-NN (optimal k) 1.549 0.862
h 60 Ens 10-NN,1-NN 1.412 0.685
h 100 6-NN (optimal k) 1.867 1.183
h 100 Ens 10-NN,1-NN 1.688 0.961
14Experimental evaluation
15Experimental evaluation
- Bias-Variance improvement
- We compute the bias2 and variance terms in the
error decomposition for h100 steps ahead - The statistics are recorded over 50 random
subsamples from the original training set
Predictor Bias2 Variance
6-NN (optimal k) 5.042 0.638
Ens 10-NN,1-NN 3.721 0.204
16Conclusions and future directions
- The proposed technique improves significantly the
prediction accuracy of the single NN forecasting
models - It outlines a principled solution to the
bias-variance problem of the NN forecasts - It is a data specific rather than a generic
approach - Combining more models and varying other
parameters would require selecting different
features
Thank you!