Value of Unlabeled Data in Regression - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Value of Unlabeled Data in Regression

Description:

Labeled and unlabeled data must be generated from the same distribution. ... Many time series prediction problems have focused on single step or short term ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 2
Provided by: projecti
Category:

less

Transcript and Presenter's Notes

Title: Value of Unlabeled Data in Regression


1
Department of Computer Science
EngineeringCollege of Engineering
Semi-supervised Learning with Data Calibration
for Long-Term Time Series Forecasting
Haibin Cheng and Pang-Ning Tan
  • MOTIVATION
  • Long term time series forecasting is needed for a
    broad range of applications, including climate
    impact assessments and urban planning.
  • CHALLENGES IN LONG-TERM FORECASTING
  • Many time series prediction problems have focused
    on single step or short term prediction problems
    due to the inherent difficulty in controlling the
    propagation of errors from one prediction step to
    the next step
  • Extensive amount of historical data is needed for
    reliable prediction, which is expensive to
    obtain.
  • Presence of concept drifts in the modeling
    domain.
  • CONTRIBUTIONS
  • Developed a semi-supervised time series
    regression approach for long-term forecasting by
    incorporating future data from model simulations
    (e.g., global climate models for impact
    assessments) with historical observations
  • Developed a covariance-preserving data
    calibration approach to align historical
    observations with model simulation data.

Semi HMMR algorithm Input Historical data L
(Xl, Yl ) and future unlabeled Xu Output Future
response Yu Method 1. Train an initial HMMR
model ?0 ( ?, A, ?, C) using the training data
L. 2. Perform local estimation of Yu 3.
Perform global estimation of Yu using the current
parameters in ?. 4. Calculate the final
estimation of Yu. 5. Calculate the confidence
of the predicted values in Yu. 6. Combine
predicted value and confidence estimated in steps
4 and 5 with training data L to re-train HMMR
model ?'(?', A', ?', C). 7. Repeat steps 3-6
until convergence (?'-? ltlt ?)
  • Value of Unlabeled Data in Regression
  • Assumptions
  • Model assumptions match well with underlying
    data.
  • Labeled and unlabeled data must be generated from
    the same distribution.
  • Experimental Evaluation
  • 1. Performance comparison in terms of average
    root mean square error (rmse)

3. Application to statistical downscaling for
future climate scenario projections 60 randomly
selected locations in North America
4. Effect of covariance-preserving data
calibration on semi-supervised HMMR
  • Conclusions
  • Unlabeled data (e.g., from model simulations) can
    be used in a semi-supervised learning framework
    to improve long-term time series forecasting.
  • Covariance-preserving data calibration helps
    improve semi-supervised learning by reducing the
    inconsistencies between historical observations
    and model simulation data

2. Value of unlabeled data Y-axis Error
Rate X-axis Labeled/Unlabeled Data Semi-supervi
sed HMMR effectively utilizes the unlabeled data
to improve its prediction, especially when
labeled data is scarce.
5. Effect of covariance-preserving data
calibration on loss of neighborhood
information
Write a Comment
User Comments (0)
About PowerShow.com