Estimating variable structure and dependence in multi-task learning via gradients PowerPoint PPT Presentation

presentation player overlay
1 / 10
About This Presentation
Transcript and Presenter's Notes

Title: Estimating variable structure and dependence in multi-task learning via gradients


1
Estimating variable structure and dependence in
multi-task learning via gradients
  • By Justin Guinney, Qiang Wu and Sayan Mukherjee
  • Presented by John Paisley

2
Outline
  • Outline of presentation
  • General problem
  • Review of single-task solution
  • Extension to multi-task
  • Experiments

3
General problem
  • Have small number of high dimensional data, x,
    with corresponding response variable, y (fully
    supervised)
  • Want to simultaneously build a classification or
    regression function and learn important features,
    as well as correlation between features (to know
    if two features are important in the same way)
  • Xuejun presented their single-task solution. This
    paper extends this to the multi-task setting.

4
Single-Task Solution (classification)
  • By Taylor expansion, estimate the classification
    function as
  • Seek to minimize expected error
  • Where is a weight function and
    ? And phi is a convex loss
    function.
  • To solve this, regularize in RKHS

5
Single-Task (regression)
  • Use the response variable for each input and only
    learn the gradient.

6
Single-Task (solution and value of interest)
  • By representer theorem, this has solution of the
    form
  • The gradient outer product (GOP) is the matrix
    with all feature information. This is
    approximated as

This paper
Xuejuns paper
Matlab
7
GOP
  • This matrix is central to their paper because it
    tells all the information about the importance of
    each feature. The diagonal can be used to rank
    each features importance and the off diagonal
    tells how features are correlated (therefore if
    two features are important in the same way, only
    one need be selected).
  • My confusion
  • I take this to mean that which would
    resolve previous page
  • However, constructing a discrete Gaussian kernel
    in Matlab, this isnt true (and makes no sense to
    me why it should be true).

8
Extension to multi-task
  • Very logical extension. They assume a base
    function and have a task-specific correction.
  • Classification RKHS regularization
  • Regression RKHS regularization

9
Experiments
10
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com