Title: Ofer%20Dekel,%20Shai%20Shalev-Shwartz,%20Yoram%20Singer
1Smooth e-Insensitive Regression by Loss
Symmetrization
- Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer
- School of Computer Science and Engineering
- The Hebrew University
- oferd,shais,singer_at_cs.huji.ac.il
- COLT 2003 The Sixteenth Annual Conference on
Learning Theory
2Before We Begin
Linear Regression given find
such that Least Squares minimize Support
Vector Regression minimize s.t.
3Loss Symmetrization
Loss functions used in classification Boosting
Symmetric versions of these losses can be used
for regression
4A General Reduction
- Begin with a regression training set
- where ,
- Generate 2m classification training examples of
dimension n1 - Learn while maintaining
- by minimizing a margin-based classification loss
5A Batch Algorithm
- An illustration of a single batch iteration
- Simplifying assumptions (just for the demo)
- Instances are in
- Set
- Use the Symmetric Log-loss
-
6A Batch Algorithm
Calculate discrepancies and weights
43210
0 1 2 3
4
7A Batch Algorithm
Cumulative weights
0 1 2 3
4
8Two Batch Algorithms
Update the regressor
43210
Log-Additive update
0 1 2 3
4
9Progress Bounds
Theorem (Log-Additive update) Theorem
(Additive update) Lemma Both bounds are
non-negative and equal zero only at the optimum
10Boosting Regularization
- A new form of regularization for regression and
classification Boosting - Can be implemented by addingpseudo-examples
- Communicated by Rob Schapire
where
11Regularization Contd.
- Regularization ? Compactness of the feasible set
for - Regularization ? A unique attainable optimizer of
the loss function
?
Proof of Convergence
Progress compactness uniqueness asymptotic
convergence to the optimum
12Exp-loss vs. Log-loss
Log-loss Exp-loss
13Extensions
- Parallel vs. Sequential updates
- Parallel - update all elements of in parallel
- Sequential - update the weight of a single weak
regressor on each round (like classic boosting) - Another loss function the Combined Loss
Log-loss
Exp-loss
Comb-loss
14On-line Algorithms
- GD and EG online algorithms for Log-loss
- Relative loss bounds
Future Directions
- Regression tree learning
- Solving one-class and various ranking problems
using similar constructions - Regression generalization bounds based on natural
regularization