Title: Regression
1Regression
- CS294 Practical Machine Learning
- Romain Thibaux
- 02/07
TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAAAAAAAAAA
2Regression
- A new perspective on freedom
TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAAAAAAAAAA
3Outline
- Nearest Neighbor, Kernel Regression
- Linear regression
- Derivation from minimizing the sum of squares
- Probabilistic interpretation
- Online version (LMS)
- Overfitting and Regularization
- L1 Regression
- Spline Regression
4Where are we?
5Classification
6Cat
Dog
7Cleanliness
Size
8?
9Regression
10Price
y
Top speed
x
11Regression
Data Goal given , predict i.e. find a
prediction function
12Examples
- Voltage ! Temperature
- Processes, memory ! Power consumption
- Protein structure ! Energy
- Robot arm controls ! Torque at effector
- Location, industry, past losses ! Premium
134
141
15Nearest neighbor
15
10
5
0
-5
-10
-5
0
5
10
15
20
25
16Voronoi Diagram
Wikipedia
17Voronoi Diagram
http//www.qhull.org/html/qvoronoi.htm
18Nearest neighbor
- To predict x
- Find the data point xi closest to x
- Choose y yi
- No training
- Finding closest point can be expensive
- Overfitting
19Kernel Regression
e.g.
- To predict X
- Give data point xi weight
- Normalize weights
- Let
20Kernel Regression
15
10
5
0
-5
-10
-5
0
5
10
15
20
25
matlab demo
21Kernel Regression
- No training
- Smooth prediction
- Slower than nearest neighbor
- Must choose width of
222
23Linear regression
24Linear regression
26
24
Temperature
22
20
30
40
20
30
20
10
10
0
0
start Matlab demo lecture2.m
25Linear regression
26Linear Regression
Error or residual
Observation
Prediction
Sum squared error
27Learning as Optimization
28Learning as Optimization
29Learning as Optimization
30Learning as Optimization
31Linear Regression
n
d
Solve the system (its better not to invert the
matrix)
32Minimize the sum squared error
Sum squared error
Linear equation
Linear system
33LMS Algorithm(Least Mean Squares)
where
Online algorithm
34Online Learning
35Beyond lines and planes
still linear in
everything is the same with
36Geometric interpretation
20
10
400
0
300
200
-10
100
0
10
20
0
Matlab demo
37Linear Regression summary
Given examples
Let
For example
Let
n
d
by solving
Minimize
Predict
38Probabilistic interpretation
Likelihood
39Assumptions vs. Reality
Voltage
Temperature
Intel sensor network data
40Assumptions vs. Reality
Requests per minute
requests per minute
5000
0
0
1
2
Time (days)
41Overfitting
30
25
20
15
10
5
0
-5
-10
-15
0
2
4
6
8
10
12
14
16
18
20
Matlab demo
42Ridge Regression(Regularization)
Minimize
with small
Effect of regularization (degree 19)
15
10
5
0
-5
-10
0
2
4
6
8
10
12
14
16
18
20
43Probabilistic interpretation
Likelihood
Prior
Posterior
443
45Locally Linear Regression
46Global temperature increase
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
1840
1860
1880
1900
1920
1940
1960
1980
2000
2020
source http//www.cru.uea.ac.uk/cru/data/tempera
ture
47Locally Linear Regression
e.g.
- To predict X
- Give data point xi weight
- Let
- Let
48Locally Linear Regression
To minimize
where
Solve
Predict
- Good even at the boundary (more important in
high dimension) - Solve linear system for each new prediction
- Must choose width of
49Locally Linear Regression Gaussian kernel
180
source http//www.cru.uea.ac.uk/cru/data/tempera
ture
50Locally Linear Regression Laplacian kernel
180
source http//www.cru.uea.ac.uk/cru/data/tempera
ture
514
52L1 Regression
53Sensitivity to outliers
High weight given to outliers
Influence function
54L1 Regression
Linear program
Influence function
55Spline RegressionRegression on each interval
70
60
50
5200
5400
5600
5800
56Spline RegressionWith equality constraints
57Spline RegressionWith L1 cost
58Summary
- Nearest Neighbors
- Kernel Regression
- Locally Linear Regression / Spline Regression
- Linear Regression
- Prevent overfitting regularization
- Robustness to outliers L1 regression
59To learn more
- The Elements of Statistical Learning, Hastie,
Tibshirani, Friedman, Springer
60Further topics
- Feature Selection future lecture
- Generalized Linear Models
- Gaussian process regression