Todays lecture: - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Todays lecture:

Description:

Give you a brief overview of. Modern regression models. PPR. Neural nets. Regression trees ... Modern Regression. Topics: Smoothing. Loess. splines. Curse of ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 25
Provided by: statAuc
Category:
Tags: lecture | modern | todays

less

Transcript and Presenter's Notes

Title: Todays lecture:


1
760 Lecture 4
  • Todays lecture

Modern Regression
2
Todays Agenda
  • Give you a brief overview of
  • Modern regression models
  • PPR
  • Neural nets
  • Regression trees
  • MARS

3
Modern Regression
  • Topics
  • Smoothing
  • Loess
  • splines
  • Curse of dimensionality
  • Regression models in high dimensions
  • PPR
  • Neural nets
  • Trees
  • MARS (not discussed)

4
Smoothing
  • Common methods
  • Loess (see VR p 230, also 20x)
  • Smoothing Splines (see VR p230, and below)
  • Regression splines (see VR p 228)
  • Supersmoother (see VR p 231)

5
Loess
  • Review 20x
  • At each data point, fit a weighted regression,
    with weights given by a kernel
  • Regression can be quadratic or linear
  • Use regression to predict fitted value at that
    point
  • Repeat for every point (subset if too many)

6
Smoothing splines
  • Data (xi,yi), i 1,,n xi low dimension
  • Choose function f to minimise (for l fixed)

Measures closeness to data
Measures smoothness
Controls tradeoff between smoothness and
closeness to data
7
Cross-validation
  • The parameter l chosen by cross-validation
  • Divide data into two sets, training set and
    test set
  • For fixed l, find the function fl that minimises
    the criterion
  • Calculate S(y-fl (x))2 summed over the test set
  • Repeat and average
  • Result is an estimate of the error Error(l) we
    get if we use fl to model new data
  • Choose l to minimise this error

8
Regression splines
  • Piecewise cubics
  • Chose knots ie equally spaced
  • Generate a spline basis in R
  • Fit using linear regression

9
Example
  • Average height and weight of American women
    30-39
  • gt women
  • height weight
  • 1 58 115
  • 2 59 117
  • 3 60 120
  • 4 61 123
  • 5 62 126
  • 6 63 129
  • 7 64 132
  • 8 65 135
  • 9 66 139
  • 10 67 142
  • 11 68 146
  • 12 69 150
  • 13 70 154
  • 14 71 159
  • 15 72 164

10
R code
  • gt round(bs(womenheight, df 5),5)
  • 1 2 3 4 5
  • 1, 0.00000 0.00000 0.00000 0.00000 0.00000
  • 2, 0.45344 0.05986 0.00164 0.00000 0.00000
  • 3, 0.59694 0.20335 0.01312 0.00000 0.00000
  • 4, 0.53380 0.37637 0.04428 0.00000 0.00000
  • 5, 0.36735 0.52478 0.10496 0.00000 0.00000
  • 6, 0.20016 0.59503 0.20472 0.00009 0.00000
  • 7, 0.09111 0.56633 0.33673 0.00583 0.00000
  • 8, 0.03125 0.46875 0.46875 0.03125 0.00000
  • 9, 0.00583 0.33673 0.56633 0.09111 0.00000
  • 10, 0.00009 0.20472 0.59503 0.20016 0.00000
  • 11, 0.00000 0.10496 0.52478 0.36735 0.00292
  • 12, 0.00000 0.04428 0.37637 0.53380 0.04555
  • 13, 0.00000 0.01312 0.20335 0.59694 0.18659
  • 14, 0.00000 0.00164 0.05986 0.45344 0.48506
  • 15, 0.00000 0.00000 0.00000 0.00000 1.00000
  • reg.splinelt- lm(weight bs(height, df 5), data
    women)
  • plot(womenheight, womenweight)

11
(No Transcript)
12
The regression problem
  • The data follow a model
  • y f(x1,,xK) error
  • where f is unknown smooth function and K is
    possibly large
  • We want to find an automatic estimate of f, to
    use for prediction
  • If K is small (1 or 2) we can use a smoother, but
    what if K is large?

13
Curse of dimensionality
  • Consider n points scattered at random in a
    K-dimensional unit sphere
  • Let D be the distance between the centre of the
    sphere to the closest point

D
D
K1 (one dimensional)
K2
14
Curse of dimensionality (cont)
  • Median of distribution of D

15
Moral
  • Smoothing doesnt work in high dimensions points
    are too far apart
  • Solution pick estimate of f from a class of
    functions that is flexible enough to match f
    reasonably closely
  • We need to be able to compute the estimate easily

16
Familiar classes of functions
  • Linear functions
  • Additive functions
  • fitted by smoothing in one or two dimensions
    using the backfitting algorithm
  • These are easy to fit but not flexible enough
  • We need something better will look at some
    other classes of functions

17
Projection pursuit
18
Neural network
x1
x2
x3
x4
x5
x6
19
Trees
  • These functions are multi-dimensional step
    functions
  • Can be described by regression trees

20
Trees (cont)
Yes
No
X2
y4
y7
6
5
y3
y1
7
X1
21
Relationships
PPR
Smooth functions
Additive models
Linear models
Neural nets
Trees
22
Complexity
  • For PPR, measured by number of ridge functions
  • For Neural nets, measured by the number of units
    in the hidden layer
  • For trees, by the number of branches
  • Caution Models that are too complex will not be
    good for prediction model noise as well as
    signal

23
Overfitting
Correct model
Test set
Training set
24
Model choice
  • Want to chose model that predicts new data well
  • Divide existing data into training set (50),
    test set (50) and validation set (25)
  • Fit models using training set
  • Estimate their prediction errors (sum of squares
    of prediction errors) using the test set or CV
  • Choose model having smallest prediction error on
    test set
  • Use validation set to get final estimate of
    prediction error
Write a Comment
User Comments (0)
About PowerShow.com