Title: Chapter 6 Transformed Linear Regression
1Chapter 6Transformed Linear Regression
- www.mysmu.edu/faculty/zlyang
2Contents
1. Introduction
2. Transformed Linear Regression
3. Estimate Transformation Parameter
4. Applications
36.1 Introduction
Recall Required conditions for the linear
regression model
- The error ei is normally distributed.
- The standard deviation ? of ei is constant for
all values of Yi. - The errors are independent.
46.1 Introduction
- When one or more of these conditions are not met,
one may consider to transform the response Y. - The aims of transformation (Box and Cox 1964)
- to induce near normality
- to achieve constancy of error variances
- to obtain a simpler model (i.e., remove
- interactions, etc.)
56.1 Introduction
- A brief list of transformations
- Y log Y (for Y gt 0)
- Use when the ? increases with Y, or
- Use when the error distribution is positively
skewed - Y Y2
- Use when the ?2 is proportional to E(Y), or
- Use when the error distribution is negatively
skewed - Y Y1/2 (for Y gt 0)
- Use when the ?2 is proportional to E(Y)
- Y 1/Y
- Use when ?2 increases significantly when Y
increases beyond some critical value.
66.1 Introduction
- All of the above transformations are special
cases of the Box-Cox Power Transformation
Note if ?2, then h(Y,?) ? Y2 if ?1/2,
then h(Y,?) ? Y1/2 if ? -1, then h(Y,?) ?
1/Y, etc.
76.2 Transformed Linear Regression
Assume there exists a ? such that the transformed
responses satisfy the usual linear regression
model
- h(Yi, ?) b0b1Xi1b2Xi2 bpXip ei
- i 1, n, such that the errors ?i
- are normally distributed,
- have constant variance, and
- are independent.
- Such a model is called transformed linear
regression model.
8How to find ??
- Based on maximum likelihood estimation (MLE)
method - Since ?i N(0, ?2), the probability density
function (pdf) of Zi h(Yi, ?) is - where ?i b0 b1Xi1 b2Xi2 bpXip
- An application of change of variable technique
gives the pdf of Yi as
9How to find ??
This leads to the log-likelihood function
For a given ?, the function is maximized at
I n?n identity matrix P the hat matrix
10How to find ??
Substituting back to the log-likelihood function
gives the partially maximized log-likelihood
function
Finally maximizing this function gives the
optimal value of ?, which can be used to
transform the Yis. The one can perform the
regular regression analysis on the transformed
Yis.
Note maximization of can only be
done numerically using computer software, e.g., R.
11How to find ??
- Steps for finding the value of ? that maximizes
- Choose a grid of values ?1, ?2, . . . , ?K from
-2, 2 - Compute for k 1, 2, . . . , K.
- Find that is the maximum among all
K values, - ?k is the optimal value.
This method is called the grid search method.
126.3 Applications
- Example 6.1. (The Salary Survey Data)
- Developed from a salary survey of computer
professionals - To identity and quantify variables determining
the salary differentials - to check if the corporations salary
administration guidelines were being followed.
Variables are - S Salary (the response variable)
- X Years of experience
- E Education (1high school 2bachelor
3advanced degree) - M whether the person bears with management
responsibility
13Example 6.1, contd Regular Linear Regression
R Code for Linear Regression salary lt-
read.table("P122.txt", headerTRUE) y lt-
salary ,1 Salary in x1 lt- salary ,2
Years of experience x2 lt- salary ,31 Dummy
for high school diploma x3 lt- salary ,32
Dummy for bachelor degree x4 lt- salary ,41
Dummy for management responsibility Fit lt-
lm(yx1x2x3x4) ( summary(Fit) ) (
anova(Fit) )
14Example 6.1, contd Regular Linear Regression
According to R2 value, the model fits the data
very well. But
15Example 6.1, contd Regular Linear Regression
R Code for Residual Analysis n lt- length(y) p
lt- 4 x0 lt- matrix(1,n,1) X lt- cbind(x0,x1,x2,x3,x4
) SSE lt- deviance(Fit) Sum of squares due to
errors sigh lt- sqrt(SSE/(n-p-1)) Estimate of
error standard deviation y_hat lt- predict(Fit)
Output the predicted values e_hat lt-
residuals(Fit) Output the OLS residuals
Compute the hat matrix P lt- X solve(t(X)
X) t(X) h lt- diag(P) take out the
diagonal elements of P to form a vector
16Example 6.1, contd Regular Linear Regression
R Code for Residual Analysis The
studentized residuals r_hat lt-
e_hat/(sighsqrt(1-h))
Externally studentized residuals r_star lt-
r_hatsqrt((n-p-2)/(n-p-1-r_hat2)) plot(x1,y)
Scatter plot of y versus x1 plot(e_hat)
Index plot of OLS residuals plot(r_hat) Index
plot of studentized residuals plot(r_hat, x1)
Plot of studentized residuals against
x1 hist(r_hat) Histograms of studentized
residuals
17Example 6.1, contd Regular Linear Regression
R Code for Box-Cox Linear Regression Searching
the lambda value that maximizes the likelihood
function I lt- diag(1,n,n) to create an identity
matrix lam lt- seq(-2,2, by0.0001) ns
length(lam) llik lt- matrix(0,1,ns) for (i in
1ns) if (lami 0) ly lt- log(y) else ly
lt- (ylami-1)/lami lliki lt-
-nlog(t(ly)(I-P)ly)/2 (lami-1)
sum(log(y)) ind lt- order(llik,lam,
decreasing TRUE) lmax lt- rbind(llik,lam)
,ind (c("lambda hat ", lmax2,1))
18Example 6.1, contd Regular Linear Regression
R Code for Box-Cox Linear Regression Perform
regression analysis based on Box-Cox
transformation lamh lt- lmax2,1 if (lamh0) ly
lt- log(y) else ly lt- (ylamh - 1)/lamh fitbc lt-
lm(lyx1x2x3x4) ( summary(fitbc) ) (
anova(fitbc) ) e_hat lt- residuals(fitbc) r_hat
lt- e_hat/(sighsqrt(1-h)) r_star lt-
r_hatsqrt((n-p-2)/(n-p-1-r_hat2)) plot(r_hat,
x1) Plot of studentized residuals against
x1 hist(r_hat) Histograms of studentized
residuals
19Example 6.1, contd Regular Linear Regression
Model fit improved. Problem of
Heteroscedasticity alleviated.
20Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
Y Per capita expenditure on public
education X1 Per capita personal income X2
Number of residents per thousand under 18 years
of age X3 Number of people per thousand
residing in urban areas G Geographical region
(1 Northeast, 2 North Central, 3 South,
4 West)
21Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
- Questions
- How is the education expenditure related to the
other variables? - Does the education expenditure differ among the
regions? - Is the expenditure relationship stable with
respect to time? - Is heteroscedasticity an issue?
- We use Box-Cox linear regression technique to
address these issues. (Download R-script for
this analysis)
22Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
Response Y Year 1960
23Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
Response Y Year 1970
24Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
Response Y Year 1975
25Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
Response h(Y, ?-hat) ?-hat -0.1352 Year 1960
26Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
Response h(Y, ?-hat) ?-hat 0.1939 Year 1970
27Example 6.2 Box-Cox Linear Regression for
Education Expenditure Data (1960, 1970, 1975)
Response h(Y, ?-hat) ?-hat -1.3395 Year 1975