Title: Aravali college of Engineering and Management, Faridabad (7)
1Program Name B.Tech CSESemester 5th Course
Name Machine Learning Course CodePEC-CS-D-501
(I)Facilitator Name Aastha
2(No Transcript)
3(No Transcript)
4(No Transcript)
5(No Transcript)
6(No Transcript)
7(No Transcript)
8Introduction to Regression Analysis
- Regression analysis is used to
- Predict the value of a dependent variable based
on the value of at least one independent
variable - Explain the impact of changes in an independent
variable on the dependent variable - Dependent variable the variable we wish to
predict or explain - Independent variable the variable used to
explain - the dependent variable
9Simple Linear Regression Model
- Only one independent variable, X
- Relationship between X and Y is described by a
linear function - Changes in Y are assumed to be caused by
changes in X
10Types of Relationships
Linear relationships
Curvilinear relationships
Y
Y
X
X
Y
Y
X
X
11Types of Relationships
(continued)
Strong relationships
Weak relationships
Y
Y
X
X
Y
Y
X
X
12Types of Relationships
(continued)
No relationship
Y
X
Y
X
13Simple Linear Regression Model
Random Error term
Population Slope Coefficient
Population Y intercept
Independent Variable
Dependent Variable
Yi ? ß0 ? ß1Xi Linear component
- ei
- Random Error component
14Simple Linear Regression Model
(continued)
Yi ? ß0 ? ß1Xi ? ei
Y Observed Value of Y for Xi
ei
Slope ß1
Predicted Value
Random Error
of Y for Xi
for this X value
i
Intercept ß0
X
Xi
15Simple Linear Regression Equation (Prediction
Line)
The simple linear regression equation provides an
estimate of the population regression line
Estimated (or predicted) Y value for observation
i
Estimate of the regression
Estimate of the regression slope
intercept
Value of X for observation i
Yˆi ? b0 ? b1Xi
The individual random error terms ei have a mean
of zero
16Sample Data for House Price Model
House Price in 1000s (Y) Square Feet (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
17Regression Using Excel
- Tools / Data Analysis / Regression
18Assumptions of Regression
- Use the acronym LINE
- Linearity
- The underlying relationship between X and Y is
linear - Independence of Errors
- Error values are statistically independent
- Normality of Error
- Error values (e) are normally distributed for any
given value of X - Equal Variance (Homoscedasticity)
- The probability distribution of the errors has
constant variance
19Pitfalls of Regression Analysis
- Lacking an awareness of the assumptions
underlying least-squares regression - Not knowing how to evaluate the assumptions
- Not knowing the alternatives to least-squares
regression if a particular assumption is violated - Using a regression model without knowledge of
the subject matter - Extrapolating outside the relevant range
20Aravali College of Engineering And
Management Jasana, Tigoan Road, Neharpar,
Faridabad, Delhi NCR Toll Free Number 91-
8527538785 Website www.acem.edu.in