Title: WELCOME TO
1WELCOME TO
THETOPPERSWAY.COM
2 Regression Analysis Regression is
that technique of statistical analysis by which
when one variable is known the other variable can
be estimated. In regression analysis the
variable, which is known, is called independent
variable and which is to be estimated is called
dependent variable.
3Nondependent and Dependent Relationships
- Types of Relationship
- Nondependent (correlation) -- neither one of
variables is target Example protein and fat
intake - Dependent (regression) -- value of one variable
is used to predict value of another variable.
Example ACT and MCAT scores for medical
applicants, MCAT is the dependent and ACT is the
independent variable - Statistical Expressions
- Correlation Coefficient -- index of nondependent
relationship - Regression Coefficient -- index of dependent
relationship
4Regression 3 Main Purposes
- To describe (or model)
- To predict (or estimate)
- To control (or administer)
5Simple Linear Regression
- Statistical method for finding
- the line of best fit
- for one response (dependent) numerical variable
- based on one explanatory (independent) variable.
6Difference between Correlation and
Regression 1.Degree and nature Correlation
studies the relationship between two or more
series but regression analysis measures the
degree and extent of this relationship thereby
providing a base for estimation.
72.Cause and effect relationship Correlation
specifies the relationship between two series and
it can specify as to what extent is the cause and
what is effect. Whereas in regression the value
of which series is known is called independent
series and whose value is to be predicted is
called dependent series. The independent series
is cause and independent series is effect.
8 3. Limit of co-efficient The limit of
co-efficient of correlation is plus minus 1 but
this is not the case with regression
co-efficient. But the product of both the
regression co-efficient cannot become greater
than 1.
9 Regression Lines The regression analysis
between two related series of data is usually
done with the help of diagrams. On the scatter
diagram obtained by plotting the various values
of related series X and Y, two lines of best fit
are drawn through the various points of the
diagrams, which are called regression lines.
10- Why two regression lines?
- When there are two series then the lines of
regression will also be two. - If the variable values of two series are named
as X and Y then one regression line is called X
on Y and the other is called Y on X.
11Deviations Taken From Arithmetic Mean
- Regression Equations of X on Y
-
- X X r ?x ( Y Y)
- ?y
- Here
- r ?x is known as the regression coefficient of
X on Y - ?y
- It is also denoted by b xy or b1.
- It measures the change in X corresponding to a
unit - Change in Y
- Also b xy ? xy / ? y 2
-
12(ii) Regression Equations of Y on X Y Y r
?y ( X X) ?x Here r ?y
is known as the regression coefficient of Y on X
?x It is also denoted by b yx or b2. It
measures the change in Y corresponding to a unit
Change in X Also b yx ? xy / ? x 2 Where x
X ( mean of X series) y Y ( mean of Y series)
13Also
Regression equation of X on Y (X X ) b1 (Y-
Y) Regression equation of Y on X (Y Y) b2
(X X)
Here r is known as the coefficient of
correlation between X and Y series. and r ?
b1x b2
14Least square method X on Y Y on X ?X n.a
b. ? Y ?Y n.a b. ? X ?XY ?Y a b ?
Y ?XY ?X a b ? X X a by Y a bx
2
2
15Example Calculate the regression equations
taking deviations of items from the mean of X
and Y series. X Y 06 09 02 11 10 05 04 0
8 08 07
16Deviations taken from assumed mean
- Regression Equations of X on Y
-
- X X r ?x ( Y Y)
- ?y
- Here
- r ?x N ? dx dy (? dx ? dy) / N ? dy 2
(?dy)2 - ?y
- Regression Equation of Y on X
- Y Y r ?x ( X X)
- ?y
- Here
- r ?x N ? dx dy (? dx ? dy) / N ? dx 2
(?dx)2 - ?y
17Example From the following data of the rainfall
and production of rice, find (i) the most likely
production corresponding to rainfall 40 cm (ii)
the most likely rainfall corresponding to
production 45 kgs. Rainfall(cm) Prod
(kgs) Mean 35 50 Std deviation 5 8 Coeffic
ient of correlation between rainfall and
production .8
18Example Obtain the lines of regression X Y 5 2
6 4 5 8 7 5 2 1
19Example From the following series X and Y, find
out the value of (i)Two regression
coefficients. (ii)Two regression equations. (iii)
Most likely value of X when Y is 34. (iv) Most
likely value of Y when X is 47.
20Series X Series Y 48 36 50 32 5
3 33 49 38 51 37 55 31 53 35 4
9 30
21Example The two regression equations are as
follows 20 X - 3Y 975..(i) 4 Y 15 X
530 0.(ii) Find out (i) Mean value
of X and Y (ii) The coefficient of correlation
between X and Y (iii)Estimate the value of Y,
when X 90 and that of X when Y 130.
22- Example
- For a certain X and Y series, the two lines of
regression - Are given below
- 6Y 5X 90
- 15X 8Y 130
- Variance of X series is 16
- Find the mean value of X and Y series
- Coefficient of correlation between X and Y
series. - Standard deviation of Y series.
23Real Life Applications
- Estimating Seasonal Sales for Department Stores
(Periodic)
24Real Life Applications
- Predicting Student Grades Based on Time Spent
Studying
25Practice Problems
- Measure Height vs. Arm Span
- Find line of best fit for height.
- Predict height forone student not indata set.
Checkpredictability of model.
26Practice Problems
- Is there any correlation between shoe size and
height? - Does gender make a difference in this analysis?
27Practice Problems
- Can the number of points scored in a basketball
game be predicted by - The time a player plays in the game?
- By the players height?
-
28Questions ???
29THANK YOU
FOR VISITING
THETOPPERSWAY.COM