Bivariate Analysis: Measures of Association - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Bivariate Analysis: Measures of Association

Description:

Highway conditions. Average temperature of previous three days. Local weather forecast ... Ski resort weekend attendance. Regression Formula. Y = a x1 e ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 15
Provided by: brianj80
Category:

less

Transcript and Presenter's Notes

Title: Bivariate Analysis: Measures of Association


1
Bivariate AnalysisMeasures of Association
  • Chapter 17

2
Correlation Analysis
  • With a simple two variable correlation, you need
    to know the strength and direction of the
    correlation

Scatterplots help illustrate the relationships
between variables
3
CorrelationLinear Association
  • The purpose of correlation analysis is to find
    the measure of the degree of linear association
    (correlation) between two variables
  • The Pearson correlation coefficient is commonly
    used for this purpose

4
Interpreting Correlation
  • Values range from -1 to 1
  • Correlations of 0 mean that there is no
    relationship between variables
  • Extreme values (-1 and 1) show that there is a
    strong linear relationship
  • Correlations of -0.7 and 0.7 are equal in
    strength, but in opposite directions.
  • A score of 0.5 would mean that there is linear
    trend with some deviation from the line
  • Positive and negative
  • Positive correlations mean that as one variable
    increases, the other increases
  • Negative correlations mean that as one variable
    increases, the other decreases

5
Regression Analysis
  • Using one variable to predict another variable
  • Example Predicting fast food purchases based on
    gender, age, income, and education
  • Bivariate (simple) regression
  • Single predictor
  • Multivariate regression
  • Multiple predictors

6
Components of Regression
  • Independent variables
  • Variables that drive other variables
  • Dependent variables
  • Outcome or predicted variable

Independent variables Highway conditions Average
temperature of previous three days Local weather
forecast Newspaper space devoted to
advertising Average of previous three weeks
attendance
Dependent variable Ski resort weekend attendance
7
Regression Formula
  • Y a ßx1 e
  • a mean of population when x1 0
  • ß change in Y population mean per unit change
    in x1
  • e error drawn independently from a normally
    distributed universe error is independent of x1
  • Since e is impossible to predict and averages to
    0, it is typically left out of the equation

8
Strength of Association
  • Variation in Y can be explained by two things
  • Variation explained by the regression (sum of
    squares due to regression)
  • Variation not explained by the regression (sum of
    squares of deviation from the regression)
  • The coefficient of determination (r2) measures
    the proportion of variation in Y explained by
    changes in X
  • Value is between 0 and 1
  • Is a measure of significance

9
Strength of Association
  • The equation for r2 is given bywhere the top
    half of the equation is deviance from the
    regression and the bottom half is total variation
  • The closer r2 gets to 1, the better the
    regression is at predicting Y. If r2 1, then
    the regression perfectly predicts the actual
    values in virtually every instance

10
Rank Correlations
  • Associations between variables can be found when
    data is given as rankings rather than interval
    scales through rank correlations

11
Spearman Rank Correlation
  • The best-known and easiest technique is the
    Spearman rank correlation (rs)
  • rs is given by the equationwhere d is the
    difference between rankings in two ranking
    methods
  • When N ? 10, rs can be used to calculate a
    t-score with the equation and the
    resulting t-score is used in a two-tailed test of
    significance

12
Kendall Rank Correlation Coefficient (?)
  • More complicated than the Spearman rank
  • Should be used when three or more sets of
    rankings are compared
  • Calculated by the proportion of concordant pairs
    minus the proportion of discordant pairs
  • There exist two bivariate observations, (xi,yi)
    and (xj,yj)
  • Concordant pairs are when (xi-xj)(yi-yj) are
    positive
  • Discordant pairs are when (xi-xj)(yi-yj) is
    negative
  • Scores range from -1 to 1

13
Goodman and Kruskals Lambda (?)
  • ? is used when nominal scales are used
  • Spearman rank and ? wont work because the
    ordering element is missing with nominal scales
  • ? can be calculated by statistical packages

14
Final Note
  • Because these calculations can be run simply with
    statistical software packages like SPSS and SAS,
    and even Excel, it is more important to
    understand the significance of results than to
    memorize equations
Write a Comment
User Comments (0)
About PowerShow.com