Title: Examining the Relationship Between Two Variables
1Examining the Relationship Between Two Variables
2What type of analysis?
- We have two variables X and Y and we are
interested in describing how a response (Y) is
related to an explanatory variable (X). - What graphical displays do we use to show the
relationship between X and Y ? - What statistical analyses do we use to summarize,
describe, and make inferences about the
relationship?
3Type of Displays
Y is Continuous Scatterplot Comparative Boxplot
Y is Ordinal or Nominal Logistic Plot 2-D Mosaic Plot
X is Continuous X is Ordinal or Nominal
4Fit Y by X in JMP
In the lower left corner of the Fit Y by X dialog
box you will see this graphic which is the same
as the more stylized version on the previous
slide.
Y Variable/Response Data Type
X Variable/Predictor Data Type
5Type of Displays
Y is Continuous Scatterplot Comparative Boxplot
Y is Ordinal or Nominal Logistic Plot 2-D Mosaic Plot
X is Continuous X is Ordinal or Nominal
6Type of Analyses
Y is Continuous Correlation and Regression - Parametric or Nonparametric If X has k 2 levels then Two-Sample t-Test or Wilcoxon Rank Sum Test. If X has k gt 2 levels then Oneway ANOVA or Kruskal Wallis Test
Y is Ordinal or Nominal If Y has 2 levels then use Logistic Regression If Y has more than 2 levels then use Polytomous Logistic Regression If both X and Y have two levels then use Fishers Exact Test, RR/OR, and Risk Difference/AR If either X or Y has more than two levels use a Chi-square Test. McNemars Test (dependent)
X is Continuous X is Ordinal or Nominal
7Fit Y by X in JMP
Y nominal/ordinal Y continuous
X continuous X nominal/ordinal
8Example Low Birthweight Study(Note This is
not NC one)
- List of Variables
- id ID for infant mother
- headcir head circumference (in.)
- leng length of infant (in.)
- weight birthweight (lbs.)
- gest gestational age (weeks)
- mage mothers age
- mnocig mothers cigarettes/day
- mheight mothers height (in.)
- mppwt mothers pre-pregnancy
- weight (lbs.)
- fage fathers age
- fedyrs fathers education (yrs.)
- fnocig fathers cigarettes/day
- fheight fathers height
- lowbwt low birth weight indicator
- (1 yes, 0 no)
- mage35 mothers age over 35 ?
- (1 yes, 0 no)
- smoker mother smoked during preg.
- (1 yes, 0 no)
- Smoker mothers smoking status
- (Smoker or Non-smoker)
- Low Birth Weight birth weight
- (Low, Normal)
Continuous Nominal
9Example Low Birthweight Study(Birthweight vs.
Gestational Age)
Y birthweight (lbs.) Continuous X
gestational age (weeks)Continuous
10Regression and Correlation Analysis from Fit Y by
X
11Example Low Birthweight Study(Birthweight vs.
Mothers Smoking Status)
Y birthweight (lbs.) Continuous X mothers
smoking status (Smoker vs. Non-smoker) Nominal
12Independent Samples t-Test from Fit Y by X
13Example Low Birthweight Study(Birthweight
Status vs. Mothers Cigs/Day)
Y birthweight status(Low, Normal)Nominal X
mothers cigs./day Continuous
P(LowCigs/Day)
14Logistic Regression from Fit Y by X
15Example Low Birthweight Study(Birthweight
Status vs. Mothers Smoking Status)
Y birthweight status(Low, Normal)Nominal X
mothers smoking status (Smoker,
Non-smoker)Nominal
16Independent Samples p1 vs. p2 - Fishers Exact,
Chi-square, Risk Difference, RR, OR
Skipped the arrows this time, everything should
self-explanatory. Notice the OR is upside-down
and needs reciprocation. OR 1/.342 2.92
17Summary
- In summary have seen how bivariate relationships
work in JMP and in statistics in general. - We know that the type of analysis that is
appropriate depends entirely on the data type of
the response (Y) and the explanatory variable or
predictor (X).