Stats 330: Lecture 31 - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Stats 330: Lecture 31

Description:

Stats 330: Lecture 31 – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 32
Provided by: T10877
Category:
Tags: lecture | ryp | stats

less

Transcript and Presenter's Notes

Title: Stats 330: Lecture 31


1
Stats 330 Lecture 31
Course Summary
2
The Exam!
  • 25 multiple choice questions, similar to term
    test (60)?
  • 3 long answer questions, similar to past exams
    You have to do all the multiple choice
    questions, and 2 out of 3 long answer questions
    (3 out of 3 for 762)?
  • Held on pm of Monday 16th Nov 2009

3
STATS 330 Course Summary
  • The course was about
  • Graphics for data analysis
  • Regression models for data analysis

4
Graphics
  • Important ideas
  • Visualizing multivariate data
  • Pairs plots
  • 3d plots
  • Coplots
  • Trellis plots
  • Same scales
  • Plots in rows and columns
  • Diagnostic plots for model criticism

5
Regression models
  • We studied 3 types of regression
  • Ordinary (normal, least squares) regression for
    continuous responses
  • Logistic regression for binomial responses
  • Poisson regression for count responses
  • (log-linear models)?

6
Normal regression
  • Response is assumed to be N(?,?2)?
  • Mean is a linear function of the covariates
  • ?????????????????????????x??????????????kxk
  • Covariates can be either continuous or
    categorical
  • Observations independent, same variance

7
Logistic regression
  • Response ( s successes out of n) is assumed to
    be Binomial Bin(n,?)?
  • Logit of Probability log(????????is a linear
    function of the covariates
  • ?????log(??????????????????x??????????????kxk
  • Covariates can be either continuous or
    categorical
  • Observations independent

8
Poisson regression
  • Response is assumed to be Poisson(?)?
  • Log of mean log(???is a linear function of the
    covariates (log-linear models)?
  • ?????log(????????????x??????????????kxk
  • (Or, equivalently
  • ????????????????????exp????????x??????????????kxk)
  • Covariates can be either continuous or
    categorical
  • Observations independent

9
Interpretation of ?? -coefficients
  • For continuous covariates
  • In normal regression, ? is the increase in mean
    response associated with a unit increase in x
  • In logistic regression, ? is the increase in log
    odds associated with a unit increase in x
  • In Poisson regression, ? is the increase in log
    mean associated with a unit increase in x
  • In logistic regression, if x is increased by 1,
    the odds are increased by a factor of exp(??
  • In Poisson regression, if x is increased by 1,
    the mean is increased by a factor of exp(??

10
Interpretation of ?? -coefficients
  • For categorical covariates
  • In normal regression, ? is the increase in mean
    response relative to the baseline
  • In logistic regression, ? is the increase in log
    odds relative to the baseline
  • In logistic regression, if we change from
    baseline to some level, the odds are increased by
    a factor of exp(parameter for that
    level??relative to the baseline
  • In Poisson regression, if we change from baseline
    to some level, the mean is increased by a
    factor of exp(parameter for that level??relative
    to the baseline

11
Measures of Fit
  • R2 (for normal regression)?
  • Residual Deviance (for Logistic and Poisson
    regression)?
  • But not for ungrouped data in logistic

12
Prediction
  • For normal regression,
  • Predict response at covariates x1, . . . ,xk
  • Estimate mean response at covariates x1, . . .
    ,xk
  • For logistic regression,
  • estimate log-odds at covariates x1, . . . ,xk
  • Estimate probability of success at covariates
    x1, . . . ,xk
  • For Poisson regression,
  • Estimate mean at covariates x1, . . . ,xk

13
Inference
  • Summary table
  • Estimates of regression coefs
  • Standard errors
  • Test stats for coef 0
  • R2 etc (normal regression)?
  • F-test for null model
  • Null and residual deviances (logistic/Poisson)?

14
Testing model vs sub-model
  • Use and interpretation of both forms of anova
  • Comparing model with a sub-model
  • Adding successive terms to a model

15
Topics specific to normal regression
  • Collinearity
  • VIFs
  • Correlation
  • Added variable plots
  • Model selection
  • Stepwise procedures FS, BE, stepwise
  • All possible regressions approach
  • AIC, BIC, CP, adjusted R2, CV

16
Factors (categorical explanatory variables)?
  • Factors
  • Baselines
  • Levels
  • Factor level combinations
  • Interactions
  • Dummy variables
  • Know how to express interactions in terms of
    means, means in terms of interactions
  • Know how to interpret zero interactions

17
Fitting and Choosing models
  • Fit a separate plane (mean if no continuous
    covariates) to each combination of factor levels
  • Search for a simpler submodel (with some
    interactions zero) using stepwise and anova

18
Diagnostics
  • For non-planar data
  • Plot res/fitted, res/xs, partial residual plots,
    gam plots, box-cox plot
  • Transform either xs or response, fit polynomial
    terms
  • For unequal variance
  • Plot res/ fitted, look for funnel effect
  • Weighted least squares
  • Transform response

19
Diagnostics (2)?
  • For outliers and high-leverage points
  • Hat matrix diagonals
  • Standardised residuals,
  • Leave-one-out diagnostics
  • Independent observations
  • Acf plots
  • Residual/previous residual
  • Time series plot of residuals
  • Durbin-Watson test

20
Diagnostics (3)?
  • Normality
  • Normal plot
  • Weisberg-Bingham test
  • Box Cox (select power)?

21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
Topics specific to Poisson regression
  • Offsets
  • Interpretation of regression coefficients
  • (same as for odds in logistic regression)?
  • Correspondence between Poisson regression
    (Log-linear models) and the multinomial model
    for contingency tables
  • The Poisson trick

26
(No Transcript)
27
Contingency tables (2)?
  • A model for the table is anything that
    specifies the form of the probabilities,
    possibly up to k unknown parameters
  • Test if the model is OK by
  • Calculate Deviance 2(log LMAX - log LMOD)
  • log LMAX replace ?s with table frequencies
  • log LMOD replace ?s with estimated ?s from the
    model
  • Model OK if deviance is small,(p-value gt 0.05)?
  • Degrees of freedom m - 1 - k
  • k number of parameters in the model

28
Independence models
  • Correspond to interactions being zero
  • Fit a saturated model using Poisson regression
  • Use anova, stepwise to see which interactions are
    zero
  • Identify the appropriate model
  • Models can be represented by graphs

29
Odds Ratios
  • Definition and interpretation
  • Connection to independence
  • Connection with interactions
  • Relationship between conditional ORs and
    interactions
  • Homogeneous association model

30
Association graphs
  • Each node is a factor
  • Factors joined by lines if an interaction
    between them
  • Interpretation in terms of conditional
    independence
  • Interpretation in terms of collapsibility

31
Contingency tables final topics
  • Association reversal
  • Simpsons paradox
  • When can you collapse
  • Product multinomial
  • comparing populations
  • populations the same if certain interactions are
    zero
  • Goodness of fit to a distribution
  • Special case of 1-dimensional table
Write a Comment
User Comments (0)
About PowerShow.com