Regression Analysis of Count Data and Development of Statistical Models - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Regression Analysis of Count Data and Development of Statistical Models

Description:

Regression Analysis of Count Data and Development of Statistical Models – PowerPoint PPT presentation

Number of Views:142

Avg rating:3.0/5.0

Slides: 43

Provided by: dlo4

Category:

more less

Transcript and Presenter's Notes

Title: Regression Analysis of Count Data and Development of Statistical Models

1
Regression Analysis of Count Data and Development
of Statistical Models

Lecture 7
Part II

2
Statistical Models For Crash Data
Recap Poisson Model
The PDF of the Poisson regression for yi given
xi is
The mean and variance are given by
The mean function is given by
3
Statistical Models For Crash Data
Recap Poisson-gamma Model (NB)
The PDF of the Poisson-gamma regression for yi
given xi is
The mean and variance are given by
The mean function is given by
Note
4
Statistical Models For Crash Data
Statistical fit (Goodness of fit)
There are various methods for estimating the
statistical fit of models. The most common one is
to compute the deviance between the prediction
and the estimated values. This is similar to the
sum of square errors described before for the
multivariate linear regression with a normal
error structure. The methods include Deviance
Statistics The Scaled Deviance Akaikes
Information Criterion Pseudo R-Squared
5
Statistical Models For Crash Data
Statistical fit (Goodness of fit)
The deviance statistic is defined as twice the
difference between the maximum log-likelihood
achievable and the log-likelihood of the fitted
model
When competitive models are compared, the model
with the lowest deviance offers the best
statistical fit. A note of caution this is only
valid when the dispersion parameter F is the same
for each competitive model.
6
Statistical Models For Crash Data
Statistical fit (Goodness of fit)
The deviance statistic for the Poisson model is
the following
The deviance statistic for the Poisson-gamma
model is the following
7
Statistical Models For Crash Data
Statistical fit (example)
Regression Analysis Response
variate Y Distribution Negative binomial with
parameter k 2.4683 Link function Log Fitted
terms Constant L_F1 L_F2 Residual d.f.
252, deviance 285.4 Summary of analysis

mean deviance d.f.
deviance deviance ratio Regression
2 88.2 44.085 38.93 Residual
252 285.4 1.133 Total
254 373.6 1.471
8
Statistical Models For Crash Data
Statistical fit (example)
Estimates of parameters
estimate s.e. t(252)
Constant -9.81 1.56
-6.29 L_F1 0.816
0.150 5.45 L_F2
0.3732 0.0629 5.93
MESSAGE s.e.s are based on the residual
deviance Aggregation parameter
k se 2.4683 0.3645
-2 x log-likelihood The values above
indicate this ß0 exp(-9.81) 0.00005501 ß1
0.816 ß2 0.3732
9
Statistical Models For Crash Data
Confidence Intervals
Confidence intervals for GLMs can be computed
using various approaches. They are generally a
little more complicated than the method used for
linear models. Some of these approaches are
approximation. One common approach is to use the
delta method.
The standard error of the estimate of the mean
response at x0 is
10
Statistical Models For Crash Data
Confidence Intervals (contd)
Where,
This is similar to the variance-covariance matrix
of linear models
Again, this is in essence the values at x0.
11
Statistical Models For Crash Data
Confidence Intervals (contd)
Confidence interval on the mean response is
Confidence interval for predicted values is
12
Statistical Models For Crash Data
Confidence Intervals
A recent paper written by Wood (2004) has
provided a direct way for estimating the
confidence intervals for Poisson and
Poisson-gamma models specifically for crash data.
Poisson Model
95 confidence interval on the mean response µ is
given by
13
Statistical Models For Crash Data
Confidence Intervals
Poisson Model
95 confidence interval on the predicted response
y is given by
This matrix can be provided by computer programs
Where,
the largest integer less or equal to x
14
Statistical Models For Crash Data
Confidence Intervals
Poisson-gamma Model
95 confidence interval on the mean response µ is
given by
15
Statistical Models For Crash Data
Confidence Intervals
Poisson-gamma Model
95 confidence interval on the mean response m
(the mean of the gamma distribution) is given by
Remember
16
Statistical Models For Crash Data
Confidence Intervals
Poisson-gamma Model
95 confidence interval on the predicted response
y is given by
17
Statistical Models For Crash Data
Confidence Intervals
Example
This example is taken from Wood (2004). The
following model predicts the number of rear-end
collisions at signalized intersections
18
Statistical Models For Crash Data
Confidence Intervals
In this example, the matrix (DD)-1 is equal to
Compute for F10,000
19
Statistical Models For Crash Data
Confidence Intervals
Compute the confidence interval for µ.
For the confidence interval is
Compute the confidence interval for m.
For the confidence
interval is
Compute the confidence interval for y.
For the confidence
interval is
See figure on next page
20
Statistical Models For Crash Data
Confidence Intervals
21
Statistical Models For Crash Data

Crash data have often the characteristics that
the mean µ can be very low (below 0.5)
Create problems with goodness-of-fit and
prediction
Read paper by Wood, G.R. (2004) Generalised
Linear Models and Goodness of Fit Testing.
Accident Analysis Prevention, Vol. 34, pp.
417-427.

22
Statistical Models For Crash Data
Low Mean Issue
23
Statistical Models For Crash Data
Time Trend Effects
24
Statistical Models For Crash Data
Time Trend Effects
Goal capture changes that vary from year to year
directly into the model.
The model structure is given by the following
Time Trend captured with the intercept (i.e., one
intercept for each year)
Characteristic each year is defined as a
different observation.
Issues Since each site is observed at a
different point in time, a temporal serial
correlation exits and affects the statistical
inferences of statistical models. Therefore, you
need to account for this correlation into the
model.
Modeling approach Generalized Estimating
Equations (GEE) Random-Effects models, etc.
25
Bayes Methods

Originally presented
Highway Safety Manual
"After the Crashes Are Counted" Workshop
Sunday, January 11th, 2004

26
Introduction

The Bayes method approaches the analysis of data
differently than the classical method
(frequentist)
Subjective judgment more easily incorporated with
the observed data and models
Treat unknown coefficients of regression models
as random variables
Data analysis less limited by the number of
observations (can be supplemented with subjective
judgment)
Computationally intensive (no longer an issue)

27
Important Characteristics

The Bayes method makes inferences from data using
probability models for quantities that are
observed and for quantities one is interested to
learn about
Bayesian data analysis can be divided into three
steps
Setting up a full probability model provide a
joint probability distribution for all observable
and unobservable quantities
Conditioning on observed data calculating and
interpreting the appropriate posterior
distribution (conditional probability
distribution)
Evaluating the fit of the model and implication
of the posterior distribution
Emphasis placed on interval estimation
(confidence interval) rather than hypothesis
testing

28
Basic Principles
Venn Diagram
A
E1
E2
E3
E4
E5
29
Basic Principles
A
E1
E2
E3
E4
E5
Total Probability Theorem
30
Basic Principles
Bayes Theorem
If event A occurred, what is the probability that
event Ei also occurred?
A
E1
E2
E3
E4
E5
31
Basic Principles
Bayes Theorem
If event A occurred, what is the probability that
event Ei also occurred?
Given the multiplication rule, it can be shown
that
Therefore, we obtain
Using the Total Probability Theorem for P(A), we
get
32
Bayes Model
In modeling and data analysis, the Bayes method
can be translated by the following equation
Where, y the observed data (can be defined as
a vector) ? unobserved quantity
33
Bayes Model
Terminology for
Posterior probability conditional on y (this
is a joint probability distribution for ? and y)
Prior distribution (can be informative,
non-informative, etc.)
Likelihood function when it is regarded as a
function of ? for a fixed y
Prior predictive distribution (also called the
marginal distribution of y)
e.g., Poisson distribution p(y ?) Po(?)
34
Bayes Model

Hierarchical Models (aka multilevel models)
They are used when information is available on
different levels of observational units
Hierarchical models allow the modeler to
structure some dependence between the parameters
under study in a logical manner
Observable outcomes modeled conditionally on
certain parameters are known as hyperparameters
Such hierarchical thinking helps understanding
multiparameter problems and plays an important
role in developing computational strategies

35
Bayes Model
Ex Hierarchical Model for crash data p(a, ß,
?y)
Assume a and ß (known or unknown)
Hyperparameters (a, ß)
Mean (?) Gamma(a,ß)
Parameter (?)
Poisson distribution with Mean (?)
Observable quantity (y)
36
Bayes Model

Estimation of posterior distribution, p(? y),
can be accomplished by integrating the full Bayes
equation (the likelihood and prior probability
functions)
The estimation can also be performed by
simulating the posterior distribution
Markov Chain Monte Carlo (MCMC) simulation
techniques are now frequently used for estimating
the posterior distribution

37
Empirical Bayes Model

The empirical Bayes (EB) method is usually
employed to simplify the computational difficulty
associated with the full Bayes method
The name empirical Bayes arises from the fact
that the prior distribution is estimated from
actual data
The data is used for estimating the
hyperparameters through MLE or MM
For the EB, the data is actually used twice
Once to estimate the hyperparameters
Once to estimate the posterior distribution

38
Empirical Bayes Model

For the EB method, a different weight is assigned
to the prior distribution and standard estimate
respectively
In safety analyses, the weights are estimated
with the assumption that the mean (?) for each
site follows a Gamma distribution
The EB estimates has been found to outperform
other estimates, such as the MLE
The EB framework is presented on next overhead

39
Empirical Bayes Model
Expected number of crashes
Where,
EB estimate of expected number of crashes
maximum likelihood estimate
observed number of crashes
Weight factor
40
Empirical Bayes Model
Alternative formulation
where
Mean of a Poisson-gamma regression
Dispersion parameter of NB regression
41
Empirical Bayes Model
Using the same example shown earlier
F1 24,164 F2 3,392 y10
The values are estimated as follows
Crashes per year
Crashes per year
42
Empirical Bayes Model
Observed value 10
Crashes per Year
EB estimate 7.63
MLE estimate 3.9
t
1
2
Year

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

TRANSFERRING LEAN SIX SIGMA AND DFSS DATA SIMPLY AND EFFECTIVELY Baseball Analytics PowerPoint PPT Presentation

TRANSFERRING LEAN SIX SIGMA AND DFSS DATA SIMPLY AND EFFECTIVELY Baseball Analytics - ... and Best Fit Model. Predictive Models. Statistical Process Control ... Baseball statistics are like a girl in a bikini. They show a lot, but not everything. ... | PowerPoint PPT presentation | free to view

Development and Validation of Predictive Classifiers using Gene Expression Profiles PowerPoint PPT Presentation

Development and Validation of Predictive Classifiers using Gene Expression Profiles - 100 published cancer gene expression datasets with clinical annotations ... Performed detailed review of statistical analysis for the 42 papers published in ... | PowerPoint PPT presentation | free to view

Spatial processes and statistical modelling PowerPoint PPT Presentation

Spatial processes and statistical modelling - ... (CAR) models from the corresponding simultaneous autoregression (SAR) models ... or dependent (e.g. CAR model for logs) 61. Introducing covariates ... | PowerPoint PPT presentation | free to view

New Approach to Bottleneck Capacity Analysis PowerPoint PPT Presentation

New Approach to Bottleneck Capacity Analysis - Data available for trucks only. Used 24 hour data. Census data U. S. Census Bureau ... rc = f4(qon, qoff) Initial models. Separate models for PQF and QDF ... | PowerPoint PPT presentation | free to view

Statistical Disclosure Limitation: Releasing Useful Data for Statistical Analysis PowerPoint PPT Presentation

Statistical Disclosure Limitation: Releasing Useful Data for Statistical Analysis - 1. Statistical Disclosure Limitation: ... Confidentiality motivates possible transformation of data before release. ... Multiple imputation approaches. ... | PowerPoint PPT presentation | free to view

GEE and Mixed Models for longitudinal data PowerPoint PPT Presentation

GEE and Mixed Models for longitudinal data - Na ve linear regression here looks for significant slopes (ignoring correlation ... First, a naive linear regression analysis is carried out, assuming the ... | PowerPoint PPT presentation | free to view

A Toolkit for Statistical Data Analysis PowerPoint PPT Presentation

A Toolkit for Statistical Data Analysis - Validation of Geant4 physics models through comparison of simulation vs ... distribution function (EDF) and then enquired how close this would be to ... | PowerPoint PPT presentation | free to view

Models for the Analysis of Discrete Compositional Data An Application of Random Effects Graphical Mo PowerPoint PPT Presentation

Models for the Analysis of Discrete Compositional Data An Application of Random Effects Graphical Mo - Discrete Regression Model (Predictors) Model for explanatory variables (CG distribution) ... at each site and placed into 1 of 6 categories of functional feeding type ... | PowerPoint PPT presentation | free to view

$Multivariable regression models with continuous covariates with a practical emphasis on fractional polynomials and applications in clinical epidemiology PowerPoint PPT Presentation$