Title: Categorical Data Analysis Part II April 27
1Categorical Data Analysis (Part II) April 27
- Dingcai Cao
- d-cao_at_uchicago.edu
2Probability Distributions
ANOVA or linear regression assumes a normal
distribution of the error Y X??, ?N(0,
?2) where Y is a continuous outcome variable, and
X is a matrix of independent variables, ? is the
error with a normal distribution (mean zero and
variance of ?2). For a normal distribution with
mean ? and variance of ?2, the probability
density function is
3Normal Distribution
Probability density function
Cumulative distribution function
Bell Shape
4Probability Distributions For Categorical Data
Two key distributions Binomial distribution
(for logistic regression) Poisson distribution
(for Poisson regression)
5Binomial Distribution
Example Assume 5 of the population is
green-eyed. If we pick 500 people randomly, how
likely is it that we get 30 or more green-eyed
people? The number of green-eyed people we pick
is a random variable Y which follows a binomial
distribution with N 500 and ? 0.05 (when
picking the people with replacement). We are
interested in the probability PY gt 30.
Probability Mass Function
Let ? denote the probability of success for a
given trial. Let Y denote the number of
successes out of the N trials, the probability of
outcome y for Y equals
6Binomial Distribution
Binomial distribution also has a bell shape. In
fact, with large N, binomial distribution is
close to normal distribution (good normal
approximation when Np N(1-p) gt 5
A special case of Binomial distribution with N
1 is called Bernoulli distribution.
7Binomial Distribution
Example N 500 and ? 0.05
8Poisson Distribution
- In binomial distribution, the number of trial N
is fixed. In some situation, N is random - For instance, the number of car accidents in
Dan-Ryan (I90/94) Express Way in a week - We are interested in the number of fatal
accidents in Dan-Ryan (I90/94) Express Way in a
week - The probability of the number of fatal accidents
can be described by Poisson distribution.
Probability mass function
where ? is the mean number of fatal accidents in
a week.
9Poisson Distribution
Probability mass function
Cumulative probability function
10Likelihood Function
Probability mass function (for discrete events)
or probability density function (for continuous
measures) allows us to calculate the probability
from the distribution parameters. In
experiments, often time we observe the data, and
we need to estimate the parameters of the
distribution.
Likelihood Function
The probability of the observed data, expressed
as a function of the parameters, is called a
likelihood function (l).
Example Suppose in a binomial distribution, N
10, the observed number of success y 3, then
the likelihood function of the data is
11Maximum Likelihood Estimation
The maximum likelihood estimate of the parameter
is defined to be the parameter value for which
the probability of the observed data has the
greatest value.
Likelihood Function
Likelihood
?
12Binomial Distribution Maximum Likelihood
Estimation
Probability Mass Function
Probability mass function allows us to calculate
the probability. However, in experiments, we run
N trials and record y number of success. We need
to estimate ?, the success probability.
Likelihood Function
Log Likelihood Function
13Binomial Distribution Maximum Likelihood
Estimation
Maximum Likelihood (ML)
Logistic regression! It assumes a binomial
distribution and uses a maximum likelihood
estimation method.
14Poisson Distribution Maximum Likelihood
Estimation
Given a sample of n measured values yi we wish to
estimate the value of the parameter ? of the
Poisson population from which the sample was
drawn.
15Normal Distribution Maximum Likelihood Estimation
In class exercise Given a sample of n measured
values yi, estimate the value of the parameter ?
and ?2 of the normal distribution.
16Logistic Regression Model
Y Binary outcome variable (I.e. a categorical
variable with only two levels). Example Smoking
Status (Yes/No) Application Status
(Admitted/Denied) Assumption Y has a binomial
distribution X Explanatory variable ?(x)
the probability of success when X takes value
x. ?(x) is the parameter for the binomial
distribution.
17Logistic Regression Model
Model
?(x)
logit(?(x))
0.5
Median effect level (EL50)
x
x
?0 indicates that Y is independent of x
18Logistic Regression Model
Odds
Odds ratio
?(x)/(1- ?(x)))
1
x
? is the change in log odds ratio with one unit
of change in x
19Logistic Regression Model
? is the log odds value when x 0. ? is the
change in log odds ratio with one unit of change
in x. x -?/? is the median effective level
(probability 0.5).