Markov Chains: Transitional Modeling - PowerPoint PPT Presentation

About This Presentation
Title:

Markov Chains: Transitional Modeling

Description:

Data Analysis :Example 1 (ignoring explanatory variables) ... ages 9 through 12 and classified according to the presence or absence of wheeze. ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 29
Provided by: statUa
Category:

less

Transcript and Presenter's Notes

Title: Markov Chains: Transitional Modeling


1
Markov Chains Transitional Modeling
  • Qi Liu

2
content
  • Terminology
  • Transitional Models without Explanatory Variables
  • Inference for Markov chains
  • Data Analysis Example 1 (ignoring explanatory
    variables)
  • Transitional Models with Explanatory Variables
  • Data Anylysis Example 2 (with explanatory
    variables)

3
Terminology
  • Transitional models
  • Markov chain
  • K th-order Markov chain
  • Tansitional probabilities and Tansitional matrix

4
Transitional models
  • y0,y1,,yt-1 are the responses observed
    previously. Our focus is on the dependence of Yt
    on the y0,y1,,yt-1 as well as any explanatory
    variables. Models of this type are called
    transitional models.

5
Markov chain
  • A stochastic process, for all t, the conditional
    distribution of Yt1,given Y0,Y1,,Yt is
    identical to the conditional distribution of Yt1
    given Yt alone. i.e, given Yt, Yt1 is
    conditional independent of Y0,Y1,,Yt-1. So
    knowing the present state of a Markov
    chain,information about the past states does not
    help us predict the future
  • P(Yt1Y0,Y1,Yt)P(Yt1Yt)

6
K th-order Markov chain
  • For all t, the conditional distribution of Yt1
    given Y0,Y1,,Yt is identical to the conditional
    distribution of Yt1 ,given (Yt,,Yt-k1)
  • P(Yt1Y0,Y1,Yt)P(Yt1Yt-k1,Yt-k2,.Yt)
  • i.e, given the states at the previous k times,
    the future behavior of the chain is independent
    of past behavior before those k times. We discuss
    here is first order Markov chain with k1.

7
Tansitional probabilities
8
Transitional Models without Explanatory Variables
  • At first, we ignore explanatory variables. Let
    f(y0,,yT) denote the joint probability mass
    function of (Y0,,YT),transitional models use the
    factorization
  • f(y0,,yT) f(y0)f(y1y0)f(y2y0,y1)f(yTy0,y1,
    ,yT-1)
  • This model is conditional on the previous
    responses.
  • For Markov chains,
  • f(y0,,yT) f(y0)f(y1y0)f(y2y1)f(yTyT-1)
    ()
  • From it, a Markov chain depends only on one-step
    transition probabilities and the marginal
    distribution for the initial state. It also
    follows that the joint distribution satisfies
    loglinear model (Y0Y1, Y1Y2,, YT-1YT)
  • For a sample of realizations of a stochastic
    process, a contingency table displays counts of
    the possible sequences. A test of fit of this
    loglinear model checks whether the process
    plausibly satisfies the Markov property.

9
Inference for Markov chains
10
Inference for Markov chains(continue)
11
Example 1 (ignoring explanatory variables)A
study at Harvard of effects of air pollution on
respiratory illness in children.The children
were examined annually at ages 9 through 12 and
classified according to the presence or absence
of wheeze. Let Yt denote the binary response at
age t, t9,10,11,12.
  • 1 wheeze2 no wheeze

y9 y10 y11 y12 count y9 y10 y11 y12 count
1 1 1 1 94 2 1 1 1 19
1 1 1 2 30 2 1 1 2 15
1 1 2 1 15 2 1 2 1 10
1 1 2 2 28 2 1 2 2 44
1 2 1 1 14 2 2 1 1 17
1 2 1 2 12 2 2 1 2 42
1 2 2 1 12 2 2 2 1 35
1 2 2 2 63 2 2 2 2 572
12
Code of Example 1
  • Code of 11.7
  • data breath
  • input y9 y10 y11 y12 count
  • datalines
  • 1 1 1 1 94
  • 1 1 1 2 30
  • 1 1 2 1 15
  • 1 1 2 2 28
  • 1 2 1 1 14
  • 1 2 1 2 9
  • 1 2 2 1 12
  • 1 2 2 2 63
  • 2 1 1 1 19
  • 2 1 1 2 15
  • 2 1 2 1 10
  • 2 1 2 2 44
  • 2 2 1 1 17
  • 2 2 1 2 42
  • 2 2 2 1 35

13
Data analysis
  • The loglinear model (y9y10,y10y11,y11y12) a first
    order Markov chain. P(Y11Y9,Y10)P(Y11Y10)
  • P(Y12Y10,Y11)P(Y12Y11)
  • G²122.9025, df8, with p-valuelt0.0001, it fits
    poorly. So given the state at time t,
    classification at time t1 depends on the states
    at times previous to time t.

14
Data analysis (cont)
  • Then we consider model (y9y10y11, y10y11y12),a
    second-order Markov chain, satisfying conditional
    independence at ages 9 and 12, given states at
    ages 10 and 11.
  • This model fits poorly too, with G²23.8632,df4
    and p-valuelt0.001.

15
Data analysis (cont)
  • The loglinear model (y9y10,y9y11,y9y12,y10y11,y10y
    12,y11y12) that permits association at each pair
    of ages fits well, with G²1.4585,df5,and
    p-value0.9178086.
  • Parameter Estimate Error Limits
    Square Pr gt ChiSq
  • y9y10 1.8064 0.1943 1.4263 2.1888
    86.42 lt.0001
  • y9y11 0.9478 0.2123 0.5282 1.3612
    19.94 lt.0001
  • y9y12 1.0531 0.2133 0.6323 1.4696
    24.37 lt.0001
  • y10y11 1.6458 0.2093 1.2356 2.0569
    61.85 lt.0001
  • y10y12 1.0742 0.2205 0.6393 1.5045
    23.74 lt.000
  • y11y12 1.8497 0.2071 1.4449 2.2574
    79.81 lt.0001

16
Data analysis (cont)
  • From above, we see that the association seems
    similar for pairs of ages1 year apart, and
    somewhat weaker for pairs of ages more than 1
    year apart. So we consider the simpler model in
    which
  • It also fits well, with G²2.3, df9, and
    p-value 0.9857876.

17
Estimated Conditonal Log Odds Ratios
18
Transitional Models with Explanatory Variables
19
(No Transcript)
20
Data Anylysis
  • Example 2 (with explanatory variables)
  • At ages 7 to 10, children were evaluated annually
    on the presence of respiratory illness. A
    predictor is maternal smoking at the start of the
    study, where s1 for smoking regularly and s0
    otherwise.

21
Childs Respiratory Illness by Age and Maternal
Smoking
22
Data analysis (cont)
23
Code of Example 2
  • data illness
  • input t tp ytp yt s count
  • datalines
  • 8 7 0 0 0 266
  • 8 7 0 0 1 134
  • 8 7 0 1 0 28
  • 8 7 0 1 1 22
  • 8 7 1 0 0 32
  • 8 7 1 0 1 14
  • 8 7 1 1 0 24
  • 8 7 1 1 1 17
  • 9 8 0 0 0 274
  • 9 8 0 0 1 134
  • 9 8 0 1 0 24
  • 9 8 0 1 1 14
  • 9 8 1 0 0 26
  • 9 8 1 0 1 18
  • 9 8 1 1 0 26
  • 9 8 1 1 1 21
  • 9 8 1 0 0 26
  • 9 8 1 0 1 18
  • 9 8 1 1 0 26
  • 9 8 1 1 1 21
  • 10 9 0 0 0 283
  • 10 9 0 0 1 140
  • 10 9 0 1 0 17
  • 10 9 0 1 1 12
  • 10 9 1 0 0 30
  • 10 9 1 0 1 21
  • 10 9 1 1 0 20
  • 10 9 1 1 1 14
  • run
  • proc logistic descending
  • freq count
  • model yt t ytp s/scalenone aggregate
  • run

24
Output from SAS
  • Deviance and Pearson
    Goodness-of-Fit Statistics
  • Criterion DF
    Value Value/DF Pr gt ChiSq
  • Deviance 8
    3.1186 0.3898 0.9267
  • Pearson 8
    3.1275 0.3909 0.9261
  • Analysis of Maximum
    Likelihood Estimates

  • Standard Wald
  • Parameter DF Estimate Error
    Chi-Square Pr gt ChiSq
  • Intercept 1 -0.2926
    0.8460 0.1196 0.7295
  • t 1 -0.2428
    0.0947 6.5800 0.0103
  • ytp 1 2.2111
    0.1582 195.3589 lt.0001
  • s 1 0.2960
    0.1563 3.5837 0.0583

25
Analysis
26
  • The model fits well, with G²3.1186, df8,
    p-value0.9267.
  • The coefficient of is 2.2111 with SE 0.1582 ,
    Chi-Square statistic 195.3589 and p-value lt.0001
    ,which shows that the previous observation has a
    strong positive effect. So if a child had illness
    when he was t-1, he would have more probability
    to have illness at age t than a child who didnt
    have illness at age t-1.
  • The coefficient of s is 0.2960, the likelihood
    ratio test of H0 0 is 3.5837,df1,with p-value
    0.0583. There is slight evidence of a positive
    effect of maternal smoking.

27
Interpratation of Paramters ß
28
  • Thank you !
Write a Comment
User Comments (0)
About PowerShow.com