Title: Markov Chains: Transitional Modeling
 1Markov Chains Transitional Modeling 
  2content
- Terminology 
- Transitional Models without Explanatory Variables 
- Inference for Markov chains 
- Data Analysis Example 1 (ignoring explanatory 
 variables)
- Transitional Models with Explanatory Variables 
- Data Anylysis Example 2 (with explanatory 
 variables)
3Terminology
- Transitional models 
- Markov chain 
- K th-order Markov chain 
- Tansitional probabilities and Tansitional matrix
4Transitional models 
- y0,y1,,yt-1 are the responses observed 
 previously. Our focus is on the dependence of Yt
 on the y0,y1,,yt-1 as well as any explanatory
 variables. Models of this type are called
 transitional models.
5Markov chain 
- A stochastic process, for all t, the conditional 
 distribution of Yt1,given Y0,Y1,,Yt is
 identical to the conditional distribution of Yt1
 given Yt alone. i.e, given Yt, Yt1 is
 conditional independent of Y0,Y1,,Yt-1. So
 knowing the present state of a Markov
 chain,information about the past states does not
 help us predict the future
- P(Yt1Y0,Y1,Yt)P(Yt1Yt)
6K th-order Markov chain 
- For all t, the conditional distribution of Yt1 
 given Y0,Y1,,Yt is identical to the conditional
 distribution of Yt1 ,given (Yt,,Yt-k1)
-  P(Yt1Y0,Y1,Yt)P(Yt1Yt-k1,Yt-k2,.Yt) 
- i.e, given the states at the previous k times, 
 the future behavior of the chain is independent
 of past behavior before those k times. We discuss
 here is first order Markov chain with k1.
7Tansitional probabilities 
 8Transitional Models without Explanatory Variables
- At first, we ignore explanatory variables. Let 
 f(y0,,yT) denote the joint probability mass
 function of (Y0,,YT),transitional models use the
 factorization
- f(y0,,yT) f(y0)f(y1y0)f(y2y0,y1)f(yTy0,y1,
 ,yT-1)
- This model is conditional on the previous 
 responses.
- For Markov chains, 
- f(y0,,yT) f(y0)f(y1y0)f(y2y1)f(yTyT-1) 
 ()
- From it, a Markov chain depends only on one-step 
 transition probabilities and the marginal
 distribution for the initial state. It also
 follows that the joint distribution satisfies
 loglinear model (Y0Y1, Y1Y2,, YT-1YT)
- For a sample of realizations of a stochastic 
 process, a contingency table displays counts of
 the possible sequences. A test of fit of this
 loglinear model checks whether the process
 plausibly satisfies the Markov property.
9Inference for Markov chains 
 10Inference for Markov chains(continue) 
 11Example 1 (ignoring explanatory variables)A 
study at Harvard of effects of air pollution on 
respiratory illness in children.The children 
were examined annually at ages 9 through 12 and 
classified according to the presence or absence 
of wheeze. Let Yt denote the binary response at 
age t, t9,10,11,12.
y9 y10 y11 y12 count y9 y10 y11 y12 count
1 1 1 1 94 2 1 1 1 19
1 1 1 2 30 2 1 1 2 15
1 1 2 1 15 2 1 2 1 10
1 1 2 2 28 2 1 2 2 44
1 2 1 1 14 2 2 1 1 17
1 2 1 2 12 2 2 1 2 42
1 2 2 1 12 2 2 2 1 35
1 2 2 2 63 2 2 2 2 572 
 12Code of Example 1
- Code of 11.7 
- data breath 
- input y9 y10 y11 y12 count 
- datalines 
- 1 1 1 1 94 
- 1 1 1 2 30 
- 1 1 2 1 15 
- 1 1 2 2 28 
- 1 2 1 1 14 
- 1 2 1 2 9 
- 1 2 2 1 12 
- 1 2 2 2 63 
- 2 1 1 1 19 
- 2 1 1 2 15 
- 2 1 2 1 10 
- 2 1 2 2 44 
- 2 2 1 1 17 
- 2 2 1 2 42 
- 2 2 2 1 35 
13Data analysis
- The loglinear model (y9y10,y10y11,y11y12) a first 
 order Markov chain. P(Y11Y9,Y10)P(Y11Y10)
-  P(Y12Y10,Y11)P(Y12Y11) 
- G²122.9025, df8, with p-valuelt0.0001, it fits 
 poorly. So given the state at time t,
 classification at time t1 depends on the states
 at times previous to time t.
14Data analysis (cont)
- Then we consider model (y9y10y11, y10y11y12),a 
 second-order Markov chain, satisfying conditional
 independence at ages 9 and 12, given states at
 ages 10 and 11.
- This model fits poorly too, with G²23.8632,df4 
 and p-valuelt0.001.
15Data analysis (cont)
- The loglinear model (y9y10,y9y11,y9y12,y10y11,y10y
 12,y11y12) that permits association at each pair
 of ages fits well, with G²1.4585,df5,and
 p-value0.9178086.
- Parameter Estimate Error Limits 
 Square Pr gt ChiSq
- y9y10 1.8064 0.1943 1.4263 2.1888 
 86.42 lt.0001
- y9y11 0.9478 0.2123 0.5282 1.3612 
 19.94 lt.0001
- y9y12 1.0531 0.2133 0.6323 1.4696 
 24.37 lt.0001
- y10y11 1.6458 0.2093 1.2356 2.0569 
 61.85 lt.0001
- y10y12 1.0742 0.2205 0.6393 1.5045 
 23.74 lt.000
- y11y12 1.8497 0.2071 1.4449 2.2574 
 79.81 lt.0001
16Data analysis (cont)
- From above, we see that the association seems 
 similar for pairs of ages1 year apart, and
 somewhat weaker for pairs of ages more than 1
 year apart. So we consider the simpler model in
 which
- It also fits well, with G²2.3, df9, and 
 p-value 0.9857876.
17Estimated Conditonal Log Odds Ratios 
 18Transitional Models with Explanatory Variables 
 19(No Transcript) 
 20Data Anylysis 
- Example 2 (with explanatory variables) 
- At ages 7 to 10, children were evaluated annually 
 on the presence of respiratory illness. A
 predictor is maternal smoking at the start of the
 study, where s1 for smoking regularly and s0
 otherwise.
21Childs Respiratory Illness by Age and Maternal 
Smoking 
 22Data analysis (cont) 
 23Code of Example 2
- data illness 
- input t tp ytp yt s count 
- datalines 
- 8 7 0 0 0 266 
- 8 7 0 0 1 134 
- 8 7 0 1 0 28 
- 8 7 0 1 1 22 
- 8 7 1 0 0 32 
- 8 7 1 0 1 14 
- 8 7 1 1 0 24 
- 8 7 1 1 1 17 
- 9 8 0 0 0 274 
- 9 8 0 0 1 134 
- 9 8 0 1 0 24 
- 9 8 0 1 1 14 
- 9 8 1 0 0 26 
- 9 8 1 0 1 18 
- 9 8 1 1 0 26 
- 9 8 1 1 1 21
- 9 8 1 0 0 26 
- 9 8 1 0 1 18 
- 9 8 1 1 0 26 
- 9 8 1 1 1 21 
- 10 9 0 0 0 283 
- 10 9 0 0 1 140 
- 10 9 0 1 0 17 
- 10 9 0 1 1 12 
- 10 9 1 0 0 30 
- 10 9 1 0 1 21 
- 10 9 1 1 0 20 
- 10 9 1 1 1 14 
-  
- run 
- proc logistic descending 
- freq count 
- model yt  t ytp s/scalenone aggregate 
- run 
24 Output from SAS 
-  Deviance and Pearson 
 Goodness-of-Fit Statistics
-  Criterion DF 
 Value Value/DF Pr gt ChiSq
-  Deviance 8 
 3.1186 0.3898 0.9267
-  Pearson 8 
 3.1275 0.3909 0.9261
-  
-  Analysis of Maximum 
 Likelihood Estimates
-  
 Standard Wald
-  Parameter DF Estimate Error 
 Chi-Square Pr gt ChiSq
-  Intercept 1 -0.2926 
 0.8460 0.1196 0.7295
-  t 1 -0.2428 
 0.0947 6.5800 0.0103
-  ytp 1 2.2111 
 0.1582 195.3589 lt.0001
-  s 1 0.2960 
 0.1563 3.5837 0.0583
25 Analysis 
 26- The model fits well, with G²3.1186, df8, 
 p-value0.9267.
- The coefficient of is 2.2111 with SE 0.1582 , 
 Chi-Square statistic 195.3589 and p-value lt.0001
 ,which shows that the previous observation has a
 strong positive effect. So if a child had illness
 when he was t-1, he would have more probability
 to have illness at age t than a child who didnt
 have illness at age t-1.
- The coefficient of s is 0.2960, the likelihood 
 ratio test of H0 0 is 3.5837,df1,with p-value
 0.0583. There is slight evidence of a positive
 effect of maternal smoking.
27Interpratation of Paramters ß 
 28