The Relative Entropy Rate of Two Hidden Markov Processes PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: The Relative Entropy Rate of Two Hidden Markov Processes


1
The Relative Entropy Rate of Two Hidden Markov
Processes
Or Zuk Dept. of Phys. Of Comp. Systems Weizmann
Inst. Of Science Rehovot, Israel
2
Overview
  • Introduction
  • Distance Measures and Relative Entropy rate
  • Results Generalization from Entropy Rate.
  • Future Directions

3
Introduction
  • Hidden Markov Processes are relevant
  • Error Correction (Markovian source noise)
  • Signal Processing, Speech recognition
  • Experimental physics -telegraph noise, TLSnoise,
    quantum jumps.
  • Bioinformatics -biological sequences, gene
    expression

Noise 10
t
Transmission
Markov chain
HMP
4
HMP - Definitions
Models are denoted by ? and µ.
  • Markov Process
  • X Markov Process
  • M? Transition Matrix
  • m?(i,j) Pr(Xn1 j Xn i)
  • Hidden Markov Process
  • Y Noisy Observation of X
  • R? Noise/Emission Matrix
  • r?(i,j) Pr(Yn j Xn i)

5
Example Binary HMP
Transition
Emission
6
Example Binary HMP (Cont.)
  • A simple, Symmetric Binary HMP
  • M R
  • All properties of the process depend on two
    parameters, p and ?. Assume w.l.og. p, ? lt ½

7
Overview
  • Introduction
  • Distance Measures and Relative Entropy rate
  • Results Generalization from Entropy Rate.
  • Future Directions

8
Distance Measures for Two HMPs
  • Why important ?
  • Often, one learns a HMP from data. It is
    important to know how different is the learned
    model from the true model.
  • Sometimes, many HMPs may represent different
    sources (e.g. different authors, different
    protein families etc.), and we wish to know which
    sources are similar.
  • What distance measure to use?
  • Look at joint distributions of N consecutive Y
    symbols P?(N) and Pµ(N) .

9
Relative Entropy (RE) Rate
  • Notation
  • Relative Entropy for finite (N-symbol)
    distributions
  • Take the limit to get the RE-rate

10
Relative Entropy (RE) Rate
  • First proposed for HMPs by JuangRabiner 85.
  • Not a norm (not symmetric, no triangle
    inequality).
  • Still it has several natural interpretations
  • -If one generates data from ?, and gives
    likelihood score to µ, then D(? µ) is the
    average likelihood-loss per symbol (compared to
    the optimal model ?).
  • -If one compresses data generated ?, assuming
    erroneously it was generated by µ, then one
    looses on average D(? µ) per symbol.

11
Relative Entropy (RE) Rate
  • For HMPs, D(? µ) is difficult to compute. So
    far only bounds SilvaNarayanan or
    approximation algorithms
  • Li et al. 05, Do 03, MohammadTranter 05 are
    known.
  • D(? µ) generalizes the concept of the Shannon
    entropy rate, using
  • H(?) log s D(? u)
  • Where u is the uniform model, s is the alphabet
    size of Y.
  • The entropy rate H for an HMP is a Lyapunov
    Exponent, which is hard to compute generally.
    Jacquet et al 04
  • What is known (for H) ? Lyapunov exponent
    representation, analyticity, asymptotic
    expansions in different Regimes.
  • Generalize results and techniques to the RE-rate.

12
Why is calculating D(? µ) difficult?
13
Overview
  • Introduction
  • Distance Measures and Relative Entropy rate
  • Results Generalization from Entropy Rate.
  • Future Directions

14
RE-Rate and Lyapunov Exponents
  • What is Lyapunov exponent?
  • Arises in Dynamical Systems, Control Theory,
    Statistical Physics etc. Measures the stability
    of the system.
  • Take two (square) matrices A,B. Choose each time
    at random A (with prob. p) or B (w.p. 1-p). Look
    at the norm
  • (1/N) log ABBBAABABBA
  • The limit
  • -Exists a.s. FurstenbergKesten 60
  • -Called Top Lyaponov Exponent.
  • -Independent of Matrix Norm chosen.
  • HMP entropy rate is given as a Lyaponov Exponent
    Jacquet et al. 04

15
RE-Rate and Lyapunov Exponents
  • What about RE-rate?
  • Given as the difference of two Lyapunov
    Exponents

-The Gs are random matrices, which are simply
obtained from M and R using the forward
equations. -Different matrices appear in the
two Lyapunov exponents, but the probabilities
selecting the matrices are the same.
16
Analyticity of the RE-Rate
  • Is the RE-rate continuous, smooth, or even
    analytic in the parameters governing the HMPs?
  • For Lyapunov exponents Known analyticity in the
    matrix entries Rulle 79, and their
    probabilities Peres 90,91 separately.
  • For HMP entropy rate, analyticity was recently
    shown by HanMarcus 05.

17
Analyticity of the RE-Rate
  • Using both results, we are able to show
  • Thm The RE-rate is analytic in the HMPs
    parameters.
  • Analyticity is shown only in the interior of the
    parameters domain (i.e. strictly positive
    probabilities).
  • Behavior on the boundaries is more complicated.
    Sometimes analyticity remains on the boundaries
    (and beyond). Sometimes we encounter
    singularities. Full characterization is still
    lacking MarcusHan 05.

18
RE-Rate Taylor Series Expansion
  • While in general the RE-rate is not known, there
    are specific parameters values for which it is
    easily given in closed-form (e.g. for
    Markov-Chains). Perhaps we can expand around
    these values, and get asymptotic results near
    them.
  • Similar approach was used for Lyapunov exponents
    Derrida, and for HMP entropy rate Jacquet et
    al. 04, WeizmannOrdenlich 04, Zuk et al. 05
    giving first-order asymptotics in various
    regimes.

19
Different Regimes Binary Case
  • p -gt 0 , p -gt ½ (? fixed)
  • ? -gt 0 , ? -gt ½ (p fixed)
  • We concentrate on the High-SNR regime ? -gt 0,
    and
  • almost-memoryless regime p-gt ½.

20
RE-Rate Taylor Series Expansion
  • In Zuk,Domany,KanterAizenman 06 we give a
    procedure for calculating the full Taylor-Series
    Expansion for the HMP entropy rate, in the High
    SNR, and almost memoryless regime.
  • Main observation Finite systems give the
    correct RE rate up to a given order
  • Was discovered using computer experiments
    (symbolic computation in Maple).
  • Stronger result holds for the entropy rate
  • (orders settle for N (k3)/2)
  • Does not hold for any regime. For some regimes
    (e.g. p-gt0), even first order never settles.

21
Proof Outline (with M. Aizenman)
H(p,e) up to O(ek)
H(?)
D(?µ)
22
Overview
  • Introduction
  • Distance Measures and Relative Entropy rate
  • Results Generalization from Entropy Rate.
  • Future Directions

23
RE-Rate Taylor Series Expansion
  • First order
  • Higher orders were computed for the binary
    symmetric case.
  • Similar results for the almost-memoryless
    regime.
  • Radius of convergence seems larger for the latter
    expansion, albeit no rigorous results are known.

24
Future Directions
  • Study other regimes. (e.g. two close models).
  • Behavior of the EM algorithm.
  • Generalizations (e.g. different alphabets sizes,
    continuous case).
  • Physical realization of HMPs (mesoscopic systems,
    quantum jumps)
  • Domain of Analyticity - Radius of convergence.

25
Thanks
  • Eytan Domany (Weizmann Inst.)
  • Ido Kanter (Bar-Ilan Univ.)
  • Michael Aizenman (Princeton Univ.)
  • Libi Hertzberg (Weizmann Inst.)
Write a Comment
User Comments (0)
About PowerShow.com