Title: Capacity of Finite-State Channels: Lyapunov Exponents and Shannon Entropy
1Capacity of Finite-State ChannelsLyapunov
Exponents and Shannon Entropy
- Tim Holliday
- Peter Glynn
- Andrea Goldsmith
- Stanford University
2Introduction
- We show the entropies H(X), H(Y), H(X,Y), H(YX)
for finite state Markov channels are Lyapunov
exponents. - This result provides an explicit connection
between dynamic systems theory and information
theory - It also clarifies Information Theoretic
connections to Hidden Markov Models - This allows novel proof techniques from other
fields to be applied to Information Theory
problems
3Finite-State Channels
- Channel state Zn ? c0, c1, cd is a Markov
Chain with transition matrix R(cj, ck) - States correspond to distributions on the
input/output symbols P(Xnx, Yny)q(x ,yzn,
zn1) - Commonly used to model ISI channels, magnetic
recording channels, etc.
R(c1, c3)
4Time-varying Channels with Memory
- We consider finite state Markov channels with no
channel state information - Time-varying channels with finite memory induce
infinite memory in the channel output. - Capacity for time-varying infinite memory
channels is defined in terms of a limit
5Previous Research
- Mutual information for the Gilbert-Elliot channel
- Mushkin Bar-David, 1989
- Finite-state Markov channels with i.i.d. inputs
- Goldsmith/Varaiya, 1996
- Recent research on simulation based computation
of mutual information for finite-state channels - Arnold, Vontobel, Loeliger, Kavcic, 2001, 2002,
2003 - Pfister, Siegel, 2001, 2003
6Symbol Matrices
- For each symbol pair (x,y) ? X x Y define a
ZxZ matrix G(x,y) - Where (c0,c1) are channel states at times (n,n1)
- Each element corresponds to the joint probability
of the symbols and channel transition
G(x,y)(c0,c1) R(c0,c1) q(x0 ,y0c0,c1), ?
(c0,c1) ? Z
7Probabilities as Matrix Products
- Let m be the stationary distribution of the
channel
The matrices G are deterministic functions of
the random pair (x,y)
8Entropy as a Lyapunov Exponent
- The Shannon entropy is equivalent to the Lyapunov
exponent for G(X,Y)
- Similar expressions exist for H(X), H(Y), H(X,Y)
9Growth Rate Interpretation
- The typical set An is the set of sequences
x1,,xn satisfying
- By the AEP P(An)gt1-e for sufficiently large n
- The Lyapunov exponent is the average rate of
growth of the probability of a typical sequence - In order to compute l(X) we need information
about the direction of the system
10Lyapunov Direction Vector
- The vector pn is the direction associated with
l(X) for any m. - Also defines the conditional channel state
probability - Vector has a number of interesting properties
- It is the standard prediction filter in hidden
Markov models - pn is a Markov chain if m is the stationary
distribution for the channel)
m
...
G
G
G
X
X
X
n
)
(
P
X
Z
p
2
1
n
1
n
n
m
...
G
G
G
1
X
X
X
2
1
n
11Random Perron-Frobenius Theory
- The vector p is the random Perron-Frobenius
eigenvector associated with the random matrix GX
For all n we have
For the stationary version of p we have
The Lyapunov exponent we wish to compute is
12Technical Difficulties
- The Markov chain pn is not irreducible if the
input/output symbols are discrete! - Standard existence and uniqueness results cannot
be applied in this setting - We have shown that pn possesses a unique
stationary distribution if the matrices GX are
irreducible and aperiodic - Proof exploits the contraction property of
positive matrices
13Computing Mutual Information
- Compute the Lyapunov exponents l(X), l(Y), and
l(X,Y) as expectations (deterministic
computation) - Then mutual information can be expressed as
- We also prove continuity of the Lyapunov
exponents on the domain q, R, hence
14Simulation-Based Computation(Previous Work)
- Step 1 Simulate a long sequence of input/output
symbols - Step 2 Estimate entropy using
- Step 3 For sufficiently large n, assume that the
sample-based entropy has converged. - Problems with this approach
- Need to characterize initialization bias and
confidence intervals - Standard theory doesnt apply for discrete symbols
15Simulation Traces for Computation of H(X,Y)
16Rigorous Simulation Methodology
- We prove a new functional central limit theorem
for sample entropy with discrete symbols - A new confidence interval methodology for
simulated estimates of entropy - How good is our estimate?
- A method for bounding the initialization bias in
sample entropy simulations - How long do we have to run the simulation?
- Proofs involve techniques from stochastic
processes and random matrix theory
17Computational Complexity of Lyapunov Exponents
- Lyapunov exponents are notoriously difficult to
compute regardless of computation method - NP-complete problem Tsitsiklis 1998
- Dynamic systems driven by random matrices
typically posses poor convergence properties - Initial transients in simulations can linger for
extremely long periods of time.
18Conclusions
- Lyapunov exponents are a powerful new tool for
computing the mutual information of finite-state
channels - Results permit rigorous computation, even in the
case of discrete inputs and outputs - Computational complexity is high, multiple
computation methods are available - New connection between Information Theory and
Dynamic Systems provides information theorists
with a new set of tools to apply to challenging
problems