Title: Learning Approach to Link Adaptation in WirelessMAN
1Learning Approach to Link Adaptation in
WirelessMAN
- Avinash Prasad
- Supervisor Prof. Saran
2Outline of Presentation
- Introduction
- Problem Definition
- Proposed Solution Learning Automaton.
- Requirements
- About Implementation
- Results
- Conclusions
- References
3Introduction ( Link Adaptation )
- Definition
- Link adaptation refers to a set of techniques
where modulation, coding rate and/or other signal
transmission parameters are changed on the fly to
better adjust to the changing channel conditions.
4Introduction ( WirelessMAN )
- WirelessMAN requires high data rates over
- channel conditions varying across different
links. - channel conditions vary over time
- Link Adaptation on per link basis is the most
fundamental step that this BWA system uses to
respond to these link to link variations, and
variations over time. There is an elaborate
message passing mechanism for exchanging channel
information at the MAC layer.
5Problem Definition
- Link adaptation requires us to know which channel
condition changes summon for a change in
transmission parameters. - Most commonly identified problem of link
adaptation - How do we calculate the threshold values for
various channel estimation parameters, which
shall signal a need for change in channel
transmission parameters.
6Problem Definition ( Current Approaches)
- Current methods for threshold estimation
- Model Based
- Requires analytical modeling.
- How reliable is the model ?
- Availability of appropriate model for the
wireless scenario ? - Statistical methods.
- Hard to obtain one channel conditions.
- Doesnt change with time, fixed.
- Even a change in season may effect the best
appropriate values. - Heuristics based
- Scope limited to very few scenarios.
7Proposed Solution ( Aim )
- Come up with a machine learning based method such
that - Learn the optimal threshold values as we operate
over the network. - No analytical modeling needed by the method in
its operation. - Should be able to handle noisy feedback from the
environment. - Generic enough to be able to learn different
parameters without much changes to the core.
8Proposed Solution ( Idea )
- Use stochastic learning automaton.
- Informally Essentially simulates animal
learning Repeatedly make your decisions based
on your current knowledge, and then refine your
decision as per the response from the
environment. - Mathematically Modifies the probability of
selecting any action based on how much reward do
we get from the environment.
9Proposed Solution ( Exp. Setup )
- Experimental setup used to study the stochastic
learning methods. - We learn the optimal SNR threshold values for
switching among coding profiles, such that the
throughput is maximized. - Threshold Ti decides when to switch from profile
i to profile i1. - The possible values for Ti have been restricted
to a limited set of values, to facilitate faster
learning, by diminishing the number of options.
10Proposed Solution ( Exp. Setup ) (
Mathematical Formulation )
- For N different profiles in use, the need to
(N-1) thresholds to be determined/learned. - At any instance these (N-1) thresholds Ti , i
1,..,N-1, form the input to the environment. - In return the environment returns the reward
- ß( SNR estimate,lt T1,..,TNgt) (1- SER)
( K / Kmax) - K represents the information block size fed to
the RS encoder in the selected profile. - Kmax is the maximum possible value of K for any
profile. This makes the reward value lie in the
range 0,1 - Clearly ß is a measure of normalized throughput.
-
11Proposed Solution (Learning Automaton) (
Formal Definition )
- A learning automaton is completely given by (A,
B, LA, Pk ) - Action set A (a1, a2,, ar), we shall always
assume this set to be finite in our discussion. - Set of rewards B 0,1
- The learning algorithm LA
- State information Pk p1k , p2k ,.., prk
12Proposed Solution (Learning Automaton) (
Why? )
- Advantages
- Complete generality of action set
- We can have a entire set of automaton , each
working on a different variable of a
multivariable problem, and yet they arrive at a
Nash Equilibrium, such that the overall function
is maximized, much faster than a single
automaton. - It can handle noisy reward values from the
environment - Perform long time averaging as it learns
- But thus needs the environment to be stationary
13Proposed Solution (Learning Automaton) (
Solution to Exp. setup )
- Each threshold is learnt by an independent
automaton in the group, game , of automaton that
solves the problem. - We choose the smallest possible action set
depending that covers all possible variations in
channel conditions in the setup, for each of the
automaton .i.e. decide the possible range of
threshold values. - We decide on the learning algorithm to use.
14Proposed Solution (Learning Automaton) (
Solution to Exp. setup )
- For k being the instance of playoff. We do the
following - Each automaton selects an action (Threshold)
based on its state Pk, the probability vector. - Based on these threshold values, we select a
profile for channel transmission. - Get feedback from the channel in the form of the
value of normalized throughput defined earlier. - Use the learning algorithm to calculate the new
state, set of probabilities Pk1 .
15Proposed Solution (Learning Automaton) (
Learning Algorithms )
- We have explored two different algorithms
- LRI , Linear reward inaction.
- Very much Markovian , just update the Pk1 based
on the last action/reward pair - for a(k)ai pi(k1) pi(k) ?ß(k)(1-
pi(k)) - otherwise pi(k1) pi(k ) - ?ß(k) pi(k
) - ? is a rate constant.
- Pursuit Algorithm
- Uses the entire history of selection and reward
to calculate the average reward estimates for all
actions. - Aggressively tries to move towards the simplex
solution, which has probability 1 for action
with highest reward estimate, say action aM . - P(k1) P(k) ?( eM(k) P(k))
16Proposed Solution (Learning Automaton)
( Learning Algorithms cont.)
- Both differ in the speed of convergence to the
optimal solution - The amount of storage required for each.
- How much decentralized the learning setup, game,
can be - The way they approach their convergence point
- Being a greedy method pursuit algorithm shows
lots of deviation in the evolution phase.
17Requirements
- 802.16 OFDM Physical layer
- Channel model (SUI model used)
- Learning Setup
18About Implementation ( 802.16 OFDM Physical
Layer)
- Implements OFDM physical layer from 802.16d.
- Coded in MatlabTM
- Complies fully to the standard, operations tested
with the example for pipeline given in the
standard. - No antenna diversity used, and perfect channel
impulse response estimation assumed.
19About Implementation ( Channel Model )
- We have implemented the complete set of SUI
models for omni antenna case. - The complete channel model consists of one of the
SUI models plus AWGN model for noise. - Coded in MatlabTM , thus completing the entire
channel coding pipeline. - Results from this data transmission pipeline
shall be presented later.
20About Implementation ( Learning Setup )
- We implemented both the algorithms for
comparison. - Coded in C/C.
- A network model was constructed using the Symbol
error rate plots obtained form PHY layer
simulations to estimate the reward values.
21Results ( PHY layer) ( BER plots for
different SUI models)
22Results ( PHY layer) ( SER plots for
different SUI models)
23Results ( PHY layer)( BER plots for different
Profiles at SUI2)
24Results ( PHY layer)(SER plots for different
Profiles at SUI2)
25Results ( PHY layer)( Reward Metric for
learning automaton )
26Results ( Learning )( Convergence curve LRI
(rate0.0015)
27Results ( Learning )(Convergence curve
Pursuit (rate0.0017)
28Results ( Learning )( Convergence curve LRI
(rate0.0032)
29Results( Learning 4 actions per Thresh )(
Convergence curve LRI (rate0.0015)
30Conclusions
- Our plots suggest the following
- Learning methods are indeed capable of arriving
at the optimal values for parameters in the type
of channel conditions faced in WirelessMAN. - The rate of convergence depends on
- rate factor (?)
- size of the action set
- How much do the actions differ in terms of the
reward that they get from the environment. - The learning algorithm
- Although we have worked with a relatively simple
setup with assumption that SNRestimated is
perfect and available complete generality of the
action set ensures that we can work with other
channel estimation parameters as well.
31References
- V. Erceg and K. V. Hari, Channel models for fixed
wireless applications. IEEE 802.16 broadband
wireless access working group.2001 - Daniel S. Baum, Simulating the SUI models. IEEE
802.16 broadband wireless access working
group,2000 - M. A. L. Thathachar and P.S. Shastry. Network of
learning automata techniques for online
stochastic optimization, Kluwer Academic
Publication,2003
32Thanks