Learning Approach to Link Adaptation in WirelessMAN - PowerPoint PPT Presentation

About This Presentation
Title:

Learning Approach to Link Adaptation in WirelessMAN

Description:

Learning Approach to Link Adaptation in WirelessMAN Avinash Prasad Supervisor: Prof. Saran Outline of Presentation Introduction Problem Definition Proposed Solution ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 33
Provided by: csd0
Category:

less

Transcript and Presenter's Notes

Title: Learning Approach to Link Adaptation in WirelessMAN


1
Learning Approach to Link Adaptation in
WirelessMAN
  • Avinash Prasad
  • Supervisor Prof. Saran

2
Outline of Presentation
  • Introduction
  • Problem Definition
  • Proposed Solution Learning Automaton.
  • Requirements
  • About Implementation
  • Results
  • Conclusions
  • References

3
Introduction ( Link Adaptation )
  • Definition
  • Link adaptation refers to a set of techniques
    where modulation, coding rate and/or other signal
    transmission parameters are changed on the fly to
    better adjust to the changing channel conditions.

4
Introduction ( WirelessMAN )
  • WirelessMAN requires high data rates over
  • channel conditions varying across different
    links.
  • channel conditions vary over time
  • Link Adaptation on per link basis is the most
    fundamental step that this BWA system uses to
    respond to these link to link variations, and
    variations over time. There is an elaborate
    message passing mechanism for exchanging channel
    information at the MAC layer.

5
Problem Definition
  • Link adaptation requires us to know which channel
    condition changes summon for a change in
    transmission parameters.
  • Most commonly identified problem of link
    adaptation
  • How do we calculate the threshold values for
    various channel estimation parameters, which
    shall signal a need for change in channel
    transmission parameters.

6
Problem Definition ( Current Approaches)
  • Current methods for threshold estimation
  • Model Based
  • Requires analytical modeling.
  • How reliable is the model ?
  • Availability of appropriate model for the
    wireless scenario ?
  • Statistical methods.
  • Hard to obtain one channel conditions.
  • Doesnt change with time, fixed.
  • Even a change in season may effect the best
    appropriate values.
  • Heuristics based
  • Scope limited to very few scenarios.

7
Proposed Solution ( Aim )
  • Come up with a machine learning based method such
    that
  • Learn the optimal threshold values as we operate
    over the network.
  • No analytical modeling needed by the method in
    its operation.
  • Should be able to handle noisy feedback from the
    environment.
  • Generic enough to be able to learn different
    parameters without much changes to the core.

8
Proposed Solution ( Idea )
  • Use stochastic learning automaton.
  • Informally Essentially simulates animal
    learning Repeatedly make your decisions based
    on your current knowledge, and then refine your
    decision as per the response from the
    environment.
  • Mathematically Modifies the probability of
    selecting any action based on how much reward do
    we get from the environment.

9
Proposed Solution ( Exp. Setup )
  • Experimental setup used to study the stochastic
    learning methods.
  • We learn the optimal SNR threshold values for
    switching among coding profiles, such that the
    throughput is maximized.
  • Threshold Ti decides when to switch from profile
    i to profile i1.
  • The possible values for Ti have been restricted
    to a limited set of values, to facilitate faster
    learning, by diminishing the number of options.

10
Proposed Solution ( Exp. Setup ) (
Mathematical Formulation )
  • For N different profiles in use, the need to
    (N-1) thresholds to be determined/learned.
  • At any instance these (N-1) thresholds Ti , i
    1,..,N-1, form the input to the environment.
  • In return the environment returns the reward
  • ß( SNR estimate,lt T1,..,TNgt) (1- SER)
    ( K / Kmax)
  • K represents the information block size fed to
    the RS encoder in the selected profile.
  • Kmax is the maximum possible value of K for any
    profile. This makes the reward value lie in the
    range 0,1
  • Clearly ß is a measure of normalized throughput.

11
Proposed Solution (Learning Automaton) (
Formal Definition )
  • A learning automaton is completely given by (A,
    B, LA, Pk )
  • Action set A (a1, a2,, ar), we shall always
    assume this set to be finite in our discussion.
  • Set of rewards B 0,1
  • The learning algorithm LA
  • State information Pk p1k , p2k ,.., prk

12
Proposed Solution (Learning Automaton) (
Why? )
  • Advantages
  • Complete generality of action set
  • We can have a entire set of automaton , each
    working on a different variable of a
    multivariable problem, and yet they arrive at a
    Nash Equilibrium, such that the overall function
    is maximized, much faster than a single
    automaton.
  • It can handle noisy reward values from the
    environment
  • Perform long time averaging as it learns
  • But thus needs the environment to be stationary

13
Proposed Solution (Learning Automaton) (
Solution to Exp. setup )
  • Each threshold is learnt by an independent
    automaton in the group, game , of automaton that
    solves the problem.
  • We choose the smallest possible action set
    depending that covers all possible variations in
    channel conditions in the setup, for each of the
    automaton .i.e. decide the possible range of
    threshold values.
  • We decide on the learning algorithm to use.

14
Proposed Solution (Learning Automaton) (
Solution to Exp. setup )
  • For k being the instance of playoff. We do the
    following
  • Each automaton selects an action (Threshold)
    based on its state Pk, the probability vector.
  • Based on these threshold values, we select a
    profile for channel transmission.
  • Get feedback from the channel in the form of the
    value of normalized throughput defined earlier.
  • Use the learning algorithm to calculate the new
    state, set of probabilities Pk1 .

15
Proposed Solution (Learning Automaton) (
Learning Algorithms )
  • We have explored two different algorithms
  • LRI , Linear reward inaction.
  • Very much Markovian , just update the Pk1 based
    on the last action/reward pair
  • for a(k)ai pi(k1) pi(k) ?ß(k)(1-
    pi(k))
  • otherwise pi(k1) pi(k ) - ?ß(k) pi(k
    )
  • ? is a rate constant.
  • Pursuit Algorithm
  • Uses the entire history of selection and reward
    to calculate the average reward estimates for all
    actions.
  • Aggressively tries to move towards the simplex
    solution, which has probability 1 for action
    with highest reward estimate, say action aM .
  • P(k1) P(k) ?( eM(k) P(k))

16
Proposed Solution (Learning Automaton)
( Learning Algorithms cont.)
  • Both differ in the speed of convergence to the
    optimal solution
  • The amount of storage required for each.
  • How much decentralized the learning setup, game,
    can be
  • The way they approach their convergence point
  • Being a greedy method pursuit algorithm shows
    lots of deviation in the evolution phase.

17
Requirements
  • 802.16 OFDM Physical layer
  • Channel model (SUI model used)
  • Learning Setup

18
About Implementation ( 802.16 OFDM Physical
Layer)
  • Implements OFDM physical layer from 802.16d.
  • Coded in MatlabTM
  • Complies fully to the standard, operations tested
    with the example for pipeline given in the
    standard.
  • No antenna diversity used, and perfect channel
    impulse response estimation assumed.

19
About Implementation ( Channel Model )
  • We have implemented the complete set of SUI
    models for omni antenna case.
  • The complete channel model consists of one of the
    SUI models plus AWGN model for noise.
  • Coded in MatlabTM , thus completing the entire
    channel coding pipeline.
  • Results from this data transmission pipeline
    shall be presented later.

20
About Implementation ( Learning Setup )
  • We implemented both the algorithms for
    comparison.
  • Coded in C/C.
  • A network model was constructed using the Symbol
    error rate plots obtained form PHY layer
    simulations to estimate the reward values.

21
Results ( PHY layer) ( BER plots for
different SUI models)
22
Results ( PHY layer) ( SER plots for
different SUI models)
23
Results ( PHY layer)( BER plots for different
Profiles at SUI2)
24
Results ( PHY layer)(SER plots for different
Profiles at SUI2)
25
Results ( PHY layer)( Reward Metric for
learning automaton )
26
Results ( Learning )( Convergence curve LRI
(rate0.0015)
27
Results ( Learning )(Convergence curve
Pursuit (rate0.0017)
28
Results ( Learning )( Convergence curve LRI
(rate0.0032)
29
Results( Learning 4 actions per Thresh )(
Convergence curve LRI (rate0.0015)
30
Conclusions
  • Our plots suggest the following
  • Learning methods are indeed capable of arriving
    at the optimal values for parameters in the type
    of channel conditions faced in WirelessMAN.
  • The rate of convergence depends on
  • rate factor (?)
  • size of the action set
  • How much do the actions differ in terms of the
    reward that they get from the environment.
  • The learning algorithm
  • Although we have worked with a relatively simple
    setup with assumption that SNRestimated is
    perfect and available complete generality of the
    action set ensures that we can work with other
    channel estimation parameters as well.

31
References
  • V. Erceg and K. V. Hari, Channel models for fixed
    wireless applications. IEEE 802.16 broadband
    wireless access working group.2001
  • Daniel S. Baum, Simulating the SUI models. IEEE
    802.16 broadband wireless access working
    group,2000
  • M. A. L. Thathachar and P.S. Shastry. Network of
    learning automata techniques for online
    stochastic optimization, Kluwer Academic
    Publication,2003

32
Thanks
  • Thanks
Write a Comment
User Comments (0)
About PowerShow.com