Lecture VI: Adaptive Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture VI: Adaptive Systems

Description:

Lecture VI: Adaptive Systems Zhixin Liu Complex Systems Research Center, Academy of Mathematics and Systems Sciences, CAS In the last lecture, we talked about Game ... – PowerPoint PPT presentation

Number of Views:241
Avg rating:3.0/5.0
Slides: 82
Provided by: Dell546
Category:

less

Transcript and Presenter's Notes

Title: Lecture VI: Adaptive Systems


1
Lecture VI Adaptive Systems
  • Zhixin Liu
  • Complex Systems Research Center,
  • Academy of Mathematics and Systems Sciences, CAS

2
In the last lecture, we talked about
  • Game Theory
  • An embodiment of the complex interactions among
    individuals
  • Nash equilibrium
  • Evolutionarily stable strategy

3
In this lecture, we will talk about
  • Adaptive Systems

4
Adaptation
  • To adapt to change oneself to conform to a new
    or changed circumstance.
  • What we know from the new circumstance?
  • Adaptive estimation, learning, identification
  • How to do the corresponding response?
  • Control/decision making

5
Why Adaptation?
  • Uncertainties always exist in modeling of
    practical systems.
  • Adaptation can reduce the uncertainties by using
    the system information.
  • Adaptation is an important embodiment of human
    intelligence.

6
Framework of Adaptive Systems
Environment

7
Two levels of adaptation
  • Individual learn and adapt
  • Population level
  • Death of old individuals
  • Creation of new individuals
  • Hierarchy

8
Some Examples
  • Adaptive control systems
  • adaptation in a single agent
  • Iterated prisoners dilemma
  • adaptation among agents

9
Some Examples
  • Adaptive control systems
  • adaptation in a single agent
  • Iterated prisoners dilemma
  • adaptation among agents

10
Adaptation In A Single Agent
Environment

11
Information
wt
yt
ut
Dynamical System
  • Information prior posterior
  • I0I1

I0 prior knowledge about the system I1
posterior knowledge about the system
u0,u1,ut, y0,y1,,yt (Observations)
The posterior information can be used to reduce
the uncertainties of the system.
12
Uncertainty
  • Model

External Uncertainty
Internal Uncertainty
  • External uncertainty noise/disturbance
  • Internal uncertainty
  • Parameter uncertainty
  • Signal uncertainty
  • Functional uncertainty

13
Adaptation
  • To adapt to change oneself to conform to a new
    or changed circumstance.
  • What we know from the new circumstance?
  • Adaptive estimation, learning, identification
  • How to do the corresponding response?
  • Control/decision making

14
  • Adaptive Estimation

15
Adaptive Estimation
  • Adaptive estimation parameter or structure
    estimator, which can be updated based on the
    on-line observations.

yt
Adaptive Estimator
e
-
?
yt
ut
System
  • Example In the parametric case, the parameter
    estimator can be obtained by minimizing certain
    prediction error

16
Adaptive Estimation
  • Parameter estimation
  • Consider the following linear regression model

unknown parameter vector
regression vector
noise sequence
  • Remark
  • Linear regression model may be nonlinear.
  • Linear system can be translated into linear
    regression model.

17
Least Square (LS) Algorithm
  • 1795, Gauss, least square algorithm
  • The number of functions is greater than that of
    the unknown parameters.
  • The data contain noise.
  • Minimize the following prediction error

18
Recursive Form of LS
Recursive Form of LS
  • where Pt is the following estimation covariance
    matrix

A basic problem
19
Recursive Form of LS
Theorem (T.L. Lai C.Z. Wei)
Under the above assumption, if the following
condition holds
then the LS has the strong consistency.
20
Weighted Least Square
  • Minimize the following prediction error
  • Recursive form of WLS

21
Self-Convergence of WLS
  • Take the weight as follows

with .
TheoremUnder Assumption 1, for any initial
value and any regression vector
, will converge to some vector almost surely.
Lei Guo, 1996, IEEE TAC
22
Adaptation
  • To adapt to change oneself to conform to a new
    or changed circumstance.
  • What we know from the new circumstance?
  • Adaptive estimation, learning, identification
  • How to do the corresponding response?
  • Control/decision making

23
  • Adaptive Control

24
Adaptive Control
Adaptive Control a controller with adjustable
parameters (or structures) together with a
mechanism for adjusting them.
y
u
Adaptive Estimator
Plant
r
Adaptive Controller
r
25
Robust Control
Model Nominal Ball
r
Can not reduce uncertainty!
26
Adaptive Control
  • An example
  • Consider the following linear regression model

Where a and b are unknown parameters, yt ,
ut, and wt are the output, input and white
noise sequence.
Objective design a control law to minimize the
following average tracking errors
27
Adaptive Control
  • If (a,b) is known, we can get the optimal
    controller

Certainty Equivalence Principle Replace
the unknown parameters in a non-adaptive
controller by its online estimate
If (a,b) is unknown, the adaptive controller can
be taken as
28
Adaptive control
  • If (a,b) is unknown, the adaptive controller can
    be taken as

where (at,bt) can be obtained by LS
29
Adaptive Control
  • The closed-loop system

30
Theoretical Problems
  • a) Stability

b) Optimality
31
Theoretical Obstacles
  • Controller

32
Theoretical Obstacles
  • 1) The closed-loop system is a very complicated
    nonlinear stochastic dynamical system.

2) No useful statistical properties, like
stationarity or independency of the system
signals are available. 3) No properties of (at,
bt) are known a priori.
33
Theorem
Assumption1) The noise sequence is a
martingale difference sequence, and there exists
a constant , such that
2) The regression vector is an
adaptive sequence, i.e.,
3) is a deterministic bounded signal.
  • Theorem
  • Under the above assumptions, the closed-loop
    system is stable and optimal.

Lei Guo, Automatica, 1995
34
Some Examples
  • Adaptive control systems
  • adaptation in a single agent
  • Iterated prisoners dilemma
  • adaptation among agents

35
Prisoners Dilemma
  • The story of prisoners dilemma
  • Player two prisoners
  • Action cooperation, Defect
  • Payoff matrix

Prisoner B
C
D
(0,5)
(3,3)
C
Prisoner A
(1,1)
(5,0)
D
36
Prisoners Dilemma
  • No matter what the other does, the best choice
    is D.
  • (D,D) is a Nash Equilibrium.
  • But, if both choose D, both will do worse than
    if both select C

Prisoner B
C
D
(0,5)
(3,3)
C
Prisoner A
(1,1)
(5,0)
D
37
Iterated Prisoners Dilemma
  • The individuals
  • Meet many times
  • Can recognize a previous interactant
  • Remember the prior outcome
  • Strategy specify the probability of cooperation
    and defect based on the history
  • P(C)f1(History)
  • P(D)f2(History)

38
Strategies
  • Tit For Tat cooperating on the first time, then
    repeat opponent's last choice.
  • Player A C D D C C C C C D D D D C
  • Player B D D C C C C C D D D D C
  • Tit For Tat and Random - Repeat opponent's last
    choice skewed by random setting.
  • Tit For Two Tats and Random - Like Tit For Tat
    except that opponent must make the same choice
    twice in a row before it is reciprocated. Choice
    is skewed by random setting.
  • Tit For Two Tats - Like Tit For Tat except that
    opponent must make the same choice twice in row
    before it is reciprocated.
  • Naive Prober (Tit For Tat with Random Defection)
    - Repeat opponent's last choice (ie Tit For Tat),
    but sometimes probe by defecting in lieu of
    cooperating.
  • Remorseful Prober (Tit For Tat with Random
    Defection) - Repeat opponent's last choice (ie
    Tit For Tat), but sometimes probe by defecting in
    lieu of cooperating. If the opponent defects in
    response to probing, show remorse by cooperating
    once.
  • Naive Peace Maker (Tit For Tat with Random
    Co-operation) - Repeat opponent's last choice (ie
    Tit For Tat), but sometimes make peace by
    co-operating in lieu of defecting.
  • True Peace Maker (hybrid of Tit For Tat and Tit
    For Two Tats with Random Cooperation) - Cooperate
    unless opponent defects twice in a row, then
    defect once, but sometimes make peace by
    cooperating in lieu of defecting.
  • Random - always set at 50 probability

39
Strategies
  • Tit For Tat cooperating on the first time, then
    repeat opponent's last choice.
  • Player A C D D C C C C C D D D D C
  • Player B D D C C C C C D D D D C
  • Tit For Tat and Random - Repeat opponent's last
    choice skewed by random setting.
  • Tit For Two Tats and Random - Like Tit For Tat
    except that opponent must make the same choice
    twice in a row before it is reciprocated. Choice
    is skewed by random setting.
  • Tit For Two Tats - Like Tit For Tat except that
    opponent must make the same choice twice in row
    before it is reciprocated.
  • Naive Prober (Tit For Tat with Random Defection)
    - Repeat opponent's last choice (ie Tit For Tat),
    but sometimes probe by defecting in lieu of
    cooperating.
  • Remorseful Prober (Tit For Tat with Random
    Defection) - Repeat opponent's last choice (ie
    Tit For Tat), but sometimes probe by defecting in
    lieu of cooperating. If the opponent defects in
    response to probing, show remorse by cooperating
    once.
  • Naive Peace Maker (Tit For Tat with Random
    Co-operation) - Repeat opponent's last choice (ie
    Tit For Tat), but sometimes make peace by
    co-operating in lieu of defecting.
  • True Peace Maker (hybrid of Tit For Tat and Tit
    For Two Tats with Random Cooperation) - Cooperate
    unless opponent defects twice in a row, then
    defect once, but sometimes make peace by
    cooperating in lieu of defecting.
  • Random - always set at 50 probability

40
Strategies
  • Always Defect
  • Always Cooperate
  • Grudger (Co-operate, but only be a sucker once) -
    Cooperate until the opponent defects. Then always
    defect unforgivingly.
  • Pavlov (repeat last choice if good outcome) - If
    5 or 3 points scored in the last round then
    repeat last choice.
  • Pavlov / Random (repeat last choice if good
    outcome and Random) - If 5 or 3 points scored in
    the last round then repeat last choice, but
    sometimes make random choices.
  • Adaptive - Starts with c,c,c,c,c,c,d,d,d,d,d and
    then takes choices which have given the best
    average score re-calculated after every move.
  • Gradual - Cooperates until the opponent defects,
    in such case defects the total number of times
    the opponent has defected during the game.
    Followed up by two co-operations.
  • Suspicious Tit For Tat - As for Tit For Tat
    except begins by defecting.
  • Soft Grudger - Cooperates until the opponent
    defects, in such case opponent is punished with
    d,d,d,d,c,c.
  • Customised strategy 1 - default setting is T1,
    P1, R1, S0, B1, always co-operate unless
    sucker (ie 0 points scored).
  • Customised strategy 2 - default setting is T1,
    P1, R0, S0, B0, always play alternating
    defect/cooperate.

41
Iterated Prisoners Dilemma
  • Which strategy can thrive/what is the good
    strategy?
  • Robert Axelrod, 1980s
  • A computer round-robin tournament
  • The first round
  • The second round

AXELROD R. 1987. The evolution of strategies in
the iterated Prisoners' Dilemma. In L. Davis,
editor, Genetic Algorithms and Simulated
Annealing. Morgan Kaufmann, Los Altos, CA.
42
Characters of good strategies
  • Goodness never defect first
  • First round the first eight strategies with
    goodness
  • Second round fourteen strategies with
    goodness in the first fifteen strategies
  • Forgiveness may revenge, but the memory is
    short.
  • Grudger is not s strategy with forgiveness
  • Goodness and forgiveness is a kind of
    collective behavior.
  • For a single agent, defect is the best strategy.

43
Evolution of the Strategies
  • Evolve good strategies by genetic algorithm
    (GA)

44
Some Notations in GA
  • String the individuals, and it is used to
    represent the chromosome in genetics.
  • Population the set of the individuals
  • Population size the number of the individuals
  • Gene the elements of the string
  • E.g., S1011, where 1,0,1,1 are called
    genes.
  • Fitness the adaptation of the agent for the
    circumstance

From Jing Hans PPT
45
How GA works?
  • Represent the solution of the problem by
    chromosome, i.e., the string
  • Generate some chromosomes as the initial solution
    randomly
  • According to the principle of Survival of the
    Fittest , the chromosome with high fitness can
    reproduce, then by crossover and mutation the new
    generation can be generated.
  • The chromosome with the highest fitness may be
    the solution of the problem.

From Jing Hans PPT
46
GA
Natural Selection
Create new generation
crossover
  • choose an initial population
  • determine the fitness of each individual
  • perform selection
  • repeat
  • perform crossover
  • perform mutation
  • determine the fitness of each individual
  • perform selection
  • until some stopping criterion applies

mutation
From Jing Hans PPT
47
Some Remarks On GA
  • The GA search the optimal solution from a set of
    solution, rather than a single solution
  • The search space is large 0,1L
  • GA is a random algorithm, since selection,
    crossover and mutation are all random operations.
  • Suitable for the following situation
  • There is structure in the search space but it is
    not well-understood
  • The inputs are non-stationary (i.e., the
    environment is changing)
  • The goal is not global optimization, but finding
    a reasonably good solution quickly

48
Evolution of Strategies By GA
  • Each chromosome represents one strategy
  • The strategy is deterministic and it is
    determined by the previous moves.
  • E.g., the strategy is determined by one step
    history, then there are four cases of history
  • Player I C D D C
  • Player II D D C C
  • The number of the possible strategies is
    222216.
  • TFT F(CC)C, F(CD)D, F(DC)C, F(DD)D
  • Always cooperate F(CC)F(CD)F(DC)F(DD)C
  • Always defect F(CC)F(CD)F(DC)F(DD)D

49
Evolution of the Strategies
  • Strategies use the outcome of the three previous
    moves to determine the current move.
  • The possible number of the histories is
    44464.
  • Player I CCC CCD CDC CDD DCC DCD
    DDD DDD
  • Player II CCC CCC CCC CCC CCC
    CCC DDC DDD

C C C C C C
C C C C C
C C C C
D D D D D D
D D D
  • The initial premises is three hypothetical move.
  • The length of the chromosome is 70.
  • The total number of strategies is 2701021.

50
Evolution of good strategy
  • Five steps of evolving good strategies by GA
  • An initial population is chosen.
  • Each individual is run in the current environment
    to determine its effectiveness.
  • The relatively successful individual are selected
    to have more offspring.
  • The successful individuals are randomly paired
    off to produce two offspring per mating.
  • Crossover way of constructing the chromosomes of
    the two offspring from the chromosome of two
    parents.
  • Mutation randomly changing a very small
    proportion of the Cs to Ds and vice versa.
  • New population are generated.

51
Evolution of the Strategies
  • Some parameters
  • The population size in each generation is 20.
  • Each game consists of 151 moves.
  • Each of them meet eight representatives, and this
    made about 24,000 moves per generation.
  • A run consists of 50 generation
  • Forty runs were conducted.

52
Results
  • The median member is as successful as TFT
  • Most of the strategies is resemble TFT
  • Some of them have the similar patterns as TFT
  • Do not rock the boat continue to cooperate after
    the mutual cooperation
  • Be provocable defect when the other player
    defects out of the blue
  • Accept an apology continue to cooperate after
    cooperation has been restored
  • Forget cooperate when mutual cooperation has
    been restored after an exploitation
  • Accept a rut defect after three mutual
    defections

53
What is a good strategy?
  • TFT is a good strategy?
  • Tit For Two Tats may be the best strategy in the
    first round, but it is not a good strategy in the
    second round.
  • Good strategy depends on other strategies,
    i.e., environment.

Evolutionarily stable strategy
54
Evolutionarily stable strategy (ESS)
  • Introduced by John Maynard Smith and George R.
    Price in 1973
  • ESS means evolutionarily stable strategy, that is
    a strategy such that, if all member of the
    population adopt it, then no mutant strategy
    could invade the population under the influence
    of natural selection.
  • ESS is robust for evolution, it can not be
    invaded by mutation.

John Maynard Smith, Evolution and the Theory of
Games
55
Definition of ESS
  • A strategy x is an ESS if for all y, y ? x, such
    that
  • holds for small positivee.

56
ESS in IPD
  • Tit For Tat can not be invaded by the wiliness
    strategies, such as always defect.
  • TFT can be invaded by goodness strategies, such
    as always cooperate, Tit For Two Tats and
    Suspicious Tit For Tat
  • Tit For Tat is not a strict ESS.
  • Always Cooperate can be invaded by Always
    Defect.
  • Always Defect is an ESS.

57
Other Adaptive Systems
  • Complex adaptive system
  • John Holland, Hidden Order, 1996
  • Examples
  • stock market, social insect, ant colonies,
    biosphere, brain, immune system, cell ,
    developing embryo,
  • Evolutionary algorithms
  • genetic algorithm, neural network,

58
References
  • Lei Guo, Self-convergence of weighted
    least-squares with applications to stochastic
    adaptive control, IEEE Trans. Automat. Contr.,
    1996, 41(1) 79-89.
  • Lei Guo, Convergence and logarithm laws of
    self-tuning regulators, 1995, Automatica, 31(3)
    435-450.
  • Lei Guo, Adaptive systems theory some basic
    concepts, methods and results, Journal of Systems
    Science and Complexity, 16(3) 293-306.
  • Drew Fudenberg, Jean Tirole, Game Theory, The
    MIT Press, 1991.
  • AXELROD R. 1987, The evolution of strategies in
    the iterated Prisoners' Dilemma. In L. Davis,
    editor, Genetic Algorithms and Simulated
    Annealing. Morgan Kaufmann, Los Altos, CA.
  • Richard Dawkins, The Selfish Gene, Oxford
    University Press.
  • John Holland, Hidden Order, 1996.

59
  • Adaptation in games

Adaptation in a single agent
60
Summary
In these six lectures, we have talked about
Complex Networks Collective Behavior of
MAS Game Theory Adaptive Systems
61
Summary
In these six lectures, we have talked about
Complex Networks Topology Collective Behavior
of MAS Game Theory Adaptive Systems
62
Three concepts
  • Average path length ltlgt
  • where dij is the shortest distance between i
    and j.
  • Clustering Coefficient CltC(i)gt
  • Degree distribution
  • P(k)probability that the randomly chosen
    node i has exactly k neighbors

Short average path length Large clustering
coefficient Power law degree distribution
63
Regular Graphs
  • Regular graphs graphs where each vertex has
    the same number of neighbors.
  • Examples

64
Random Graph
  • ER random graph model G(N,p)
  • Given N nodes
  • Add an edge between a randomly-selected pair of
    nodes with probability p
  • Homogeneous nature each node has roughly the
    same number of edges

65
Small World Networks
  • WS model

Introduce pNK/2 long-range edges
A few long-range links are sufficient to
decrease l, but will not significantly change C.
66
Scale Free Networks
  • Some observations
  • A breakthrough Barabási Albert, 1999, Science
  • Generating process of BA model
  • 1) Starting with a network with m0 nodes
  • 2) Growth at each step, we add a new node
    with m(?m0) edges that link the new node to m
    different nodes already present in the network.
  • 3) Preferential attachment When choosing
    the nodes to which the new nodes connects, we
    assume that the probability ? that a new node
    will be connected to node i on the degree ki of
    node i, such that

67
Summary
In these six lectures, we have talked about
Complex Networks Topology Collective Behavior
of MAS More is different Game Theory Adaptive
Systems
68
Multi-Agent System (MAS)
  • MAS
  • Many agents
  • Local interactions between agents
  • Collective behavior in the population level
  • More is different.---Philp Anderson, 1972
  • e.g., Phase transition, coordination,
    synchronization, consensus, clustering,
    aggregation,
  • Examples
  • Physical systems
  • Biological systems
  • Social and economic systems
  • Engineering systems

69
Vicsek Model
Neighbors
70
Theorem 2 (Jadbabaie et al. , 2003)
Joint connectivity of the neighbor graphs on each
time interval th, (t1)h with h gt0
Synchronization of the linearized Vicsek model
Related result J.N.Tsitsiklis, et al., IEEE
TAC, 1984
71
Theorem 7 High Density Implies Synchronization
  • For any given system parameters
  • and when the number of agnets n
  • is large, the Vicsek model will synchronize
    almost surely.

This theorem is consistent with the simulation
result.
72
Theorem 8 High density with short distance
interaction
Let
and the velocity satisfy Then
for large population, the MAS will synchronize
almost surely.
73
Soft Control
  • Key points
  • Different from distributed control approach.
    Intervention to the distributed system
  • Not to change the local rule of the existing
    agents
  • Add one (or a few) special agent called shill
    based on the system state information, to
    intervene the collective behavior
  • The shill is controlled by us, but is treated
    as an ordinary agent by all other agents.
  • Shill is not leader, not leader-follower type.
  • Feedback intervention by shill(s).

This page is very important!
From Jing Hans PPT
74
Leader-Follower Model
  • Key points
  • Not to change the local rule of the existing
    agents.
  • Add some (usually not very few) information
    agents called leaders, to control or
    intervene the MAS But the existing agents
    treated them as ordinary agents.
  • The proportion of the leaders is controlled by us
    (If the number of leaders is small, then
    connectivity may not be guaranteed).
  • Open-loop intervention by leaders.

75
Summary
In these six lectures, we have talked about
Complex Networks Topology Collective Behavior
of MAS More is different Game Theory
Interactions Adaptive Systems
76
Definition of Nash Equilibrium
  • Nash Equilibrium (NE) A solution concept of a
    game
  • (N, S, u) a game
  • Si strategy set for player i
  • set of
    strategy profiles

  • payoff function
  • s-i strategy profile of all players except
    player i
  • A strategy profile s is called a Nash
    equilibrium if
  • where si is any pure strategy of the player i.

77
Definition of ESS
  • A strategy x is an ESS if for all y, y ? x, such
    that
  • holds for small positivee.

78
Summary
In these six lectures, we have talked about
Complex Networks Topology Collective Behavior
of MAS More is different Game Theory
Interactions Adaptive Systems Adaptation
79
Other Topics
  • Self-organizing criticality
  • Earthquakes, fire, sand pile model, Bak-Sneppen
    model
  • Nonlinear dynamics
  • chaos, bifurcation,
  • Artificial life
  • Tierra model, gene pool, game of life,
  • Evolutionary dynamics
  • genetic algorithm, neural network,

80
Complex systems
  • Not a mature subject
  • No unified framework or universal methods

81
THE END
Write a Comment
User Comments (0)
About PowerShow.com