Satisfaction Equilibrium - PowerPoint PPT Presentation

About This Presentation

Title:

Satisfaction Equilibrium

Description:

Satisfaction Equilibrium. St phane Ross. Canadian AI 2006. 2 / 21. Problem ... Agents generally do not know the preferences (rewards) of their opponents ... – PowerPoint PPT presentation

Number of Views:80

Avg rating:3.0/5.0

Slides: 22

Provided by: stphan5

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Satisfaction Equilibrium

1
Satisfaction Equilibrium

Stéphane Ross

2
Problem

In real life multiagent systems
Agents generally do not know the preferences
(rewards) of their opponents
Agents may not observe the actions of their
opponents
In this context, most game theoretic solution
concepts are hardly applicable
We may try to define equilibrium concepts that
do not require complete information
are achievable through learning, over repeated
play

3
Plan

Game model
Satisfaction Equilibrium
Satisfaction Equilibrium Learning
Results
Conclusion
Questions

4
Presentation Plan

Game model
Satisfaction Equilibrium
Satisfaction Equilibrium Learning
Results
Conclusion
Questions

5
Game model

Number of agents
Joint action
space
Set of possible outcomes
, the outcome function.
, agent is reward function.
Agent only knows , and .
After each turn, every agent observes an outcome
.

6
Game model

Observations
The agents do not know the game matrix
They are unable to compute best responses and
Nash Equilibrium.
They can only reason on their history of actions
and rewards.

a,b,c,d
7
Presentation Plan

Game model
Satisfaction Equilibrium
Satisfaction Equilibrium Learning
Results
Conclusion
Questions

8
Satisfaction Equilibrium

Since the agents can only reason on their history
of payoff, we may adopt a satisfaction-based
reasoning
If an agent is satisfied by its current reward,
it should keep playing the same strategy
An unsatisfied agent may decide to change its
strategy according to some exploration function
An equilibrium will arise when all agents are
satisfied.

9
Satisfaction Equilibrium

Formally
is the satisfaction
function of agent
if (agent i is
satisfied)
if (agent i is not
satisfied)
is the satisfaction threshold of agent
A joint strategy is a satisfaction equilibrium
if

10
Example

Prisoners dilemma
Possible satisfaction matrix

Dominant strategy D Nash Equilibrium
(D,D) Pareto-Optimal (C,C), (D,C), (C,D)
11
Satisfaction Equilibrium

However, even if a satisfaction equilibrium
exists, it may be unreachable

12
Presentation Plan

Game model
Satisfaction Equilibrium
Satisfaction Equilibrium Learning
Results
Conclusion
Questions

13
Satisfaction Equilibrium Learning

If the satisfaction thresholds are fixed, we only
need to apply the satisfaction-based reasoning
Choose a strategy randomly
If satisfied, keep playing the same strategy
Else choose a new strategy randomly
We can also use other exploration functions which
favour actions that have not been explored often
Ex

14
Satisfaction Equilibrium Learning

We use a simple update rule
When the agent is satisfied, we increment its
satisfaction threshold by some variable
If the agent is unsatisfied, we decrement its
satisfaction threshold of
is multiplied by a factor each
turn such that it converges to 0
We also use a limited history of our previous
satisfaction states and thresholds for each
action to bound the value of the satisfaction
threshold

15
Presentation Plan

Game model
Satisfaction Equilibrium
Satisfaction Equilibrium Learning
Results
Conclusion
Questions

16
Results

Fixed satisfaction thresholds
In simple games, we were always able to reach a
satisfaction equilibrium.
Using a biased exploration improves the speed of
convergence of the algorithm.
Learning the satisfaction thresholds
We are generally able to learn the optimal
satisfaction equilibrium in simple games.
Using a biased exploration improves the
convergence percentage of the algorithm.
The factor and history size affects the
convergence of the algorithm and need to be
adjusted to get optimal results.

17
Results Prisoners dilemma
18
Presentation Plan

Game model
Satisfaction Equilibrium
Satisfaction Equilibrium Learning
Results
Conclusion
Questions

19
Conclusion

It is possible to learn stable outcomes without
observing anything but our own rewards
Satisfaction equilibria can be defined on any
Pareto-Optimal solution.
However, satisfaction equilibria are not always
reachable
The proposed learning algorithms achieves good
performance in simple games
However, they require game-specific adjustments
for optimal performance

20
Conclusion