Title: Online Planning for Resource Production in RealTime Strategy Games
1Online Planning for Resource Production in
Real-Time Strategy Games
- Hei Chan, Alan Fern, Soumya Ray,
- Nick Wilson, Chris Ventura
- School of EECS
- Oregon State University
2RTS Game Example Wargus
peasant
goldmine
townhall
forest
3RTS Game Example Wargus
4Resource Production in RTS Games
- Produce (or gather) various raw materials,
buildings, civilian or military units - Build economy and military power to prepare for
tactical battles - Ability to do this in a short amount of time is
key to winning the RTS game - Plans with short makespan are desirable
5Challenging Features
- Actions have durations
- Resources are numeric
- Objects are created during game-play
- Plans typically contain many actions
- Concurrency is necessary for short makespan
- Actions must be selected in real-time
- Many of the hardest aspects of AI planning!
6How do we cope?
- Give up on optimality
- Comparable with human expert players
- Reason about number of objects of each type
- Do not reason about each individual object
- Leverage special structure of RTS game actions
- Only a subset of full PDDL language
7Solution Architecture
Abstract game state Resource amount of each type
(e.g. of peasants) Actions currently being
executed
Online planner
Action dispatcher
Actions to be executed now e.g. 2 collect-gold
1 collect-wood
Current game state
Ground actions Peasant 1 collect-gold Peasant 2
collect-gold Peasant 3 collect-wood
Wargus engine
8What about PDDL planners?
- We can express our domain in PDDL
- (durative-action collect-gold
- parameters ()
- duration ( ?duration 300)
- condition
- (and (over all (gt total-townhall 0)))
- (at start (gt avail-peasant 0))
- effect
- (and (at start (decrease avail-peasant 1))
- (at end (increase avail-peasant 1))
- (at end (increase total-gold 100))
- (at end (increase time ?duration))))
- Several PDDL planners can handle our features
- SAPA, MIPS-XXL, SG-Plan, LPG-td, TM-LPSAT, etc.
- Do not return good plans
- Often slow
9Action Language
- Define RTS resource production actions in a
specialized language - Captures most of the actions in RTS domain
- Can be translated to a restricted subset of PDDL
- Based on four resource tags according to
relationships between resource and action - Require
- Borrow
- Consume
- Produce
10Action Language
- action build-townhall
- duration 1530
- require 1 lumbermill
- borrow 1 peasant
- consume 1200 gold 800 wood
- produce 1 townhall
Needs to be present during action, but is not
locked up
Locked up during action, so other actions
cannot borrow
Deducted from game state at start of action
Added to game state at end of action
11Online Planner Architecture
Abstract game state
Sequential planner
Collect-gold
Collect-wood
Build-farm
Sequential plan
Heuristic scheduler
Collect-gold
Concurrent plan
Collect-wood
Build-farm
Executable action selector
Collect-gold
Actions to be executed now
Collect-wood
12Sequential Planner
- Means-Ends Analysis (MEA) is used to find a
sequential plan to achieve the goal - MEA iteratively constructs a plan to satisfy each
resource goal - Recursively finds a sub-plan which satisfies the
preconditions of an action which produces the
resource
13Means-Ends Analysis
1 peasant 1 townhall
400 gold 1 supply
400 gold 1 supply
1 peasant 1 townhall 400 gold
Collect-gold x 4
400 gold 1 supply
400 gold 200 wood
1 peasant 1 townhall 400 gold
Collect-gold x 4
Build-farm
400 gold 1 supply
1 peasant 1 townhall 4 supply
Collect-gold x 4
Collect-wood x 2
Build-farm
1 peasant 1 townhall 400 gold 4 supply
Collect-gold x 4
Collect-wood x 2
Build-farm
Collect-gold x 4
14Means-Ends Analysis
- MEA is guaranteed to return the plan with the
minimal set of actions in polynomial time in the
number of actions in the plan, if - Each resource is produced by exactly one action
- Initial state guarantees no cyclical dependency
among resource goals - Can be verified by examining the resource
dependency graph
15Heuristic Scheduler
- Reschedule actions from the sequential plan to
allow concurrency and decrease makespan - Each action is moved repeatedly to an earlier
time, until its preconditions no longer hold - Rescheduling can be done in quadratic time in the
number of actions in the plan - Not optimal, but suitable for online planning
16However,
- The above planner (MEA rescheduling) only
creates the minimal resources to achieve the goal - If there is currently one peasant, the planner
will not create any additional peasants - No concurrency possible
- Creating more peasants may decrease makespan
- A good planner must create a close-to-optimal
number of renewable resources, e.g. peasants
Collect-gold
Collect-wood
Build-farm
17Creating Renewable Resources
- Find plans that satisfy intermediate goals
- ICAPS Planning in Games Workshop paper
- Search over a variable but bounded set of
intermediate goals
Create one extra peasant Goal
Plan1
Create one extra barracks Goal
Plan2
Chosen plan
Min
Create one extra townhall Goal
Plan3
Goal
Plan0
18Empirical Evaluation
- Compare our online planner in Wargus resource
production against - An expert human player via the game-playing
interface (2 trials) - Best of 2 plans written by the human player
- Other state-of-the-art PDDL planners
19Experimental Results
5 5 5
1 5 5
5 9 5
9 7 5
of peasants
20Experimental Results
5 1 1
8 1 1
5 1 1
9 1 1
of peasants
21Limitations
- Actions have constant durations and effects
- Durations vary for actions such as collect-wood
- Effects can change due to new technology
- Unable to infer object identity
- Needed for some actions such as repair
- No spatial reasoning
- Needed for building placement
- Actions cannot be stopped
- Necessary if goals or environment change
22Future Work
- Less restrictive action language
- Provide guarantees when using inaccurate models
- Adaptive model learning
- Integrate into a full RTS game-playing system
23Summary
- We presented an approach to solving large
resource production problem in RTS games - Our approach works in an online setting, as it
uses a computationally efficient action selection
mechanism - Our planner is competitive with an human expert
player and performs significantly better than
state-of-the-art planners in this domain