Title: Dynamics of Learning
1Dynamics of Learning Distributed Adaptation
Santa Fe Institute James P. Crutchfield, P.I.
Multi-Agent System Science (MASS) Dimension Agents
learn complex environment ab initio Synchronizati
on of agent to environment Agents adapt to
nonstationary environment Strategies for
agent-agent coordination Metrics for large-scale
MASs Statistical Complexity Amount of structure
organization in environt Individual agent
knowledge v. group knowledge Mutuality
Architecture of information flow Lyapunov
Spectra Degrees of stability and
instability Causal Synchrony Detect coherent
subgroup behavior
- CAHDE REF
- ACFC Adapting to instabilities in air flow
control - AirOps Emergence of spontaneous leadership
- Solution
- Interacting reinforcement and ?-machine learning
agents solve a group task - Approach
- Pattern Discovery Beyond pattern recognition
- Design analysis based on sound principles of
learning - Metrics for cooperation in large-scale systems
Future Plans (6 months out) New
problems Continuous-state and continuous-time
agents Adaptation to active, pattern-forming
environments Dynamical theory of how learning and
adaptation occur Anticipated results Monitor
emergence of cooperation in agent
collectives Measure mutuality in interacting
reinforcement learners Test on in-house
autonomous robotic vehicle collectives Analytical
tools Predict whether or not group cooperation
can occur Agent intelligence versus group
size Prediction of the rate of adaptation during
collective task Prototype models Solvable MAS
systems Software tools Ab Initio Learning
Algorithms Library for Estimating MASS
Metrics Enterprise Java Platform for Robot
Collectives
- Results To Date
- Predictive theory of agent learning
- Quantify agent modeling capacity
- Data Set Size v. Prediction Error v. Model
Complexity - Pattern Discovery The Aha Effect
- Incremental learning algorithm
- Quantify structure in environment
- How structure leads to unpredictability for agent
- Define synchronization for chaotic environments
- Predict required data and time to synchronize
- Periodic case solved in closed form
- Transient information New metric of
synchronization - Dynamics of reinforcement-learning agents
- Nash equilibria v. oscillation v. chaos
- Dependence on system architecture and initial
state
2Dynamics of Learning Distributed AdaptationPI
James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute TASK PI Meeting,
9-11 January 2002, D.C.
- Dynamics of Learning Single-Agent Adaptability
- REF Scenario ACFC
- Adapting to Instabilities in Air Flow Control
- Emergence of Distributed Adaptation
- REF Scenario AirOps
- Spontaneous Leadership
3Dynamics of Learning Distributed AdaptationPI
James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Adapting to
Instabilities in Air Flow Control
- REF Scenario Air Corridor Flow Control (ACFC)
- Adapting to Instabilities in Air Flow Control
- Novel air vehicle or vehicle behavior appears in
evnt - Unexpected weather patterns (e.g. central Asian
dust storms) - Emergent Novelty
- Pattern formation in intruder MAS
- Pattern formation in own MAS
- Conclusion
- Cannot design in all potentialities
- Need to adapt to novelty
4Dynamics of Learning Distributed AdaptationPI
James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Adapting to
Instabilities in Air Flow Control
- Problem Statement
- Agent learns patterns in heterogeneous
environment - Even agent with a good model will meet unforeseen
- events and patterns in environment
- Where to start base design for agents?
- Conclusion Need Tabla Rasa Learning for
- Agents to dynamically adapt
- Designers to create initial models and MAS
architecture -
5Dynamics of Learning Solution Tabla Rasa
Learning using theState-splitting
?-Machine-Learning Algorithmwith C. Shalizi
(SFI) and K. Klinkner (USF)
- The Goal in Learning Estimate ?-machine of
environment - Optimal predictors agent prediction error as
small as possible - Minimal size of effective envt states as
small as possible - Unique representation
- Start from Fair Coin
- Split states, when doing so
- Is statistically justified (enough data) and
- Will significantly reduce the prediction error
- Convergence from below (C? 0 to true C?)
- Previous algorithms
- Converge from above
- Sometimes gave nondeterministic ?-machines
- New algorithm
- Produces deterministic ?-machines
6Tabla Rasa Learning State-splitting
?-Machine-Learning Algorithm
- Definitions S Casual states
- Entropy Rate
- h? HPr(sS)
- Statistical Complexity
- C? HPr(S)
- Empirical Error Rate
- Demp(L) DPrS(sL)Premp(sL)
- Generalization Error Rate
- Dgen(L) DPrS(sL)Prtrue(sL)
7Tabla Rasa Learning State-splitting
?-Machine-Learning AlgorithmFair Coin
h?
D(L)
C?
- No overfitting of randomness
- Better model of process than raw data
8Tabla Rasa Learning State-splitting
?-Machine-Learning AlgorithmNoisy Period-2
h?
D(L)
C?
9Tabla Rasa Learning State-splitting
?-Machine-Learning AlgorithmEven Process
h?
- Type of ? memory
- No Markov model can work
D(L)
C?
10Tabla Rasa Learning State-splitting
?-Machine-Learning AlgorithmMisiurewicz Process
h?
C?
D(L)
11Tabla Rasa Learning State-splitting
?-Machine-Learning AlgorithmSimple
Nondeterministic
h?
- ? memory ? causal states
- No Markov model can work
D(L)
C?
12Tabla Rasa Learning State-splitting
?-Machine-Learning AlgorithmConsequences
- New algorithm working well, preserves desired
properties - Yields
- Systematic approx to environmental
unpredictability, complexity - Model for prediction, decision-making, and
planning - To do
- Theoretical characterization of convergence in
- Prediction error
- Statistical complexity
- Adapt to be explicitly dynamical
- What if environment is nonstationary?
- Include state merging
- Integrate as learning algorithm for various kinds
of agents
13Dynamics of Learning Distributed AdaptationPI
James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Adapting to
Instabilities in Air Flow Control
- Results
- Inferring new environment states is the discovery
of new patterns and significant events - Only when statistically justified are more
complex models built - Metric for adaptation given by change in model
complexity - Can now explicitly trade-off
- Amount of data versus model complexity
- Model complexity versus prediction error
14Tabla Rasa Learning Phenomenal Patterns and the
Aha Effect
- Learning complex environments (w/ Chris Douglas)
- Learning paradigm
- Three phases
- Memorization
- Aha!
- Refinement
15Dynamics of Learning Distributed AdaptationPI
James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Adapting to
Instabilities in Air Flow Control
- Applications to REF
- Pattern discovery means that agents
- Adapt to novelty
- Automatically capture emergent behaviors in evnt
- Patterns learned tell us the structure of the
envt - Design
- Diagnosis
- Real-time monitoring
- When environment has changed significantly
- Performance of system adaptation
16Dynamics of Learning Distributed Adaptation
PI James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Spontaneous
Leadership
- REF Scenario AirOps Spontaneous Leadership
- Emergence of control levels in MAS
- Electing officers
- Authority hierarchy
- Recovery from loss of central control
- Spontaneous task specialization
- Develop control structures appropriate to group
tasks
17Dynamics of Learning Distributed Adaptation
PI James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Spontaneous
Leadership
- Problem Statement
- MAS Analysis
- How to detect emergent control architecture in
own or opponent forces? - MAS Design
- Rules for learning and interaction that produce
cooperative group behavior - Must be able to measure emergence of cooperation
- Conclusion Need to detect hidden coherent
behaviors
18Dynamics of Learning Distributed Adaptation
PI James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Spontaneous
Leadership
- Dynamics of Learning Solution Causal Synchrony
- Builds on generalized synchrony of earlier work
- Applies to any network of dynamical nodes
- Do causal-state reconstruction on each node in
network - For any two nodes, find the mutual information
between their causal states - Amount of common causal information
- Normalize MI to get score between 0 and 1
- Degree of coherence in future
behavior - Display as weighted graph showing clusters of
synchrony - Average over network
- O(N) reconstructions, O(N2) calculations of MI
19Dynamics of Learning Distributed Adaptation
PI James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Spontaneous
Leadership
- Observations Results
- No assumptions about internal node dynamics
- Heterogeneous networks OK
- Robust against noise and over-fitting
- More informative than MI covariance of raw
time-series - Hierarchical decomposition á la C. Alexander,
Design Patterns - Overall causal synchrony
- Amount of information in network rather than in
individual agents
20Dynamics of Learning Distributed Adaptation
PI James P. Crutchfield Cosma Shalizi,
Post-DocSanta Fe Institute Spontaneous
Leadership
- Application to REF
- When does synchronized behavior emerge?
- When do we get internally-synchronized fragments?
- When is each behavior desirable?