Title: The Particle Swarm:
1 The Particle Swarm Theme and Variations on
Computational Social Learning
James Kennedy Washington, DC Kennedy.Jim_at_gmail.com
2The Particle Swarm
A stochastic, population-based algorithm for
problem-solving Based on a social-psychological
metaphor Used by engineers, computer scientists,
applied mathematicians, etc. First reported in
1995 by Kennedy and Eberhart Constantly evolving
3The Particle Swarm Paradigm is a Particle Swarm
A kind of program comprising a population of
individuals that interact with one another
according to simple rules in order to solve
problems, which may be very complex.
It is an appropriate kind of description of the
process of science.
4Two Spaces
The social network topology, and The state of the
individual as a point in a Cartesian coordinate
system Moving point a particle A bunch of them
a swarm Note memes evolution of
ideas vs. Changes in people who hold ideas
5Cognition as Optimization
Minimizing or maximizing a function result by
adjusting parameters
Cognitive consistency theories, incl. dissonance
Parallel constraint satisfaction
Feedforward Neural Nets
Particle swarm describes the dynamics of the
network, as opposed to its equilibrium properties
6Mind and Society
Minsky Minds are simply what brains do. No
minds are what socialized human brains do.
solipsism
7Dynamic Social Impact Theory
Nowak, A., Szamrej, J., Latané, B. (1990). From
private attitude to public opinion A dynamic
theory of social impact. Psychological Review,
97, 362-376.
if(SIN) Computer simulation 2-d CA Each
individual is both a target and source of
influence Euclidean neighborhoods Binary,
univariate individuals Strength randomly
assigned Polarization
8Particle Swarms The Population
To understand the particle swarm, you have to
understand the interactions of the particles One
particle is the stupidest thing in the world The
population learns Every particle is a teacher
and a learner Social learning, norms,
conformity, group dynamics, social influence,
persuasion, self-presentation, cognitive
dissonance, symbolic interactionism, cultural
evolution
9Neighborhoods Population topology
Gbest
Lbest
(All N20)
Particles learn from one another. Their
communication structure determines how solutions
propagate through the population.
10von Neumann (Square) Topology
Regular, easy to understand, works adequately
with a variety of versions perhaps not the best
for any version, but not the worst.
11The Particle Swarm Pseudocode
Initialize Population and constants Repeat Do i1 to population size CurrentEvali eval( ) If CurrentEval lt pbesti then do pbesti CurrentEvali For d1 to Dimension pid xid Next d If CurrentEval i lt Pbestgbest then gbesti End if g best neighbors index For d1 to Dimension vid Wvid U(0, AC) (pid xid) U(0, AC) (pgd xid) xid xid vid Next d Next i Until termination criterion
12Pseudocode Chunk 1
CurrentEvali eval( ) If CurrentEvali lt
pbesti then do pbesti CurrentEvali
For d1 to Dimension pid xid
Next d If CurrentEval i lt pbestgbest then
gbesti End if
Can come at top or bottom of the loop In gbest
topology, ggbest Its useful to track the
population best
13Pseudocode Chunk 2
g best neighbors index For d1 to Dimension vid Wvid U(0, AC) (pid xid) U(0, AC) (pgd xid) xid xid vid Next d
Constriction Type 1 W0.7298,
ACW2.051.496 Clerc 2006 W0.7,
AC1.43 Inertia W might vary with time, in (0.4,
0.9), etc., AC2.0 typically Note three
components of velocity Vmax - not necessary,
might help
Q Are these formulas arbitrary?
14Some Standard Test Functions
Rosenbrock
Sphere
Rastrigin
Schaffers f6
Griewank
15Test Functions Typical Results
16Step-Size Depends on Neighbors
pi0 pg0
Movement of the particle through the search
space is centered on the mean of pi and pg on
each dimension, and its amplitude is scaled to
their difference. Exploration vs. exploitation
automatic
pi2 pg-2
pi0.1 pg-0.1
the box
17Search distribution Scaled to Neighborhood
Previous bests constant at 10 A million
iterations
Q What is the distribution of points that are
tested by the particle?
18Bare Bones particle swarm
x G((pi pg)/2, abs(pi pg))
G(mean, s.d.) is Gaussian RNG Simplified
(!) Works pretty well, but not as good as
canonical.
19Kurtosis
Peaked -- fat tails
Tails trimmed
Empirical observations with ps held constant
Not trimmed
20Kurtosis
High peak, fat tails
Mean moments of the canonical particle swarm
algorithm with previous bests set at 20, varying
the number of iterations.
Iterations Mean S.D. Skew-ness Kurtosis
1,000 0.0970 37.7303 -0.0617 8.008
3,000 0.0214 41.5281 0.0814 18.813
10,000 -0.0080 41.6614 -0.0679 40.494
100,000 0.0022 41.7229 0.2116 170.204
1,000,000 0.0080 41.3048 0.3808 342.986
21Bursts of Outliers
Volatility clustering seems to typify the
particles trajectory
22Adding Bursts of Outliers to Bare Bones PSO
Griewank30
Sphere
Center (pid pgd)/2 SD pid - pgd xid G(0,1) if Burst 0 and U(0,1)lt PBurstStart then Burst U(0, maxpower) Else If Burst gt 0 and U(0,1)lt PBurstEnd then Burst 0 End If If Burst gt 0 then xid xid Burst xid Center xid SD
Rosenbrock
Griewank10
f6
Rastrigin
(Bubbled line is canonical PS)
23The Box
Where the particle goes next depends on which way
it was already going, the random numbers, and the
sign and magnitude of the differences.
vid Wvid U(0, AC) (pid xid) U(0, AC) (pgd xid) xid x id v id
The area where it can go it crisply bounded, but
the probability inside the box is not uniformly
dense.
24Empirical distribution of means of random numbers
from different ranges
Simulate with uniform RNG, trim tails
25TUPS Truncated-Uniform Particle Swarm
Start at current position x(t) Move weighted
amount same direction W1 (x(t) x(t-1)) Find
midpoint of extremes, and difference between
them, on each dimension (the sides of the
box) Weight that, add it to where you are, call
it the center Generate uniformly distributed
random number around the center, range
slightly less than the length of the side Thats
x(t1)
26TUPS Truncated-Uniform Particle Swarm
x(t1)x(t) W1 (x(t) x(t-1)) W2 ((U(-1,1) (width/2.5)) center) W10.729 W21.494 Width is difference between highest and lowest (p-x) Center is width/2 Generates a point less than Width/2 from the center
27Binary Particle Swarms
S(x) 1 / (1 exp(-x)) v the usual if
rand() lt S(v) then x 1 else x 0
Transform velocity with sigmoid function in (0 ..
1) Use it as a probability threshold
Though this is a radically different concept,
the principles of particle interaction still
apply (because the power is in the interactions).
28FIPS
-- The fully-informed particle swarm (Rui
Mendes)
v(t1) W1 v(t) S(rand() W2/K (pk
x(t))) x(t1)x(t)v(t1)
(Knumber of neighbors, kindex of neighbor, W2
is a sum.) Note that pi is not a source of
influence in FIPS. Doesnt select best
neighbor. Orbits around the mean of neighborhood
bests. This version is more dependent on
topology.
29Deconstructing Velocity
Because x(t1) x(t) v(t1) we know that on
the previous iteration, x(t) x(t-1)
v(t) So we can find v(t) v(t) x(t)
x(t-1) and can substitute that into the formula,
to put it all in one line
xid (t1) xid (t) W1(xid
(t) - xid(t-1))
(rand()(W2)(pid - xid(t))
(rand()(W2)(pgd - xid(t))
30Generalization and Verbal Representation
We can generalize the canonical and FIPS versions
xid (t1) xid (t) W1(xid (t)-
xid (t-1)) S(rand()(W2/K)(pkd -
xid(t)))
or in words
NEW POSITION CURRENT POSITION
PERSISTENCE SOCIAL INFLUENCE
31Social Influence
has two components Central Tendency, and
Dispersion
NEW POSITION CURRENT POSITION
PERSISTENCE SOCIAL CENTRAL TENDENCY
SOCIAL DISPERSION
Hmmm, this gives us something to play with !
32Gaussian Essential Particle Swarm
Note that only the last term has randomness in it
the rest is deterministic
meanp(pid pgd)/2 dispabs(pid
pgd)/2 xid (t1) xid (t)
W1(xid (t)- xid (t-1)) W2(meanp xid)
G(0,1)disp
NEW POSITION CURRENT POSITION
PERSISTENCE SOCIAL CENTRAL TENDENCY
SOCIAL DISPERSION
G(0,1) is a Gaussian RNG
33Gaussian Essential Particle Swarm
xid (t1) xid (t) W1(xid (t)-
xid (t-1)) W2(meanp xid)
G(0,1)disp
Function Trials Canonical Gaussian
F6 20 0.0015 13E-10
GRIEWANK 20 0.0086 0.0103
GRIEWANK10 20 0.0508 0.045
RASTRIGIN 20 56.862 49.151
ROSENBROCK 20 41.836 44.197
SPHERE 20 88E-16 38E-24
34Various Probability Distributions
Function Trials Canonical Gaussian Triangular Double-Exponential Cauchy
F6 20 0.0015 13E-10 0.0057 0.001 0.0029
GRIEWANK 20 0.0086 0.0103 0.0275 0.0149 0.0253
GRIEWANK10 20 0.0508 0.045 0.0694 0.0541 0.0768
RASTRIGIN 20 56.862 49.151 140.94 47.26 33.829
ROSENBROCK 20 41.836 44.197 67.894 41.308 42.054
SPHERE 20 88E-16 38E-24 1.70E-18 2.40E-22 1.60E-17
There is clearly room for exploration here.
35Gaussian FIPS
Fully-informed uses all neighbors
FIPScenter mean of (pkd xid) FIPSrange mean
of abs(pid - pkd) xid xid W1
(xid(t)-xid(t-1)) W2 FIPScenter
G(0,1) (FIPSrange/2)
36Gaussian FIPS
Gaussian FIPS compared to Canonical PSO, square
topology.
Function Trials Canonical Gaussian FIPS
F6 20 0.0015 0.001
GRIEWANK 20 0.0086 0.0007
GRIEWANK10 20 0.0508 0.0215
RASTRIGIN 20 56.862 36.858
ROSENBROCK 20 41.836 40.365
SPHERE 20 88E-16 41E-29
37Gaussian FIPS vs. Canonical PSO
t(38), alpha0.05
Modified Bonferroni correction
func t p-value rank inv. rank New alpha Sig.
SPHERE 3.17 0.0030 1 6 0.008333
GRIEWANK10 2.99 0.0048 2 5 0.010000
GRIEWANK 2.64 0.0118 3 4 0.012500
RASTRIGIN 2.21 0.0333 4 3 0.016667 .
F6 0.45 0.6583 5 2 0.025000 .
ROSENBROCK 0.15 0.8782 6 1 0.050000 .
Ref Jaccard, J. Wan, C. K. (1996). LISREL
approaches to interaction effects in multiple
regression. Thousand Oaks, CA Sage Publications.
38Mendes Two Measures of Performance
Color and shape indicate parameters of the social
network degree, clustering, etc.
39Best Topologies
FIPS versions
Best-neighbor versions
40Worst FIPS Sociometries
and Proportions Successful
41Understanding the Particle Swarm
Lots of variations in particle movement
formulas Teachers and learners Propagation of
knowledge is central Interaction method and
social network topology interact Its simple,
but difficult to understand
42In Sum
- There is some fundamental process
- Uses information from neighbors
- Seems to require balance between persistence
and influence - Decomposing a version we know is OK
- We can understand it
- We can improve it
- Particle Trajectories
- Arbitrary? -- not quite
- Can be replaced by RNG (trajectory is not the
essence) - How it works
- It works by sharing successes among individuals
- Need to look more closely at the sharing
phenomenon itself
43 Send me a note Jim Kennedy Kennedy.Jim_at_gmail.co
m