Ferromagnetic Clustering

About This Presentation

Title:

Ferromagnetic Clustering

Description:

At high temperatures the system is paramagnetic or disordered; ... Working in the super-paramagnetic phase of the model we use the values of the ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 45

Provided by: host50

Category:

more less

Transcript and Presenter's Notes

Title: Ferromagnetic Clustering

1
Ferromagnetic Clustering

Data clustering using a model granular magnet

2
Introduction

In recent years there has been significant
interest in adapting numerical as well as
analytic techniques from statistical physics to
provide algorithms and estimates for good
approximate solutions to hard optimization
problems.
Cluster analysis is an important technique in
exploratory data analysis.

3
Introduction
4
Potts Model

The Potts model was introduced as a
generalization of the Ising model. The idea came
from the representation of the Ising model as
interacting spins which can be either parallel or
antiparallel. An obvious generalization was to
extend the number of directions of the spins.
Such a model was proposed by C. Domb as a PhD
thesis for his student R. Potts in 1952.
The original model considered an interaction of
spins, which point to one of s equally spaced
directions in a plane defined by n angles of the
spin at the site q.
This model, now known as the vector Potts model
or the clock model, was solved in 2 dimensions by
Potts for s1,2,3,4.

5
Visualizations of Spin Models of Magnetism

Spin models provide simple models of magnetism,
and in particular of the transition between
magnetic and non-magnetic phases as a magnetic
material is heated past its critical temperature.
In a spin model of magnetism, variables
representing magnetic spins are associated with
sites in a regular crystalline lattice.
Different spin models are specified by different
types of spin variables, interactions between the
spins, and lattice geometry.
The Ising model is the simplest spin model, where
the spins have only two states (1 and -1),
representing magnetic spins that are up or down.

6
Ferromagnetic Ising model

A configuration of the standard ferromagnetic
Ising model with two spin states (represented by
black and white) at the critical temperature,
where there is a phase transition between the low
temperature ordered or magnetic phase (where the
spins align) and the high temperature disordered
or non-magnetic phase (where the spins are more
randomized).
As the temperature increases from zero (where
all the spins are the same), "domains" of
opposite spin begin to appear and start to
disorder the system.
At the phase transition, domains or clusters of
spins appear in all shapes and sizes.
As the temperature increases further, the domains
start to break up into random individual spins.

7
Introduction

We present a new approach to clustering, based on
the physical properties of an inhomogeneous
ferromagnet.
No assumption is made regarding the underlying
distribution of the data.

8
Introduction

We assign a Potts spin to each data point and
introduce an interaction between
neighboring points, whose strength is a
decreasing function of the distance between the
neighbors.
This magnetic system exhibits three phases.
At very low temperatures it is completely
ordered i.e. all spins are aligned.
At very high temperatures the system does not
exhibit any ordering.
In an intermediate regime clusters of relatively
strongly coupled spins become ordered, whereas
different clusters remain uncorrelated.
The spin-spin correlation function (measured by
Monte Carlo) is used to partition the spins and
the corresponding data points into clusters.

9
Some Physics BackgroundPotts Model

The energy of a configuration is given by the
Hamiltonian
Interactions are a decreasing function of the
distance
The closer two points are to each other, the
more they like to
be in the same state.

10
Some Physics BackgroundPotts Model

In order to calculate the thermodynamic average
of a physical quantity A at a fixed temperature
T, one has to calculate the sum
where the Boltzmann factor,
plays the role of the probability density
which gives the
statistical weight of each spin configuration
in thermal equilibrium and Z is a
normalization constant,

11
Potts Model

The order parameter of the system is ltmgt, where
the magnetization, m(S), associated with a spin
configuration S is defined (Chen, Ferrenberg and
Landau 1992) as
with
where is the number of spins with
the value
The thermal average of is called
the spinspin correlation function,
which is the probability of the two spins si
and sj being aligned.

12
Potts Model

Potts system is homogeneous when the spins are on
a lattice and all nearest neighbor couplings are
equal.
At high temperatures the system is paramagnetic
or disordered
indicating that, for all
statistically significant
configurations.
As the temperature is lowered, the system
undergoes a sharp transition to an ordered,
ferromagnetic phase the magnetization jumps to
.
At very low temperatures and
.
The variance of the magnetization is related to a
relevant thermal quantity, the susceptibility,

13
Potts Model

Strongly inhomogeneous Potts models - spins form
magnetic grains, with very strong couplings
between neighbors that belong to the same grain,
and very weak interactions between all other
pairs.
At low temperatures such a system is also
ferromagnetic.
At high temperatures the system may exhibit an
intermediate, super-paramagnetic phase - when
strongly coupled grains are aligned, while there
is no relative ordering of different grains.

14
Potts Model

There are can be a sequence of several
transitions in the super-paramagnetic phase as
the temperature is raised the system may break
first into two clusters, each of which breaks
into more sub-clusters and so on.
Such a hierarchical structure of the magnetic
clusters reflects a hierarchical organization of
the data into categories and sub-categories.
Working in the super-paramagnetic phase of the
model we use the values of the pair correlation
function of the Potts spins to decide whether a
pair of spins do or do not belong to the same
grain and we identify these grains as the
clusters of our data.

15
Monte Carlo simulation of Potts modelsthe
Swendsen-Wang method

The aim is to evaluate sums such as (2) for
models with N gtgt 1 spins.
Direct evaluation of sums like (2) is
impractical, since the number of configurations S
increases exponentially with the system size N.
Monte Carlo simulations methods overcome this
problem by generating a characteristic subset of
configurations which are used as a statistical
sample.
A set of spin configurations
is generated according to the Boltzmann
probability distribution (3). Then, expression
(2) is reduced to a simple arithmetic average
where the number of configurations in the
sample, M, is much smaller than , the total
number of configurations.

16
Monte Carlo simulation of Potts modelsthe
Swendsen-Wang method

1. First visit all pairs of spins lti, jgt
that interact, i.e. have
Two spins are frozen together with
probability
If in our current configuration Sn the
two spins are in the same state,
si sj , then sites i and j are frozen
with probability
2. Identifying the SWclusters of spins -
SWcluster contains all spins
which have a path of frozen bonds
connecting them.
According to (8) only spins of the same
value can be frozen in the
same SWcluster .

17
Monte Carlo simulation of Potts modelsthe
Swendsen-Wang method

Final step of the procedure generation of new
spin configuration , by drawing,
independently for each SWcluster, randomly a
value s 1, . . . q, which is assigned to all
its spins.
The physical quantities that we are interested in
are the magnetization (4) and its square value
for the calculation of the susceptibility ,
and the spinspin correlation function (5).
At temperatures where large regions of correlated
spins occur, local methods (such as Metropolis),
which flip one spin at a time, become very slow.
The SW procedure overcomes this difficulty by
flipping large clusters of aligned spins
simultaneously.

18
Clustering of Data

Description of the Algorithm

19
Clustering of Data Description of the Algorithm

Assume that our data consists of N patterns or
measurements , specified by N corresponding
vectors , embedded in a
D-dimensional metric space.
Method consists of three stages.

20
Clustering of Data Description of the Algorithm

1.Construct the physical analog Potts-spin
problem
(a) Associate a Potts spin variable
to each point .
(b) Identify the neighbors of each point
according to a selected
criterion.
(c) Calculate the interaction between
neighboring points
and .

21
Clustering of Data Description of the Algorithm

2. Locate the super-paramagnetic phase.
(a) Estimate the (thermal) average magnetization,
ltmgt, for
different temperatures.
(b) Use the susceptibility to identify
the super-paramagnetic
phase.

22
Clustering of Data Description of the Algorithm

3. In the super-paramagnetic regime
(a) Measure the spinspin correlation, , for
all neighboring points , .
(b) Construct the data-clusters.

23
Description of the Algorithm Construction of the
physical analog Potts-spin problem

1.a The value of q does not imply any assumption
about the
number of clusters present in the data. q,
determines mainly
the sharpness of the transitions and the
temperatures at
which they occur.
1.b Since the data do not form a regular
lattice, one has to supply
some reasonable definition for
neighbors. We use Delaunay
triangulation over other graphs structures
when the patterns
are embedded in a low dimensional (D 3)
space.
When Dgt3 - vi and vj have a mutual
neighborhood value K,
if and only if vi is one of the K-nearest
neighbors of vj and vj
is one of the K-nearest neighbors of vi.

24
Description of the Algorithm Construction of the
physical analog Potts-spin problem

1.c We need strong interaction between spins that
correspond to
data from a high density region, and weak
interactions
between neighbors that are in low-density
regions.
a is the average of all distances dij
between neighboring pairs vi and vj.
is the average number of neighbors
This normalization of the interaction
strength enables us to estimate the temperature
corresponding to the highest super-paramagnetic
transition.

25
Description of the Algorithm Locating
super-paramagnetic regions

The various temperature intervals in which the
system self-organizes into different partitions
to clusters are identified by measuring the
susceptibility ? as a function of temperature.
Starting from highest transition temperature
estimate, one can take increasingly refined
temperature scans and calculate the function ?(T)
by Monte Carlo simulation.

26
Description of the Algorithm Locating
super-paramagnetic regions

1. Choose the number of iterations M to be
performed.
2. Generate the initial configuration by
assigning a random value to
each spin.
3. Assign frozen bond between nearest neighbors
points vi and vj
with probability .
4. Find the connected subgraphs, the SWclusters.
5. Assign new random values to the spins (spins
that belong to the
same SWcluster are assigned the same value).
This is the new
configuration of the system.

27
Description of the Algorithm Locating
super-paramagnetic regions

6. Calculate the value assumed by the physical
quantities of interest in the new spin
configuration.
7. Go to step 3, unless the maximal number of
iterations M, was reached.
8. Calculate the averages (7).

28
Description of the Algorithm Locating
super-paramagnetic regions

We measure the susceptibility ?
at different temperatures in order to locate
these different regimes.
The aim is to identify the temperatures at which
the system changes its structure.

29
Description of the Algorithm Identifying data
clusters

Select one temperature in each region of
interest.
Each sub-phase characterizes a particular type of
partition of the data, with new
clusters merging or breaking.

30
Description of the Algorithm Identifying data
clusters

SwendsenWang method provides an improved
estimator of the spinspin correlation function.
It calculates the twopoint connectedness ,
the probability that sites vi and vj belong to
the same SWcluster, which is estimated by the
average (7) of the following indicator function
Cij ltcijgt is the probability of finding sites
vi and vj in the same SWcluster.
Then the spinspin correlation function

31
Description of the Algorithm Identifying data
clusters

1. Build the clusters core if gt 0.5, a
link is set
between the neighbor data points and
.
2. Capture points lying on the periphery of the
clusters
by linking each point to its
neighbor of
maximal correlation .
3. Data clusters are identified as the linked
components
of the graphs obtained in steps 1,2.

32
Applications

The Iris Data

33
The Iris Data

It consists of measurement of four quantities,
performed on each of 150 flowers. The specimens
were chosen from three species of Iris. The data
constitute 150 points in four-dimensional space.

34
The Iris Data

We determined neighbors in the D4 dimensional
space according to the mutual K (K5) nearest
neighbors definition.
We observe that there is a well separated cluster
(corresponding to the Iris Setosa species) while
clusters corresponding to the Iris Virginia and
Iris Versicolor do overlap.
Applied the SPC method and obtained the
susceptibility curve

Projection of the iris data on the plane spanned
by its two principal components.
35
The Iris Data

Susceptibility curve of Fig. (a) clearly shows
two peaks
When heated, the system first breaks into two
clusters at T0.01.(Fig.b).
At T0.02 we obtain two clusters, of sizes 80 and
40
At T0.06 another transition occurs,
where the larger cluster splits to two.
At T0.07 we identified clusters of sizes 45, 40
and 38, corresponding to the species Iris
Versicolor,Virginica and Setosa respectively.

36
The Iris Data

Iris data breaks into clusters in two stages.
This reflects the fact that two of the three
species are closer to each other than to the
third one.
The SPC method clearly handles very well such
hierarchical organization of the data.

37
The Iris Data

125 samples were classified correctly (as
compared with manual classification).
25 were left unclassified.
No further breaking of clusters was observed.
All three disorder at T0.08.

38
The Iris Data

Among all the clustering algorithms used in this
example, the minimal spanning tree procedure
obtained the most accurate result, followed by
SPC method, while the remaining clustering
techniques failed to provide a satisfactory
result.

39
The Iris Data
40
Advantages of the SPC Algorithm vs. Other Methods

41
Other methods and their disadvantages

These methods employ a local criterion, against
which some attribute of the local structure of
the data is tested, to construct the clusters.
Typical examples are hierarchical techniques such
as the agglomerative and divisive methods.
All these algorithms tend to create clusters even
when no natural clusters exist in the data.

42
Other methods and their disadvantages

(a) high sensitivity to initialization
(b) poor performance when the data contains
overlapping clusters
(c) inability to handle variability in cluster
shapes,
cluster densities and cluster sizes
(d) none of these methods provides an index that
could be used to determine the most
significant
partitions among those obtained in the
entire
hierarchy.

43
Advantages of the Method

provides information about the different self
organizing regimes of the data
the number of macroscopic clusters is an output
of the algorithm
hierarchical organization of the data is
reflected in the manner the clusters merge or
split when a control parameter (the physical
temperature) is varied

44
Advantages of the Method

the results are completely insensitive to the
initial conditions
the algorithm is robust against the presence of
noise
the algorithm is computationally efficient,
equilibration time of the spin system scales with
N, the number of data points, and is independent
of the embedding dimension D.

Write a Comment

User Comments (0)