Ferromagnetic Clustering - PowerPoint PPT Presentation

About This Presentation
Title:

Ferromagnetic Clustering

Description:

At high temperatures the system is paramagnetic or disordered; ... Working in the super-paramagnetic phase of the model we use the values of the ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 45
Provided by: host50
Category:

less

Transcript and Presenter's Notes

Title: Ferromagnetic Clustering


1
Ferromagnetic Clustering
  • Data clustering using a model granular magnet

2
Introduction
  • In recent years there has been significant
    interest in adapting numerical as well as
    analytic techniques from statistical physics to
    provide algorithms and estimates for good
    approximate solutions to hard optimization
    problems.
  • Cluster analysis is an important technique in
  • exploratory data analysis.

3
Introduction
4
Potts Model
  • The Potts model was introduced as a
    generalization of the Ising model. The idea came
    from the representation of the Ising model as
    interacting spins which can be either parallel or
    antiparallel. An obvious generalization was to
    extend the number of directions of the spins.
    Such a model was proposed by C. Domb as a PhD
    thesis for his student R. Potts in 1952.
  • The original model considered an interaction of
    spins, which point to one of s equally spaced
    directions in a plane defined by n angles of the
    spin at the site q.
  • This model, now known as the vector Potts model
    or the clock model, was solved in 2 dimensions by
    Potts for s1,2,3,4.

5
Visualizations of Spin Models of Magnetism
  • Spin models provide simple models of magnetism,
    and in particular of the transition between
    magnetic and non-magnetic phases as a magnetic
    material is heated past its critical temperature.
    In a spin model of magnetism, variables
    representing magnetic spins are associated with
    sites in a regular crystalline lattice.
  • Different spin models are specified by different
    types of spin variables, interactions between the
    spins, and lattice geometry.
  • The Ising model is the simplest spin model, where
    the spins have only two states (1 and -1),
    representing magnetic spins that are up or down.

6
Ferromagnetic Ising model
  • A configuration of the standard ferromagnetic
    Ising model with two spin states (represented by
    black and white) at the critical temperature,
    where there is a phase transition between the low
    temperature ordered or magnetic phase (where the
    spins align) and the high temperature disordered
    or non-magnetic phase (where the spins are more
    randomized).
  • As the temperature increases from zero (where
    all the spins are the same), "domains" of
    opposite spin begin to appear and start to
    disorder the system.
  • At the phase transition, domains or clusters of
    spins appear in all shapes and sizes.
  • As the temperature increases further, the domains
    start to break up into random individual spins.

7
Introduction
  • We present a new approach to clustering, based on
    the physical properties of an inhomogeneous
    ferromagnet.
  • No assumption is made regarding the underlying
    distribution of the data.

8
Introduction
  • We assign a Potts spin to each data point and
    introduce an interaction between
    neighboring points, whose strength is a
    decreasing function of the distance between the
    neighbors.
  • This magnetic system exhibits three phases.
  • At very low temperatures it is completely
    ordered i.e. all spins are aligned.
  • At very high temperatures the system does not
    exhibit any ordering.
  • In an intermediate regime clusters of relatively
    strongly coupled spins become ordered, whereas
    different clusters remain uncorrelated.
  • The spin-spin correlation function (measured by
    Monte Carlo) is used to partition the spins and
    the corresponding data points into clusters.

9
Some Physics BackgroundPotts Model
  • The energy of a configuration is given by the
    Hamiltonian
  • Interactions are a decreasing function of the
    distance
  • The closer two points are to each other, the
    more they like to
  • be in the same state.

10
Some Physics BackgroundPotts Model
  • In order to calculate the thermodynamic average
    of a physical quantity A at a fixed temperature
    T, one has to calculate the sum
  • where the Boltzmann factor,
  • plays the role of the probability density
    which gives the
  • statistical weight of each spin configuration
  • in thermal equilibrium and Z is a
    normalization constant,

11
Potts Model
  • The order parameter of the system is ltmgt, where
    the magnetization, m(S), associated with a spin
    configuration S is defined (Chen, Ferrenberg and
    Landau 1992) as
  • with
  • where is the number of spins with
    the value
  • The thermal average of is called
    the spinspin correlation function,
  • which is the probability of the two spins si
    and sj being aligned.

12
Potts Model
  • Potts system is homogeneous when the spins are on
    a lattice and all nearest neighbor couplings are
    equal.
  • At high temperatures the system is paramagnetic
    or disordered
  • indicating that, for all
    statistically significant
  • configurations.
  • As the temperature is lowered, the system
    undergoes a sharp transition to an ordered,
    ferromagnetic phase the magnetization jumps to
    .
  • At very low temperatures and
    .
  • The variance of the magnetization is related to a
    relevant thermal quantity, the susceptibility,

13
Potts Model
  • Strongly inhomogeneous Potts models - spins form
    magnetic grains, with very strong couplings
    between neighbors that belong to the same grain,
    and very weak interactions between all other
    pairs.
  • At low temperatures such a system is also
    ferromagnetic.
  • At high temperatures the system may exhibit an
    intermediate, super-paramagnetic phase - when
    strongly coupled grains are aligned, while there
    is no relative ordering of different grains.

14
Potts Model
  • There are can be a sequence of several
    transitions in the super-paramagnetic phase as
    the temperature is raised the system may break
    first into two clusters, each of which breaks
    into more sub-clusters and so on.
  • Such a hierarchical structure of the magnetic
    clusters reflects a hierarchical organization of
    the data into categories and sub-categories.
  • Working in the super-paramagnetic phase of the
    model we use the values of the pair correlation
    function of the Potts spins to decide whether a
    pair of spins do or do not belong to the same
    grain and we identify these grains as the
    clusters of our data.

15
Monte Carlo simulation of Potts modelsthe
Swendsen-Wang method
  • The aim is to evaluate sums such as (2) for
    models with N gtgt 1 spins.
  • Direct evaluation of sums like (2) is
    impractical, since the number of configurations S
    increases exponentially with the system size N.
  • Monte Carlo simulations methods overcome this
    problem by generating a characteristic subset of
    configurations which are used as a statistical
    sample.
  • A set of spin configurations
    is generated according to the Boltzmann
    probability distribution (3). Then, expression
    (2) is reduced to a simple arithmetic average
  • where the number of configurations in the
    sample, M, is much smaller than , the total
    number of configurations.

16
Monte Carlo simulation of Potts modelsthe
Swendsen-Wang method
  • 1. First visit all pairs of spins lti, jgt
    that interact, i.e. have
  • Two spins are frozen together with
    probability
  • If in our current configuration Sn the
    two spins are in the same state,
  • si sj , then sites i and j are frozen
    with probability
  • 2. Identifying the SWclusters of spins -
    SWcluster contains all spins
  • which have a path of frozen bonds
    connecting them.
  • According to (8) only spins of the same
    value can be frozen in the
  • same SWcluster .

17
Monte Carlo simulation of Potts modelsthe
Swendsen-Wang method
  • Final step of the procedure generation of new
    spin configuration , by drawing,
    independently for each SWcluster, randomly a
    value s 1, . . . q, which is assigned to all
    its spins.
  • The physical quantities that we are interested in
    are the magnetization (4) and its square value
    for the calculation of the susceptibility ,
    and the spinspin correlation function (5).
  • At temperatures where large regions of correlated
    spins occur, local methods (such as Metropolis),
    which flip one spin at a time, become very slow.
    The SW procedure overcomes this difficulty by
    flipping large clusters of aligned spins
    simultaneously.

18
Clustering of Data
  • Description of the Algorithm

19
Clustering of Data Description of the Algorithm
  • Assume that our data consists of N patterns or
    measurements , specified by N corresponding
    vectors , embedded in a
  • D-dimensional metric space.
  • Method consists of three stages.

20
Clustering of Data Description of the Algorithm
  • 1.Construct the physical analog Potts-spin
    problem
  • (a) Associate a Potts spin variable
    to each point .
  • (b) Identify the neighbors of each point
    according to a selected
  • criterion.
  • (c) Calculate the interaction between
    neighboring points
  • and .

21
Clustering of Data Description of the Algorithm
  • 2. Locate the super-paramagnetic phase.
  • (a) Estimate the (thermal) average magnetization,
    ltmgt, for
  • different temperatures.
  • (b) Use the susceptibility to identify
    the super-paramagnetic
  • phase.

22
Clustering of Data Description of the Algorithm
  • 3. In the super-paramagnetic regime
  • (a) Measure the spinspin correlation, , for
    all neighboring points , .
  • (b) Construct the data-clusters.

23
Description of the Algorithm Construction of the
physical analog Potts-spin problem
  • 1.a The value of q does not imply any assumption
    about the
  • number of clusters present in the data. q,
    determines mainly
  • the sharpness of the transitions and the
    temperatures at
  • which they occur.
  • 1.b Since the data do not form a regular
    lattice, one has to supply
  • some reasonable definition for
    neighbors. We use Delaunay
  • triangulation over other graphs structures
    when the patterns
  • are embedded in a low dimensional (D 3)
    space.
  • When Dgt3 - vi and vj have a mutual
    neighborhood value K,
  • if and only if vi is one of the K-nearest
    neighbors of vj and vj
  • is one of the K-nearest neighbors of vi.

24
Description of the Algorithm Construction of the
physical analog Potts-spin problem
  • 1.c We need strong interaction between spins that
    correspond to
  • data from a high density region, and weak
    interactions
  • between neighbors that are in low-density
    regions.
  • a is the average of all distances dij
    between neighboring pairs vi and vj.
  • is the average number of neighbors
  • This normalization of the interaction
    strength enables us to estimate the temperature
    corresponding to the highest super-paramagnetic
    transition.

25
Description of the Algorithm Locating
super-paramagnetic regions
  • The various temperature intervals in which the
    system self-organizes into different partitions
    to clusters are identified by measuring the
    susceptibility ? as a function of temperature.
  • Starting from highest transition temperature
    estimate, one can take increasingly refined
    temperature scans and calculate the function ?(T)
    by Monte Carlo simulation.

26
Description of the Algorithm Locating
super-paramagnetic regions
  • 1. Choose the number of iterations M to be
    performed.
  • 2. Generate the initial configuration by
    assigning a random value to
  • each spin.
  • 3. Assign frozen bond between nearest neighbors
    points vi and vj
  • with probability .
  • 4. Find the connected subgraphs, the SWclusters.
  • 5. Assign new random values to the spins (spins
    that belong to the
  • same SWcluster are assigned the same value).
    This is the new
  • configuration of the system.

27
Description of the Algorithm Locating
super-paramagnetic regions
  • 6. Calculate the value assumed by the physical
    quantities of interest in the new spin
    configuration.
  • 7. Go to step 3, unless the maximal number of
    iterations M, was reached.
  • 8. Calculate the averages (7).

28
Description of the Algorithm Locating
super-paramagnetic regions
  • We measure the susceptibility ?
  • at different temperatures in order to locate
  • these different regimes.
  • The aim is to identify the temperatures at which
    the system changes its structure.

29
Description of the Algorithm Identifying data
clusters
  • Select one temperature in each region of
    interest.
  • Each sub-phase characterizes a particular type of
    partition of the data, with new
  • clusters merging or breaking.

30
Description of the Algorithm Identifying data
clusters
  • SwendsenWang method provides an improved
    estimator of the spinspin correlation function.
  • It calculates the twopoint connectedness ,
    the probability that sites vi and vj belong to
    the same SWcluster, which is estimated by the
    average (7) of the following indicator function
  • Cij ltcijgt is the probability of finding sites
    vi and vj in the same SWcluster.
  • Then the spinspin correlation function

31
Description of the Algorithm Identifying data
clusters
  • 1. Build the clusters core if gt 0.5, a
    link is set
  • between the neighbor data points and
    .
  • 2. Capture points lying on the periphery of the
    clusters
  • by linking each point to its
    neighbor of
  • maximal correlation .
  • 3. Data clusters are identified as the linked
    components
  • of the graphs obtained in steps 1,2.

32
Applications
  • The Iris Data

33
The Iris Data
  • It consists of measurement of four quantities,
    performed on each of 150 flowers. The specimens
    were chosen from three species of Iris. The data
    constitute 150 points in four-dimensional space.

34
The Iris Data
  • We determined neighbors in the D4 dimensional
    space according to the mutual K (K5) nearest
    neighbors definition.
  • We observe that there is a well separated cluster
    (corresponding to the Iris Setosa species) while
    clusters corresponding to the Iris Virginia and
    Iris Versicolor do overlap.
  • Applied the SPC method and obtained the
    susceptibility curve

Projection of the iris data on the plane spanned
by its two principal components.
35
The Iris Data
  • Susceptibility curve of Fig. (a) clearly shows
    two peaks
  • When heated, the system first breaks into two
    clusters at T0.01.(Fig.b).
  • At T0.02 we obtain two clusters, of sizes 80 and
    40
  • At T0.06 another transition occurs,
  • where the larger cluster splits to two.
  • At T0.07 we identified clusters of sizes 45, 40
    and 38, corresponding to the species Iris
    Versicolor,Virginica and Setosa respectively.

36
The Iris Data
  • Iris data breaks into clusters in two stages.
  • This reflects the fact that two of the three
    species are closer to each other than to the
    third one.
  • The SPC method clearly handles very well such
    hierarchical organization of the data.

37
The Iris Data
  • 125 samples were classified correctly (as
    compared with manual classification).
  • 25 were left unclassified.
  • No further breaking of clusters was observed.
  • All three disorder at T0.08.

38
The Iris Data
  • Among all the clustering algorithms used in this
    example, the minimal spanning tree procedure
    obtained the most accurate result, followed by
    SPC method, while the remaining clustering
    techniques failed to provide a satisfactory
    result.

39
The Iris Data
40
Advantages of the SPC Algorithm vs. Other Methods

41
Other methods and their disadvantages
  • These methods employ a local criterion, against
    which some attribute of the local structure of
    the data is tested, to construct the clusters.
  • Typical examples are hierarchical techniques such
    as the agglomerative and divisive methods.
  • All these algorithms tend to create clusters even
    when no natural clusters exist in the data.

42
Other methods and their disadvantages
  • (a) high sensitivity to initialization
  • (b) poor performance when the data contains
  • overlapping clusters
  • (c) inability to handle variability in cluster
    shapes,
  • cluster densities and cluster sizes
  • (d) none of these methods provides an index that
  • could be used to determine the most
    significant
  • partitions among those obtained in the
    entire
  • hierarchy.

43
Advantages of the Method
  • provides information about the different self
  • organizing regimes of the data
  • the number of macroscopic clusters is an output
    of the algorithm
  • hierarchical organization of the data is
    reflected in the manner the clusters merge or
    split when a control parameter (the physical
    temperature) is varied

44
Advantages of the Method
  • the results are completely insensitive to the
    initial conditions
  • the algorithm is robust against the presence of
    noise
  • the algorithm is computationally efficient,
    equilibration time of the spin system scales with
    N, the number of data points, and is independent
    of the embedding dimension D.
Write a Comment
User Comments (0)
About PowerShow.com