RAIN: Data Clustering using RAndomized INteraction of Data Points - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

RAIN: Data Clustering using RAndomized INteraction of Data Points

Description:

RAIN: Data Clustering using RAndomized INteraction of Data Points ... Olfa Nasraoui, Elizabeth Le n University of Louisville. Olfa.nasraou, eleon _at_ louisville.edu ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 20
Provided by: Jonata3
Category:

less

Transcript and Presenter's Notes

Title: RAIN: Data Clustering using RAndomized INteraction of Data Points


1
RAIN Data Clustering using RAndomized
INteraction of Data Points
  • Jonatan Gómez Universidad Nacional Colombia
  • jgomezpe_at_unal.edu.co
  • Olfa Nasraoui, Elizabeth León University of
    Louisville
  • Olfa.nasraou, eleon _at_ louisville.edu

2
Outline
  • Introduction
  • Original Gravitational Clustering
  • Randomized Gravitational Clustering
  • RAIN
  • Experiments
  • Results
  • Conclusions

3
Introduction
  • Data Clustering
  • Hierarchical
  • Agglomerative
  • Divisible
  • Partitional
  • Unsupervised
  • Robust

4
Original Gravitational Clustering
  • Based on the Universal gravitational Law.
  • Each data point sample (data record) is
    considered an object in an n-dimensional space.
  • Each data point has associated a mass of 1.
  • Points are moved according to the gravitational
    Law (simulation).
  • Hierarchical clustering technique.
  • Time complexity O(n3).

5
Randomized Gravitational Clustering
  • Each point is moved using the gravitational force
    exerted by another single object over it and the
    Newtons second motion law
  • d is the (distance) vector defined by y-x
  • G is the gravitational constant

6
Randomized Gravitational Clustering (contd..)
  • Time complexity (O(n?n) Super Linear)
  • Non hierarchical approach
  • Robust and Unsupervised
  • Cooling factor
  • Uses an optimal union-find disjoint set structure
  • Extraction phase based on cluster size

7
Randomized Gravitational Clustering Algorithm
8
Randomized Gravitational Clustering Algorithm
(contd..)
9
RAIN Randomized Interaction of Data Points
  • A Generalization of RGC
  • Reduces effect of data size and data dimension
    using a notion of Maximum Distance Between
    Closest Points
  • Uses not only gravitational function for moving
    points but other possible interaction functions
  • Defines an heuristic for setting the Initial
    Interaction Strength (Gravitational force)

10
RAIN Randomized Interaction of Data Points
  • Maximum Distance Between Closest Points
  • Interaction Functions

11
RAIN Initial Interaction Strength Setting
12
RAIN Algorithm
Demo
13
Experiment Data Set
14
RAIN Evolution using
15
RAIN Results using
16
RAIN Evolution using
17
RAIN Results using
18
Conclusions
  • We have developed an heuristic mechanism for
    determining the initial interaction strenght
    parameter
  • Gratitational clustering was extended for using
    different interaction functions
  • The effect of data size and dimensionality is
    reduced using a notion of maximum-minimum
    distance between data points
  • Our results show that RAIN works well in a
    variety of data sets

19
Questions????
Write a Comment
User Comments (0)
About PowerShow.com