Defining and identifying communities in networks - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Defining and identifying communities in networks

Description:

... Levent ZG R. 1. Defining and identifying communities in networks ... Mapping the network into a tree (dendogram). Leaves are nodes, branches join nodes ... – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 27
Provided by: levent
Category:

less

Transcript and Presenter's Notes

Title: Defining and identifying communities in networks


1
Defining and identifying communities in networks
  • Authors F.Radicchi, C. Castellona,
  • F. Cecconi, V. Loreto, D. Parisi
  • Presentation by Levent Özgür

2
Goal of the paper
  • The investigation of community structures in
    networks as definite and as fast as possible
  • Fast algorithm
  • Self contained algorithm that decides the
    resulted subraphs as community or not

3
Statistical properties of network systems
  • 1. Small world
  • 2. Power law
  • 3. Network transivity
  • Thisrelated papers add the forth
  • 4. Community structure

4
What is community?
  • Subset of nodes within the graph such that
    connections between the nodes are denser than
    connections with the rest of the networks.
  • Mapping the network into a tree (dendogram).
    Leaves are nodes, branches join nodes
    (hierarchical structure).

5
How to find communities?
  • Agglomerative
  • Traditional method to find communities
  • Start with all nodes and no edges -gt come up with
    the communities with necessary edges
  • Have serious drawbacks
  • Fails, with some freq., to find correct
    communities in networks!
  • Finds, usually, only cores of communities leave
    out periphery!

6
How to find communities?
  • Divisive Algorithms
  • Start with all nodes edges -gt come up with the
    communities with necessary edges
  • New algorithms compared to agglomerative ones
  • Divide network into smaller subnetworks
  • Edges are removed iteratively according to
    different algorithms

7
GN Algorithm
  • Calculate the edge betweenness for all edges
  • Remove edge with the highest betweenness
  • Recalculate values
  • Repeat from step 2 until no edges remain

8
GN Algorithm edge betweennes
  • Selection of the edges to be cut is based on this
    value
  • A generalization of the centrality betweenness.
  • Betweenness of an edge is the number of shortest
    paths running through that edge.
  • Idea When a graph is made of that tightly bound
    clusters, loosely interconnected, all shortest
    paths between clusters have to go through the few
    intercluster connections.

9
A sample dendogram and edge betweenness
10
GN Algorithm Future work
  • Hope to generalize for both weighted directed
    graphs.
  • Time complexity
  • n number of vertices -gt not practical for very
    large graphs.
  • Strength of community structure (according to
    this strength, choose networks to be divided)

11
Alternatives for GN Algorithm
  • Random walk betweenness
  • Current flow betweenness
  • Information centrality

12
This study ...
  • Tries to find a faster algorithm than GN, which
    is successful as GN.
  • Provides an alternative divisive algorithm
    considering local quantities only -- so fast --gt
    edge clustering coefficient. ( of triangles
    (or higher order) to which a given edge belongs /
    of triangles that might potentially include)

13
This study...
  • Tries to cover "Community Sense" (strong sense,
    weak sense) --gt Quantative definitions of
    community (How much community?)
  • Self contained algorithm --gt Accept divisions if
    two or more groups fulfill the definitions of
    community!

14
Quantative Definitions of Community
  • The algorithm may lead to a false community?
  • We need a precise definition for community
  • Define community as having nodes where internal
    connections are denser than external ones.

15
Two definitions for community
  • Community in a Strong Sense Each node in a
    strong community has more conections within the
    community than with the rest of the graph
  • Community in a Weak Sense The sum of all degrees
    within V is larger than sum of all degrees toward
    the rest of the network.
  • Strong sense gt Weak sense

16
Community type formulas
  • Strong Sense Community
  • Weak Sense Community

17
Self Contained Algorithms
  • Decide whether subgraphs fulfill the definition
    of community
  • If the criterioan is not reached, do not split
    into those subgraphs
  • This selection is performed iteratively by a
    simple algorithm

18
Self Contained Algorithms
  • 1. Choose a definition of community
  • 2. Compute the edge betweenness for all edges and
    remove those with the highest scores
  • 3. If the removal does not split the sub-graph,
    go to 2
  • 4. If there is split, test if at least two of the
    resulting subgraphs fullfil the definition
  • 5. Goto 2 for all the subgraph until no edges are
    left int he network

19
A Fast Algorithm
  • An alternating divisive method is introduced
  • Requires the consideration of local quantaties
  • Much faster than GN algorithm
  • Edge clustering coefficient
  • of triangles to which a given edge belongs,
    divided by of triangles that might potentially
    include it
  • Degrees of adjacent nodes given.

20
Formula
  • Edge clustering coefficient for triangle
  • A more general formula (s of possible
    cyclic structures by the given degrees of nodes)

21
Edge betweenness vs edge clustering coefficient
22
Edge betweenness vs edge clustering coefficient
  • Anticorrelation exists between two quantities,
    but not perfect
  • Low values of one quantity tend to have high
    values of the ither quantity
  • Expect two algorithms to yield similar community
    structures, although not perfectly coinciding.

23
Time complexities
24
Similar researches
  • Relevant for social tasks, biological injuries or
    technological problems.
  • Collaboration network of early jazz musicians
  • Email meesage network in a university -- found
    formal informal levels of organizations
  • Email network in a large company -- organizations
    become communities!

25
Conclusion
  • Two main improvements
  • Self-contained algorithm that decides the
    existence of community of the splitted subgraphs
  • Less time consuming algorithm based on local
    quantities that is as successful as GN algorithm

26
The End...
  • Thank you for your attention...
  • Any Questions ?
Write a Comment
User Comments (0)
About PowerShow.com