Title: Overview Analysis of Social Networks
1Overview Analysis of Social Networks
2Plan
- Introduction
- Analysis of Social Network
- Properties of network
- The small-world effect
- Clustering coefficient
- Scale-free network
- Betweenness centrality
- Community structure
- Finding community structure
- Conclusion
3Introduction
- Social Network
- Collaboration networks
- Film Actors
- Telephone call graph
- Webograph (WebOfPeople)
- Technological Network
- Internet, the World-Wide Web
- Sofware packages
- Biological Network
- Metabolic network
- Protein interactions
- Neural network
4Analysis of Social Network
- Social Networks
- Social Entities persons, organizations, things,
cities (Actor/Node/Point/Agent) - Binary Relations social relations, dependencies,
exchange (Tie,Link,Edge,Line,Arc) - directed or undirected, weighted or unweighted
- Weight increasing or decreasing the tie between
the two entities - A labeled directed graph G (or Matrices)
- Vertex or edge attribute a partial function
assiging nominal or numerical values to vertices
or edges
5Analysis of Social Network
- Purpose
- Identify important vertices, crucial
relationships,subgroups, roles, - Answer questions about structures
- Interest
- Element Properties(absolute and relative)
- Single actors, links, incidences
- Group classifying the elements of networks and
properties of subnetworks - Actor equivalence classes, cluster identification
- Network Connectivity or balance
6Analysis of Social Network
- What makes a vertex importance or central ?
- Centrality of a vertex
- Network tend to build clusters ?
- Clustering coefficient
- Network evolve ?
- Degree distribution
- Overall structure ?
- Small-world phenomenon
7Properties of networks
- The small-world effect
- Clustering coefficient
- Degree distribution
- Scale-free network
- Betweenness Centrality
- Community structure
8The small world effect
- Milgram's experiment (1967)
- The participants could only pass the letters (by
hand) to personal acquaintances who they thought
might be able to reach the target whether
directly or via a "friend of a friend" - Letters passed person to person were able to
reach a designated target individual step in only
a small numbers of steps
Small world phenomeon It is said that all
strangers can be linked through six degrees of
separation
9Online experiment
Home page of http//smallworld.columbia.edu/
10Clustering coefficient
- Determine whether or not a graph is a small-world
network (Watts and Strogatz) - ?G(v) number of subgraphs 3 edges 3 vertices,
one of which is v - tG(v) number of subgraphs (not necessarily
induced) with 2 edges 3 vertices, one of which is
v and such that v is incident to both edges - This average higher then random graph with same
vertex set small-world
11Scale Free Network
- Power law distribution
- in number of connections between nodes
- Some few nodes
- Extremely high connectivity
- Essentially scale-free
- Vast majority
- Relatively poorly connected
12Scale Free Network
- Some scale-free network graphs
- Protein interaction networks
- Protein binding relation
- Metabolic pathway
- Enzyme, Substrate links by chemical interactions
- WWW ou Weblog links
- Web pages links pointing from one page to
another - Actor collaborations
- Actors in the same movies
- Airline traffic routes
13Scale Free Network
- Main properties
- have scaling (power law) degree distribution
- have growth and preferential attachment
- have hightly connected hubs which hold the
network together - self-similar
- universal in the sense of not depending on
domain-specific details
14Comparing Random and Scale-Free Distribution
- Random Scale-Free
- Source the journal Nature
15Scale Free Network
- Cumulative degree distributions for six different
networks 1
16Scale Free Network
- Strength and weakness
- extremely tolerant of random failures
- inhomogeneity of the nodes on the network
- extremely vulnerable to intentional attacks on
their hubs - Hub is important
- extremely vulnerable to epidemics
- Critical threshold (number of nodes infected)
17Betweenness centrality
- Vertex betweenness
- Determine the role of each actor (node) in a
social network - Based on shortest path
- Edge betweenness
18Modularity
- A specific proposed division of that network into
communities - Division is good many edges within communities,
only a few between them - NC total number of clusters in a given set C
- dc number of edges between nodes of a given
cluster c - lc total degree of nodes in cluster c L
total number of edges in whole network - For detecting community structure in social
network - Optimal set of clusters the one with highest
modularity during cluster building
19Communities in network
- Social network display a community structure
- Families, groups of close friends, etc.
- Community subgraph V with the internal
connections denser than the external ones - Detect the presence of communities ?
- Find members of possible communities ?
20Communites and Clustering
- Two concept
- Very closely related but different
- Cluster
- Part of graph, internal edges more than external
ones - Community
- Set of vertices sharing the same topological
properties - Community Clusters
- Same set of edges
Two different communities (a bipartite clique)
that are not represented by clustered subgraphs
1
21Community identify
22Problem (1)
- Network
- n vertices with no prior information
- know structural information (edge link
connectivity) - Many domains are related
- Computer Science
- Mathematics
- Sociology
- Physics
-
23Problem (2)
- How to recognize communities within network ?
- No exact definition of the cluster
- A lot of methods
- Have their own advantages and drawback
- Are suitable to different data structures
- Mathematical view point
- Clusteringan optimization procedure according to
clustering criterion - Various clustering approches present different
types of knowledge concerning the clustering
criterion
24Problem (3)
- Clustering the graph
- Reduce visual complexity
- relatively highly connected nodes
- their associated edges are grouped to form a
sub-graph, represented by one abstract node - Finding the cliques
- A clique a maximal connected subgraphs (there
is an edge between any pairs in subgraph) - Computation issuse
- Use cliques to determine communities
- Connectivity requried in a clique is too strong
25Approaches to finding communities in networks (1)
- Divisive Method
- Girvan Newman Algorithm 10
- Calculate the betweenness score for each of the
edges - Remove the edge with the highest score
- Compute the modularity for the network
- Go back to step 1 until all edges of the networks
are removed, resutlting in N non-connected nodes - -The best division when the highest modularity
value is obtained - -Community detection is not graph partinioning
- -Effective for obtaining communities in several
types networks - . Computational cost of order O(n2m)
-
26Approaches to finding communities in networks (2)
- Agglomerative hierarchical clustering
- Modularity optimization algorithm 7
- Starting with a state
- Each vertex is the sole member of one of n
communities - Repeatedly join communities together in pairs
- Choosing at each step the join that results
- Greatest increase (or smallest decrease) in
modularity - Can be applied to very large networks
Visualization of the community structure at
maximum Modularity 7
27Approaches to finding communities in networks (3)
- Agglomerative hierarchical clustering
- Single linkage (nereast neighbor) methods
- merges clusters iteratively
- develop a measure of similarity between pairs of
vertices - many different such similarity measures are
possible, par exemple vertices structural
equivalence - starting with an empty network of n vertices and
no edges, one adds edges between pairs of
vertices in order of decreasing similarity,
starting with the pair with strongest similarity - Weaknesses
- Not scale well time complexity at least 0(n2)
- Can not undo
28Approaches to finding communities in networks (4)
- Hierarchical Growth Method 6
- expanding neighborhoods
- First neighborhoods vertices at a distance one
edge - Second neighborhoods
- Use a threshold to obtain the best value of the
modularity - Consideration of successive neighborhoods of a
set of seeds - Start from vertex -gt Link of its successive
neighborhood - To verify if they belong to same community than
the seed - Inter-community edge removed -gt split network
into communities -
- Zachary Karate Club Network
- Simple benchmark for community
- finding methodologies
-
29Hierarchical Growth Method
30Approaches to finding communities in networks (5)
- Others
- SFClusters 9
- considers the characteristics of the SF network
graph - A lot of useful information in graph network
(have large clustering coefficients or clustered
regions of graph) - finds local clusters based on the local density
and vertex neighborhood - Time complexity
- Polynominal O(nml3)
- n number of vertices
- m number of edges
- l vertex size of the average modified Gabriel
influence region of the graph
31Conclusion
- Brief overview some basic indices for analysic of
social networks - One of difficile problem
- Relation in social network
- Strong / Weak
- Algorithm aspects in network analysis
- Concern the fast computation of such indices
- Particular method analysis needs
- A priori knowledge of the number of expected
communities
32Conclusion
- Network analysis demandes
- Visualizations
- Two obvious criteria for the quality
- Is the information manifest in the network
represented accurately ? - Is this information conveyed efficiently ?
- Creating network visualizations
- Thought throught three aspects
- Substantive aspect the viewer is interested in
- Design (Par example mapping of data to
graphical variables) - Algorithm employed to realize the design
33Reference
- 1 M.E.J. Newman . The structure and function of
complex network - 2 Analysic and Visualization Social Networks
- 3 Guido Caldarelli- Scale-Free Networks
Complex Webs in Natural, Technological and Social
Sciences- Oxford University Press - 4 U.Brandes. A faster algorithm for betweenness
centrality - 5Jan Rupnik. Finding community structure in
social network analysis-Overview - 6 F.A.Rodrigues, G. Travieso, L. da F.Costa
Fast Community Identification by Hierarchical
Growth - 7 A. Clauset, M. E. J. Newman, and C. Moore
Finding community structure in very large
networks - 8 M. Girvan and M. E. J. Newman -Community
structure in social and biological networks - 9 Xiaohua Hu , Jianchao Han Discovering
Clusters from Large Scale-Free Network Graph - 10M.E.J. Newman and M.Girvan Finding and
evaluating community structure in networks - 11M.E.J Newman Fast algorithm for detecting
community structure in networks
34Thank you very much