Title: Social Network Analysis
1Social Network Analysis Network Optimization
_at_ Dept. of Computer Communication Engineering,
University of Thessaly _at_ Dept. of Informatics,
Aristotle University
- Dimitrios Katsaros, Ph.D.
Koblenz, February 18th, 2008
2Outline of the talk
- A few words about my research
- My latest results Social Network Analysis for
Network Optimization - Web (2nd round review _at_ IEEE Transactions on
Knowledge Data Engineering) - PRIMITIVE Community Identification
- PROTOCOL Content Outsourcing
- GOAL Latency Reduction
- Wireless Multimedia Sensor Nets (2nd round review
_at_ ACM Mobile Networks Applications) - PRIMITIVE Important Sensor Nodes Identification
- PROTOCOL Cooperative Caching
- GOAL Latency Reduction
- Why Collective Intelligence?
3My Research Areas (chronological info)
- WIRELESS NETWORKS
- Mobile Pervasive Computing
- Data Management
- Caching (04)
- Air-Indexing (07)
- Data Dissemination
- Broadcast Scheduling (04)
- Prediction
- Mobility Prediction (0308)
- Prefetching (03)
- Mobile Ad Hoc Networks
- Content-based Multimedia Retrieval (0508)
- Broadcasting (0608)
- Wireless Sensor Networks
- Sensor Network Clustering (07)
- (DistrLocal) Data Indexing (0608)
- Cooperative Caching (0708)
- Data Dissemination (08)
- WIRED NETWORKS
- Conventional and Streaming Media Distribution in
the Web - Replication (03)
- Prefetching (010203)
- Caching (04)
- Overlay and P2P Networks
- Content Distribution Networks (0506)
- Content Placement in CDNs (0708)
- Indexing Query Routing in P2P (progress)
- Distributed Structures over P2P (progress)
- Web Information Retrieval and Data Mining
- Web Link Mining (05)
- Web Ranking (0708)
- Rank Aggregation (0708)
- Social Network Analysis (0708)
- Bibliometrics (060708)
4Research areas Ultimately ? ???
Mobile/Pervasive Computing
Web
Pervasive Web
Overlay Nets
Caching Air-Indexing
Peer-to-Peer Networks
Caching Prefetching Replication
Semistructured Data
Web views
Webcasting
Content Distribution Networks
Location Tracking
Ad Hoc
Content-Based MIR
Broadcasting Data Dissemination
Web Ranking Search Engines
Cooperative Caching Sensor Node Clustering
Distributed Indexing Coverage/Connectivity
Flash storage
Social Network Analysis
Information Retrieval
Sensors
5Social Network Analysis
- A social network is a social structure to
describe social relations (wikipedia) - The history of Social Network is older than
everybody who is here - More than 100 years (Cooley 1909, Durkheim 1893)
- Focusing on small groups
- Information Techniques give it a new life
- book Stanley Wasserman Katherine Faust
- Mathematical Representation
- Structural Locational Properties
- Roles Positions
- Dyadic Triadic Methods
6Social Network Analysis
- Stanley Wasserman Katherine Faust
- Mathematical Representation
- Structural Locational Properties
- Centrality
- Betweenness Centrality
- Roles Positions
- Dyadic Triadic Methods
7Betweenness Centrality
- Let suw swu denote the number of shortest paths
from u ? V to w ? V (by definition, suu 0) - Let suw(v) denote the number of shortest paths
from u to w that some vertex v ? V lies on - The Betweenness Centrality index NI(v) of a
vertex v is defined as - Large values for the NI index of a node v
indicate that this node can reach others on
relatively short paths, or that v lies on
considerable fractions of shortest paths
connecting others
8The NI index in sample graphs
In parenthesis, the NI index of the respective
node i.e., 7(156) node with ID 7 has NI equal
to 156.
- Nodes with large NI
- Articulation nodes (in bridges), e.g., 3, 4, 7,
16, 18 - With large fanout, e.g., 14, 8, U
- Therefore geodesic nodes
9The NI index in a localized algorithm
- For any node v, the NI indexes of the nodes in
N12(v) calculated only for the subgraph of the
2-hop (in general, k-hop) neighborhood reveal the
relative importance of the nodes in N12 - For a node u (of the 2-hop neighbourhood of a
node v), the NI index of u will be denoted as
NIv(u)
10Betweenness Centrality in
- WEB Performing graph clustering and recognizing
communities in Web site graphs - WIRELESS MULTIMEDIA SENSOR NETWORKS Recognizing
(in a distributed fashion) important sensor
nodes, the mediators, that coordinate cooperative
caching decisions
11Community Identification Content Outsourcing
for the Web
12The need for content outsourcing
13CiBC Method
- Target is true
- CiBC method
- Building cliques and clusters around
representative (pole) nodes (with low CB) - Earlier methods have
- Defined hard communities ?node
deg(inCom)gtdeg(outCom) - exploited edge betweenness to perform
hierarchical agglomerative clustering
14CiBC Method
Phase 1 NI Computation -O(nm) Phase 2
Initialization of cliques O(n)
ID NI index
10 20.68
2 19.61
6 11.38
1 10.28
7 2.06
0 1.73
9 0.99
8 0.99
4 0.75
5 0.00
11 0.00
15CiBC Method
Phase 2 Initialization of cliques O(n)
ID NI index
10 20.68
2 19.61
6 11.38
1 10.28
7 2.06
0 1.73
9 0.99
8 0.99
4 0.75
5 0.00
11 0.00
16CiBC Method
Phase 2 Initialization of cliques O(n)
ID NI index
10 20.68
2 19.61
6 11.38
1 10.28
7 2.06
0 1.73
9 0.99
8 0.99
4 0.75
5 0.00
11 0.00
17CiBC Method
Phase 2 Initialization of cliques O(n)
ID NI index
10 20.68
2 19.61
6 11.38
1 10.28
7 2.06
0 1.73
9 0.99
8 0.99
4 0.75
5 0.00
11 0.00
18CiBC Method
Phase 2 Initialization of cliques O(n)
ID NI index
10 20.68
2 19.61
6 11.38
1 10.28
7 2.06
0 1.73
9 0.99
8 0.99
4 0.75
5 0.00
11 0.00
19CiBC Method
A B C D
A 3 3 0 0
B 3 3 1 1
C 0 1 3 4
D 0 1 4 3
A
Phase 3 Clique Merging Creation of
Communities
B
Complexity O(l2) l is the number of cliques
C
D
20CiBC Method
A B C D
A 3 3 0 0
B 3 3 1 1
C 0 1 3 4
D 0 1 4 3
A
Phase 3 Clique Merging Creation of
Communities
B
4 3
C
D
21CiBC Method
A B C
A 3 3 0
B 3 3 2
C 0 2 10
A
Phase 3 Clique Merging Creation of
Communities
B
C
22CiBC Method
A B C
A 3 3 0
B 3 3 2
C 0 2 10
A
Phase 3 Clique Merging Creation of
Communities
B
C
23CiBC Method
A C
A 9 2
C 2 10
A
Phase 3 Clique Merging Creation of
Communities
Phase 4 Check constraints
C
24CiBC vs. Clique Percolation Method, LRU
25Cooperative Caching in Wireless Multimedia Sensor
Networks
26The NICoCa protocol
- Each node is aware of its 2-hop neighborhood
- Uses NI to characterize some neighbors as
mediators - A node can be either a mediator or an ordinary
node - Each sensor node stores
- the dataID, and the actual multimedia datum
- the data size, TTL interval
- for each cached item, the timestamps of the K
most recent accesses - each cached item is characterized either as O
(i.e., own) or H (i.e., hosted)
27The cache discovery protocol (1/2)
- A sensor node issues a request for a multimedia
item - Searches its local cache and if it is found
(local cache hit) then the K most recent access
timestamps are updated - Otherwise (local cache miss), the request is
broadcasted and received by the mediators - These check the 2-hop neighbors of the requesting
node whether they cache the datum (proximity hit) - If none of them responds (proximity cache miss),
then the request is directed to the Data Center
28The cache discovery protocol (2/2)
- When a mediator receives a request, searches its
cache - If it deduces that the request can be satisfied
by a neighboring node (remote cache hit),
forwards the request to the neighboring node with
the largest residual energy - If the request can not be satisfied by this
mediator node, then it does not forward it
recursively to its own mediators, since this will
be done by the routing protocol, e.g., AODV - If none of the nodes can help, then requested
datum is served by the Data Center (global hit )
29Cache vs. hits (MB files uniform access) in a
sparse WMSN (d 4)
HYBRID appears at L. Yin and G. Cao,
Supporting cooperative caching in ad hoc
networks, IEEE Transactions on Mobile Computing,
5(1)77-89, 2006
30Cache vs. hits (MB files uniform access) in a
dense WMSN (d 7)
HYBRID appears at L. Yin and G. Cao,
Supporting cooperative caching in ad hoc
networks, IEEE Transactions on Mobile Computing,
5(1)77-89, 2006
31Evolution of cyberspace
- Semantic Web Pervasive Computing
- WWW Broadband WIFI grid computing
- Unicode XML RDF Ontologies
- Internet Multimedia URL HTTP HTML
- Servers Telecom Networks PCs TCP-IP
e-mail FTP - Computers Micro-chips Application Software
WYSIWYG Interfaces
Collective Intelligence Net
Semantic Web
WWW
Internet
PC
Computer
32Why Collective Intelligence?
- Users/ devices generate data at an unprecedented
rate - Blogs
- Tags
- Sensor measurements
- Web pages
- Rankings by search engines
- They are opinions or votes
- Under some conditions group IQ gt individual IQ
- So far Opinion/Vote fusion
- PageRank (i.e., collective linking preferences)
- Metasearching (ranked list merging)
- Collaborative filtering (what is interesting from
what other people say, what people like you say) - ..
33Collective Intelligence challenges
- Statistical analysis of social networks
- Identification of influential opinions and/or
producers (e.g., bloggers) - Discover social context to provide
personalization searching - Opinion spam
- Bias filtering
34Collective Intelligence challenges
- Finding high-quality content
- Opinion mining
- Dealing with controversies (e.g., in Wikipedia)
- Metadata from data analysis
- Storage of metadata
- MOST IMPORTANTLY
- In Centralized and/or Distributed settings
35Thank you for your attention!
36References
- Our work
- D. Katsaros?, G. Pallis, K. Stamos, A.
Sidiropoulos, A. Vakali, Y. Manolopoulos. CDNs
Content Outsourcing via Generalized Communities.
IEEE Transactions on Knowledge and Data
Engineering, (under second round review),
December, 2007. - N. Dimokas, D. Katsaros, and Y. Manolopoulos,
Cooperative Caching in Wireless Multimedia
Sensor Networks ACM Mobile Networks and
Applications, (under second round review),
February, 2008. - Competing methods
- CPM community identification method G. Palla,
I.Derenyi, I.Farkas, and T.Vicsek. Uncovering the
overlapping community structure of complex
networks in nature and society. Nature,
435(7043)814818, 2005. - Hybrid cooperative caching method L. Yin and G.
Cao. Supporting cooperative caching in ad hoc
networks. IEEE Transactions on Mobile Computing,
5(1)7789, 2006.