Title: Organization of Complex Networks
1Organization of Complex Networks
- Maksim Kitsak
- Advisor H. Eugene Stanley
Three questions addressed in thesis Q1
Betweenness centrality of fractal and non-fractal
networks? Q2 Leadership and structure of
industry networks? Q3 What (who) are the most
efficient spreaders?
Ph.D Final Oral Examination April,15,2009
2Collaborators
F. Liljeros (Stockholm University)
A.V. Goltsev, S.N. Dorogovtsev, J.F.F.
Mendes (University of Aveiro)
G. Paul (Boston University)
S. V. Buldyrev (Yeshiva University)
M. Riccaboni, F. Pammolli (University of Florence)
L.K. Gallos, H.A. Makse (City College of New
York)
S. Havlin (Bar-Ilan University)
3Networks Definitions
1) Network is a set of nodes (objects) connected
with edges (relations).
2) Degree (k) of a node is a number of edges
connected to it.
3) Degree Distribution P(k) is the probability
that a randomly chosen node has degree k.
4) Betweenness Centrality C(i) of node i is
(approximately) the total number of shortest
paths passing this node. (important for transport)
4Scale-Free Degree Distribution
US Airline Network
Log(P(k))
Log(k)
Scale-free degree distribution favors the
existence of (hubs) nodes with many connections!
5Structure Analysis K-cores and K-shells
K-core is the sub-graph with nodes of degree at
least k in the sub-graph.
- Pruning Rule
- Remove all nodes with k1.
- Some remaining nodes may now have k 1.
- 2) Repeat until there is no nodes with k 1.
- 3) The remaining network forms the 2-core.
- 4) Repeat the process for higher k to extract
other cores
2
3
1
S. B. Seidman, Social Networks, 5, 269 (1983).
K-shell is a set of nodes that belongs to the
K-core but NOT to the K1 core
6Leadership and structure of Industry Networks
- We consider 2 sectors of industry
- 1) Life Sciences (LS)
- 2) Information and Communication Technologies
(ICT) - Q1 Which firms are the industry leaders?
-
- Q2 How long does the leader maintain its leading
position? -
- Q3 What is the structure of industry networks?
- Q4 What are the growth principles of industry
networks?
M. Kitsak, M.Riccaboni, S. Havlin, F. Pammolli
and H.E. Stanley Phys Rev. E (submitted) (2009).
7Leadership in Industry Networks
Q What is an industry network? A An industry
network is a network of firms (nodes) connected
with business contracts (edges).
Q Should one distinguish among sectors of
industry? A Yes! We analyze the Information
and Communication Technology (ICT) (8000 firms)
and the Life Sciences (LS) (7000 firms) sectors.
A schematic picture by www.optimice.com.au
Q Which firms are the leaders of given industry
sector? A1 Firms that sign the most business
contracts (Largest Degree). A2 Firms that sign
business contracts with the most leaders.
(Largest k-core value)
8How are k-shell layers connected within the LS
network?
Jellyfish Structure
Nucleus (0.5) market leaders (Pfizer, GSK,
Novartis...) Tendrils (10) Small degree nodes,
connected exclusively to the nucleus. The bulk
body (90) remaining firms.
Tendrils are start-up firms that sell their
products exclusively to leaders.
9How are k-shell layers connected in the ICT
network?
Jellyfish Structure
ICT sector is highly competitive and unstable.
(Riccaboni et al (2002). In the ICT sector there
is emphasis on integrating technologies. More
deals among companies of the small degree which
results in a smaller number of k-shell layers.
- Q Other networks with jellyfish structure?
- The Internet at the Autonomous System level (S.
Carmi et al 2007.) - The network of Workplaces in Sweden (Y. Chen et
al 2007) - Configurational model with scale-free
connectivity - (M. Kitsak et al 2008)
10Growth principles of the LS and the ICT
industries?
Cumulative degree distribution
- LS grows linearly while ICT expands
exponentially. - 2. Both networks are scale-free.
(Jellyfish structure)
Growth principles
a) Growth power-law connectivity
Preferential attachment?, Barabasi et al 1999.
b) Offset in the degree distribution cgt0
Exclusive deals? (Random links) R. Albert et al
(2000).
c) K-shell structure of the LS and the ICT
networks ???
11Leadership and structure of Industry Networks
Q1 Which firms are the industry leaders? A1
Firms in the innermost k-cores. Q2 How long
does the leader maintain its leading
position? A2 (Not shown here) Life Sciences
sector is very stable while Information and
Communication Technologies sector is highly
competitive Q3 What is the structure of
industry networks? A3 Both sectors have
scale-free connectivity. Life Sciences sector has
jellyfish structure. Q4 What are the growth
principles of industry networks? A4 Preferential
attachment, Random linking ????
M. Kitsak, M.Riccaboni, S. Havlin, F. Pammolli
and H.E. Stanley Phys Rev. E (submitted) (2009).
12Who (What) are the most efficient spreaders?
- Epidemics and information spread fast in social
and technological systems. - It is hard to eradicate a virus after it has
spread to a significant area. - Hubs are infected early during epidemic outbreaks
and contribute the most - to the spread of an epidemic.
Q1 Do all hubs conduct epidemics similarly?
Q2 Are shortest paths important in spreading?
Q3 Does spreading depend on the network
structure?
Nodes in the highest k-shell layers Are the most
efficient spreaders!
M. Kitsak, L.K. Gallos, F. Liljeros, S. Havlin,
H.E. Stanley and H.A. Makse (in preparation)
(2009).
13A Tool for Epidemics Studies The SIR Model
- All nodes of the network initially in a
susceptible (S) state except for one or several
nodes which initially in an infective (I) state. - At every time step every infective (I) node
attempts to infect all of its susceptible (S)
neighbors (with probability ). - After time steps the infective node
recovers. Recovered (R) nodes can no longer be
infected.
SIR Susceptible, Infected, Recovered
The number of infected individuals M7
14Identifying efficient spreaders in the Hospital
network.
1) Inpatients are connected if they were sharing
the same ward for at least 20 days. 2) For
every node i in the system calculate the average
number of nodes infected in the case virus
originates from node i. 3) Group nodes based on
their degree (k), betweenness centrality (C) and
k-shell value ks.
A lot of hubs are inefficient spreaders!
1) M is correlated with the k-shell index of the
infection origin but not with degree and
betweenness centrality! 2) Identification of the
most efficient spreaders by their k-shell values
is 80 efficient., (degree 60 betweenness
centrality 20).
15An epidemic starts at a randomly chosen
node! Probability that particular node gets
infected? Typical time it takes for virus to
reach particular node?
- Count the fraction of times given node gets
infected! E - 2) Measure the average inverse time it takes for
the virus to reach given node
E3/4
Network
M,E and ltT-1gt are strongly correlated for all
values of all values of Ăź and TR .
16Why are M, E and ltT-1gt correlated?
Epidemic path
Probability that virus will not propagate along
given edge in 1 day (1-Ăź)
in 2 days (1-Ăź) (1-Ăź)
in TR days Probability virus will propagate
along given edge in T days
k
j
Epidemic origin
Neighbors are infected with probability Ăź per
time step.
i
SIR spreading can be mapped onto the edge
percolation problem! M.E.J. Newman Phys. Rev. E
66 016128 (2002).
Pre-select (mark) edges at random with
probability
Virus originating at node i infects node j in lij
time steps. lij is the shortest path between
nodes i and j along marked edges.
17Why are M, E and ltT-1gt correlated?
- The epidemic centrality of a node, E, is the
probability this node infected during an
epidemic outbreak. - It is relatively easy to measure how often each
individual gets sick.
Are the most efficient spreaders those who get
sick often? (Further analysis is necessary in the
case of inhomogeneous society) (disorder in
transmissibility etc.)
The most efficient spreaders are located in the
highest k-shell layers. The most efficient
spreaders are also likely to be infected early
during an epidemic outbreak.
18Summary
- k-shell decomposition allows partitioning the
network into subgraphs with different properties
(Jellyfish structure) - 2) Nodes in the highest k-shell layers are the
most effective in spreading epidemics. -
- 3) (not discussed here) k-shell organization of a
network is highly sensitive to even minor changes
in networks structure.
19(No Transcript)
20Acknowledgements I thank my wife Evgeniya for
letting me be part of her life and for her
endless love, support and patience during the
years of my PhD studies. I gratefully thank H.
Eugene Stanley for being my advisor during my PhD
studies at Boston University. I thank to Gene for
his patience, guidance and for teaching me to
express scientific ideas clearly. I am thankful
to Shlomo Havlin for being my main collaborator
and mentor during the last 4 years of my studies
at Boston University. I greatly appreciate all
the knowledge and experience Shlomo has
generously shared with me. I thank Hernan A.
Makse for his guidance and welcoming me to
collaborate with his group at the CCNY. I greatly
appreciate insightful discussions I had with
Hernan and everything he has taught me. I would
like to express my gratitude to all my
collaborators (in alphabetical order) Sergey V.
Buldyrev, Reuven Cohen, Sergey N. Dorogovtsev,
Lazaros K. Gallos, Alexander V. Goltsev, Dmitri
Gorbach, Fredrik Liljeros, Jose F. Mendes, Omar
Ormachea, Fabio Pammolli, Massimo Riccaboni and
Jia Shao for working with me and everything I
have learnt from them. I thank Albert-Laszlo
Barabasi, Lidia Braunstein, Lou Chitkushev, Paul
Krapivsky, Gerry Paul, Orion Penner, Hernan D.
Rozenfeld, Diego Rybsky and Chaoming Song for
valuable discussions that helped me in my
research.
21Why are Mi and T-1i correlated?
The epidemic centrality of a node is a relative
frequency this node is infected. Are the most
efficient spreaders those who gets sick very
often? (Further analysis is necessary in the case
of inhomogeneous society) (disorder in
transmissibility etc.)
22How does Infected Mass depend on contagiousness
of epidemics?
Average Infected Mass
Contagiousness
The virus needs to be significantly more
contagious in order to achieve the same epidemics
level when originating from low k-shell
nodes. Even not contagious virus can infect a
large portion of the network, provided it
originates from a large k-shell node.
23Take Home Messages
Structure of networks is extremely important in
spreading processes!
K-shell index adequately describes the efficiency
of a node in epidemics!
Immunization strategies should be optimized
according to particular structure of a network
24SIR Epidemics Model Average Infected Mass vs.
Infection Origin
The average infected mass is strongly correlated
with the k-shell index of the Infection origin
but not with its degree and centrality
betweenness!
25How does Infected Mass depend on contagiousness
of epidemics?
Contagiousness
Contagiousness
The virus needs to be significantly more
contagious in order to achieve the same epidemics
level when originating from low k-shell
nodes. Even not contagious virus can infect a
large portion of the network, provided it
originates from a large k-shell node.
26(No Transcript)
27Transition from Fractal to Non-Fractal Behavior
Analytical Consideration
28Centrality distribution Analytical Consideration