Title:
1Adversarial Deletion in Scale Free Random Graph
Process by A.D. Flaxman et al.
- Hammad Iqbal
- CS 3150
- 24 April 2006
2Talk Overview
- Background
- Large graphs
- Modeling large graphs
- Robustness and Vulnerability
- Problem and Mechanism
- Main Results
- Adversarial Deletions During Graph Generation
- Results
- Graph Coupling
- Construction of the proofs
3Large Graphs
- Modeling of large graphs has recently generated
interest 1990s - Driven by the computerization of data acquisition
and greater computing power - Theoretical models are still being developed
- Modeling difficulties include
- Heterogeneity of elements
- Non-local interactions
4Large Graphs Examples
- Hollywood graph 225,000 actors as vertices an
edge connects two actors if they were cast in the
same movie - World Wide Web 800 million pages as vertices
links from one page to another are the edges - Citation pattern of scientific publications
- Electrical Power-grid of US
- Nervous system of the nematode worm
Caenorhabditis elegans
5Small World of Large Graphs
- Large naturally occurring graphs tend to show
- Sparsity
- Hollywood graph has 13 million edges (25 billion
for a clique of 225,000 vertices) - Clustering
- In WWW, two pages that are linked to the same
page have a higher prob of including link to one
another - Small Diameter
- log n
- D.J. Watts and S.H. Strogatz, Collective dynamics
of 'small-world' networks, Nature (1998)
6Talk Overview
- Background
- Large graphs
- Modeling large graphs
- Robustness and Vulnerability
- Problem and Mechanism
- Main Results
- Adversarial Deletions During Graph Generation
- Results
- Graph Coupling
- Construction of the proofs
7Erdos-Renyi Random Graphs
- Developed around 1960 by Hungarian mathematicians
Paul Erdos and Alfred Renyi. - Traditional models of large scale graphs
- G(n,p) a graph on n where each pair is joined
independently with prob p - Weaknesses
- Fixed number of vertices
- No clustering
8Watts-Strogatz Model
- Starting from a ring lattice with n vertices and
k edges per vertex, rewire each vertex with prob
p to a randomly chosen destination figure - A good model for Hollywood graph
- Web is also shown to fit small world model
- Weakness Constant n
9(No Transcript)
10Barabasi model
- Incorporates growth and preferential attachment
- Evolves to a steady scale-free state the
distribution of node degrees dont change over
time - Prob of finding a vertex with k edges k-3
11Degree Distribution
- Scale Free
- P X k ck-a
- Power Law distributed
- Heavy Tail
- Erdos- Renyi Graphs
- P X k e-? ?k / k!
- ? depends on the N
- Poisson distributed
- Decays rapidly for large k
- PXk ? 0 for large k
12Power Law distributions
- Also referred to as heavy-tail, Pareto, Zipfian
distributions - Pervasive in many naturally occurring phenomena
- Scale-free graph have power law distributions
P X k ck-a cgt0 and agt0
13Exponential (ER) vs Scale Free
130 vertices and 430 edges Red 5 highest
connected vertices Green Neighbors of red
Albert, Jeong, Barabasi 2000
14Degree Sequence of WWW
- In-degree for WWW pages is power-law distributed
with x-2.1 - Out-degree x-2.45
- Av. path length between nodes 16
15Talk Overview
- Background
- Large graphs
- Modeling large graphs
- Robustness and Vulnerability
- Problem and Mechanism
- Main Results
- Adversarial Deletions During Graph Generation
- Results
- Graph Coupling
- Construction of the proofs
16Robustness and Vulnerability
- Many complex systems display inherent tolerance
against random failures - Examples genetic systems, communication systems
(Internet) - Redundant wiring is common but not the only
factor - This tolerance is only shown by scale-free graphs
(Albert, Jeong, Barabasi 2000)
17Inverse Bond Percolation
- What happens when a fraction p of edges are
removed from a graph? - Threshold prob pc(N)
- Connected if edge removal probability pltpc(N)
- Infinite-dimensional percolation
- Worse for node removal
18General Mechanism
- Barabasi (2000) - Networks with the same number
of nodes and edges, differing only in degree
distribution - Two types of node removals
- Randomly selected nodes
- Highly connected nodes (Worst case)
- Study parameters
- Size of the largest remaining cluster (giant
component) S - Average path length l
19Main Results(Deletion occurs after generation)
Why is this important?
? Random node removal ? Preferential node
removal
20Talk Overview
- Background
- Large graphs
- Modeling large graphs
- Robustness and Vulnerability
- Problem and Mechanism
- Main Results
- Adversarial Deletions During Graph Generation
- Results
- Graph Coupling
- Construction of the proofs
21Main Result
- Time steps 1,,n
- New vertex with m edges using preferential att.
- Total deleted vertices dn (Adversarially)
- m gtgt d
- w.h.p a component of size n/30
22Formal Statements
- Theorem 1
- For any sufficiently small constant d there
exists a sufficiently large constant mm(d) and a
constant ??(d,m) such that whp Gn has a giant
connected component with size at least ?n
23Graph Coupling
Random Graph G(n,p)
Red Induced graph vertices Gn
24Informal Proof Construction
- A random graph can be tightly coupled with the
scale free graph on the induced subset (Theorem
2) - Deleting few edges from a random graph with
relatively many edges will leave a giant
connected component (Lemma 1) - There will be a sufficient number of vertices
for the construction of induced subset (Lemma 2)
w.h.p
25Formal Statements
- Theorem 2
- We can couple the construction of Gn and random
graph Hn such that Hn G(Gn,p) and whp - e(Hn \ Gn) Ae-Bmn
- Difference in edge sets of Gn and Hn decreases
exponentially with the number of edges
26Induced Sub-graph Properties
- Vertex classification at each time step t
- Good if
- Created after t/2
- Number of original edges that remain undeleted
m/6 - Bad otherwise
- Gt set of good vertices at time t
- Good vertex can become bad
- Bad vertex remains bad
27Proof of Theorem 2Construction
- Hn/2 G(Gn/2,p)
- For k gt n/2, both Gk and Hk are constructed
inductively - Gk is generated by preferential attachment model.
- Hk is constructed by connecting a new vertex
with the vertices that are good in Gk - A difference will only happen in case of failure
28Proof of Theorem 2Type 0 failure
- If not enough good vertices in Gk
- Lemma 2 whp ?t t/10
- Prob of occurrence is therefore o(1)
- Generate Gn and Hn independently if this
occurs
29Proof of Theorem 2Type 1 failure
- If not enough good vertices are chosen by xk1 in
Gk - r number of good vertices selected
- Let Pa given vertex is good e0
- Failure if r (1-d)e0m
- Upper bound
30Proof of Theorem 2Type 2 failure
- If the number of good vertices chosen by xk1 in
Gk is less than the random vertices generated
in Hk - XBi(r, e0) and YBi(?k,p)
- Failure if YgtX
- Upper bound on type 2 failure prob Ae-Bm
31Proof of Theorem 2Coupling and deletion
- Take a random subset of size Y of the good chosen
vertices in Gk and connect them with the new
vertex in Hk - Delete vertices in Hk that are deleted by the
adversary in Gk - Hn G(Gn,p)
- Difference can only occur due to failure
32Proof of Theorem 2Bound on failures
- Prob of failure at each step Ae-Bm
- Total number of misplaced edges added
- EM Ae-Bmn
-
33Lemma 1Statement
- Let G obtained by deleting fewer than n/100 edges
from a realization of Gn,c/n. if c10 then whp G
has a component of size at least n/3
34Proof of Lemma 1
- Gn,c/n contains a set S of size n/3 s n/2
- P at most n/100 edges joining s to n-s is small
- E number of edges across this cut s(n-s)c/n
- Pick some e so that n/100 (1-e)s(n-s)c/n
s
n-s
N/100
35Proof of Lemma 1
36Proof of Lemma 2Statement and Notation
- whp ?t t/10 for n/2 lt t n
- Let
- zt number of deleted vertices
- ?t number of vertices in Gt
- It is sufficient to show that
37Proof of Lemma 2Coupling
- Couple two generative processes
- P adversary deletes vertices at each time step
- P no vertices are deleted until t and then
same vertices are deleted as P - Difference can only occur because of failure
- Upper bound on zt(P)
38Theorem 1Statement
- For any sufficiently small constant d there
exists a sufficiently large constant mm(d) and a
constant ??(d,m) such that whp Gn has a giant
connected component with size at least ?n
39Proof of Theorem 1
- Let G1Gn and G2 G(Gn,p)
- Let G G1 n G2
- e(G2 \ G) Ae-Bmn by theorem 2
- whp G ?n n/10 by lemma 2
- Let m be large so that pgt10/ ?n
- Proof by lemma 1
40(No Transcript)