SI 614 Directed - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

SI 614 Directed

Description:

Title: Slide 1 Author: ladamic Last modified by: ladamic Created Date: 11/14/2005 1:35:40 PM Document presentation format: On-screen Show Company: University of Michigan – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 64
Provided by: LAD103
Category:

less

Transcript and Presenter's Notes

Title: SI 614 Directed


1
SI 614Directed weighted networks, minimum
spanning trees, flow
Lecture 12 Instructor Lada Adamic
2
Outline
  • directed networks
  • prestige
  • weighted networks
  • minimum spanning trees
  • flow

3
Review of centrality in undirected
networks Comparison
  • Comparing across these 3 centrality values
  • Generally, the 3 centrality types will be
    positively correlated
  • When they are not (low) correlated, it probably
    tells you something interesting about the network.

 
Low Degree
Low Closeness
Low Betweenness
High Degree
 
Embedded in cluster that is far from the rest of
the network
Ego's connections are redundant - communication
bypasses him/her
High Closeness
Key player tied to important important/active
alters
 
Probably multiple paths in the network, ego is
near many people, but so are many others
High Betweenness
Ego's few ties are crucial for network flow
Very rare cell. Would mean that ego monopolizes
the ties from a small number of people to many
others.
 
slide Jim Moody
4
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality Actors centrality
(prestige) is equal to a function of the prestige
of those they are connected to. Thus, actors who
are tied to very central actors should have
higher prestige/ centrality than those who are
not.
  • a is a scaling vector, which is set to normalize
    the score.
  • b reflects the extent to which you weight the
    centrality of people ego is tied to.
  • R is the adjacency matrix (can be valued)
  • I is the identity matrix (1s down the diagonal)
  • 1 is a matrix of all ones.

slide Jim Moody
5
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
The magnitude of b reflects the radius of power.
Small values of b weight local structure, larger
values weight global structure. If b is
positive, then ego has higher centrality when
tied to people who are central. If b is
negative, then ego has higher centrality when
tied to people who are not central. As b
approaches zero, you get degree centrality.
slide Jim Moody
6
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
b 0.23
slide Jim Moody
7
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
b-.35
b.35
slide Jim Moody
8
Centrality in Social Networks Power / Eigenvalue
Bonacich Power Centrality
b.23
b -.23
slide Jim Moody
9
Examples of directed networks?
  • WWW
  • food webs
  • population dynamics
  • influence
  • hereditary
  • citation
  • transcription regulation networks
  • neural networks

10
Prestige in directed social networks
  • when prestige may be the right word
  • admiration
  • influence
  • gift-giving
  • trust
  • directionality especially important in instances
    where ties may not be reciprocated (e.g. dining
    partners choice network)
  • when prestige may not be the right word
  • gives advice to (can reverse direction)
  • gives orders to (- -)
  • lends money to (- -)
  • dislikes
  • distrusts

11
Extensions of undirected degree centrality -
prestige
  • degree centrality
  • indegree centrality
  • a paper that is cited by many others has high
    prestige
  • a person nominated by many others for an reward
    has high prestige


12
Extensions of undirected closeness centrality
  • closeness centrality usually implies
  • all paths should lead to you
  • and unusually not
  • paths should lead from you to everywhere else
  • usually consider only vertices from which the
    node i in question can be reached


13
Influence range
  • The influence range of i is the set of vertices
    who are reachable from the node i

14
Extending betweenness centrality to directed
networks
  • We now consider the fraction of all directed
    paths between any two vertices that pass through
    a node

paths between j and k that pass through i
betweenness of vertex i
all paths between j and k
  • Only modification when normalizing, we have
    (N-1)(N-2) instead of (N-1)(N-2)/2, because we
    have twice as many ordered pairs as unordered
    pairs

15
Directed geodesics
  • A node does not necessarily lie on a geodesic
    from j to k if it lies on a geodesic from k to j

j
k
16
Prestige in Pajek
  • Calculating the indegree prestige
  • NetgtPartitiongtDegreegtInput
  • to view, select FilegtPartitiongtEdit
  • if you need to reverse the direction of each tie
    first (e.g. lends money to -gt borrows
    from)NetgtTransformgtTranspose
  • Influence range (a.k.a. input domain)
  • Netgtk-NeighboursgtInput
  • enter the number of the vertex, and 0 to consider
    all vertices that eventually lead to your chosen
    vertex
  • to find out the size of the input domain, select
    InfogtPartition
  • Calculate the size of the input domains for all
    vertices
  • NetgtPartitionsgtDomaingtInput
  • Can also limit to only neighbors within some
    distance

17
Proximity prestige in Pajek
  • Direct nominations (choices) should count more
    than indirect ones
  • Nominations from second degree neighbors should
    count more than third degree ones
  • So consider proximity prestige
  • Cp(ni)

fraction of all vertices that are in is input
domain
average distance from i to vertex in input domain
18
Weighted networks
  • Examples
  • email communication
  • sports matches
  • packet transfer
  • population movement
  • co-authorship
  • food webs
  • Weighted treatment of data/algorithms usually
    left for future work

19
But what are weights good for?
  • Defining thresholds
  • Shortest paths that dont take long
  • Flow/capacity of a network

20
Food webs
  • Food webs
  • usually considered as binary networks
  • problems in defining threshold fluxes
  • do killer whales who eat bears count?
  • weights
  • interaction frequency
  • acts of predation per hectare per day
  • carbon flow (prey to predator)
  • grams of Carbon per meter squared per year
  • interaction strength (predator on prey)
  • (carbon flow of prey to predator)/ (biomass of
    predator)

Lake carbon flow
21
Co-authorship networks
  • The weight assigned to each edge is the sum of
    the number of papers in which two people were
    co-authors, divided by the total number of people
    in that paper
  • large-scale high energy physics collaboration
    producing a paper with 100 authors is less
    evidence of direct collaboration than an article
    in Social Networks with only two co-authors.
  • Should we normalize?
  • all weights from i to other nodes should sum to
    1? (probably not)

all papers where i and j were coauthors
number of authors of paper k
22
Symmetry in normalization
  • If normalizing by the sum of values for each node

assume simple weighting number ofpapers
co-authored
2
1
3
wij 3/31 wji 3/151/5
3
j
3
i
6
  • Cosine similarity symmetric values
  • assume the weight for each paper is wk 1/(nk-1)
  • i and j each have vectors of 0s and ws
    depending on whetherthey authored paper k
  • normalize by the length of both vectors

23
Other similarity Measures
Simple matching Dices Coefficient Jaccards
Coefficient Cosine CoefficientOverlap Coefficient
a1
a3
p2
p9
p1
p4
p11
p5
p3
p6
p10
p8
a2
p7
Q set of papers authored by a1 D set of papers
authored by a2
24
Weighted shortest paths
  • Routes
  • shortest route from Chicago to Boston
  • vertex intersection
  • edge weights road distances
  • alternative weights expected time traveled, gas
    consumed
  • usually sum the weights from each segment

finish
surface road 25 mph, 50 miles 2 hours
freeway, 70 mph 30 miles/70 mph 26 minutes
start
freeway, 65 mph 40 miles/65 mph 37 minutes
25
Reliable paths through social networks
  • The probability of transmitting a message or
    infectious agent could be related to the strength
    of the tie
  • e.g. rather than summing the weights, we might
    multiply the probabilities of getting through

p 1
p 0.001
p 0.05
p 0.5
p 0.5
Probability of getting an idea through to the
head of labs via CEO (0.0011 0.001), via
direct manager (0.50.5 0.25)
26
Shortest Path Problem
  • Given a weighted graph and two vertices u and v,
    we want to find a path of minimum total weight
    between u and v.
  • Length of a path is the sum of the weights of its
    edges.
  • Example
  • Shortest path between Providence and Honolulu
  • Applications
  • Internet packet routing
  • Flight reservations
  • Driving directions

849
PVD
ORD
1843
142
SFO
802
LGA
1205
1743
337
1387
HNL
2555
1099
1233
LAX
1120
DFW
MIA
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
27
Negative weights
  • Shortest paths usually undefined for edges with
    negative weights if there are negative cycles
    present

2
4
3
-3
28
Shortest Path Properties
  • Property 1
  • A subpath of a shortest path is itself a
    shortest path
  • Property 2
  • There is a tree of shortest paths from a start
    vertex to all the other vertices
  • Example
  • Tree of shortest paths from Providence

849
PVD
ORD
1843
142
SFO
802
LGA
1205
1743
337
1387
HNL
2555
1099
1233
LAX
1120
DFW
MIA
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
29
Dijkstras Algorithm
  • The distance of a vertex v from a vertex s is the
    length of a shortest path between s and v
  • Dijkstras algorithm computes the distances of
    all the vertices from a given start vertex s
  • Assumptions
  • the graph is connected
  • the edges are undirected
  • the edge weights are nonnegative
  • We grow a cloud of vertices, beginning with s
    and eventually covering all the vertices
  • We store with each vertex v a label d(v)
    representing the distance of v from s in the
    subgraph consisting of the cloud and its adjacent
    vertices
  • At each step
  • We add to the cloud the vertex u outside the
    cloud with the smallest distance label, d(u)
  • We update the labels of the vertices adjacent to
    u

slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
30
Edge Relaxation
  • Consider an edge e (u,z) such that
  • u is the vertex most recently added to the cloud
  • z is not in the cloud
  • The relaxation of edge e updates distance d(z) as
    follows
  • d(z) ? mind(z),d(u) weight(e)

d(u) 50
d(z) 75
10
e
u
z
s

d(u) 50
d(z) 60
10
e
u
z
s

slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
31
Example
0
A
4
8
2
4
2
8
7
1
C
B
D
3
9
?
?
2
5
E
F
0
0
A
A
4
4
8
8
2
2
3
2
8
3
2
7
7
1
7
1
C
B
D
C
B
D
3
9
3
9
5
11
5
8
2
5
2
5
E
F
E
F
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
32
Example (cont.)
0
A
4
8
2
3
2
7
7
1
C
B
D
3
9
5
8
2
5
E
F
0
A
4
8
2
3
2
7
7
1
C
B
D
3
9
5
8
2
5
E
F
slide by Huajie Zhang, http//www.cs.unb.ca/cours
es/cs3913/
33
Minimum spanning trees
  • Connect all vertices with a single tree
  • Consider a communications company, such as ATT
    or GTE that needs to build a communication
    network that connects n different users. The
    cost of making a link joining i and j is cij.
    What is the minimum cost of connecting all of the
    users?

Common assumption the only links possible are
the ones directly joining two nodes.
web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
34
Electronic Circuitry
  • Consider a system with a number of electronic
    components. In order to make two pins i and j of
    different components electrically equivalent, one
    can connect i and j by a wire. How can we
    connect n different pins in this way to make them
    electrically equivalent to each other so as to
    minimize the total wire length.

web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
35
Minimum Cost Spanning Tree Problem
  • Undirected network G (N, A).
  • (i, j) is the same arc as (j, i).
  • We associate with each arc (i, j) ? A a cost
    cij.
  • A spanning tree T of G is a connected acyclic
    subgraph that spans all the nodes. A connected
    graph with n nodes and n 1 arcs is a spanning
    tree.
  • The minimum cost spanning tree problem is to find
    a spanning tree of minimum cost.

web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
36
A Minimum Cost Spanning Tree Problem
10
8
2
4
6
35
15
17
1
30
25
20
21
40
3
5
7
15
11
web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
37
A Minimum Cost Spanning Tree
10
8
2
4
6
35
15
17
1
30
25
20
21
40
3
5
7
15
11
web.mit.edu/jorlin/www/15.082/Lectures/16_Spannin
g_Trees.ppt
38
Prim-Jarnik Algorithm
  • Vertex based algorithm
  • Grows one tree T, one vertex at a time
  • A cloud covering the portion of T already
    computed
  • Label the vertices v outside the cloud with
    keyv the minimum weigth of an edge connecting
    v to a vertex in the cloud, keyv , if no
    such edge exists

www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
39
Prim Example
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
40
Prim Example (2)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
41
Prim Example (3)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
42
Kruskal's Algorithm
  • The algorithm adds the cheapest edge that
    connects two trees of the forest

MST-Kruskal(G,w) 01 A Æ 02 for each vertex v Î
VG do 03 Make-Set(v) 04 sort the edges of E
by non-decreasing weight w 05 for each edge (u,v)
ÃŽ E, in order by non-decreasing weight do 06 if
Find-Set(u) ¹ Find-Set(v) then 07 A A È
(u,v) 08 Union(u,v) 09 return A
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
43
Kruskal Example
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
44
Kruskal Example (2)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
45
Kruskal Example (3)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
46
Kruskal Example (4)
www.cs.earlham.edu/celikeb/fall_2005/cs310_aads/l
ecture_slides/ch23_minimum_spanning_trees.ppt
47
Network flow
  • Applications
  • traffic transportation
  • maximum number of cars that can commute from
    Berkley to San Francisco during rush hour
  • fluid networks pipes that carry liquids
  • computer networks packets traveling along fiber
  • extended applications (from Kleinberg Tardos,
    Algorithm Design)
  • bipartite matching problem
  • number of disjoint paths between two vertices
  • survey design
  • airline scheduling
  • image segmentation
  • baseball elimination

48
Max flow problem how much stuff can we get from
source to sink per unit time?
Capacity
7
Sink
Source
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
49
Equivalent tasks
  • Find a cut with minimum capacity
  • Find maximum flow from source to sink

www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
50
A Flow
3
5
7
2
residual graph
5
2
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
51
Augmenting Paths
  • A path from source to sink in the residual graph
    of a given flow
  • If there is an augmenting path in the residual
    graph, we can push more flow

www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
52
Ford-Fulkerson Method
  • initialize total flow to 0
  • residual graph G G
  • while augmenting path exist in G
  • pick a augmenting path P in G
  • m bottleneck capacity of P
  • add m to total flow
  • push flow of m along P
  • update G

www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
53
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
4
2
4
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
54
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
4
2
4
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
55
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
56
Example
3
2
1
3
1
1
3
1
3
1
2
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
57
Example
1
2
1
3
1
1
1
1
1
1
2
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
58
Example
1
2
1
3
1
1
1
1
1
1
1
4
3
2
2
1
1
1
2
3
1
3
1
1
1
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
59
Example
1
2
1
3
1
1
1
1
1
1
1
1
3
3
2
1
1
1
1
1
2
2
2
2
2
2
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
60
Answer Max Flow 4
2
2
2
2
2
1
1
1
2
2
2
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
61
Answer Minimum Cut 4
3
2
1
3
1
1
3
1
3
1
1
4
3
2
2
1
1
1
2
4
2
4
www.comp.nus.edu.sg/ooiwt/slides/2004-cs3233-grap
h2.ppt
62
project status report
  • worth 5 of your grade, meant to keep you on
    track
  • 2-3 weeks later in-class presentation
  • 1 month later final project report due
  • what it should do
  • include part of your project proposal as intro
  • include result summaries (including figures
    tables).
  • be 4-6 pages
  • include references to and briefly (paragraph or
    2) discuss some related work.
  • include a plan of remaining work.
  • It is graded on a 0-5 scale
  • 5 - same as 4, but very complete and already
    shows interesting new insights
  • 4 - data, more than basic analysis (e.g. looked
    at robustness, community structure, centrality,
    etc. if applicable)
  • 3 - some data, preliminary analysis (imported
    data into Pajek or GUESS, counted things up,
    visualized, if possible)
  • 2 - some data, no results
  • 1 - attempts made to get project started, but
    nothing worked out (no data, no results)
  • 0 - no work done

63
GUESS installation
  • Windows
  • unzip the files into a folder
  • edit the guess.bat (a batch executable file) so
    that
  • _at_rem set GUESS_HOMEc\program files\GUESS
    becomes _at_set GUESS_HOMEC\PROGRA1\GUESS
    if you installed into c\Program Files\GUESS
  • else you can try installing into a directory with
    no spaces in the name and have (e.g.) _at_set
    GUESS_HOMEC\apps\GUESS
Write a Comment
User Comments (0)
About PowerShow.com