Title: Networked Life and Social Networks
1Networked Life and Social Networks
- Thanks to Michael Kearns, James Moody, Anna
Nagurney
2Networked Life
- Physical, social, biological, etc
- Hybrids
- Static vs dynamic
- Local vs global
- Measurable and reproducible
3- A purely technological network?
- Points are physical machines
- Links are physical wires
- Interaction is electronic
- What more is there to say?
Internet, Router Level
4- Points power stations
- Operated by companies
- Connections embody business relationships
- Food for thought
- 2003 Northeast blackout
North American Power Grid
5- Points are still machines but are associated
with people - Links are still physical but may depend on
preferences - Interaction content exchange
- Food for thought free riding
Gnutella Peers
6- Points sovereign nations
- Links exchange volume
- A purely virtual network
Foreign Exchange
7- Purely biological network
- Links are physical
- Interaction is electrical
- Food for thought
- Do neurons cooperate or compete?
The Human Brain
8The Premise of Networked Life
- It makes sense to study these diverse networks
together. - The Commonalities
- Formation (distributed, bottom-up, organic,)
- Structure (individuals, groups, overall
connectivity, robustness) - Decentralization (control, administration,
protection,) - Strategic Behavior (economic, free riding,
Tragedies of the Common) - An Emerging Science
- Examining apparent similarities between many
human and technological systems organizations - Importance of network effects in such systems
- How things are connected matters greatly
- Details of interaction matter greatly
- The metaphor of viral spread
- Dynamics of economic and strategic interaction
- Qualitative and quantitative can be very subtle
- A revolution of measurement, theory, and breadth
of vision
9Whos Doing All This?
- Computer Information Scientists
- Understand and design complex, distributed
networks - View competitive decentralized systems as
economies - Social Scientists, Behavioral Psychologists,
Economists - Understand human behavior in simple settings
- Revised views of economic rationality in humans
- Theories and measurement of social networks
- Physicists and Mathematicians
- Interest and methods in complex systems
- Theories of macroscopic behavior (phase
transitions) - All parties are interacting and collaborating
10Examples
- Theories
- Apps in all areas
11The Networked Nature of Society
- Networks as a collection of pairwise relations
- Examples of (un)familiar and important networks
- social networks
- content networks
- technological networks
- biological networks
- economic networks
- The distinction between structure and dynamics
A network-centric overview of modern society.
12Contagion, Tipping and Networks
- Epidemic as metaphor
- The three laws of Gladwell
- Law of the Few (connectors in a network)
- Stickiness (power of the message)
- Power of Context
- The importance of psychology
- Perceptions of others
- Interdependence and tipping
- Paul Revere, Sesame Street, Broken Windows, the
Appeal of Smoking, and Suicide Epidemics
13Graph Network Theory
- Networks of vertices and edges
- Graph properties
- cliques, independent sets, connected components,
cuts, spanning trees, - social interpretations and significance
- Special graphs
- bipartite, planar, weighted, directed, regular,
- Computational issues at a high level
14Social Network Theory
- Metrics of social importance in a network
- degree, closeness, between-ness, clustering
- Local and long-distance connections
- SNT universals
- small diameter
- clustering
- heavy-tailed distributions
- Models of network formation
- random graph models
- preferential attachment
- affiliation networks
- Examples from society, technology and fantasy
15The Web as a Network
- Empirical web structure and components
- Web and blog communities
- Web search
- hubs and authorities
- the PageRank algorithm
- The Main Streets and dark alleys of the web
The algorithmic and social implications of
network structure.
16Towards RationalityEmergence of Global from
Local
- Beyond the dynamics of transmission
- Context, motivation and influence
- The madness/wisdom of crowds
- thresholds and cascades
- mathematical models of tipping
- the market for lemons
- private preferences and global segregation
17Interdependent Security and Networks
- Security investment and Tragedies of the Commons
- Catastrophic events you can only die once
- Fire detectors, airline security, Arthur
Anderson,
Blending network, behavior and dynamics.
18Network Economics
- Buying and selling on a network
- Modeling constraints on trading partners
- Local imbalances of supply and demand
- Preferential attachment, price variation, and the
distribution of wealth
The effects of network structure on economic
outcomes.
19Modern Financial Markets
- Stock market networks
- correlation of returns
- Market microstructure
- limit and market orders
- order books and electronic crossing networks
- network, connectivity and data issues
- Quantitative trading
- VWAP trading, market making
- limit order power laws
- Herd behavior in trading
- Economic theory and financial markets
- Behavioral economics and finance
- Impacts of the Internet on financial markets
A study of the network that runs the world.
20Definition of Social Networks
- A social network is a set of actors that may
have relationships with one another. Networks can
have few or many actors (nodes), and one or more
kinds of relations (edges) between pairs of
actors. (Hannemann, 2001)
21History (based on Freeman, 2000)
- 17th century Spinoza developed first model
- 1937 J.L. Moreno introduced sociometry he also
invented the sociogram - 1948 A. Bavelas founded the group networks
laboratory at MIT he also specified centrality
22History (based on Freeman, 2000)
- 1949 A. Rapaport developed a probability based
model of information flow - 50s and 60s Distinct research by individual
researchers - 70s Field of social network analysis emerged.
- New features in graph theory more general
structural models - Better computer power analysis of complex
relational data sets
23Foundations Theory
Structural Analysis from method and metaphor to
theory and substance.
H. White The presently existing, largely
categorical descriptions of social structure have
no solid theoretical grounding furthermore,
network concepts may provide the only way to
construct a theory of social structure. (p.25)
Integration of large-scale social systems
Form Vs. Content
24Introduction
- Social network analysis is
- a set of relational methods for systematically
understanding and identifying connections among
actors. SNA - is motivated by a structural intuition based on
ties linking social actors - is grounded in systematic empirical data
- draws heavily on graphic imagery
- relies on the use of mathematical and/or
computational models. - Social Network Analysis embodies a range of
theories relating types of observable social
spaces and their relation to individual and group
behavior.
25Introduction
What are social relations?
A social relation is anything that links two
actors. Examples include Kinship Co-membership
Friendship Talking with Love Hate Exchang
e Trust Coauthorship Fighting
26Introduction
What properties relations are studied?
The substantive topics cross all areas of
sociology. But we can identify types of
questions that social network researchers
ask 1) Social network analysts often study
relations as systems. That is, what is of
interest is how the pattern of relations among
actors affects individual behavior or system
properties.
27Introduction
High Schools as Networks
28(No Transcript)
29(No Transcript)
30Introduction
Why do Networks Matter?
Local vision
31Introduction
Why do Networks Matter?
Local vision
32Representation of Social Networks
Ann
Sue
Nick
Rob
33Graphs - Sociograms (based on Hanneman, 2001)
- Labeled circles represent actors
- Line segments represent ties
- Graph may represent one or more types of
relations - Each tie can be directed or show co-occurrence
- Arrows represent directed ties
34Graphs Sociograms (based on Hanneman, 2001)
- Strength of ties
- Nominal
- Signed
- Ordinal
- Valued
35Visualization Software Krackplot
36Connections
- Size
- Number of nodes
- Density
- Number of ties that are present vs the amount of
ties that could be present - Out-degree
- Sum of connections from an actor to others
- In-degree
- Sum of connections to an actor
- Diameter
- Maximum greatest least distance between any actor
and another
37Some Measures of Distance
- Walk (path)
- A sequence of actors and relations that begins
and ends with actors - Geodesic distance (shortest path)
- The number of actors in the shortest possible
walk from one actor to another - Maximum flow
- The amount of different actors in the
neighborhood of a source that lead to pathways to
a target
38Some Measures of Power (based on Hanneman, 2001)
- Degree
- Sum of connections from or to an actor
- Closeness centrality
- Distance of one actor to all others in the
network - Betweenness centrality
- Number that represents how frequently an actor is
between other actors geodesic paths
39Cliques and Social Roles (based on Hanneman,
2001)
- Cliques
- Sub-set of actors
- More closely tied to each other than to actors
who are not part of the sub-set - Social roles
- Defined by regularities in the patterns of
relations among actors
40SNA applications
- Many new unexpected applications plus many of the
old ones - Marketing
- Advertising
- Economic models and trends
- Political issues
- Organization
- Services to social network actors
- Travel guides
- Jobs
- Advice
- Human capital analysis and predictions
- Medical
- Epidemiology
- Defense (terrorist networks)
41Examples of Applications (based on Freeman, 2000)
- Visualizing networks
- Studying differences of cultures and how they can
be changed - Intra- and interorganizational studies
- Spread of illness, especially HIV
42Foundations Data
The unit of interest in a network are the
combined sets of actors and their relations. We
represent actors with points and relations with
lines. Actors are referred to variously
as Nodes, vertices, actors or
points Relations are referred to variously
as Edges, Arcs, Lines, Ties
Example
b
d
a
c
e
43Foundations Data
- Social Network data consists of two linked
classes of data - Nodes Information on the individuals (actors,
nodes, points, vertices) - Network nodes are most often people, but can be
any other unit capable of being linked to another
(schools, countries, organizations,
personalities, etc.) - The information about nodes is what we usually
collect in standard social science research
demographics, attitudes, behaviors, etc. - Often includes dynamic information about when the
node is active - b) Edges Information on the relations among
individuals (lines, edges, arcs) - Records a connection between the nodes in the
network - Can be valued, directed (arcs), binary or
undirected (edges) - One-mode (direct ties between actors) or two-mode
(actors share membership in an organization) - Includes the times when the relation is active
- Graph theory notation G(V,E)
44Foundations Data
In general, a relation can be (1) Binary or
Valued (2) Directed or Undirected
The social process of interest will often
determine what form your data take. Almost all
of the techniques and measures we describe can be
generalized across data format.
45Foundations Data and social science
Global-Net
46Foundations Data
We can examine networks across multiple levels
1) Ego-network - Have data on a respondent (ego)
and the people they are connected to (alters).
Example terrorist networks - May include
estimates of connections among alters
2) Partial network - Ego networks plus some
amount of tracing to reach contacts of contacts
- Something less than full account of
connections among all pairs of actors in the
relevant population - Example CDC Contact
tracing data
47Foundations Data
We can examine networks across multiple levels
- 3) Complete or Global data
- - Data on all actors within a particular
(relevant) boundary - - Never exactly complete (due to missing data),
but boundaries are set - Example Coauthorship data among all writers in
the social sciences, friendships among all
students in a classroom
48Foundations Graphs
Working with pictures. No standard way to draw a
sociogram which are equal?
49Foundations Graphs
Network visualization helps build intuition, but
you have to keep the drawing algorithm in mind
Spring-embeder layouts
Tree-Based layouts
Most effective for very sparse, regular graphs.
Very useful when relations are strongly directed,
such as organization charts, internet connections,
Most effective with graphs that have a strong
community structure (clustering, etc). Provides
a very clear correspondence between social
distance and plotted distance
Two images of the same network
50Foundations Graphs
Network visualization helps build intuition, but
you have to keep the drawing algorithm in mind
Spring-embeder layouts
Tree-Based layouts
Two images of the same network
51Foundations Graphs
Network visualization helps build intuition, but
you have to keep the drawing algorithm in
mind. Hierarchy Tree models Use optimization
routines to add meaning to the Y-axis of the
plot. This makes it possible to easily see who
is most central because of who is on the top of
the figure. Usually includes some routine for
minimizing line-crossing. Spring Embedder
layouts Work on an analogy to a physical system
ties connecting a pair have springs that pull
them together. Unconnected nodes have springs
that push them apart. The resulting image
reflects the balance of these two features. This
usually creates a correspondence between physical
closeness and network distance.
52Foundations Graphs
53Foundations Graphs
Using colors to code attributes makes it simpler
to compare attributes to relations. Here we can
assess the effectiveness of two different
clustering routines on a school friendship
network.
54Foundations Graphs
As networks increase in size, the effectiveness
of a point-and-line display diminishes - run out
of plotting dimensions. Insights from the
overlap that results in from a space-based
layout as information. Here you see the
clustering evident in movie co-staring for about
8000 actors.
55Foundations Graphs
This figure contains over 29,000 social science
authors. The two dense regions reflect different
topics.
56Foundations Graphs
As networks increase in size, the effectiveness
of a point-and-line display diminishes, because
you simply run out of plotting dimensions. Ive
found that you can still get some insight by
using the overlap that results in from a
space-based layout as information. This figure
contains over 29,000 social science authors. The
two dense regions reflect different topics.
57Foundations Graphs and time
Adding time to social networks is also
complicated, run out of space to put time in most
network figures. One solution animate the
network - make a movie! Here we see streaming
interaction in a classroom, where the teacher
(yellow square) has trouble maintaining
order. The SoNIA software program (McFarland and
Bender-deMoll)
58Foundations Methods
Graphs are cumbersome to work with analytically,
though there is a great deal of good work to be
done on using visualization to build network
intuition. Recommendation use layouts that
optimize on the feature you are most interested
in.
59A graph is vertices and edges
- A graph is vertices joined by edges
- i.e. A set of vertices V and a set of edges E
- A vertex is defined by its name or label
- An edge is defined by the two vertices which it
connects, plus optionally - An order of the vertices (direction)
- A weight (usually a number)
- Two vertices are adjacent if they are connected
by an edge - A vertexs degree is the no. of its edges
60Directed graph (digraph)
- Each edge is an ordered pair of vertices, to
indicate direction - Lines become arrows
- The indegree of a vertex is the number of
incoming edges - The outdegree of a vertex is the number of
outgoing edges
E
210
M
450
190
60
B
200
130
L
P
61Traversing a graph (1)
- A path between two vertices exists if you can
traverse along edges from one vertex to another - A path is an ordered list of vertices
- length the number of edges in the path
- cost the sum of the weights on each edge in the
path - cycle a path that starts and finishes at the
same vertex - An acyclic graph contains no cycles
62Traversing a graph (2)
- Undirected graphs are connected if there is a
path between any pair of vertices - Digraphs are usually either densely or sparsely
connected - Densely the ratio of number of edges to number
of vertices is large - Sparsely the above ratio is small
E
M
B
L
P
63Two graph representationsadjacency matrix and
adjacency list
- Adjacency matrix
- n vertices need a n x n matrix (where n V,
i.e. the number of vertices in the graph) - can
store as an array - Each position in the matrix is 1 if the two
vertices are connected, or 0 if they are not - For weighted graphs, the position in the matrix
is the weight - Adjacency list
- For each vertex, store a linked list of adjacent
vertices - For weighted graphs, include the weight in the
elements of the list
64Representing an unweighted, undirected graph
(example)
0E
1M
2B
3L
4P
65Representing a weighted, undirected graph
(example)
0E
210
1M
450
190
60
2B
200
130
3L
4P
66Representing an unweighted, directed graph
(example)
0E
1M
2B
3L
4P
67Comparing the two representations
- Space complexity
- Adjacency matrix is O(V2)
- Adjacency list is O(V E)
- E is the number of edges in the graph
- Static versus dynamic representation
- An adjacency matrix is a static representation
the graph is built in one go, and is difficult
to alter once built - An adjacency list is a dynamic representation
the graph is built incrementally, thus is more
easily altered during run-time
68Algorithms involving graphs
- Graph traversal
- Shortest path algorithms
- In an unweighted graph shortest length between
two vertices - In a weighted graph smallest cost between two
vertices - Minimum Spanning Trees
- Using a tree to connect all the vertices at
lowest total cost
69Graph traversal algorithms
- When traversing a graph, we must be careful to
avoid going round in circles! - We do this by marking the vertices which have
already been visited - Breadth-first search uses a queue to keep track
of which adjacent vertices might still be
unprocessed - Depth-first search keeps trying to move forward
in the graph, until reaching a vertex with no
outgoing edges to unmarked vertices
70Shortest path (unweighted)
- The problem Find the shortest path from a vertex
v to every other vertex in a graph - The unweighted path measures the number of edges,
ignoring the edges weights (if any)
71Shortest unweighted pathsimple algorithm
For a vertex v, dv is the distance between a
starting vertex and v
- 1 Mark all vertices with dv infinity
- 2 Select a starting vertex s, and set ds 0, and
set shortest 0 - 3 For all vertices v with dv shortest, scan
their adjacency lists for vertices w where dw is
infinity - For each such vertex w, set dw to shortest1
- 4 Increment shortest and repeat step 3, until
there are no vertices w
72Foundations Build a socio-matrix
From pictures to matrices
Undirected, binary
Directed, binary
73Foundations Methods
From matrices to lists
Arc List
Adjacency List
a b b a b c c b c d c e d c d e e c e d
74Foundations Basic Measures
Basic Measures For greater detail,
see http//www.analytictech.com/networks/graphth
eory.htm
Volume
The first measure of interest is the simple
volume of relations in the system, known as
density, which is the average relational value
over all dyads. Under most circumstances, it is
calculated as
1???0
75Foundations Basic Measures
Volume
At the individual level, volume is the number of
relations, sent or received, equal to the row and
column sums of the adjacency matrix.
Node In-Degree Out-Degree a
1 1 b 2 1 c
1 3 d 2 0 e
1 2 Mean 7/5 7/5
76Foundations Data
Basic Measures
Reachability
Indirect connections are what make networks
systems. One actor can reach another if there is
a path in the graph connecting them.
a
b
d
a
c
e
f
77Foundations Basic Matrix Operations
One of the key advantages to storing networks as
matrices is that we can use all of the tools from
linear algebra on the socio-matrix. Some of the
basics matrix manipulations that we use are as
follows
- Definition
- A matrix is any rectangular array of numbers. We
refer to the matrix dimension as the number of
rows and columns
(5 x 5)
(5x2)
(5x1)
78Foundations Basic Matrix Operations
Matrix operations work on the elements of the
matrix in particular ways. To do so, the
matrices must be conformable. That means the
sizes allow the operation. For addition (),
subtraction (-), or elementwise multiplication
(), both matrices must have the same number of
rows and columns. For these operations, the
matrix value is the operation applied to the
corresponding cell values.
-1 0 -3 6 2 1
3 6 11 8 2 9
1 3 4 7 2 5
2 3 7 1 0 4
A-B
AB
A
B
2 9 28 7 0 20
3 9 12 21 6 15
AB
Multiplication by a scalar 3A
79Matrix properties
- Addition contributes to the actors relations
- Multiplication sums over a trait.
- Negative values can occur
- (friend, dont care, enemy) (1,0,-1)
- Interpret operations carefully
80Foundations Basic Matrix Operations
The transpose ( or T) of a matrix reverses the
row and column dimensions. AtijAji So a M x
N matrix becomes an N x M matrix.
T
a b c d e f
a c e b d f
81Foundations Basic Matrix Operations
The matrix multiplication (x) of two matrices
involves all elements of the matrix, and will
often result in a matrix of new dimensions. In
general, to be conformable, the inner dimension
of both matrices must match. So A3x2 x B2x3
C3 x 3 But A3x3 x B2x3 is not defined
(actually a tensor) Substantively, adding
names to the dimensions will help us keep track
of what the resulting multiplications mean So
multiplying (send x receive)x (send x receive)
(send x receive), giving us the two-step
distances (the senders recipient's receivers).
82Foundations Basic Matrix Operations
The multiplication of two matrices Amxn and Bnxq
results in Cmxq
a b c d
e f g h
aebg afbh cedg cfdh
a b c d e f
agbj ahbk aibl cgdj chdk cidl egfg
ehfk eifl
g h i j k l
(3x2) (2x3)
(3x3)
83Foundations Basic Matrix Operations
The powers (square, cube, etc) of a matrix are
just the matrix times itself that many
times. A2 AA or A3 AAA We often use
matrix multiplication to find types of people one
is tied to, since the 1 in the adjacency matrix
effectively captures just the people each row is
connected to.
84Foundations Data
Basic Measures
Reachability
The distance from one actor to another is the
shortest path between them, known as the geodesic
distance. If there is at least one path
connecting every pair of actors in the graph, the
graph is connected and is called a component.
Two paths are independent if they only have the
two end-nodes in common. If a graph has two
independent paths between every pair, it is
biconnected, and called a bicomponent. Similarly
for three paths, four, etc.
85Foundations Data
Calculate reachability through matrix
multiplication. (see p.162 of WF)
Total of directed walks for power n
Minimal distance from one node to another
86Foundations Data
Mixing patterns
Matrices make it easy to look at mixing patterns
connections among types of nodes. Simply
multiply an indicator of category by the
adjacency matrix.
e
d
c
f
B 4 to selves B 2 to G G 2 to B G 6 to selves
b
a
87Foundations Data
Matrix manipulations allow you to look at
direction of ties, and distinguish symmetric
from asymmetric ties.
To transform an asymmetric graph to a symmetric
graph, add it to its transpose.
X 0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0
0 1 1 0
XT 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1
0 0
Max Sym MIN Sym 0 1 0 0 0 0 1 0 0
0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0
0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0
1 0 0
0 2 0 0 0 2 0 1 0 0 0 1 0 1 2 0 0 1 0 1 0 0 2 1 0
Interpretation?
88Graphs / matrices
- Analysis of social structure
- Visualization tools
- Other methods
- Statistics
- Power laws
- Bayesian graphs
89Global web Winners take all
- Pr(page has k inlinks) ? k-? ??2.1
- Popular few receive disproportionate share of
links - ? traffic, ? prob SE indexing, ? SE ranking
90Category-specific webWinners dont (quite)
take all
- All US company homepages
- hist w/ exp ? buckets (const on log scale)
- Strong deviation from pure power law
- Unimodal (?l.n.) body,power law tail
- Less skewed many fare well against mode
Pennock, Giles, et.al PNAS 2002
91Applications of Networked Life
- Social structure in organizations
- Economic and business behavior
- Epidemiology
- Information discovery
- Design and robustness of networks
-
92SNA disciplines
- More diverse than expected!
- Sociology
- Political Science
- Business
- Economics
- Sciences
- Computer science
- Information science
- Others?
93SN on the web - services
- A social network service uses software to build
online social networks for communities of people
who share interests and activities or who are
interested in exploring the interests and
activities of others. - wikipedia - Friending
- Facebook
- MySpace
- LinkedIn
- Second life
- This will only increase!
- Large complex, heterogeneous networks
- Latours actor-network model
- Different entities connect actors
- Coauthorship network connected by papers
94example networks of people and articles (e.g.,
citation and co-authorship networks)
this image is from the system ReferalWeb by
Henry Katz et al. at ATT Research http//foraker.r
esearch.att.com/refweb/version2/RefWeb.html
95SNA and the Web 2.0
- Wikis
- Blogs
- Folksonomies
- Collaboratories
96Computational SNA Models
- New models are emerging
- Very large network analysis is possible!
- Deterministic - algebraic
- Early models still useful
- Statistical
- Descriptive using many features
- Diameter, betweeness,
- Probabilistic graphs
- Generative
- Creates SNA based on agency, documents,
geography, etc. - Community discovery and prediction
97Graphical models
- Modeling the document generation
Existing three generative models. Three
variables in the generation of documents are
considered (1) authors (2) words and (3)
topics (latent variable)
98Theories used in SNA
- Graph/network
- Heterogeneous graphs
- Hypergraphs
- Probabilistic graphs
- Economics/game theory
- Optimization
- Visualization/HCI
- Actor/Network
- Many more
99Big questions
- Scalability of investigations
- Data acquisition and data rights
- Heterogeneous network analysis
- Integration into decision making