Analysing Social Networks Via the Internet - PowerPoint PPT Presentation

About This Presentation
Title:

Analysing Social Networks Via the Internet

Description:

MySpace and Facebook are the largest explicit social ... her blog, know her email and on her MySpace page) Some Network Types ... Layout. Spring-embedder ... – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 30
Provided by: bernie9
Category:

less

Transcript and Presenter's Notes

Title: Analysing Social Networks Via the Internet


1
Analysing Social Networks Via the Internet
  • Bernie Hogan
  • PhD Candidate, Department of Sociology
  • Research Coordinator, NetLab

2
As we may think
  • Wholly new forms of encyclopedias will appear,
    ready made with a mesh of associative trails
    running through them
  • Vannevar Bush, 1945

3
60 years later
  • We have no shortage of associative trails. But it
    is not confined to information
  • When computer networks link people as well as
    machines, they become social networks (Wellman,
    et al. 1996)

4
Why do networks matter?
  • Googles succeeded through a social network
    algorithm.
  • MySpace and Facebook are the largest explicit
    social networks ever created.
  • We can show how the rich get richer Preferential
    attachment (Barabasi and Albert 1998),
  • And how everyone is only six degrees, apart
    (Milgram 1967 Watts 2001).

5
The Oracle of Kevin Bacon The Original Online
Network
The Importance of Being Earnest
Where the Truth Lies
84 Charing Cross Road
A Few Good Men
Mission Impossible II
6
What are networks?
  • Relationships between actors
  • Friendships
  • Partnerships
  • Hyperlinks
  • Information about actors
  • People
  • Businesses
  • Webpages

Plus
7
Nodes
  • Generally constrained to well defined types.
  • People to people (not to orgs and teams).
  • More than one type are included in affiliation
    networks
  • Linking people as one set to events as another
    set.

8
Links can be
  • Directed links arcs (from me to you)
  • Undirected links edges (me and you)
  • Valued (I sent 3 messages to you)
  • Signed (I like him I dislike her)
  • Multiplex (l link to her blog, know her email and
    on her MySpace page)

9
Some Network Types
Users of a web forum
Subset of political blogs
Friend pages on MySpace
10
Where to find networks online?
Social networking
Email
Social news
Web links
Blogs
Message boards
Instant messengers
Games
11
Networks as data
To
A B C D
A 1 0 0
B 1 0 1
C 0 1 1
D 0 0 0
A
B
From
D
C
12
Networks as data II
13
Capturing this data online
  • Scraping pages
  • Using scripting languages (python, perl)
  • Using scraping software
  • APIs (Application Program Interface)
  • Again using scripting languages
  • Out-of-the-box software
  • Online applications
  • More on this tomorrow!!

14
Analysing Data
  • Software Applications
  • UCInet powerful, social-science oriented, quirky
    interface
  • Pajek powerful, strange interface, comprehensive
  • Others (Egotistics, NetMiner, Visualyzer,
    NetWorkBench)
  • Software Environments
  • JUNG (Java Universal Network Graphing Package)
  • R (SNA package)
  • iGraph (Python)

15
Common metrics I Centrality
  • Who is the most connected?
  • Simple question, complex answer

Degree Number of links
3
Betweenness Shortest paths
PageRank Links to high degree
16
Common metrics II Sub-groups
  • Interested in group structure
  • Again, many applicable measures
  • Components
  • Number of disconnected sets
  • Strong must be an arc in to all nodes
  • Community detection
  • See Mark Newmans work (such as the Girvan-Newman
    algorithm)
  • Special Ks K-shell, K-core, K-plex

17
World Wide Web K-shells
  • http//xavier.informatics.indiana.edu/lanet-vi

18
Community Detection Political Blogs
  • Adamic Glance. 2004. The Political Blogosphere
    and the 2004 U.S. Election Divided They Blog.

19
Visualizing Data
  • Applications
  • GUESS great for tweaking based on attribute
    data. Technical, but powerful.
  • NetDraw straightforward, integrates with UCInet
  • Pajek fast, draws large networks, pretty
  • More coming out every week (See the work of
    Martin Wattenberg, Danyel Fisher and Fernanda
    Viegas)
  • Environments / Packages
  • JUNG, Prefuse, Piccolo, R (gplot)

20
Visualization Best Practices
  • General
  • Do NOT show a graph for graphs sake.
  • Huge networks often give cluttered pictures
  • De-clutter by trimming to symmetric ties.
  • Drawing Nodes
  • Size can often represent log(continuous
    variable).
  • Tint - can represent categorical or continuous
    variable.
  • Do not show ego in an egonet.
  • Only use labels on small graphs (n lt 50).
  • Layout
  • Spring-embedder layouts work nicely.
  • Post-layout touch ups are possible using bin
    packing (in GUESS).

Most Important Be Graph Literate. Otherwise
youll be impressed with the first thing you
draw, regardless of its quality
21
Visualization Demo Email Subgroups in JUNG
22
Example - Digg.com
Popular Stories
Stories from Friends
Todays Top Stories
23
Digg Using networksto Predict the News
  • Data gathered in early March
  • All Digg Users with 7 or more top stories (subset
    of top 1000 Diggers) as of Feb 27
  • Mapped symmetric ties
  • Node size is log( stories-6), brightness is
    degree.
  • Calculated number of ties (for links to top
    diggers links to other diggers)
  • In to node Fans
  • Symmetric Friends
  • Out from nodes Watched

24
.
25
Regression Output - Predicting Popular Stories
Effect of fans in high places
Very strong models
26
Online networks in Context
Media Multiplexity There is a positive
relationship between the number of ways in which
people connect and tie strength
(Haythornthwaite 1999)
27
Networks in a pinch
  • The number of ties is often the most significant.
  • Just ask.
  • Specify boundary conditions (e.g. people you have
    emailed in the past month)
  • Categories are help them remember and give you
    extra data points. (e.g. friends / workmates /
    relatives)
  • With a roster, you can get people to select from
    a list.

28
Summary
  • Network analysis Because sociology wasnt nerdy
    enough already.
  • Involves a disparate suite of programs for
    capture, analysis and visualization.
  • Compelling visual imagery - maps of
    relationships.
  • Strong explanatory power in online spaces.
  • A host of meaningful metrics to choose from
  • Sometimes, the number of ties is enough.

29
Many Thanks
  • Bernie Hogan
  • bernie.hogan_at_utoronto.ca
  • PhD Candidate, Department of Sociology
  • Research Coordinator, NetLab
  • Graduate Fellow, Knowledge Media Design Institute
  • University of Toronto
  • P.S. Ask me about my scripts and tools
Write a Comment
User Comments (0)
About PowerShow.com