CS 599: Social Media Analysis - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

CS 599: Social Media Analysis

Description:

Information Diffusion in Social Media Kristina Lerman University of Southern California CS 599: Social Media Analysis University of Southern California * – PowerPoint PPT presentation

Number of Views:135
Avg rating:3.0/5.0
Slides: 46
Provided by: Crai1169
Learn more at: https://www.isi.edu
Category:

less

Transcript and Presenter's Notes

Title: CS 599: Social Media Analysis


1
Information Diffusion in Social Media
  • Kristina Lerman
  • University of Southern California

2
Information diffusion on Twitter follower graph
3
Diffusion on networks
  • The spread of disease, ideas, behaviors, on a
    network can be described as a contagion process
    where an active node (infected/informed/adopted)
    activates its non-active neighbors with some
    probability
  • creates a cascade on a network
  • How large do cascades become?
  • What determines their growth?

4
Diffusion models
  • Complex response infection requires multiple
    exposures.
  • Non-monotonic exposure response

Exposure response function
Complex contagion
Threshold model
1
1
infection prob.
infection prob.
fiki
number infected neighbors
number infected neighbors
5
Epidemic diffusion model
  • Infected nodes propagate contagion to susceptible
    neighbors with probability m (transmissibility or
    virality of contagion)

Exposure response function
1
infection prob.
number infected neighbors
6
Epidemic threshold
  • Epidemic threshold t
  • For m lt t, localized cascades (epidemic dies out)
  • For m gt t, global cascades
  • Epidemic threshold depends on topology only
    largest eigenvalue of adjacency matrix of the
    network
  • True for any network

7
Differences in the Mechanics of Information
Diffusion across Topics Idioms, Political
Hashtags and Complex Contagion on Twitter
  • Daniel M Romero, Brendan Meeder and Jon Kleinberg

Presentation by Aswin Rajkumar
8
Motivation and Contribution
  • Information Diffusion and Topics- Eg
    Controversial political topics have high
    information diffusion.- Scientific study of the
    variation in diffusion mechanics across topics.
  • Contribution of the paper- Empirical analysis of
    real world data- Observation that the mechanics
    of spread can be defined using two variables,
    stickiness and persistence.- Confirmation of
    sociological theories found in the offline world
    diffusion of innovations

9
The Study How?
  • Twitter Dataset, a snapshot covering a large
    number of tweets over a period of several months
    (Aug 09 to Jan 10)
  • 3 billion messages from over 60 million users
  • Hashtag Tokens, Top 500 Hashtags
  • _at_Mention Network, Neighbor Sett mentions from
    X to Y, t 3Why? Shows Xs attention to Y.

10
The Study What?
  • Adoption and Spread of Hashtags - Diffusion
  • Topics Politics, Celebrity, Music, Movies,
    Games, Idioms, Sports and Technology
  • Stickiness - the probability that a piece of
    information will pass from a person who knows or
    mentions it to another person who is exposed to
    it.
  • Persistence and Complex Contagion, a principle
    from sociology. Persistence - the relative extent
    to which repeated exposures to a hashtag continue
    to have significant marginal effects on
    adoption.Rate of decay.

11
Complex Contagion
Complex contagion refers to the phenomenon in
social networks in which multiple sources of
exposure to an innovation are required before an
individual adopts the change of behavior. -
Wikipedia
12
P(K)StickinessPersistence
13
Analysis Stickiness and Persistence
  • Take the top 500 hashtags
  • Classify them into 8 topics or categories
  • Construct p(k) curves for each hashtag and
    average them separately within each category
  • Compare the shapesPolitical Hashtags High
    Stickiness and PersistenceTwitter Idioms High
    Stickiness, Low Persistence
  • mw2, mafiawars
  • lost, newmoon
  • mj, brazilwantsjb
  • pandora, thisiswar
  • obama, hcr
  • cricket, nhl
  • photoshop, digg

14
Twitter Idioms
cantlivewithout
musicmonday
iloveitwhen
followfriday
15
Analysis Subgraph Structure
  • Interconnections among early adopters
  • Subgraphs for political hashtags - High
    in-degree, large number of triangles.
  • Tie Strength Strong, Weak.

Credit Bridge-talent.com
16
Exposure Curve - Definitions
  • K-exposed A user is k-exposed to a tag h if he
    has not used h, but is connected to k other users
    who have used h in the past.
  • Whats the probability that a k-exposed user u
    will use hashtag h in the future?1) Ordinal
    Time EstimateProbability of a k-exposed user u
    using hashtag h before becoming k1 exposed.P(k)
    I(k) / E(k) E(k) number of k-exposed
    users I(k) number of k-exposed users who used
    h before becoming k1 exposed.2) Snapshot
    EstimateSimilar, but based on time. E(k)
    numer of users k-exposed at t1. I(k) number of
    users k-exposed at t1 and used h before t2P(k)
    I(k) / E(k) -gt Exposure Curve

17
Comparison Parameters
  • Persistence ParameterF(P) A(P) / R(P)A(P)
    Area under P curve.R(P) Area under the
    rectangle of length K and height
    max(P(k))Curve comparisonsIncreases rapidly
    and falls vs Increases slowly and
    saturatesIncreases slowly and saturates vs Rapid
    Increase
  • Stickiness ParameterM(P) Max(P(K))

18
Plots
F(P) A(P) / R(P) -gt Persistence Parameter M(P)
Max(P(K)) -gt Stickiness Parameter
19
Improvements and Related Work
  • _at_Mention network is not very representative.
    Also, attention should be from Y to X.
  • Considers only average persistence. Median and
    variance should be analyzed too.
  • Other types of networks. Eg Blogs. Gruhl, Guha,
    Nowell, Tomkins - Information Diffusion through
    Blogspace.
  • Influence on Online Behavior. Eg Games. Woo,
    Kang, Kim The Contagion of Malicious Behaviors
    in Online Games
  • Network structure is dynamic in real life. Bano,
    Holthoefer, Wang, Moreno, Bailon Diffusion
    Dynamics with Changing Network Composition

20
Conclusion
  • Hashtags of different topics exhibit different
    mechanics of spread. Politically controversial
    hashtags have the highest diffusion.
  • Information diffusion depends on the probability
    of users adopting a hashtag after repeated
    exposure to it. Depends on the magnitude of the
    probabilities as well as the rate of decay
  • Confirms the sociological theory of complex
    contagion
  • Higher in-degree and stronger ties results in
    better spread.

21
Questions?
22
What Stops Social Epidemics? (Ver Steeg et al.)
  • Why do information cascades in social media
  • Grow quickly initially
  • But remain much smaller than predicted by
    epidemic models?
  • Information cascades differ from viral contagion
  • Response to repeated exposure is important on
    Digg (and Twitter)
  • Drastically alters predictions about size of
    epidemics

23
Social news
  • Users submit or vote for (infected by) news
    stories
  • Social network
  • Users follow friends to see
  • Stories friends submit
  • Stories friends vote for
  • Trending stories
  • Digg promotes most popular stories to its Top
    News page

24
How large are cascades in social media?
Number of people who share a message (with a URL)
Twitter
Digg
70K URLs 700K users 36M edges
3.5K URLs 258K users 1.7M edges
Most cascades less than 1 of total network size!
25
Why are these cascades so small?
Standard model of epidemic growth (Heterogenous
mean field theory, SIR model, same degree
distribution as Digg)
Most cascades fall in this range
Transmissibility, m
Transmissibility of almost all Digg stories fall
within width of this line?!
26
Maybe graph structure is responsible?
? Mean field prediction (same degree dist.) ?
Simulated cascades on a random graph with same
degree dist. Simulated cascades on the
observed Digg graph
epidemic threshold
Transmissibility m
  • clustering reduces epidemic threshold and
    cascade size,
  • but not enough!

27
What about the spreading mechanism?
Infected
Not Infected
?
28
Are repeat exposures a big effect?
Yes, more than half of the users are exposed to
the same information more than once!
29
How do people respond to repeated exposure?
Exposure response
Not much. We have similar results for Twitter
------- Also noted by Romero, et al, WWW 2011
30
Big consequences for cascade growth
  • Most people are exposed to a story more than once
  • Repeated exposures have little effect
  • Growth of epidemics is severely curtailed
    (especially compared to Ind. Cascade Model)

31
Weak response to repeated exposures suppresses
outbreaks
Take effect of repeat exposure into
account Actual Digg cascades Result of
simulations
Epidemic threshold unchanged
?
m, Transmissibility
32
How Limited Visibility and Divided Attention
Constrain Social Contagion (Hodas Lerman, 2012)
  • Questions
  • How do people respond to exposures to information
    by friends on social media?
  • What role does content play in information
    diffusion?
  • Findings
  • Users have finite ability to process information
  • Most recently received messages are retweeted,
    the rest are overlooked
  • Highly connected users (hubs) are far less likely
    to retweet any message they receive than poorly
    connected people
  • Reduced susceptibility of hubs to infections
    explains why cascades are small

33
Mechanics of information diffusion
User must see an item and find it interesting
before he/she can spread it (e.g., by retweeting
it, voting for or liking it, )
Cognitive
Tastes
Retweet
Content
Interface
34
Cognitive factors Position bias
  • People pay more attention to items at the top of
    the screen or a list of items

Payne, The Art of Asking Questions (1951)
Counts Fisher ICWSM11
Buscher et al, CHI09
limits how far down the list/page the user
navigates
35
Measuring position bias
  • Amazon Mechanical Turk experiments
  • Users were asked to recommend science stories
  • We controlled the order stories were presented to
    users

Position bias stories at top list positions
received more recommendations
Lerman Hogg (2014) Leveraging position bias
to improve peer recommendation in Plos One.
36
Position bias creates a limited attention
post visibility
new post at top of users screen
post near the top is most likely to be seen
37
Position bias creates a limited attention
some time later newer posts appear at the top
post is less likely to be seen
38
Position bias and number of friends
many friends
few friends
some time later newer posts appear at the top
post is less likely to be seen
same age post is even less visible to a highly
connected user
39
Friends are a source of distraction
users with more friends are more active
users with more friends are distracted by more
content
nf
  • Limited attention makes hubs less susceptible to
    infection

40
Users retweet most recent messages
high connectivity users
Time Response Function
low connectivity users
  • Users retweet newest messages (at the top of
    their screen)
  • Hubs are much less likely to retweet an older
    message

41
Does content matter?
visibility
probability to tweet a message
virality
42
Do viral messages spread farther?
ln(virality)
viral messages can reach many or few people
43
How do people respond to multiple exposures?
Exposure response
Number of tweeting friends
  • Is this evidence for complex contagion?

44
Complex contagion- artifact of heterogeneity
low connectivity users
high connectivity users
  • Breaking down exposure response by different
    sub-populations, separated according to number of
    friends they follow, reveals simple, monotonic
    response

45
Summary
  • A meme is not a virus
  • Information spread ? Disease spread
  • Big consequences for modeling information spread
    in social media
  • Highly connected people (hubs) act as fire walls
    to information spread
  • They have a hard time finding messages in their
    stream
  • ? People have a finite capacity to process
    information the more messages they receive, the
    less likely they are to respond to any given one
  • Information overload actually reduces the size of
    information cascades
Write a Comment
User Comments (0)
About PowerShow.com