A Study of social influence in diffusion of innovation over Facebook PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: A Study of social influence in diffusion of innovation over Facebook


1
A Study of social influence in diffusion of
innovation over Facebook
  • Shaomei Wu
  • sw475_at_cornell.edu
  • Information Science
  • Cornell University
  • Information Science Breakfast, Dec 5, 2008

2
Diffusion of Innovation
  • Diffusion is the process in which an innovation
    is communicated through certain channels over
    time among the members of a social system.
  • Everett M. Rogers
  • innovation Friendship Quiz a Facebook
    application
  • Communicated Invitations among Facebook
    friends
  • time September 25, 2008 Now
  • social system Facebook

Rogers, Everett M. (2003). Diffusion of
Innovations, 5th ed.. New York, NY Free Press,
pp 5-6
3
Basic Diffusion Models
Threshold Model
Cascade Model
?
Statistically Equivalent
David Kempe, Jon Kleinberg, Eva Tardos.
Maximizing the Spread of Influence through a
Social Network. KDD, 2003
4
Cascade Model
  • Each recommendation will succeed with certain
    probability.

h
k
b
c
pgk
i
pab
pab
pac
pdi
g
pgl
pag
d
a
pad
l
pdj
j
paf
pae
non-adopter adopter social link recommendation
f
e
Question how to estimate puv ?
5
Question how to estimate puv?
  • Current practice
  • Constant 1
  • Based on ONLY network structure (e.g.,
    in/out-degree) 2

Do individuals and the social relationship among
them matter?
1 Jure Leskovec, Mary McGlohon, Christos
Faloutsos, Natalie Glance, Matthew Hurst,
Cascading Behavior in Large Blog Graphs. SDM
2007. 2 Jure Leskovec, Lada Adamic, Bernardo
Huberman. The Dynamics of Viral Marketing. ACM
Conference on Electronic Commerce (EC) 2006.
6
Theories from Empirical Diffusion Research
  • Opinion leaders who own greater exposure to
    mass media than their followers, are more
    cosmopolite, have greater social participation
    , have higher socioeconomic status, and are
    more innovative Rogers 2003, pp 316-318.
  • The importance of heterophily between
    participants on certain attributes (i.e.,
    education and socioeconomic status) at
    determining the efficiency of diffusion, despite
    the fact that more effective communication
    occurs when two or more individuals are
    homophilous Rogers, 2003, pp19

7
This project is to
  • Model puvs for cascade model
  • Identify the most influential factors at
    determining puv
  • Predict the success of contagion
  • Exploit Facebook data
  • A real-world, ongoing diffusion instance
  • Rich and (most of the time) trustable profile
    information of individuals and their social
    connections/activities
  • Precisely timestamped diffusion process, a
    complete log of events

8
Status
  • Launched Sep 25, 2008.
  • Currently used data is until Nov 25, 2008.
  • 216 adopters,
  • 375 individuals,
  • 737 edges between 266 pairs of people,
  • 90 successful infection
  • 178 failed infection
  • Network Evolution (in the first month after
    release)

9
(No Transcript)
10
Predict the success of invitation with SVM
  • A Binary classifier
  • each invitation is either successful or failed.
  • Features
  • Individual features
  • Pair features (homophily/heterophily)

11
Individual Features
of events attended/invited of photo tagged
of wall posts of networks of groups
participated of notes Religion Political
View Gender Age Culture Background Relationship
Status Work Info Education Info
Social Activeness
Innovativeness
Socioeconomics
Education
12
Pair-wise Features
Age difference Same gender? Same political
view? Same religion? Same culture background?
of same networks of photos both tagged of
groups both participated of events both
attended Same education level? Same high
school? Same college? Same workplace? Same
current city?
Biological traits
Belief
Socioeconomics
Proximity
13
Each invitation is a training example - machine
learning.
Training Data
all numerical features are normalized across
examples.
14
AdaBoost (with DecisionDump) A popular way
to do feature selection.
  • Selected Features
  • sender wall post count
  • sender group count
  • sender network count
  • receiver age
  • receiver group count
  • sender receiver common group count
  • Performance (10-fold cross validation)
  • Accuracy 83.6

15
SVM performance
  • SVM-light (10-fold cross-validation)

16
Weights from SVM
17
Result
  • SVM-light performance
  • 209 records into 5 folds, 4 for training, 1 for
    testing.
  • Performance on the testing set
  • Accuracy 71.43 (30 correct, 12 incorrect, 42
    total)
  • Precision/recall 55.56/38.46
  • Feature weights distribution

Top weighted features 8, sender_events_invited,4
, sender_friend_count,11, sender_gender35,
receiver_is_It's Complicated5,
sender_wall_post_count,9, sender_note_count27.
sender_is_In a Relationship
So, the story can be when a sender who has been
invited to greater number of events in Facebook,
has more friends, wrote more Facebook notes (blog
entries), is female, has less wall posts, in a
relationship, tried to infect a person whose
relationship status is its complicated, its
more like the infection will happen compared to
other cases.
18
SVM with features selected by AdaBoost
19
Background
  • Diffusion of Innovation
  • Question
  • How does it work in large online social networks?
  • What are the key factors at determining the
    success of infection?
  • Can we predict the propagation path?

20
Hypothesis
  • Social influence depends on 5 dimensions of
    similarities
  • geographical distance
  • current location(country/state/city), current
    school, current major, year of class, current
    workplace, current courses enrolled
  • background similarity
  • sex, sexual preference, dating interest,
    relationship interest, relationship status,
    birthday, political view, religious view,
    hometown address, previous school, previous
    workplace
  • social similarity
  • number of mutual networks they belong to,
    number of mutual friends
  • interest similarity
  • activities, favorite books, favorite music,
    favorite movies, favorite TV shows, favorite
    quotas
  • social status distance
  • difference of numbers of friends, difference
    of wallpost counts, difference of counts of
    message sent and received, difference of counts
    of notes.

21
Project Description
  • Objectives
  • Identify the key factors for social influence
  • Predict occurrence of adoption based on the key
    factors.
  • Friendship Quiz
  • A Facebook application we developed
  • Enable users to make quizzes and send to their
    friends (take a peek!)
  • We track the spread of application.

22
Highlights
  • A real-world diffusion of innovation
  • Rich and (most of the time) trustful profile
    information of individuals and their social
    connections/activities
  • Precisely timestamped diffusion process, a
    complete log of events
  • Ongoing diffusion process

23
Backup Threshold Model
Write a Comment
User Comments (0)
About PowerShow.com