Predictively Modeling Social Text - PowerPoint PPT Presentation

About This Presentation
Title:

Predictively Modeling Social Text

Description:

Predictively Modeling Social Text William W. Cohen Machine Learning Dept. and Language Technologies Institute School of Computer Science Carnegie Mellon University – PowerPoint PPT presentation

Number of Views:199
Avg rating:3.0/5.0
Slides: 76
Provided by: William1472
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Predictively Modeling Social Text


1
Predictively Modeling Social Text
  • William W. Cohen
  • Machine Learning Dept. and Language Technologies
    Institute
  • School of Computer Science
  • Carnegie Mellon University
  • Joint work with Amr Ahmed, Andrew Arnold,
    Ramnath Balasubramanyan, Frank Lin, Matt Hurst
    (MSFT), Ramesh Nallapati, Noah Smith, Eric Xing,
    Tae Yano

2
Newswire Text
Social Media Text
  • Formal
  • Primary purpose
  • Inform typical reader about recent events
  • Broad audience
  • Explicitly establish shared context with reader
  • Ambiguity often avoided
  • Informal
  • Many purposes
  • Entertain, connect, persuade
  • Narrow audience
  • Friends and colleagues
  • Shared context already established
  • Many statements are ambiguous out of social
    context

3
Newswire Text
Social Media Text
  • Goals of analysis
  • Extract information about events from text
  • Understanding text requires understanding
    typical reader
  • conventions for communicating with him/her
  • Prior knowledge, background,
  • Goals of analysis
  • Very diverse
  • Evaluation is difficult
  • And requires revisiting often as goals evolve
  • Often understanding social text requires
    understanding a community

4
Outline
  • Tools for analysis of text
  • Probabilistic models for text, communities, and
    time
  • Mixture models and LDA models for text
  • LDA extensions to model hyperlink structure
  • LDA extensions to model time
  • Alternative framework based on graph analysis to
    model time community
  • Preliminary results tradeoffs
  • Discussion of results challenges

5
Introduction to Topic Models
  • Mixture model unsupervised naïve Bayes model
  • Joint probability of words and classes
  • But classes are not visible

?
C
Z
W
N
M
b
6
Introduction to Topic Models
7
Introduction to Topic Models
  • Probabilistic Latent Semantic Analysis Model

d
  • Select document d Mult(?)
  • For each position n 1,?, Nd
  • generate zn Mult( ?d)
  • generate wn Mult( ?zn)

?d
?
Topic distribution
z
  • Mixture model
  • each document is generated by a single (unknown)
    multinomial distribution of words, the corpus is
    mixed by ?
  • PLSA model
  • each word is generated by a single unknown
    multinomial distribution of words, each document
    is mixed by ?d

w
N
M
?
8
Introduction to Topic Models
  • PLSA topics (TDT-1 corpus)

9
Introduction to Topic Models
JMLR, 2003
10
Introduction to Topic Models
  • Latent Dirichlet Allocation

?
  • For each document d 1,?,M
  • Generate ?d Dir( ?)
  • For each position n 1,?, Nd
  • generate zn Mult( ?d)
  • generate wn Mult( ?zn)

a
z
w
N
M
?
11
Introduction to Topic Models
  • Latent Dirichlet Allocation
  • Overcomes some technical issues with PLSA
  • PLSA only estimates mixing parameters for
    training docs
  • Parameter learning is more complicated
  • Gibbs Sampling easy to program, often slow
  • Variational EM

12
Introduction to Topic Models
  • Perplexity comparison of various models

Unigram
Mixture model
PLSA
Lower is better
LDA
13
Introduction to Topic Models
  • Prediction accuracy for classification using
    learning with topic-models as features

Higher is better
14
Outline
  • Tools for analysis of text
  • Probabilistic models for text, communities, and
    time
  • Mixture models and LDA models for text
  • LDA extensions to model hyperlink structure
  • LDA extensions to model time
  • Alternative framework based on graph analysis to
    model time community
  • Preliminary results tradeoffs
  • Discussion of results challenges

15
Hyperlink modeling using PLSA
16
Hyperlink modeling using PLSACohn and Hoffman,
NIPS, 2001
?
  • Select document d Mult(?)
  • For each position n 1,?, Nd
  • generate zn Mult( ?d)
  • generate wn Mult( ?zn)
  • For each citation j 1,?, Ld
  • generate zj Mult( ?d)
  • generate cj Mult( ?zj)

d
?d
z
z
w
c
N
L
M
?
g
17
Hyperlink modeling using PLSACohn and Hoffman,
NIPS, 2001
?
PLSA likelihood
d
?d
z
z
New likelihood
w
c
N
L
M
?
g
Learning using EM
18
Hyperlink modeling using PLSACohn and Hoffman,
NIPS, 2001
Heuristic
?
(1-?)
0 ? 1 determines the relative importance of
content and hyperlinks
19
Hyperlink modeling using PLSACohn and Hoffman,
NIPS, 2001
  • Experiments Text Classification
  • Datasets
  • Web KB
  • 6000 CS dept web pages with hyperlinks
  • 6 Classes faculty, course, student, staff, etc.
  • Cora
  • 2000 Machine learning abstracts with citations
  • 7 classes sub-areas of machine learning
  • Methodology
  • Learn the model on complete data and obtain ?d
    for each document
  • Test documents classified into the label of the
    nearest neighbor in training set
  • Distance measured as cosine similarity in the ?
    space
  • Measure the performance as a function of ?

20
Hyperlink modeling using PLSACohn and Hoffman,
NIPS, 2001
  • Classification performance

content
Hyperlink
Hyperlink
content
21
Hyperlink modeling using LDA
22
Hyperlink modeling using LinkLDAErosheva,
Fienberg, Lafferty, PNAS, 2004
a
?
  • For each document d 1,?,M
  • Generate ?d Dir( ?)
  • For each position n 1,?, Nd
  • generate zn Mult( ?d)
  • generate wn Mult( ?zn)
  • For each citation j 1,?, Ld
  • generate zj Mult( . ?d)
  • generate cj Mult( . ?zj)

z
z
w
c
N
L
M
?
g
Learning using variational EM
23
Hyperlink modeling using LDAErosheva, Fienberg,
Lafferty, PNAS, 2004
24
Newswire Text
Social Media Text
  • Goals of analysis
  • Extract information about events from text
  • Understanding text requires understanding
    typical reader
  • conventions for communicating with him/her
  • Prior knowledge, background,
  • Goals of analysis
  • Very diverse
  • Evaluation is difficult
  • And requires revisiting often as goals evolve
  • Often understanding social text requires
    understanding a community

Science as a testbed for social text an open
community which we understand
25
Author-Topic Model for Scientific Literature


26
Author-Topic Model for Scientific
LiteratureRozen-Zvi, Griffiths, Steyvers, Smyth
UAI, 2004
a
P
  • For each author a 1,?,A
  • Generate ?a Dir( ?)
  • For each topic k 1,?,K
  • Generate fk Dir( ?)
  • For each document d 1,?,M
  • For each position n 1,?, Nd
  • Generate author x Unif( ad)
  • generate zn Mult( ?a)
  • generate wn Mult( fzn)

a
x
z
?
A
w
N
M
f
b
K
27
Author-Topic Model for Scientific Literature
Rozen-Zvi, Griffiths, Steyvers, Smyth UAI, 2004
a
  • Learning Gibbs sampling

P
?
x
z
?
A
w
N
M
f
b
K
28
Author-Topic Model for Scientific Literature
Rozen-Zvi, Griffiths, Steyvers, Smyth UAI, 2004
  • Perplexity results

29
Author-Topic Model for Scientific Literature
Rozen-Zvi, Griffiths, Steyvers, Smyth UAI, 2004
  • Topic-Author visualization

30
Author-Topic Model for Scientific
LiteratureRozen-Zvi, Griffiths, Steyvers, Smyth
UAI, 2004
  • Application 1 Author similarity

31
Author-Topic Model for Scientific Literature
Rozen-Zvi, Griffiths, Steyvers, Smyth UAI, 2004
  • Application 2 Author entropy

32
Author-Topic-Recipient model for email data
McCallum, Corrada-Emmanuel,Wang, ICJAI05
33
Author-Topic-Recipient model for email data
McCallum, Corrada-Emmanuel,Wang, ICJAI05
Gibbs sampling
34
Author-Topic-Recipient model for email data
McCallum, Corrada-Emmanuel,Wang, ICJAI05
  • Datasets
  • Enron email data
  • 23,488 messages between 147 users
  • McCallums personal email
  • 23,488(?) messages with 128 authors

35
Author-Topic-Recipient model for email data
McCallum, Corrada-Emmanuel,Wang, ICJAI05
  • Topic Visualization Enron set

36
Author-Topic-Recipient model for email data
McCallum, Corrada-Emmanuel,Wang, ICJAI05
  • Topic Visualization McCallums data

37
Author-Topic-Recipient model for email data
McCallum, Corrada-Emmanuel,Wang, ICJAI05
38
Modeling Citation Influences
39
Modeling Citation InfluencesDietz, Bickel,
Scheffer, ICML 2007
  • Copycat model of citation influence
  • LDA model for cited papers
  • Extended LDA model for citing papers
  • For each word, depending on coin flip c, you
    might chose to copy a word from a cited paper
    instead of generating the word

40
Modeling Citation InfluencesDietz, Bickel,
Scheffer, ICML 2007
  • Citation influence model

41
Modeling Citation InfluencesDietz, Bickel,
Scheffer, ICML 2007
  • Citation influence graph for LDA paper

42
Models of hypertext for blogs ICWSM 2008
Ramesh Nallapati
me
43
LinkLDA model for citing documents Variant of
PLSA model for cited documents Topics are shared
between citing, cited Links depend on topics in
two documents
Link-PLSA-LDA
44
Experiments
  • 8.4M blog postings in Nielsen/Buzzmetrics corpus
  • Collected over three weeks summer 2005
  • Selected all postings with gt2 inlinks or gt2
    outlinks
  • 2248 citing (2 outlinks), 1777 cited documents
    (2 inlinks)
  • Only 68 in both sets, which are duplicated
  • Fit model using variational EM

45
Topics in blogs
Model can answer questions like which blogs are
most likely to be cited when discussing topic z?
46
Topics in blogs
Model can be evaluated by predicting which links
an author will include in a an article
Link-LDA
Link-PLDA-LDA
Lower is better
47
Another model Pairwise Link-LDA
  • LDA for both cited and citing documents
  • Generate an indicator for every pair of docs
  • Vs. generating pairs of docs
  • Link depends on the mixing components (?s)
  • stochastic block model

48
Pairwise Link-LDA supports new inferences
but doesnt perform better on link prediction
49
Outline
  • Tools for analysis of text
  • Probabilistic models for text, communities, and
    time
  • Mixture models and LDA models for text
  • LDA extensions to model hyperlink structure
  • Observation these models can be used for many
    purposes
  • LDA extensions to model time
  • Alternative framework based on graph analysis to
    model time community
  • Discussion of results challenges

50
(No Transcript)
51
Authors are using a number of clever tricks for
inference.
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
Predicting Response to Political Blog Posts with
Topic Models NAACL 09
Noah Smith
Tae Yano
56
Political blogs and and comments
Posts are often coupled with comment sections
Comment style is casual, creative, less carefully
edited
56
57
Political blogs and comments
  • Most of the text associated with large A-list
    community blogs is comments
  • 5-20x as many words in comments as in text for
    the 5 sites considered in Yano et al.
  • A large part of socially-created commentary in
    the blogosphere is comments.
  • Not blog ? blog hyperlinks
  • Comments do not just echo the post

58
Modeling political blogs
Our political blog model
CommentLDA
z, z topic w word (in post) w word (in
comments) u user
D of documents N of words in post
M of words in comments
59
Modeling political blogs
Our proposed political blog model
LHS is vanilla LDA
D of documents N of words in post
M of words in comments
60
Modeling political blogs
RHS to capture the generation of reaction
separately from the post body
Our proposed political blog model
Two chambers share the same topic-mixture
Two separate sets of word distributions
D of documents N of words in post
M of words in comments
61
Modeling political blogs
Our proposed political blog model
User IDs of the commenters as a part of comment
text
generate the words in the comment section
D of documents N of words in post
M of words in comments
62
Modeling political blogs
Another model we tried
Took out the words from the comment section!
The model is structurally equivalent to the
LinkLDA from (Erosheva et al., 2004)
This is a model agnostic to the words in the
comment section!
D of documents N of words in post
M of words in comments
63
Topic discovery - Matthew Yglesias (MY) site
63
64
Topic discovery - Matthew Yglesias (MY) site
64
65
Topic discovery - Matthew Yglesias (MY) site
65
66
Comment prediction
(MY)
  • LinkLDA and CommentLDA consistently outperform
    baseline models
  • Neither consistently outperforms the other.

20.54
Comment LDA (R)
(CB)

(RS)

16.92
32.06
Link LDA (R)
Link LDA (C)
user prediction Precision at top 10 From left to
right Link LDA(-v, -r,-c) Cmnt LDA (-v, -r, -c),
Baseline (Freq, NB)
66
67
From Episodes to Sagas Temporally Clustering
News Via Social-Media Commentary
Noah Smith
Matthew Hurst
Frank Lin
Ramnath Balasubramanyan
68
Motivation
  • News-related blogosphere is driven by recency
  • Some recent news is better understood based on
    context of sequence of related stories
  • Some readers have this context some dont
  • To reconstruct the context, reconstruct the
    sequence of related stories (saga)
  • Similar to retrospective event detection
  • First efforts
  • Find related stories
  • Cluster by time
  • Evaluation agreement with human annotators

69
Clustering results on Democratic-primary-related
documents
k-walks (more later)
SpeCluster time Mixture of multinomials
model for general text timestamp from Gaussian
70
Clustering results on Democratic-primary-related
documents
  • Also had three human annotators build
    gold-standard timelines
  • hierarchical
  • annotated with names of events, times,
  • Can evaluate a machine-produced timeline by
    tree-distance to gold-standard one

71
Clustering results on Democratic-primary-related
documents
  • Issue divergence of opinion with human
    annotators
  • is modeling community interests the problem?
  • how much of what we want is actually in the
    data?
  • should this task be supervised or unsupervised?

72
More sophisticated time models
  • Hierarchical LDA Over Time model
  • LDA to generate text
  • Also generate a timestamp for each document from
    topic-specific Gaussians
  • Non-parametric model
  • Number of clusters is also generated (not
    specified by user)
  • Allows use of user-provided prototypes
  • Evaluated on liberal/conservative blogs and ML
    papers from NIPS conferences

Ramnath Balasubramanyan
73
Results with HOTS model - unsupervised
74
Results with HOTS model human guidance
  • Adding human seeds for some key events improves
    performance on all events.
  • Allows a user to partially specify a timeline of
    events and have the system complete it.

75
Comments
Social Media Text
  • Probabilistic models
  • can model many aspects of social text
  • Community (links, comments)
  • Time
  • Evaluation
  • introspective, qualitative on communities we
    understand
  • Scientific communities
  • quantitative on predictive tasks
  • Link prediction, user prediction,
  • Against gold-standard visualization (sagas)
  • Goals of analysis
  • Very diverse
  • Evaluation is difficult
  • And requires revisiting often as goals evolve
  • Often understanding social text requires
    understanding a community
Write a Comment
User Comments (0)
About PowerShow.com