First International Conference on - PowerPoint PPT Presentation

About This Presentation

Title:

First International Conference on

Description:

Title: PowerPoint Presentation Author: KDD Last modified by: KDD Document presentation format: On-screen Show (4:3) Other titles: Times New Roman Arial Unicode MS ... – PowerPoint PPT presentation

Number of Views:68

Avg rating:3.0/5.0

Slides: 24

Provided by: kdd95

Learn more at: https://www3.nd.edu

Category:

more less

Transcript and Presenter's Notes

Title: First International Conference on

1
Structural Link Analysis from User Profiles and
Friends NetworksA Feature Construction Approach

William H. Hsu, Joseph Lancaster, Martin S. R.
Paradesi, Tim Weninger
Monday, 26 March 2007
Laboratory for Knowledge Discovery in Databases
Kansas State University
http//www.kddresearch.org/KSU/CIS/ICWSM-20070326.
ppt

2
Link Analysis in Social NetworksThe K-State
Corpus
3
Outline

Background, Related Work and Rationale
Technical Objective Link Mining in Social
Networks
Methodology Graph Feature Extraction
Experimental Results K-State LJMiner Corpus
Continuing Work Statistical Relational Models

4
Problem StatementLink Mining in Social Networks

Problem Definition
Given records of users of weblog or social
network service
Discover
Features of entities users, communities
Relationships friendship, membership,
moderatorship
Explanations and predictions for relationships
Goals
Boost precision and recall of link existence
prediction
Find relevant features
Significance Recommendations (Friendship,
Membership)

5
Related WorkLink Mining

Getoor and Diehl (2005) - Graphical model
representations of link structure
Ketkar et al. (2005) - Data mining techniques vs
graph-based representation
Sarkar Moore (2005) - Change in link structure
across discrete time steps
Popescul Ungar (2003) - ER model to predict
links
Hill (2003), Bhattacharya Getoor (2004)
Statistical Relational Learning to resolve
identity uncertainty
Resig et al. (2004) - Predicting IM online times
using friends graph degree
McCallum et al. (2005) - Inferring roles and
topic categories based on link analysis

6
Rationale

Limitations of Current State of the Art
Do not take graph features into account
Limited ability to select, extract features
Novel Contribution Link Mining System
Extracts, computes features of network model
Towards dependent types for relational link
mining
Rationale
Desired functionality infer new links from old
Evaluation precision, recall for link existence

7
Outline

Background, Related Work and Rationale
Technical Objective Link Mining in Social
Networks
Methodology Graph Feature Extraction
Experimental Results K-State LJMiner Corpus
Continuing Work Statistical Relational Models

8
K-State Test BedLJMiner Corpus
9
LiveJournal Topology 1Tools and Security Model
10
LiveJournal Topology 2Definitions
11
Outline

Background, Related Work and Rationale
Technical Objective Link Mining in Social
Networks
Methodology Graph Feature Extraction
Experimental Results K-State LJMiner Corpus
Continuing Work Statistical Relational Models

12
Graph Features 1Node, Pair, Link-Dependent
Node-Dependent Features specific to one node
(vertex) within candidate pair
Indegree (v) Target popularity
Indegree (u) Source popularity
Outdegree (u) Source fertility
Outdegree (v) Target fertility
Pair-Dependent Features specific to one
candidate pair of nodes (vertices)
Link-Dependent Features specific to one link
(edge) in directed graph
13
Graph Features 2Node and Pair Features in
LJMiner
14
LJCrawler

System Design
Data acquisition client, injector, parser
Ancillary issues
Multi-threading
Distribution
Storage
Analytical postprocessing LJClipper, LJStats
Distinguishing features of LJCrawler
Results
200 users/second maximum, 5 users/second allowed
Approximately 2 million pages crawled

15
Outline

Background, Related Work and Rationale
Technical Objective Link Mining in Social
Networks
Methodology Graph Feature Extraction
Experimental Results K-State LJMiner Corpus
Continuing Work Statistical Relational Models

16
Network StatisticsGraph Distance
17
Interpretation of Results

941-node graph (Hsu et al., 2006) LJCrawler v1
output
1000-4000 node graphs LJCrawler v2 output

18
Outline

Background, Related Work and Rationale
Technical Objective Link Mining in Social
Networks
Methodology Graph Feature Extraction
Experimental Results K-State LJMiner Corpus
Continuing Work Statistical Relational Models

19
Results

Establishing an Interdisciplinary Research
Initiative
K-State / KU / UNL collaboration
Resources Linguistic Data Consortium
NIST evaluations
Involving End Users of Machine Translation
Document users
Machine learning, data mining, info extraction
researchers
Novel Applications
Social networks and collaborative recommendation
Gisting and beyond

20
Continuing Work

Information Extraction and Intelligent IR
Learning models for IE ontologies
Latent semantic analysis
Machine Learning
Natural language learning
Time series learning and understanding
Relational and first-order models
Automated Reasoning
Probabilistic
Case-based and analogical
Data Mining and Warehousing
Grid Computing

21
References

Knight, K. Whats New in Statistical Machine
Translation. Invited Talk, International Joint
Conference on Artificial Intelligence
(IJCAI-2005), Edinburgh, UK, August, 2005.
Knight, K. Graehl, J. (2005). An Overview of
Probabilistic Tree Transducers for Natural
Language Processing. In Proceedings of CICLing
2005, p. 1-24.
Chiang, D. A hierarchical phrase-based model for
statistical machine translation. In Proceedings
of the Conference of the Association for
Computational Linguistics (ACL 2005), p. 263270.
Koehn, P., Och, F. J., Marcu, D. (2003).
Statistical Phrase-Based Translation. In
Proceedings of HLT-NAACL 2003, the Human Language
Technology Conference of the North American
Chapter of the Association for Computational
Linguistics, May 27 - June 1, 2003, Edmonton,
CANADA.