Social Networks and Surveillance: Evaluating Suspicion by Association

About This Presentation

Title:

Social Networks and Surveillance: Evaluating Suspicion by Association

Description:

Create a machine-understandable model of existing social networks ... Social Networks. Individuals engaged in suspicious or undesirable behavior rarely act alone ... – PowerPoint PPT presentation

Number of Views:71

Avg rating:3.0/5.0

Slides: 21

Provided by: ryanla

Category:

more less

Transcript and Presenter's Notes

Title: Social Networks and Surveillance: Evaluating Suspicion by Association

1
Social Networks and Surveillance Evaluating
Suspicion by Association

Ryan P. Layfield
Dr. Bhavani Thuraisingham
Dr. Latifur Khan
Dr. Murat Kantarcioglu
The University of Texas at Dallas

layfield, bxt043000, lkhan, muratk_at_utdallas.edu
2
Overview

Introduction
Our Goal
System Design
Social Networks
Threat Detection
Correlation Analysis
The Experiment
Setup
Current Results
Issues
Future Work

3
Introduction

Automated message surveillance is essential to
communication monitoring
Widespread use of electronic communication
Exponential data growth
Impossible to sift through all by hand
Going beyond basic surveillance
Identifying groups rather than individuals
Monitoring conversations rather than messages

4
Our Goal

Design new techniques and apply existing
algorithms to
Create a machine-understandable model of existing
social networks
Identify abnormal conversations and behavior
Monitor a given communications system in
real-time
Continuously learn and adapt to a dynamic
environment

5
System Design

Three major components
Social Network Modeler
Initial Activity Detector
Correlated Activity Investigator

6
Social Networks

Individuals engaged in suspicious or undesirable
behavior rarely act alone
We can infer than those associated with a person
positively identified as suspicious have a high
probability of being either
Accomplices (participants in suspicious activity)
Witnesses (observers of suspicious activity)
Making these assumptions, we create a context of
association between users of a communication
network

7
Social Networks

Within our model
Every node is a unique user
Every message creates or strengthens a link
between nodes
Over time, the network changes
Frequent communication leads to stronger links
Intermittent messaging implies weakening social
ties
The strength of the link implies how strong an
association between individuals is
From this data, we can theoretically identify
Hubs
Groups
Liaisons

8
Social Networks
9
Threat Detection

Every message sent is scrutinized in the interest
of identifying suspicious communication
Keywords analysis
Prior context (i.e. previous message content)
When a detection algorithm yields a strong
result, a token is created
The token is created at the origin and passed to
the recipient(s)
Existing tokens, if any, are cloned instead
The result is a web that potentially reflects the
dissemination of suspicious information activity

10
Correlation Analysis

Future messages with similar suspicious topics
are not always identifiable with the same
initial techniques
Quick replies
Pronoun use
Assumption that recipient is aware of topic
If a token is present at the sender when a
message is sent
Message token is associated with and new message
are analyzed
If analysis yields a strong match, the token is
further cloned and passed to recipient

11
The Experiment

A rare set of words shared between two or more
messages are candidates for keyword analysis, but
they are not always easily sifted from noise
Noise within text-based messages comes in a
variety of forms
Misspelled words
Unusual word choice
Incompatible variations of the same language
(i.e. British vs. American English)
Unexpected language
However, we do not want to eliminate potential
keywords
Document names
Terminology specific to a subject
Buzz words

12
The Experiment

We proposed an experiment that attempts to
eliminate false positives due to noisy data while
strengthening and expanding our correlation
techniques

13
Setup

Tools
Running word rank database
Implementation of word set theory infrastructure
JAMA Matrix Library
Singular Value Decomposition
Our Approach
Apply SVD noise filtering based on 100 messages
Analyze word frequency correlation between
current message and prior suspicious messages
Generate a score based on the results

14
Setup

Construct a matrix based on the last 100 messages

messages
More common
words
Less common
15
Setup

Decompose and rebuild

VT
?
U
A
Eliminate weak singular values
16
Setup
Pulled from messages j and k
Raw total score for word wi
Pulled from running word database
Counts only intersection of words
Predefined fixed threshold
17
Current Results