Social Networks and Surveillance: Evaluating Suspicion by Association - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Social Networks and Surveillance: Evaluating Suspicion by Association

Description:

Create a machine-understandable model of existing social networks ... Social Networks. Individuals engaged in suspicious or undesirable behavior rarely act alone ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 21
Provided by: ryanla
Category:

less

Transcript and Presenter's Notes

Title: Social Networks and Surveillance: Evaluating Suspicion by Association


1
Social Networks and Surveillance Evaluating
Suspicion by Association
  • Ryan P. Layfield
  • Dr. Bhavani Thuraisingham
  • Dr. Latifur Khan
  • Dr. Murat Kantarcioglu
  • The University of Texas at Dallas

layfield, bxt043000, lkhan, muratk_at_utdallas.edu
2
Overview
  • Introduction
  • Our Goal
  • System Design
  • Social Networks
  • Threat Detection
  • Correlation Analysis
  • The Experiment
  • Setup
  • Current Results
  • Issues
  • Future Work

3
Introduction
  • Automated message surveillance is essential to
    communication monitoring
  • Widespread use of electronic communication
  • Exponential data growth
  • Impossible to sift through all by hand
  • Going beyond basic surveillance
  • Identifying groups rather than individuals
  • Monitoring conversations rather than messages

4
Our Goal
  • Design new techniques and apply existing
    algorithms to
  • Create a machine-understandable model of existing
    social networks
  • Identify abnormal conversations and behavior
  • Monitor a given communications system in
    real-time
  • Continuously learn and adapt to a dynamic
    environment

5
System Design
  • Three major components
  • Social Network Modeler
  • Initial Activity Detector
  • Correlated Activity Investigator

6
Social Networks
  • Individuals engaged in suspicious or undesirable
    behavior rarely act alone
  • We can infer than those associated with a person
    positively identified as suspicious have a high
    probability of being either
  • Accomplices (participants in suspicious activity)
  • Witnesses (observers of suspicious activity)
  • Making these assumptions, we create a context of
    association between users of a communication
    network

7
Social Networks
  • Within our model
  • Every node is a unique user
  • Every message creates or strengthens a link
    between nodes
  • Over time, the network changes
  • Frequent communication leads to stronger links
  • Intermittent messaging implies weakening social
    ties
  • The strength of the link implies how strong an
    association between individuals is
  • From this data, we can theoretically identify
  • Hubs
  • Groups
  • Liaisons

8
Social Networks
9
Threat Detection
  • Every message sent is scrutinized in the interest
    of identifying suspicious communication
  • Keywords analysis
  • Prior context (i.e. previous message content)
  • When a detection algorithm yields a strong
    result, a token is created
  • The token is created at the origin and passed to
    the recipient(s)
  • Existing tokens, if any, are cloned instead
  • The result is a web that potentially reflects the
    dissemination of suspicious information activity

10
Correlation Analysis
  • Future messages with similar suspicious topics
    are not always identifiable with the same
    initial techniques
  • Quick replies
  • Pronoun use
  • Assumption that recipient is aware of topic
  • If a token is present at the sender when a
    message is sent
  • Message token is associated with and new message
    are analyzed
  • If analysis yields a strong match, the token is
    further cloned and passed to recipient

11
The Experiment
  • A rare set of words shared between two or more
    messages are candidates for keyword analysis, but
    they are not always easily sifted from noise
  • Noise within text-based messages comes in a
    variety of forms
  • Misspelled words
  • Unusual word choice
  • Incompatible variations of the same language
    (i.e. British vs. American English)
  • Unexpected language
  • However, we do not want to eliminate potential
    keywords
  • Document names
  • Terminology specific to a subject
  • Buzz words

12
The Experiment
  • We proposed an experiment that attempts to
    eliminate false positives due to noisy data while
    strengthening and expanding our correlation
    techniques

13
Setup
  • Tools
  • Running word rank database
  • Implementation of word set theory infrastructure
  • JAMA Matrix Library
  • Singular Value Decomposition
  • Our Approach
  • Apply SVD noise filtering based on 100 messages
  • Analyze word frequency correlation between
    current message and prior suspicious messages
  • Generate a score based on the results

14
Setup
  • Construct a matrix based on the last 100 messages

messages
More common
words
Less common
15
Setup
  • Decompose and rebuild

VT
?
U
A
Eliminate weak singular values
16
Setup
Pulled from messages j and k
Raw total score for word wi
Pulled from running word database
Counts only intersection of words
Predefined fixed threshold
17
Current Results
  • Method is not currently accurate
  • Large fluctuations
  • Correlation easily swayed by plethora of common
    words
  • Uncommon words not given enough weight

18
Current Results
1000 messages evaluated, first 100 used to seed
word ranks.
19
Issues
  • Word frequencies fluctuate wildly during
    beginning of experiment (0.0 10.0)
  • Extreme cost for current construction methods and
    computation
  • Filtering context limited to recent global
    history
  • Affected by large bodies of text

20
Future Work
  • Tap potential of existing matrix for further
    analysis
  • Adaptive filtering feedback algorithms
  • Speed improvements to accommodate real-time
    streams
  • Flexible communication platform monitoring
  • Addition of pipe architecture for modular threat
    detection and correlation
Write a Comment
User Comments (0)
About PowerShow.com