Title: Detecting Hidden Deceiving Groups in Social Networks
1Detecting Hidden Deceiving Groups in Social
Networks
- Kishore Ekula, Prasanth Kalakota and Naveen
Santhapuri - 04/18/2006
2Outline
- Motivation
- References
- Deception Detection
- Social Network Analysis
- Detecting Hidden Groups
- Deception Detection using Verbal Cues
- Project idea
3Motivation
- Imperative to
- Filter and distinguish deceptive information
- Identify hidden malicious groups
- Interested parties
- Individuals, Law Enforcement
- Corporate and Social Networks
- Sparse research in combining the recent advances
in both domains
4References
- Lina Zhou (UMD), Douglas P. Twitchell, Tiantian
Qin (Uarizona), Judee K. Burgoon, Jay F.
Nunamaker, Jr., An Exploratory Study into
Deception Detection in Text-based
Computer-Mediated Communication, Proc. of 36th
Annual Hawaii International Conference on System
Sciences (HICSS'03) - Diesner, J., Frantz, T., Carley, K.M. (2005).
Communication Networks from the Enron Email
Corpus. Journal of Computational and Mathematical
Organization Theory 11, 201-228
5Deception
- Deception
- Active transmission of messages and information
to create a false conclusion - Deception - reasons
- Self-preservation
- Self-presentation
- Gain
- Altruistic (social) lies
6Deception Methods - Offensive
- Concealment
- Falsification
- Misdirecting
- Half-concealment
- Incorrect inference dodge
- Social Engineering
- Electronic Deceptions like Spam, Phishing, Trojan
Horse attacks
7Detecting Deception
(Source http//www.cs.nps.navy.mil/people/faculty
/rowe/virtcomm162.htm)
8Detecting Deception
- High-Level Clues
- Discrepancies in Information presented
- Logical Fallacies
- Inconsistency in tone
9Social Network Analysis (SNA)
- Social network analysis is the mapping and
measuring of relationships and flows between
people, groups, organizations - The nodes in the network are the people and
groups while the links show relationships or
flows between the nodes. - SNA provides both a visual and a mathematical
analysis of human relationships
10Social Network Analysis
Source http//www.research.ibm.com/thinkresearch/
pages/2005/20050706_think.shtml
11Social Network Analysis
- Popular Individual Network Measures
- Degree Centrality
- Betweenness Centrality and
- Closeness Centrality
- Important Ties between Individuals and Groups
- Direct or Indirect
- Strong or Weak
- One-way or Two-way
12The Enron CaseEnron - What happened?
- Enron was formed in 1985
- Within 15 years became nations seventh-biggest
company in revenue - In 1999, Enron officials began to separate losses
from equity and derivate trades into special
purpose entities (SPE) - On October 31, 2001, the Securities and Exchange
Commission (SEC) started an inquiry into Enron
13Research on Enron Case
- Management Institute of Paris (MIP) identified
Enrons and Andersens senior managers for
Enrons failure - Enrons management misled the public, lacked
moral leadership and ethics, and created an
organizational culture of greed and secrecy
14Data
- Federal Energy Regulatory Commission (FERC)
originally posted Enron email database on the
internet in May of 2002 - FERC collected a total of 619,449 emails from 158
employees, each email contains the email address
of sender and receiver, date, time, subject, body
and text
15Database Refinement and Extraction of Relational
Data
- Data in the corpus is multi-mode (e.g. work
relationship, friendship), multi-link
(connections across various meta-matrix entities)
and multi-time period - Nodes and edges can have multiple attributes such
as the position and location of an employee or
the types of relationships between two
communication partners (multi-mode)
16Database Refinement
- DyNetML Interchange Format for Rich Social
Network Data - Files require data from three tables The message
ID which includes time information, the sender,
and the recipient - ISI position file lists the names of 161 Enron
employees, and 132 of them it provides position
information
17(No Transcript)
18Methodology
- ORA (Organization risk analyzer) was used to
analyze the communication networks - Position information on agents used to compare
formal and informal organizational structure - Explored changes in the network over time
- Comparing a network from a month during the Enron
crisis with a network from a month in which no
major negative happenings are reported
19Methodology
- October 2000 and 2001 was picked for this
comparison - At first Intel report was run in ORA and next ORA
context report that compares graph level measures
from Intel report for Enron with values for real
networks stored in a CASOS database - ORA risk report was run which identifies critical
individuals who bear risk for organization
20Results
21Comparison of networks
22Variation in email frequency
23Limitations
- Main limitation of the study is that the relation
data is not validated which is extracted - Analyzed only two time points and a subset of 227
people - Only the message flow was taken as analysis
criteria - The content of the messages was not considered
- Required for knowing the deceptive indices of
various players
24Deception Detection in CMC
- Involves researching two areas
- Deception theory
- Detection theory
- Deception theory
- Media richness, Channel Expansion and
Interpersonal Deception theory (IDT), Models of
Deceptive Communication - Detection Theory
- Criteria based Content Analysis (CBCA)
- Reality Monitoring (RM)
- Scientific Content Analysis (SCAN)
- Verbal Immediacy (VI)
25Media Richness theory
- Majority of daily information
- Involves some form of deceit
- Uses rich media (face-to-face, voice)
- Non-verbal cues Ex Polygraph testing
Figure Credit Blue water Business Solutions Inc.
26Channel Expansion and IDT
- Experience increases the perceived richness of
media - Experienced user can transmit more deception cues
- Able to strategically hide possible deception
cues - IDT
- Deceiver will engage in
- Modifications of behavior in response to
suspicions - Displaying indicators of deception
- Findings useful for low level channels like CMC
27Detection Theories
- CBCA and RM
- Hypothesis A statement derived from actual
memory will differ in content and quality from a
statement derived from fantasy - The former more perceptual information and the
later more cognitive operations - SCAN
- The absence of some criteria indicates deception
- Pronouns, first person singular, connection
28Verbal Immediacy Theory
- We immediate
- You and I non immediate
- Deception is associated with ve affect
- Non-immediacy is referred to as an indication of
separation - Variations in immediacy include verbal forms such
as pronouns and tense - Assessing immediacy is by literal interpretation
of words and not on connotative meaning - VI is easy to operate compared to other theories
29Verbal Cues
- Extract some cues from existing criteria
- New cues based on
- Observations of experimental deceptive messages
- Knowledge of linguistics
- Deceivers have cognitive anxiety
- May unintentionally adopt higher degree of
non-immediacy - To enhance impression, are likely to display
higher expressiveness of language
30Hypothesis and cues
- Deceptive Senders display
- Higher quantity, complexity, non-immediacy,
expressiveness, informality, affect - Less diversity and specificity of language
- Total 27 linguistic cues
- Diversity ratio of total number of different
words / total number of words - Expressiveness (adjective adverbs) / (nouns
verbs)
31Cues
- Complexity
- Informality Typo ratio
- Non-Immedaicy
- Passive voice
- Modal Verbs
- Can, could, might etc.
32Experimental methodology
- 60 students (30 dyads) over 4 days
- 2 (deceptive vs. truth ) x 2 (sender vs.
Receiver) x 3 (time periods) - Desert survival problem
- Achieve an agreeable ranking of items for
survival - Deceivers were selected randomly
- Participants were asked to give reasons if they
re-rank the items sent by their partner - One item was rendered useless and participants
were asked to re-rank items on day 3 - Each of the partners were given a questionnaire
as to how much they trusted their partner
33Analysis of Results
- Ratio of generalizing terms in deceptive
condition was surprisingly higher than truthful
condition - More words, group references, and affective
information - Deceivers gave elaborate reasons to boost
credibility - Discrepancy indicates cues tend to differ based
on intent (cover up requires less detail)
34Conclusions
- Detecting Deception requires the knowledge of the
two ends of the communication channel and also
the content of the communication - Network Analysis techniques have a fair degree of
sophistication but text analysis for deception is
still in a nascent stage - Identifying promising verbal cues is the key
- Cues differ across contexts
35Project Idea
- Members in a hidden group
- Tend to deceive people outside the group
- Be honest within the group
- A topic/category based model
- Observe the pattern of deception for each topic
- Use the patterns to identify a possible group for
each topic - Derive the hidden groups members from the set of
possible groups