Cyber-TA: Massive and Distributed Data Correlation - PowerPoint PPT Presentation

About This Presentation
Title:

Cyber-TA: Massive and Distributed Data Correlation

Description:

Introduction Approaches to Privacy-Preserving Correlation A Cyber-TA Distributed Correlation Example botHunter Cyber-TA: Massive and Distributed Data Correlation – PowerPoint PPT presentation

Number of Views:170
Avg rating:3.0/5.0
Slides: 23
Provided by: PhilP
Learn more at: http://www.cyber-ta.org
Category:

less

Transcript and Presenter's Notes

Title: Cyber-TA: Massive and Distributed Data Correlation


1
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Cyber-TA Massive and Distributed Data
Correlation
Phillip Porras - porras_at_csl.sri.com Computer
Science Laboratory, SRI International www.cyber-ta
.org 28 September 2006
2
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Massive Data Correlation Group
  • Massive Data Correlation Group Examining
    strategies to collect and analyze local network
    events in search of large-scale attack
    phenomena, emerging malware threats, stealth
    activity across large-scale networks
  • Contributors SRI, Yale, SANS Institute, NCSU, UC
    Davis, GA-Tech, and others
  • Perspectives
  • Massive/Passive Analysis Methods Examining
    large-scale data correlation strategies to apply
    in incoming security log data from the repository
  • Data utility requirements for data privacy
    services
  • Optimal data sources
  • New (and current) correlation strategies must
    address data anonymization
  • Distributed Analysis Methods Distribute
    attack detection logic to producers, collect
    results abstractions and conduct group consensus
    analyses

3
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Data Analysis Approaches
  • Massive/Passive Analysis Methods
  • low-rate (Stealth) pattern/sequence detection
    in massive data stores
  • massive privacy-preserving data mining
    strategies (Massive PPDM)
  • fast entropy-shift detection in high-volume data
    streams
  • Highly-Predictive Blacklist (HPB) production
  • Distributed Analysis Methods
  • producer-side behavior-based malware correlation
    (botHunter v0.9)
  • summary statistics, consensus attack detection
    and trend analyses

4
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Isolating Stealthy Actions in Massive Data Volumes
  • Objective Stealth def in this context
    seeking long-duration or short-sequence
    deterministic behavior patterns in massive data
    streams
  • Current Detection Methods lack computational
    and memory efficiency in processing massive data
    stores
  • Current coordinated attack discovery (e.g.,
    attack collaboration) have not been applied in
    repository-scale applications
  • We seek data pruning techniques, optimal data
    attribute selections that will facilitate various
    deterministic behavior pattern analyses
  • Low-speed scanning, common malware communication
    patterns, long-duration propagation analyses,
    regularities in IDS Log production patterns that
    indicate detection redundancies
  • Employ massive-data analysis techniques in areas
    such as streaming algorithmics, very-large
    databases, and distributed data mining

5
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Example Low-density pattern analyzer port N-Grams
  • Provides a basis upon which
  • Automated discovery of emerging malware scan
    patterns
  • Local Systems can be compared to global N-Gram
    patterns

300M connection over A 56K unused IP
FOUND On days 1-3 there were 160-200 sources
per day probed the following 10 port
combination (All MS B.O. Targets)
80135139445102514332745312750006129
1-2-3 195-200-160
Dst_Port N-Grams
Common SRC_IP cnts
1433
135
8013513944510252745312750006129
80139445102514332745312750006129
2 1
4 12
0080 Web Server 0135 MS DCE Locator Service
(DHCP, DNS, WINS) 0139 MS NetBios 0445 MS
Win2K SMB 1025 CAN-2003-0533 MS LSASRV.DLL
B.O 1433 MS SQL-Server B.O. 2745 MS Bagle
Virus Backdoor 3127 MS MyDoom Backdoor 5000
BioNet, Bubble, Blazer, ICKiller Backdoors 6129
MS Dameware Remote Admin
80
139445102514332745312750006129
4 22
6
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Massive PPDM Strategies
  • Current PPDM Methods
  • Peer-based shared encryption scheme (e.g.,
    homomorphic encryption)
  • Example Capabilities
  • Privacy Preserving Set Intersection All parties
    want the intersection of their private datasets
    revealed, without gaining/revealing
    non-intersecting data
  • Privacy Preserving Set Matching Each member Pi
    wants to know which values in its set intersect
    with values of the other members set, without
    gaining/revealing non-matchers
  • Solutions are traced to 2-party case of private
    equality testing, among other techniques
  • Massive PPDM
  • PPDM in non-peer-based environments (e.g.,
    large-scale sensor grids)
  • PPDM computational scalability and lightweight
    key coordination schemes
  • Usage Concept N coalition partners wish to
    compare netflow/intrusion/FW logs to find common
    attack sources insufficient trust to openly
    share unrelated connection histories

7
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Massive Data Efficient Change/Shift Detection
Entropy LETS TALK
8
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Highly-Predictive Blacklisting (HPB) - Concept
Sensor Repository
  • S. Katti, B. Krishnamurthy, D. Katabi,
    Collaborating Against Common Enemies, ACM
    SIGCOMM05 Internet Measurement Conference.
  • Surveyed data from 1700 DShield Sensors
  • Introduced Highly Collaborative Groups
  • Relative small membership sizes
  • Correlated attacks appear at corr_group networks
    within small time frames
  • Groups relations are long lasting
  • Cross group relations have small intersections
  • Implications
  • blacklist sharing among groups may yield higher
    relevance rates, more managable sizes

9
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
Contributor Pool Cluster Details
  • Clustering Logic
  • Each node corresponds to a /24 subnet.
  • Different colors represent different prefixes.
  • Two nodes are connected if more than 10 of the
    attacks target one nodes also go to the other.
  • The nodes in the clusters are highly connected
    while there is little or no connection between
    nodes in different clusters.

10
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
Massive Data Correlation Data Analysis
Approaches Stealth Threats Massive PPDM
Shifts and Spikes Highly Predictive
Blacklists Distributed Correlation Techniques
HPB Example Data Assessment
  • Clusters are constructed using day ones alert
    reports
  • On day one
  • attackers observed by the repository 976,997
  • attackers observed by the cluster 10,106
  • On day two
  • over 50 of the attackers seen by any node in the
    cluster can be predicted by day ones observation
    from the cluster

Day two attack
Day one repository observation
Day one attack to the cluster
11
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
  • bØtHunt3r
  • A behavior-based correlation framework
  • for botnet detection

12
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
What is botHunter?
botHunter is a passive bot detection system,
consisting of
  • Snort-based sensor suite specialized in
    malware-specific event detection
  • malware-specific inbound scan detection using TRW
    variant
  • comprehensive remote to local exploit detection,
    emphasizing most common methods
  • PAYL-based session anomaly detection system
    detecting payload exploits over key TCP protocols
  • Botnet specific egg download banners, bot
    registration acknowledements
  • Victim-to-CC-based communications exchanges,
    particularly for IRC bot protocols
  • inbound to outbound scan monitoring system
  • Cyber-TA-based plugin correlator
  • combines information from sensors to recognize
    bots that infect and coordinate with your
    internal network assets
  • Submits bot-detection profiles to the Cyber-TA
    repository infrastructure

13
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
Bot infection case study Phatbot
  • An example infection lifecycle of the Phatbot
    infection captured in a controlled VMWare
    environment
  • A Attack, V Victim, C CC Server
  • E1 A. ? V.2745, 135, 1025, 445, 3127, 6129,
    139, 5000 (Bagle, DCOM2, DCOM, NETBIOS, DOOM,
    DW, NETBIOS, UPNPTCP connections w/out content
    transfers)
  • E2 A. ? V.135 (Windows DCE RCP exploit in
    payload)
  • E3 V. ? A.31373 (transfer a relatively large
    file via random A port specified by exploit)
  • E4 V. ? C.6668 (connect to an IRC server)
  • E5 V. ? V.2745, 135, 1025, 445, 3127, 6129,
    139, 5000 (V begins search for new infection
    targets, listens on 11759 for future egg
    downloads)

14
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
A Behavioral-based Approach
V-2-A
botHunter abstracts the infection lifecycle into
5 possible stages
A-2-V
V-2-
Type II
V-2-C
A-2-V
  • Search for duplex communication sequences that
    are indicative of infection-coordination-infection
    lifecycle

Type I
V-2-
  • Under a weighted correlation scheme, external
    stimulus is not enough to declare bot
  • stimulus does not require strict ordering, but
    does require temporal locality

15
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
Botnets Architecture Overview
System Requirements Snort 2.6.0, OS Linux,
MacOS, Win, FreeBSD, Solaris,
Java 1.4.2
Snort 2.6.0
spp_scade.ch
e2 Payload Anomalies
CTA Anonymizer Plugin
SLADE
e1 Inbound Malware Scans
botHunter Correlator
Span Port to Ethernet Device
spp_scade.ch
e5 Outbound Scans
SCADE
e2 Exploits e3 Egg Downloads e4 CC Traffic
Java 1.4.2
Signature Engine
  • bot Infection Profile
  • Confidence Score
  • Victim IP
  • Attacker IP List (by confidence)
  • Coordination Center IP (by confidence)
  • Full Evidence Trail Sigs, Scores, Ports
  • Infection Time Range

16
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensor Suite SCADE
SCADE ./snort-2.6.0/src/preprocessors/spp_scade.c
  • Custom malware specific weighted scan detection
    system for inbound and outbound sources
  • Inbound (E1 Initial Scan Phase)
  • suspicious port scan weighted TRW score
  • failed connection to vulnerable port high
    weight
  • failed connection to other port median weight
  • successful connection to vulnerable port low
    weight
  • Outbound (E5 Victim Outbound Scan)
  • S1 Scan rate of V over time t
  • S2 Scan failed connection rate of V over t
  • S3 Scan target entropy (low revisit rate
    implies bot search) over t
  • Majority voting scheme employed combines model
    assessments

17
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensor Suite SLADE
SLADE ./snort-2-6.0/src/preprocessors/spp_slade.c
  • Suspicious payload detect Modified PAYL 3-gram
    byte distribution analyzer over a limited set of
    network services
  • Implements a lossy data structure to capture
    3-gram hash space default vector size 2048.
    (Versus n3, 2563 224 16M).
  • Current Slade port set 21, 53, 80, 135, 1025,
    445 TCP
  • Auto-transition from train to detect mode
    enabled
  • Current Status in develop to enable per-port
    auto-threshold selection

18
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensor Suite Signature Engine
  • botHunter Signature Set Replaces all standard
    snort rules with five custom rulesets
    e1-5.rules
  • Scope known worm/bot exploit general traffic
    signatures, shell/code/script exploits,
    update/download/registered rules, CC command
    exchanges, outbound scans and malware exploits
  • Rule sources
  • Bleeding Edge malware rulesets
  • Snort Community Rules
  • Snort Registered Free Set
  • Cyber-TA Custom bot-specific rules
  • Current Set 237 rules, operating on SRI/CSL and
    GA-Tech networks, relative low false positive rate

19
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
botHunter - Correlation Framework
  • Characteristics of Bot Declarations
  • states are triggered in any order, but pruning
    timer reinitializes row state once an InitTime
    Trigger is activated
  • external stimulus alone cannot trigger bot alert
  • 2 x internal bot behavior triggers bot alert
  • When bot alert is declared, IP addresses are
    assigned responsibility based on raw contribution

Bot-State Correlation Data Structure
VictimIP E1 E2 E3 E4 E5
Score
Rows Valid Internal Home_Net IP Colums Bot
infection stages Entry IP addresses that
contributed alerts to E-Column Score Column
Cumulative score for per Row Threshold
(row_score gt threshold) ? declare bot InitTime
Triggers An event that initiate pruning
timer Pruning Timer Seconds remaining until a
row is reinitialized
Defaults E1 Inbound scan detected
weight .25 E2 Inbound exploit detected
weight .25 E3 Egg download detected
weight .50 E4 CC channel detected
weight .50 E5 Outbound scan detected
weight .50 Threshold 1.0 Pruning Interval
120 seconds
20
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
Implementation Status and Example Output
./Run_botHunter.csh c ./config/phatbot.config S
tarting program... Score 1.5 (gt 1.0) Infect
Target 192.168.166.40 Infector
List 192.168.166.20 C C List 192.168.166.10
(25), 192.168.166.20 (3) Start 06/22/2006
164223.33 PDT Report End 06/22/2006
164438.54 PDT INBOUND SCAN 192.168.166.20
(164223 PDT) E1 scade detected host
192.168.166.40 scanned by 192.168.166.20 at
ports 2745 3127 6129 EXPLOIT
192.168.166.20 (2) (164224.67 PDT) E2
SHELLCODE x86 NOOP 135lt-4819 (164224.67
PDT) E2 SHELLCODE x86 0x90 unicode NOOP
135lt-4819 EGG DOWNLOAD C and C TRAFFIC
192.168.166.10 (25) (164241.34 PDT-164331.20
PDT) E4 COMMUNITY BOT Internal IRC server
detected E4 BLEEDING-EDGE TROJAN BOT - potential
scan/exploit command 1037lt-6668 E4 COMMUNITY BOT
GTBot scan command 1037lt-6668 OUTBOUND SCAN
192.168.166.20 (164346.85 PDT) E5 scade
detected suspicious scanner 192.168.166.40
scanning 30 IPs at ports 0 2745
Example VMWare Phatbot Experiment
Coordination Center 192.168.166.10 Initial
Bot Infector 192.168.166.20 Victim System
192.168.166.40
21
Introduction Approaches to Privacy-Preserving
Correlation A Cyber-TA Distributed Correlation
Example botHunter
What is botHunter? A Real Case Study Behavior-base
d Correlation Architectural Overview
botHunter Sensors Correlation Framework Example
botHunter Output Cyber-TA Integration
botHunter - born a Cyber-TA plugin
Cyber-TA Threat Ops Center
CTA Anonymizer Plugin
Snort Alerts
botHunter Correlator
Bot Profile Repository
Java 1.4.2
AnonymizationService
Cyber-TA RDBMS Manager
MIXNET Deliver Daemon
Delivery Ack
Delivery Ack
CTA Anonymizer
TLS Session
TLS Session
TOR Circuit
TOR Circuit
TCP/IP
TCP/IP
22
  • END
Write a Comment
User Comments (0)
About PowerShow.com