Title: NetViewer: A Network Traffic Visualization and Analysis Tool
1NetViewer A Network Traffic Visualization and
Analysis Tool
- Seong Soo Kim
- L. Narasimha Reddy
- Electrical and Computer Engineering
- Texas AM University
2Contents
- Introduction and Motivation
- Our Approach
- NetViewers Architecture
- NetViewers Functionality
- Evaluation of Netviewer
- Conclusion
3Attack/ Anomaly
- Single attacker (DoS)
- Multiple Attackers (DDoS)
- Multiple Victims (Worms, viruses)
- Aggregate Packet header data as signals
- Image based anomaly/attack detectors
4Motivation (1)
- Previous studies looked at individual flows
behavior - These become ineffective with DDoS ? Aggregate
Analysis - Link speeds are increasing
- currently at G b/s, soon to be at 10100 G b/s
- Need simple, effective mechanisms
- Packet inspection cant be expensive
- Can we make them simple enough to implement them
at line speeds?
5Motivation (2)
- Signature (rule)-based approaches are tailored to
known attacks - Become ineffective when traffic patterns or
attacks change - New threats are constantly emerging
- Quick identification of network anomalies is
necessary to contain threat - Can we design general mechanisms for attack
detection that work in real-time?
6Our Approach (1)
- Look at aggregate information of traffic
- Collect data over a large duration (order of
seconds) - Can be higher if necessary
- Use sampling to reduce the cost of processing
- Process aggregate data to detect anomalies
- Individual flows may look normal ? look at the
aggregate picture
7Our Approach (2) - Environment
8NetViewers Architecture
- Packet Parser Collects and filters raw packets
and traffic data from packet header traces or
NetFlow records. - Signal Computing Engine Analyzes the
statistical properties of aggregate traffic
distributions. -
- Detection Engine Thresholds setting through
statistical measures of traffic signal. - Visualization Engine Employing image processing
, and displaying traffic signals and images - Alerting Engine Attacks and anomalies are
detected/identified in real-time
9Packet Parser (1)
- Packet headers carry a rich set of information
- Data Packet counts, byte counts, the number of
flows - Domain Source/destination address,
source/destination Port numbers, protocol numbers - Processing traffic header poses challenges.
- Discrete spaces
- Large Domains
- 232 IPv4 addresses
- 216 Port numbers
- Need Mechanisms to reduce the domain size
- Need Mechanisms to generate useful signals
10Packet Parser (2) Data structure for reducing
domain size
- 2 dimensional arrays countij
- To record the packet count for the address j in
ith field of the IP address - Normalized packet counts
-
-
-
- Effects
- Constant, small memory regardless of the packets,
232 (4G) ? 4256 (1K) - Running time O(n) to O(lgn)
- Somewhat reversible hash function
11Packet Parser (3) Data structure for reducing
domain size
- Simple example
- IP of Flow1 165. 91. 212. 255, Packet1
3 - IP of Flow2 64. 58. 179. 230, Packet2
2 - IP of Flow3 216. 239. 51. 100, Packet3
1 - IP of Flow4 211. 40. 179. 102, Packet4 10
- IP of Flow5 203. 255. 98. 2, Packet5
2 -
12Packet Parser (3) Data structure for reducing
domain size
- Simple example
- IP of Flow1 165. 91. 212. 255, Packet1
3 - IP of Flow2 64. 58. 179. 230, Packet2
2 - IP of Flow3 216. 239. 51. 100, Packet3
1 - IP of Flow4 211. 40. 179. 102, Packet4 10
- IP of Flow5 203. 255. 98. 2, Packet5
2 -
13Signal Computing Engine
- Correlation
- To measure the strength of the linear
relationship between adjacent sampling instants -
- Delta
- The difference of traffic intensity
- It is remarkable at the instant of beginning and
ending of attacks -
- Scene change Analysis
- Variance of pixel intensities in the image
14Detecting Engine Threshold setting
- From generated distribution signals (Ss), derive
statistical thresholds - High threshold TH Traffic distribution less
correlated than usual - Low threshold TL Traffic distribution more
uniform than usual
15Visualization Engine
- Treat the traffic data as images
- Apply image processing based analysis
16Image Generation
17(No Transcript)
18Generated various traffic Images
- Image reveals the characteristics of traffic
- Normal behavior mode
- A single target (DoS)
- Semi-random target a subnet is fixed and other
portion of address is change - (Prefix-based attacks)
- Random target
- horizontal (Worm) and vertical scan (DDoS)
19Alerting Engine
- Scrutinize the statistical quantities
correlation and delta - Identify the IP addresses of suspicious attackers
and victims - Lead to some form of a detection signal
- Generate the detection report
20NetViewers Functionality
- Traffic Profiling
- General information of current network traffic
- Monitoring
- Monitor traffic distribution signal (Ss) over the
latest time-window - Anomaly Reporting
- Image-based traffic in the source/destination IP
address domain and the 2-dimensional domain - Auxiliary Function
- Multidimensional Image
- Attack Tracking
- Automatic Spoofed Address Masking
21Traffic Profiling Function (1)
22Traffic Profiling Function (2)
- Understanding the general nature of the traffic
ay the monitoring point - Bandwidth in Kbps and Kpps (packet per sec.)
- Protocol the proportion occupied by each
traffic protocol in percent - Top 5 flows the topmost 5 flows in packet count
or byte count or flow number - Based on LRU (least Recently Used) policy cache
23Monitoring Function (1)
24Monitoring Function (2)
- Traffic distribution signal (Ss) over the latest
time-window - 3 kinds of selected signals Ss of packet count,
Ss of byte count, Ss of flow count - Source IP packet count distribution signal in
the source IP address domain - Source FLOW the number of flow distribution
signal in the source IP address domain - Source PORT packet count distribution signal in
the source IP port domain - MULTIDIMENSIONAL multiple components of the
above signals in source domain - Pr the anomalous probability of current traffic
under Gaussian distribution - Signal the distribution signal computed by
- illustrated with dotted vertical lines of 3s
level - m and s mean value and standard deviation of
distribution signal using EWMA
25Anomaly Reporting Function (1)
26Anomaly Reporting Function (2) normal network
traffic
- Use variance of pixel intensities
- Distribution of traffic over the observed domain
- During anomalies, the traffic distributions
different from normal traffic - Higher correlation (DOS)
- Lower correlation (worms)
27Anomaly Reporting Function (3) semi-random
targeted attacks
28Anomaly Reporting Function (4) random targeted
attacks
- Worm propagation type attack
- DDoS propagation type attack
29Anomaly Reporting Function (5) complicated
attacks
- Complicated and mixed attack pattern
- The horizontal (dotted or solid) line gt specific
source scanning destination addresses. - The vertical line gt random sources assail
specific destination
30Anomaly Reporting Function (6) Summary of
Visual representation of traffic
- Worm attacks horizontal line in 2D image
- DDoS attacks vertical line in 2D image
- Line detection algorithm
- Visual images look different in different traffic
modes - Motion prediction can lead to attack prediction
-
31Anomaly Reporting Function (7)
32Anomaly Reporting Function (7)- Identification
Time Tue 10-14-2003 051200
-------------------------------------------------
------------- Source IP1 134.
correlation 17.48 possession 18.77
delta 2.50 S Source IP1 141.
correlation 4.33 possession
3.94 delta 0.79 S Source IP1
155. correlation 58.20
possession 56.80 delta 2.84
S Source IP1 210. correlation
5.66 possession 6.51 delta
1.60 S Source IP2 75.
correlation 17.47 possession 18.77
delta 2.51 S Source IP2 110.
correlation 4.62 possession
5.25 delta 1.21 S Source IP2
223. correlation 4.31 possession
3.94 delta 0.78 S Source IP2
230. correlation 58.21
possession 56.84 delta 2.76
S Source IP3 7. correlation
15.59 possession 17.02 delta
2.74 S Source IP3 14.
correlation 53.99 possession 52.31
delta 3.41 S Source IP4
41 correlation 15.16 possession
16.36 delta 2.30 S Source IP4
50 correlation 52.58 possession
50.83 delta 3.54 S -----------------
--------------------------------------------- Iden
tified No. 1st 4, 2nd 4, 3rd 2, 4th
2
Destination IP1 18.
correlation 4.37 possession 3.88
delta 1.01 S Destination IP1 128.
correlation 6.08 possession 7.01
delta 1.75 S Destination IP1 131.
correlation 53.65 possession 52.33
delta 2.67 S Destination IP2 181.
correlation 56.03 possession 54.00
delta 4.15 S Destination IP4
26 correlation 3.89 possession
3.58 delta 0.65 S --------------------
------------------------------------------ Identif
ied No. 1st 3, 2nd 1, 3rd 0, 4th
1
Identified Suspicious Source IP
address(es) 134. 75. 7. 41
correlation 17.48 possession 18.77
delta 2.50 S
141.223.xxx.xxx correlation 4.33
possession 3.94 delta 0.79 S
155.230. 14. 50 correlation
58.20 possession 56.80 delta 2.84
S 210.xxx.xxx.xxx correlation
5.66 possession 6.51 delta
1.60 S ------------------------- Identified
Suspicious Destination IP address(es)
18.xxx.xxx.xxx correlation 4.37
possession 3.88 delta 1.01
128.xxx.xxx.xxx correlation 6.08
possession 7.01 delta 1.75 S
131.181.xxx.xxx correlation
53.65 possession 52.33 delta
2.67
The detection report of
anomaly identification.
- Identify IP using statistical measures
- Black list
33Flow-based Network Traffic
- The number of flows based visual representation
- The number of flows in address domain.
- The black lines illustrate more concentrated
traffic intensity. - An analysis is effective for revealing flood
types of attacks.
34Port-based Network Traffic
- Port number based visual representation
- Normalized packet counts in port-number domain.
- An analysis is effective for revealing portscan
types of attacks.
- Attack traffic SQL Slammer worm
- 0d 1434 0x 059A 0d 5 0d 154
35Multidimensional Visualization
- Study multi-dimensional signals in IP address
- i) packet counts ? R
- ii) number of flows ? G
- iii) the correlation of packet counts ? B
- Comprehensive characteristics.
- Diverse analysis.
36Evaluation in Address-based signals
Time D. TP b 1 FP a 2 NP b 3 NP a 4 LR 5 NLR 6
Real-time SA 81.5 637/782 0.06 2/3563 76.3 0.15 1451.2/ 508.7 0.19/ 0.24
DA 87.1 681/782 0.42 15/3563 88.4 0.15 206.9/ 589.3 0.13/ 0.12
(SA, DA) 94.2 737/782 0.48 17/3563 _ _ 197.5 0.06
1. True Positive rate by 3s, the number of
detection / the number of anomalies. 2. False
Positive rate by 3s 3. Expected true positive
rate by NP test 4. Expected false positive rate
by NP test 5. Likelihood Ratio in measurement by
3s / LR in NP test 6. Negative Likelihood Ratio
by 3s / NLR in NP test
- NP Test shows a little high performance than 3s
- 2 dimensional is better than 1 dimensional.
37Port-based signals
Time D. TP b FP a NP b NP a LR NLR
Real-time SP 83.4 652/782 0.14 5/3563 94.9 0.07 594.1/ 1428.8 0.17/ 0.05
DP 96.2 752/782 0.17 6/3563 90.5 0.14 571.1/ 630.4 0.04/ 0.09
(SP, DP) 96.8 757/782 0.25 9/3563 _ _ 383.2 0.03
- Port-based signal could be a powerful signal
- Particularly useful for probing/scanning attacks
38Multidimensional signals
Time D. TP b FP a LR NLR
Real-time (S, D) 97.1 759/782 0.62 22/3563 157.2 0.03
Post mortem (S, D) 97.4 762/782 0.34 12/3563 289.3 0.03
- Combined with three distinct image-based signals
address-based, flow-based and port-based - Improve the detection rates considerably
- It is possible to detect complicated attacks
using various signals
39Attack Tracking - Motion prediction
40Automatic Spoofed address Masking
- Unassigned by IANA especially, 1st byte
- Blue-colored polygons indicate the reserved IP
addresses there should be no pixels matching
the unassigned space - Destination IP normal traffic
- Source IP SQL slammer using (randomly) address
spoofed traffic
41Comparison with IDS
- Intrusion detection system (IDS) is
signature-based compared to our
measurement-based. - Compares with predefined rules
- Need to be updated with the latest rules.
- Snort as representative IDS.
- Both show similar detection on TAMU trace.
- Snort is superior in identification
- But missed heavy traffic sources and new patterns
- Required more processing time.
42Advantages
- Not looking for specific known attacks
- Generic mechanism
- Works in real-time
- Latencies of a few samples
- Simple enough to be implemented inline
- Window and Unix versions are released at
http//dropzone.tamu.edu/skim/netviewer.html - Comments to
- seongsoo1.kim_at_samsung.com or reddy_at_ece.tamu.edu
43Conclusion
- We studied the feasibility of analyzing packet
header data as Images for detecting traffic
anomalies. - We evaluated the effectiveness of our approach
for real-time modes by employing network traffic. - Real-time traffic analysis and monitoring is
feasible - Simple enough to be implemented inline
- Can rely on many tools from image processing area
- More robust offline analysis possible
- Concise for logging and playback
44Thank you !!
45Identification (2) Entire IP address level
- Step 1 Employ 4 independent hash functions as a
Bloom filter, h1(am), h2(am), h3(am), h4(am). - Step 2 Concatenation of suspicious IP bytes
using e-vicinity. - Continue to the 4th byte.
- Step 3 Membership query of generated 4-byte IP
address - Automatic containment for identified attacks
46Processing and memory complexity
- Two samples of packet header data 2P, P is the
size of the sample data - Summary information (DCT coefficients etc.) over
samples S - Total space requirement O(PS)
- P is 232 ? 4256 1024 (1D), 264 ? 256K (2D)
- S is 3232 ? 16
- Memory requires 258K
- Processing O(PS)
- Update 4 counters per domain
- Per-packet data-plane cost low.