Network Security Monitoring and Analysis based on Big Data Technologies - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Network Security Monitoring and Analysis based on Big Data Technologies

Description:

Title: A Supervised Machine Learning Approach to Classify Host Roles On Line Using sFlow Author: Bingdong Li Last modified by: Mehmet H Gunes Created Date – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 52
Provided by: Bingd
Learn more at: https://www.cse.unr.edu
Category:

less

Transcript and Presenter's Notes

Title: Network Security Monitoring and Analysis based on Big Data Technologies


1
Network Security Monitoring and Analysis based
on Big Data Technologies
  • Bingdong Li

August 26, 2013
2
Outline
  • Motivation
  • Objectives
  • System Design
  • Monitoring and Visualization
  • Network Measurement
  • Classification and Identification of Network
    Objects
  • Conclusion
  • Future Work

3
Motivation
  • Traditional security systems assume a static
    system
  • Network attacks
  • sophisticated
  • organized
  • targeted
  • persistent
  • dynamic
  • external
  • internal

4
(No Transcript)
5
Motivation
Cybersecurity, Big Data, Machine Learning
6
Motivation
  • Problem Network Security is becoming more
    challenging
  • Resource A Large Amount of Security Data
  • Network flow
  • Firewall log
  • Application log
  • Server log
  • SNMP
  • Opportunity Big Data Technologies, Machine
    Learning

7
Objectives
  • A network security monitor and analysis system
    based on Big Data technologies to
  • Measures the network
  • Real time continuous monitoring and interactive
    visualization
  • Intelligent network object classification and
    identification based on role behavior as context

8
Objectives
Network Security
Big Data
Machine Learning
9
System Design
  • Components
  • Data collection
  • Data storage
  • Security gateway
  • Data processing
  • User Interfaces
  • Features
  • Monitoring and visualization
  • Measurement
  • Intelligent analysis

10
(No Transcript)
11
System Design
  • Data Collection

12
System Design
  • Online Real Time Process

13
System Design
  • NoSQL Storage

14
System Design
  • User Interfaces

15
(No Transcript)
16
System Design
  • The Design supports features
  • Real Time Continuous Monitoring and Interactive
    Visualization
  • Network Measurement
  • Classification and Identification of Network
    Objects

17
Monitoring and Visualization
  • Real Time
  • response within a time constraint
  • Interactive
  • involve user interaction
  • Continuously
  • continue to be effective overtime in light
    of the inevitable changes that occur
  • (NIST)

18
Monitoring and Visualization
  • Retrieve Data
  • Web User Interfaces
  • Video Demo

19
Monitoring and Visualization
  • Data Retrieving
  • Data are stored with IP as primary key and time
    slice as the secondary key in column
  • Accessing these data is in ? (1)

20
Real Time Querying
21
Host Network Connection
22
Network Status
23
Top N
24
Demo of Interactivity and Continuity
Video Demo
25
Network Measurement
  • A case study
  • The Anonymity Technology Usage on Campus Network
  • Using sFlow
  • Geo-Location
  • Usage of Anonymity Systems

26
Geo-location of Anonymity Usage on Campus
  • One Instance Bahamas, Belarus, Belgium,
    Bulgaria, Cambodia, Chile, Colombia, Estonia,
    Ghana, Greece, Hungary, Ireland, Israel, Jamaica,
    Jordan, Korea, Mongolia, Namibia, Nigeria,
    Pakistan, Panama, Philippines, Slovakia, Turkey,
    Ukraine, Vietnam, Zimbabwe
  • Two Instances Chad, ChezchRep, Denmark,
    Hongkong, Iran, Japan, Kazakhistan, Poland,
    Romania, Spain, Switzerland
  • Three Instances Austria, France, Singapore
  • Four Instances Australia, Indonesia, Taiwan,
    Thailand

27
Usage of Anonymity Systems
Packets () Traffic (MB ) Observed IPs ()
Proxies 5,580 (62.65) 8.13 (43.53) 234 (3.23)
Tor 3,129 (35.13) 9.04 (48.37) 152 (0.25)
I2P 190 (2.13) 1.50 (8.02) 23 (1.01)
Commercial 7 (0.08) 0.016 (0.08) 2 (N/A)
Total 8,906 (100) 16.69 (100) 411 (N/A)
28
Classification of Host Roles
  • Data Three months sFlow data from a large campus

Role Count
Client 5494
Server 1920
Public Place 784
Personal Office 416
College1 163
College2 253
Web Server 56
Web Email Server 25
29
Classification of Host Roles
  • Algorithms
  • Decision Tree
  • On-line SVM

30
Classification of Host Roles
  • Features
  • Ad hoc based on domain knowledge
  • Aggregating features for on-line classification
  • 24 features normalized between 0 and 1, inclusive

31
Classification of Host Roles
  • Features
  • 24 features derived from
  • src/dest IP address
  • src/dest Port number
  • TTL
  • Package Size
  • Transport protocol

32
Classification of Host Roles
  • Ground Truth
  • Host Information in Active Directory
  • Crawler to validate its status

33
Classification of Host Roles
  • Classifying Client vs. Server
  • Classifying Web Server vs. Web Email Server
  • Classifying Hosts at Personal Office vs. Public
    Place
  • Classifying Hosts at Two Different Colleges
  • Feature Contributions

34
Classifying Client vs. Server
35
Classifying Web Server vs. Web Email Server
36
Classifying Host From Personal Office vs. Public
Place
37
Classifying Host From Two Different Colleges
38
Accuracy
  • High accuracies of Host Role Classification

Classification Accuracy ()
Clients vs. Server 99.2
Regular web server vs. Web email server 100
Hosts from personal office vs. public places 93.3
Host from two different colleges 93.3
39
Feature Contribution
40
Identification of a User
  • Data NetFlow data from a large campus

Count
College1 163
College2 253
41
Identification of a User
  • Algorithms
  • Decision Tree
  • On-line SVM
  • Ground Truth
  • Host Information in Active Directory
  • Crawler to validate its status

42
Identification of a User
  • Features
  • Discrete probability distribution function (pdf)
  • An Example
  • System Port Number 6, 8, 9, 11, 14, 30, 80,
    1020
  • Outliner (P) is 1,
  • 80 is the interested port (S)
  • Number of bin 4 ( R )

43
Identification of a User
  • An Example
  • (1-0.01) 8 to 7, the 7th is 80,
  • bin slice size 80 / (4-1) 26.6
  • 6, 8, 9, 11, 14, 30, 80, 1020
  • pdf 0.625 0.125 0.125 0.125

30
6,8,9,11, 14
80
1020
44
Identification of a User
  • An Example without P and S
  • Bin size slice is 1024/4 256,
  • 6, 8, 9, 11, 14, 30, 80, 1020
  • pdf 0.875 0 0
    0.125

6,8,9,11, 14,30,80
1020
45
Identify a User Among Other Users
46
Accuracy
  • Identifying a particular user among other users
  • Decision Tree 93.3
  • On-line Support Vector Machine 78.5

47
Feature Contribution
48
Conclusion
  • Major Contributions
  • A Big Data analysis system
  • a conference paper
  • Monitoring and interactive visualization
  • Usage of anonymity technologies
  • a conference and a journal paper
  • Models of classification of host roles and
    identification and users
  • a conference paper

49
Conclusion
  • The Big Data analysis system is high performance
    and scalable
  • Real Time Continuous Network Monitoring and
    Interactive Visualization are implemented and
    supported by the high performance system

50
Conclusion
  • Proxies and Tor are main anonymity technologies
    used on campus
  • US, Germany, and China are the top 3 countries
  • Models and Features for Classification of Host
    roles
  • client vs. server, non-web server vs. web server,
    personal office vs. public office, from two
    different colleges
  • Models of Features for Identification of a
    particular user among other users

51
Future Work
  • Improvement to the Current Work
  • More interactive features and better user
    interfaces
  • Further analysis on user identification
    features, algorithm (such as deep learning)

52
Future Work
  • Extension to the Current Work
  • Define and filter out background traffic
  • Detection of operating system fingerprinting
  • Identity anonymity
  • Fusion with other network security data source

53
Future Work
  • Vision
  • To Provide network security as a service for
    individuals, small businesses, or government
    offices
Write a Comment
User Comments (0)
About PowerShow.com