Academic Advisor: Dr' Yuval Elovici - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Academic Advisor: Dr' Yuval Elovici

Description:

Download files suspicious as confidential. Analyze the material using Machine Learning. ... than find information leaks in P2P networks (e.g. to find shared MP3 files) ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 23
Provided by: csBg
Category:

less

Transcript and Presenter's Notes

Title: Academic Advisor: Dr' Yuval Elovici


1
  • Academic Advisor Dr. Yuval Elovici
  • Technical Advisor Dr. Lidror Troyansky

2
A little bit of background
  • PortAuthority Offers Businesses the Opportunity
    to Gain Insight Into Their Information Leak
    Vulnerabilities.
  • 70 of Information Leaks are InternalMost
    organizations focus on preventing outside-in
    security breaches, but industry analysts argue
    that up to 70 of security breaches occur from
    the inside-out. Information leaks of private and
    confidential information create a growing threat
    to any size organization.
  • Example of file sharing information leaks
    http//www.ynet.co.il/articles/0,7340,L-2875208,0
    0.htmlAir force officer in the IDF suspended
    over sharing confidential army documents

3
Where do we fit in?
  • P2P Networks.
  • Gnutella, Gnutella2, Bittorrent, eDonkey2000,
    Kadmelia.
  • P2P networks are typically used for connecting
    nodes via largely ad hoc connections.
  • Sharing content files containing audio, video,
    data or anything in digital format is very common
    (including confidential information).
  • Real-time data, such as VOIP, is also passed
    using P2P technology.

4
Where do we fit in?
Continued
5
Our Project
  • Develop a system which will
  • Be able to configure the scanning parameters.
  • Scan the P2P networks.
  • Download files suspicious as confidential.
  • Analyze the material using Machine Learning.
  • Generate reports.
  • Produce statistics.

6
Architecture
7
Main Functional Requirements
  • Scanning and looking for suspicious target (e.g.
    as confidential) information in the P2P network
    (Gnutella).

8
Main Functional Requirements
Continued
  • Downloading the suspicious target (e.g. as
    confidential) information from the P2P network
    (Gnutella).

9
Main Functional Requirements
Continued
  • Analyzing the scanned results (determine the
    value of the documents).
  • The system will use the Learning Machine based on
    the filtering algorithm to classify the
    documents.

10
  • Bayesian filtering is the process of using
    Bayesian statistical method to classify documents
    into categories.
  • Bayesian filtering gained attention when it was
    described in the paper A Plan for Spam by Paul
    Graham, and has become a popular mechanism to
    distinguish illegitimate spam email from
    legitimate "ham" email.
  • Bayesian filtering take advantage of Bayes'
    theorem, says that the probability that a
    document is of a certain group (confidential
    documents), given that it has certain words in
    it, is equal to the probability of finding those
    certain words in a document from that group
    (confidential documents), times the probability
    that any document is of that group (confidential
    documents), divided by the probability of finding
    those words in any Group

11
Main Functional Requirements
Continued
  • Statistics Gathering
  • The number of users which currently hold the
    target information.
  • Using IP Geolocation and finding out the
    geographic location of the leaked information.
  • The history of searched for, downloaded
    analyzed files.

12
Main Functional Use Cases
13
Main Functional Use Cases
Continued
Scan network - Use Case Diagram
14
Main Functional Use Cases
Continued
Analyze downloaded files - Use Case Diagram
15
Main Functional Use Cases
Continued
16
Non-Functional Requirements
  • Performance constraints
  • The system should return a search result for
    suspicious target after no more than 15 minutes.
  • The system timeout for downloading should be
    configurable.
  • The system should hold history result and
    statistics of not more than one year ago.

17
Non-Functional Requirements
Continued
  • Safety and Security
  • The system will not be used for any other purpose
    than find information leaks in P2P networks (e.g.
    to find shared MP3 files).
  • The system will not expose the confidential
    documents it downloads and the documents that
    were used in the Machine Learning algorithm.

18
Non-Functional Requirements
Continued
  • Platform constraints
  • OS Windows XP.
  • Database MS SQL Server 2000.
  • Programming languages (Restricted to Python,
    Java/J2E, C and C)

19
Risks
  • Mainly a research project.
  • Algorithm risk (Machine Learning).
  • Is it good for confidential documents?
  • Action to be taken
  • Feasibility Study.

20
What does successful mean?
21
Risks
  • Gnutella is an old network.
  • May not contain confidential information.
  • Action to be taken
  • Test suite.
  • Use a different P2P network.

22
Epilogue
  • ??????' "???? ?????? ?? ???? ??? ?????? ?????
    ???..."
  • ???? ???????? ????
  • www.cs.bgu.ac.il/amirf/AMOS
Write a Comment
User Comments (0)
About PowerShow.com