Academic Advisor: Dr' Yuval Elovici - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Academic Advisor: Dr' Yuval Elovici

Description:

... MP3, or MPEG) that we composed, with a unique name, and try with a different ... Finish Gnutella driver (able to perform search and download of documents) ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 20
Provided by: csBg
Category:

less

Transcript and Presenter's Notes

Title: Academic Advisor: Dr' Yuval Elovici


1
ADD Presentation
  • Academic Advisor Dr. Yuval Elovici
  • Technical Advisor Dr. Lidror Troyansky

2
Where do we fit in?
Continued
3
What are we doing??
  • Develop a system which will
  • be able to Configure the searching parameters.
  • scan the P2P networks.
  • download files suspicious as confidential.
  • analyze the material using Machine Learning.
  • generate reports.
  • produce statistics.

4
Requirements
5
Functional Requirements
  • Scanning and looking for suspicious target (e.g.
    as confidential) information in the P2P network
    (Gnutella).
  • Downloading the suspicious target (e.g. as
    confidential) information from the P2P network
    (Gnutella).
  • Analyzing the scanned results (determine the
    value of the documents).
  • The system will use the Machine Learning based on
    the filtering algorithm to classify the
    documents.
  • Statistics Gathering
  • The number of users which currently hold the
    target information.
  • Using IP Geolocation and finding out the
    geographic location of the leaked information.
  • The history of searched for, downloaded
    analyzed files.

6
Non-Functional Requirements
  • Performance constraints
  • The system should return a search result for
    suspicious target after no more than 15 minutes.
  • The system should not limit the download target
    time. (Remark it should be configurable. By
    default, a time-out should always be set)
  • The system should hold history result and
    statistics of not more than one year ago.
  • Safety and Security
  • The system will not be used for any other purpose
    than find information leaks in P2P networks (e.g.
    to find MP3 shares).
  • The system will not expose the confidential
    documents it downloads and the documents were
    used in the Machine Learning algorithm.

7
System Architecture
8
System Architecture (cont'd)
  • The system is constructed from several components
    which are written in different languages and
    communicate between each other in several ways.
  • All software modules reside in the same computer.
  • IGTellaHandler- The primary responsibility of
    this component is downloading documents from the
    Gnutella Network. The IGTellaHandler is written
    in Java and communicates with the main component
    (P2PinspectorGadget) via RMI technology (to
    increase the de-coupling between the copmonents).
  • IGConfClassifier-The primary responsibility of
    this component is classifying documents using
    different classification rules.The output of
    this process will be saved in the database, and
    will be available for further use.

9
System Architecture (cont'd)
  • IGDBHandler- The primary responsibility of this
    component is connecting to an external database
    and stand as an interface for the system's
    modules for the database.IGDBHandler will be
    written in java and will communicate with the
    main component via RMI communication.
  • P2PInspectorGadget-This component is the
    system's main component, it has two primary
    reponsibilities, the first is interaction with
    the user via the Graphical User Interface, and
    the second is to control the flow of the
    system.P2PinspectorGadget will be written in
    Java and will connect to the different components
    with the connection mentioned above, and will not
    communicate with any other external system.

10
Main Classes
11
Main Classes (cont'd) Network handler
12
Main Classes (cont'd) Network handler
13
Searching files seq. diagram
14
Testing
  • Unit testingAll the units will be tested for
    every use case.For each use-case all of the
    possible paths will be tested.The unit testing
    is a part of the design of the project, an
    automated tests are running all of the time when
    we develop the system.Here are some of the
    testing in the test-plan
  • Start system Starting the system with a
    firewall blocking of the P2P needed ports, and
    see that the system doesn't crush and outputs the
    right error message.
  • Scan network Verify that this process concludes
    after a pre-defined time-out.
  • Analyze downloaded files Verify that the system
    converts the different text formats (DOC, PPT and
    PDF) correctly into "raw" text.
  • Analyze downloaded files Verify accuracy of the
    algorithm (achieving the standard of
    false-positive and true-positive as defined in
    the project's targets.

15
Testing (cont'd)
  • Acceptance Tests As a part of the acceptance
    tests, all of the use cases will be fully checked
    from the beginning to the end.In addition, all
    of the non-functional requirements will be tested
    to make sure they meet their targets
  • System's HistoryIn order to verify that the
    System saves all the information for the period
    that the user has defined (default is 1 year), we
    shall manually try to change the system's clock
    and trick and see that the data that needs to be
    saved is saved and the data that should have been
    deleted, is deleted.
  • System legitimacy (non pirate uses)The system
    will be blocked for uploading data, this will be
    checked with planting a unique media file (maybe
    MP3, or MPEG) that we composed, with a unique
    name, and try with a different client to download
    the media file.

16
Testing (cont'd)
  • Content SafetyIn order to test for Content
    safety (classified documents used for the
    learning part of the algorithm will not be
    exposed to the P2P network), those two
    sub-application are running as a separate
    processes with different memory space.The test
    will be attempt from another client to download
    the classified documents or the list of the
    documents from the process that connnects to the
    P2P network.

17
GUI
18
GUI (cont'd)
19
Remaining tasks (till prototype)
  • Create and Integrate the GUI.
  • Find a list of Gnutella1 working servers.
  • Classification algorithm inspecting and learning.
  • Integrate Python written algorithm to Java.
  • Finish PDF 2 DOC converter.
  • Finish Gnutella driver (able to perform search
    and download of documents).
Write a Comment
User Comments (0)
About PowerShow.com