Intelligent File System - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Intelligent File System

Description:

Forward the Query msg to other connected hosts except // the one sent it in the same group. ... Further Research on the latency due to the grouping ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 38
Provided by: cstp1
Category:

less

Transcript and Presenter's Notes

Title: Intelligent File System


1
Intelligent File System
  • Changgyu Oh
  • 04/02/02

2
Problem Domain(1)
  • Scalability of Current Decentralized P2P Networks
    similar to Gnutella
  • A total number of messages generated in the
    network uses a lot of network bandwidth.
  • In Frodes RD report, The story tells that an
    employee at Nullsoft asid in an IRC chat that
    GnutellaNet probably would not scale to more than
    250 or so clients. Gene Kan, a highly profiled
    spokesman in the Gnutella community, states that
    the technology was initially designed to support
    file sharing in a small network between friends.
  • Marius at al claimed in 35 that Gnutella
    protocol cannot exceed more than a few thousand
    peer-nodes.

3
Problem Domain(2)
  • Violation of User Anonymity
  • From queryHit packet of the Gnutella protocol,
    everyone knows a publishers address.

4
Problem Domain(3)
  • Lack of information for the resources

5
Goal of the Project
  • Autonomous approach to control messages flowing
    over network by grouping and managing messages
    generated by peer hosts using caching mechanism.
  • Fast response for desired file and for related
    information using association rules and data
    mining. It provides flexible query mechanism.
  • Efficient data representation represents
    pointless file association rules.
  • More enhanced anonymous features to the
    decentralized systems with the new approach for
    the IP address field of the queryHit packet.
  • Algorithms are provided the above results.

6
Related Works
  • Anonymous Publication Service
  • The scheme of Publius system 28 for the
    anonymity is based on a static, system-wide list
    of available servers. It doesnt support the
    adding of the new servers or purging dead servers
    because of static feature.
  • The Eternity system 30 was based on the
    Andersons seminal paper on the Eternity Service.
    According to the Anderson, the basic idea of the
    eternity service is to use the redundancy and the
    scattering techniques to distribute replicas over
    a large number of hosts. It adds the anonymity
    mechanism to drive up the cost of selective
    service denial attacks.
  • Q/A(Query and Advertising System) 29 also tried
    fixing the weakness of the Gnutella protocol, but
    still first level of server knows the IP address
    of clients.

7
Meta-search engines
  • EEM 1 is similar in the sense that it builds
    representatives for each database and is an
    optimizing relationship hierarchy.
  • ETCR 2 is very similar in the sense that class
    hierarchies are used for inheritance,
    classification and transitive closure reasoning.
  • BLDLC 4 uses classification hierarchies to
    increase capabilities of the data browsing in
    digital libraries.

8
Caching
  • The Distributed File System (DFS) discusses
    detecting network failures. It ensures that
    caches are consistent when they occur 9.
  • Some file systems choose a simple model, where
    the failure detection is not required such as the
    Network File System 10, whose clients poll the
    server to find out when the file was last
    modified, and determine if the cached version is
    valid.
  • The Andrew file system 11 and Sprite 12 file
    system are other file systems using caching
    schema. Hint-Based Cooperative Caching file
    system was introduced to help clients to make
    decisions based on local state, enabling a
    loosely coordinated system and to reduce overhead
    and access latency 13.

9
Our Approach for the flexible query mechanism
  • New File Association Rules are introduced.
  • Every node(resource) is annotated with data
    elements, lt?,R,O,?gt
  • consisting of a Set of Number pairs(?),
  • a Relation Type(R),
  • Constraint Rule(O),
  • and Hierachy Identifier(?).

10
Contained_In relationship
11
IS-A relationship
12
Message Cost FunctionFa total number of replicas
  • Maximum hop
  • A number of replicas of message generated by peer
    hosts
  • A number of peer hosts for message forwarding in
    a routing table of each peer host.

13
Our New Grouping Algorithm
14
Properties of Distributed Grouping Structure
  • search is robust against node failure
  • It is completely decentralized
  • Peer host in the same group share same group
    information
  • All peers serve as entry points for search
  • Division of a group is automatically occurs when
    it is necessary.

15
Initial Grouping
16
(No Transcript)
17
Searching in DGS
18
Replicas generated at each hop
19
Caching mechanism adapted in IFS
  • 3 if(CheckMessage(MsgHeader) SeenBefore )
  • 4 Begin
  • 5 Drop the message
  • 6 End
  • // Add the Query msg to the cache.
  • 7 cachingQuery( getMsgID(),RemoteHost)
  • // Forward the Query msg to other connected hosts
    except // the one sent it in the same group.
  • 8 decreasingTTL(msg) forwardMsg(msg,toPeer
    Hosts)

20
Enhancing anonymous feature of Gnutella Protocol
  • 23 IPClue encodingIPClue(address)
  • //create a queryHit Message
  • 24 qrm CreateQueryHitMessage(MsgID of the
    query Packet,
  • ClientID(),
  • IPclue
  • ,port
  • ,Minimum Speed
  • ,records )

21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Enhanced query mechanism
28
Manager Components
29
Client Part
30
Server Part
31
Comparison with Other approach
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
Comparison QueryType
36
Conclusion
  • IFS is a fully decentralized, server-less, highly
    scalable, and a fully distributed file system
    that provides a high degrees of resource
    availability and flexible queries for the P2P end
    users. The system continuously maintains an
    entire network scale in the decentralized manner.

37
Discussion
  • Further Research on the latency due to the
    grouping
  • File registration strategy on heterogeneous
    environment
Write a Comment
User Comments (0)
About PowerShow.com