Title: Freenet A Distributed Anonymous Information Storage and Retrieval System
1Freenet A Distributed Anonymous Information
Storage and Retrieval System
- Ian Clarke
- Oskar Sandberg
- Brandon Wiley
- Theodore W.Hong
2Introduction
- Network Computer Systems grow in importance.
- Current systems offer little user privacy.
- Every new data item stored in only one or few
places.
3Freenet
- A distributed information storage and retrieval
system. - Privacy concerns.
- No central point failures.
- Operates as a distributed file system across many
individual computers. - Transparent moving, deleting, replication of data
4Freenet Design Goals
- Anonymity for producer and consumer of
information. - Deniability for storers of information.
- Resistance to attempts by third parties to deny
access to information. - Efficient Dynamic storage and routing of
information. - Network functions decentralization.
5Roadmap
- Architecture
- Keys and Searching
- Retrieving Data
- Storing Data
- Managing Data
- Adding Nodes
- Protocol Details
- Performance Analysis
- Network Convergence
- Scalability
- Fault Tolerance
- Small World Model
- Security
6Architecture ( 1 / 2)
- Freenet implemented as an adaptive peer to peer
network of nodes. - Nodes can query each other for information store
or retrieval. - Files named after location independent keys.
- Each node maintains
- Shared Datastore
- Routing Table of entries ( node address,
possible data keys ).
7Architecture ( 2 / 2)
- Requests for keys are passed along from node to
node through a chain of proxy requests. - Routes depend on the key.
- Each request is assigned a hops-to-live value.
- Each request is assigned a pseudo-unique random
identifier. - Joining to the network requires address
discovering of some nodes.
8Keys And Searching
- Freenet data files are identified by binary file
keys. - Binary file keys obtained by 160bit SHA-1.
- Three Types of keys
- Keyword-Signed Key (KSK)
- Signed-Subspace Kay ( SSK )
- Content Hash Key ( CHK )
9Keyword-Signed Key (KSK) ( ½)
- KSK derived from a descriptive string of the
file. The descriptive string is chosen when
storing the file. - Based on the descriptive string a public/private
key pair is generated. - Public half is hashed to yield the file key.
- Private half ensures the match of a retrieved
file sign of the file.
10Keyword-Signed Key (KSK) (2/2)
- The user publishes only the descriptive string.
- Problem Global namespace. Collisions, junk file
under popular descriptive strings. - The file is encrypted using the descriptive
string as a key.
11Signed-Subspace Key ( SSK ) (1/2)
- Attacks global namespace problems.
- A user creates a namespace by randomly generating
a public/private key pair. - File insertion based on the private half.
- File key generation process
- Public namespace key and descriptive string
hashed independently - XORed together
- Hash the XOR result.
12Signed-Subspace Key ( SSK ) (2/2)
- Private half used to sign the file.
- User publishes the descriptive string along with
the subspaces public key. - Storing data requires the private key.
- The file is encrypted using the descriptive
string as a key.
13Content Hash Key ( CHK )
- A content hash key is acquired by directly
hashing the contents of the corresponding file. - This assigns a pseudo unique file key.
- Files are encrypted using a randomly generated
hash key. - User publishes the content hash key along with
the decryption key. - The decryption key is not stored together with
the file.
14Retrieving Data (1/3)
- Downstream node Node to which a request will be
passed. - Upstream node Node to which a reply/data
returns. - Process of retrieving data
- User initiates a request of the form ( binary
file key, hops-to-live) - The request is send to his node.
- If found the data is returned with a note
indicating who was the source
15Retrieving Data (2/3)
- Continued
- If not found, the request is propagated to the
next node. - If found in the next node, the data is returned
back across the path established. Data cached on
every intervening node. - New route entries are created.
- Failures
- If downstream node down, current node tries
its second choice. - If hops-to-live exceeded, failure message
returned to the original requestor.
16Retrieving Data (3/3)
17Effects of the data retrieve process
- After some queries nodes will specialize in few
sets of similar keys. Similar
Lexicographically. - Nodes will specialize in storing clusters of
files with similar keys. - Popular data will be transparently replicated
near the requesting nodes. - As nodes process requests, new route entries are
created Connectivity increased.
18Lexicographic closeness Data closeness ?
- Lexicographic closeness does not imply
descriptive string closeness. - E.g Hash keys AH5JK2, AH5JK3, AH5JK5
- will most probably refer to completely unrelated
files. - This scattering was actually intended in order to
attach central points of failures.
19Storing Data ( 1/ 2)
- Storing data is similar to the process of
retreving data. - Calculate the binary file key, specify
hops-to-live. - Hops-to-live specifies the number of nodes where
the data will be stored. - Nodes accept insert proposals.
- If the key is found, the node returns the
pre-existing file to the requestor.
20Storing Data ( 2/ 2)
- If key not found, the node propagates the request
to the next route based on key lexicographic
distances. - When hops-to-live reached, a all clear message
is sent to the original requestor. - The requestor then sends the data to be stored.
- This data is cached on every node along the
established path. Also route entries are created. - Same case of failure as with the retrieve process.
21Effects of the storing Mechanism
- New files are cached on nodes that have already
stored files with similar keys. - Newly added nodes can use the store mechanism to
announce their existence. - Attackers that may try to insert junk files under
existing keys will simply spread the pre-existing
files.
22Data Management ( ½)
- Finite storage space.
- Finite route table space.
- Storage managed by LRU.
- When a new files comes to be stored and no space
available LRU entries deleted. - Inconsistency between Storage space and route
tables. - Routing table entries are deleted in the same
fashion.
23Data Management (2/2)
- No guarantee for file lifetime.
- Nodes can decide to completely drop a data file.
- Encryption of storage files political legal
reasons.
24Adding Nodes ( ½)
- A new node can join the network by discovering
the address of one or more existing nodes. - New nodes must announce their existence.
- Existing nodes would like to know to which keys
they should assign the new nodes.
25Adding Nodes (2/2)
- Process of joining A Freenet System
- Candidate node calculates a random seed
- Sends a message to an existing node containing
its address and the hash of the seed. - The node that accepts this message generates a
seed XORs it with the hash value of the message
and sends it to a randomly chosen node. - When hops-to-live become 0, all nodes reveal
their seeds. - All seeds are XORed to produce the new nodes
key.
26Freenet Protocol
- Based on messages.
- Message form
- ltTransaction id, Hops-To-Live, Depth countergt
- Depth counter incremented at every hop. Used be
the replying node to ensure that the message will
reach the requestor.
27Request Data
- The requestor sends a Request.Data message
including the search key. - In case of a successful search, the source of the
data responds to the upstream node with a
Send.Data message. - In case of unsuccessful search or hops-to-live
exhausted, Reply.NotFound message is sent. - If the request reached a dead end or loop
detected and HTL not 0 , a Request.Continue
message is sent back to the upstream node
containing the remaining HTL. - The upstream node sends a Request.Restart message
to the an upstream node.
28Store Data
- The requesting node sends a Request.Insert
message which contains the proposed key. - The store message is propagated from node to node
based on route entries. - In case of a collision a Send.Data message or a
Reply.NotFound message is sent back. - If now more nodes can be accessed but there are
HTL, a Request.Continue message is sent. - If HTL become 0 without having encoutered a
collision, a Reply.Insert message is propagated
to the upstream node.
29Performance Analysis
- Network Convergence
- Scalability
- Fault Tolerance
- Small World Model
30Network Convergence (1/2)
- 1000 nodes.50 items datastore each and a routing
table of 150 entries. - Each node hash routing entries only for his two
closest neighbours. - Random keys were inserted to random nodes.
- Every 100 time steps, 300 random requests for
previously inserted files were performed.HTL500. - Request pathlength Number of hops taken before
finding the data.
31Network Convergence (2/2)
32Scalability (1/2)
- 20 nodes were used initially.
- Inserts and requests were performed randomly as
previously. - Every 5 time steps a new node was created and
inserted to the network. - The announcement message was sent to a randomly
chosen node.
33Scalability (2/2)
34Fault tolerance (1/2)
- Network of 1000 nodes.
- Progressively removed randomly chosen nodes to
simulate node failures. - Freenet is extremely robust against node
failures. - The median pathlength remains below 20 even when
up to 30 of the nodes have failed.
35Fault tolerance (2/2)
36Small World Networks Model
- The majority of the nodes have a few local
connections to other nodes. - Few nodes have large wide ranging connections.
- Nodes are well connected short paths among
them. - Small world networks are fault tolerant.
37Is Freenet a small world?
- There must be a scale-free power-law distribution
of links within the network.
38Security issues
- Primary goal is protecting the anonymity of both
requestors and inserters of data. - Protect the identity of the node that holds some
specific data. - If a malicious user intends to remove a data
file, he is hindered by the anonymity of the node
that holds the file.
39Basic Freenet
- Sender anonymity exposed to a local eavesdropper.
- Sender anonymity preserved when there is a set
of malicious collaborating nodes. - Receiver Anonymity is in essence key anonymity.
- Key anonymity exposed both to a local
eavesdropper and a set of malicious collaborating
nodes
40Free net Prerouting
- Freenet Messages are encrypted by a succession of
public keys which determine the route that
message will follow. - Nodes along the route cannot determine either the
originator of the message or its contents( since
encrypted ). - After the end of the prerouting phase, the
message will be inserted into the Freenet
pretending that the endpoint of the preroute was
the originator of the message.
41Data sources Protection
- While a node replies to its upstream node that he
is the source of some file, he can intentionally
hide his address. - A node replying for a data file is sure to be the
source. It is possibly propagating the data file. - Requesting a file with HTL 1 is not a threat.
42Other security concerns
- Modification of requested files.
- A node steering all the traffic to itself
pretending it owns all the data files. - DoS Attacks.
- Attempting to exhaust the storage space.
- pay a long computation.
- Divide datastore to a new files section and to
a established files section.
43Conclusions
- Effective means of anonymus information storage
and retrieval. - Highly scalable.