Title: A Distributed Search Service for P2P File Sharing in Mobile Applications
1A Distributed Search Service for P2P File Sharing
in Mobile Applications
- 4 September, 2003
- Authors - Christoph Lindemann and Oliver P.
Waldhorst, University of Dormund, Dept. of
Computer Science
2Itinerary
- Background Studies
- Introducing Passive Distributed Indexing (PDI)
- Algorithm Details
- Performance Results
- Conclusion and Future Work
3Background Studies
A Mobile Ac-Hoc Network
Short-range Wireless, e.g. Bluetooth Medium-range
Wireless, e.g. IEEE 802.11 Such Ac-Hoc Network
can be used for data sharing between
mobiles,e.g. Documents, MP3s and Video Clips How
to enable searching of P2P data on top of the
architecture?
4Background Studies
Related Works
Solutions Comments
Napster Hybid P2P using a Centralized Index Server No mobile device in general has the capability to act as the central serverEven there is a central server, it cannot be reachable from all client due to Hidden-Node Problem
Gnutella Fully distributed searching using multi-hop flood algorithm Flooding entire network with query messages limits its scalability
7DS First in mobile environment, utilizing flood algorithm Similar drawback as Gnutella
5Proposed Solution
- Objectivesto provide a general-purpose file
search service which can be used by several
kinds of mobile applications running on top - Passive Distributed Indexing (PDI)
- - Each device stores its local documents as a
Repository- Uniquely identify documents with
its local pathand unique device ID, a.k.a.
Document Identifier- A local Index Cache is
maintained on each device,which forms the core
component of this architecture- Searching is
performed by keyword searches
6Passive Distributed Indexing
Operation Scenario
Node 3
Node 1
q d2, d3
Node 2
q d1, d2
7Passive Distributed Indexing
Operation Scenario
Node 3
Node 1
q d2, d3
QUE q ?
Node 2
q d1, d2
8Passive Distributed Indexing
Operation Scenario
Node 3
Node 1
q d2, d3
QUE q ?
Node 2
q d1, d2
9Passive Distributed Indexing
Operation Scenario
Node 3
Node 1
q d1, d2, d3
REP q d1, d2
q d1, d2
REP q d2, d3
Node 2
REP q d1, d2
q d1, d2, d3
10Passive Distributed Indexing
Operation Scenario
Node 3
Node 1
q d1, d2, d3
REP q d3
q d1, d2, d3
Node 2
q d1, d2, d3
11Performance Analysis
Independent Parameters
No. of Devices, Transmission Range, Mobility
Model
No. of Documents, No. of Keywordsof
Interest, Distribution of Keywords Inter-reques
t Timeof Queries
Index Cache Size, Max. TTL, (Document Timeout)
System Param.
Application Param.
Protocol Param.
12Performance Analysis
Values for Simulation
13Performance Analysis
Performance Measure
?
14Performance Analysis
Performance Measure
?
Nall 5
15Performance Analysis
Performance Measure
?
Nall 5
Nrep 3
16Performance Analysis
Performance Measure
?
Nall 5
Nrep 3
Query Hit Rate Nrep / Nall
(other performance measures, e.g. system response
time, is left for future work.)
17Analysis of Results
Sensitivity to System Parameters No. of Devices
Index Cache Size
?in No. of devices leads to ?in PDI performance
(2)
Local Index Cache has very little Impact
Limited impact of ?in No. of devices on PDI
performance (1)
(1, 2) Small index cache cannot accommodate
entries for all matching documents Conclusion
Index Cache size can be small when No. of devices
is small, whereas sufficient index cache size can
boost performance in case of large No. of devices
18Analysis of Results
Sensitivity to System Parameters No. of Devices
Forwarding TTL
Advantage vanished when No. of devices grows
further (2)
Forwarding improves performance by 20 (1)
Message forwarding has very little Impact
(1) A higher probability of reaching more devices
for forwarding in medium No. of devices (2) High
No. of devices fills local index cache with
nearby entries, which replaces message-forwarding
adequately Conclusion Forwarding is useful in
medium density systems, but should be disabled
for high density systems to avoid unnecessary
network traffic
19Analysis of Results
Sensitivity to System Parameters Transmission
Range Index Cache Size
Index Cache Size significantly improves
performance
Local Index Cache has very little Impact (1)
(1) Small No. of devices is reached with very low
transmission range, thus increase in cache size
makes no impact Conclusion Index Cache size can
be small for short-range devices such as
Bluetooth, whereas No. of devices should be high
to compensate for the low Hit Rate
20Analysis of Results
Sensitivity to System Parameters Transmission
Range Forwarding TTL
PDI with message forwarding disabled gains best
performance for high-range devices
(1) Responses for uncommon entries are still
forwarded over great distances, that fills index
caches with junk entries Conclusion When
transmission range is high, message forwarding
should be disabled
21Analysis of Results
Sensitivity to App Parameters Zipf
Zipf-like distribution is used to model PDF of
searching keywords For keyword kj, Pr(k kj)
j- a , for 0 lt a lt 1 Therefore, the higher
the a, more localized is the query stream
22Analysis of Results
Sensitivity to App Parameters Zipf Index
Cache Size
PDI can achieve a hit rate of gt 70 despite of
locality in large Index Cache
PDI is extremely sensitivity to locality in
request stream for small Index Cache
Conclusion For applications offering no
significant locality in the request stream, sizes
of Index Cache must be chosen adequate
23Analysis of Results
Sensitivity to App Parameters Zipf Forwarding
TTL
For even higher locality, 2-hop forwarding
out-performs the others
PDI is gains performance improvements from packet
forwarding for higher locality, 2-hop forwarding
performs similarly with higher Hops
Conclusion 2-hop message forwarding should be
enabled in applications offering a high degree of
locality in request stream
24Analysis of Results
Sensitivity to App Parameters No. of Document
Index Cache Size
Performance decreases linearly with No. of
documents per device
Performance increases with Index Cache size in
only a log-like fashion (1)
(1) Has been shown elsewhere what this behaviors
is explained if a Zipf-like request distribution
is assumed Conclusion Maybe more sophisticated
Forwarding Strategies rather than increasing
Index Cache Size should be employed to improve
the performance
25Analysis of Results
Sensitivity to App Parameters No. of Document
Index Cache Size
Performance is improved by 10 if a small No. of
documents exists in each device, with
near-maximal performance with 2-hop forwarding
For large No. of documents per device, no
significant difference in forwarding strategy
Conclusion 2-hop forwarding can improves
performance in small No. of documents per device,
but all forwarding gains no performance when No.
of documents per device is large
26Analysis of Results
Transient Behaviors
PDI Hit Rate increases steadily after simulation
start
Real Hit Rate is constant over time
Real Hit Rate Rate of hits reported from
devices actually hold a matching
document Conclusion System will attain its
maximal performance automatically and no initial
warm-up mechanism is required
27Conclusion and Future Work
PDI is
General-purpose Distributed Document Search
Service
Utilizes Local Caching of Query Results to Avoid
Flooding the Network
Tunable(Cache Size, TTL, Document Timeout) to
Support Different Environments Applications
Provides an Initial Filling of Index Caches in a
Very Short Time, No Warm-up Mechanism is Needed
28Conclusion and Future Work
Contributions of Simulation Results
Requires Sufficiently Large Index Cache Size
High Density,Low Query Locality
2-hop Packet Forwarding should be DisabledIf
EitherThe No. of Devices or Transmission Range
is High
Medium Density, Medium-range
Requires Sufficient Large Index Cache Size
Large No. ofDocuments
29Conclusion and Future Work
Future Works include
- Investigation on the Impact of Document
Modifications on the Performance of PDI, and the
Design of the Appropriate Workaround Mechanism - Evaluation of the Performance of PDI considering
Sophisticated Workload Models that Contains
Location Depended Queries - Development of a Prototype Implementation of PDI
and Field Tests
30Conclusion and Future Work
Comments
- PDI is a very simple solution for porting P2P
File Sharing to Ac-Hoc Mobile Network - The Paper contains comprehensive simulation
results and analysis of the PDI mechanism - However, the author did not suggest further
modification on the PDI mechanism based on the
analyzed results - There is also no analytical comparisons to any
other similar implementations - PDI is yet to be challenged for improvement