CBIR in P2P Systems and Percolation Search - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

CBIR in P2P Systems and Percolation Search

Description:

The recent success of image blogs (web 2.0) and image sharing servers like ... server/client fashion. recall. Apply CBIR onto Unstructured P2P. Image Retrieval ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 29
Provided by: admisFu
Category:

less

Transcript and Presenter's Notes

Title: CBIR in P2P Systems and Percolation Search


1
CBIR in P2P Systemsand Percolation Search
  • IEEE P2P06

2
References
  • Comparison of Image Similarity Queries in P2P
    Systems. UCLA. P2P06
  • PRISM Indexing Multi Dimensional Data in P2P
    Networks using Reference Vectors. UCSB. ACM
    MM05.
  • Percolation Search in Power Law Networks Making
    Unstructured Peer-To-Peer Networks Scalable.
    UCLA. P2P04. Best Paper!

3
Architecture
Full Text Search
Image Retrieval
P2P infrastructure
Unstructured P2P
Structured P2P
K Random Walk
Bloom filter
Super Peer
Percolation
PRISM
4
Image Retrieval
Image Retrieval
P2P infrastructure
Unstructured P2P
Structured P2P
Percolation
PRISM
Super Peer
5
Why applying (CB)IR to P2P?
  • The recent success of image blogs (web 2.0) and
    image sharing servers like flickr.com,
    del.icio.us has shown that people have the wish
    to share and to publish their images.
  • In absence of efficient algorithm for
    sophisticated image feature operation, P2P may be
    the best one can do.

6
Annotation vs CBIR
  • Annotation
  • Subjective estimation
  • No uniform schema
  • Simple
  • Converse to text search
  • CBIRextract feature vector from images and
    capture the similarity in multi-dimensional space
  • Capture image statistics
  • Computationally costly
  • Semantic gap

7
Adopted Scheme
  • Integrating aforementioned two schemes.
  • It makes no difference between searching with
    annotation and with text. (the latter is
    aforementioned in my last presentation).
  • So how to apply CBIR onto P2P is what we need to
    concern.
  • The feature Vector for CBIR
  • 166D-HSV histogram
  • Similarity and matching strategy
  • Euclidean metric
  • K-NN

8
CBIR vs Text Search
  • CBIR
  • Image is high dimension, therefore, tremendous
    cost for extracting feature.
  • Indexed by statistical feature vector.
  • Matched by similarity or distance.
  • Full Text Search
  • Indexed by inverted file.
  • Matched by results merging.( though the VSM are
    introduced to compare with the similarity)

Transforming the inverted list to a chord key is
very feasible for the term is a good
representative of the whole list. As for CBIRs
feature vector, it is not so straightforward to
do such kind of transition.
9
Adapt CBIR onto DHT
How to use a key to represent a feature vector
with respect to its content
  • Requirements
  • Index distribution should be based on the same
    distance function as the retrieval.
  • Indices of similar objects should be stored at
    the same peer.
  • It should be possible to identify the peers with
    key that are likely to store the indices of
    objects that are similar to a given query.
  • PRISM

10
PRISM
Image Retrieval
P2P infrastructure
Unstructured P2P
Structured P2P
PRISM
Percolation
Super Peer
11
Vector ?(PRISM) ? Key
  • Compute the distances with reference vectors
    respectively, yielding a serial numbers.
  • Sort them.
  • Obtain the indices of each element in the sorted
    distance list.
  • With a threshold k, take the top k indices, and
    generate several pairs.
  • Calculate pairs into key.

12
View from an example
  • A vector X and several preassigned reference
    vectors, r1, r2,, rn
  • Compute distance between X and ri as di(X, ri)
  • Sort the array d1,d2, , dn ascend to dl1,dl2,
    , dln
  • Obtain array with threshold K l1, l2, lk
  • Pairs (l1, l2), (l1, l3.), (l2, l3)
  • Compute key for every (li, lj.)

13
How to publish and search with PRISM?
  • One key match offers a chance to compare the
    similarity.
  • Semantically, one key shows which of the two
    reference vectors is closer to the feature vector?

14
The measurement of experiment
  • BaveRNEtraversedz/N
  • BmaxRzNEout
  • DaveCzNcopy/Nstore
  • Dmax
  • PaveRNfDave/z
  • PmaxRNfDmax/z

15
The measurement of PRISM without load balancing
recall
  • Bave16.1bps
  • Bmax100kbps
  • Dave176KB
  • Dmax840MB
  • Pave670kFlOp/s
  • Pmax 55GkFlOp/s

server/client fashion
16
Apply CBIR onto Unstructured P2P
Image Retrieval
P2P infrastructure
Unstructured P2P
Structured P2P
Percolation
PRISM
Super Peer
17
Percolation Search in Power Law Networks
  • Based on the Power Law graph
  • P(k)Ak-t
  • Corollaries (t2)
  • (k)Alnkmax, (k2)Akmax
  • 3 steps
  • Content list implantationshort random walk
  • Query implantationshort random walk
  • Bond percolationprobabilistic broadcast schema
  • q?qc?(k)/(k2)lnkmax/kmax

18
Percolation search
19
Why q?(k)/(k2) ?A experimental perspective
P 0.6
P 0.5
  • A square lattice in which squares are occupied
    with an independent probability p, and unoccupied
    with a probability 1-p.
  • A cluster is a complete collection of
    interconnected sites
  • Threshold concentration ( Pc ) 0.5927 ( 2D
    square site )

20
The measurement
  • BaveRNEqz/NRzNAln2 kmax/2kmax560bps
  • BmaxRzNqkmaxRzNlnkmax2700kbps
  • DaveCzNlog2N/N Czlog2N 300KB
  • Dmax52MB
  • PaveRNfDave/z7.9MFlOp/s
  • PmaxRNfDmax/z 1.4GkFlOp/s

21
Advantage of percolation search
  • Scalable
  • Search whole resources instead of part of them
  • Low time and low bandwidth consuming
  • When t2, time is strictly O(logN)!

22
Super Peer
Image Retrieval
P2P infrastructure
Unstructured P2P
Structured P2P
Super Peer
Percolation
PRISM
23
Super peer architecture
  • Resources in leaf peers and searching index in
    super peers.
  • Widely deployed

24
Super node P2P measurement
  • BaveRN(sN1)z/N?RzsN? RzN1/2560bps
  • BmaxRzN31.2Mbps
  • DaveCzN1/N Cz/s 16KB
  • Dmax Cz/s?CzN1/211MB
  • PaveRNfDave/z420kFlOp/s
  • PmaxRNfDmax/z ?RCfNN1/2300MkFlOp/s
  • We set sN1/2

25
Conclusion(1)
  • The performance of structured and unstructured
    systems seem to be pretty close in our
    application domain
  • Unstructured systems have the advantage of being
    more flexible with respect to the queries they
    allow.
  • I should mention that this conclusion is similar
    to my recent presentation.

26
Conclusion(2)
  • CBIR over P2P
  • Structured P2P
  • Devise a mechanism to converse vector to
    content-based key. (the performance is
    unsatisfied)
  • Unstructured P2P
  • Feasible.
  • The content-based index of image is not
    structured!

27
QA
28
Thank you!Richard TangNov. 2006
Write a Comment
User Comments (0)
About PowerShow.com