Title: Can Peer-to-Peer File-sharing be of Help for Research Communities ?
1Can Peer-to-Peer File-sharing be of Help for
Research Communities ?
- Julita Vassileva
- Computer Science Department (MADMUC Lab)
- University of Saskatchewan
2Outline
- Motivation
- Problems user participation, trust
- Motivating user participation
- User modelling
- Reward with better QofS
- Social awareness (visualization)
- Ensuring trust
- Conclusions
3Motivation
- Need a search engine for locally stored papers
- Web links disappear, protected sites
- Hard disks too large
- Why P2P?
- Harvest the resources of a community of users
- Advantages of a distributed approach vs
centralized
4What is a P2P System?
GNUTELLA
5COMTELLA
- A P2P (Gnutella based) system for file sharing
and service - users share academic papers, code snippets
- Non-centralized digital library for a research
group / class - Can be downloaded from
- http//bistrica.usask.ca/madmuc/news.htm
6Vassileva J. (2002) Supporting Peer-to-Peer User
Communities, in R. Meersman, Z. Tari et al.
(Eds.) "On the Move to Meaningful Internet
Systems 2002 CoopIS, DOA, and ODBASE"
Coordinated International Conferences
Proceedings, Irvine, Springer LNCS 2519,
230-247. Vassileva, J. (2002) Motivating
Participation in Peer to Peer Communities,in
P.Peta, R.Tolksdorf, F. Zambonelli (Eds.)
Engineering Societies in theAgents World III,
Proceedings of the 3rd International Workshop
ESAW'02,Madrid, Springer LNAI 2577,
141-155. Bretzke H., Vassileva J. (2003)
Motivating Cooperation in Peer to Peer Networks,
in P.Brusilovsky, A. Corbett, F.De Rosis (eds.)
Proceedings of the 9th International Conference,
on User Modelling, UM03, Johnstown, PA,Springer
LNCS, 218-227.
Lingling Sun Graduate student
Christopher Cox NSERC Summer 2002 project
Helen Bretzke CRA-W and NSERC Summer 2002
project
Yamini Upadrashta Graduate student
7(No Transcript)
8Problems
- User Participation
- critical mass needed
- most users are free-riders
- why do people contribute?
- satisfies a need (is useful)
- doesnt cost (effort, money, inconvenience)
- there is some incentive (money, glory, power)
- serves a greater cause (e.g. cancer research,
SETI_at_home, etc.)
- Trust
- sure that contributing wont cause harm
- able to identify trustworthy peers
9First condition system must be useful
- Allow searching own files
- Any file stored on disk can be found with
Comtella - Shared files can be stored anywhere on disk
- Integration with other tools
- With Browser (e.g. IE, Netscape, Mozilla, etc.)
- allows viewing files directly from Comtella
- prompts the user to share papers when a PDF file
opened - With Word Processor (e.g. MS Word)
- generating lists of references automatically
- Additional functionality
- Adding annotations and ratings to papers
10Levels of participation
- Bring new files
- Provide disk space / processor time
- Dispatch requests
- Stay on-line
- Use and quit
11How to motivate participation?
- Why do people offer their time and resources?
- Different people have different motivations
- Some would help their friends and hope to make
new friends through helping
- Some would expect better service
12Incentives
- Micro-payments for each transaction?
Shirky says it wont ever work (e.g.
Mojo-nation) Flat rates work better (e.g.
Internet, cable) How to map virtual currency
into real money?
13How to motivate participation?
- Why do people offer their time and resources?
- Different people have different motivations
- Some are altruists (for the cause)
- Some would help their friends and hope to make
new friends through helping
- Some would expect better service
14Know your user!
Modelling
- User Type Altruist? Socialist? Utilitarian?
- User Interests What does she search / need?
- User Relationships and Community Who shares
interest with the user? Potential friends and
foes.
15Modelling user interests
- Define a taxonomy of subject categories (e.g. ACM
subject index) - Keep track of the categories of queries (? user
interests) - Keep track of resources offered by the user in
each interest category - Update user level of interest in each
sub-category using reinforcement learning - Cluster users in interest-based groups
16Computing user interests
- Reinforcement learning
- The users strength of interest S in an area a is
calculated based on how frequently and how
recently the user has searched in this area. - Sa(et, t) i Sa(e t-1, t-1) (1 - i)
et - where et ÃŽ 0, 1 is calculated as et 1/ d, and
- d 1 level_distance between the level of the
sub-area of the query and the level of the area a
in the ontology hierarchy. Currently, the
ontology hierarchy has only 2 levels, so et 0.5
17Modelling user relationships
- Monitor whose files the user chooses, the quality
of the files (does the user keep the files), and
who downloads files offered by the user - Represent each user relationship
- For each area of interest
- Strength how successful service was given
(reinforcement learning used, similar to user
interests) - Balance reciprocity of services used/ given
- Adapt P2P topology form a neighborhood for
search using the best relationships (friends)
in the area of search
Gnutella
18Computing the balance of a relationship
- BXY (N X?Y - N Y?X ) / (N X?Y N Y?X )
- BXY ÃŽ -1, 1
- N X?Y - number of times X took from Y
- N Y?X - number of times Y took from X
19Modelling user type
- Monitor users actions regarding file sharing,
relative time spent on-line, acts of interrupting
service, total balance of users giving / taking - Update a number in -1, 1 representing users
cooperativeness - Motivational actions in the interface triggered
by passing certain thresholds
20Computing user type
- The measure of user cooperativeness at time t
C(wt, t) i C(w t-1, t-1) (1 - i) wt, - w ? -1,0) ? (0,1 represent the weight of
evidence, where w lt 0 is a selfish act while w gt
0 is an altruistic act. - overallBalance (1/n)SY (BXY)
- userType (C(wt, t) overallBalance) /2
- If userType is in -1, -0.5) then user is
selfish, if it is in -0.5) ? ( 0.5 then user is
reciprocal, and if it is in (0.5, 1 then user is
altruistic.
21Rewarding relationships
- People who share a lot of useful files and behave
cooperatively will have more friends - Friends are treated differently
- Transfers not interrupted
- Queries processed with priority
- Queries are propagated farther
- Queries sent to friends in the area
- Higher chance of having relevant files
- Faster responses
- Better quality of files
- People with more friends get better Quality of
Service!
22Evaluation results - simulation
- Comparing the round trip time obtained for
queries without a friends list with
the round trip time for queries with friends
list
23Evaluation results user experiment
8 users over a week
24Summary of results
- The simulation results show that peers obtain
results faster when searching for files in
categories for which they have friends - The user evaluation still underway
- Does the QoS reward motivate participation?
-
25Social awareness
In isolation, selfishness is logical
To gain perspective, users needfeedback about
their social environment
In cities, the sidewalks provide the right kinds
and numbers of interactions from which
neighborhoods emerge.
26A matter of scale
- Provides visual feedback
- Resolves scale
- Attractive interesting
An astronomical metaphor
27Views of the community
- Connectivity (currently reachable peers)
- Ranking of peers by contribution
- number of shared files
- balance of relationships
- Papers shared by each peer
- Interests of each peer
28Architecture
Introducing a non-vital server or many servers
Server
- Collect info. from peers
- Generate community views
Server
Server
29Ranking of peers based on contribution
30Shared interests
31Personalized views
- Who are my friends in this area?
- How strong is my relationship with them?
- How much have they contributed?
- Do I owe them or do they owe me?
- Which files do they share?
- What have they been searching for / downloading
recently?
32Trust
- We already model the strength of relationships
between users - Based on counting downloads /uploads
- We can incorporate an explicit measure of the
quality of resource - Idea Let users
- Rate their resources (quality of paper)
- Add annotations (summaries) of papers
33Immediate benefit
- Learning effect compiling reviews of articles
- Visualization of document ranking in given
category of interest top 10 list - Professor / Boss will know who has read and
annotated paper and who has not ? could have a
motivation effect on participation.
34Reputation
- Global reputation of peers can be computed
- Ranking of peers based on
- how many highly rated papers they share
- how many times they have introduced a new paper
in the system that has become highly rated - how the users ratings correlate with those of
their peers and with high-rank peers - Emergence of Power peers
- What extra rights will they have (reward)?
- Could have a motivational effect, as in
Slashdot.com
35Community views
- Connectivity (currently reachable peers)
- What are these peers interested in / sharing
- Ranking of peers by contribution
- Shared interest clusters
- Personalized views (who are my friends?)
- Ranking of resources (papers)
- Reputation of peers
36Updating trust in peers
- Relationships ? subjective trust in the source of
the paper (the other peer) - Trust depends on the evaluation criteria of the
peer - Compare own rating of paper with the rating given
by the source - If ratings are sufficiently close,
increase trust in source, else decrease trust - Trust depends on category of interest
- Combined trust measures for peers?
- Peers share their trust measures (gossip)
37Trust and reputation
- Yao Wang
- Ph.D. student
- Wang Y., Vassileva J. (to appear) Bayesian
Network-Based Trust Model, Proc. of IEEE/ WIC
International Conference on Web Intelligence (WI
2003), October 13-17, 2003, Halifax, Canada. - (best paper award nominee)
38Applying a Bayesian network trust model to
COMTELLA
T
File quality
Paper category (subject area)
Reliability (download)
Paper rating
39Future work
- Incorporating a trust reputation mechanism into
Comtella - to protect from malicious file-sharers
- to ensure that users share papers with
appropriate peers and benefit most from their
articles and comments
40 take-home messages
- Motivating user participation is crucial
- Building in mechanisms for trust and reputation
- Encouraging contribution
- building relationships
- Rewards by better quality of service
- reputation / visibility
- Techniques
- Modeling user interests, relationships, user
type - Creating community awareness through
visualization
- Will allow users to find reputable sources
- May protect community from malicious or
irresponsible peers