To trust or not, is hardly the question!

1 / 33
About This Presentation
Title:

To trust or not, is hardly the question!

Description:

... rels ppt/s/_rels/14.xml.rels ppt/s/_rels/15.xml.rels ppt ... png ppt/media/image8.png ppt/media/image7.png ppt/media/image1.png ppt/theme ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 34
Provided by: publi5

less

Transcript and Presenter's Notes

Title: To trust or not, is hardly the question!


1
Wikipedia
  • To trust or not, is hardly the question!

2
We're never so vulnerable than when we trust
someone but paradoxically, if we cannot trust,
neither can we find love or joy- Walter Anderson
Trust
Quality
Popularity
Reach
How much we can trust is the right question
3
Agenda
  • Review two articles
  • Briefly summarize other publications

4
(No Transcript)
5
Content quality
  • What are the hallmarks of consistently good
    information?
  • Objectivity unbiased information
  • Completeness self explanatory
  • Pluralism not restricted to a particular
    viewpoint
  • Define prepositions of trust

6
Prepositions of trust
7
UML Model for Wikipedia
8
Macro-areas of analysis
  • Six macro-areas Quality of user, user
    distribution and leadership, stability,
    controllability, quality of editing and
    importance of an article.
  • Using the ten propositions, 50 sources of trust
    evidence are identified.

9
Logic conditions
  • Necessary to control the meaning of each trust
    factor in relationship to the others
  • IF stability is high AND (length is short OR edit
    is low OR importance is low) THEN warning
  • IF leadership is high AND dictatorship is
    high THEN warning
  • IF length is high AND importance is low THEN
    warning

10
Calculation of Trust
11
Evaluation
  • Featured articles vs. Standard articles

12
Cluster Analysis
13
(No Transcript)
14
Models
  • Basic
  • The better the authors, the better the article
    quality
  • PeerReview
  • Assumption A contributor reviews the content
    before modifying it, thereby approving the
    content that he/she does not edit

15
Models
  • ProbReview
  • Improved assumption A contributor may not review
    the entire article before modifying it
  • The farther a word is from another that the
    author has written, the lower the probability
    that he/she has read it
  • In conflicts, the higher probability is
    considered
  • Probability is modeled as a monotonically
    decaying function of the distance between the
    words
  • Naïve
  • The longer the article is , the better its
    quality
  • Used as a baseline for comparison

16
Iterative computation
  • Initialize all quality and authority values
    equally
  • For each iteration
  • Use authority values from previous iteration to
    compute quality
  • Use quality values to compute authority
  • Normalize all quality and authority values
  • Repeat step 2 until convergence (alternatives
    repeat until difference is very small or until
    maximum iterations have been reached)

17
Evaluation
  • Use a set of articles on countries that have been
    assigned quality labels by Wikipedias Editorial
    team
  • Preprocessing
  • Bot revisions were removed from the analysis.
  • Consecutive edits by a user were removed and
    final edit was used.

18
Evalation metrics
  • Normalized discounted cumulative gain at top k
    (NDCG_at_k)
  • Suited for ranked articles that have multiple
    levels of assessment
  • Spearmans rank correlation
  • Relevant for comparing the agreement between two
    rankings of the same set of objects

19
Results
20
Conclusions
  • ProbReview works best with decay scheme 2 or 3.
  • Article length seems to be correlated with
    article quality
  • Adding this to Basic and PeerReview models showed
    some improvement but ProbReview did not benefit

21
(No Transcript)
22
Summary
  • Revision trust model may help address
  • Article trust
  • Fragment trust
  • Author trust
  • A dynamic Bayesian network is used to model the
    evolution of article trust over revisions
  • Wikipedia featured articles, clean-up articles
    and normal articles are used for evaluation

23
Results
24
(No Transcript)
25
Summary
  • Uses revision history as well as the reputation
    of the contributing authors
  • Assigns trust to text

26
(No Transcript)
27
Summary
  • Propose the use of a trust tab in Wikipedia
  • Link-ratio Ratio between the number of citation
    and the number of non-cited occurrences of the
    encyclopedia term
  • Evaluation compare link ratio values for
    featured, normal and clean-up articles

28
(No Transcript)
29
Summary
  • Propose a content-driven reputation system for
    authors
  • Authors gain reputation when their work is
    preserved by subsequent authors and lose
    reputation when edits are undone or quickly
    rolled back
  • Evaluation Low-reputation authors have larger
    than average probability of having poor quality
    as judged by human observers and are undone by
    later editors

30
(No Transcript)
31
Summary
  • A different question What are the controversial
    articles?
  • Uses edit and collaboration history
  • Two Models Basic and Contributor Rank
  • Contributor Rank model tries to differentiate
    between disputes due to the article and those due
    to the aggressiveness of the contributors, with
    the former being the one that is to be measured
  • Evaluation Identification of labeled
    controversial articles

32
Conclusions
  • Interesting area to work on
  • Different angles to consider and different
    questions too
  • Data is available easily and has lots of relevant
    features
  • Wikipedia editorial team classified articles help
    evaluation
  • Great scope for more work in this area
  • I want to look at this from the health
    perspective

33
Thank You
  • Feb 29, 2008
Write a Comment
User Comments (0)