Diapositiva 1 - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Diapositiva 1

Description:

You may read and edit articles in many different languages: ... Deutsch (German) Fran ais (French) Italiano (Italian) (Japanese) ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 22
Provided by: guid87
Category:

less

Transcript and Presenter's Notes

Title: Diapositiva 1


1
Social Networks, the case of Wikipedia
  • A. Capocci(1), D. Donato (2), S. Leonardi(2), F.
    Rao (1), L. Salete Buriol(2), V.Zlatic (1), G.C.
    (1)
  • CNR-INFM Centre SMC Dep. Physics University
    Sapienza Rome, Italy
  • Dep. of Informatics, University Sapienza Rome,
    Italy

PRE 74, 036116 (2006) EPL 81, 28006 (2008)
2
INTRODUCTION
3
(No Transcript)
4
INTRODUCTION
Wikipedia in other languages You may read and
edit articles in many different
languages Wikipedia encyclopedia languages with
over 100,000 articles Deutsch (German)
Français (French) Italiano (Italian)
(Japanese) Nederlands (Dutch) Polski (Polish)
Português (Portuguese) Svenska (Swedish)
Wikipedia encyclopedia languages with over
10,000 articles ??????? (Arabic) ?????????
(Bulgarian) Català (Catalan) Cesky (Czech)
Dansk (Danish) Eesti (Estonian) Español
(Spanish) Esperanto Galego (Galician) ?????
(Hebrew) Hrvatski (Croatian) Ido Bahasa
Indonesia (Indonesian) ??? (Korean) Lietuviu
(Lithuanian) Magyar (Hungarian) Bahasa Melayu
(Malay) Norsk bokmål (Norwegian) Norsk
nynorsk (Norwegian) Româna (Romanian) ???????
(Russian) Slovencina (Slovak) Slovencina
(Slovenian) ?????? (Serbian) Suomi (Finnish)
Türkçe (Turkish) ?????????? (Ukrainian) ??
(Chinese) Wikipedia encyclopedia languages with
over 1,000 articles Alemannisch (Alemannic)
Afrikaans Aragonés (Aragonese) Asturianu
(Asturian) Az?rbaycan (Azerbaijani)
Bân-lâm-gú (Min Nan) ?????????? (Belarusian)
Bosanski (Bosnian) Brezhoneg (Breton) ?a???
?e??? (Chuvash) Corsu (Corsican) Cymraeg
(Welsh) ???????? (Greek) Euskara (Basque)
????? (Persian) Føroyskt (Faroese) Frysk
(Western Frisian) Gaeilge (Irish) Gàidhlig
(Scots Gaelic) ?????? (Hindi) Interlingua
Íslenska (Icelandic) Basa Jawa (Javanese)
??????? (Georgian) ????? (Kannada) Kurdî /
????? (Kurdish) Latina (Latin) Latvieu
(Latvian) Lëtzebuergesch (Luxembourgish)
Limburgs (Limburgish) ?????????? (Macedonian)
????? (Marathi) Napulitana (Neapolitan)
Occitan ???? (Ossetic) Plattdüütsch (Low
Saxon) Scots Sicilianu (Sicilian) Simple
English Shqip (Albanian) Sinugboanon
(Cebuano) Srpskohrvatski/??????????????
(SerboCroatian) ????? (Tamil) Tagalog
??????? (Thai) Tatarça (Tatar) ??????
(Telugu) Ti?ng Vi?t (Vietnamese) Walon
(Walloon) Complete list Multilingual
coordination Start a Wikipedia in another
language
5
INTRODUCTION
A Nature investigation aimed to find if Wikipedia
is an authoritative source of information with
respect to established sources as Encyclopedia
Britannica.
  • Among 42 entries tested, the difference in
    accuracy was not particularly great
  • the average science entry in Wikipedia contained
    around four inaccuracies
  • the one in Britannica, about three.
  • On the other hand the articles on Wikipedia are
    longer on average than those of Britannica. This
    accounts for a lower rate of errors in Wikipedia.
  • In a survey of more than 1,000 Nature authors
  • 70 had heard of Wikipedia
  • of those
  • 17 of those consulted it on a weekly basis.
  • less than 10 help to update it

(Nature 438, 900-901 2005)
6
(No Transcript)
7
INTRODUCTION
  • sociological reasons the encyclopedia collects
    pages written by a number of indipendent and
    eterogeneous individuals. Each of them
    autonomously decides about the content of the
    articles with the only constraint of a prefixed
    layout. The autonomy is a common feature of the
    content creation in the Web. The wikipedia
    authors community is formed by members whose
    only wish is to make available to the world
    concepts and topics that they consider
    meaningful. In some sense, tracing the evolution
    of the wikipedia subsets should mirror the
    develop of significant trends within each
    linguistic community.
  • generation on time wikipedia provides time
    information associated with nodes. Moreover, it
    provides old information time information for
    the creation and the modifications for each page
    on the dataset.
  • independency of external links wikipedia
    articles link mainly to articles on the same
    dataset.
  • variety of graph sizes it can be collected one
    graph by language, and the graph dimensions vary
    from a few hundred pages up to half million pages.

8
DATA
We generated six wikigraphs, wikiEN, wikiDE,
wikiFR, wikiES, wikiIT and wikiPT, generated from
the English, German, French, Spanish, Italian and
Portuguese datasets, respectively. The graphs
were obtained from an old dump of June 13, 2004.
We are not using the current data due to disk
space restrictions. The English dataset of June
2005 has more than 36 GB compacted, that is about
200 GB expanded.
The page that was mostly visited was the main
pages for wikiEN, wikiDE, wikiFR and wikiES,
while that for the datasets wikiIT and wikiPT
there were no visits associated with the pages.
9
DATA
  • SCC (Strongly Connected Component) includes
    pages that are mutually reachable by traveling on
    the graph
  • IN component is the region from which one can
    reach SCC
  • OUT component encompasses the pages reached from
    SCC.
  • TENDRILS are pages reacheable from the IN
    component,and not pointing to SCC or OUT region
    TENDRILS also includes those pages that point to
    the OUT region not belonging to any of the other
    de?ned regions.
  • TUBES connect directly IN and OUT regions,
  • DISCONNECTED regions are those isolated from the
    rest.

The Bow-tie structure, found in the WWW (Broder
et al. Comp. Net. 33, 309, 2000)
10
DATA
The measure/size of the Wikigraph for the various
languages.
The percentage of the various components of the
Wikigraph for the various languages.
11
DATA
The Degree shows fat tails that can be
approximated by a power-law function of the kind
P(k) k-g Where the exponent is the same both
for in-degree and out-degree.
In the case of WWW 2 gin 2.1
indegree(empty) and outdegree(filled)
Occurrency distributions for the Wikgraph in
English (?) and Portuguese (?).
12
DATA
As regards the assortativity (as measured by the
average degree of the neighbours of a vertex with
degree k) there is no evidence of any assortative
behaviour.
The average neighbors indegree, computed along
incoming edges, as a function of the indegree
for the English (?) and Portuguese (?)
13
MODEL
  • We introduced an evolution rule, similar to other
    models of
  • rewiring already considered,
  • At each time step, a vertex is added to the
    network. It is connected to the existing
    vertices by M oriented edges the direction of
    each edge is drawn at random
  • with probability R1 the edge leaves the new
    vertex pointing to an existing one chosen with
    probability proportional to its indegree
  • with probability R2, the edge points to the new
    vertex, and the source vertex is chosen with
    probability proportional to its outdegree.
  • Finally, with probability R3 1 - R1 - R2 the
    edge is added between existing vertices the
    source vertex is chosen with probability
    proportional to the outdegree, while the
    destination vertex is chosen with probability
    proportional to the indegree.

See for example Krapivsky Rodgers and Redner
PRL 86 5401 (2001)
14
MODEL
At each time step one adds a node and M edges.
1. with probability R1 the edge leaves the new
node and points an existing node chosen with
probability proportional to its in-degree.
2. with probability R2 the edge points the new
node and leaves an existing node chosen with
probability proportional to its out-degree
3. with probability R3 1 R1 - R2 the edge
points an existing node with probability
proportional to its in-degree and leaves and
leaves an existing node chosen with probability
proportional to its out-degree.
15
MODEL
The parameters have a physical meaning and can
been measured on real data. In the english case,
for instance, this yields R1 0.026, R2
0.091 in the data we have, M 10
By approximating discrete time variation by
derivativatives with respect to the continuous
variable t, one can write and solve the following
rate equations for the in- and out-degree dkin
/dt (R1R3) kin t-1 dkout /dt (R2R3) kout t-1
16
MODEL
By solving the rate equation, one obtains the
time evolutions and, with little algebra, the
distributions of the in- and out-degree
17
CONCLUSION
  • We have a structure that resembles the bow-tie
    of the WWW
  • We have a power-law decay for the degree
    distributions and also
  • a power-law decay for the number of one page
    updates
  • Preferential Attachment in the Rewiring seems to
    be the driving force
  • in the evolution of the system
  • The microscopic structure of rewiring is very
    different from that of WWW
  • In principle a user can change any series of
    edges and add as many
  • pages as wanted. Still most of the quantities
    are similar

18
INCOMING CONFERENCE
19
INCOMING CONFERENCE
Uri Alon Joseph Stiglitz (tbc)
Alessandro Vespignani Alain Barrat
Ginestra Bianconi Dirk Brockmann
Debora Donato James Fowler Kwang-Il
Goh (tbc) Shlomo Havlin Dirk Helbing
Matthew O. Jackson János Kertész
(tbc) Amos Maritan José Fernando
Mendes Luciano Pietronero Frank
Schweitzer H. Eugene Stanley Marc
Vidal
20
Thanks!
http//www.guidocaldarelli.com
21
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com