Measurement, Modeling, and Analysis of a PeertoPeer FileSharing Workload presentation

About This Presentation

Transcript and Presenter's Notes

Title: Measurement, Modeling, and Analysis of a PeertoPeer FileSharing Workload

1
Measurement, Modeling, and Analysis of a
Peer-to-Peer File-Sharing Workload

K.P. Gummadi, R. J. Dunn, et al
SOSP03
Presented by Lu-chuan Kung kung_at_uiuc.edu

2
Outline

Trace methodology and analysis
User characteristics
Client activities
Object dynamics
Analyze why Kazaa workload is not Zipf
A model of P2P file-sharing workloads
A study of bandwidth-saving techniques
Conclusion

3
Trace Methodology

Passively collect Kazaa traffic at the border of
campus network and internet
Query traffic was not captured b/c of encryption.
File transfers are HTTP transfers w/
Kazaa-specific header
Summary statistics of the trace

4
Kazaa Users Are Patient

Transfer time the difference between the start
time and the end time of a request
Small objects
Large objects 100MB (typically video files)

5
User Slow Down As They Age

Do people become hungrier for content as they
gain experience with Kazaa?
Older clients requested fewer bytes b/c
Attrition population declines as clients age
Slowing down older clients ask for less

6
Client Activity

Its difficult to quantify the availability of
clients in a p2p system
Client activity includes
Activity fraction time spent in transfers /
duration of lifetime. Lower bound on availability
Average session length typical duration length

7
Object Characteristics

Kazaa is not one workload
Kazaa is a blend of workloads of different
properties
3 ranges of objects small ((10MB100GB), and large (100GB)
Majority of requests are for smaller objects
Most bytes transferred are due to large objects

8
Kazaa Object Dynamics

Multimedia objects are immutable, therefore
affect object dynamics
Kazaa clients fetch objects at most once
Kazaa client requests an object once 94 of time
Kazaa client requests an object twice 99 of
time
Most requests are for old (repeated) objects
An object is old if at least one month has passed
since the first request of the object
72 of requests for large objects are old
52 of requests for small objects are old

9
Kazaa Object Dynamics

The popularity of Kazaa objects is often
short-lived
The most popular pages remains stable for the Web
Popularity is fleeting in Kazaa
Audio files lose popularity faster than popular
video files
The most popular Kazaa objects tend to be
recently born objects
Newly born objects did not receive any requests
during the first month of the trace

10
Kazaa Is Not Zipf

Zipfs law
The popularity of ith-most popular object is
proportional to i-a, a Zipf coefficient
Kazaa is not Zipf
Most popular objects are less popular than Zipf
would predict

11
Why Kazaa Is Not Zipf

Fetch-repeatly vs. fetch-at-most-once
Simulate the two cases based on the same Zipf
distribution
The result of fetch-at-most-once is similar to
Kazaa.
Non-Zipf workloads are also observed in web proxy
caches and VoD servers

12
A Model of P2P File-Sharing Workloads

Hypothesis underlying popularity of objects in a
fetch-at-most-once system is driven by Zipfs law
A client requests 2 objects per day. Choose which
object to fetch from Zipf(1)
An object is born with rate ?o , its popularity
rank is selected from Zipf(1)
Total object population cannot be observed from
the trace. Use back-inference given 18,000
distinct objects are requested in the trace,
whats the total number of objects? Ans 40,000

13
Model Structure and Notation

Parameter value are chosen to reflect the
measured data from the trace

14
File-Sharing Effectiveness

How should organization exepect bandwidth demand
to change over time, given a shared proxy server?
Hit rate of the proxy cache decreases in the
fetch-at-most-once case
Fetch-at-most-once clients consume the most
popular objects early

15
New Object Arrivals Improve Hit Rate

Object updates in Web lower the hit rate
New objects arrivals are beneficial in P2P system
Arrivals of popular objects increase hit rate
If no arrivals, clients are forced to choose from
the remaining unpopular objects

16
New Clients Cannot Stabilize Performance

The infusion of new clients at a constant rate
cannot compensate for the increasing number of
old clients
If we want to keep hit rate as a constant, we
need exponential client arrival rate

17
Model Validation

Underlying Zipf assumption cannot be validated
directly.
Use the proposed model to replicate the object
popularity distribution in the trace
Estimate various parameters
Arrival rate of new objects is chosen to fit the
measured data. ?o 5,475 objects per year

18
Exploring Locality-aware Request Routing

A significant fraction of Internet bandwidth is
consumed by Kazaa
How would exploitation of locality help to save
bandwidth?
Different ways to exploit locality
A centralized proxy cache placed at organization
border
Request redirection favor organization-internal
peers
Centralized request redirection
Decentralized request redirection

19
An Ideal Proxy Cache

Assume an ideal proxy infinite capacity and
bandwidth
86 of external bandwidth would be saved
However, some may not want to store P2P
file-sharing content in a proxy server due to
legal issues

20
Benefits of Locality-Awareness

Trace-based simulation
Infinite storage capacity
At most 12 concurrent downloads
Upload bandwidth 500 Kb/s
External bandwidth 100 Kb/s
Clients are available only when theyre
transferring (a very conservative assumption)
Cold misses objects cannot be found in peers
Busy misses objects found but the peer is
unavailable due to concurrent transfers

21
Benefits of Locality-Awareness

Locality awareness obtained 68 byte hit rate for
large objects and 37 byte hit rate for small
objects
A substantial number of miss bytes (62 of large
objects, 43 of small objects) are due to
unavailable clients

22
Benefits of Increased Availability

Most of bytes served and consumed come from
highly available peers
Adding availability to the most available hosts
earns a higher hit rate than adding to the least
available host

23
Conclusion

P2P file-sharing workloads are different to Web
workloads
User are patient
Aged clients demand less
Fetch-at-most once
The proposed model suggests that client births
and object births are the fundamental forces
driving P2P workloads
Theres significant locality in the Kazaa
workload
Locality-aware peers would save 63 external
transfers even under conservation assumption

24
Comments

Some of the observed characteristic may be
related to the design of Kazaa and the measuring
methodology and thus cannot be generalized
The lack of portal sites in P2P system may also
be a reason that most popular objects in P2P are
less popular than Zipfs law would predict

25
Assessing the Quality of Voice Communications
Over Internet Backbones

A.P. Markopoulou, F.A. Tobagi, M.J. Karam
Tran. on Networking v11 no5 Oct 2003
Presented by Lu-chuan Kung

26
Outline

VoIP System
Playout schemes
Voice Impairment in Networks
Internet measurements
Numerical results
Discussion

27
VoIP System
28
VoIP System

Speech signal
Talkspurts have mean 352ms
Silence periods have mean 650ms
Encoding schemes
Packetizer add headers for different protocols
Playout buffer packets are held for a later
playout time in order to smooth playout
Decoder reconstruct the speech signal

29
Playout Schemes

Two types fixed and adaptive
Fixed playout scheme
End-to-end delay p is the same for all packets
Large delay decreases packet loss due to late
arrivals, but also decreases interactivity
Adaptive playout scheme
Estimate p based on delay dav and delay variation
v
p dav 4v
Estimate p
Talkspurt by talkspurt
Packet by packet

30
Voice Impairment in Networks

Quality of voice is affected by
Encoding
Packet loss
Network delay jitter
End-to-end delay
Echo
End-to-end delay consists of
Encoding delay
Packetization delay
Network delay
Playout bufferring delay
Decoding delay

31
Assessment of Voice Communication in Packet
Networks

Mean Opinion Score (MOS) a subjective rating
given by listeners, given on a scale of 1-5
Intrinsic quality MOSintr quality after
compression

32
Degradation Due to Loss

PLC Packet Loss Concealment
Convert loss rate to MOS

33
Loss of Interactivity

Loss of interactivity due to large end-to-end
delay
NTT study
6 conversation modes (tasks), task 1 is the
hardest, task 6 is the most relaxed type

34
Echo Impairment

Echo can cause major quality degradation
The effect of echo is a function of delay and
echo losses

35
Emodel

Published by ITU-T. Provide formulas to predict
MOS of voice quality
R (R0 Is) Id Ie A
R0 basic SNR
Is impairment of signal, eg. sidetone and PCM
Id impairment due to delay (echo
interactivity)
Ie impairment due to distortion (loss)
A advantage factor (lenient users)

36
Internet Measurements

Probe measurement
5 major U.S. cities
43 paths in total
7 providers P1,P2,,P7
50 bytes probes sent every 10 ms

37
Observations on the Traces

Duration of the trace 3 days
Network loss
6 out of 7 providers have outages
Outages happened at least once per day
Delay characteristics
Delay spikes
Alternation between high and low states
Periodic clustered delay spikes

38
Delay Characteristics
39
Consistent Characteristics Per Provider
40
One Example Call

Apply emodel to the traces using different
playout buffer scheme
Example of a 15-min call

41
One Example Call

Fixed playout incurs many losses in the last 5
mins

42
How to Choose p for Fixed Scheme

Tradeoff between loss and delay
There is a optimal value of delay to achieve
maximum MOS value

43
Example Path Many Calls

Random calls uniformly spread over an hour
150 short (3.5-min) and 50 long (10-min) calls
Plot CDF vs. MOS

Fixed Playout
Adaptive Playout
44
Discussion

Backbone networks have a wide range of
performance
Some are already able to support high quality
voice communications
Some are barely able to provide acceptable VoIP
service (MOS 3.6)
Reliability problems are more serious than QoS
service mechanisms

45
Comments

How representative are the chosen paths among the
typical paths on Internet?
End host to end host paths have larger delay

Write a Comment

User Comments (0)

About PowerShow.com

Measurement, Modeling, and Analysis of a PeertoPeer FileSharing Workload PowerPoint PPT Presentation