3. Randomization - PowerPoint PPT Presentation

About This Presentation
Title:

3. Randomization

Description:

Joining torrent. Peers divided into. seeds have entire file. but leechers ... contains information necessary to contact tracker; describes file in the torrent ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 50
Provided by: jimku
Category:

less

Transcript and Presenter's Notes

Title: 3. Randomization


1
3. Randomization
  • randomization used in many protocols
  • well study examples
  • Ethernet multiple access protocol
  • router (de)synchronization
  • active queue management
  • BitTorrent load balancing

2
Ethernet
  • single shared broadcast channel
  • 2 simultaneous transmissions by nodes
    interference
  • only one node can send successfully at a time
  • multiple access protocol distributed algorithm
    that determines how nodes share channel, i.e.,
    determine when node can transmit

Metcalfes Ethernet sketch
3
Ethernet uses CSMA/CD
  • A sense channel, if idle
  • then
  • transmit and monitor channel
  • If detect another transmission
  • then
  • abort and send jam signal
  • update collisions
  • delay as required by exponential backoff
    algorithm
  • goto A
  • else done with frame set collisions to zero
  • else wait until ongoing transmission over, goto
    A

4
Ethernets CSMA/CD (more)
  • Jam Signal make sure all other transmitters are
    aware of collision 48 bits
  • Exponential Backoff
  • first collision for given packet choose K
    randomly from 0,1 delay is K x 512 bit
    transmission times
  • after second collision choose K randomly from
    0,1,2,3
  • after next collision double K (and keep doubling
    on collisions until..)
  • after ten or more collisions, choose K randomly
    from 0,1,2,3,4,,1023

5
Ethernets use of randomization
  • resulting behavior probability of retransmission
    attempt (equivalently length of randomization
    interval) adapted to current load
  • simple, load-adaptive, multiple access

randomize retransmissions over longer time
interval, to reduce collision probability
heavier Load (most likely), more nodes trying to
send
more collisions
6
Ethernet comments
  • upper bounding at 1023 k limits max size
  • could remember last value of K when we were
    successful (analogy TCP remembers last values of
    congestion window size)
  • Q why use binary backoff rather than something
    more sophisticated such as AIMD simplicity
  • note ethernet does multiplicative-increase-comple
    te-decrease (why?)

7
Analyzing the CSMA/CD Protocol
  • Goal quantitative understanding of performance
    of CSMA protocol
  • fixed length pkts
  • pkt transmission time is unit of time
  • throughput S - number of pkts successfully
    (without collision) transmitted per unit time
  • a end-to-end propagation time
  • time during which collisions can occur

8
  • offered load G - number pkt transmissions
    attempted per unit time
  • note S
  • Poisson model probability of k pkt transmission
    attempts in t time units
  • Probk trans in t ((Gt)k
    )(e-Gt)/k!
  • infinite population model
  • capacity of multiple access protocol maximum
    value of S over all values of G

9
Analyzing CSMA/CD(cont)
  • Focus on one transmission attempt
  • I - length of idle period, B - length of busy
    period
  • C I B - length of a cycle
  • Lengths of successive cycles are independent of
    each other because of Poisson assumption,
  • S p/EC p/(EIEB)
  • where p is prob of successful trasnmission druing
    a cycle (busy period)
  • p e-?G

transmission attempts
10
Analyzing CSMA/CD(cont)
  • Because of Poisson assumption, I is exponentially
    distributed with mean 1/G (EI 1/G)
  • Focus on EB
  • Case 1 no collision (NC) EBNC 1
    ??and P(NC) e-?G
  • Case 2 collision (C)
  • P(C) 1- e-?G
  • A hand wavey argument to compute EB
  • Last packet in vulnerable period (length ?)
    arrives approximately on average ?/2 into
    period
  • all packets after initiating pkt, hear
    initiating packet by ?
  • Initiating packet hears last packet ??time units
    after it begins transmission.
  • An additional ? time needed for channel to
    become idle
  • EBC 5?/2

?/2
initiating pakt
11
Analyzing CSMA/CD(cont)
  • EC P(NC) EBNC P(C) EBC
  • e-?G (1 ?) (1- e-?G )5?/2
  • and
  • S p/(EIEB
  • e-?G /(1/G e-?G (1 - 3?/2) 5?/2)

12
a .01
a .02
a .05
a .10
13
The bottom line
  • Why does ethernet use randomization to
    desynchronize a distributed adaptive algorithm
    to spread out load over time when there is
    contention for multiple access channel

14
(de)Synchronization of periodic routing updates
  • periodic losses observed in end-end Internet
    traffic

source Floyd, Jacobson 1994
15
Why?
16
Router update operation
receive update from neighbor process (time TC2)
prepare own routing update (time TC)
send update (time Td to arrive at
dest) start_timer (uniform Tp /- Tr)
timeout, or link fail update
wait
receive update from neighbor process
17
Router synchronization
  • 20 (simulated) routers broadcasting updates to
    each other
  • x-axis time until routing update sent relative
    to start of round
  • By t100,000 all router rounds are of length 120!
  • synchronization or lack thereof depends on system
    parameters

18
  • blowup of previous graph
  • note expansion of computation phase ?
    increased period

19
Sync
  • coupled routers
  • example of spontaneous synchronization
  • fireflies
  • sleep cycle
  • hurt beat
  • etc.
  • Steven Strogatz . Sync, Hyperion Books, 2003.

20
Avoiding synchronization
receive update from neighbor process (time TC2)
  • enforce max time spent in prepare state
  • choose random timer component, Tr large (e.g.,
    several multiples of TC)

prepare own routing update (time TC)
send update (time Td to
arrive) start_timer (uniform Tp /- Tr)
wait
receive update from neighbor process
21
Randomization in Router Queue Management
  • normally, packets dropped only when queue
    overflows
  • drop-tail queueing

FCFS Scheduler
P1
P3
P2
P4
P5
P6
ISP
ISP
Internet
router
router
22
The case against drop-tail queue management
FCFS Scheduler
P1
P3
P2
P4
P5
P6
  • large queues in routers are a bad thing
  • end-to-end latency dominated by length of queues
    at switches in network
  • allowing queues to overflow is a bad thing
  • connections transmitting at high rates can starve
    connections transmitting at low rates
  • connections can synchronize their response to
    congestion

23
Idea early random packet drop
FCFS Scheduler
P1
P3
P2
P4
P5
P6
  • when queue length exceeds threshold, drop packets
    with queue length dependent probability
  • probabilistic packet drop flows see same loss
    rate
  • problem bursty traffic (burst arrives when queue
    is near threshold) can be over penalized

24
Random early detection (RED) packet drop
Average queue length
Drop probability
Maxqueue length
Forced drop
Maxthreshold
Probabilisticearly drop
Minthreshold
No drop
Time
  • use exponential average of queue length to
    determine when to drop
  • avoid overly penalizing short-term bursts
  • react to longer term trends
  • tie drop prob. to weighted avg. queue length
  • avoids over-reaction to mild overload conditions

25
Random early detection (RED) packet drop
Average queue length
Drop probability
Maxqueue length
Forced drop
Maxthreshold
Probabilisticearly drop
Minthreshold
No drop
Time
Drop probability
100
maxp
min
max
Weighted Average Queue Length
26
Random early detection (RED) packet drop
  • large number (5) of parameters difficult to tune
    (at least for http traffic)
  • gains over drop-tail FCFS not that significant
  • still not widely deployed

27
We will revisit!
28
RED why probabilistic drop?
  • provide gentle transition from no-drop to
    all-drop
  • provide gentle early warning
  • provide same loss rate to all sessions
  • with tail-drop, low-sending-rate sessions can be
    completely starved
  • avoid synchronized loss bursts among sources
  • avoid cycles of large-loss followed by
    no-transmission

29
BitTorrent
30
About BitTorrent
  • P2P protocol created 2002
  • quite popular approximately 35 of Internet
    traffic (according to CacheLogic)
  • Warner Brothers to distribute films through
    Bittorrent (May 2006)

31
Bittorrent Basics
server
  • Setting large file (GB), large demand (10s,
    1000s or more clients) Not feasible to set up
    infrastructure for traditional client-server
    download.
  • divide file into small pieces (256KB).
  • utilize all peers upload capacities

client
client
client
32
Bittorrent schematic
Seed (peer with entire file)
tracker
peer
peer
new peer
peer
33
Bittorrent who to upload to?
  • Tit-for-tat upload to peers from which most data
    downloaded in last 30 seconds (4 peers by
    default)
  • ? incentive to upload in order to be chosen by
    other peers!

34
Bittorrent what piece to send?
  • Rarest-first upload piece rarest among your
    neighbors first

1
2
3
1
2
1
35
Joining torrent
metadata file
peer list
join
datarequest
  • Peers divided into
  • seeds have entire file
  • leechers still downloading

1. obtain metadata file (out of band)
2. contact tracker
3. obtain peer list (contains seeds, leechers)
4. contact peers from peer list for data
36
Joining torrent
  • Peers divided into
  • seeds have entire file
  • but leechers do not
  • Process
  • obtain metadata file (out of band)
  • contact tracker
  • obtain peer list (seeds, leechers)
  • contact peers from peer list

metadata file
peer list
join
datarequest
seed/leecher
37
Exchanging data
  • verify pieces using hashes
  • download sub-pieces (blocks) in parallel
  • advertise received pieces to peer list
  • interested need pieces that a given peer has

I have
!
38
Metadata file structure
  • contains information necessary to contact
    tracker describes file in the torrent
  • URL of tracker
  • file name
  • file length
  • piece length (typically 256KB)
  • SHA-1 hashes of pieces for verification
  • creation date, comment, creator,

39
Tracker protocol
  • communicates with clients via HTTP/HTTPS
  • client GET request
  • info_hash uniquely identifies the file
  • peer_id uniquely identifies the client
  • client IP and port (typically 6881-6889)
  • numwant how many peers to return (defaults to
    50)
  • stats bytes uploaded, downloaded, left
  • tracker GET response
  • interval how often to contact the tracker
  • list of peers, containing peer id, IP and port
  • stats complete, incomplete
  • tracker-less mode based on the Kademlia DHT

40
Piece selection
  • when downloading starts choose at random
  • get complete pieces as quickly as possible
  • obtain something to trade
  • after obtaining 4 pieces (local) rarest first
  • achieves fastest replication of rare pieces
  • obtain something of value to trade
  • get unique pieces from seed
  • endgame mode
  • defense against last-block problem
  • send requests for missing sub-pieces to all
    peers in our peer list
  • send cancel messages upon receipt of a sub-piece

41
Last-block problem
  • at end of the download, a peer may have trouble
    finding the missing pieces
  • based on anecdotal evidence
  • other proposals
  • network coding Gkantsidis et al., Infocom05
  • prefer to upload to peers with similar file
    completeness unfair for the peers with most of
    the pieces Tian et al., Infocom06

42
Optimistic unchoking
  • four peers unchoked at a time
  • every ten seconds drop peer with lowest download
    rate
  • every 30 sec. unchoke a random peer
  • multi-purpose mechanism
  • allow bootstrapping of new clients
  • balances loads among peers - lower delays
  • maintain connected network every peer has
    non-zero chance of interacting with any other peer

43
Randomized Load Balancing
44
Load Balancing Static Case
  • problem statement
  • n balls dropped into n bins so as to minimize
    maximum load
  • ideal policy add to least loaded bin
  • requires too much state information
  • random policy choose bin randomly
  • maximum load is with high
    probability

45
Load Balancing Dynamic Case
  • model
  • n queues with rate 1 exponential servers w
  • Poisson arrivals with rate ?
  • problem assign jobs to queues to minimize delays
  • random policy choose queues randomly
  • n independent single server queues with Poisson
    arrivals
  • queue length distribution P(Q i) ?i
  • clever random policy join shortest of d 2
    randomly chosen queues
  • n large,

Mitzenmacher 1996
46
A simple fluid analysis
  • n large
  • job assigned to shortest of d 2 randomly chosen
    queues
  • si(t) fraction of queues with load at least i
    at time t
  • s0(t) 1, for all t

47
(No Transcript)
48
Continuing
  • at equilibrium
  • and

49
Similarity to BT
  • peers act as servers
  • arriving request selects 4 servers at random,
    drops heaviest loaded server
  • repeats this process of randomly choosing new
    server, dropping worst of 4
Write a Comment
User Comments (0)
About PowerShow.com