ZGP001 (zphddef.ppt - 07/15/03) - PowerPoint PPT Presentation

About This Presentation
Title:

ZGP001 (zphddef.ppt - 07/15/03)

Description:

Proxy cache. http://www.some.com/page. http://334.249.2.8/page ... Proxy cache (1) (2) URL router. ZGP010. URL routing continued. One armed URL router. HTTP requests ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 54
Provided by: csee8
Learn more at: http://www.csee.usf.edu
Category:
Tags: ascii | http | nortel | ppt | proxy | table | zgp001 | zphddef

less

Transcript and Presenter's Notes

Title: ZGP001 (zphddef.ppt - 07/15/03)


1
Performance Evaluation of URL Routing for Content
Distribution Networks
PhD defense by Zornitza Genova Prodanoff Committe
e Members Dr. K. J. Christensen (Major
Professor) Dr. M. Varanasi Dr. R. Perez Dr.
Chari Dr. Labrador
ZGP001 (zphddef.ppt - 07/15/03)
2
Acknowledgements
  • I would like to thank
  • My major professor Dr. Ken Christensen,
  • My committee Dr. Varanasi, Dr. Perez, Dr.
    Chari, and Dr. Labrador
  • Dr. Suen for his comments at my proposal defense
  • My colleagues K. Yoshigoe, A. Aslam, G.
    Perrera, and J. Shahbazian
  • My family

ZGP002
3
Topics
  • Motivation
  • Problem and contributions
  • URL Routing
  • Improvements to URL routing
  • Evaluation of URL signatures
  • Evaluation of hashing for URL routing
  • Summary
  • List of my publications

ZGP003
4
Motivation
2.5 Billion Hours Spent Waiting on the Web in
1998. - John Roth, chief executive of Nortel
Networks at Telecom '99
ZGP004
5
Problem and contributions
  • Problem
  • Excessive delay in the Internet caused by the
    inability to efficiently access distributed
    content in the Web
  • My contributions
  • 1) Architected a new URL router that uses HTTP
    redirection
  • Investigated new use of CRC32 for reducing the
    size of routing tables
  • Investigated a new self-adjusting hashing method
    for faster URL routing look-up
  • Performed the first queuing evaluation of hashing
    - effects of correlation discovered

ZGP005
6
Topics
  • Motivation
  • Problem and contributions
  • URL Routing
  • Improvements to URL routing
  • Evaluation of URL signatures
  • Evaluation of hashing for URL routing
  • Summary
  • List of my publications

ZGP006
7
URL routing
  • Next generation Internet - Content Distribution
    Networks
  • A CDN is an overlay network on the Internet
  • A CDN co-locates content throughout the world
  • CDNs are of a great commercial and research
    interest
  • 15 million in NSF funding for Web services
    research
  • Akamai is one major CDN provider

ZGP007
8
URL routing continued
Global content distribution in a CDN
http//214.29.2.15/page
http//www.some.com/page
http//334.249.2.8/page
ZGP008
9
URL routing continued
  • HTTP redirection in a CDN
  • (1) HTTP request and redirect
  • (2) HTTP re-request and response

Reverse cache
Origin site
Proxy cache
Clients
Distributed server
ZGP009
10
URL routing continued
Architecture of a new URL router
One armed URL router
HTTP requests and redirects
Network links
Layer 3 switch
ZGP010
11
URL routing continued
  • Need to exchange routing tables (digesting)
  • Summary Cache 17
  • Use Bloom filters to merge routing (hash)
    tables
  • Bloom filter is probabilistic and does not
    support updates
  • False positives if non-unique hashes
  • Results in a routing collision in the context
    of URLs

ZGP011
12
URL routing continued
  • Need to do look-ups in routing tables
  • Why use hashing?
  • Build routing tables as hash tables for efficient
    look-up
  • Idea of selfadjusting hash
  • Most frequently used keys are closer to the head
  • If chained hashing rearrange after key accesses
  • Transposition rule for lists 50, 7
  • Move-to-front rule for lists 33
  • Review of H1 hashing 74
  • Self-adjusting by using transposition

ZGP012
13
URL routing continued
Chained resolution of hash table collision

index

chain

key

record

r0
rn-1
r0
k0
0




r1
r1
k1
1


r2
r2
k2
2

The hashing collision at index 0 causes the chain
to be created







rs
m-1

rn-1

kn-1

ZGP013
14
URL routing continued
H1 and Simple hashing algorithms based on 37
C1. Create lists For i ? 0 to m-1 set LISTi ?
NULL. C2. Hash Set i ? h(KEY), j ? 0 C3.
Is there a list? If LISTi NULL, go to C6.
C4. Compare If K LISTij, terminate C5.
Advance to next If LISTij ? NULL, set j ?
j1 and go to step C4. C6. Insert new key Set
LISTij ? KEY. C4A. Compare and transpose H1
hashing If K LISTij and j ? 0, swap
LISTij with LISTij-1 and terminate Else
terminate
ZGP014
15
URL routing continued
Now begin my contributions in digesting and
hashing (and evaluation thereof)
ZGP015
16
Topics
  • Motivation
  • Problem and contributions
  • URL routing
  • Improvements to URL routing
  • Evaluation of URL signatures
  • Evaluation of hashing for URL routing
  • Summary
  • List of my publications

ZGP016
17
Improvements to URL routing
  • Open problems
  • Select best source based on state (and location
    of client)
  • Reduce the size of the routing table to
    update/share
  • Perform fast routing look-ups

My problems
ZGP027
18
Improvements to URL routing continued
  • My idea
  • Use CRC32 for URL signatures
  • CRC32 circuitry is already part of an Ethernet
    adapter
  • Serial shift-register with wrapped XOR terms
  • Use to get CRC32 signatures for URL in HTTP
    request header
  • Need to calculate a CRC32 over a subfield 53
  • The subfield is the URL in an HTTP request header

ZGP018
19
Improvements to URL routing continued
  • Define the following,
  • P is CRC32 generator polynomial
  • Ai, i 1, , m is a polynomial (bit sequence)
  • We store in a table (for all possible M) the
    remainders
  • , where M is length of subfield

Packet header
Subfield
Rest of packet
A0
A2
A1
ZGP019
20
Improvements to URL routing continued
We have the following,
Returned by adapter - from CRC32 shift register
What we want (CRC32 for subfield)
ZGP020
21
Improvements to URL routing continued
For the following
properties apply
ZGP021
22
Improvements to URL routing continued
  • Solve for RA2 as follows
  • Let A3 be A0 shifted left M bits.
  • Then
  • and

  • .

32-bit multiply
ZGP022
23
Improvements to URL routing continued
  • My idea
  • Aggressive hashing to perform fast look-up
  • Self-adjusting chained collision resolution
  • Fast way to do hash table look-ups
  • Based on move-to-front rule for lists 33, 50

ZGP023
24
Improvements to URL routing continued
  • The new Aggressive hashing algorithm

C1. Create lists For i ? 0 to m-1 set LISTi ?
NULL. C2. Hash Set i ? h(KEY), j ? 0 C3.
Is there a list? If LISTi NULL, go to C6.
C4. Compare If K LISTij, terminate C5.
Advance to next If LISTij ? NULL, set j ?
j1 and go to step C4. C6. Insert new key Set
LISTij ? KEY. C4B. Compare and move-to-front
Aggressive hashing If K LISTij and j ? 0
LISTij ? TEMP, for k 0 to j LISTik LISTi ?
k-1. Terminate. Else terminate.
New
25
Topics
  • Motivation
  • Problem and contributions
  • URL routing
  • Improvements to URL routing
  • Evaluation of URL signatures
  • Evaluation of hashing for URL routing
  • Summary
  • List of my publications

ZGP025
26

Evaluation of URL signatures
Evaluation done with trace-driven
simulation Response variables 1)
Probability of false hits due to signature
collisions 2) CPU time required to generate URL
signatures 3) Reduction in processing and
memory resources for URL look-up
ZGP026
27
Evaluation of URL signatures continued
  • Input data used in the evaluation
  • Obtained lists of URLs from 9 cache and server
    HTTP logs
  • Access lists
  • URL lists
  • CRC32 lists
  • Unique URLs range from 70 to 2.5 million (1.5
    to 146 MBytes)
  • Continuity of logs was in months
  • Full URL string or CRC32 signatures lists were
    built

generated by me
2.1 GBytes of ASCII format raw data was used
ZGP027
28
Evaluation of URL signatures continued
Input data characteristics
ZGP028
29
Evaluation of URL signatures continued
  • Experiments on the performance of CRC32
  • Experiment 1 Number of CRC collisions was
    measured
  • CRC32 generated for each URL
  • Non-unique CRC32s counted
  • Experiment 2 Measured CPU time to generate
    CRC32 URL list
  • Software CRC generation (8-bit look-up coded in
    C)
  • Experiment 3 Measured CPU time required for
    look-up
  • All entries from access list were looked up in
    URL list
  • URL list is a Simple chained hash table

ZGP029
30
Evaluation of URL signatures continued
Results for experiment 1
Measured and theoretical are close
ZGP030
31
Evaluation of URL signatures continued
Results for experiment 2
Time per URL string is small (? sec)
ZGP031
32
Evaluation of URL signatures continued
Results for experiment 3


0.6

0.5

up time (sec)
0.4

-
0.3

Look
0.2


Full URL
0.1


CRC32 URL signatures
0

10

12

14

16

18

20

22

H
value

CRC32 URL signature is better
ZGP032
33
Evaluation of URL signatures continued
  • Experiments for CRC32 vs. MD5-Bloom filter
    digesting
  • Experiment 1 Measured digest size and
    generation CPU time
  • MD5-Bloom filter
  • CRC32
  • 32-bit checksum
  • Lempel-Ziv (LZ) compression (used pkzip25)
  •  
  • Experiment 2 Measured digest size and CPU time
  • MD5-Bloom
  • Experiment 3 Measured collisions
  • Control variable is URL length
  • MD5-Bloom vs. CRC32
  • URL length is a maximum of 25, 30, , 80 bytes

ZGP033
34
Evaluation of URL signatures continued
  • Experiments for CRC32 vs. MD5-Bloom filter
    digesting (continued)
  • Experiment 4 Measured digest size of the hash
    chain method
  • Based on the number of components
  • Tree structure of 32 bits for a ltdepth, hash
    codegt pair

ZGP034
35
Evaluation of URL signatures continued
Results for experiments 1 and 2
Similar CRC32 and Bloom filter collisions
ZGP035
36
Evaluation of URL signatures continued
Results for experiment 3
0.10
0.01
MD5-Bloom
Collisions ()
CRC32
0.00
25
35
45
55
65
75
URL length (bytes)
Collisions are same for CRC32 and Bloom filter
ZGP036
37
Evaluation of URL signatures continued
  • Results from experiment 4
  • Hash chaining in an average of 212 larger
    digests than CRC32

Substantially larger then the other methods
ZGP037
38
Evaluation of URL signatures continued
  • Discussion of results
  • CRC32 URL signatures reduce the size of URL lists
    and speed-up look-up in a hash table
  • Require less network bandwidth to transfer
  • Require less memory for storage in the URL router
  • For CRC32 the number of collisions was found to
    be small
  • CRC32 digests require less CPU and produce same
    collisions

ZGP038
39
Topics
  • Motivation
  • Problem and contributions
  • URL routing
  • Improvements to URL routing
  • Evaluation of URL signatures
  • Evaluation of hashing for URL routing
  • Summary
  • List of my publications

ZGP039
40
Evaluation of hashing for URL routing continued
  • Look-up time experiments
  • Experiment 1 Effect of hash table size on
    look-up time (NASA access list)
  • Experiment 2 Effect of hash table size (in K )
    on look-up time (Clark.net access list)

ZGP040
41
Evaluation of hashing for URL routing continued
Hash table look-up time for experiment 1
60
50
Simple
40
30
Mean Look-up Time
Aggressive
20
H1
10
0
8
9
10
11
12
13
Hash table Size (K)
For dense hash tables Aggressive is better than H1
ZGP041
42
Evaluation of hashing for URL routing continued
Hash table look-up time for experiment 2
40
30
Simple
Mean Look-up Time
20
Aggressive
10
H1
0
8
9
10
11
12
13
K
Similar to experiment 1 results
ZGP042
43
Evaluation of hashing for URL routing continued
  • Evaluation model (single server queue)
  • Response variables
  • mean queuing delay
  • drop in utilization

Arrivals are URLs to be looked-up
Server is a hash table look
Queued URLs
ZGP043
44
Evaluation of hashing for URL routing continued
  • Mean queue length experiments
  • Experiment 1 Effect of hash table size (K) on
    queue length (L) for utilization U 80 (Simple
    chain) and exponential arrivals
  • Experiment 2 Effect of burtiness (Tmax) on L
    for U 80 (Simple chain) and K 8
  • Experiment 3 Effect of (Tmax) on L for U 80
    and K 8
  • Experiment 4 Effect of autocorrelation
    (unshuffled and shuffled ordering of requests) on
    L for U 80 and K 8
  • Experiment 5 Effect of autocorrelation
    (unshuffled and shuffled ordering of requests) on
    L for U 80 (Simple chain) and K 8

ZGP044
45
Evaluation of hashing for URL routing continued
Results for experiment 1

6

Simple

5

4

L
3
2


Aggressive
1

H1
0

8
9
10
11
12
13






K

Self-adjusting methods show similar performance
ZGP045
46
Evaluation of hashing for URL routing continued
Results for experiment 2

40

Simple hashing
-


value range is

30

5500 to 34000

L
20

H1

10

Aggressive

0

50

100

250

500

750

1000

T

max
H1 shows faster increase in L
ZGP046
47
Evaluation of hashing for URL routing continued
Results for experiment 3

120K
H1


80K

L

40K
Aggressive

Simple

0

50
100
250
500
750
1000






T
max

H1 has magnitudes worse queue length
ZGP047
48
Evaluation of hashing for URL routing continued
Results for experiment 4
H1 has magnitudes worse queue length
ZGP048
49
Evaluation of hashing for URL routing continued
Results for experiment 5
ZGP049
50
Evaluation of hashing for URL routing continued
  • Discussion of results
  • Aggressive hashing improves upon H1 hashing
  • Modest look-up time improvement
  • Significant improvement from a queueing
    perspective
  • Queueing must be used for evaluating hashing
    algorithms
  • LRD in look-up time of H1 results in extreme
    queueing delay
  • Catastrophic effects on any application

ZGP050
51
Topics
  • Motivation
  • Problem and contributions
  • URL routing
  • Improvements to URL routing
  • Evaluation of URL signatures
  • Evaluation of hashing for URL routing
  • Summary
  • List of my publications

ZGP051
52
Summary
  • In summary, I have address the problem of
  • Excessive delay in the Internet caused by the
    inability to efficiently access distributed
    content in the Web
  • My work has shown that
  • 1) A URL router that uses HTTP redirection is
    feasible
  • CRC32 can be used for digesting of URL routing
    tables
  • Aggressive hashing improves upon existing hashing
    algorithms in fast look-up
  • Queueing behavior needs to be considered when
    evaluating hashing algorithms

Four publications have resulted
ZGP052
53
List of my related publications
  • Z. Genova and K. Christensen, "Managing Routing
    Tables for URL Routers in Content Distribution
    Networks," submitted to the International Journal
    of Network Management in June 2003
  • Z. Genova and K. Christensen, Efficient
    Summarization of URLs using CRC32 for
    Implementing URL Switching, Proceedings of the
    27th IEEE Conference on Local Computer Networks
    (LCN), pp. 343-344, November 2002
  • Z. Genova and K. Christensen, Using Signatures
    to Improve URL Routing, Proceedings of IEEE
    International Performance, Computing, and
    Communications Conference, pp. 45-52, April 2002
  • Z. Genova and K. Christensen, Challenges in URL
    Switching for Implementing Globally Distributed
    Web Sites, Proceedings of the Workshop on
    Scalable Web Services, pp. 89-94, August 2000
  •  

ZGP053
Write a Comment
User Comments (0)
About PowerShow.com