Title: Content Distribution Network CDN Performance
1Content Distribution Network (CDN) Performance
- Punit Shah (pshah_at_cse.ogi.edu)
- CSE581 Internet Technologies
- OGI, OHSU
- 2002, Jan 16th
2Papers
- CDN, CDN Performance
- The measured performance of CDNs.
- On the use and performance of CDNs.
- Analytical model for CDN performance in
multi-level caching. - Web caching and content distribution A view from
the interior.
3What is CDN ?
- The CDNs are means to offload some or all of the
(mainly static content) content delivery burden
from the origin server. A replica server, which
delivers content on behalf of the origin server
is called a CDN server. - Aimed to address
- Client perceived latency (e.g. web browsers).
- Capacity management of the server.
- Caching as a side-effect.
4Request Redirection
- Primarily two ways to redirect request to the CDN
servers. - DNS redirection
- Authoritative DNS server is controlled by the
CDN infrastructure. Distributes the load to the
various CDN servers depending whatever policy
(e.g. round-robin, least loaded CDN server,
geographical distance etc.) using DNS trick. - URL rewriting
- Main page still comes from the origin server,
but URL for the embedded objects, e.g. images,
clips are rewritten, which points to a any of the
CDN server. Some vendors rewrite using hostname
and some uses IP addr directly. - Some vendors do employ a combination of these two
methods. - Not simple to find a nearest CDN server (in terms
of latency).
5Full Site DNS redirection example
Origin Server
111.222.100.1
10.20.30.1
www.yahoo.com/GET index.html
10.20.30.4
10.20.30.2
CDN controlled DNS Server
10.20.30.3
Vendors Adero(Full), Akami and Digital Island
(Partial)
6Partial DNS redirect/URL rewriting example
index.html ltHTMLgt ltBODYgt ltA HREF/about_us.htmlgt
About Us lt/Agt ltIMG SRCwww.clearway1.net/www.yah
oo.com/img1.gifgt ltIMG SRCwww.clearway2.net/www.
yahoo.com/img2.gifgt ltIMG SRC10.20.30.2/www.yaho
o.com/img3.gifgt lt/BODYgt lt/HTMLgt
Vendors Clearway (URL RW)
7CDN performance elements
- Client perceived latency.
- Thats what most of the papers focused, as an
outsider.
- Load balancing among the CDN servers.
- Number of request offloaded from an origin
server.
8Analytical Model
- Gadde et al. derives CDN cachable ratio as (Cni -
Cnl)/(1 - Cnl) - where
- Cni CDN hit ratio for client population of size
ni who forwards to this CDN server for some
fixed object x - Cnl cache hit ratio at leaf node (e.g. proxy)
serving client population of size nl
9Model Performance
- More clients, less CDN cache hit ratio.
- If number of clients increased further, curve
take a bell shape, indicating cache thrashing. - Model validated with the NLANR cache hierarchy at
the root level (considering all root level
cache as an unified cache). 32 cache hit ratio
in Oct 1999.
10CDN Server Selection
- Primary paper Johnson et al. focuses on how
good (good minimal client latency) CDN
server is selected by the Akamai and Digital
Island. Both of these uses partial site DNS
lookup. - Used three distinct client locations in the US.
Two east coast and one western state. Clients
were running different OS and different internet
bandwidth. - Test Procedure
- Determine set of CDN servers (hostnames) used by
the particular CDN. - Obtain IP address of the CDN servers.
- Identify a GIF file (3-4KB), and retrieve this
GIF from each of the CDN servers 25 times. Record
time taken. Notice that DNS lookup time is not a
factor, as IP addrs are used.This test was
conducted at all three client sites. - Fetch same GIF via CDN server identified by
contacting an origin server. Record time taken.
Modified gethostbyname()? or /etc/resolv.conf
order. Because TTL was quite small (10s of
seconds). This tests were also conducted at all
three client sites for both of these vendors.
11Results
- Both vendors demonstrated identical results.
- Not very best CDN is chosen at some locations.
- Performance is highly location dependent. Some
location performed much better than the others.
Indicating CDN server placement. - However gt90 times reasonably good server, with
respect to particular location is chosen. - For around 10 of times, rather random choice
would done better. - Conclusion Doesnt choose an optimal CDN server,
but avoids notably bad CDN server.
12Another location
13Other Results
- Focus is to compare Sep 2000 and Jan 2001
results. - CDN server selection test results are identical
to the what we saw earlier. - HTTP/1.1 results are better than HTTP/1.0
parallel connection. V1.1 pipeline is faster than
serial.
14Load balancing and DNS Lookup Overhead
- Till now we ignored DNS lookup time to focus on
measuring quality of the CDN server chosen. - However not an insignificant overhead. Esp.
considering very small download time and TTL,
e.g. Adero 10sec, Akamai and Digital Island
20sec. TTL for non-CDN origin site, cnn.com
15min, espn.com 6hours. - Bala et al. conducted a test to measure DNS
lookup overhead (and latency) introduced by the
CDN load balancing mechanism. - Test procedure
- Store (fixed) IP addr for each CDN server at
every 8 hours. - During this 8 hours period, at every 30 mins.,
compare new IP returned with previously retrieved
(fixed) IP addr. - Access DNS lookup time and download time for new
IP addr returned. - Compare download time with fixed IP addr.
15Results
- In Jan 2001, 15 (Fasttide) to 70(Digital
Island) time new IP is same as fixed. - In above cases a new IP download time is
identical to the fixed IP, but DNS lookup
overhead undermines overall performance. - 10 of times, download from new IP addr is
faster, but again DNS lookup - 30-40(Akamai) times new download time is more
then a fix IP addr, again DNS lookup ... - New download time are more than fixed IP addr
download time. - Overall redirection is not efficient.
16Some Facts ...
- CDN mainly used for image files (static
contents). - Content server by the CDN is a static in the
nature. Only 0.3 content changed for existing
URLs and at the most 13 new URLs were
introduced. - Black-box performance testing. So no data about
load-balancing, only latency. - Large increase in deployment in the CDN between
Nov 99 (only 1-2 of top 670 sites) and Dec 2000
(25 of the popular sites). - Akamai seems to be most popular CDN vendor.
- Images are 96-98 of the CDN served contents. But
only 40-46 of the CDN-served bytes. Rest is
dynamic content ? - CDN images cache-hit rate is 30-80. Only 25-60
for non-CDN served. - Needs to map IP addrs with the geography for
better CDN server selection. - CDNs can not used for something that involves
authentication etc.