Title: Globally Distributed Content Delivery
1Globally Distributed Content Delivery
- John Dilley,Bruce Maggs,
- Jay Parikh,Harald Prokop,Ramesh Sitaraman,Bill
Weihl - AkamaiTechnologies
- IEEE INTERNET COMPUTING SEPTEMBER.OCTOBER 2002
2Outline
- Introduction
- Existing approaches
- Akamai's Network Infrastructure
- Network Service
- Technical Challenges
- Related Work
- Conclusion
3Introduction
- As web sites become popular
- The flash crowd problem
- Crash a site
- Unusually high response times
- Serious problems while serving Web content from a
single location - Site scalability
- Reliability
- Performance
- Devise a system to serve from variable number of
surrogate origin servers at the network edge - Many technical challenges
- Direct request
- Handle failure
- Monitor and control the server
- Update software
4Existing Approaches
- Several approaches to delivering content in a
scalable and reliable way - Local clustering
- Improve fault-tolerance and scalability
- Difficult to scale clusters to thousands of
servers - Mirroring
- Deploying clusters in a few location
- Mirroring requires synchronizing the site among
the mirrors - Multi-homing
- Using multiple ISPs to connect to the internet
- Underlying network protocols dont converge
quickly
5Akamais Network Infrastructure
- Allocating more servers to site experiencing high
load - The system directs client requests
- Nearest
- Function of network topology and dynamic link
characteristics - Available
- Function of load and network bandwidth
- Likely
- Function of which servers carry the content for
each customer in a data center
6Akamais Network InfrastructureAutomatic Network
Control
- The direction of requests to content server is
mapping - Using dynamic , fault-tolerant DNS system
- Resolve hostname
- service requested
- user location
- network status
- also use DNS for network load-balancing
7Akamais Network InfrastructureAutomatic Network
Control
- akamai name server resolve host name to IP
address using the following criteria - Service requested
- Server health
- Server load
- Network condition
- Client location
- Content requested
- BGP messages
- Network reachability information
- The best routing path among the Internets Ass
- The mapping system
- Using BGP and live network statistics --
traceroute data - Network structure and quality measures
8Akamais Network InfrastructureAutomatic Network
Control
9DNS Resolution
- Akamai edge servers located using a DNS name.
- Such as a7.g.akamai.net
- Process
- Resolver------------gtroot name server
- a7.g.akamai.net
- lt------------
- the IP of the name servers that handle .net
domain requests - Resolver------------gt.net name servers
- lt------------
- the IP of the name servers that handle
.akamai.net domain requests (the Akamai
top-level name servers(top-level DNS in F.1))
10DNS Resolution
- Resolver------------gtAkamai TL DNS server
- lt------------
- return the low-level Akamai name servers
(low-level DNS in F.1) for .g.akamai.net with a
TTL of about one hour - Resolver------------gtAkamai LL DNS server
- lt-----------
- return the IP of servers available to satisfy
the request with a short TTL(sec to min)
11Akamais Network InfrastructureNetwork Monitoring
- DNS-based load balancing system
- continuously monitors the state of services ,
their server and networks. - Each content servers reports its load to a
monitoring application - The Monitoring application report to the local
DNS server - DNS server determines which IP to return when
resolving DNS name by some threshold - Assign some of the servers allocated content to
additional servers - The servers IP is no longer available to clients
- The monitoring system also transmits data center
load to the top-level DNS resolver - Monitor the entire system end-to-end
- Use agents that simulate end-user behavior
12Network Services
- Static Content
- Akamais content servers use content type to
apply - Lifetime
- Zero to infinite ( never check the objects
consistency) - lifetime values for Akamai edge servers can
differ - Special feature
- The ability to serve secure content
- Support alternate content
13Network Services
- Dynamic Content
- Use Edge Side Includes techology
- Similar to server-side include languages
- Add fault-tolerance features
- Can process XML data
- Break a dynamic page into fragments with
independent cacheability properties - Must fetch only noncacheable or expired fragments
from the origin Web site.
14Network Services
- Streaming Media
- Support live and on-demand in three format
- MS windows media , Real , Quicktime
- Additional challeges
- The content provider typically captures and
encodes a live stream and sends to entry-point
server - Mechanism the will react quickly to a failed
entry-point server - Delivery from the entry-point to the edge servers
must be resilient to failure and packet loss - Uses information dispersal techniquesmultiple
redundant paths
15Technical ChallengesSystem Scalability
- Monitoring and controlling servers while minimize
monitoring bandwidth - Monitoring network conditions ,and generate new
maps - Minimizing the overhead added to DNS to avoid
long DNS lookup times - Dealing with incomplete and out-of-date
information - Reacting quickly to changing network conditions
and changing workloads - Measuring Internet conditions to attain end-user
performance - Customers problems
- Isolating customers to avoid affecting each other
- Data integrity
- Collecting logs
16Technical ChallengesSystem Reliability
- Akamais monitoring and mapping software ensures
that server or network failure do not affect end
users - DNS system detects failures quickly and hands out
new IP - For customers still using cached DNS ansers
- DNS resolution return multiple IP addresses
- A live server assume the failed servers IP
- Prevent when network failures make sites
unreachable - Top-level (TL) identify local DNS servers at
different sites - Prevent outages
- Avoid single points of failure
- By replicating monitoring and control mechanisms
17Technical ChallengesSystem Reliability
- Detect and repair software flaw
- New request and response headers appear at any
time - Use a test tool that directs a copy of a live
servers traffic to test version of software
18Technical Challenges--Software Deployment and
Platform Management
- Software must evolve with new features
- for customers ,improved performance, and better
operational and monitoring capabilities. - Cannot upgrade software on the entire network
atomically - Unlikely all edge servers (or all networks) will
be available at the same time - Inevitably ,miss some servers and have to update
them at a later time. - Operating multiple OS platforms and service
- a monitoring platform and tools that run across
those platforms - Gave access to servers' service delivery
parameters--for load balancing
19Technical Challenges Content Visibility and
Control
- Cache consistency
- Cacheable-object consistency
- Apply a "time to live" (TTL) to objects
- Or use a different URL for each object version
- Unique query string
- Let customers place a version generation number
in the URL - Versioned objects often infinite TTLs
- Uncacheable objects' performance
- An edge server between client and origin
- Split the TCP connection
- Edge server can react to packet loss more quickly
- Map clients to an efficient edge servers
- Edge server can maintain much longer persistent
connection with the client
20Technical Challenges Content Visibility and
Control
- Lifetime Control
- Some case, edge server remove certain objects
from all servers on demand - Akamai's edge servers support on-demand purges
for changed or invalid content - Authentication and authorization
- Serving protected content ,edge server
- Contain authorization features
- Relay authentication tokens to the origin server
for authorization - Edge server must not to evict the protected
content on a request authorization failure - Akamai lets content providers authorize every
user request from their own site
21Technical Challenges Content Visibility and
Control
- Integrity Control
- Server must ensure that
- each client request receives the correct response
- Detect when origin servers issue incomplete
responses - Content integrity check
- Visibility into access patterns
- Detailed content-access logs
- Aggregate individual server logs
- Real-time delivery rates and client location
than full log details. - Billing
22Related Work in Distributed and Fault-Tolerant
Systems
- Akamai use logically centralized (but distributed
and replicated) administration and control to map
request to servers and to manage the entire - Other systems also use logically centralized
control - The Autonet
- Use a centralized algorithm to recompute and
distribute routing tables when the network
topology changes - Web-based Caching
- Update and Management Tools
- Software update and management in large
distributed computing environments - Depot
- Reliable system update using modular package and
internal consistency verification - Secure data distribution by shared global file
system
23Conclusion
- Current work
- Running applications at the network's edge
- Advantages
- Capacity on demand
- Cost-effective use of the shared resources
- Respond to users without long distances
- Many challenges
- Visibility into customers' running application
- Applications access the distributed data