The CoDeeN Content Distribution Network - PowerPoint PPT Presentation

About This Presentation
Title:

The CoDeeN Content Distribution Network

Description:

An Open Proxy Network. Probably the largest in existence ... Server surrogates (proxies) on most North American sites. Originally everywhere, but we cut back ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 25
Provided by: valued85
Category:

less

Transcript and Presenter's Notes

Title: The CoDeeN Content Distribution Network


1
The CoDeeN Content Distribution Network
  • Vivek S. Pai, Limin Wang, KyoungSoo Park, Ruoming
    Pang, Larry Peterson
  • Princeton University
  • August 12, 2003

2
Content Distribution Networks
  • Replicates Web content broadly
  • Redirects clients to best copy
  • Load, locality, proximity
  • Offloads work from origin servers
  • Multiplexes load spikes
  • Reduces overprovisioning
  • Ex Akamai, Mirror Image, Speedera

3
What Does It Do?
  • An Academic Content Distribution Network
  • Redirects/caches HTTP requests
  • Based on our OSDI 2002 paper on CDN performance
  • An Open Proxy Network
  • Probably the largest in existence

4
Who Is The Target Audience?
  • Now
  • Users wanting better performance
  • People seeking anonymity
  • Next
  • Content providers seeking load sharing
  • Later
  • General support for absorbing flash crowds
  • Avoid the Slashdot Effect

5
How Does It Work?
  • Server surrogates (proxies) on most North
    American sites
  • Originally everywhere, but we cut back
  • Clients specify proxy to use
  • Cache hits served locally
  • Cache misses forwarded to CoDeeN nodes
  • Maybe forwarded to origin servers

6
Request Forwarding
7
When Will It Be Ready?
  • January development started
  • Reliability stability major concerns
  • March stable enough for daily use
  • April security problems begin
  • Shut down for one month
  • June Restarted beta
  • Expecting production soon

8
Decisions Good Bad
  • Use commercial proxy with API USITS 2003
  • Good mostly layer 7 concerns
  • Bad limits deployment size (donated licenses)
  • Deployment on PlanetLab
  • Good otherwise impossible
  • Bad vulnerable to other experiments
  • Allow open access
  • Good generates real traffic
  • Bad some traffic just plain mean

9
Lots of Malicious Traffic
Restrict ports HTTP methods
Multi-scale req bw accounting
  • Spammers
  • SMTP tunnels, POST forms, IRC channels
  • Bandwidth hogs
  • Google crawls, steganographers, X-Pacific
  • Hackers Spreaders
  • Yahoo dictionary attacks, IIS vuln tests
  • Content thieves
  • E-journals/databases, local content

Signature database Robot test
Determine location privilege
10
Protecting Privilege
11
Attempted SMTP Tunnels/Day
12
By The Numbers
  • Restarted in late May
  • In continuous operation
  • Stats from first 8 weeks
  • Over 59,000 unique IPs as clients
  • Over 24 million requests serviced
  • Valid rates up to 15K reqs/hour
  • Roughly 1 million reqs/day aggregate

13
More Production Info
  • About 2000 lines of code
  • About ¼ is actual decision logic
  • Uptimes limited by upgrades
  • Generally 1-2 times/week
  • Downtimes of 20 seconds/node
  • Currently on 40 nodes

14
Daily Requests (Serviced)
15
Welcome
16
Avoiding
sorted by avoiding
17
Load
sorted by load average
18
Total
sorted by total req rate
19
Users
sorted by users
20
The Troubles Weve Caused
  • Routinely trigger open proxy alerts
  • Educating sysadmins, others
  • Resource checks generate noise
  • Got onto planetlab-support
  • Really good honeypots
  • 6000 SMTP flows/minute at CMU
  • Spammers do 1M HTTP ops/day

21
What Weve Learned
  • Parallel ssh is a must
  • General commands/queries
  • Basis for parallel scp
  • Used to detect out-of-date files
  • Monitoring is a must
  • Too hard to see anomalies in 40 nodes
  • Almost looks like a demo
  • Be careful accepting outside requests

22
What We Still Need
  • Better layer 4 tools
  • Hard to tell why things die
  • Building complete heartbeats isnt fun
  • Better isolation on most resources
  • CPU/OS Java, VServers, ???
  • Others FD exhaustion, disk space

23
What We Wouldnt Mind
  • Customizable DNS mapping
  • Map project.planet-lab.org to some node
  • Projects could provide feedback
  • Node availability, utility, etc
  • Most IP geolocation seems locked up

24
More Info
  • http//codeen.cs.princeton.edu
Write a Comment
User Comments (0)
About PowerShow.com