CS193H: High Performance Web Sites Lecture 14: Rule 11 - PowerPoint PPT Presentation

About This Presentation
Title:

CS193H: High Performance Web Sites Lecture 14: Rule 11

Description:

GET /msn HTTP/1.1. Host: astrocenter.astrology.msn.com. Request. HTTP/1.x 301 Moved Permanently. Location: http://astrocenter.astrology.msn.com/msn/ Response ... – PowerPoint PPT presentation

Number of Views:222
Avg rating:3.0/5.0
Slides: 29
Provided by: steves7
Category:

less

Transcript and Presenter's Notes

Title: CS193H: High Performance Web Sites Lecture 14: Rule 11


1
CS193HHigh Performance Web SitesLecture 14
Rule 11 Avoid Redirects
  • Steve Souders
  • Google
  • souders_at_cs.stanford.edu

2
announcements
  • midterm Friday 10/31 315-405pm
  • 30-40 short answer questions
  • 10/29 315pm (now) check five Web 100
    Performance Profile sites
  • 11/3 Doug Crockford from Yahoo! will be guest
    lecturer talking about "Ajax Performance"

3
3xx status codes
  • "further action needs to be taken by the user
    agent in order to fulfill the request"
  • 300 Multiple Choices (based on Content-Type)
  • 301 Moved Permanently
  • 302 Moved Temporarily (aka, Found)
  • 303 See Other (clarification of 302)
  • 304 Not Modified
  • 305 Use Proxy
  • 306 (no longer used)
  • 307 Temporary Redirect (clarification of 302)
  • http//www.w3.org/Protocols/rfc2616/rfc2616-sec10.
    htmlsec10.3

most popular
response for conditional GET request
HTTP/1.1
4
redirect example
Request
GET / HTTP/1.1 Host astrology.yahoo.com
Response
HTTP/1.1 301 Moved Permanently Location
http//shine.yahoo.com/astrology/
  • go to the new location instead of the original
    one
  • why use redirects?
  • prettier URLs
  • track traffic
  • authentication

5
worst blocker
  • inserting a redirect to the HTML document is
    worse than how stylesheets and scripts block
  • all resources in the page are delayed
  • the user gets very little feedback (nothing in
    the page)
  • rendering, even the HTML text, is delayed
  • 2nd worse redirecting to a script

6
caching redirects
Request
GET / HTTP/1.1 Host astrology.yahoo.com
Response
HTTP/1.1 301 Moved Permanently Date Tue, 28 Oct
2008 073953 GMT Location http//shine.yahoo.com
/astrology/ Cache-Control private Connection
close Transfer-Encoding chunked Content-Type
text/html charsetutf-8
  • "Moved Permanently" is it cached?
  • no
  • spec "cacheable if indicated by a Cache-Control
    or Expires header field"

7
caching 301, 302, Expires
past Expires no Expires future Expires
301 Moved Permanently don't cache all don't cache IE, FF3, Safari, Opera cache FF2, Chrome don't cache IE, FF3, Safari, Opera cache FF2, Chrome
302 Moved Temporarily don't cache all don't cache IE, FF3, Safari, Opera, Chrome cache FF2 don't cache IE, FF3, Safari, Opera cache FF2, Chrome
past Expires no Expires future Expires
301 Moved Permanently don't cache all don't cache IE, FF3, Safari, Opera cache FF2, Chrome don't cache IE, FF3, Safari, Opera cache FF2, Chrome
302 Moved Temporarily don't cache all don't cache IE, FF3, Safari, Opera, Chrome cache FF2 don't cache IE, FF3, Safari, Opera cache FF2, Chrome
FF2 and Chrome only browsers to cache
redirects Firefox 3.1 fixes regression from FF2
to FF3
IE 7 8(beta 2), Firefox 2.0 3.0, Safari 4,
Opera 9.61, Chrome 0.2
8
redirect alternatives
  • JavaScript
  • document.location "destination.php"
  • what if JavaScript is disabled or not present?
  • meta refresh put in document's HEAD
  • ltmeta http-equiv"refresh"
  • content"0 urldestination.php"gt
  • in IE, causes conditional GET requests for all
    resources (similar to Reload button)

9
cache workaround
need to let the page load so it can be cached
  • lthtmlgt
  • ltheadgt
  • ltscript type"text/javascript"gt
  • window.onload function ()
  • document.location "destination.php"
  • lt/scriptgt
  • ltnoscriptgt
  • ltmeta http-equiv"refresh"
  • content"0 urldestination.php"gt
  • lt/noscriptgt
  • lt/headgt
  • one last thing
  • make this document cacheable!

10
redirects in the top 10
redirects
www.aol.com 5
www.ebay.com
www.facebook.com
www.google.com/search
search.live.com/results
www.msn.com 1
www.myspace.com
en.wikipedia.org/wiki
www.yahoo.com 2
www.youtube.com
  • mostly ads

11
common uses
  1. redirect from blah.com to www.blah.com
  2. missing trailing slash
  3. tracking internal traffic
  4. tracking outbound traffic
  5. prettier URLs, preserve old URLs
  6. connecting web sites
  7. ads
  8. authentication

12
use 1 www
Request
GET / HTTP/1.1 Host aol.com
Response
HTTP/1.x 301 Moved Permanently Date Tue, 28 Oct
2008 230142 GMT Expires Tue, 28 Oct 2008
233142 GMT Location http//www.aol.com/
  • why redirect from http//aol.com/ to
    http//www.aol.com/?
  • set cookies on www domain non-issue
  • cache resources once regardless of which URL is
    used
  • http//aol.com/logo.gif
  • http//www.aol.com/logo.gif

13
use 1 www in the top 10
  • which top 10 sites redirect from blah.com to
    www.blah.com?

status Expires
aol.com 301 30 mins
ebay.com 301
facebook.com 301 July 1997
google.com 301 30 days
search.live.com na
msn.com 301
myspace.com 301
wikipedia.org 200
yahoo.com 301
youtube.com 303 Apr 1971
how is this possible?!
303 See Other "MUST NOT be cached"
14
use 1 www Wikipedia
  • all resources referenced via full URLs
  • easy, if you're doing
  • CDN
  • domain sharding
  • cookieless domain
  • another alternative
  • ltbase href"http//www.wikipedia.org"gt

15
use 2 missing trailing slash
Request
GET /msn HTTP/1.1 Host astrocenter.astrology.msn.
com
Response
HTTP/1.x 301 Moved Permanently Location
http//astrocenter.astrology.msn.com/msn/
  • reasons to redirect for missing trailing slash
  • autoindexing
  • workaround don't use autoindexing
  • relative URLs for resources
  • workaround use base href, full URLs, or URLs
    relative to root

16
use 3 internal tracking
Request
GET /_yltAlume/http3A//tools.search.yahoo.com
/about/forsearchers.html HTTP/1.1 Host
m.www.yahoo.com
Response
HTTP/1.x 302 Moved Temporarily Location
http//tools.search.yahoo.com/about/forsearchers.h
tml
  • "More" link on Yahoo! front page
  • workaround track referer sic on internal
    servers

17
use 4 outbound tracking
Request
GET /url?urlhttp3A2F2Fwww.npr.org/
HTTP/1.1 Host www.google.com
Response
HTTP/1.x 302 Found Location http//www.npr.org/
  • clicking on a Google search result
  • workarounds
  • image beacon race conditions
  • XMLHttpRequest readyState 2 faster, more
    complex
  • HTML 5
  • lta ping"http//..."gt
  • ltlink relpingback href"http//..."gt

18
use 5 prettier URLs
Request
GET / HTTP/1.1 Host music.myspace.com
Response
HTTP/1.x 302 Moved Location http//profile.myspac
e.com/index.cfm?fuseactionmusic
  • prettier URLs are easier to remember
  • also, preserve old URLs when code changes
  • workaround mod_rewrite, cacheable redirects

19
use 6 connecting sites
  • http//toolbar.google.com/
  • http//toolbar.google.com/index.html
  • http//toolbar.google.com/T5/
  • http//toolbar.google.com/T5/intl/en/index.html
  • http//www.google.com/tools/firefox/toolbar/FT3/in
    tl/en/index.html
  • redirects are an easy way to "integrate" separate
    teams (T4, T5), separate code bases (IE, FF),
    separate servers (toolbar, www)
  • workarounds CNAMEs, mod_rewrite

20
use 7 ads
  • specifically, counting ad impressions
  • advertisers and publishers have a hard time
    reconciling the count
  • when do you count an ad impression?
  • when a page containing an ad is served?
  • what if the page never arrives?
  • when a page containing an ad arrives at the
    client?
  • what if the ad request fails, or the user stops
    the page?
  • when the content of the ad (image, Flash) is
    requested from the advertiser?
  • what if the user leaves the page before the
    content arrives?
  • after the content arrives?
  • is it the publisher's fault if the content is
    slow?

21
use 7 ads
  • how do you count an ad impression?
  • when a page containing an ad is served?
  • count it on the publisher's backend
  • when a page containing an ad arrives at the
    client?
  • send a beacon from the client
  • when the content of the ad (image, Flash) is
    requested from the advertiser?
  • count it on the advertiser's backend
  • after the content arrives?
  • send a beacon from the client
  • redirects can help count when content is served
    and reconcile the two parties

22
use 7 ads
  • http//ad.doubleclick.net/im51cso3fhttp//ad
    .doubleclick.net/dot.gif?258824979
  • http//ad.doubleclick.net/dot.gif?258824979
  • http//eatps.web.aol.com9000/open_web_adhoc?subty
    pe40458
  • http//www.aolcdn.com/pops_promo/pixel
  • http//ad.doubleclick.net/ad/N553.AEAOLService/B27
    75919.11dcadv1297440sz1x1ord4613012?
  • http//m1.2mdn.net/viewad/1297440/1x1.gif
  • from http//www.aol.com/
  • double logging?

23
use 7 ads
  • http//ads.cnn.com/event.ng/TypecountClieARd
  • http//i.cdn.turner.com/cnn/images/1.gif
  • http//ad.doubleclick.net/ad/N3880.SD146.3880/B310
    7454.25dcoveosz1x1orddwgksue,beqpWcytARh?
  • http//m1.2mdn.net/viewad/1139835/67-1x1.gif
  • from http//www.cnn.com/
  • double logging?

24
use 8 authentication
  • cookies are used for authentication
  • cookies can only be set on the page's domain
  • how authenticate someone on domain A if they're
    currently on domain B?
  • redirects
  • authentication is often on https servers
  • how authenticate someone on https if they're
    currently on http?
  • redirects

25
use 8 authentication
  • https//www.google.com/accounts/ServiceLoginBoxAut
    h
  • https//www.google.com/accounts/CheckCookie?contin
    uehttp3A2F2Fgroups.google.com2Fgroups2Fauth
    3F_done
  • http//groups.google.com/groups/auth?_done
  • http//groups.google.com/groups/auth?_done
  • http//groups.google.com/
  • one reason why redirects with Set-Cookie are
    sometimes not cached

26
avoid redirects
  • eliminate the need
  • base href or full URLs for resources
  • referer tracking
  • HTML 5 A ping and LINK pingback
  • CNAMEs
  • mod_rewrite
  • no autoindex
  • make them cacheable
  • 301 with future Expires
  • JavaScript meta refresh with future Expires

27
Homework
  • study for midterm 10/31 315-405pm
  • 11/7 1159pm rules 4-10 applied to your
    "Improving a Top Site" class project

28
Questions
  • What's the status text for 301 and 302?
  • What HTTP response header contains the URL the
    user is redirected to?
  • Why are redirects worse than stylesheets and
    scripts in terms of blocking?
  • If a redirect is "Moved Permanently", does that
    mean it's cached?
  • Which browsers today cache redirects?
  • What are two other techniques for doing
    redirects? How do they compare to the 301/302
    status approach?
Write a Comment
User Comments (0)
About PowerShow.com