CS193H: High Performance Web Sites Lecture 16: Rule 13 PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: CS193H: High Performance Web Sites Lecture 16: Rule 13


1
CS193HHigh Performance Web SitesLecture 16
Rule 13 Configure ETags
  • Steve Souders
  • Google
  • souders_at_cs.stanford.edu

2
announcements
  • 11/17 guest lecturer Robert Johnson
    (Facebook), "Fast Data at Massive Scale - lessons
    learned at Facebook"

3
Expires
GET /v-app/scripts/107652916-dom.common.js
HTTP/1.1 Host www.blogger.com User-Agent
Mozilla/5.0 () Gecko/2008070208
Firefox/3.0.1 Accept-Encoding gzip,deflate
HTTP/1.1 200 OK Content-Type application/x-javasc
ript Last-Modified Mon, 22 Sep 2008 211435
GMT Content-Length 2066 Content-Encoding
gzip XmoÛHþ\ÿFÖvãwØoq...
HTTP/1.1 200 OK Content-Type application/x-javasc
ript Last-Modified Mon, 22 Sep 2008 211435
GMT Content-Length 2066 Content-Encoding
gzip Expires Fri, 26 Sep 2008 220000
GMT XmoÛHþ\ÿFÖvãwØoq...
  • expiration date determines freshness
  • can also use Cache-Controlmax-age

4
Conditional GET (IMS)
sometime after 3pm PT 9/24/08
GET /v-app/scripts/107652916-dom.common.js
HTTP/1.1 Host www.blogger.com User-Agent
Mozilla/5.0 () Gecko/2008070208
Firefox/3.0.1 Accept-Encoding gzip,deflate If-Mod
ified-Since Mon, 22 Sep 2008 211435 GMT
HTTP/1.1 304 Not Modified
  • IMS determines validity does the browser's
    cached version match what's on the server?
  • the comparison is based on the resource's date
  • a 304 response is sent instead of all the data
  • IMS is used when Reload is pressed

5
ETag Response Header
GET /v-app/scripts/107652916-dom.common.js
HTTP/1.1 Host www.blogger.com User-Agent
Mozilla/5.0 () Gecko/2008070208
Firefox/3.0.1 Accept-Encoding gzip,deflate
HTTP/1.1 200 OK Content-Type application/x-javasc
ript Last-Modified Mon, 22 Sep 2008 211435
GMT Content-Length 2066 Content-Encoding
gzip XmoÛHþ\ÿFÖvãwØoq...
HTTP/1.1 200 OK Content-Type application/x-javasc
ript Last-Modified Mon, 22 Sep 2008 211435
GMT Content-Length 2066 Content-Encoding
gzip Expires Fri, 26 Sep 2008 220000 GMT ETag
"19f1e-7920-4525b037f0440" XmoÛHþ\ÿFÖvãwØoq...
6
Conditional GET (INM)
sometime after 3pm PT 9/24/08
GET /v-app/scripts/107652916-dom.common.js
HTTP/1.1 Host www.blogger.com User-Agent
Mozilla/5.0 () Gecko/2008070208
Firefox/3.0.1 Accept-Encoding gzip,deflate If-Mod
ified-Since Mon, 22 Sep 2008 211435
GMT If-None-Match "19f1e-7920-4525b037f0440"
HTTP/1.1 304 Not Modified
  • alternative way to test validity

7
What is an ETag
  • http//www.w3.org/Protocols/rfc2616/rfc2616-sec3.h
    tmlsec3.11
  • added in HTTP/1.1
  • used by clients and servers to validate expired
    resources
  • more flexible than Last-Modified date
  • "An entity tag consists of an opaque quoted
    string"
  • " An entity tag MUST be unique across all
    versions of all entities associated with a
    particular resource."

8
If-None-Match (hit)
http//www.w3.org/Protocols/rfc2616/rfc2616-sec14.
htmlsec14.26
  • "If any of the entity tags match the entity tag
    of the entity that would have been returned in
    the response to a similar GET request (without
    the If-None-Match header) on that resource,
    then the server MUST NOT perform the requested
    method, unless required to do so because the
    resource's modification date fails to match that
    supplied in an If-Modified-Since header field in
    the request. Instead, if the request method was
    GET or HEAD, the server SHOULD respond with a 304
    (Not Modified) response,"

9
INM, IMS hit miss
hit miss
hit 304 full response
miss
If-Modified- Since
If-None-Match
10
If-None-Match (miss)
  • If none of the entity tags match, then the server
    MAY perform the requested method as if the
    If-None-Match header field did not exist, but
    MUST also ignore any If-Modified-Since header
    field(s) in the request. That is, if no entity
    tags match, then the server MUST NOT return a 304
    (Not Modified) response.

11
INM, IMS hit miss
hit miss
hit 304 full response
miss full response full response
If-Modified- Since
If-None-Match
if not managed properly, sending both IMS and INM
lowers the chances of a simple, small 304
response How could it not be managed properly?!
12
Apache ETags
  • "19f1e-7920-4525b037f0440"
  • "inode-size-timestamp"
  • inode used by filesystems to store file type,
    owner, group, permissions, etc.
  • inode for the same file differs across servers
    even if file size, timestamp, and directory is
    the same
  • http//stevesouders.com/images/arrow-right-9x13.pn
    g
  • ETag "21f5315-d4-5d51f0c0"
  • http//1.cuzillion.com/images/arrow-right-9x13.png
  • ETag "1ee57ec-d4-5d51f0c0"

13
IIS ETags
  • "b4f35327edac51113f"
  • "timestampchangenumber"
  • changenumber counter to track IIS configuration
    changes
  • changenumber rarely the same across servers
  • http//hp.msn.com/global/c/hpv10/favicon.ico
  • ETag "b4f35327edac51113f"
  • ETag "b4f35327edac51e6e"

14
example ETag miss
  • GET /global/c/hpv10/favicon.ico HTTP/1.1
  • Host hp.msn.com
  • If-Modified-Since Wed, 26 Oct 2005 223958 GMT
  • If-None-Match "b4f35327edac5119bc"
  • HTTP/1.x 200 OK
  • Content-Length 1406
  • Etag "b4f35327edac51d76"
  • Last-Modified Wed, 26 Oct 2005 223958 GMT
  • Expires Wed, 06 Feb 2008 011016 GMT

Last-Modified matches (but IMS misses)
timestamp is the same
changenumber differs, validations misses, entire
body is resent
validation miss
15
the problem with ETags
  • the default ETag syntax in Apache and IIS makes
    it unlikely that INM will match across servers,
    even when the resource is the same
  • probability of an incorrect INM miss
  • (n-1)/n where "n" is the number of servers
  • not an issue if you just have one server
  • http//www.apacheweek.com/issues/02-01-18
  • "can cause an unnecessary performance hit as
    resources are fetched more often than is
    required"
  • http//support.microsoft.com/kb/922703
  • "IIS 6.0 sends a 200 response because it
    considers the different change numbers to mean
    that the resources are not the same versions"

16
the solution for ETags
  • if you're not leveraging ETags, turn them off
  • reduces size of requests and responses
  • reduces outbound traffic from your servers
  • increases proxy cache hit rate
  • Apache
  • FileETag none
  • IIS
  • synchronize changenumber across servers
  • http//support.microsoft.com/kb/922703/

17
ETags in the wild
server ETags? default syntax?
www.aol.com AOLserver no
www.ebay.com IIS yes yes
www.facebook.com Apache no
www.google.com/search gws no
search.live.com/results ASP.NET yes no
www.msn.com IIS no
www.myspace.com Apache some no
en.wikipedia.org/wiki Apache lighthttpd some yes no ?
www.yahoo.com YTS no
www.youtube.com btfe no
18
possible uses for ETags
  • ???

19
Homework
  • 11/7 1159pm rules 4-10 applied to your
    "Improving a Top Site" class project
  • 11/12 315pm Web 100 Double Check
  • look at your rows in Web 100 spreadsheet
  • double-check your entries for any rows in red
  • update incorrect entries
  • enter "y" in "Double Checked" column
  • read HPWS Chapter 14

20
Questions
  • Why were ETags introduced in HTTP/1.1?
  • What do "IMS" and "INM" stand for?
  • How do IMS and INM interplay during resource
    validation?
  • What's the default syntax for ETags in Apache and
    IIS?
  • What component in each default syntax hurts
    performance, and why?
  • What are three performance gains you can achieve
    by turning off ETags?
Write a Comment
User Comments (0)
About PowerShow.com