星期二, 3月 13, 2012

Web Cache Issue

簡單記綠一下看完一些Web Cache資料的心得,
方便以後快速學習。

使用Cache的理由

To reduce latency
To reduce network traffic

Web Caches的種類
Browser Caches:
Proxy Caches:
Gateway Caches: 如Content delivery networks (CDNs)

在你的網頁控制Caches


在網頁的Header加上HTML Meta Tags
PRAGMA HTTP HEADERS:
<meta http-equiv="Pragma" content="no-cache">



the HTTP specification does not set any guidelines for Pragma response headers; 
不是每個瀏覽器都會接受Pragma這個header

EXPIRES HTTP HEADER:
<meta http-equiv="Expires" content="0">
由於指定特定的過期時間,所以會有一些限制。

First, because there’s a date involved, the clocks on the Web server and the cache must be synchronised; if they have a different idea of the time, the intended results won’t be achieved, and caches might wrongly consider stale content as fresh.
Another problem with Expires is that it’s easy to forget that you’ve set some content to expire at a particular time. If you don’t update an Expires time before it passes, each and every request will go back to your Web server, increasing load and latency.


CACHE-CONTROL HTTP HEADERS:
<meta http-equiv="Cache-Control" content="no-cache">

當同時使用Cache-Control和Expires二個Header時,Cache-Control擁有較高的優先權。

Useful Cache-Control response headers include:

  • max-age=[seconds] — specifies the maximum amount of time that an representation will be considered fresh. Similar to Expires, this directive is relative to the time of the request, rather than absolute. [seconds] is the number of seconds from the time of the request you wish the representation to be fresh for.
  • s-maxage=[seconds] — similar to max-age, except that it only applies to shared (e.g., proxy) caches.
  • public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically private.
  • private — allows caches that are specific to one user (e.g., in a browser) to store the response; shared caches (e.g., in a proxy) may not.
  • no-cache — forces caches to submit the request to the origin server for validation before releasing a cached copy, every time. This is useful to assure that authentication is respected (in combination with public), or to maintain rigid freshness, without sacrificing all of the benefits of caching.
  • no-store — instructs caches not to keep a copy of the representation under any conditions.
  • must-revalidate — tells caches that they must obey any freshness information you give them about a representation. HTTP allows caches to serve stale representations under special conditions; by specifying this header, you’re telling the cache that you want it to strictly follow your rules.
  • proxy-revalidate — similar to must-revalidate, except that it only applies to proxy caches.
For example:
Cache-Control: max-age=3600, must-revalidate


VALIDATORS AND VALIDATION
Validation is used by servers and caches to communicate when an representation has changed. By using it, caches avoid having to download the entire representation when they already have a copy locally, but they’re not sure if it’s still fresh. 



  1. 常見的Validators有Last-Modified Header和ETag。
  2. 一般來說,網頁伺服器對靜態資源(javascript file, images, files)都會自動產生這二個Headers來做為Validator(Tips:ETag validation is also becoming prevalent.)
  3. 動態網頁預設不會產生Last-Modified Header

Last-Modified HEADER
在瀏覽器第一次請求某一個URL時,服務器端的返回狀態會是200,內容是你請求的資源,同時有一個Last-Modified的屬性標記(Http Reponse Header)此文件在服務期端最後被修改的時間,

格式類似這樣:
Last-Modified: Tue, 24 Feb 2009 08:01:04 GMT

客戶端第二次請求此URL時,根據 HTTP 協議的規定,瀏覽器會向服務器傳送 If-Modified-Since 報頭(Http Request Header),詢問該時間之後文件是否有被修改過:
If-Modified-Since: Tue, 24 Feb 2009 08:01:04 GMT
如果服務器端的資源沒有變化,則自動返回 HTTP 304 (NotChanged.)狀態碼,內容為空,這樣就節省了傳輸數據量。當服務器端代碼發生改變或者重啟服務器時,則重新發出資源,返回和第一次請求時類 似。從而保證不向客戶端重複發出資源,也保證當服務器有變化時,客戶端能夠得到最新的資源。
註:如果If-Modified-Since的時間比服務器當前時間(當前的請求時間request_time)還晚,會認為是個非法請求

Last-Modified標識能夠節省一點帶寬,但是還是逃不掉發一個HTTP請求出去,而且要和Expire一起用。 

ETag HEADER

HTTP 協議規格說明定義ETag為「被請求變量的實體標記」 。
簡單點即服務器響應時給請求URL標記,並在HTTP響應頭中將其傳送到客戶端,
類似服務器端返回的格式:
Etag: 「5d8c72a5edda8d6a:3239〞

客戶端的查詢更新格式是這樣的:
If-None-Match: 「5d8c72a5edda8d6a:3239〞
如果ETag沒改變,則返回狀態304。
即:在客戶端發出請求後,Http Reponse Header中包含 Etag: 「5d8c72a5edda8d6a:3239〞
標識,等於告訴Client端,你拿到的這個的資源有表示ID:5d8c72a5edda8d6a:3239。
當下次需要發Request索要同一個 URI的時候,瀏覽器同時發出一個If-None-Match報頭( Http RequestHeader)此時包頭中信息包含上次訪問得到的Etag: 「5d8c72a5edda8d6a:3239〞標識。
If-None-Match: 「5d8c72a5edda8d6a:3239「
這樣,Client端等於Cache了兩份,服務器端就會比對2者的etag。
如果If-None-Match為False,不返回200,返回304 (Not Modified) Response。
最常見的javascript清理cache的方法(將javascript連結多加一個亂數)。
 
Reference:
Caching Tutorial
JSP Servlet Filter-1
app engine 自建"Last-Modified" 的快取控制
Etag和Expires原來和設定 [轉貼 2010-03-24 13:25:52]

沒有留言:

張貼留言

留個話吧:)

其他你感興趣的文章

Related Posts with Thumbnails