Leverage Browser caching – Website Speed Optimization
Leverage Browser Caching as described on the Google Developers Network’s section on Website Speed Optimization; is a cause of major debate among web development professionals. The things we know for sure include that the Google Search Indexing algorithms penalize speed issues that occur on a website. Consequently, it is essential for website developers to harness effective website speed optimization techniques. The first part of Website Speed Optimization involves some details that allow you to Leverage Browser Caching for the essential parts of a website. Website speed optimization is an essential part of Technical SEO.
What is Browser Caching? And how do you “Leverage” it?
Browser Caching is the storage of a webpage’s most often used resource files on a local computer after a user visits the page.
“Leveraging” browser caching is when webmasters instruct browsers on how their resources should be handled in the users temporary internet files library.
The most efficient server communication system involves somehow, limiting the downloading of the same resource time and time again. To achieve such a server comm. system, a webmaster can define a caching policy on a client browsers cache using the Cache-Control HTTP Header in the server responses. This specification can eliminate network latency and excessive data-charges on the client-side due to downloading redundant web resources.
The Actual Ideology Behind Why we “Leverage Browser Caching”
When a user arrives on a webpage for the first time their browser has to load a lot of recurring components like the Logo, CSS files, JS files and misc resources.
Note: When adding JS; remember to keep all essential resources in one file as discussed in the
AMP (Accelerated Mobile Pages).
On every visit the resources need to be loaded again and again, resulting in high loading times
To bypass this recursive calling, browsers are equipped with an HTTP cache.
We can utilize this caching facility to reduce a website’s loading time by providing accurate HTTP header directives. They are provided with each server response that Leverage Browser Caching for effective management of a resource in the browser cache (And the period until it expires)
The Types of Caching Directives
No-cache and No-store
No-cache indicates that the returned response cannot be used to satisfy a subsequent request to the same URL with verifying if the response has changed. Consequently, no cache will incur a round-trip if an ETag is present to verify if a download is required. However, a no-store works on a much simpler idea; it simply disallows the browser and all intermediate caches from storing any version of the returned response. This can be useful for one time transactions and requests that contain sensitive data such as financial transactions.
Public and Private
If the response is declared public, then it can be cached even though it has HTTP authentication and the response status code is not cache-able. Public is a default value because explicit caching information indicates that the response is cache-able anyway. On the other hand, Private responses can be cached but are intended for a single user by default and not cache-able by any intermediary cache (e.g. a CDN).
This directive specifies the maximum age in seconds that the response can be cached and reused during.
The Problem that was solved with ETags
In case that a max-age of a resource was not yet achieved and an update was made on the webpage; the browser will load the outdated resource from the cache. To counter such occurrences Version Control and Fingerprints are embedded in the resources. This way if a newer version has been released, it will be not rejected since an older version exists in the cache.
Another and a more specialized protocol was the use of ETags as a check-sum value in cached resources. When a cached page is revisited, the server responds with resources that certain ETags. In case the resource has not changed, the two Validation Tokens, one in store and the other that exists in the HTTP header sent by the server, will match and a 304 Status code (NOT MODIFIED response will be dispatched). However, if the ETags do not match, a 200 Status Code (OK response will be generated).
The Issue with ETags
However, the problem with ETags is that they are typically constructed using specific attributes that make them unique to a specific web server. ETags do not match when a browser gets the original component from one server and then tries to validate that component on a different server; this can backfire for many web service providers who use multiple servers to handle requests. Apache and IIS both embed data in the ETag by default. This dramatically increases the odds of check-sums mismatching for a similar resource being fetched from a cluster of servers. The end-result is unnecessary fetching despite the resource being stored in the cache memory. In such cases, it is often preferable to not use ETags at all.
This problem can be traced back to the ETag format for Apache 1.3 and 2.x; inode-size-timestamp. What this means is that a given file may reside in the same directory across multiple servers and despite having the same file-size, permissions, timestamp, etc, its inode format may be specific for the type of server.
The format for ETags for IIS 5.0 and 6.0 is Filetimestamp:ChangeNumber. A Change Number is a counter used to track configuration changes to IIS. It is quite unlikely that a ChangeNumber is same across all IIS servers hosting a large website. Thus the similar issue with ETags is experienced.
Note: The HTML5 Boilerplate project is a potential remedy for cross server compatibility issues. You can find Sample Configuration Files for all the most widely used servers with detailed comments for individual configuration flag and settings.
A Cache-Control Directive Explained
A server returns a response with a set of HTTP headers that state the content-length, content-type, content-location caching directives, ETags etc. The following is a sample of a Cache-Control Directive in a website’s htaccess File.
# BEGIN Cache-Control Headers
Header set Cache-Control “max-age=2592000, public”
Header set Cache-Control “max-age=604800, public”
Header set Cache-Control “max-age=216000, private”
Header set Cache-Control “max-age=600, private, must-revalidate”
# END Cache-Control Headers
The ability to define Max-Ages helps streamline the web presentation for individual website demands; if the webpage contains frequently changing images, it is better to define less max-ages. However, if the images are static and do not change over a longer period, then it is preferable to let images be stored in a cache to allow faster page loading.
Note: For the latest and greatest trends in web design technologies and professional practices visit the series on “Web Design“.
The Conclusion to Leverage Browser Caching
There is no such thing as the perfect cache policy. The best strategy depends on traffic patterns, the type of updates and their frequency, the data-freshness that is prerequisite for effective operation of the site. and the level of caching that ensures significant benefits to the operation.
To sum up a long and comprehensive site optimization plan that helps a developer leverage browser caching :
- Properly configure ETag Tokens across servers
- Use URL fingerprinting in your resources to effective manage the online resources and implement correct version control of online assets
- Ensure that the right assets (those that are most oftenly used) are allowed to be cached in CDNs and intermediaries
- Determine the optimal cache age of individual resource types (those that can be cached on a clients browser)
- Identify the cache hierarchy (with proper fingerprinting) that allow you to deliver the fresh content in conjunction with an effective caching strategy