Documents publiés » Technologies Web »

Save a wire, make and use cacheable RSS

This document is a short pledge to try and make RSS an even more useful and usable contribution to the Web.

RSS (RDF Site Summary) is a widespread technology used to syndicate dated information, especially news feeds ; as such, it has a huge potential for contents providers as a very effective tool to propagate and advertize their contents, and for users as a way to follow the changes in many Web sites at once.

One default though affects quite frequently RSS feeds - and more generally, dynamically generated contents : they don’t use adequate HTTP properties that save bandwith through caching technologies.

This particular default creates an undue network traffic that hurts both users and content providers ; it is particularly sensitve with RSS, since most RSS clients send a fair number of requests per day for a given feed, even though most RSS feeds contents don’t change during that time.

Besides the bandwidth consumption, this also means that RSS clients have to spend more CPU time to check whether the content of a non-cachable RSS feed has changed or not.

Thus, if you maintain an RSS feed, make sure that it is HTTP-cacheable, and if it is not, you should use the very complete guide to make a Web site work with Web caches.

As a user, check that the RSS feeds you've subscribed to do use correctly the HTTP-cache properties, and if not, contact the Webmasters of the sites to have them change their configuration.

As a small bonus, this shell script to check quickly if a Web site is cacheable or not - it can get handy especially when trying to check a list of sites, using a loop.

 # takes an HTTP URI as parameter
 # and check if it supports the Last-Modified/If-Modified-Since validation
lm=`HEAD $1|grep -i Last-Modified|cut -d : -f 2-`
echo "Checking with validator set to " $lm >&2
if [ `HEAD -H "If-Modified-Since: $lm" $1|head -1|cut -d " " -f 1` == "304" ]
        then echo $1 " has valid HTTP caching"
        else echo $1 " does NOT support HTTP caching"

This script uses the HEAD tool, provided by the Debian package libwww-perl.