Content Negotiation or
Multi-Format Processing refers to the ability of an
to use information supplied by a user agent to determine the most appropriate file to send in response to a request.
The Apache documentation defines Content Negotiation as the server's ability to
[...] choose the best representation of a resource based on the browser-supplied preferences for media type, languages, character set and encoding.
Content negotiation is also known as multi-format processing in some web server environments (possibly just those derived from the original CERN base).
Content negotiation is typically used to provide translations of content, though it can be used to negotiate based on any
Accept-* header in an HTTP request.
My interest in using content negotiation for CSS and RSS is to serve compressed files to any user agents which can accept that encoding, reverting back to uncompressed files for user agents which don't.
The value in compressing these files, specifically RSS (and Atom) files, is to reduce the overall bandwidth being used, especially when confronted by automated agents which retrieve these files repeatedly over the course of the day.
Content negotiation is one means of reducing bandwidth utilization, as well as improving user experience just a teensy bit by minimizing the size of ever-growing CSS files.
As an experiment, I started compressing my RSS and Atom files on my personal site some time during the summer of 2004.
Periodically a script runs on my site which invokes gzip to compress the RSS or atom file, resulting in the deletion of the original file (eg: a request for /articles/rss.xml would result in a 404 File Not Found error) and the creation of a new compressed file (rss.xml.gz).
At this stage, a request for http://epcostello.net/articles/atom.xml
would still result in a
I needed to turn on content negotiation by adding
Options +Multiviews to my
.htaccess file for that portion of the site.
Once added, a request for http://epcostello.net/articles/atom.xml causes the server to seek a suitable response.
atom.xml file, but the server finds
atom.xml.gz and returns that.
Generally a good thing, except in the scenario where a user agent
cannot handle the gzip encoding.
I've done this both with my RSS/Atom feeds, as well as my CSS style sheet on my personal site. I'm mostly satisfied with the results, however I've noticed that there are several blog aggregators or readers which don't appear to handle the gzip encoding.
Also, surprisingly, Microsoft Internet Explorer cannot handle a gzip encoded CSS file.
Options +Multiviews is the way to turn on automatic content negotiation under Apache.
An alternative is to use a typemap file.
A type-map file allows you to perform content negotiation on a file-by-file basis, while
Options +Multiviews applies to the directory (and I believe sub-directories).
The name of the type-map file is what I'm going to call the external URI of the document you want to negotiate.
Now, one minor problem I initially had was that I didn't want to change the various URLs in my documents.
So, I initially tried to have my type-map file end in
.xml however that ruled out using .xml as an extension for any of the negotiated files.
So, I've gotten over my resistance to changing the URLs for these files throughout my site in the interest of actually getting this working.
The remainder of this article is a step by step guide to creating and using type-maps for content negotiation of RSS/Atom files.
To use a type-map for content negotiation, you have to associate an extension to the type-map handler. You can do this in your server configuration, or in a .htaccess file.
Note: The name can be configured on a server by server basis, and the ability to add handlers may be restricted on your particular server.
Now, one other thing I decided was to avoid the default type-map extension of
I wanted to use a scheme where I could indicate the type of the file being negotiated in the URI.
So I am using
.@extension to indicate both the mime type of the negotiated file as well to indicate that a given URL needs to be negotiated using a typemap.
Thus, a negotiated RSS feed
rss.xml will have a type-map file named
A negotiated CSS style file
common.css will have a corresponding type-map
This resulted in addding the following to my
AddHandler type-map @xml @txt @html @css
So, any request for a file ending in
will result in the type-map handler being invoked.
The type-map handler will look for a corresponding type-map file in the directory, so we need to create one.
The type-map file lists the various options to negotiate across.
A stanza consists of a
URI: header specifying the file to return, and one or more additional headers which specify characteristics about the file which can be used in negotiation.
The accepted headers are:
Accept:header in an HTTP request
Accept-language:header in an HTTP request
Accept-encodingheader in an HTTP request
Content-length:I'm not sure how one would negotiate on this
Description:a textual description used in error messages if no variations are suitable
Although the RSS and Atom feeds for this site are negotiated based on the presence and value of the
accept-encoding header in an HTTP request, I'm just going to refer to the Atom feed for this example.
The atom feed is saved as
feeds/atom.xml relative to the
articles directory of this site.
A cron job is run on a regular basis which compresses
atom.xml and saves it to
atom.xml.gz can be retrieved directly, however I want to negotiate and ideally return solely the gzip'd file.
The applicable type-map file is named
atom.@xml and consists of:
The first line of the first stanza specifies the filename of the uncompressed file, atom.xml.
The next line specifies the content-type,
A blank line is required between stanzas.
The second stanza specifies the compressed filename using the URI header, followed again by the content-type. The content-encoding line gives the type-map handler the information it needs to differentiate between this entry and the previous entry in the file.
Again, this file is saved as
atom.@xml on my system.
If you request http://artific.com/articles/feeds/atom.@xml with a typical browser you likely will just see the resulting Atom format data.
If you bring up the page properties (eg: Tools/Page Info in Firefox), you might notice the smaller size, but otherwise the fact that you received the compressed file instead of the uncompressed file should be transparent.
You could use
curl to see the effect of negotiation by specifying whether or not compressed data is accpted (eg:
wget -S --header=Accept-Encoding:\ gzip,deflate http://artific.com/articles/feeds/atom.@xml to request the compressed version, drop the --header option to get the uncompressed version).
The point, basically, is to minimize the bandwidth required for these files.
If you're using PHP you can turn on gzip compression for PHP files, and in theory you could wrap some PHP around your RSS/Atom feeds, your CSS and other static files. I tried that briefly, but feel it's a waste of resources especially since the Apache server can handle this work easily.
Now, there's another option that is much easier than what I've outlined here, and that's to use
mod_gzip_handle_methods GET POST
mod_gzip_item_include file "\.css$"
mod_gzip_item_include file "\.html$"
mod_gzip_item_include file "\.xml$" </IfModule>
However, I had to rule that out as my hosting provider doesn't include mod_gzip in the standard build.
So, back to what can be done with a typical user setup.
There are some drawbacks to this approach.
It's not totally transparent, you have to specify a filename which will appear odd in comparison to what the user agent is expecting, regardless of using my
.@xml extension or the default
.var default extension.
There are user agents in the world which unfortunately make decisions based on the extension of the requested document, these UAs may mistakenly treat your gzip'd file as
text/plain or an octet stream, and the resulting interpretation is unlikely to be what you want.
Another drawback, specific to using this technique with a content management system like MovableTpe is that there's a disconnect between the URI of the feed (eg: http://artific.com/articles/feeds/atom.@xml) and the file maintained by the CMS (in my setup:
feeds/atom.xml is the only file known to MT).
So, where I've been using:
<link rel="alternate" type="application/atom+xml" href="<$MTLink template="Atom Index"$>" title="Atom Index"/>
to link to my RSS and Atom feeds, I've had to change to and hard code:
<link rel="alternate" type="application/atom+xml" href="<$MTBlogURL$>/feeds/atom.@xml" title="Atom Index"/>
It's a minor thing, but it's the sort of thing which will surface to cause problems for me down the road when I decide to move my feeds somewhere.
Basically: if you run a site which is expected to receive many visitors, or you have files like the various syndication feeds, or you're concerned about speeding up transmission and minimizing bandwidth utilization, you will want to look at compressing as much data on your site as possible.
Use this technique if
mod_gzip is unavailable.
If you are serving pages up in PHP, look at using either
ob_gzhandler or adding
php_flag zlib.output_compression On
php_value zlib.output_compression_level 9
It occurred to me as I wrote this that it may be possible to set up a type-map for a generic feed, negotiating not only whether it's compressed or not, but whether it's RSS, Atom, or the feed around the corner.
Posted in MovableType
Copyright 2002–2011 Artific Consulting LLC.
Unless otherwise noted, content is licensed for reuse under the Creative Commons Attribution-ShareAlike 3.0 License. Please read and understand the license before repurposing content from this site.