Deep Linking and Indexing AJAX Applications – Google, Hashbang and state maintenance

In AJAX applications user interaction is handled on the fly and content is generated and injected into the DOM. Today this is an important step in creating responsive UI’s and the benefits are obvious.

But since users are no loner browsing actual pages, you need to take extra steps to maintain state, handle url’s and serve indexable content to the crawlers. This post should give you a head start on your next fully fledged AJAX application.

State and bookmarkable url’s

Basically we wanna accomblish two things here:

  1. Be able to change the browser’s current url without causing roundtrips to the server.
  2. Route those url’s to JavaScript functionality and dynamic content.

 

In modern browsers history.pushState() gives you full control, allowing you to manage history and change urls strictly on the client – this can be any valid url within the current domain. The onpopstate event will then let you listen for url changes and map relevant content and functionality.

Usually though you need to support a few more browsers and are stuck with location.hash and the onhashchange event which has wider support. The concept here is to use the hashfragment of the document (which does not cause roundtrips) to emulate url structures and/or parameters.

This could look something like this:
site.com/search.aspx#?term=foo&filter=bar

Or this perhaps prettier one:
site.com/search.aspx#/foo/bar/or/what/ever

As long as it’s a valid url it can take whatever form you fancy, and if you include a plugin like Ben Almans jQuery Hashchange, this approach will have you going in IE6 and IE7 too.

From here on out, it’s about updating location.hash while listening for changes with onhashchange, then execute functionality accordingly. In other words you’re now linking to specific parts of the application and allowing users to bookmark relevant url’s.

Abstracting url handling

While very possible to do manually, this url-to-functionality mapping can get quite tedious. Fortunately there are quite a few libraries to help you abstract this part. Ben Almans extended BBQ plugin is taking the hashchange a step further, adding jQuery.param and jQuery.deparam methods to help querying url’s.

A few other examples: Basic, Advanced.

Also more elaborate frameworks like Backbone.js and the lighter Spine.js has Route modules for mapping functionality. These also give the advantage of supporting history.pushState() while falling back on location.hash in older browsers – sounds like a win-win.

Here’s an example of routes in Backbone.js

var Workspace = Backbone.Router.extend({
  routes: {
    "help":                 "help",    // #help
    "search/:query":        "search",  // #search/kiwis
    "search/:query/p:page": "search"   // #search/kiwis/p7
  },
  help: function() {
    ...
  },
  search: function(query, page) {
    ...
  }
});

 

Indexing content with Googles Hashbang

Now that your about to turn your AJAX application up to 11, you need some way to dish out content to search engines to complete the scenario. If you’re able to go with the modern approach of changing proper url’s with pushState(), you just have to make sure that the server is able to render relevant content based on the same url’s – which might include some useragent detection skills.

When it comes to the hashfragment things get a bit more complicated, as it’s not a part of the communication with the server. Google is aware of that and offers a solution with the hashbang notation – ‘#!’. With this you’re letting Google know that this ‘ajax-url’ is indexable and that the server is able to render a snapshot of the html. The crawler will then make one additional request using the hashfragment as a parameter.

Using one of the previous examples, this is what happens:
site.com/search.aspx#!/foo/bar/or/what/ever

Will result in this additional request:
site.com/search.aspx?_escaped_fragment_=/foo/bar/or/what/ever

The server then has to process this request and render a snapshot of the relevant content.

Basically thats it, you now have hashbang url’s in the Google index linking directly to your AJAX functionality.

Using redirects

You can use redirects to help with the server part, it can sometimes make things easier. As long as the crawler eventually ends up at a page, it’s perfectly safe to do redirections.

Lets say the crawler requests:
site.com?_escaped_fragment_=/foo/bar

You could redirect to this relevant content:
site.com/foo/bar

Also if you’re using a framework like Backbone.js with pushState() for modern browser and location.hash as fallback for older ones, redirects will complete the cycle.

The modern browser gets a pushState() for AJAX functionality on:
site.com/foo/bar

The older browser gets a hashchange for AJAX functionality on:
site.com#!/foo/bar

The server redirects the crawlers additional request from:
site.com?_escaped_fragment_=/foo/bar

To this:
site.com/foo/bar

So 301′s or 302′s?

Using 301 ‘Moved permanently’ redirects, the target url will end up in the Google index, if you use 302 ‘Moved temporarily’ it will be the #! url.

Usually 302′s straight to the AJAX experience is the way to go, but remember that disabled users, or users with JavaScript turned off, could hit that url too.

Here’s the official info on Google’s AJAX crawling, there’s som e good info on creating HTML snapshots as well.


One Response to Deep Linking and Indexing AJAX Applications – Google, Hashbang and state maintenance

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>