The HTML5 History API: AJAX & URLs (or no URLs) - Part I

Amarjit Bharath   —   20 March 2013   —   Code & Drupal

The introduction of AJAX sparked a frenzy of web applications, including chats, feeds, dynamic forms and many more examples alike. It has allowed developers to create websites where the browser would never refresh or reload a page when a link is clicked. It creates a nice effect and allowed us to have faster loading sites; saying that, a major flaw was introduced, which relates to the browser address bar and history.

In this article I will explain exactly what the problem is, how we can fix it using both HTML4 and HTML5. I will introduce you to the HTML5 History API and create a working example of an AJAX site and the History API.

Firstly, let me state that it is possible for a website to access your browser history. Before you gasp in embarrassment, it can’t see what you’ve been browsing (release breath), just force your browser to jump back and forth through your browser history. It’s the same process as you clicking the ‘back’ and ‘forward’ buttons within your browser. So you can be sure your browser history is safe.

Secondly, an AJAX site usually refers to a website that loads parts or all of its content without having to refresh or redirect to another page. When new content is loaded via AJAX in the browser, the URL within the address bar will never change. It can create an effect of the website responding quickly, as the browser doesn’t need to re-render all of the content.

AJAX uses the XMLHttpRequest object to fetch remote data, although that data must adhere to the same origin policy of JavaScript. When using the AJAX functions in JQuery and other such frameworks, they are all simply using the XMLHttpRequest object in the background.

A major drawback

One of the major drawbacks of developers implementing AJAX is that they tend to ignore the fact that the URL in the address bar is never updated when they load content using the AJAX call.

Let me create an example.

Say you have website, programmingdonkey.com (that URL doesn’t exist but feel free to register it).
You have implemented your navigation region on the left and the content region occupies the remaining space to the right.

  1. The user would visit your site
  2. Click a link
  3. Your JavaScript event listener captures the event, then fetching the new data and updating the content region.
  4. The URL in the address bar has not been updated, even though new content is present on the page. See Diagram 1.1

HTML5 example no url change
Diagram 1.1 – URL in address bar not updated.

This can be a major nuisance for anyone wanting to ‘Add bookmark’. If the user decided to bookmark the page, they would only be bookmarking the page currently in the address bar – most probably the homepage or whatever page they originally landed on.

Say for example, a user navigates through several depths of links and bookmarks the page for later. They return later, firing their browser up and visiting your site via the bookmark, only to be presented with the page where they first started. If the user attempts to press the back button or view their browser history nothing would happen because the history doesn't exist for the site. This creates a frustrating user experience.

To further add to this frustration your users will not be able to share pages from within your site. This also means sharing links through social media won’t work either.

The URL within the address bar is of some significance and is used by many different technologies and services to point users to the correct resource on a website. If you are only using AJAX to load widgets then you won’t need to update the address bar. It only becomes an issue when using AJAX to reload the entire page content.

It may seem like a cool effect that content is being updated without the URL changing – but it’s just not usable.

Side note - Search engines don’t parse JavaScript. If your website only uses AJAX to load content, your content will never be indexed, with the exception of your homepage. I’ve seen it many times before, big sites using AJAX to load content that is never indexed. As a result they lose position in the rankings to their competitors.

The Fix: HTML4

Previously, to fix the issue with the address bar not being updated after state changes (after AJAX content is loaded); we appended hashes to the end of the URL. These hashes would signify what content you were currently viewing.

Say for example a user clicked a link; the URL would be updated with the hash embedded in the anchor or extracted from the link’s attribute and added to the end of the URL within the address bar. The browser as a result would not reload or redirect. If an existing hash is already appended to the URL, this would simply be replaced.
This provided a means of having unique addresses that the user could bookmark and even share.

Example of hash used in anchor: <a href="#contact-us">Contact Us</a>

Example of attribute used in anchor: <a href="/contact-us" name="contact-us">Contact Us</a>

A JavaScript click listener would then be attached to the anchor. It would extract the hash or attribute (contact-us) and fetch the content using AJAX via say ‘programmingdonkey.com/contact-us’.

Ajax url change #contact-us
Diagram 1.2 – URL in address bar updated with hash and without page loading.

If the user decided to share this link – on page load, the server or JavaScript would again extract the hash (or attribute) and display the content as if the link was clicked directly.

This implementation will work on all browsers, even IE6. Hashes within anchors have been around for a long time; so this implementation is solid, however ugly.

Although this is a good fix to our problem of the address bar not being updated, search engines will still not follow or index anchors that contain only hashes. Thus the best option here is using well formatted URLs in the ‘href’ attribute and add a ‘class’ or ‘name’ attribute containing the hash. Adding a prefix will help you distinguish between regular values and hash values used for loading content.

e.g. <a href="/contact-us" name="ajax-contact-us">Contact Us</a>

Another issue still exists; the back and forward buttons will still not function as intended. You can navigate through the site but pressing the ‘back’ button will take you to the previous site you visited. Very annoying but there isn’t really a good fix for this using HTML4.

Rounding this up, we know we can achieve unique addresses for the various content on our site (although ugly) using hashes and even provide a fallback for non-JS users and spiders. The remaining problem still remains of the back and forward buttons not working during state changes; which leads us nicely on to the HTML5 History API.

The Fix: HTML5

The History API is not a new feature in HTML. The HTML5 specification does however add functionality to allow us to do more with the browser history and URL’s within the address bar. It’s supported by all major browsers, apart from IE (including IE9).

It lets us do away with those ugly hashes by allowing us developers to rewrite the entire URL after the domain prefix. The History API is accessed via JavaScript, so it’s all client-side and always available. You can consider the HTML4 solution defunct with the exception as a fallback for IE9 and below because Microsoft has not implemented it.

Thus we no longer have to ruin our well formed source code with extra attributes or hashes within anchor tags. We only need a small snippet of JavaScript to listen for clicks and then fetch the content by parsing the URL in the address bar. Your links will remain untouched, working as normal and supported for browsers without JavaScript capabilities.

Furthermore, the History API allows us to manipulate the browser history so users can navigate back and forth between states. This means the page history will be added to the browser history, as if the user was roaming a site without AJAX.

The History API specification also allows you to modify the page title – but it’s not really supported as of yet. This is achievable through a separate line of JavaScript.

Using the History API, users can now bookmark your pages with clean URLs (no hashes – except IE of course), share and even navigate back and forth through your site freely. It also means that if your JavaScript breaks or a spider crawls your site, it will still be available in its entirety; bearing in mind, you have been developing with graceful degradation in mind.

Later on, I’ll show you how you can implement this yourself with minimal code and do away with third party libraries.

HTML5 example with full url change
Diagram 1.3 – URL in address updated with well formed URL without page loading.

Alternatives

An alternative you may consider to AJAX is having the entire content readily available on a single page – altering the visibility using JavaScript and CSS, which would work nicely on small sites where content is limited. As a result, no HTTP requests need to be sent and received between clicks, as the data has already (hopefully) downloaded and available in the DOM from the start.

History.js developed by balupton implements the HTML5 History API with fallback for IE9 and downwards. It’s a third party library, which contains our solutions for both HTML4 and HTML5. It has a few extras like active menu classes and AJAX loading of content. If you end up using it, I urge you to get an idea what lies behind the scripts.

Final Message

It is clear that well formed URLs are an important, functional part of websites. URLs give our users a visual cue to what they are currently viewing and even if a page has loaded. Some services and users even directly modify the URL within the address bar when they familiar with the website.

Spiders crawl what they can see (without the use of JavaScript) and will not index any URLs that contain hashes. So ensure you implement the HTML5 solution with fallback for IE users.

This leads me on to my next point of building your site first and then adding the AJAX functionality on top. This will ensure your site is developed in a way that is accessible for all users. Building core functionality first and adding the little extras on top is a common methodology in web development.

Most importantly, the only way users can get access to these resources is by well formed URLs. The URL acronym actually stands for ‘Uniform resource locator’, it’s in the name.

In the next article, we will create a simple sitemap and bolt on AJAX so your content is loaded dynamically. We will then implement the HTML5 History API to achieve what we have talked about today.



Our Partners / Accreditations / Networks

0161 872 3455

Sign up to our newsletter