_Ed. Note, March 2019: I’ve written a more succinct version of this argument here.
In Unix, a trailing slash on a pathname identifies the path as pointing to a folder (aka directory). If a pathname does not have a trailing slash then it points to a file. A folder is a ‘collection’ of files. The syntax of URIs is derived from the syntax of Unix filenames, and the concept of using trailing slashes to identify ‘collection’ resources was carried over. However on the Web, the strong delineation between folders and files does not exist, frequently a ‘collection’ resource appears similar in structure and content to a normal resource (sometimes referred to as a ‘subordinate’ resource). As a consequence, much confusion has arisen about the purpose and importance of trailing slashes on collection resources. It is common for users to forget the trailing slash on a resource, and common for web-servers to assist users, when they make this mistake, to redirect them to the URI with the trailing slash automatically. In fact this practice is so pervasive, that for the vast majority of users, a resource URI with or without a trailing slash is treated as a synonym. They are considered two URIs that point to the same resource, using either one is fine. However this understanding is not quite correct. It is more correct to understand that the resource without the trailing slash does not exist at all. But instead of being unhelpful and reporting a 404 Not Found status, web-servers almost always apply Postel’s Law and tell the user where the resource they are looking for is actually located, via a permanent redirect. Read on to learn why it is important to have the correct understanding when designing RESTful APIs.
The main reason trailing slashes are so important, is that they are critical to relative URIs in your API functioning correctly. There are many reasons why you should want to use relative URIs, but let’s look at just a couple:
Everywhere the fully qualified URI is repeated, it must be changed if the URI ever needs to change. With relative URIs this problem is mitigated, the location of a resource is expressed relative to the location of the current resource, and the client is able to turn that relative location into an absolute location using URI resolution. To be blunt, using fully qualified URIs, when a relative URI would suffice, is a violation of the Don’t Repeat Yourself (DRY) principle. By needlessly repeating the fully qualified URI, you create more work for yourself, if the fully qualified URI has to change for any reason. You can also create more work for the customers of your API, by potentially preventing them using relative URIs.
Let’s look at an example that neglects to use trailing slashes and attempts to use relative URIs, and as a consequence gets things wrong. Assume we’ve noticed that there just aren’t enough blog engines in the world, and so we’ve created the greatest blog engine ever (GBEE), and we want to expose an API to it, so that the people who make those ‘sharing’ widgets have yet another API that they need to integrate with. Here’s a rough idea of the API:
GET https://{blog-name}.gbee.io/blog/posts # Retrieve the list of posts GET https://{blog-name}.gbee.io/blog/posts/{id} # Retrieve an individual post GET https://{blog-name}.gbee.io/blog/images/{id} # Retrieve an image linked to in a post's content.
Here’s a sample of retrieving the list of blog posts:
GET /blog/posts HTTP/1.1 Host: some-blog.gbee.io
produces a listing like the following:
HTTP/1.1 200 OK Content-Type: application/json { "posts": \[ { "links": \[{"href":"holiday"}\], "tags": \["vacation", "mexico"\], "summary": " Had a great trip to Mexico recently, here's a picture of where we stayed: ![](../images/hotel.jpg) " }, ... \] }
We can see that the relative URI of the first blog post is holiday
, so what is the correct absolute URI of the blog post? You might expect it to be http://some-blog.gbee.io/blog/posts/holiday
, but that’s not what the document above says. It actually says the absolute location is http://some-blog.gbee.io/blog/holiday
. To understand why, you need to understand the algorithm for transforming relative URIs into absolute URIs. The first step is to establish the base URI of a resource, in this case the base URI is the URI of the requested resource, i.e.: http://some-blog.gbee.io/blog/posts
The next step is to merge the base URI with the relative URI, RFC 3986 describes this process, the relevant statement is:
return a string consisting of the reference’s path component appended to all but the last segment of the base URI’s path (i.e., excluding any characters after the right-most “/” in the base URI path, or excluding the entire base URI path if it does not contain any “/” characters).
Therefore we must exclude any characters after the right-most “/” in http://some-blog.gbee.io/blogs/posts
, which gives: http://some-blog.gbee.io/blogs/
. Finally the relative URI (holiday
) is appended to this URI, thus giving http://some-blog.gbee.io/blog/holiday
. If a client attempts to retrieve http://some-blog.gbee.io/blog/holiday
they will get a 404 Not Found error, because the resource doesn’t actually exist at that location. If it is not already clear, placing a trailing slash on a collection resource is not optional. It is critical to relative URIs being resolved correctly. RFC 3986 is one of the foundational specifications of the web and as the example above demonstrates, collection resources are expected to have a trailing slash. It’s not just a stylistic preference, or something that provides an SEO optimization. It’s intrinsic to the syntax of URIs and therefore, important to get right. I’d speculate that a lack of understanding of how relative URIs are transformed into absolute URIs, is a major factor in the all the too common occurrence, of web APIs not naming collection resources correctly, and consequently using fully qualified URIs throughout the API unnecessarily. I think developers encounter problems trying to get relative URIs working properly (through their lack of understanding of the mechanics) and then err on the side of caution and switch to using fully qualified URIs throughout. The above ‘broken’ API also provides a commonly seen problem where relative URIs seem to be working in one case, but not in another. If we were to retrieve content of the blog post it would look like this:
GET /blog/posts/holiday HTTP/1.1 Host: some-blog.gbee.io
HTTP/1.1 200 OK Content-Type: text/htmlHad a great trip to Mexico recently, here's a picture of where we stayed: ![](../images/hotel.jpg)
Nothing unusual about this content, just regular HTML, and the relative URI of the image link looks correct:
Base URI
http://some-blog.gbee.io/blog/posts/holiday
Relative URI
../images/hotel.jpg
Absolute URI
http://some-blog.gbee.io/blog/images/hotel.jpg
So the problem doesn’t lie here, again it lies with the /blog/posts
resource. When the <img>
tag is evaluated relative to /blog/posts
, the wrong location is produced:
Base URI
http://some-blog.gbee.io/blog/posts
Relative URI
../images/hotel.jpg
Absolute URI
http://some-blog.gbee.io/images/hotel.jpg
You can easily imagine many developers scratching their heads trying to figure out why the image location is not being calculated correctly for the /blog/posts
resource, when it is working fine for the /blog/posts/holiday
resource. The fact that the base URI is that of the requesting document (the list of posts), not of the blog post itself is easily missed.
It is also worth appreciating that the author of the blog post may have the reasonable expectation that they can use relative URIs, since they are an intrinsic part of the Web. Users will expect that the blog engine will fully support the use of relative URIs. By choosing not to put a trailing slash on the /blogs/posts
resource, the blogging engine has failed to meet this expectation. The post author can reasonably view the blog engine as defective in this regard.
All of the problems outlined above can be addressed by placing a trailing slash at the end of the blog posts resource, so it’s URI becomes:
http://some-blog.gbee.io/blog/posts/
Now the relative path for the holiday
relative URI resolves correctly:
Base URI
http://some-blog.gbee.io/blog/posts/
Relative URI
holiday
Absolute URI
http://some-blog.gbee.io/blog/posts/holiday
The relative path for the image in the blog post also resolves correctly when resolved relative to the /blog/posts/
resource:
Base URI
http://some-blog.gbee.io/blog/posts/
Relative URI
../images/hotel.jpg
Absolute URI
http://some-blog.gbee.io/blog/posts/holiday
Another mistake that I have sometimes seen, is to get naming of a collection resource correct, but to get the URI for queries on the collection wrong, e.g.:
http://some-blog.gbee.io/blog/posts/ # retrieve all posts http://some-blog.gbee.io/blog/posts?tags=vacation # retrieve posts tagged with 'vacation'
The problem here, once again, is that relative URIs returned in the http://some-blog.gbee.io/blog/posts?tags=vacation
resource will be resolved relative to http://some-blog.gbee.io/blog
rather than http://some-blog.gbee.io/blog/posts/
, because the trailing slash is missing. The correct form of the URI would be:
http://some-blog.gbee.io/blog/posts/?tags=vacation
Some resource formats specify ways to override the base URI of a resource. For example HTML has the <base>
element. XML has the xml:base
extension. These mechanisms exist to provide a way for a HTML or XML document to render correctly when using embedded hyper-links that use relative URIs, regardless of whether the hosting document is correctly named with a trailing slash or not (or if the URI of the hosting document cannot be determined). I only mention these for completeness, I would not recommend their use unless required to workaround an existing API that does not name collection resources correctly.
URIs are hierarchial in nature. The path component of a URI is particularly hierarchal. Levels in the hierarchy are delimited by the slash ("/
") character. Paths to the right of a slash are subordinate to paths to the left of a slash:
/a/b # b is subordinate to a /c/d/e # e is subordinate to d, d is subordinate to c
Any path which occurs to the left of a slash is a collection resource, any path which occurs to the right is a subordinate. Note that a resource can be both a collection resource and a subordinate resource (as shown by the /c/d/e
example, d
is both a collection resource and a subordinate resource). If a resource is a collection resource (even if it is subordinate to another resource), then it’s URI must have a trailing slash. If you visualize the path hierarchy of your API as a tree, then only the leaf nodes in the tree should lack a trailing slash.