Do not construct URLs with concatenation

I'm working on an installation of the Joomla! CMS where none of the links are working correctly. Joomla! is very sloppy with URLs. The uploads directory appears to be called images/stories but a quick grep shows that that exact string is referenced 146 times in the Joomla! installation. That's in the source code, not the database. Most of those times it is being concatenated into strings to make URLs.

I've just spent three hours working out that I have no idea what Joomla or the XHTMLSuite editor the client has chosen to use is doing and that I don't give a damn because whatever they are doing, they are wrong.

The correct way to construct a URL from a filename is not concatenation. Do not do this. It does not work properly. So to avoid any confusion let me state categorically how URLs are supposed to work.

Relative URLs are the only situation where a web browser tries to interpret the query string of an HTTP request. For this purpose, the URLs http://hostname/directory and http://hostname/directory/ are not the same. The latter form is correct. The former works because Apache works out that this is a directory and issues an HTTP redirect to "canonicalise" it. Never hard code a URL for a "directory" which does not contain a trailing slash. If it isn't hard-coded, make sure that the application appends a trailing slash if none exists.

There are two operations which you then need to define to be able to construct URLs:

  • Given an absolute base URL A, and an absolute or relative URL B, compute a new URL B` which is an absolute representation of B in the context of A.
  • Given an absolute or relative URL, append a query-string parameter.

The first operation is not concatenation. Learn this.

In notation, let A ~ B = B`

So say you want a URL for a specific uploaded image. Start with a base URL for your site.


We then have a relative url of our image directory from the base url.


Then http://mysite/ ~ images/stories/ = http://mysite/images/stories/

We have a filename of our image, "Uploaded Image.jpg". First, we need to make that a relative URL. This requires URL encoding:


Then http://mysite/images/stories/ ~ Uploaded%20Image.jpg = http://mysite/images/stories/Uploaded%20Image.jpg

At this point we have a working URL. I know, it looks like all we've done is concatenation, and that's why people appear to make this mistake time and time and time again. But it isn't concatenation. What if our base URL was http://mysite/CMS/ and our images URL was /uploads/ ? Or what if our images URL is http://uploads.mysite/?

More than that, using this operation doesn't let people go wrong. It discourages them from just wedging a / in there in the hope that it will make their URLs work, and prevents ambiguity about whether a piece of code works in all situations or just the way they've got it configured.


Comments powered by Disqus