Debunking SEO

Daniel Pope

I've discussed previously how the SEO industry constructs its advice. But I now want to take them to task on actual advice I've received from SEO companies. SEO companies make claims that are poorly scientifically verifiable, because it's very difficult to distinguish causal factors in changes in search result positions.

To validate these claims we could imagine a study where we compare the rankings of two groups of websites, distinguished only by whether they implement a given SEO suggestion. If the hypothesised recommendation does affect ranking we would expect to see a statistically significant amelioration of search engine ranking.

I don't believe this is possible. For one thing, there are too many factors, given the complexity of the web, to be able to extract a clear picture, so any results would be unlikely to be "statistically significant". This means any effect noted would not be as great as the margins of error of the experiment. The results would be too muddied by independent and much more important considerations like inbound links and accessibility. Also you can't get a very good appreciation of how much a rank is affected: you only see the order of results, not how much better one result is considered than the next. Statistically that should widen the margins of error.

I am skeptical about a lot of these things. I don't think I can disprove them given the doubts I've expressed above, but I do contest them. I believe they are unlikely and I believe SEO people believe them for invalid reasons.

Using meta keywords tags increases ranking for those keywords.

No major search engine uses meta keywords. It's far too easy to manipulate and does not reflect the actual page content.

Using meta keywords tags increases search engine traffic if the keywords also appear in the page content.

I doubt this would predict relevance well enough to be useful. For example, anyone could mirror the content in the page into the meta keywords, and the page would rank higher. Meanwhile sites that omit keywords would be penalised.

Using meta description tags increases search engine traffic.

The meta description tag appears in place of an excerpt from the content in several major search engines. This undoubtedly increases the apparent quality and apparent relevance of a site in a search engine's result pages, and that could persuade more people to click on a link.

Using keywords in <title> increases search engine ranking.

Unfortunately, I think this is very plausible. However, I frequently see page titles ransacked by keyword stuffing. There's a trade-off here between providing something that reflects the content of your site and making your site untidy and the search result listing unclear. For example, I would prefer to see

<title>Fireplaces - Mobstone Marble</title>

in search results to

<title>Mobstone Marble for cheap fireplaces, fireguards, hearths, gravestones and more - Call 01234 567890</title>

which is the sort of thing I've seen recommended by some SEO companies. I think there's an argument that as long as you're putting the terms "Fireplaces" and "Mobstone Marble" into the title, you've covered the relevant keywords for that page, plus the page is described clearly and unambiguously in search results.

Keywords in <h1> tags are more heavily weighted for relevance than keywords in <h2> tags and so on down to <h6> and any other tag.

This is definitely a good predictor of relevance, but it should be remembered that it can be generalised to all tags, not just <h1-6> tags. You could deduce a weighting scheme like this through statistical analysis of a corpus of HTML. In particular, you might find that <th> or <dt> or maybe even just nested <b><i> trumps <h6>.

Putting keywords into bigger <h1-6> tags increases ranking for those terms.

This is the kind of thing that non-programmers/non-statisticians would assume is implied by the previous fact, but is not. Tag weights definitely guide how a search engine apportions weight but it would be fairly naïve if it just counted as simple boost in the rankings. Search engines strive to assess relevance from page content in the way humans do, and bigger titles don't imply more relevance to humans. They catch your attention more, but you assess their relevance in a more holistic fashion.

Putting keywords into URLs makes a page appear more relevant for those keywords.

It's a fact that URLs are intended to be opaque: there's no reason to believe http://work-safe-images.org/racoon.png is not a JPEG image of a vagina. Humans don't treat them this way of course. Filenames would be useless if they didn't help us to identify the content of a file. However, one problem with using something defined as not being relevant as relevant is that you would get a significant rate of misprediction. If you take a URL of /animals/racoon.html as lending credence to an assessment that it's a page about racoons, what happens when you discover it's not a page about racoons at all? In short, a search engine must assess relevance of a page based on the page itself. Since it has to do that, does it really get more information from the URL? Let's say URLs are relevant 70% of the time. Something that is wrong 30% of the time and otherwise merely confirms what you already know is pretty worthless. I think friendly URLs are good from a usability perspective, and they confer a certain element of quality as far as I'm concerned, but I don't think it's very plausible that they affect rankings.

Putting keywords in image URLs makes it appear more relevant for those keywords.

This is possibly a lot more plausible than with HTML. When spidering images you will really struggle to find enough information. Since we can't assess relevance directly with images, the example above changes. URLs may be wrong 30% of the time, but 70% of the time there is information you could not otherwise find. Still, one way around this, if I was writing a search engine, would be simply not to index images where I cannot gather enough information to assess relevance.

Putting keywords in URLs and alt tags will get images to appear combined searches and thereby boost conversions.

Images that are likely to produce conversions don't appear in combined search results. Have a look on Google now, if you want, and convince yourself of that. I suspect you would not be able to find any image that promotes one specific vendor. There must be some heuristic which ensures that images in combined search results are vendor-neutral encyclopaedia-type images. Googling "Britney Spears" gets you pictures of Britney Spears. Googling "Asus eee 700" doesn't get you pictures. I suspect there's a reason for that.

Providing buttons to "Bookmark this page" boosts conversions.

It obvious that users who bookmark pages come back more than those who don't, but I doubt a great many people use these, unless they are frequent enough users to know where the "bookmark this page" button is better than they know where the star is on their web browser. That kind of user doesn't need the encouragement to come back.

Opening external sites in new windows encourages people to return when they have finished reading an external page.

There's no doubt that if you can keep your site in a background window, it can allow visitors to pick up where they left off when they close the foreground window. However, the most heavily used navigational tool is the back button, not desktop windows, and opening a new window disables the ability to use the back button to return. Instead of closing windows to return to what they were doing, users fall into a pattern of piling up windows and then use "Close Group" from the Windows taskbar, or some just closing a batch of windows at once. Which approach is best can be established by usability research, and on this issue usability analyst Jakob Nielsen is unambigous: don't break the back button! Don't open new windows!

Absolute URLs are better than relative URLs.

Software can convert between relative and absolute URLs as necessary. This only affects broken software that needs to and doesn't. There is a lot of broken software in the world but any software that's been tested against the wild wild web shouldn't fall into this trap. The amount of software that's broken in this way is negligible in comparison to the amount broken for hundreds of other reasons that you also need not support.

Comments