Distributed Website Thumbnailing

Thumbnail screenshots of websites seem to improve web usability enormously. For me, seeing a thumbnail triggers clearer and faster recognition than a domain or a name alone. Favicons also help when I've used the site enough to become accustomed to it. The GooglePreview Firefox extension is a favourite of mine for this reason.

There are now quite a number of websites which allow free website thumbnails. While these services are pretty good, and I recommend using them, these services require a huge amount of bandwidth to load the websites and serve the thumbnails, a lot of CPU time to render the websites, and a lot of storage to store them all. This means they consume money and the companies running them place a variety of restrictions on what can be done with the thumbnails. Also you very frequently find thumbnails don't yet exist or no longer exist, and the thumbnail service serves up some advertising instead, which is bad for usability. Perversely, it's for the same infrequently visited sites that it's hardest to remember that thumbnails get purged quickest.

If Google or another large search engine entered this market they could make a fast and free service that would be self-supporting. They are the only people who are making vast amounts of money enhancing the web - because a better web drives more business through their main search engine.

However, in the absence of that, I wonder if we shouldn't turn to distributed technologies to make the business of understanding where a link takes you an innate part of web standards, rather than a bolt-on service controlled by a vendor.

You could imagine a web standard similar to the favicons system, where thumbnails of the website are available at standard sizes - say 128x128 or 256x256 - at /thumbnail128 and /thumbnail256, but this places the onus on the publisher to create the screenshots and keep them up to date. Even worse, it's not a great idea to trust the website themselves. Shock sites, porn sites or scam sites could benefit from misleading users into visiting a site.

One solution might be a distributed network for website thumbnails. A lot of research and development has been done in the area of DHTs particularly to improve the performance and decentralisation of peer-to-peer networks. A client could look up a URL in a DHT to obtain a URL for a thumbnail of that website.

There is also a way of generating thumbnails in a distributed manner: web browsers. There are so many web browsers visiting so many websites that if you could tap into only a tiny fraction of them - with, say, a Firefox extension that generates and uploads thumbnails using <canvas> (assuming you can work around the privacy implications) - you could get good coverage quickly. Because it piggy-backs onto the normal web-browsing experience, it uses very little extra bandwidth than what users were already using.


Comments powered by Disqus