Tip: Tables containing people

One of the most common things to store in a database are details of people, be they users, contacts, or third parties you record details for. People are also far and away the most common source of mistakes in the design of relational database schemas and applications. Here are some pitfalls with handling names.

  1. First name, last name - Japanese people, amongst many others, disagree with the West on this ordering.
  2. Christian name - I'm not a Christian.
  3. Title as an enumeration - do you limit the choices of prefix titles people use? Have you included Dr., Sir, Prof., Prof. Sir, Rt. Hon., Cllr., Mayor, Cpt., Cmdr., M., Mme., Mlle., Sheik, Pope, HRH, President etc? Do you think you can compile an exhaustive list?
  4. Are two names required? Sting, Lulu, Bono, Pocahontas, Countess of Wessex.
  5. Guessing initialisms - If you've asked for my full name, don't assume you can address me as Daniel J. Pope or D. Pope. I very rarely use these forms. Joanne Kathleen Rowling is usually announced as J. K. Rowling. Similarly, I'd always initial myself DJP and never DP.
  6. Many US Americans (sadly lacking maps) require their generation after their surname for disambiguation. But you can't turn John Davis III into Davis III, John. You would write Davis, John or possible Davis, John III.
  7. Don't force Title Case - Mohammed al Fayed, Manfred von Karma, John Davis III and so on shouldn't be put into title case.
  8. "Abdhul" is not an informal way to address an Arab whose name begins with Abdhul ie. the first word of a given name is not necessarily an informal name.
  9. Have you stored aliases? Should your database be able to record the fact that Thomas C. Mapother IV is also Tom "Only scientologists can save the world" Cruise?
  10. I happen to have a peculiarity with my names. I am not called "Dan Pope". Dan is slightly diminuitive but "Pope" is slightly formal, and the combination shouldn't be used. People have often assumed they may refer to me this way and they are wrong. To confuse issues, in recent years I sometimes sign e-mails this way because I've found it's a good fit for the formality of the medium. But different criteria apply in the second or third person.

Names are a very individual thing and people feel strongly about how they should be addressed. How you index or sort names is your business and doesn't influence how people wish to be addressed. If you're indexing by surname, be aware that not everybody will identify themselves to you by the same surname you have indexed. If you're sorting, you should perhaps be aware that individuals' native collations may differ.

The tip, then, is this: store the forms of address you wish to use. Do not assume you can correctly derive other forms.

These are the variants I use:

Full Title: Daniel James Pope MA (Cantab.)

Full Name: Daniel James Pope

Name: Daniel Pope

Formal Name: Mr. Pope

Postal Name: Mr. Daniel Pope

Initials: DJP

Vocative: Daniel

Familiar Vocative: Dan

Nickname: Mauve

Preferred Username: mauve

application/javascript in IE

The application/javascript MIME type has stopped working in my copies of Internet Explorer (6 and 7). Both now silently ignore scripts linked with type="application/javascript", which is as far as I have since determined the behaviour most people experience. Indeed I've only found very sketchy reports of it working at all.

Because I had no idea I was unique in being able to experience scripts linked in the RFC-compliant and IANA-approved way, I have used this MIME type on quite a few sites. So it's good news, because I would never have spotted this otherwise. Happily, most of these sites are not yet deployed, and the rest only use Javascript for very trivial polish. There now begins a hunt to replace application/javascript with text/javascript on all my sites.

Obviously, I'm at a loss to explain why Microsoft has failed to offer a consistent platform for development and I feel quite aggrieved about it. If I hadn't already boycotted Internet Explorer this would be a pretty good reason to start.

On the assumption that some other piece of software has by accident or design disabled - or had previously enabled - application/javascript, this is my list of candidates:

  1. Microsoft Outlook 2007, installed for the first time on this computer this week.
  2. Word 2007, also installed this week for the first time.
  3. Some random Windows update I installed last weekend.
  4. Access 2007, installed two weeks ago (but I think I would have noticed sooner if that had caused this change).
  5. Removing something malicious with Spybot. I think it removed one or two dubious registry entries left over from an infection of Vundo/VirtuMonde last year, so it's unlikely but there's an outside chance of an effect.

Tip: Styling bullet lists

Bullet lists are a wonderful way of formatting both because they convey structure and because they serve to visually break up otherwise impenetrable paragraphs of prose.

It's possible to use a custom image for these bullets, by way of the CSS list-style-image property. This is a great way to carry your site's theme through into the content.

If you can't be bothered to draw custom images, you can at least make your bullets agree with the font using the list-style-type. With sans-serif fonts, square bullets usually look more professional. If you are using serif fonts - and this is a bad idea incidentally, because they aren't as legible on a screen - use circle or disc (ie. unfilled circular) bullets.

Inkscape 0.46 preview

Inkscape 0.46 is drawing close. There's a whole wealth of new features, some of which are incredibly exciting. I've not been using the SVN version since I replaced my desktop machine 5 months ago, so I'm only now discovering what these new features will let me achieve. It's a process of discovering not so much how each tool works, but identifing useful little ways of combining features.

To whet your appetite, here's a two-minute doodle demonstrating a turbulent displacement filter.

Watery Doodle

Update: here's a more realistic style using a specular lighting filter.

Watery Doodle 2

Tip: Don't forget the favicon

In contemporary web browsers, favicons are prominently displayed in the browser's UI. As such they have become an integral aspect of creating a site that looks professional. More importantly, they allow users to identify a website from a host of tabs or bookmarks. Alas only a third of web pages (based only on my bookmarks list, admittedly) offer a favicon.

What your favicon looks like is less relevant than having something distinctive. Favicons should trigger recognition, and need not stand up to examination.

Satchmo

Since I've been using the Django framework for web development I've been pointed in the direction of the Satchmo framework a few times. Satchmo is intended to be the framework for e-Commerce development built on top of Django. I've looked at Satchmo more than once trying to weigh up whether it would be a useful thing to use for projects, because an e-Commerce framework certainly should be useful. Satchmo doesn't satisfy this expectation.

I have written an e-Commerce shop in Django. It took three days, much of which was spent writing the backend - basket, order processing and so on. I am absolutely certain that I couldn't have done that with Satchmo. The first reason is that Satchmo isn't orthogonal; it's impossible to use Satchmo's basket/ordering system without using Satchmo's products front end. Front ends are very easy to write in Django and I needed something custom for this shop. Custom front ends are very hard to write in Satchmo - they involve writing custom extensions to the product model. The second reason is that Satchmo doesn't use Enterprise Design Patterns like Money.

Satchmo also provides code that is very much outside its remit as a framework, like this strange configuration property which embeds Google Analytics. Google Analytics code is trivial to paste into the templates should I want to do so. Why would I need this built into the platform?

Satchmo doesn't satisfy as a framework; it's more of an off-the-shelf application regardless of what the authors claim. Now I have Django web shop code, I won't be needing Satchmo.

Tip: Mock up your designs

A lot of hobbyist web designers start by creating the content for their web pages, then try to add graphics to "spice it up". This is a mistake. It's very hard to make a page look appealing in retrospect.

Start with a high-quality vector graphics tool - I recommend Inkscape - and properly design the website. To represent where to put content use Lorem Ipsum or simply rectangles to represent individual paragraphs. Once you are satisfied with a design, recreate it in HTML, exporting any graphical elements you need from the mockup.

You can use this approach even for sites that don't need graphics; it is beneficial for choosing colours, fonts and layouts as well as for creating fully graphical sites.

Be careful though - you need to design a mockup that you can successfully convert to HTML. Graphics programs are WYSIWYG but the web is not and this places constraints on what you can successfully produce. Beware in particular of how the site changes as your browser window or text size changes.

New for 2008, Web Tips

Since I started this blog I've used it to document my experiences and thoughts on the field of web development, both as a record for me, and as a resource for other web developers. As a result a lot of the content of this website deals in-depth with issues which, unless you work with the web professionally or are at least an very accompished amateur, you are unlikely to encounter.

From the new year, I will also be posting brief tips which I hope will be useful to a wider audience of amateur webmasters and hobbyists. I am going to aim to post a tip once a week, at least initially.

Please let me know what you think of my tips segment. I'm also happy to answer questions or defend my advice; drop me a comment if you want to engage in any of the aforementioned.

Web Standards in the Next Generation

With the news this week that Microsoft have a build of Internet Explorer that can pass Acid2, I wonder if I will be forced to eat my words when I suggested recently that Internet Explorer may be falling further behind with web standards, not closing the gap.

Well, we have an interesting opportunity to measure an aspect of that gap. A quick glance at Bugzilla shows that Gecko was able to generate a correct screenshot by 2006-04-17. Internet Explorer claimed correct rendering on 2007-12-12. The gap is 604 days for Gecko, but obviously, greater for other browsers who have been compliant for much longer.

If Internet Explorer 8 progresses anything like Gecko, there will be a large number of bugs still to fix. If Internet Explorer 8 progresses anything like Internet Explorer historically progresses, most of those bugs won't get fixed. In other words, I'll believe it when I see it. That might not be for 20 months and might not be available on Windows XP. In fact, if the same post on the IE blog they are keen to excuse themselves from commitment to specific web standards, offering only a general tone in favour of them but excusing themselves with respect to backward compatibility. Taken as a preamble to a compliant-looking Acid2 rendering, I take this to mean, "we may not deliver this in IE8". I think everyone hopes they will, but by comparison, some of the Acid2 patches could have been in Firefox 2 but weren't because Firefox 2 was built with a frozen earlier build of Gecko.

Meanwhile, Firefox 3 is drawing closer. My impression is that the gap between 2 and 3 is not huge, which (hear me out!) is because Firefox 2 was excellent and Firefox 3 struggles to improve upon it. The difference for users is relatively minor. Although the new approach to bookmarking is hugely refreshing I think many users, including my parents, just won't get it.

The difference for developers is significantly less marked - the difference between having functional support for a technology that isn't portable to IE and having good support for a technology that isn't portable to IE is not something that will revolutionise the web. In fact, looking at the Firefox 3 for Developers page, the changes are disappointing and even worrying. In some ways it's a return to the browser wars of the late 1990s when competition between browser vendors' extensions demolished the concept of web standards.

  • Support for aspects of HTML5 - there isn't even a first working draft of HTML5. Although it was the WHATWG spec before, complying with a specification this early will mean that the implementation may not conform to the final specification, by which time, developers will be relying on the non-standard behaviour.
  • APNG - APNG is a Mozilla-sponsored bastardisation of PNG to add animation. It doesn't subscribe to the contract of PNG (which expressly forbids animation) and it isn't negotiable properly because it hijacks PNG's MIME type, extension and magic. This spells very bad news for the PNG format. In future it will be impossible to tell if a PNG is animated or not, and of course all legacy software will believe not. Despite the best efforts of a number of people, myself included, but most particularly Glenn Randers-Pehrson, Mozilla refused to adopt amendments which would resolve the conflicting standards and the PNG group failed to ratify APNG as an official extension. Although APNG was an ad-hoc solution to offer animated UI elements in Mozilla, it is being released and promoted as the new web standard for animation and MNG support, although a superior and established format, has been canned.
  • Microformats - Firefox 3 builds-in support for Microformats, which could just as easily be a standalone Javascript library. There's no reason why this should be built-in, except to create a de-facto standard in an API which Mozilla controls. Moreover it promotes microformats as a de-facto standard, which I'm not comfortable with, because I think Microformats are an ugly hack in lieu of a proper solution.

Polymorphic Basket Pattern

I have a design pattern I use when designing an e-Commerce system. I call it the polymorphic basket and as the name suggests, it is a design pattern covering the basket. However the basket is just a special case of an order (one that is stored, typically, in a session rather than in the database), and this pattern also covers orders stored in the database.

The problem the pattern seeks to address is maintaining and pricing a list of items. The naïve solution is to record a reference to the SKU, and a number representing the quantity. This solution does not generalise well. In many shops, there is a mixture of products conforming to different conceptual models. While some products can be fully represented by an SKU code, some need bespoke customisation. It is generally an easy task to create mixed catalogues and customisation pages for these products. Data storage for these products is perhaps simplest using a Concrete Table Inheritance pattern. Even if a given merchant is only selling within one model, they may one day want to supplement their product line with perhaps just a few products sold under a different model.

The pattern is to maintain a OrderItemList of polymorphic objects conforming to a OrderItem interface. On adding an item from the catalogue, the details are copied into an OrderItem of an appropriate type. The OrderItem must encapsulate a copy of catalogue data, not references (in case that data changes or is deleted). There are no situations I have come across where we need to query the database based on the contents of the OrderItemList, so persisting the OrderItems is an ideal case for using serialisation (the Serialized LOB pattern).

The nuances of the OrderItem interface come down to experience. The OrderItemList must also be able to correctly identify and handle duplication of OrderItems - if a compatible item is added, do they stack, remain as duplicates, or refuse to be added? If items are stackable, can shoppers change the quantity they wish to purchase? Can the maximum quantity purchasable vary? An OrderItem must be queried for a price, but how is this price affected by discounts and voucher codes? How does each item affect postage and packing options and costs?

In practice even this system is not sufficient because orders are not necessarily a flat list. In some cases, OrderItems must contain child OrderItems. These are things like add-on packs and upgrades which are conceptually self-contained, but can only be ordered alongside a parent item. Child items are priced seperately but grouped with the parent item for the purposes of removing the item from the basket or changing the quantity.

I include the following example list of items (derived from experience) which a flexible e-Commerce ordering system should be able to handle within a single basket:

  • DVDs - for each product, one SKU and one price (many similar examples).
  • Clothing - for each product, SKUs corresponding to both colour and size. Some sizes may have different prices (many similar examples).
  • Groceries - for each product, SKUs corresponding to different pack sizes at different prices. (many similar examples).
  • Computers - each SKU may be upgraded with a custom combination of add-ons, at extra cost. Some of these may be available as standalone products, other times not (likewise all configurable but mass-produced goods).
  • Rope - pricing is based on the length of rope to be cut from the drum at a different rate per SKU. Users might choose length instead of quantity (likewise textiles).
  • Kitchens/Worktops - each SKU corresponds to a finish, but pricing is complicated, based on how many boards need to be cut to satisfy a layout, given tolerances for carpentry and mitres, and the labour cost of performing that carpentry (likewise anything bespoke).
  • Antiques - each antique can only be purchased by a single buyer and must then be removed unless/until the sale falls through (likewise anything second-hand).
  • Samples - given away free, but limited to one of each SKU per customer. Because they are free the normal delivery charges may not apply, and the checkout might have to be cut short because payment information is not necessary. It may not even be worth combining these into the standard order process, although it might save the merchant some overhead if the customer simply wants a few samples to be chucked in when their real order is dispatched.