You are here

Interleaving languages

Website Migration Handbook
· ·

Many of your users may speak multiple languages. If so, you may want to implement lists that prefer one language, but show another language if the content is not available in the preferred language.

Let me say right off that you may decide not to implement this functionality in a system. That said, this is a requirement that, if not built into the system from the start, may never happen. So you'll want to think this through. If you decide to implement this, make sure not to lock yourself in to a system that will not allow it (and be careful assuming that you could rewrite/extend things later to support this).

Let's start with three press releases, one of which is in two languages:

  • "David goes to market" (in English) and "Dawud mache souk" (the same press release, also about David going to the market, in phonetic Chadian Arabic)
  • "Dawud nisid kalam Arab" (a press release only in Chadian Arabic)
  • "David looking for better examples" (only in English)

The English-only list of press releases ("David goes to market" and "David looking for better examples") and Chadian Arabic only ("Dawud mache souk" and "Dawud nisid kalam Arab") lists are easy and obvious. But what about the two press releases that are just in one language? If I'm on a site in Chadian Arabic, then shouldn't the English-only press release be presented (or at least the option of listing the English)? That way, you at a minimum indicate that there are more relevant press releases, and, in the best case, the person might be able to figure out English so they get the information. Of course, this may only make sense when the proportion of English content doesn't totally swamp the Chadian Arabic (so for instance this may be relevant on a Chad-specific site). So the prefer Chadian Arabic, but show in English if not available in Chadian Arabic list would look like this:

  • Dawud mache souk
  • Dawud nisid kalam Arab
  • David looking for better examples (English)

If you don't decide up front which that you need this type of list (or implement an interim solution and never implement the above), you'll probably end up with something like this if you later decide you want both languages (note that the first press release is presented duplicated, being presented once in English and once in Chadian Arabic):

  • David goes to market (English)
  • Dawud mache souk
  • Dawud nisid kalam Arab
  • David looking for better examples (English)

This wouldn't be the end of the world, but not ideal.

From a technical perspective, the ideal interleaving would probably involve some sort of parent/child relationship (for instance perhaps there's an abstract, language-less object that's about David going to market, with two children, one tagged as English with the title "David goes to market" and another tagged as Chadian Arabic with the title "Dawud mache souk"). The more obvious implementation, that would normally only support the duplicated list, would be to just have a mass of objects that indicate language and title but have no parent/child relationship. Of course, these could be related to each other (with some sort of "translation-of" property for example), but it would then be computationally complex to get the non-duplicated list as-is (without adding some sort of sophisticated index or some such).

Another possible technical implementation would be one piece of content that has blocks within it for different translations (for instance, the first press release above is one piece of content but with two translation blocks in it).

Website Migration Handbook

First published 11 November 2007