Packaging Drupal translations: a proposal

As it seems a brand new Drupal localization site is coming to life any of these days, we still need to work out some issues like how the translations will be packaged to support all of our existing deployment models and maybe some new ones.

We've been thinking and talking for some time about it. This is about my specific proposal for solving this packaging issue.

Well, if you want to see a specific implementation of translations downloaded on the fly by the installer check out the Open Atrium latest (dev) version. As I explain below, this is a limited use case (Drupal distribution with known set of modules), so we've gone through some shortcuts here. But anyway it is a nice proof of concept of how you can automatically download/import/update translations. Yeah, it works!

For the drupal.org translations server, our requirements may be a bit more:

The requirements

  • Downloading and importing translations on the fly from the Drupal installer.
  • Automatically updating site translations for enabled modules.
  • Being able to get translations for single modules, as we don't want to fill up the string tables with unneeded stuff.
  • Not needing additional tools or libraries on the client site. And working for small cheap web hostings (performance, memory).
  • Allowing full package downloads and repackaging them for some Drupal distributions.
  • Supporting automated deployment tools like Drush or Aegir, which basically means installs and updates can be scripted.

That looks complex enough, but if we think just of the server side, we could translate our requirements into these:

Simple specs

  • Publish plain po files, so no need for uncompressing tools, not even for downloading, as you can open the remote file and import it on the database.
  • Publish easily discoverable information about which translations are available and when they were last updated.
  • The files will be as many and as small as one file per module per language so we can just retrieve and parse the chunks we need for our site. And also we can break down the process into small steps for performance and memory saving.

We are ok with the first two with the Open Atrium deployment system but well, we don't have 3000 modules so a single package containing all translations works for us.

For the Drupal case it's just a bit more difficult, but not impossible. We just may have some thousands of sites hitting our server daily, asking for translations that may be updated weekly for 3000 packages with many versions each and almost 100 languages ... This really calls for some simpler and faster solution which won't take down our server with unneeded database queries, like this:

The solution

Just publish the static po files with some known folder structure.

Then our translation files, periodically exported by the server can be arranged this way or similar:

core/package/release/language/type/name.po

For the Spanish translation of system module in Drupal core 6.11 this would look as:

6.x/drupal/6.11/es/modules/system.po

About downloading full packages/translations for off-line install or other deployment tools, we could use Apache's on-the-fly tar.gz compression: mod_gzip.

Then, if you want to download all 6.11 Drupal core Spanish translation the path would look as

6.x/drupal/6.11/es.tar.gz

Or, if you want all available translations for the same package, it would be:

6.x/drupal/6.11.tar.gz

The '6.x' at the beginning will allow our filesystem to evolve for future Drupal versions, so it may be different for '7.x', just thinking about the future.

Yes but.... what about discovery and update?

That is as easy as customizing Apache's index files so we can produce easily parseable XML instead of HTML. See mod_autoindex.

Then, what about the client side?

If we think PHP can open remote files about as easily as local files, then this is really a non-issue. Our installer would work like this when a new modules is installed:

  1. Search for a local file for this module and language
  2. If not existing, search for a remote file
  3. Just import the file (either local or remote) contents into the locale tables

So this is the idea. Deadly simple, light speed server side, then we just need to build some good packaging scripts.