Setting up Gettext

In this tutorial, you will learn:

  • How to translate your program's source code using Gettext

  • How to translate other files using Intltool

This tutorial is part of the Setting up a real-life GTK application series. If you don't want to follow along with the previous parts, simply copy the app-skeleton3 directory from the tutorial's code examples. Or, you can start from the beginning.

There is one more piece of infrastructure to add before we start writing real code. That is allowing the program to be translated into other languages. You might think that translation is something to be done only after the application is finished, but it's much easier to keep translation in mind while you are writing the application.

Do you have to learn thirty languages to do this? No. The way it usually works in an open source project is that you write the program in English (or, strictly speaking, in the "C locale".) Another tool, gettext, extracts all the words and phrases from the program that will be displayed to users, and puts them in a translation template or .pot file.

You then recruit translators to your project (this is the hard part) and they translate some or all of the phrases in the translation template, producing a .po file. Generally, you give your translators commit access to your source code repository, so they just commit the .po files whenever they are done. The make install process takes care of installing the translations in the proper places in the user's system.

When the program is started, it looks at the value of the LANG environment variable. If this contains a language code for a language which the program is available in, then the program will display the translated phrases for that language. If not, the program will simply be in English. Also, if the translator did not finish translating the entire translation template, then the program will be in the other language as much as possible, and any untranslated phrases will still be in English. This allows translation to be an ongoing process, and translators can contribute as much as they have time for. Also, if you add a new message in a new version of the program, and the translator is on vacation, it won't mean the whole translation is invalidated.

Using gettext

First, copy the project that we have so far to a new app-skeleton4 directory, and change the version number if you like. To add gettext to our build system, add the following lines to the "Shopping List" section of configure.ac:

AM_GNU_GETTEXT([external])
AM_GNU_GETTEXT_VERSION([0.18.1])

Also, add po/Makefile.in to the AC_CONFIG_FILES call in the "Output" section, and po to the SUBDIRS variable in Makefile.am. (The po subdirectory is where the gettext files are placed by default.)

It is slightly confusing that we should put a file named Makefile.in into AC_CONFIG_FILES. Didn't configure transform Makefile.in into Makefile? In this case, gettext installs a Makefile.in.in which we transform into Makefile.in. Then, code generated by the AM_GNU_GETTEXT macro transforms the Makefile.in into Makefile automatically. One hopes that things won't get any more convoluted in the future, with tools that install a Makefile.in.in.in.in.

To install the gettext files, simply run the bootstrapping command: autoreconf -i. Then, before running configure, there are a few files we have to add to the po directory. The first is called POTFILES.in. This is a list of files that contain translatable messages for gettext to extract. The next file is called LINGUAS. This file contains a space-separated list of all available translations. Create both files and leave them empty for now, we will fill them later.

The last file to create is called Makevars. This file contains some customizable variables that get inserted into the po directory's Makefile. After running autoreconf, there should be a template for this file called Makevars.template. Copy this to Makevars and edit it. It doesn't need much changing; I changed the COPYRIGHT_HOLDER variable to my name, and the MSGID_BUGS_ADDRESS to $(PACKAGE_BUGREPORT), so that translators report bugs in the messages to the same place that regular bugs should be reported. (In a large project, you might want to separate these.)

Autoconf sets the PACKAGE_BUGREPORT variable to the bug report address that we specified in the call to AC_INIT in configure.ac.

Next, we must mark all the user interface strings in hello-world.c for translation. First, add at the top of the file:

#include <glib/gi18n.h>

This is a special GLib header file where the interface to gettext is defined. (i18n stands for "internationalization", because there are eighteen letters between the initial I and the final N.) In particular, this header defines the _() macro with which we mark our strings. (If gettext support is disabled, _() does nothing; otherwise it is an alias for the gettext() function. It is called _ so as to save typing and not to distract the eye while reading the code.)

Then, add the following code to the top of the main() function, at line 40:

/* Set up internationalization */
setlocale (LC_ALL, "");
bindtextdomain (PACKAGE, LOCALEDIR);
textdomain (PACKAGE);

This sets up the program to read the translated message files. Finally, add the following line to the makefile so that the program knows where to find the message files:

AM_CPPFLAGS = -DLOCALEDIR=\""$(localedir)"\"

There are four strings that we need to mark for translation: one in print_hello(), one in on_delete_event(), the argument of gtk_window_set_title() in main(), and the argument of gtk_button_new_with_label() in main(). Surround these strings with a call to _(), so that "Hello World" becomes _("Hello World").

Note that not all the strings should be translated! For example, the names of the signals in g_signal_connect() and the icon name in gtk_window_set_icon_name() should be left as they are. They are not displayed to the user; instead, they have an internal meaning to the program. If we were to translate the destroy signal into German, then the program would stop working, since it would try to connect to the zerstören signal, which of course it has never heard of.

Last of all, we need to add our source file with the strings marked for translation to POTFILES.in, so gettext knows to look there:

src/hello-world.c

If we now change to the po directory and run make update-po, an app-skeleton.pot should be generated. This is the template that translators can base their translations on.

Translating the program

Now, we will translate the program into another language and test it. If you speak another language than English, why not try translating it into that language? As an example, I will use Dutch (language code nl) here.

The first thing to do is to create a .po file for your chosen language. Change to the po directory and type:

msginit -l nl

This will create a file named nl.po, based on the translation template. You can edit it with your favorite text editor, although for bigger projects you might want to use a dedicated translation editor such as Virtaal or Poedit. Translate the messages as you see fit, or use my translations:

app-skeleton4/po/nl.po

#: src/hello-world.c:11
msgid "Hello World\n"
msgstr "Hallo Wereld\n"

#: src/hello-world.c:27
msgid "delete event occurred\n"
msgstr "delete event heeft plaatsgevonden\n"

#: src/hello-world.c:52
msgid "Hello"
msgstr "Hallo"

#: src/hello-world.c:76
msgid "Hello World"
msgstr "Hallo Wereld"

After translating, add the language code to the LINGUAS file:

Then change to the po directory and run make update-po again. It is important to do this whenever a translator gives you an updated .po file, otherwise the new translations will not get installed. Then install the program once again using make install.

You should now be able to run the program in Dutch (or whatever language you chose.) If your system locale is set to Dutch, then simply running the program should work. If you have your system in English or a different locale, you can test the translation by running the program with the LANG environment variable set to nl. (Just type LANG=nl before the program name on the command line.)

Translating non-source code

The gettext program has one shortcoming: it only works in program code, where it can call the gettext() function to fetch its translations. Data files, such as a GUI definition in XML, or the desktop file, don't get translated. This is why intltool was invented. It takes translatable strings from these files and adds them to the translation template. Then, when they have been translated, it merges the strings back into the data files. We will now add intltool to our program and use it to translate the desktop file.

First of all, intltool has its own bootstrapping program, called intltoolize. It needs to be run after autoreconf. Since using intltool means we can't bootstrap the build system simply by using autoreconf -i anymore, we will make a bootstrap script. Create a file in called autogen.sh, make it executable, and write in it:

#!/bin/bash

echo "Regenerating autotools files"
autoreconf --force --install || exit 1

echo "Setting up Intltool"
intltoolize --copy --force --automake || exit 1

We need the --force options because autoreconf and intltoolize overwrite each other's po/Makefile.in.in files. Go ahead and run the bootstrap script now.

Then, add to the "Shopping List" section in configure.ac:

IT_PROG_INTLTOOL([0.40])

Below the "Libraries" section, add a new section called "Variables." To use intltool, we need to define a variable called GETTEXT_PACKAGE that contains the name of the program as it is known to gettext:

# Needed by intltool
GETTEXT_PACKAGE=${PACKAGE_TARNAME}
AC_SUBST([GETTEXT_PACKAGE])

The call to AC_SUBST makes sure that any occurrences of @GETTEXT_PACKAGE@ in the makefiles are replaced with the contents of the variable. Now run make again.

Next, we need to tell intltool which data files contain translatable strings. There is actually a script, intltool-prepare that will automate that for us. Change to the project root directory and run intltool-prepare. You will notice that it automatically adds the desktop file to po/POTFILES.in and it changes the install rule for the desktop file in Makefile.am.

It also creates a new file, app-skeleton.desktop.in. If you are using a source control system for your project, remove the app-skeleton.desktop from your repository as the script suggests; it is now automatically generated from app-skeleton.desktop.in, which you should add to your repository. From now on, when you want to edit the desktop file, edit app-skeleton.desktop.in instead.

To translate the desktop file, go to the po/ directory and run make update-po once more. You will see two new untranslated strings added to the .pot and .po files. Translate them yourself, or use my Dutch translation:

#: ../app-skeleton.desktop.in.h:1
msgid "A sample application from the Advanced GTK+ Techniques tutorial"
msgstr "Een voorbeeldapplicatie uit de cursus Gevorderde GTK+ Technieken"

#: ../app-skeleton.desktop.in.h:2
msgid "App Skeleton"
msgstr "Skeletapplicatie"

Then, run make update-po once more, then make. You can look inside the generated app-skeleton.desktop file to ascertain that your translations have been merged. If you now run make install, the translated file will be installed.

With this, all our build infrastructure is in place. Now we can begin writing some real code. You are ready to start on the next tutorial, Writing a real-life GTK application.