Newcomer
Often it is not acceptable to provide an application in just one language and it must be possible to provide the software in many languages. But the problem is not solved there. Besides simple text, one must also handle date/time and number formats for example, since they are specific to regions and languages as well. This chapter will give the reader an overview of the utilities that Zope 3 provides to solve these issues.
One of the most severe issues of Zope 2 was the lack of multi-language support. This significantly limited the adoption of Zope outside English-speaking regions. Later support was partially added through add-on products like Localizer, ZBabel, which allowed translation of DTML and Python code (and therefore ZPT). However, these solutions could not overcome the great limitation that Zope 2 is not unicode aware. Several workarounds to the problem were provided, but they did not provide a solid solution.
Once the internationalization effort was initiated and the i18n Page Template namespace was developed for Zope 3, it was backported to Zope 2 and a Placeless Translation Service product was provided by the community ( http://www.zope.org/Members/efge/TranslationService). 1
When the Zope 3 development was opened to the community, it was realized that internationalization is one of the most important features, since Zope has a large market in Latin America, Asia and especially Europe. Therefore, the first public Zope 3 sprint in January 2002 was dedicated to this subject. Furthermore, Infrae paid me for two weeks to work on Zope 3’s internationalization and localization support. Since then I have maintained and updated the internationalization and localization support for Zope 3.
In the previous section I used the terms internationalization and localization, but what do they mean? Internationalization, often abbreviated as I18n, is the process to make a software translatable. This includes preparing and marking strings for translation, provide utilities to represent data (like dates/times and numbers) in regional formats and to be able to recognize the region/language setting of the user. The last section of this chapter will deal in detail on how to internationalize the various components of your Zope 3 code. Localization, on the other hand, is the process to translate the software to a particular language/region. For this task, one needs a tool to extract all translatable strings and another one to aid the translation process. Localization data for number formatting, currencies, timezones and much more are luckily already compiled in large volumes of XML-locale files.
There are three goals which the Zope 3 I18n support tries to accomplish:
In the Open Source world, there are two established solutions for providing I18n libraries and L10n utilities, GNU Gettext and ICU . The latter was primarily developed to replace the original Java I18n support. However, Gettext is the defacto standard for the Free Software world (for example KDE and Gnome), but it has some major shortcomings. Gettext only does the translation of messages (human readable strings) okay - not even well. On the other hand, there are many translation tools that support the gettext format, such as KBabel, a true power tool for translating message catalogs . Therefore, it is important to support the gettext message catalog format, even if it is only through import and export facilities.
ICU, in contrast, is a very extensive and well-developed framework that builds upon the experience of the Java I18n libraries. ICU provides objects for everything that you could ever imagine, including locales, object formatting and transliteration rules. The best of all is that the information of over 220 locales is available in XML files. These files contain complete translations of all countries and languages, date/time formatting/parsing rules for three different formats (follow standard specification) - including all month/weekday names/abbreviations, timezone specifications (city names inclusive) - and number formatting/parsing rules for decimal, scientific, monetary, percent and per-mille numbers.
The first decision we made concerning I18n was to make all human-readable text unicode, so that we would not run into the same issues as Zope 2. Only the publisher would convert the unicode to ASCII (using UTF-8 or other encodings). The discussion and decision of this subject are immortalized in the proposal at http://dev.zope.org/Zope3/UnicodeForText).
Since the ICU framework is simply too massive to be ported to Python for Zope 3, we decided to adopt the locales support from ICU (using the XML files as data) and support the gettext message catalogs for translation, simply because the gettext tools are available as standard libraries in Python. From the XML locale files we mainly use the date/time and number patterns for formatting and parsing these data types. Two generic pattern parsing classes have been written respectively and can be used independently of Zope 3’s I18n framework. On top of these pattern parsing classes are the formatter and parser class for each corresponding data type. But all this is hidden behind the Locale object, which makes all of the locale data available and provides some convenience functions.
The Locale instance for a user is available via the request object, which is always available from a view. However, one can easily test the functionality of Locale instances using the interactive Python prompt. Go to the directory ZOPE3/src and start Python. You can now use the following code to get a locale:
You can now for example retrieve the currency that is used in the US and get the symbol and name of the currency:
The more interesting tasks are formatting and parsing dates/times. There are four predefined date/time formatters that you can choose from: “short”, “medium”, “full”, and “long”. Here we just use “short”:
For numbers you can choose between “decimal”, “percent”, “scientific”, and “currency”:
While the object formatting is the more interesting task, the more common one is the markup and translation of message strings. In order to manage translations better, message strings are categorized in domains. There is currently only one domain for all of the Zope core called “zope”. Products, such as ZWiki, would use a different domain, such as “zwiki”. Translatable messages are particularly marked in the code (see the section below) and are translated before their final output.
All message translations for a particular language of one domain are stored in a message catalog. Therefore we have a message catalog for each language and domain pair. We differentiate between filesystem (global) and ZODB (local) product development. Global message catalogs are standard gettext PO files. The PO files for the “zope” domain are located in ZOPE3/src/zope/app/locales/<REGION>/LC_MESSAGES/zope.po, where REGION can be de, en or pt_BR.
Local message catalogs, on the other hand, are managed via the ZMI through local translation domains. In such a utility you can create new languages, domains and message strings, search through existing translations and make changes, import/export external message catalogs (Gettext PO files), and synchronize this translation domain with another one. Especially the synchronization between translation domain utilities is very powerful, since it allows easy translation upgrades between development and production environments.
|
|
|
|
Okay, now we know how to manage translatable strings, but how can we tell the system which strings are translatable? Translatable strings can occur in ZPT, DTML, ZCML and Python code. We noticed however, that almost all Python-based translatable strings occur in views, which led us to the conclusion that message strings outside views are usually a sign of bad programming and we have only found a few exceptions (like interface declarations). This leads to a very important rule:
Translations of human readable strings should be done very late in the publication process, preferrably just before the final output.
In the next section we will go into some more detail on how to markup the code in each language.
As mentioned before, Zope is not a simple application, and therefore we cannot translate a text message directly in the Python code (since we do not know the user’s locale), but must mark them as translatable strings, which are known as MessageIds. Message Ids are created using Message Id factories. The factory takes the domain as argument to the constructor:
Note: The _ (underscore) is a convention used by gettext to mark text as translatable.
Now you can simply mark up translatable strings using the _ function:
But this is the simple case. What if you want to include some data? Then you can use:
In this case the number is inserted after the translation. This way you can avoid having a translation for every different value of x.
For Page Templates we developed a special i18n namespace (as mentioned before), which can be used to translate messages. The namespace is well documented at http://dev.zope.org/Zope3/ZPTInternationalizationSupport and some examples can be found at http://dev.zope.org/Zope3/ZPTInternationalizationExamples.
There is no DTML tag defined for doing translations yet, but we think it will be very similar to the ZBabel and Localizer version, since they are almost the same.
I briefly described ZCML’s way of internationalizing text in the previous chapter. In the schema of each ZCML directive you can declare translatable attributes simply by making them MessageId fields. The domain for the message strings is provided by the i18n_domain attribute in the configure tag. Therefore the user only has to specify this attribute to do the I18n in ZCML.
Once the code is marked up, you must extract these strings from the code and compile message catalogs. For this task there is a tool called ZOPE3/utilities/i18nextract.py. Its functionality and options are discussed in “Internationalizing a Product”.