ZPTInternationalizationSupport

Status: IsImplementedProposal

This document is a proposal to extend Zope Page Templates to provide internationalization support. Note that statements of fact below should be read as proposals.

A new XML namespace named i18n will be added. Attributes in this namespace modify the behavior of the TAL interpreter. This document describes these attributes and the effect they have on TAL. For examples, refer to ZPTInternationalizationExamples.

The i18n namespace URI and recommended prefix are currently defined as:

  xmlns:i18n="http://xml.zope.org/namespaces/i18n"

This is not a URL, but merely a unique identifier. Do not expect a browser to resolve it successfully.

This proposal does not discuss ways to pass additional information to translation services which can use more information.

i18n:translate

This attribute is used to mark units of text for translation. If this attribute is specified with an empty string as the value, the message ID is computed from the content of the element bearing this attribute. Otherwise, the value of the element gives the message ID.

i18n:domain

The i18n:domain attribute is used to specify the domain to be used to get the translation. If not specified, the translation services will use a default domain. The value of the attribute is used directly; it is not a TALES expression.

i18n:source

The i18n:source attribute specifies the language of the text to be translated. The default is "nothing", which means we don't provide this information to the translation services.

i18n:target

The i18n:target attribute specifies the language of the translation we want to get. If the value is "default", the language negotiation services will be used to choose the destination language. If the value is "nothing", no translation will be performed; this can be used to suppress translation within a larger translated unit. Any other value must be a language code.

The attribute value is a TALES expression; the result of evaluating the expression is the language code or one of the reserved values.

Note that i18n:target is primarily used for hints to text extraction tools and translation teams. If you had some text that should only be translated to e.g. German, then it probably shouldn't be wrapped in an i18n:translate span.

i18n:name

Name the content of the current element for use in interpolation within translated content. This allows a replaceable component in content to be re-ordered by translation. For example:

    <span i18n:translate=''>
      <span tal:replace='here/name' i18n:name='name' /> was born in
      <span tal:replace='here/country_of_birth' i18n:name='country' />.
    </span>

would cause this text to be passed to the translation service:

    "${name} was born in ${country}."

i18n:attributes

This attribute will allow us to translate attributes of HTML tags, such as the alt attribute in the img tag. The i18n:attributes attribute specifies a list of attributes to be translated with optional message IDs? for each; if multiple attribute names are given, they must be separated by semi-colons. Message IDs? used in this context must not include whitespace.

Note that the value of the particular attributes come either from the HTML attribute value itself or from the data inserted by tal:attributes.

If an attibute is to be both computed using tal:attributes and translated, the translation service is passed the result of the TALES expression for that attribute.

An example:

    <img src="http://foo.com/logo" alt="Visit us"
         tal:attributes="alt here/greeting"
         i18n:attributes="alt"
         >

In this example, let tal:attributes set the value of the alt attribute to the text "Stop by for a visit!". This text will be passed to the translation service, which uses the result of language negotiation to translate "Stop by for a visit!" into the requested language. The example text in the template, "Visit us", will simply be discarded.

Another example, with explicit message IDs:

    <img src="../icons/uparrow.png" alt="Up"
         i18n:attributes="src up-arrow-icon; alt up-arrow-alttext"
         >

Here, the message ID up-arrow-icon will be used to generate the link to an icon image file, and the message ID up-arrow-alttext will be used for the "alt" text.

i18n:data

Since TAL always returns strings, we need a way in ZPT to translate objects, the most obvious case being DateTime objects. The data attribute will allow us to specify such an object, and i18n:translate will provide us with a legal format string for that object. If data is used, i18n:translate must be used to give an explicit message ID, rather than relying on a message ID computed from the content.

Relation with TAL processing

The attributes defined in the i18n namespace modify the behavior of the TAL interpreter for the tal:attributes, tal:content, tal:repeat, and tal:replace attributes, but otherwise do not affect TAL processing.

Since these attributes only affect TAL processing by causing translations to occur at specific times, using these with a TAL processor which does not support the i18n namespace degrades well; the structural expectations for a template which uses the i18n support is no different from those for a page which does not. The only difference is that translations will not be performed in a legacy processor.

Relation with METAL processing

When using translation with METAL macros, the internationalization context is considered part of the specific documents that page components are retrieved from rather than part of the combined page. This makes the internationalization context lexical rather than dynamic, making it easier for a site builder to understand the behavior of each element with respect to internationalization.

Let's look at an example to see what this means:

    <html i18n:translate='' i18n:domain='EventsCalendar'
          metal:use-macro='container/master.html/macros/thismonth'>

      <div metal:fill-slot='additional-notes'>
        <ol tal:condition="here/notes">
          <li tal:repeat="note here/notes">
             <tal:block tal:omit-tag=""
                        tal:condition="note/heading">
               <strong tal:content="note/heading">
                 Note heading goes here
               </strong>
               <br />
             </tal:block>
             <span tal:replace="note/description">
               Some longer explanation for the note goes here.
             </span>
          </li>
        </ol>
      </div>

    </html>

And the macro source:

    <html i18n:domain='CalendarService'>
      <div tal:replace='python:DateTime().Month()'
           i18n:translate=''>January</div>

      <!-- really hairy TAL code here ;-) -->

      <div define-slot="additional-notes">
        Place for the application to add additional notes if desired.
      </div>

    </html>

Note that the macro is using a different domain than the application (which it should be). With lexical scoping, no special markup needs to be applied to cause the slot-filler in the application to be part of the same domain as the rest of the application's page components. If dynamic scoping were used, the internationalization context would need to be re-established in the slot-filler.

Note: It should be possible for a macro component to explicity request that it use the dynamic context rather than the lexical context. This has not yet been discussed at any level of detail.


jim (Feb 6, 2002 9:59 am; Comment #2)
We need to add more discussion of semantics.
  • How does the presense of translate or id effect repeat, content, and replace?
  • What happens when there are nested translation and tal/metal tags? (I know the answer, but we need to document it.)
jassalasca_jape (May 16, 2002 1:05 am; Comment #5)
I am still getting a handle on Zope, but have a keen interest in multilingual content and Web-based translation. Please forgive me if this comment is off the mark.

A setup that offers slots (placeholders) within translatable content needs to be able to cope with changes in word-order. As far as I am aware, gettext() offers multiple string substitutions, but does not offer a means of altering their order within a given language environment. This is not much of a problem within the continental European context, but with Japanese (or Finnish?) it can make things pretty awkward. If message catalogs or something similar are offered, they should be able to cope with changes in word order.

Fred Drake - gettext() implementations typically do not deal with insertion of replacement text, and simple use of the gettext() primitive typically doesn't either. Our position is that there must be an outer layer that provides this service, which is handled in Page Templates by using the i18n:name attribute when re-ordering is possible.

jassalasca_jape (May 16, 2002 11:48 pm; Comment #6) Editor Remark Requested
Hmm. Having thought about this a little more, I think that the focus needs to be put on the translation process itself, because that is the bottleneck in these systems. The markup gives you a structure. The translator provides alternative text that needs to be melded into the structure. Markup should provide flags that make that melding process easy. I would suggest that TAL should include "mobile" and "fixed" declarations, which can be used by a translation editor to determine whether the tag to which they attach can shift position. With this, the translator can apply the XML grid to text after it has been translated, and fine-tune the tag positioning to suit the new context. A server-side product could be used for proof-of- concept, and client-side software could be used to make things more slick -- but you need those signals in the markup to make it work.

Put another way, XML text markup itself suffers from a mixing of structure and expression. The specific ordering of boldface markup and anchors, say, in a run of text is a fortuitous product of the language and style in which the content has been expressed. The same cannot be said of the items in a bullet list, or the paragraphs that make up a document. Tags for fixed and mobile character could be used to distinguish between these two characteristics of the XML tag critter.

Are there any ideas in this line floating around already? If not, I'd like to throw this into the mix.

jim (May 28, 2002 6:39 pm; Comment #7)
 > i18n:domain
 > 
 >   The i18n:domain attribute is used to specify the domain to be used
 >   to get the translation.  If not specified, the translation services
 >   will use a default domain.
 

For Zope, I really can't imagine where the default domain will come from. For other applications though, this makes sense.

 > i18n:source
 > 
 >   The i18n:source attribute specifies the language of the text to be
 >   translated.  The default is "nothing", which means we don't provide
 >   this information to the translation services.
 

This doesn't make sense, given the translation API.

Fred Drake - We should be able to explicitly identify the implicit source language; isn't that what we're doing here?

 > i18n:name
 > 
 >   Name the content of the current element for use in interpolation
 >   within translated content.  This allows a replaceable component in
 >   content to be re-ordered by translation.  For example::
 > 
 >     <span tal:translate=''>
 >       <span tal:replace=here/name i18n:name=name /> was born in
 >       <span tal:replace=here/country_of_birth i18n:name=country />.
 >     </span>
 > 
 >   would cause this text to be passed to the translation service::
 > 
 >     "${name} was born in ${country}."
 

You should point yout that this can be used without tal attributes. In fact. For example, this would work as well:

      <span tal:translate=''>
        <span i18n:name='name'><b>Jim</b></span> was born in
        <span i18n:name='country'>USA</span>.
      </span>

 > Relation with TAL processing
 > 
 >   The attributes defined in the i18n namespace modify the behavior
 >   of the TAL interpreter for the tal:attributes, tal:content,
 >   tal:repeat, and tal:replace attributes, but otherwise do not
 >   affect TAL processing.
 > 
 >   Since these attributes only affect TAL processing by causing
 >   translations to occur at specific times, using these with a TAL
 >   processor which does not support the i18n namespace degrades well;
 >   the structural expectations for a template which uses the i18n
 >   support is no different from those for a page which does not.  The
 >   only difference is that translations will not be performed in a
 >   legacy processor.
 

Isn't it worth noting that i18n can function independently of TAL? Interaction with TAL is an optimization, isn't it?

Fred Drake - it could be described differently, yes.

 > Relation with METAL processing
 
...
 >   Let's look at an example to see what this means::
 > 
 >     <html i18n:translate='' i18n:domain=EventsCalendar
 >           metal:use-macro='container/master.html/macros/thismonth'>
 

You don't want i18n:translate here do you?

bwarsaw (May 29, 2002 12:03 pm; Comment #8)
 > i18n:target
 > 
 >   The i18n:target attribute specifies the language of the translation
 >   we want to get.  If the value is "default", the language negotiation
 >   services will be used to choose the destination language.  If the
 >   value is "nothing", no translation will be performed; this can be
 >   used to suppress translation within a larger translated unit.
 

Does it ever make sense to supply anything other than "default" or "nothing" to an i18n:target attribute? IOW, does it ever make sense to say i18n:target="es" as in the ZPTInternationalizationExamples? I'm not sure it does. I suppose you could have a list of languages here with the semantics "only translate this to Dutch, Japanese, or German, but never anything else. I'm hard pressed to find a use case for that scenario.

Other than that, why would you give a language argument to i18n:target? If you always wanted to translate some string to e.g. Spanish, why use i18n mechanisms at all? Wouldn't simple tal be enough?

bwarsaw (May 29, 2002 12:28 pm; Comment #9)
 > i18n:name
 > 
 >   Name the content of the current element for use in interpolation
 >   within translated content.  This allows a replaceable component in
 >   content to be re-ordered by translation.  For example::
 > 
 >     <span tal:translate=''>
 >       <span tal:replace=here/name i18n:name=name /> was born in
 >       <span tal:replace=here/country_of_birth i18n:name=country />.
 >     </span>
 > 
 >   would cause this text to be passed to the translation service::
 > 
 >     "${name} was born in ${country}."
 

I'd actually like to see a shorthand for this be officially supported (you can call it out of scope for now). I'm thinking something like:

     <span i18n:translate='' tal:interpolate='on'>
        ${name} was born in ${country}.
     </span>

I'd imagine this to be a direct textual substitution for the more verbose tal:replace + i18n:name syntax. In fact, this might be a more general tal shorthand (having not much to do with internationalization).

bwarsaw (May 31, 2002 3:48 pm; Comment #10)
Here is a ChatLog3?1May2002 between Barry (lomax in the log), Fred (fdrake), and Stephan (srichter) discussion this proposal and the TranslationServiceInterface.
PeFu? (Aug 20, 2002 3:17 am; Comment #11)
Jean-Paul Smets wrote the following Howto (patches included): http://www.nexedi.org/Members/jp/cmf-i18n.stx IMHO this is a good point to get started now.
efge (Sep 6, 2002 6:21 am; Comment #12)
Question about translation's message ids:

Suppose I have:

      <span i18n:translate="foo">bar</span>

Does that mean that the message id is the string foo, or that it is the TALES value foo? The spec is not clear on that. The current Z3 code assumes it is a TALES value but apparently Plone folks expect it to be a string.

antonio (Dec 19, 2002 2:02 pm; Comment #14)
First of all I would like to thank you for all your efforts in making Zope a Multilingual Content Management System (MCMS). If you have been working with i18n then you know how complex this is and how many few CMS really support it.

I have been looking a bit into Localizer and this proposal examples on how to extend TAL. I've no comments on the details of implementation. I would rather like to point you to existing XML standards about i18n that probably could be integrated into the Zope XML and TAL language.

The most emerging localization standard in XML is called XLIFF and you can find all information here http://www.xliff.org is mantained at OSAIS by members like SUN, IBM, SAP etc... A very good short introduction is here http://www.opentag.com/xliff.htm

Why I whish this direction?

Well, standards always makes life easier, especially if they are adopted by the industry. No need to create so many XML-format transformations and adaptations.

Saving a lot of time in inserting the translations of texts in Zope: if you have this standard implemented then you can send Zope instances as XLIFF to the translation companies (e.g. they use tool like Trados which supports XLIFF or other XML) and then you can put the translated object back into Zope within a mouse click. This will save all time to imput the translated text manually into Localizable Zope objects or into the message catalog (also called translation memories). Then you would only need to do the translation quality check on the published content with small corrections.

I think a product that import and export zope data in XLIFF for later translation process would be of great help.

I hope this could give some more insight of the state of the art in XML standards for localizations.

Best regards Antonio

ryuch (Jan 13, 2003 12:30 pm; Comment #15) Editor Remark Requested
Internationalization includes Translation? I read just about Localization at the below, but in most cases Translation is the complete Globalization.

Domino Global Workbench(now, one of IBM products) supports to manage multiple language Documents and Applications. You can read the Guide Book at http://www.lotus.com/products/dmlt.nsf .

If we really want to make Zope a Multilingual Content Management System, we need to consider to adopt Translation Workbench as a essential facility. Generally Translation Workbenches have Translation Memory, Machine Translation adaptor, Sentence Aligner, and etc.

srichter (Jan 14, 2003 8:40 am; Comment #16)
 >  If we really want to make Zope a Multilingual Content Management System,
 >    we need to consider to adopt Translation Workbench as a essential
 >    facility.
 >  Generally Translation Workbenches have Translation Memory, Machine
 >    Translation adaptor, Sentence Aligner, and etc.
 

I disagree; we do not need all that. KDE and other Open Source projects use only Gettext and they have probably the largest multilingual applications in the world.

I went already a step further and use the new uprising standard ICU as a base for Locales and I hope I will be able to move the rest of the Translation Service into this direction as well, simply because I do not like gettext too much. But sayiing that we need all these other tools is a bit too much I think (even though the ICU Locale XML files have definitions for these type of things).

Also, there are no people in the Zope community that know enough about these things (and are wiulling to contribute). I have internationalized a couple of my projects as well, and I noticed that with message translation and number/currency/date/time formatting you can cover at least 97% of all your needs.

lalo (Jan 23, 2003 7:46 pm; Comment #17)
request for clarification: what happens if some ${stuff} name is not found when interpolating? I believe it should be left as-is, but there is a case for removing it.
efge (Jan 23, 2003 7:52 pm; Comment #18)
Ah, good remark.

Currently Zope 2's implementation in my TranslationService bombs... I'll change it to keep the ${stuff} there.

LRA (Apr 30, 2003 6:05 pm; Comment #19)
 >  Does it ever make sense to supply anything other than "default" or
 >  "nothing" to an i18n:target attribute?
 

Yes, but not a constant fixed value. i18n:target="es" is not only useless, but it is also invalid, since i18n:target value is a tales expression.

We are using it in our DDBug? project because we decided it doesn't make sense to have a Bug repository with a interface in one language (i.e. browser detected) and the contents in another, so we put a i18n:target="here/getDDBugLanguage" at the outermost block in every page



( 99 subscribers )