Implemented in x

 

Official Documentation Available

This topic is now covered in Internationalization effort.

Rationale

We need to define a process to translate modules messages in languages other than the default one (typically English) and keep them up-to-date.

Goal

Having a mechanism to easily localize modules messages and keep track of which ones need translation/update.

Implementation

This process should for each Magnolia's module (or basename) and for each target language:

  • Extract messages_XX.properties files (where XX stands for the language code).
  • Detect which files need to be translated.
  • Detect which files need translation update, that is
    • the messages_XX.properties file exists for a certain target language but new keys have been added to the default one
  • Export the files to be translated in a format suitable for localization tools (e.g. .properties for developers or xls for translators). Google spreadsheet has been eventually chosen to store and share files to be localized and/or revised. This has the following advantages:
  1. It's free, as in free beer (smile)
  2. Will easily allow collaboration from the community around the globe for localization contributions and revisions. (one or more persons from the community should be made responsible for a certain module/locale?).
  3. Has revision history and one can revert to previous versions.
  4. Can use google translate formula to automatically localize missing properties in a target language, thus greatly reducing the burden for the human translators, who will be nonetheless necessary to revise the automatic translation.

The following picture shows how a spreadsheet for the Arabic language and the adminInterface module looks like (notice that all translation was done automatically by google and therefore has been marked for revision)

  • Once a module's localization is done, the spreadsheet needs to be exported as xls and recreated as a .properties file which will be eventually merged with the one under subversion.
  • Groovy scripts will be used to perform the extraction and recreation of messages files. An internal project has been created on svn to store the scripts and some of their dependencies (the google data API 1.4.x jars) which don't have maven artifacts. The url to the project is http://svn.magnolia-cms.com/svn/internal/i18n-effort/

3 Comments

  1. Messages extraction in tab separated vales format is done. See attached script exportPropertiesFilesAsTSV.groovy. Also read the script javadoc/source to get more info about what it does and how it works.

    1. Switched back to the original idea of creating one spreadsheet per locale and one worksheet per module. This was the result of trying to use the Google Data and Spreadsheet APIs to upload the tsv files and automatically localize the untranslated keys as there is one serious shortcoming:
      you cannot use the tsv format programmatically to upload and convert tsv into google spreadsheets, unless you have a "Premier account". Also,
      you need to use two different APIs, the Data one for creating/uploading files and the Spreadsheet one to manipulate them which is a bit cumbersome. Now, with one spreadsheet per locale we can easily upload them manually. However, what we miss with this process is the automatic translation with GoogleTranslate formula as you can only add a formula programmatically by manipulating the cells returned by a request through Google's SpreadsheetService. If we want automatic localization by Google we need this further step.

      1. I said: "If we want automatic localization by Google we need this further step."
        Now I can say: "Done, see http://svn.magnolia-cms.com/svn/internal/magnolia-module-localization/src/main/java/info/magnolia/module/localization/GoogleTranslateFormulaUpdate.groovy".
        End of the soliloquy.