Translation:General Information

From Apache OpenOffice Wiki
Jump to: navigation, search

This page summarizes useful information on the translation work needed to release a localized OpenOffice.org version. To get in touch with the translations community please come and ask for some advice on the dev@l10n.openoffice.org mailing list (browse archive), or on the #openoffice.org IRC channel.


en_US Source Strings

For every milestone on the DEV300 codeline (future non-branched 3.x releases) the current en-US source strings are extracted and uploaded in SDF file format to:

Diffs of all new and changed strings from each milestone can be found at:

Removed strings are not part of those diffs.


Release schedule

Keep an eye on the due date for translation submissions.

Release schedule with translation deadlines specific for the release you are targeting can be found at: http://wiki.services.openoffice.org/wiki/OOoReleaseXX where XX is OpenOffice.org release version number. You can also browse complete list.

Translation handover dates are also announced at the mailto:dev@l10n.openoffice.org mailing list.

Teams Translating with Pootle

More and more teams are using Pootle to translate (What is Pootle?). Here is the complete list.

Pootle provides web interface for translating, managing translation team and reviewing new translations.

As a bonus, when using Pootle for managing translations Pootle administrators will make sure content will be updated with new messages for translation and delivered for integration according to the release schedule. You do not need to prepare SDF files or create an issue if using Pootle. Unfortunately sometimes update is not glitch free and a lot of messages get marked as fuzzy and need translating when messages are relocated inside OpenOffice.org code base. You should always keep backups of your translations offline. There is no version control support for translations in Pootle.

For adding new languages to Pootle, native language project leads should ask on the mailto:dev@l10n.openoffice.org list (browse archive). Then you need to register, and request to be added as the admin for your team. Then your translators need to register, so you can assign access rights and goals to them in the way that suits your project best. Translators on probation, for example, can be assigned only right to "Suggest", so their input is saved separately as draft translation strings for you to review.

Pootle Translation Process

  • Pootle administrators make sure Pootle content is updated according to translation schedule
  • L10n lead give the Go to start translation on Pootle (announcement to the dev@l10n list)
  • Native Language leads or the Translation lead coordinates the translation work User's Guide
  • Translation teams work with Language team to make sure the translation is reviewed
  • Native Language/translation leads make sure translation is complete within the deadlines
  • Native Language/translation leads communicate translation completion to l10n lead
  • Pootle administrator downloads the translated files and provide them to release engineering

It is recommended to use Pootle to manage the translation work. For translation it is recommended to download the files and translate them with a translation editor which support translation memory functionality. Linguistic review can be then performed right after translation.

Translation Notes for Pootle Users

Content on Pootle can be edited, fixed at any time. However, please make sure to subscribe the mailto:tools@l10n.openoffice.org (browse archive) list and to make sure not to upload all translated files before Pootle content updates are carried out. Pootle downtimes, Pootle maníntenance or Pootle content updates are announced to this list.

Translating using Gettext PO files

Gettext PO file format is popular format for editing translations used widely in free software community. There are many editors and other translations management or workflow tools supporting Gettext PO file format.

The SDF file as used by OpenOffice.org can be converted to Gettext PO files and backconverted using the tools from the Translate Toolkit package. The Translate Toolkit depends on the python and python-devel packages, so you must also have these installed on your system.

Documentation with examples for conversion and backconversion can be found at Translate Toolkit documentation. PO files provided on the Pootle deployment for OpenOffice.org handle duplicate messages using msgctx feature (oo2po --duplicates=msgctx) which is a default and recommended settings in recent releases of Translate Toolkit.

When translating OpenOffice.org using Gettext PO files you start with downloaded SDF file with source messages and generate a set of empty PO templates tree (POT). Then you either initialize empty PO files to be used with new translation project or merge an existing translations with new templates. After the translation is completed you should convert PO files back to GSI/SDF format using downloaded SDF file.

Generate a fresh set of PO templates

To generate a fresh set of PO templates (POT) out of downloaded SDF named en-US.sdf file run: oo2po -P -i en-US.sdf -o pot. Don't delete SDF file as you will need it to convert translations back to SDF/GSI file for translations delivery.

Sometimes generated POT files will be made ready for download at: http://download.services.openoffice.org/files/extended/ooomisc/POT/

The structure of the generated pot tree is embedded. Files exist inside directories and subdirectories. Do not change this hierarchy in any way. It is also important to remember that most of the strings are represented by the directory helpcontent2, also called the "Help". You may find it useful to separate this directory from the rest (called the "GUI"), when translating, so your efforts on the interface files do not appear to be a tiny proportion. The interface files are essential, and must be translated first, and maintained at 100% if possible. You can submit this translation separately. Then work on the Help, and try and get it done, bit by bit. So don't be discouraged by the size of the tree: it is mostly "Help".

If you are just starting new translation project use pot2po pot po to create an empty tree of PO files for translation.

Tip.png This tool can also be used to extract PO translations from a translated SDF file. For example if you have sr.sdf file containing Serbian Cyrillic translations you can create translated PO files tree by using: oo2po -l sr sr.sdf po. See official documentation for more details.


Merge existing translations with new templates

If you already have already translated PO files located inside po tree you can update them to the new templates preserving all previous work. To merge (migrate) translations to new templates pomigrate2 tool from Translation Toolkit can be used. It is advanced tool trying to reuse as many messages as possible. You should read the documentation available to get the best combination of options working for you.

Frequently it is used as pomigrate2 -C -F po ponew pot to merge existing translation inside po directory tree with new templates from pot directory tree into new directory tree ponew. This will also do fuzzy matching and use a compendium which will help migrating messages relocated inside OpenOffice.org source code tree.


Tip.png This tool will call msgmerge program from Gettext package. It is useful to pass --previous argument to the msgmerge program, something pomigrate2 is not doing. With this argument, when merging translations as fuzzy old source message will be preserved as a message comment. This helps when fixing fuzzy translations as old and new source message can be easily compared. Some PO editors provide in-line comparison visualization.

To pass this argument when using pomigrate2 you could create a new msgmerge script as:

#!/bin/bash
/usr/bin/msgmerge --previous $@

and set it in the executable path before system wide msgmerge tool.


Translate messages

All translations must be in UTF-8 encoding (the standard encoding for translations), so make sure the Preferences in your editor are set to UTF-8.

When translating, you must preserve the existing structure of each string (placeholders/variables, escape marks, XML tags, etc.). Only change the translatable text.

Look at the string IDs in the .po files: they are mostly quite informative. At least you can understand which type of GUI element is represented by that string (label, menuitem, radiobutton, pushbutton).

There is no need to keep the ~accelerator marks in the translation, as OpenOffice.org can insert those itself and will move them in case of conflicts. Accelerators are needed in translation only for:

  • top level menus (File, Edit,...)
  • general pushbuttons where you would like to select specific accelerators and to make them consistent between releases

Generate GSI/SDF file from translated PO files

When you are satisfied with your translations you should use po2oo tool from Translation Toolkit to convertback translated PO files into GSI/SDF file for translations delivery.

You will need downloaded en-US.sdf SDF file which was used for generating POT tree. Having PO tree in po directory, run po2oo -l sr -i po -t en-US.sdf -o GSI_sr.sdf where sr is your locale language code and GSI_sr.sdf is output GSI/SDF file name. Use your locale code.

Check output file with Gsicheck tool (see below), compress it using bzip2 (bzip2 -k GSI_sr.sdf) and upload to some public http or ftp space for submitting in the issue requesting integration.

How to deliver translated files

  • provide SDF files that contain translated strings only (please remove non translated strings from the sdf file)
  • provide a GSI / SDF file containing both the translated strings and the corresponding en-US source strings. Remove untranslated strings from the sdf file. Please note that the en-US strings have to be the same milestone like your translation.
  • please make sure that the GSI / SDF file format is not violated (format errors like wrong amount of tabs, shifted columns, ... ) by using "Gsicheck". Please use the latest version. Usage: gsicheck -c myfile.sdf. In case of errors please use the log file to fix them.
  • go to Issue Tracker and file a bug to "ihi@openoffice.org", cc: "vg@openoffice.org", assign the issue to "ihi@openoffice.org", cc: "vg@openoffice.org", Target milestone to "OOo XX" , Component "l10n" , Subcomponent "code" , Issue type "ENHANCEMENT". The summary line should describe the type of strings (GUI, Help or both), language and version. For example: [VI] GUI Translation for 2.2. Please don't attach your file directly to the issue, but provide an URL / link pointing to your file. Please do attach only if you don't have any other web space available.


What happens after I submit the issue?

Lots of interesting stuff. :) Once your issue is submitted your translation will be included in the next build targeted to integrate localization. Once announced you can download it from the download server and start testing it.

The release schedule does allow you some time to test your builds and submit language fixes for the translation. It's also important to test how the build works, and to submit issues for any problems.

Getting and distributing localized builds

See more information How to get OpenOffice.org released in your language.

To summarize, here are the basic steps after final localized RC (release candidate) builds are available:

  • run sanity check on RC l10n builds
  • update the test status on QATrack
  • set the status as APPROVED in QATrack
  • get an approved build distributed to the mirror network by filling an issue


Think about joining a Native-Language Confederation and further improve your native language project.

Tips and Tools

Team

build a team of translators and 1-2 reviewers to work on the project. It is recommended to keep the number of the reviewers to 1-2 people, since the more translators and reviewers working on the project, the less you can ensure quality and consistency.


Glossary

OpenCTI - terminology is the repository of the latest terminology used in OpenOffice.org. Open CTI replaces SunGloss. No need to login to lookup terms but if you want to edt or edit terms you will need to get an account (please register first). A Help button is available and provide instructions on how to use the tool.


Translation Memories

If you have translated other software, especially software which performs tasks similar to those of an OpenOffice.org component (e.g. Gnumeric, AbiWord, Koffice, the GIMP), we recommend you use translation memory, to avoid duplication of work, and to use existing resources as effectively as possible. Translation memories can help reduce inconsistencies in your translations.

You can create, maintain and apply your translation memory (TM) using 'plain Translation Compendia' gettext (please refer to the gettext manual)


Latest translation memories in the TMX format can be found at: http://ooo.services.openoffice.org/pub/OpenOffice.org/cws/upload/localization/ under tmxXX directory where XX is OpenOffice.org release version number.

TMX can also be created as follows:

  • download PO files from Pootle or extract from SDF file
  • run po2tmx


For more information on using Translation memories and Glossaries in Pootle please refer to below page: http://wiki.services.openoffice.org/wiki/Pootle_Glossary_Guide#Translation_Memory_in_Pootle

Gsicheck Tool

Gsicheck tool should be used to make sure the translated .sdf files are not corrupted can be found at: http://ooo.services.openoffice.org/gsicheck/


Translation Editors

Various translation editors that support the Gettext PO file format (in alphabetic order):

PoEdit, Lokalize and WordForge run on both Linux and Windows. Lokalize is the KDE4 replacement of the old KBabel, which has long been the most popular PO editor for these platforms, but gTranslator and PoEdit have planned improvements which may make them more competitive; WordForge is a new editor which is rapidly becoming popular.

OmegaT+ is a cross-platform Java application that runs on Linux, Mac OS X, Solaris, Windows, and other supported platforms. It supports a number of document formats and TMX for translation memories, along with matching, glossary, and machine translation. Features improved reliability, speed, and user interface.

OmegaT runs on Windows, Linux and OSX. It supports a number of file formats, TMX for translation memories, TBX (beside simple TXT & CSV) for the terminology, and offers dictionary interfacing, on-the-spot spell checking and machine translation.

gTranslator runs on Linux and some BSD platforms.

LocFactoryEditor runs only on Mac OSX. It handles XLIFF natively, and supports Apple formats, gettext formats, SVN submission and submission by email to projects like the TP (TP Robot) and Debian (Debian BTS). It also converts between PO compendia and TMX.

Virtaal runs on Linux and Windows. It supports XLIFF and PO natively, with both of these formats being available to Pootle users this makes it an ideal offline equivalent for Pootle. Virtaal includes Translation Memory, Machine Translation and Terminology support. It supports many other localisation formats and includes other useful features like spell checking, autocorrect and autocomplete.

Translation QA

Don't forget that spellcheckers like Aspell have a wide range of dictionaries for well over 70 languages. Spellcheckers not only check your spelling: they are great for catching typos. ;)

Another great utility in the Translate Toolkit package is pofilter tool that can be used to catch many types of errors like translated variable names, incorrect capitalization, broken XML tags and many more. There is a --openoffice set of standard checks to be used with OpenOffice.org translations but you may want to exclude some filters to reduce number of false positives. Learn more on Automating Translation QA.


Other tools

Please make sure you are running the latest version of gettext, to benefit from its new features, like contextual handling and comparison with previous original strings.

Please add further information on these and other tools that may help other translators to perform their job.

Personal tools