How to submit new Locale Data

From Apache OpenOffice Wiki
Revision as of 13:51, 28 March 2014 by Arielch (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

General Information

To be able to fully support a new language or locale or an already existing but not yet fully supported language/country combination as a locale, including number formats and calendar data, and having it selectable as default document language, OpenOffice needs a locale data file.

Locale data files can quite easily be generated with the generator available at

For technical details and semantics of elements please see the generator's documentation and the comments in the locale data DTD file and as a sample locale data file for example the en_US locale

Note that several locale data elements may be inherited from another locale's data by means of the ref="..." attribute if they share identical data, which may come handy if locales are to be created for the same language but different countries that differ only in a few elements such as currency symbols. This would have to be done manually though. Doing so also reduces the memory footprint needed during runtime when the data libraries are loaded.

If you want to create and contribute a locale data file for your locale, you can create the locale data file using the generator and then ask the OpenOffice Localization mailing list for instructions on how to submit it.


There are a few pitfalls or things to think about when generating locale data:

Index Keys and Scripts

Character Range

In section H. Enumeration and Scripts, the default H1. Character range for indexes entry is A-Z. If your language uses other characters, e.g. additionally accented characters, or completely different characters, you probably want to add them at the proper position. See the generator's documentation and/or the documentation in the DTD file mentioned above for the LC_INDEX section.

Unicode Script

The preselected BasicLatin and Latin1Supplement scripts usually are sufficient for Western European languages. If your language uses other or completely different characters please select the appropriate Unicode script(s). For the distribution of characters in different Unicode scripts see The Unicode Character Code Charts By Script.

Currency Formats

The generator in step 3 section I. Currency offers two list boxes, I6. Currency format for positive values and I7. Currency format for negative values. While the format for positive values defaulted to $1 (currency symbol immediately preceding the amount) may be correct, the default format for negative values ($1) (parentheses around symbol and amount, but no minus sign) almost certainly is not for countries other than US. Please take the time and choose the correct entries of both list boxes.

Personal tools