Import of Hindi numbers from Microsoft Word documents

From Apache OpenOffice Wiki
Jump to: navigation, search
Specification Status
Author Henning Brinkmann
Last Change 17.09.2007
Status Preliminary Help


Microsoft Word marks numbers with the script to use by a hint. Furthermore there is an option to display numbers as Hindi, Arabic, by Context or determined by the System. This specification defines how the script hint and the display option shall be handled on import of Microsoft Word documents.


Reference Document Check Location (URL)
Specification Process Entry Check passed n/a
Product Requirement, RFE, Issue ID (required) available [1]
Product Concept Document not available
Test case specification (required) not available <PLEASE ENTER LOCATION HERE>
IDL Specification not available
Software Specification Rules n/a n/a
Other, e.g. references to related specs


Role Name E-Mail Address
Developer Henning Brinkmann
Quality Assurance Michael Rüß
Documentation Uwe Fischer
User Experience <First Name, Last Name> <>

Acronyms and Abbreviations

Acronym / Abbreviation Definition
<WYSIWYG> <What You See Is What You Get>

Detailed Specification

When a digit is marked to have CTL script in the imported Word document it shall be imported as Hindi digit iff the bidi language is one of the languages mentioned below.

Language Language Code
Arabic(Algeria) 0x1401
Arabic(Bahrain) 0x3c01
Arabic(Egypt) 0xc01
Arabic(Iraq) 0x801
Arabic (Jordan) 0x2c01
Arabic(Kuwait) 0x3401
Arabic(Lebanon) 0x3001
Arabic(Libya) 0x1001
Arabic(Morocco) 0x1801
Arabic(Oman) 0x2001
Arabic(Qatar) 0x4001
Arabic(Saudi Arabia) 0x401
Arabic(Syria) 0x2801
Arabic(Tunisia) 0x1c01
Arabic(U.A.E) 0x3801
Arabic(Yemen) 0x2401

This feature shall only be activated iff the configuration item RegardHindiDigits (see below) is true.

If the configuration item RegardHindiDigits is set the following mapping between Arabic and Hindi characters applies:

Arabic (Unicode) Hindi (Unicode)
0 (U+0030) ٠ (U+0660)
1 (U+0031) ١ (U+0661)
2 (U+0032) ٢ (U+0662)
3 (U+0033) ٣ (U+0663)
4 (U+0034) ٤ (U+0664)
5 (U+0035) ٥ (U+0665)
6 (U+0036) ٦ (U+0666)
7 (U+0037) ٧ (U+0667)
8 (U+0038) ٨ (U+0668)
9 (U+0039) ٩ (U+0669)
Help | User Interface Element Templates | Example Spec


The specified features improves interoperability with Microsoft Word.


Configuration Group Setting Type Default Comment |
Writer.xcs FilterFlags/WinWord RegardHindiDigits xs:long false If true yields to digits marked as CTL script to be imported as Hindi digits.
Help | Configuration Table Template

File Format

This specification covers import only and thus has no consequences regarding the file format.


Help | File Format Table Template

Open Issues

Urdu (and Sindhi) uses the same Unicode code points for Extended Arabic-Indic (aka Persian) digits but has some glyph variation that is selected at the font rather than encoding level (using OpenType lang features and so), see Unicode book, Ch. 8.2 Arabic. --Khaled Hosny 17:59, 23 June 2008 (CEST)
Personal tools