Engineering and SI Number Format in Calc

From Apache OpenOffice Wiki
Jump to: navigation, search

ux-ooo-logo-rgb-129-61.png

ux.openoffice.org

Quick Navigation

Team

Explanation of Notations

The following is a short description of all the number formats to be discussed here.

Engineering Notation

This type of notation is an offshoot of Scientific Notation. The difference is that the exponent is only shown in multiples of 3. Thus a number such as 153,100 becomes 153.1E+3 in Engineering notation[1] instead of 1.531E+5 in scientific.

SI Notation

This is what's known as SI Prefix[2] notation. Similar to Engineering Notation the exponent also runs in multiples of 3 (usually), but the difference is that instead of showing the E+/- notation a symbol is given for each exponent. E.g. 153,100 in SI notation would be 153.1 k. Note that SI explicitely forbid 153.1k with the prefix 'k' packed to the number, and require a space between [3]

Binary Prefix Notation

This is very similar to SI, but is intended for use when referring to sizes of data storage. Binary Prefix[4] Notation works in much a similar way, but is not calculated using 10 as a multiple (as with Scientific). Rather the closest matching power of 2 is used. Thus instead of 1000 we get 1024. As a convention similar symbols to SI is used, but with an "i" suffix to distinguish then. E.g. 153,100 in Binary Prefix Notation would become 149.51Ki.

Examples from other software

Example of how other software handle these formats.

Engineering Notation in Microsoft Excel 2003

Excel2003 custom notation.png

Format Strings

Each number format requires a format string to indicate not just what type of format to use, but also details as per how many decimals to display. In all numeric notations the format string could indicate how many digits to display for the fraction portion of the number. E.g. 0.00+E0 would force a 2 decimal fraction if needed or not.

Engineering Notation

The Excel format string for engineering notation is similar to the scientific notation, except that it starts with 3 hashes (#). See the sample above. The current scientific notation in both Excel and OOo Calc is the same except for this point. Both use E+ to indicate exponent notation. The digit(s) after this indicate how many digits to use for the exponent. E.g. 0.0+E000 would force 3 digits for the exponent, if not needed it becomes padded with zeros. A further use of the Excel format string is that not only normal engineering notation is possible, but also notation in other multiples. E.g. ####E+0 would format 153,100 as 15.310E+4.

An alternative mentioned would be to substitute F for E instead. In this case ###.000E+00 would be equivalent to 0.000F+0.

SI Notation

There is (yet) no sample format string for this. Therefore we need to decide on one. A proposed idea would be to use a S to force SI notation. E.g. 0.00S. There's has one problem as the s is also used to denote seconds when formatting numbers as time units. So a different character should be used. Thus not one of the following already used: A, D, E, H, I, M, P, S, Y. A suggestion would be T stemming from the 1st non-used letter in the word meTric.

Binary Prefix Notation

Simple really, use B as for SI. It's not used in any other normal format.

Language Specific

Further to the above, it may be necessary to have language specific codes for these. More discussion is necessary.

Sample Implementations

File:EngSIFormula.ods contains custom formula to convert numbers to formatted text in the engineering-, SI- and binary notations. To use enable macros when opening as the formula are written in embedded sBasic.

File Format

TODO: how to persistently store the feature in various file formats.

ODF

The ODF Specification for Scientific Numbers does not contain any way of saving a format string directly. There are only 4 parameters for a scientific format:

The number format dialog still displays the format string, but this has to be generated from the above properties. At present the following scenario applies:

Say the format string is typed as: ###.00E+00

This gets translated into the various properties. Thus a format style such as:

<number:number-style style:name="N108">
  <number:scientific-number 
    number:decimal-places="2" 
    number:min-integer-digits="3" 
    number:min-exponent-digits="2"
  />
</number:number-style>

When the file is later opened, the format string is recalculated from these properties. Thus the resulting string would become: 000.00E+00

It is clear that the current ODF format specification cannot work as is with Engineering Format, let alone SI. Thus some modifications to the spec should be designed. The following are some options.


Option 1

Make use of the grouping property as a switch for using exponent increments by the value of the min-integer-digits. This would not need the ODF spec to be changed at present, only the way OOo Calc interprets the grouping property. If used, it would disallow thousands grouping on all scientific formats. It also doesn't cater for SI formatting. Older programs would not have any dificulty opening this file, but (obviously) they would not format correctly.

Option 2

Introduce an extra property into the scientific format, named something like number:exponent-increment. Again, this only applies to engineering format. And also would not interfere with older programs, thus the ODF file can still be opened by an earlier version - just not formatted as intended.

Option 3

Add a format string property to the scientific format type, to be included as aditional to the existing properties. This would allow for numerous extra formatting without needing further ODF alterations. It would be possible to use this for engineering- as well as SI formatting. And again, older programs would not have troubles opening the file - just not format as intended.

Option 4

Create 2 new style classes called number:engineering-number and number:si-number. Used as is, this would invalidate the ODF file when opened in an older version program. A possible solution would be to have a secondary (alternative) format specification to be included into the number:number-style, which would be the current scientific number format. Thus older versions would read the scientific format and still be able to open the ODF file.

OOXML

Seeing as Excel has no SI equivalent, only the engineering format can be saved with any consistency. Thus the same format string should be used as used in Excel: ###.00E+00 as Excel saves it. The current implementation of OOXML saves into an XLSX file's sub-file \xl\styles as <numFmt numFmtId="164" formatCode="###.00E+00"/>.

For SI notation it may have to be converted into engineering notation when saving to OOXML. Otherwise there may be an error when opening in Excel.

.xls and .doc binary formats

Same as with OOXML above, except it saves to a binary field inside the file (exactly where all other format strings are saved). But still, the SI format is not implemented ans should thus also be handled as per OOXML.

References

  1. see "Engineering Notation" in Wikipedia
  2. see "SI prefix" in Wikipedia
  3. see [1]
  4. see "Binary prefix" in Wikipedia

See Also

Personal tools