Difference between revisions of "Talk:NUMBERTEXT/MONEYTEXT development"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Turkish language source)
Line 226: Line 226:
:: I have integrated with some small fixes the Turkish description to Numbertext 0.7. See http://NUMBERTEXT.org, too. Thanks, [[User:Nemeth|Nemeth]] 11:45, 10 November 2009 (UTC)
:: I have integrated with some small fixes the Turkish description to Numbertext 0.7. See http://NUMBERTEXT.org, too. Thanks, [[User:Nemeth|Nemeth]] 11:45, 10 November 2009 (UTC)
Thanks Nemeth I will anounce this release numbertext at turkish openoffice.org forum [[User:Ramdem|Ramdem]] 17:27, 11 November 2009 (UTC)

Revision as of 17:27, 11 November 2009

Discussion page of NUMBERTEXT/MONEYTEXT development

Start a new section for a new theme, bug report or a language module (Soros program). See also NUMBERTEXT.org.

License requirements: Soros programs of NUMBERTEXT project are released under LGPL/BSD dual-license.

Use ~~~~ (four tilde) at the end of your comment to include your login name and a time stamp.

To indent your comment, use one or more colons at the beginning of it.

Some languages need male/female option for number to text

Hi, in Catalan de numbers 1 and 2 can be male or female, based on what's numered. Example: cotxe (car) is male and flor (flower) is female. So 1 cotxe (one car) is spelled "un cotxe" and 1 flor (one flower) is spelled "una flor". So, 1--> un (if male noun) and una (if female noun), 2 --> dos (if male noun) and dues (if female noun).

This male/female change also happens in numbers finished in 1 and 2 different that 11 and 12 (21, 22, 31, 32, ...) and also in hundreds and thousands.

Spanish also has this male/female, but only in numbers finished in 1. In Spanish 2 it's always spelled "dos".

Finally, this male/female isseu als is important for currency to text. Many currency are treated as male nouns: euro, dollar. But few currencis are "female": sterling pounds or the old spanish peseta. So, 1200 $ is spelled as "mil dos-cents dòllars", but 1200 PTA is spelled as "mil dues-centes pessetes".

I have fixed them by text converters. ca_ES uses manual arguments for the gender of the currency units and subunits, es_ES module uses automatic gender detection (feminine units end with "a" or "as"):
# masculine to feminine conversion of "un" after millions,
# if "as?$" matches currency name

f:(.*ill)(.*),(.*) \1$(f:\2,\3)		# don't modify un in millions
f:(.*un)([^a].*,|,)(.*as?) $(f:\1a\2\3)	# un libra -> una libra
f:(.*),(.*) \1 \2

"([A-Z]{3}) ([-−]?1)" $(f:|$2,$(\1:us))
"([A-Z]{3}) ([-−]?\d+0{6,})" $2 de $(\1:up)
"([A-Z]{3}) ([-−]?\d+)" $(f:|$2,$(\1:up))
Thanks for your report. Nemeth 22:12, 3 September 2009 (UTC)

Works fine with currency, thanks. But I'm thinking in some additional option in NUMBERTEX OOo Calc function. Currently we have, =NUMBERTEXT(number); =NUMBERTEXT(number,lang_code); What about? =NUMBERTEXT(number,lang_code, gender_code); Where gender_code can be: 0,1,2,.... Catalan only needs 2 variations, but may be other languages uses 3 or more variations. Of course, masculine/0 code as default.

or maybe better? =NUMBERTEXT_FEM(number); =NUMBERTEXT_FEM(number,lang_code); for "feminine" option.

Of course, we could use MONEYTEXT function with a fake currency code, with feminine tag, but empty units strings. But I think it is a workarround. --Jmontane 20:35, 6 September 2009 (UTC)

NUMBERTEXT is a string function. The numeric input converted by Calc automatically. What about
and similar expressions?
Maybe for the special handling of dates, we have to add a DATETEXT() function. Thanks for your suggestions. Nemeth 11:36, 10 November 2009 (UTC)

Minor bug in Spanish language definition

Spanish has gender variation in numbers containing the string "ientos" (doscientos/as, quinientos/as, novecientos/as, etc). It generates "doscientos libras", but the correct would be "doscientas libras". I think that this line should solve this:

f:(.*ient)o(s.*),(.*as?) $(f:\1a\2,\3)   # doscientos libra/libras -> doscientas

--Roebek 16:24, 25 September 2009 (UTC)

Thanks for your patch. There is in the new Numbertext 0.7 release. Nemeth 11:36, 10 November 2009 (UTC)

Some fixes on Catalan definition


^0 zero
1$ u
1 un
2 dos
3 tres
4 quatre
5 cinc
6 sis
7 set
8 vuit
9 nou
10 deu
11 onze
12 dotze
13 tretze
14 catorze
15 quinze
16 setze
17 disset
1(\d) di$1
20 vint
2(\d) vint-i-$1
30 trenta
40 quaranta
50 cinquanta
60 seixanta
70 setanta
80 vuitanta
90 noranta
(\d)(\d) $(\10)-$2
1(\d\d) cent $1
(\d)(\d\d) $1-cents $2
1(\d{3}) mil $1
(\d{1,3})(\d{3}) $1 mil $2
1(\d{6}) un milió $1
(\d{1,6})(\d{6}) $1 milions $2
1(\d{9}) mil milions $1
1(\d{12}) un bilió $1
(\d{1,6})(\d{12}) $1 bilions $2
1(\d{18}) un trilió $1
(\d{1,6})(\d{18}) $1 trilions $2
1(\d{24}) un quadrilió $1
(\d{1,6})(\d{24}) $1 quadrilions $2  

# negative number?

[-−](\d+) menys |$1

# decimals

"([-−]?\d+)[.,]" $1| coma
"([-−]?\d+[.,]\d*)(\d)" $1| |$2

# currency

# unit/subunit singular/plural

us:([^,]*),([^,]*),([^,]*),([^,]*) \1
up:([^,]*),([^,]*),([^,]*),([^,]*) \2
ss:([^,]*),([^,]*),([^,]*),([^,]*) \3
sp:([^,]*),([^,]*),([^,]*),([^,]*) \4
CHF:(\D+) $(\1: franc suís, francs suís, cèntim, cèntims)
EUR:(\D+) $(\1: euro, euros, cèntim, cèntims)
GBP:(\D+) $(\1: lliura esterlina, lliures esterlines, penic, penics)
JPY:(\D+) $(\1: ien, iens, sen, sen)
USD:(\D+) $(\1: dòlar EUA, dòlar EUA, cent, cents)
"([A-Z]{3}) ([-−]?1)([.,]00?)?" $2 $(\1:us)
"([A-Z]{3}) ([-−]?\d+)([.,]00?)?" $2 $(\1:up)
"(([A-Z]{3}) [-−]?\d+)[.,](01)" $1 amb $(1) $(\2:ss)
"(([A-Z]{3}) [-−]?\d+)[.,](\d)" $1 amb $(\30) $(\2:sp)
"(([A-Z]{3}) [-−]?\d+)[.,](\d\d)" $1 amb $3 $(\2:sp) 
Fixed in Numbertext 0.6. Many thanks for your help. Nemeth 22:16, 3 September 2009 (UTC)

Thanks for your work. I've updated at launchpad (bug #425374) Catalan Soros code with some additional fixes and improvements.--Jmontane 20:36, 6 September 2009 (UTC)

French numbering remarks

Congratulations for this fantastic extension ! It was needed for many years !

These remarks are still valid for version 0.7


a) Not language specific : When there is more than two decimals, MONEYTEXT rounds the value to 2 decimals, that is correct behaviour, I think. But currently it rounds up only above decimal 5, instead of from decimal 5, and not even in every cases.

Compare with the rounding of Calc when formatted with 2 decimals :

Value 9,9949 is displayed 10 by Calc, but MONEYTEXT will treat it like 9,99
MONEYTEXT produces 10 only for a value strictly greater that 9,995, for example 9,995001

Value 5,995 Euros in en-US gives : six euro and zero cents

rounding up is correct but...
the text should be : six euros
(plural for euros, no mention of cents)

Value 9,995 Euros in en-US gives : nine euro and ninety-nine cents

no round up this time ! round up occurs only with a slightly greater value.
I believe, Python (the implementation language of the Numbertext extension) uses different rounding algorithm, but I will check it. Nemeth 11:44, 10 November 2009 (UTC)

b) not language specific, case of rounding down :

MONEYTEXT value 7,004 gives in fr-FR : "sept euros et zéro centimes" instead of : "sept euros"

MONEYTEXT value 0,004 gives in fr-FR : "zéro euros et zéro centimes" instead of : "zéro euro"

BMarcelly 10:43, 10 November 2009 (UTC)

I will fix it. Many thanks for your great bug reports, especially for the previous missing 0.x decimals. It was a complementer character group bug of the interpreter. Nemeth 11:44, 10 November 2009 (UTC)

Monetary units

These monetary units are listed in file numbertext_fr_FR.py (and other french variants) but are not recognized by MONEYTEXT:


For fr-FR, fr-BE, fr-CH you should add XPF: franc Pacifique

singular : 1 franc Pacifique ; plural : 2 francs Pacifique

In file numbertext_ro_RO.py the monetary unit RON is listed but not recognized by MONEYTEXT.

BMarcelly 10:46, 10 November 2009 (UTC)

I think, this is the fault of the i18n database of OpenOffice.org. Nemeth 11:44, 10 November 2009 (UTC)

Turkish language source


First I thank to developers of this extension. I made turkish version numbertext_tr_TR.py. Here is the source

File:Numbertext tr TR.txt

I hope in newer versions turkish version adds to the project

In turkish;
Number texts written with spaces like one hundered twent five, but money texts written with deleting of spaces, like onehunderedtwentyfive turkish lira

Is it possible to do this?
Ramdem 20:01, 12 September 2009 (UTC)

Yes, it's possible by a space deletion call. I will add it, and you can check the result. Nemeth 13:09, 27 September 2009 (UTC)
I have integrated with some small fixes the Turkish description to Numbertext 0.7. See http://NUMBERTEXT.org, too. Thanks, Nemeth 11:45, 10 November 2009 (UTC)

Thanks Nemeth I will anounce this release numbertext at turkish openoffice.org forum Ramdem 17:27, 11 November 2009 (UTC)

Personal tools