Difference between revisions of "Talk:NUMBERTEXT/MONEYTEXT development"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Turkish language source)
(Greek language needs male/female option: new section)
 
(33 intermediate revisions by 11 users not shown)
Line 47: Line 47:
 
Of course, we could use MONEYTEXT function with a fake currency code, with feminine tag, but empty units strings. But I think it is a workarround.
 
Of course, we could use MONEYTEXT function with a fake currency code, with feminine tag, but empty units strings. But I think it is a workarround.
 
--[[User:Jmontane|Jmontane]] 20:35, 6 September 2009 (UTC)
 
--[[User:Jmontane|Jmontane]] 20:35, 6 September 2009 (UTC)
 +
 +
: NUMBERTEXT is a string function. The numeric input converted by Calc automatically. What about
 +
 +
NUMBERTEXT("ordinal:4545")
 +
NUMBERTEXT("feminine:564")
 +
NUMBERTEXT("ordinal-feminine:564")
 +
NUMBERTEXT(CONCATENATE("ordinal-feminine:";$A1))
 +
 +
: and similar expressions?
 +
 +
: Maybe for the special handling of dates, we have to add a DATETEXT() function. Thanks for your suggestions. [[User:Nemeth|Nemeth]] 11:36, 10 November 2009 (UTC)
 +
 +
:: Yes, I thinks it's fine. I looked at en_US_2 code on Numbertext IDE. But, will be these prefixes (ordinal, feminine,...) language dependant? Whe can define them freely?
 +
:: I think it's a good option.
 +
--[[User:Jmontane|Jmontane]] 12:10, 28 April 2010 (UTC)
 +
 +
== Minor bug in Spanish language definition ==
 +
 +
Spanish has gender variation in numbers containing the string "ientos" (doscientos/as, quinientos/as, novecientos/as, etc). It generates "doscientos libras", but the correct would be "doscientas libras". I think that this line should solve this:
 +
 +
f:(.*ient)o(s.*),(.*as?) $(f:\1a\2,\3)  # doscientos libra/libras -> doscientas
 +
 +
--[[User:Roebek|Roebek]] 16:24, 25 September 2009 (UTC)
 +
 +
: Thanks for your patch. There is in the new Numbertext 0.7 release. [[User:Nemeth|Nemeth]] 11:36, 10 November 2009 (UTC)
  
 
== Some fixes on Catalan definition ==
 
== Some fixes on Catalan definition ==
Line 132: Line 157:
 
Congratulations for this fantastic extension ! It was needed for many years !
 
Congratulations for this fantastic extension ! It was needed for many years !
  
These remarks are still valid for version 0.6.1
+
These remarks are still valid for version 0.9
  
  
 
==== MONEYTEXT ====
 
==== MONEYTEXT ====
a) found same error with fr, es, it languages, maybe others
 
  
MONEYTEXT value 0 gives : "zéro euros" instead of : "zéro euro" (singular)
+
a) Not language specific : When there is more than two decimals, MONEYTEXT rounds the value to 2 decimals, that is correct behaviour, I think. But currently it rounds up only above decimal 5, instead of from decimal 5, and not even in every cases.
 
+
 
+
b) Not language specific : When there is more than two decimals, MONEYTEXT rounds the value to 2 decimals, that is correct behaviour, I think. But currently it rounds up only above decimal 5, instead of from decimal 5, and not even in every cases.
+
  
 
Compare with the rounding of Calc when formatted with 2 decimals :
 
Compare with the rounding of Calc when formatted with 2 decimals :
Line 150: Line 171:
 
Value 5,995 Euros in en-US gives : six euro and zero cents
 
Value 5,995 Euros in en-US gives : six euro and zero cents
 
: rounding up is correct but...
 
: rounding up is correct but...
: should be : six euros (plural) and cents should not be mentioned
+
: the text should be : '''six euros'''
 +
:: (plural for euros, no mention of cents)
  
 
Value 9,995 Euros in en-US gives : nine euro and ninety-nine cents
 
Value 9,995 Euros in en-US gives : nine euro and ninety-nine cents
 
: no round up this time ! round up occurs only with a slightly greater value.
 
: no round up this time ! round up occurs only with a slightly greater value.
  
 +
:: I believe, Python (the implementation language of the Numbertext extension) uses different rounding algorithm, but I will check it. [[User:Nemeth|Nemeth]] 11:44, 10 November 2009 (UTC)
  
c) not language specific, case of rounding down :
+
 
 +
b) not language specific, case of rounding down :
  
 
MONEYTEXT value 7,004 gives in fr-FR : "sept euros et zéro centimes" instead of : "sept euros"
 
MONEYTEXT value 7,004 gives in fr-FR : "sept euros et zéro centimes" instead of : "sept euros"
  
 +
MONEYTEXT value 0,004 gives in fr-FR : "zéro euros et zéro centimes" instead of : "zéro euro"
  
d) combination of a) and c) :  
+
: I will fix it. Many thanks for your great bug reports, especially for the previous missing 0.x decimals. It was a complementer character group bug of the interpreter. [[User:Nemeth|Nemeth]] 11:44, 10 November 2009 (UTC)
  
MONEYTEXT value 0,004 gives in fr-FR : "zéro euros et zéro centimes" instead of : "zéro euro"
+
Still existing in version 0.9 /  [[User:BMarcelly|BMarcelly]] 07:03, 26 May 2010 (UTC)
  
[[User:BMarcelly|BMarcelly]] 14:01, 12 September 2009 (UTC)
+
== Turkish language source ==
  
==== NUMBERTEXT ====
+
Hello,
  
There is a general problem with numbers lower than 1 : no textual result !
+
First I thank to developers of this extension.
: Value 0,98 gives (in all languages) : 0,98
+
I made turkish version numbertext_tr_TR.py. Here is the source
  
[[User:BMarcelly|BMarcelly]] 10:29, 4 September 2009 (UTC)
+
----
 +
[[File:Numbertext_tr_TR.txt]]
  
==== Negative values ====
 
  
For a negative value in french we don't say "négatif de " but "moins ". Example with value -8,3
 
  
NUMBERTEXT gives : "négatif de huit virgule trois" instead of : "moins huit virgule trois"
+
I hope in newer versions turkish version adds to the project
 +
----
 +
In turkish;<br/>
 +
Number texts written with spaces like one hundered twent five, but money texts written with deleting of spaces, like ''onehunderedtwentyfive'' turkish lira<br/>
  
MONEYTEXT gives : "négatif de huit euros et trente centimes" instead of : "moins huit euros et trente centimes"
+
'''Is it possible to do this?
 +
'''
 +
<br/>[[User:Ramdem|Ramdem]] 20:01, 12 September 2009 (UTC)
 +
: Yes, it's possible by a space deletion call. I will add it, and you can check the result. [[User:Nemeth|Nemeth]] 13:09, 27 September 2009 (UTC)
  
[[User:BMarcelly|BMarcelly]] 10:29, 4 September 2009 (UTC)
+
:: I have integrated with some small fixes the Turkish description to Numbertext 0.7. See http://NUMBERTEXT.org, too. Thanks, [[User:Nemeth|Nemeth]] 11:45, 10 November 2009 (UTC)
 +
Thanks Nemeth I will announce this release numbertext at turkish openoffice.org forum [[User:Ramdem|Ramdem]] 17:27, 11 November 2009 (UTC)
  
==== Monetary units ====
+
== Minor Bug in Thai BAHTTEXT or NUMBERTEXT/MONEYTEXT ==
  
These monetary units are listed in file numbertext_fr_FR.py (and other french variants) but are not recognized by MONEYTEXT:
+
In OOo, it spells all the numbers ending with '-01' as 'หนึ่ง', not 'เอ็ด' which are all wrong. There is only 2 cases that OOo spells them correctly, that are when the number is 1, and when the number has other number before 1 such as '-21' or '-51'.
: BIF, DJF, DZD, GNF, HTF, KMF, MAD, MUR, SCR, VUV, XOF
+
  
 +
The rule of spelling a number in Thai when '1' is at the least digit of integral part of a number in Thai, it is spelled 'เอ็ด' not 'หนึ่ง' such as;
 +
31 is spelled 'สามสิบ'''เอ็ด'''' not 'สามสิบ'''หนึ่ง'''', or
 +
201 is spelled 'สองร้อย'''เอ็ด'''' not 'สองร้อย'''หนึ่ง'''', or
 +
50001 is spelled 'ห้าหมื่น'''เอ็ด'''' not 'ห้าหมื่น'''หนึ่ง'''', and so on.
  
For fr-FR, fr-BE, fr-CH you should add XPF: franc Pacifique
+
There is only one case it is spelled 'หนึ่ง' when the number is 1.
: singular : 1 franc Pacifique ;  plural : 2 francs Pacifique
+
  
 +
See the issue at [http://www.openoffice.org/issues/show_bug.cgi?id=83490 OO.o Bug Tracker]
  
In file numbertext_ro_RO.py the monetary unit RON is listed but not recognized by MONEYTEXT.
+
And now I find that NUMBERTEXT.org is also make it wrong.
  
 +
: What a surprise! I have fixed in the version 0.8. Thanks for your report! László ([[User:Nemeth|Nemeth]] 06:43, 20 April 2010 (UTC))
  
The country code for sweden is incorrect: sv-SV means swedish from Salvador! Original swedish is sv-SE
 
  
The monetary unit for SEK has a plural, see any swedish news on internet
+
== What is the longest string numbertext can parse? ==
: singular krona, plural kronor
+
Just for info. What escale is the limit of numbertext? [http://en.wikipedia.org/wiki/Long_and_short_scales]
 +
Is there any limit on input or output string?
 +
--[[User:Jmontane|Jmontane]] 12:14, 28 April 2010 (UTC)
 +
: There is no limitation for the input and output size (null-terminated strings). [[User:Nemeth|Nemeth]] 07:14, 30 April 2010 (UTC)
  
 +
== language / mony codes ==
  
[[User:BMarcelly|BMarcelly]] 14:01, 12 September 2009 (UTC)
+
Hi. It works graet, but where I can find language / mony codes ? --[[User:Adam majewski|Adam majewski]] 15:27, 30 June 2010 (UTC)
  
== Turkish language source ==
+
== "un" [1] varies gender in french ==
  
Hello,
+
Hello.
 +
Thanks a lot for this great and smart extension !
 +
For french as for most latin languages, MONEYTEXT() function needs gender variability for 1 ("un/une"), since currencies can be male or female. However word ending is not significant in french.
 +
Here is a proposal (based on fr-xx from relase 0.9.3), which uses f/m attributes attached to each currency. Since I still do not figure out all Soros subtleties, I guess there could be a better way to achieve this.<pre>
 +
__numbertext__
  
First I thank to developers of this extension.
+
[...]
I made turkish version numbertext_tr_TR.py. Here is the source
+
  
----
+
# currency
[[File:Numbertext_tr_TR.txt]]
+
  
 +
# unit/subunit singular/plural
  
 +
us:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \1
 +
up:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \2
 +
ud:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \3
 +
ss:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \4
 +
sp:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \5
  
I hope in newer versions turkish version adds to the project
+
# masculine/feminine
----
+
In turkish;<br/>
+
Number texts written with spaces like one hundered twent five, but money texts written with deleting of spaces, like ''onehunderedtwentyfive'' turkish lira<br/>
+
  
'''Is it possible to do this?
+
mf:.*(,f) e
'''
+
 
 +
BIF:(\D+) $(\1: franc burundais, francs burundais, de francs burundais, centime, centimes,m)
 +
CAD:(\D+) $(\1: dollar canadien, dollars canadiens, de dollars canadiens, cent, cents,m)
 +
CDF:(\D+) $(\1: franc congolais, francs congolais, de francs congolais, centime, centimes,m)
 +
CHF:(\D+) $(\1: franc suisse, francs suisses, de francs suisses, centime, centimes,m)
 +
DJF:(\D+) $(\1: franc de Djibouti, francs de Djibouti, de francs de Djibouti, centime, centimes,m)
 +
DZD:(\D+) $(\1: dinar algérien, dinars algériens, de dinars algériens, centime, centimes,m)
 +
EUR:(\D+) $(\1: euro, euros, d’euros, centime, centimes,)
 +
GBP:(\D+) $(\1: livre sterling, livres sterling, de livres sterling, penny, pennies,f)
 +
GNF:(\D+) $(\1: franc guinéen, francs guinéens, de francs guinéens,,,m)
 +
HTF:(\D+) $(\1: gourde, gourde, de gourde, centime, centimes,f)
 +
KMF:(\D+) $(\1: franc des Comores, francs des Comores, de francs des Comores, centime, centimes,m)
 +
LBP:(\D+) $(\1: livre libanaise, livres libanaises, de livres libanaises,,,f)
 +
MAD:(\D+) $(\1: dirham marocain, dirhams marocains, de dirhams marocains, centime, centimes,m)
 +
MGA:(\D+) $(\1: ariary, ariarys, d’ariarys, iraimbilanja, iraimbilanja,m)
 +
MRO:(\D+) $(\1: ouguiya, ouguiya, d’ouguiya, khoum, khoums,m)
 +
MUR:(\D+) $(\1: roupie mauricienne, roupies mauriciennes, de roupies mauriciennes, cent, cents,f)
 +
RWF:(\D+) $(\1: franc rwandais, francs rwandais, de francs rwandais, centime, centimes,m)
 +
SCR:(\D+) $(\1: roupie seychelloise, roupies seychelloises, de roupies seychelloise, cent, cents,f)
 +
TND:(\D+) $(\1: dinar tunisien, dinars tunisiens, de dinars tunisiens, millime, millimes,m)
 +
USD:(\D+) $(\1: dollar américain, dollars américains, de dollars américains, cent, cents,m)
 +
VUV:(\D+) $(\1: vatu, vatus, de vatus,,,m)
 +
X[AO]F:(\D+) $(\1: franc CFA, francs CFA, de francs CFA, centime, centimes,m)
 +
XPF:(\D+) $(\1: franc Pacifique, francs Pacifique, de francs Pacifique, centime, centimes,m)
 +
 
 +
"(GNF|LBP|VUV) ([-−]?[01](.0+)?)" $2 $(\1:us)
 +
"(GNF|LBP|VUV) ([-−]?\d+0{6,})" $2 $(\1:ud)
 +
"(GNF|LBP|VUV) ([-−]?\d+[.,]\d+)" $2 $(\1:up)
 +
 
 +
"([A-Z]{3}) ([-−]?1)([.,]00?)?" $2$(\1:mf) $(\1:us)              # un/une
 +
"([A-Z]{3}) ([-−]?\d*[02-9]1)([.,]00?)?" $2$(\1:mf) $(\1:up)    # cent un/une mais pas cent onze
 +
"([A-Z]{3}) ([-−]?[0])([.,]00?)?" $2 $(\1:us)
 +
"([A-Z]{3}) ([-−]?\d+0{6,})([.,]00?)?" $2 $(\1:ud)
 +
"([A-Z]{3}) ([-−]?\d+)([.,]00?)?" $2 $(\1:up)
 +
 
 +
"((MGA|MRO) [-−]?\d+)[.,]0" $1
 +
"((MGA|MRO) [-−]?\d+)[.,]2" $1 et |$(1) $(\2:ss)
 +
"((MGA|MRO) [-−]?\d+)[.,]4" $1 et |$(2) $(\2:sp)
 +
"((MGA|MRO) [-−]?\d+)[.,]6" $1 et |$(3) $(\2:sp)
 +
"((MGA|MRO) [-−]?\d+)[.,]8" $1 et |$(4) $(\2:sp)
 +
 
 +
"((TND) [-−]?\d+)[.,](001)" $1 et |$(1) $(\2:ss)
 +
"((TND) [-−]?\d+)[.,](\d)" $1 et |$(\300) $(\2:sp)
 +
"((TND) [-−]?\d+)[.,](\d\d)" $1 et |$(\30) $(\2:sp)
 +
"((TND) [-−]?\d+)[.,](\d\d\d)" $1 et |$3 $(\2:sp)
 +
 
 +
"(([A-Z]{3}) [-−]?\d+)[.,](01)" $1 et |$(1) $(\2:ss)
 +
"(([A-Z]{3}) [-−]?\d+)[.,](\d)" $1 et |$(\30) $(\2:sp)
 +
"(([A-Z]{3}) [-−]?\d+)[.,](\d\d)" $1 et |$3 $(\2:sp)
 +
 
 +
[...]
 +
</pre>
 +
[[User:Jmzambon|jmzambon]] 14:42, 3 September 2010 (UTC)
 +
 
 +
== Latvian language ==
 +
It would be nice to include also code for Latvian :
 +
<pre>
 +
__numbertext__
 +
^0 nulle
 +
1 viens
 +
2 divi
 +
3 trīs
 +
4 četri
 +
5 pieci
 +
6 seši
 +
7 sepiņi
 +
8 astoņi
 +
9 deviņi
 +
10 desmit
 +
11 vienpadsmit
 +
12 divpadsmit
 +
13 trīspadsmit
 +
14 četrpadsmit
 +
15 piecpadsmit
 +
16 sešpadsmit
 +
17 septiņpadsmit
 +
18 astoņpadsmit
 +
19 deniņpadsmit
 +
([2])(\d) divdesmit $2
 +
([23456789])(\d) $1|desmit $2
 +
1(\d\d) simts $1
 +
(\d)(\d\d) $1 simti $2
 +
1(\d{3}) viens tūkstotis $1
 +
(\d{1,3})(\d{3}) $1 tūkstoši $2
 +
1(\d{6}) viens miljons $1
 +
(\d{1,3})(\d{6}) $1 miljoni $2
 +
1(\d{9}) viens miljards $1
 +
(\d{1,3})(\d{9}) $1 miljardi $2
 +
1(\d{12}) viens triljons $1
 +
(\d{1,3})(\d{12}) $1 triljoni $2
 +
1(\d{15}) viens kvadriljons $1
 +
(\d{1,3})(\d{15}) $1 kvadriljoni $2
 +
1(\d{18}) viens kvintiljons $1
 +
(\d{1,3})(\d{18}) $1 kvintiljoni $2
 +
1(\d{21}) viens sekstiljons $1
 +
(\d{1,3})(\d{21}) $1 sekstiljoni $2
 +
1(\d{24}) viens septiljons $1
 +
(\d{1,3})(\d{24}) $1 septiljoni $2
 +
 
 +
# negative numbers
 +
 
 +
[-−](\d+) mīnus |$1
 +
 
 +
# decimals
 +
 
 +
 
 +
([-−]?\d+)[.,] $1| komats
 +
([-−]?\d+[.,]\d*)(\d) $1| |$2
 +
 
 +
 
 +
# female conversion
 +
f:(.*)viens viena
 +
f:(.*)i \1as
 +
f:(.*) \1
 +
 
 +
# currency
 +
 
 +
# unit/subunit
 +
 
 +
us:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \1
 +
up:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \2
 +
ug:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \3
 +
ss:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \4
 +
sp:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \5
 +
sg:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \6
 +
 
 +
LVL:(\D+) $(\1: lats, lati,latu, santīms, santīmi, santīmu)
 +
EUR:(\D+) $(\1: eiro, eiro, eiro, cents, centi, centu)
 +
RUB:(\D+) $(\1: rublis, rubļi, rubļu, kapeika, kapeikas, kapeiku)
 +
USD:(\D+) $(\1: ASV dolārs, ASV dolāri, ASV dolāru, cents, centi, centu)
 +
 
 +
 
 +
"([A-Z]{3}) ([-−]?1)([.,]00?)?" $2| $(\1:us)
 +
"([A-Z]{3}) ([-−]?\d*[02-9]1)([.,]00?)?" $2| $(\1:us)
 +
"([A-Z]{3}) ([-−]?[23456789])([.,]00?)?" $2| $(\1:up)
 +
"([A-Z]{3}) ([-−]?\d*[02-9][23456789])([.,]00?)?" $2| $(\1:up)
 +
"([A-Z]{3}) ([-−]?\d+)([.,]00?)?" $2| $(\1:ug)
 +
 
 +
"((RUB) [-−]?\d+)[.,]([02-9])1" $1 $(\30) |$(f:$(1)) $(\2:ss)
 +
"((RUB) [-−]?\d+)[.,]([02-9][23456789])" $1 $(f:$3)  $(\2:sp)
 +
 
 +
"(([A-Z]{3}) [-−]?\d+)[.,](01)" $1 |$(1) $(\2:ss)
 +
"(([A-Z]{3}) [-−]?\d+)[.,]([02-9])1" $1 $(\30) |$(1)  $(\2:ss)
 +
 
 +
"(([A-Z]{3}) [-−]?\d+)[.,]([02-9][23456789])" $1 |$3 $(\2:sp)
 +
 
 +
"(([A-Z]{3}) [-−]?\d+)[.,](\d)" $1 |$(\30) $(\2:sg)
 +
"(([A-Z]{3}) [-−]?\d+)[.,](\d\d)" $1 |$3 $(\2:sg)
 +
</pre>
 +
--[[User:Asterisks|Asterisks]] 23:38, 17 March 2012 (UTC)
 +
 
 +
== Ukrainian language ==
 +
 
 +
I look forward to any feedback and Ukrainian language addition.
 +
 
 +
[[File:Numbertext_uk.txt]]
 +
 
 +
[[User:Ivanmelwise|Ivanmelwise]] ([[User talk:Ivanmelwise|talk]]) 10:29, 9 January 2013 (UTC)
 +
 
 +
== Chinese DAXIE (大写) BUG ==
 +
 
 +
There is an error in Chinese. 
 +
zh-ZH-2 (banking writing) : digits 2 and 4 are same (贰) this is correct for 2 but not for 4. 4 should be 肆
 +
 
 +
== Greek language needs male/female option ==
 +
 
 +
The moneytext function needs some improvements for the greek language. I can participate or provide more info.

Latest revision as of 12:01, 10 December 2013

Discussion page of NUMBERTEXT/MONEYTEXT development

Start a new section for a new theme, bug report or a language module (Soros program). See also NUMBERTEXT.org.

License requirements: Soros programs of NUMBERTEXT project are released under LGPL/BSD dual-license.

Use ~~~~ (four tilde) at the end of your comment to include your login name and a time stamp.

To indent your comment, use one or more colons at the beginning of it.

Some languages need male/female option for number to text

Hi, in Catalan de numbers 1 and 2 can be male or female, based on what's numered. Example: cotxe (car) is male and flor (flower) is female. So 1 cotxe (one car) is spelled "un cotxe" and 1 flor (one flower) is spelled "una flor". So, 1--> un (if male noun) and una (if female noun), 2 --> dos (if male noun) and dues (if female noun).

This male/female change also happens in numbers finished in 1 and 2 different that 11 and 12 (21, 22, 31, 32, ...) and also in hundreds and thousands.

Spanish also has this male/female, but only in numbers finished in 1. In Spanish 2 it's always spelled "dos".

Finally, this male/female isseu als is important for currency to text. Many currency are treated as male nouns: euro, dollar. But few currencis are "female": sterling pounds or the old spanish peseta. So, 1200 $ is spelled as "mil dos-cents dòllars", but 1200 PTA is spelled as "mil dues-centes pessetes".

I have fixed them by text converters. ca_ES uses manual arguments for the gender of the currency units and subunits, es_ES module uses automatic gender detection (feminine units end with "a" or "as"):
# masculine to feminine conversion of "un" after millions,
# if "as?$" matches currency name

f:(.*ill)(.*),(.*) \1$(f:\2,\3)		# don't modify un in millions
f:(.*un)([^a].*,|,)(.*as?) $(f:\1a\2\3)	# un libra -> una libra
f:(.*),(.*) \1 \2

"([A-Z]{3}) ([-−]?1)" $(f:|$2,$(\1:us))
"([A-Z]{3}) ([-−]?\d+0{6,})" $2 de $(\1:up)
"([A-Z]{3}) ([-−]?\d+)" $(f:|$2,$(\1:up))
Thanks for your report. Nemeth 22:12, 3 September 2009 (UTC)

Works fine with currency, thanks. But I'm thinking in some additional option in NUMBERTEX OOo Calc function. Currently we have, =NUMBERTEXT(number); =NUMBERTEXT(number,lang_code); What about? =NUMBERTEXT(number,lang_code, gender_code); Where gender_code can be: 0,1,2,.... Catalan only needs 2 variations, but may be other languages uses 3 or more variations. Of course, masculine/0 code as default.

or maybe better? =NUMBERTEXT_FEM(number); =NUMBERTEXT_FEM(number,lang_code); for "feminine" option.

Of course, we could use MONEYTEXT function with a fake currency code, with feminine tag, but empty units strings. But I think it is a workarround. --Jmontane 20:35, 6 September 2009 (UTC)

NUMBERTEXT is a string function. The numeric input converted by Calc automatically. What about
NUMBERTEXT("ordinal:4545")
NUMBERTEXT("feminine:564")
NUMBERTEXT("ordinal-feminine:564")
NUMBERTEXT(CONCATENATE("ordinal-feminine:";$A1))
and similar expressions?
Maybe for the special handling of dates, we have to add a DATETEXT() function. Thanks for your suggestions. Nemeth 11:36, 10 November 2009 (UTC)
Yes, I thinks it's fine. I looked at en_US_2 code on Numbertext IDE. But, will be these prefixes (ordinal, feminine,...) language dependant? Whe can define them freely?
I think it's a good option.

--Jmontane 12:10, 28 April 2010 (UTC)

Minor bug in Spanish language definition

Spanish has gender variation in numbers containing the string "ientos" (doscientos/as, quinientos/as, novecientos/as, etc). It generates "doscientos libras", but the correct would be "doscientas libras". I think that this line should solve this:

f:(.*ient)o(s.*),(.*as?) $(f:\1a\2,\3)   # doscientos libra/libras -> doscientas

--Roebek 16:24, 25 September 2009 (UTC)

Thanks for your patch. There is in the new Numbertext 0.7 release. Nemeth 11:36, 10 November 2009 (UTC)

Some fixes on Catalan definition

__numbertext__ 

^0 zero
1$ u
1 un
2 dos
3 tres
4 quatre
5 cinc
6 sis
7 set
8 vuit
9 nou
10 deu
11 onze
12 dotze
13 tretze
14 catorze
15 quinze
16 setze
17 disset
1(\d) di$1
20 vint
2(\d) vint-i-$1
30 trenta
40 quaranta
50 cinquanta
60 seixanta
70 setanta
80 vuitanta
90 noranta
(\d)(\d) $(\10)-$2
1(\d\d) cent $1
(\d)(\d\d) $1-cents $2
1(\d{3}) mil $1
(\d{1,3})(\d{3}) $1 mil $2
1(\d{6}) un milió $1
(\d{1,6})(\d{6}) $1 milions $2
1(\d{9}) mil milions $1
1(\d{12}) un bilió $1
(\d{1,6})(\d{12}) $1 bilions $2
1(\d{18}) un trilió $1
(\d{1,6})(\d{18}) $1 trilions $2
1(\d{24}) un quadrilió $1
(\d{1,6})(\d{24}) $1 quadrilions $2  

# negative number?

[-−](\d+) menys |$1

# decimals

"([-−]?\d+)[.,]" $1| coma
"([-−]?\d+[.,]\d*)(\d)" $1| |$2

# currency

# unit/subunit singular/plural

us:([^,]*),([^,]*),([^,]*),([^,]*) \1
up:([^,]*),([^,]*),([^,]*),([^,]*) \2
ss:([^,]*),([^,]*),([^,]*),([^,]*) \3
sp:([^,]*),([^,]*),([^,]*),([^,]*) \4
CHF:(\D+) $(\1: franc suís, francs suís, cèntim, cèntims)
EUR:(\D+) $(\1: euro, euros, cèntim, cèntims)
GBP:(\D+) $(\1: lliura esterlina, lliures esterlines, penic, penics)
JPY:(\D+) $(\1: ien, iens, sen, sen)
USD:(\D+) $(\1: dòlar EUA, dòlar EUA, cent, cents)
"([A-Z]{3}) ([-−]?1)([.,]00?)?" $2 $(\1:us)
"([A-Z]{3}) ([-−]?\d+)([.,]00?)?" $2 $(\1:up)
"(([A-Z]{3}) [-−]?\d+)[.,](01)" $1 amb $(1) $(\2:ss)
"(([A-Z]{3}) [-−]?\d+)[.,](\d)" $1 amb $(\30) $(\2:sp)
"(([A-Z]{3}) [-−]?\d+)[.,](\d\d)" $1 amb $3 $(\2:sp) 
Fixed in Numbertext 0.6. Many thanks for your help. Nemeth 22:16, 3 September 2009 (UTC)

Thanks for your work. I've updated at launchpad (bug #425374) Catalan Soros code with some additional fixes and improvements.--Jmontane 20:36, 6 September 2009 (UTC)

French numbering remarks

Congratulations for this fantastic extension ! It was needed for many years !

These remarks are still valid for version 0.9


MONEYTEXT

a) Not language specific : When there is more than two decimals, MONEYTEXT rounds the value to 2 decimals, that is correct behaviour, I think. But currently it rounds up only above decimal 5, instead of from decimal 5, and not even in every cases.

Compare with the rounding of Calc when formatted with 2 decimals :

Value 9,9949 is displayed 10 by Calc, but MONEYTEXT will treat it like 9,99
MONEYTEXT produces 10 only for a value strictly greater that 9,995, for example 9,995001

Value 5,995 Euros in en-US gives : six euro and zero cents

rounding up is correct but...
the text should be : six euros
(plural for euros, no mention of cents)

Value 9,995 Euros in en-US gives : nine euro and ninety-nine cents

no round up this time ! round up occurs only with a slightly greater value.
I believe, Python (the implementation language of the Numbertext extension) uses different rounding algorithm, but I will check it. Nemeth 11:44, 10 November 2009 (UTC)


b) not language specific, case of rounding down :

MONEYTEXT value 7,004 gives in fr-FR : "sept euros et zéro centimes" instead of : "sept euros"

MONEYTEXT value 0,004 gives in fr-FR : "zéro euros et zéro centimes" instead of : "zéro euro"

I will fix it. Many thanks for your great bug reports, especially for the previous missing 0.x decimals. It was a complementer character group bug of the interpreter. Nemeth 11:44, 10 November 2009 (UTC)

Still existing in version 0.9 / BMarcelly 07:03, 26 May 2010 (UTC)

Turkish language source

Hello,

First I thank to developers of this extension. I made turkish version numbertext_tr_TR.py. Here is the source


File:Numbertext tr TR.txt


I hope in newer versions turkish version adds to the project


In turkish;
Number texts written with spaces like one hundered twent five, but money texts written with deleting of spaces, like onehunderedtwentyfive turkish lira

Is it possible to do this?
Ramdem 20:01, 12 September 2009 (UTC)

Yes, it's possible by a space deletion call. I will add it, and you can check the result. Nemeth 13:09, 27 September 2009 (UTC)
I have integrated with some small fixes the Turkish description to Numbertext 0.7. See http://NUMBERTEXT.org, too. Thanks, Nemeth 11:45, 10 November 2009 (UTC)

Thanks Nemeth I will announce this release numbertext at turkish openoffice.org forum Ramdem 17:27, 11 November 2009 (UTC)

Minor Bug in Thai BAHTTEXT or NUMBERTEXT/MONEYTEXT

In OOo, it spells all the numbers ending with '-01' as 'หนึ่ง', not 'เอ็ด' which are all wrong. There is only 2 cases that OOo spells them correctly, that are when the number is 1, and when the number has other number before 1 such as '-21' or '-51'.

The rule of spelling a number in Thai when '1' is at the least digit of integral part of a number in Thai, it is spelled 'เอ็ด' not 'หนึ่ง' such as; 31 is spelled 'สามสิบเอ็ด' not 'สามสิบหนึ่ง', or 201 is spelled 'สองร้อยเอ็ด' not 'สองร้อยหนึ่ง', or 50001 is spelled 'ห้าหมื่นเอ็ด' not 'ห้าหมื่นหนึ่ง', and so on.

There is only one case it is spelled 'หนึ่ง' when the number is 1.

See the issue at OO.o Bug Tracker

And now I find that NUMBERTEXT.org is also make it wrong.

What a surprise! I have fixed in the version 0.8. Thanks for your report! László (Nemeth 06:43, 20 April 2010 (UTC))


What is the longest string numbertext can parse?

Just for info. What escale is the limit of numbertext? [1] Is there any limit on input or output string? --Jmontane 12:14, 28 April 2010 (UTC)

There is no limitation for the input and output size (null-terminated strings). Nemeth 07:14, 30 April 2010 (UTC)

language / mony codes

Hi. It works graet, but where I can find language / mony codes ? --Adam majewski 15:27, 30 June 2010 (UTC)

"un" [1] varies gender in french

Hello. Thanks a lot for this great and smart extension ! For french as for most latin languages, MONEYTEXT() function needs gender variability for 1 ("un/une"), since currencies can be male or female. However word ending is not significant in french.

Here is a proposal (based on fr-xx from relase 0.9.3), which uses f/m attributes attached to each currency. Since I still do not figure out all Soros subtleties, I guess there could be a better way to achieve this.
__numbertext__

[...]

# currency

# unit/subunit singular/plural

us:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \1
up:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \2
ud:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \3
ss:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \4
sp:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),m?|f \5

# masculine/feminine

mf:.*(,f) e

BIF:(\D+) $(\1: franc burundais, francs burundais, de francs burundais, centime, centimes,m)
CAD:(\D+) $(\1: dollar canadien, dollars canadiens, de dollars canadiens, cent, cents,m)
CDF:(\D+) $(\1: franc congolais, francs congolais, de francs congolais, centime, centimes,m)
CHF:(\D+) $(\1: franc suisse, francs suisses, de francs suisses, centime, centimes,m)
DJF:(\D+) $(\1: franc de Djibouti, francs de Djibouti, de francs de Djibouti, centime, centimes,m)
DZD:(\D+) $(\1: dinar algérien, dinars algériens, de dinars algériens, centime, centimes,m)
EUR:(\D+) $(\1: euro, euros, d’euros, centime, centimes,)
GBP:(\D+) $(\1: livre sterling, livres sterling, de livres sterling, penny, pennies,f)
GNF:(\D+) $(\1: franc guinéen, francs guinéens, de francs guinéens,,,m)
HTF:(\D+) $(\1: gourde, gourde, de gourde, centime, centimes,f)
KMF:(\D+) $(\1: franc des Comores, francs des Comores, de francs des Comores, centime, centimes,m)
LBP:(\D+) $(\1: livre libanaise, livres libanaises, de livres libanaises,,,f)
MAD:(\D+) $(\1: dirham marocain, dirhams marocains, de dirhams marocains, centime, centimes,m)
MGA:(\D+) $(\1: ariary, ariarys, d’ariarys, iraimbilanja, iraimbilanja,m)
MRO:(\D+) $(\1: ouguiya, ouguiya, d’ouguiya, khoum, khoums,m)
MUR:(\D+) $(\1: roupie mauricienne, roupies mauriciennes, de roupies mauriciennes, cent, cents,f)
RWF:(\D+) $(\1: franc rwandais, francs rwandais, de francs rwandais, centime, centimes,m)
SCR:(\D+) $(\1: roupie seychelloise, roupies seychelloises, de roupies seychelloise, cent, cents,f)
TND:(\D+) $(\1: dinar tunisien, dinars tunisiens, de dinars tunisiens, millime, millimes,m)
USD:(\D+) $(\1: dollar américain, dollars américains, de dollars américains, cent, cents,m)
VUV:(\D+) $(\1: vatu, vatus, de vatus,,,m)
X[AO]F:(\D+) $(\1: franc CFA, francs CFA, de francs CFA, centime, centimes,m)
XPF:(\D+) $(\1: franc Pacifique, francs Pacifique, de francs Pacifique, centime, centimes,m)

"(GNF|LBP|VUV) ([-−]?[01](.0+)?)" $2 $(\1:us)
"(GNF|LBP|VUV) ([-−]?\d+0{6,})" $2 $(\1:ud)
"(GNF|LBP|VUV) ([-−]?\d+[.,]\d+)" $2 $(\1:up)

"([A-Z]{3}) ([-−]?1)([.,]00?)?" $2$(\1:mf) $(\1:us)              # un/une
"([A-Z]{3}) ([-−]?\d*[02-9]1)([.,]00?)?" $2$(\1:mf) $(\1:up)     # cent un/une mais pas cent onze
"([A-Z]{3}) ([-−]?[0])([.,]00?)?" $2 $(\1:us)
"([A-Z]{3}) ([-−]?\d+0{6,})([.,]00?)?" $2 $(\1:ud)
"([A-Z]{3}) ([-−]?\d+)([.,]00?)?" $2 $(\1:up)

"((MGA|MRO) [-−]?\d+)[.,]0" $1
"((MGA|MRO) [-−]?\d+)[.,]2" $1 et |$(1) $(\2:ss)
"((MGA|MRO) [-−]?\d+)[.,]4" $1 et |$(2) $(\2:sp)
"((MGA|MRO) [-−]?\d+)[.,]6" $1 et |$(3) $(\2:sp)
"((MGA|MRO) [-−]?\d+)[.,]8" $1 et |$(4) $(\2:sp)

"((TND) [-−]?\d+)[.,](001)" $1 et |$(1) $(\2:ss)
"((TND) [-−]?\d+)[.,](\d)" $1 et |$(\300) $(\2:sp)
"((TND) [-−]?\d+)[.,](\d\d)" $1 et |$(\30) $(\2:sp)
"((TND) [-−]?\d+)[.,](\d\d\d)" $1 et |$3 $(\2:sp)

"(([A-Z]{3}) [-−]?\d+)[.,](01)" $1 et |$(1) $(\2:ss)
"(([A-Z]{3}) [-−]?\d+)[.,](\d)" $1 et |$(\30) $(\2:sp)
"(([A-Z]{3}) [-−]?\d+)[.,](\d\d)" $1 et |$3 $(\2:sp)

[...]

jmzambon 14:42, 3 September 2010 (UTC)

Latvian language

It would be nice to include also code for Latvian :

__numbertext__
^0 nulle
1 viens
2 divi
3 trīs
4 četri
5 pieci
6 seši
7 sepiņi
8 astoņi
9 deviņi
10 desmit
11 vienpadsmit
12 divpadsmit
13 trīspadsmit
14 četrpadsmit
15 piecpadsmit
16 sešpadsmit
17 septiņpadsmit
18 astoņpadsmit
19 deniņpadsmit
([2])(\d) divdesmit $2
([23456789])(\d) $1|desmit $2
1(\d\d) simts $1
(\d)(\d\d) $1 simti $2
1(\d{3}) viens tūkstotis $1
(\d{1,3})(\d{3}) $1 tūkstoši $2
1(\d{6}) viens miljons $1
(\d{1,3})(\d{6}) $1 miljoni $2
1(\d{9}) viens miljards $1
(\d{1,3})(\d{9}) $1 miljardi $2
1(\d{12}) viens triljons $1
(\d{1,3})(\d{12}) $1 triljoni $2
1(\d{15}) viens kvadriljons $1
(\d{1,3})(\d{15}) $1 kvadriljoni $2
1(\d{18}) viens kvintiljons $1
(\d{1,3})(\d{18}) $1 kvintiljoni $2
1(\d{21}) viens sekstiljons $1
(\d{1,3})(\d{21}) $1 sekstiljoni $2
1(\d{24}) viens septiljons $1
(\d{1,3})(\d{24}) $1 septiljoni $2

# negative numbers

[-−](\d+) mīnus |$1

# decimals


([-−]?\d+)[.,] $1| komats
([-−]?\d+[.,]\d*)(\d) $1| |$2


# female conversion
f:(.*)viens viena
f:(.*)i \1as
f:(.*) \1

# currency

# unit/subunit

us:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \1
up:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \2
ug:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \3
ss:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \4
sp:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \5
sg:([^,]*),([^,]*),([^,]*),([^,]*),([^,]*),([^,]*) \6

LVL:(\D+) $(\1: lats, lati,latu, santīms, santīmi, santīmu)
EUR:(\D+) $(\1: eiro, eiro, eiro, cents, centi, centu)
RUB:(\D+) $(\1: rublis, rubļi, rubļu, kapeika, kapeikas, kapeiku)
USD:(\D+) $(\1: ASV dolārs, ASV dolāri, ASV dolāru, cents, centi, centu)


"([A-Z]{3}) ([-−]?1)([.,]00?)?" $2| $(\1:us)
"([A-Z]{3}) ([-−]?\d*[02-9]1)([.,]00?)?" $2| $(\1:us)
"([A-Z]{3}) ([-−]?[23456789])([.,]00?)?" $2| $(\1:up)
"([A-Z]{3}) ([-−]?\d*[02-9][23456789])([.,]00?)?" $2| $(\1:up)
"([A-Z]{3}) ([-−]?\d+)([.,]00?)?" $2| $(\1:ug)

"((RUB) [-−]?\d+)[.,]([02-9])1" $1 $(\30) |$(f:$(1)) $(\2:ss)
"((RUB) [-−]?\d+)[.,]([02-9][23456789])" $1 $(f:$3)  $(\2:sp)

"(([A-Z]{3}) [-−]?\d+)[.,](01)" $1 |$(1) $(\2:ss)
"(([A-Z]{3}) [-−]?\d+)[.,]([02-9])1" $1 $(\30) |$(1)  $(\2:ss)

"(([A-Z]{3}) [-−]?\d+)[.,]([02-9][23456789])" $1 |$3 $(\2:sp)

"(([A-Z]{3}) [-−]?\d+)[.,](\d)" $1 |$(\30) $(\2:sg)
"(([A-Z]{3}) [-−]?\d+)[.,](\d\d)" $1 |$3 $(\2:sg)

--Asterisks 23:38, 17 March 2012 (UTC)

Ukrainian language

I look forward to any feedback and Ukrainian language addition.

File:Numbertext uk.txt

Ivanmelwise (talk) 10:29, 9 January 2013 (UTC)

Chinese DAXIE (大写) BUG

There is an error in Chinese. zh-ZH-2 (banking writing) : digits 2 and 4 are same (贰) this is correct for 2 but not for 4. 4 should be 肆

Greek language needs male/female option

The moneytext function needs some improvements for the greek language. I can participate or provide more info.

Personal tools