Extension Dictionaries

From Apache OpenOffice Wiki
Revision as of 13:18, 28 April 2008 by Tl (Talk | contribs)

Jump to: navigation, search

Problem with current dictionary.lst file

In order to get rid of the problems involved with the dictionary.lst file at upgrade (see issue 72559) we have to forsake the use of it and provide the necessary information about dictionaries in the configuration.

Already provided configuration entries

Each implementation of spell checker, hyphenator or thesaurus needs to have an entry in the configuration (i.e. the file Linguistic.xcu) stating it's implementation name (for purposes of identification) and the file format names of the dictionaries it can handle. More than one file format can be listed if supported.

The dictionary file format names for the current OOo linguistic are

  • DICT_SPELL
  • DICT_HYPH and
  • DICT_THES


Thus the respective entries do look like this:

  • Spell checker entry
<node oor:name="ServiceManager">
    <node oor:name="SpellCheckers">
        <node oor:name="org.openoffice.lingu.MySpellSpellChecker" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_SPELL</value>
            </prop>
        </node>

        ... entries for other spell checkers ...
    </node>
</node>
  • Hyphenator entry
<node oor:name="ServiceManager">
    <node oor:name="Hyphenators">
        <node oor:name="org.openoffice.lingu.LibHnjHyphenator" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_HYPH</value>
            </prop>
        </node>

        ... entries for other hyphenators ...
    </node>
</node>
  • Thesaurus entry
<node oor:name="ServiceManager">
        <node oor:name="org.openoffice.lingu.new.Thesaurus" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_THES</value>
            </prop>
        </node>

        ... entries for other thesauri ...
</node>


The only link between one of the above services and the dictionaries to be used by them is the name of the dictionary format. When invoked the services are required to check in the configuration which dictionaries they can make use of and thus establishing their set of dictionaries to use. This is done by looking at the format names of the configured dictionaries.

Dictionary entries (must be provided)

The entries that are still missing and need to be provided are those for the dictionaries. Each dictionary must have an entry of its own.

An entry consists of

  • a unique name (the node name in the configuration)
  • a list of file locations (only the ones actually needed by the service implementation)
  • a single format name and
  • a list of ISO-names for locales listing the languages the dictionary may be used for

Please note that there is no specified order to the list of files provided in the Locations property. Also it is the implementations task to distinguish between those files and their potentially different use by their name only.


Thus a set of dictionary entries in the Linguistic.xcu provided by a single extension may look like this:

 <node oor:name="ServiceManager">
    <node oor:name="Dictionaries">
        <node oor:name="HunSpellDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/de_CH.aff %origin%/de_CH.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_SPELL</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="HunSpellDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/en_US.aff %origin%/en_US.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_SPELL</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
        <node oor:name="HyphDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/hyph_en_US.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_HYPH</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
        <node oor:name="HyphDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/hyph_de_CH.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_HYPH</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="ThesDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/th_de_CH_v2.dat %origin%/th_de_CH_v2.idx</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_THES</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="ThesDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/th_en_US_v2.dat %origin%/th_en_US_v2.idx</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_THES</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
    </node>
 </node>  


A most simple dictionary extension providing several dictionaries at once is attached to issue 81365.

A more complete extension probably likes to provide the description.xml for the extension as well in order to give a short description and a version number for the extension.

Dictionary extensions vs pre-installed flar dictionaries

There are only two choices for installing dictionaries to make use of. Those are

  • pre-installing dictionaries with the office installation or
  • download one or more dictionaries as extension

Downloadable dictionaries must always be provided as extension only! Either way in both variants configuration entries for each dictionary have to be provided.

The configuration entries for above two flavors differ only in the way the required files are denoted in the Locations property of the entry.


For extension dictionaries the property must look like

    <prop oor:name="Locations" oor:type="oor:string-list">
        <value>%origin%/en_US.aff %origin%/en_US.dic</value>
    </prop>

While for pre-installed dictionaries (using plain files as done so far) it needs to look like

    <prop oor:name="Locations" oor:type="oor:string-list">
        <value>$(insturl)/share/dict/ooo/en_US.aff $(insturl)/share/dict/ooo/en_US.dic</value>
    </prop>


Note: Even though it is possible to use flat-file installation for dictionaries the way to go is by providing them as extension!

A sample description.xml for a dictionary extension

Personal tools