Difference between revisions of "Extension Dictionaries"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Dictionary extensions vs pre-installed dictionaries)
(Dictionary extensions vs pre-installed dictionaries)
Line 176: Line 176:
 
</pre>
 
</pre>
  
While the for pre-installed dicitonaries (using flat files as done so far) it needs to look like
+
While for pre-installed dictionaries (using plain files as done so far) it needs to look like
 
<pre>
 
<pre>
 
     <prop oor:name="Locations" oor:type="oor:string-list">
 
     <prop oor:name="Locations" oor:type="oor:string-list">

Revision as of 09:56, 26 February 2008

Problem with current dictionary.lst file

In order to get rid of the problems involved with the dictionary.lst file at upgrade (see issue 72559) we have to forsake the use of it and provide the necessary information about dictionaries in the configuration.

Already provided configuration entries

Each implementation of spell checker, hyphenator or thesaurus needs to have an entry in the configuration (i.e. the file Linguistic.xcu) stating it's implementation name (for purposes of identification) and the file formats name of the dictionaries it can handle. More than one file format can be listed if supported.

The dictionary file format names for the current OOo linguistic are

  • DICT_SPELL
  • DICT_HYPH and
  • DICT_THES


Thus the respective entries do look like this:

  • Spell checker entry
<node oor:name="ServiceManager">
    <node oor:name="SpellCheckers">
        <node oor:name="org.openoffice.lingu.MySpellSpellChecker" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_SPELL</value>
            </prop>
        </node>

        ... entries for other spell checkers ...
    </node>
</node>
  • Hyphenator entry
<node oor:name="ServiceManager">
    <node oor:name="Hyphenators">
        <node oor:name="org.openoffice.lingu.LibHnjHyphenator" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_HYPH</value>
            </prop>
        </node>

        ... entries for other hyphenators ...
    </node>
</node>
  • Thesaurus entry
<node oor:name="ServiceManager">
        <node oor:name="org.openoffice.lingu.new.Thesaurus" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_THES</value>
            </prop>
        </node>

        ... entries for other thesauri ...
</node>


The only link between one of the above services and the dictionaries to be used by them is the name of the dictionary format. When invoked the services are required to check in the configuration which dictionaries they can make use of and thus establishing their set of dictionaries to use. This is done by looking at the format names of the configured dictionaries.

Dictionary entries (must be provided)

The entries that are still missing and need to be provided are those for the dictionaries. Each dictionary must have an entry of its own.

An entry consists of

  • a unique name (the node name in the configuration)
  • a list of file locations (only the ones actually needed by the service implementation)
  • a single format name and
  • a list of ISO-names for locales listing the languages the dictionary may be used for


Thus a set of dictionary entries in the Linguistic.xcu provided by a single extension may look like this:

 <node oor:name="ServiceManager">
    <node oor:name="Dictionaries">
        <node oor:name="HunSpellDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/de_CH.aff %origin%/de_CH.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_SPELL</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="HunSpellDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/en_US.aff %origin%/en_US.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_SPELL</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
        <node oor:name="HyphDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/hyph_en_US.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_HYPH</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
        <node oor:name="HyphDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/hyph_de_CH.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_HYPH</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="ThesDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/th_de_CH_v2.dat %origin%/th_de_CH_v2.idx</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_THES</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="ThesDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/th_en_US_v2.dat %origin%/th_en_US_v2.idx</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_THES</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
    </node>
 </node>  


A most simple dictionary extension providing several dictionaries at once is attached to issue 81365.

A more complete extension probably likes to provide the description.xml for the extension as well in order to give a short description and a version number for the extension.

Dictionary extensions vs pre-installed dictionaries

There are only two choices for installing dictionaries to make use of. Those are

  • pre-installing dictionaries with office installation or
  • download one or more dictionaries as extension

Downloadable dictionaries must always be provided as extension only! Either way in both variants configuration entries for each dictionary have to be provided.

The configuration entries for above two flavors differ only in the way the required files are denoted in the Locations property of the entry.


For extension dictionaries the property must look like

    <prop oor:name="Locations" oor:type="oor:string-list">
        <value>%origin%/en_US.aff %origin%/en_US.dic</value>
    </prop>

While for pre-installed dictionaries (using plain files as done so far) it needs to look like

    <prop oor:name="Locations" oor:type="oor:string-list">
        <value>$(insturl)/share/dict/ooo/en_US.aff $(insturl)/share/dict/ooo/en_US.dic</value>
    </prop>
Personal tools