Difference between revisions of "Extension Dictionaries"

From Apache OpenOffice Wiki
Jump to: navigation, search
(A sample description.xml for a dictionary extension)
Line 1: Line 1:
=Problem with current dictionary.lst file=
 
 
In order to get rid of the problems involved with the dictionary.lst
 
file at upgrade (see [http://www.openoffice.org/issues/show_bug.cgi?id=72559 issue 72559]) we have to forsake the use of it and provide the necessary information about dictionaries in the configuration.
 
 
 
=Already provided configuration entries=
 
=Already provided configuration entries=
  
 
Each implementation of spell checker, hyphenator or thesaurus needs to have an entry in the configuration (i.e. the file Linguistic.xcu) stating its implementation name (for purposes of identification) and the file format names of the dictionaries it can handle. More than one file format can be listed if supported.
 
Each implementation of spell checker, hyphenator or thesaurus needs to have an entry in the configuration (i.e. the file Linguistic.xcu) stating its implementation name (for purposes of identification) and the file format names of the dictionaries it can handle. More than one file format can be listed if supported.
  
The dictionary file format names for the current OOo linguistic are
+
The dictionary file format names for the current {{AOo}} linguistic are
 
*DICT_SPELL
 
*DICT_SPELL
 
*DICT_HYPH and
 
*DICT_HYPH and
Line 18: Line 13:
 
*'''Spell checker entry'''
 
*'''Spell checker entry'''
  
<source lang="xml">
+
<syntaxhighlight lang="xml">
 
<node oor:name="ServiceManager">
 
<node oor:name="ServiceManager">
 
     <node oor:name="SpellCheckers">
 
     <node oor:name="SpellCheckers">
Line 30: Line 25:
 
     </node>
 
     </node>
 
</node>
 
</node>
</source>
+
</syntaxhighlight>
  
 
*'''Hyphenator entry'''
 
*'''Hyphenator entry'''
<source lang="xml">
+
<syntaxhighlight lang="xml">
 
<node oor:name="ServiceManager">
 
<node oor:name="ServiceManager">
 
     <node oor:name="Hyphenators">
 
     <node oor:name="Hyphenators">
Line 45: Line 40:
 
     </node>
 
     </node>
 
</node>
 
</node>
</source>
+
</syntaxhighlight>
  
 
*'''Thesaurus entry'''
 
*'''Thesaurus entry'''
<source lang="xml">
+
<syntaxhighlight lang="xml">
 
<node oor:name="ServiceManager">
 
<node oor:name="ServiceManager">
 
         <node oor:name="org.openoffice.lingu.new.Thesaurus" oor:op="fuse">
 
         <node oor:name="org.openoffice.lingu.new.Thesaurus" oor:op="fuse">
Line 58: Line 53:
 
         ... entries for other thesauri ...
 
         ... entries for other thesauri ...
 
</node>
 
</node>
</source>
+
</syntaxhighlight>
  
  
Line 80: Line 75:
 
Thus a set of dictionary entries in the Linguistic.xcu provided by a single extension may look like this:
 
Thus a set of dictionary entries in the Linguistic.xcu provided by a single extension may look like this:
  
<source lang="xml">
+
<syntaxhighlight lang="xml">
 
  <node oor:name="ServiceManager">
 
  <node oor:name="ServiceManager">
 
     <node oor:name="Dictionaries">
 
     <node oor:name="Dictionaries">
Line 151: Line 146:
 
     </node>
 
     </node>
 
  </node>   
 
  </node>   
</source>
+
</syntaxhighlight>
  
  
Line 178: Line 173:
  
  
A most simple dictionary extension providing several dictionaries at once is attached to [http://www.openoffice.org/issues/show_bug.cgi?id=81365 issue 81365].
+
A most simple dictionary extension providing several dictionaries at once is attached to [https://bz.apache.org/ooo/show_bug.cgi?id=81365 issue 81365].
  
 
A more complete extension probably likes to provide the description.xml for the extension as well in order to give a short description and a version number for the extension.
 
A more complete extension probably likes to provide the description.xml for the extension as well in order to give a short description and a version number for the extension.
Line 185: Line 180:
 
=A sample description.xml for a dictionary extension=
 
=A sample description.xml for a dictionary extension=
  
<source lang="xml">
+
<syntaxhighlight lang="xml">
  
 
<?xml version="1.0" encoding="UTF-8"?>
 
<?xml version="1.0" encoding="UTF-8"?>
Line 283: Line 278:
 
</description>
 
</description>
  
</source>
+
</syntaxhighlight>
  
 
=Hints for developers=
 
=Hints for developers=
  
If you want to test a new version of a dictionary extension before you release it, you don't need to go through the deinstall/install cycle each time. Extensions will get unpacked when they are installed, and you can find your dictionary somewhere inside the $(share)/uno_packages/cache or $(user)/uno_packages/cache folders of your OOo installation (depending on whether the dictionary was installed for all users or the current user only). For testing your new dictionary version you can just copy it over the old one and restart OOo (in case the dictionary was already loaded) and test the dictionary until you are pleased.
+
If you want to test a new version of a dictionary extension before you release it, you don't need to go through the deinstall/install cycle each time. Extensions will get unpacked when they are installed, and you can find your dictionary somewhere inside the $(share)/uno_packages/cache or $(user)/uno_packages/cache folders of your {{AOo}} installation (depending on whether the dictionary was installed for all users or the current user only). For testing your new dictionary version you can just copy it over the old one and restart {{AOo}} (in case the dictionary was already loaded) and test the dictionary until you are pleased.
  
 
=What you must not do!=
 
=What you must not do!=
Line 301: Line 296:
  
 
=Further readings and notes=
 
=Further readings and notes=
*[http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/Extensions/Extensions Start page for extensions]
+
*[https://wiki.openoffice.org/wiki/Documentation/DevGuide/Extensions/Extensions Start page for extensions]]
*[http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/Extensions/Checklist_for_Writing_Extensions Checklist for writing extensions]
+
*[https://wiki.openoffice.org/wiki/Documentation/DevGuide/Extensions/Checklist_for_Writing_Extensions Checklist for writing extensions]
*[http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/Extensions/description.xml About the description.xml]
+
*[https://wiki.openoffice.org/wiki//wiki/Documentation/DevGuide/Extensions/description.xml About the description.xml]
*[http://wiki.services.openoffice.org/wiki/Documentation/DevGuide/Extensions/Online_Update_of_Extensions Online update of extensions]
+
*[https://wiki.openoffice.org/wiki//wiki/Documentation/DevGuide/Extensions/Online_Update_of_Extensions Online update of extensions]
  
 
=Uploading and installing extensions=
 
=Uploading and installing extensions=
  
<p>You can browse all available extensions [http://extensions.services.openoffice.org here].</p>
+
<p>You can browse all available extensions [https://extensions.openoffice.org here].</p>
<p>For a list of currently available dictionary extensions in the repository just check [http://extensions.services.openoffice.org/dictionary here]</p>
+
<p>For a list of currently available dictionary extensions in the repository just check [https://extensions.openoffice.org/dictionary here]</p>
<p>Extensions should be created as oxt files and uploaded to the repository. Only that way checking for available updates will happen automatically. You can upload your extension [http://wiki.services.openoffice.org/wiki/Extensions/website/submission here] </p>
+
<p>Extensions should be created as oxt files and uploaded to the repository. Only that way checking for available updates will happen automatically. You can upload your extension [https://wiki.openoffice.org/wiki/Extensions/website/submission here] </p>
<p>For installation instructions look [http://extensions.services.openoffice.org/resources/user/howto_install here]</p>
+
<p>For installation instructions look [https://extensions.openoffice.org/resources/user/howto_install here]</p>
  
 
= Integration of a dictionary extension to the installation set =
 
= Integration of a dictionary extension to the installation set =

Revision as of 14:50, 2 February 2021

Already provided configuration entries

Each implementation of spell checker, hyphenator or thesaurus needs to have an entry in the configuration (i.e. the file Linguistic.xcu) stating its implementation name (for purposes of identification) and the file format names of the dictionaries it can handle. More than one file format can be listed if supported.

The dictionary file format names for the current Apache OpenOffice linguistic are

  • DICT_SPELL
  • DICT_HYPH and
  • DICT_THES


Thus the respective entries do look like this:

  • Spell checker entry
<node oor:name="ServiceManager">
    <node oor:name="SpellCheckers">
        <node oor:name="org.openoffice.lingu.MySpellSpellChecker" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_SPELL</value>
            </prop>
        </node>
 
        ... entries for other spell checkers ...
    </node>
</node>
  • Hyphenator entry
<node oor:name="ServiceManager">
    <node oor:name="Hyphenators">
        <node oor:name="org.openoffice.lingu.LibHnjHyphenator" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_HYPH</value>
            </prop>
        </node>
 
        ... entries for other hyphenators ...
    </node>
</node>
  • Thesaurus entry
<node oor:name="ServiceManager">
        <node oor:name="org.openoffice.lingu.new.Thesaurus" oor:op="fuse">
            <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list">
                <value>DICT_THES</value>
            </prop>
        </node>
 
        ... entries for other thesauri ...
</node>


The only link between one of the above services and the dictionaries to be used by them is the name of the dictionary format. When invoked the services are required to check in the configuration which dictionaries they can make use of and thus establishing their set of dictionaries to use. This is done by looking at the format names of the configured dictionaries.

Dictionary entries (must be provided)

The entries that are still missing and need to be provided are those for the dictionaries. Each dictionary must have an entry of its own.

An entry consists of

  • a unique name (the node name in the configuration)
  • a list of file locations (only the ones actually needed by the service implementation)
  • a single format name and
  • a list of ISO-names for locales listing the languages the dictionary may be used for

Please note that there is no specified order to the list of files provided in the Locations property. Also it is the implementations task to distinguish between those files and their potentially different use by their name only.


Thus a set of dictionary entries in the Linguistic.xcu provided by a single extension may look like this:

 <node oor:name="ServiceManager">
    <node oor:name="Dictionaries">
        <node oor:name="HunSpellDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/de_CH.aff %origin%/de_CH.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_SPELL</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="HunSpellDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/en_US.aff %origin%/en_US.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_SPELL</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
        <node oor:name="HyphDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/hyph_en_US.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_HYPH</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
        <node oor:name="HyphDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/hyph_de_CH.dic</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_HYPH</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="ThesDic_de_CH" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/th_de_CH_v2.dat %origin%/th_de_CH_v2.idx</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_THES</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>de-CH</value>
            </prop>
        </node>
        <node oor:name="ThesDic_en_US" oor:op="fuse">
            <prop oor:name="Locations" oor:type="oor:string-list">
                <value>%origin%/th_en_US_v2.dat %origin%/th_en_US_v2.idx</value>
            </prop>
            <prop oor:name="Format" oor:type="xs:string">
                <value>DICT_THES</value>
            </prop>
            <prop oor:name="Locales" oor:type="oor:string-list">
                <value>en-US</value>
            </prop>
        </node>
    </node>
 </node>


About node names for the dictionaries:

If people outside the core developer cycle want to provide an
extensions, I recommend to use the reversed domain schema notation. If
e.g. you are employed by company "linguprovider" in Russia that has the
domain linguprovider.ru you could name your node

"ru.linguprovider.grabinski.dict_ru"

This should help to avoid name clashes. If your company was too big to
make you feel comfortable with being the only "Grabinski", you could add
more "namespaces", e.g.

"ru.linguprovider.division1.grabinski.dict_ru"

Just break it down to a level where you think you can arrange everything
easily.

If you are doing your dictionary just as a private person you can use
your own domain or your e-mail address etc. This should help to keep the
probability of name clashes low.


A most simple dictionary extension providing several dictionaries at once is attached to issue 81365.

A more complete extension probably likes to provide the description.xml for the extension as well in order to give a short description and a version number for the extension.

However for a sample description.xml to use look below.

A sample description.xml for a dictionary extension

<?xml version="1.0" encoding="UTF-8"?>
<description xmlns="http://openoffice.org/extensions/description/2006" xmlns:d="http://openoffice.org/extensions/description/2006"  xmlns:xlink="http://www.w3.org/1999/xlink">
 
 
    <!-- SHOULD OR MUST BE PROVIDED ENTRIES FOLLOWING... -->
 
 
    <!--Here you can state the license text to be displayed during installation.
        You can provide more than one localized version if you like.
        If no matching locale was found the first one will be displayed.
        !!! Don't change the values for 'accept-by' or 'suppress-on-update' it  !!!
        !!! might be troublesome in multi-user installations if no shared-layer !!!
        !!! installation can be done.                                           !!!
 
        !!! IMPORTANT: if the dictionary is to be part of the OOo installation it !!!
        !!! MUST NOT have a registration entry with a license to be displayed.    !!!
        !!! Otherwise the installation will break!                                !!!
        !!! Thus this entry should be used for down loadable dictionaries only.   !!!
    <registration>
        <simple-license accept-by="admin" suppress-on-update="false" >
            <license-text xlink:href="LISEZMOI.txt" lang="fr-FR" />
        </simple-license>
    </registration>
    -->
 
    <!--The version of your extension. NOT the one of OpenOffice.org...
        It will also be used to automatically check if there are updates for this
        extension available. Newer versions should have higher values.
        Only digits and '.' may be used.
    -->
    <version value="1.2.1" />
 
    <!--A unique identifier for your extension.
        In order to avoid name clashes with other extensions it should probably hold
        your company name and maybe your full name along with the name of the extension in 
        a form named reversed-domain-notation which would look like this
            org.openoffice. ...
            net.MyWebpage.www.DictionaryName
        For the very same reason they should NOT start with 'org.openoffice'. That string
        should only be used for extensions shipped with OOo.
        When choosing the identifier keep in mind that others may provide a dictionary for that
        very same language as well and even then your identifier still needs to be unique!
    -->
    <identifier value="net.MyWebpage.www.MyName.OOo-Dictionaries.fr-FR" />
 
    <!--A name for the extension to be used in the UI.
        For dictionaries it should show the locales supported
        and the purpose spell checking and/or hyphenation and/or thesaurus.
        The display name can be localized and there should be at least one
        entry for each language it implements and one default English entry.
        The default entry is the one listed first.
    -->
    <display-name>
        <name lang="en">French (France) spell check dictionary</name>
        <name lang="fr">... to be done ...</name>
    </display-name>
 
    <!--Dictionaries should work with all platforms...-->
    <platform value="all" />
 
    <!--A minimal OpenOffice.org version the extension requires to be used with.
        For dictionary extensions that will be 'OpenOffice.org 3.0'
    -->
    <dependencies>
        <OpenOffice.org-minimal-version value="3.0" d:name="OpenOffice.org 3.0" />
    </dependencies>
 
 
    <!-- MORE OPTIONAL LIKE ENTRIES FOLLOWING (may easily be omitted, out-commented by default)... -->
 
 
    <!--If you uploadet your extension to the repository (which should be the default!) 
        you do not need to have this one.
    <update-information>
        <src xlink:href="http://extensions.openoffice.org/testarea/desktop/license/update/lic3.update.xml" />
    </update-information>
    -->
 
    <!--Check if this is already generated by repository.
        Otherwise you may like to provide it manually.
    <publisher>
        <name xlink:href="http://extensions.openoffice.org/testarea/desktop/publisher/publisher_en.html" lang="en">My dictionary extension (en)</name>
        <name xlink:href="http://extensions.openoffice.org/testarea/desktop/publisher/publisher_fr.html" lang="fr">My dictionary extension (fr)</name>
    </publisher>
    -->
 
    <!--This link will be generated by repository. Check if this already works for multiple languages. 
        If not you may provide it manually if you like.         
    <release-notes>
        <src xlink:href="http://extensions.openoffice.org/testarea/desktop/publisher/release-notes_en.txt" lang="en" />
        <src xlink:href="http://extensions.openoffice.org/testarea/desktop/publisher/release-notes_fr.txt" lang="fr" />
    </release-notes>
    -->
 
</description>

Hints for developers

If you want to test a new version of a dictionary extension before you release it, you don't need to go through the deinstall/install cycle each time. Extensions will get unpacked when they are installed, and you can find your dictionary somewhere inside the $(share)/uno_packages/cache or $(user)/uno_packages/cache folders of your Apache OpenOffice installation (depending on whether the dictionary was installed for all users or the current user only). For testing your new dictionary version you can just copy it over the old one and restart Apache OpenOffice (in case the dictionary was already loaded) and test the dictionary until you are pleased.

What you must not do!

You must have at most one dictionary of any type for each language in your extension!

That is only one for each format DICT_SPELL, DICT_HYPH, DICT_THES per locale.

Otherwise, if for example you have two different spelling dictionaries with different content, they will all be used at the same time(!), which is most likely not want you want. And thus you have taken away the choice for the user.

If you want to provide two different spellings for the locale fr-FR, e.g. 'French Reformed' and 'French Classic' you have to do so by providing them in separate extensions! That way the user can explicitly choose which type of spelling he likes to use.

Further readings and notes

Uploading and installing extensions

You can browse all available extensions here.

For a list of currently available dictionary extensions in the repository just check here

Extensions should be created as oxt files and uploaded to the repository. Only that way checking for available updates will happen automatically. You can upload your extension here

For installation instructions look here

Integration of a dictionary extension to the installation set

See Spellchecker_Integration_into_Installation_Set for how to integrate a dictionary to the installation set.

Personal tools