Odt2Wiki/Features
ODT2Wiki: OpenDocument to MediaWiki conversion
There are two main open formats for storing knowledge and information in a future-proof way. The one format is OpenDocument, which is used by Apache OpenOffice and many other contemporary office suites. The other one is the format used for editing the Wikipedia global knowledge base. The later format is defined by the MediaWiki wiki engine.
This document in combination with the odt2wiki transformation demonstrates how to establish interoperability between these formats.
Note: The OpenDocument source of this document is attached to Issue 48409 of OpenOffice as odt2wiki-features.odt. The transformation result can be viewed online at Odt2Wiki/Features of the Apache OpenOffice wiki.
Supported Features
OpenDocument and MediaWiki are by no way equivalent. While OpenDocument has its main focus on sophisticated presentation and styling, MediaWiki format is designed to be collaboratively editable in a non-wysiwyg fashion using minimal markup. Due to this differences in the objective, one cannot expect a perfect result, when converting from OpenDocument to MediaWiki. However, there is a common subset of formatting that can be reliably converted from OpenDocument to MediaWiki. At the one hand, this document describes this portable subset. On the other hand, it serves as a benchmark document for the transformation, since it is written itself in OpenDocument and can therefore be transformed to MediaWiki.
Headings
While headings can be thoroughly formatted in OpenDocument, their format is defined by the style applied by the wiki engine. During export to MediaWiki format, style information of headings is lost and the default heading style is applied by the wiki engine. The headings of this document serve as example for the transformation of headings.
Hyperlinks
Native OpenDocument hyperlinks are transformed into “external” wiki links during the transformation. Therefore, the built-in linking facility of OpenDocument should only be used, when generating links pointing to other sites outside the wiki web. For generating wiki links that point to other subjects of the same wiki domain (e.g. Wikipedia), use wiki links explained below.
Lists
To have lists reliably exported, you must make sure, that the whole list has assigned a consistent list style.
Bullet Lists
An example for an unordered list with bullets:
- The first item.
- The second item
- The third item
- A sub-item.
- Another sub-item
- A sub-sub-item
- Level two continued
- Level one continued
Pure Indented Lists
The list bullets may also be omitted. To have those lists exported reliably, make sure to really use a list style instead of several paragraph styles with increasing indentation.
- The first list item
- The second list item
- A sub-item
- Another sub-item
- A sub-sub-item
- Continued with level two.
- Continued with level one.
Numbered Lists
An example for a numbered list:
- The first item.
- The second item
- The third item
- A sub-item.
- Another sub-item
- A sub-sub-item
- Level two continued
- Level one continued
Mixed lists
Numbered and unordered lists can be mixed within separate levels of the same list:
- The first item.
- The second item
- The third item
- A sub-item.
- Another sub-item
- A sub-sub-item without bullet or numbering
- Another sub-sub-item
- Level two continued
- Level one continued
Paragraphs
Alignment
Explicit text alignment should not be used in regular Wikipedia articles. Nevertheless, text alignment is supported by the odt2wiki transformation. An example follows:
Regular text aligned to the left.
Pre-formatted text
A paragraph style with a fixed-width font face is interpreted as pre-formatted text by the transformation. A opposed to typewriter character style, pre-formatted text is rendered with a border in the wiki engine like in the example below:
Some code example. A paragraph with explicit line breaks to get better structuring # Some comment Some paragraphs with indentation by preceeding spaces.
Character styles
Character styles modify the appearance of only parts of a paragraph. The character styles supported by the transformation are discussed next.
Bold
Some text within a paragraph may be set in bold style.
Italics
Like bold, italics is also supported. Especially, the combination of both styles works as well.
Being of no special use in real text, the complexity of the transformation increases noticeably by supporting text that joins these styles without intermediate space:
- bolditalics,
- italicsbold,
- bolditalicsbold,
- italicsbolditalics,
- italicsboldanditalicsitalics,
- boldboldanditalicsbold,
- boldanditalicsboldboldanditalics, and
- boldanditalicsitalicsboldanditalics.
Named, Linked and Nested Styles
OpenDocument styles can be linked together and multiple styled elements may be nested. The resulting formatting is the union of all these styles, where the innermost definition of a property wins.
- Style A declaring bold.
- Style B based on A declaring typewriter.
- Style C based on B declaring bold and italics.
- Style D based on C declaring normal font face.
- Style E based on D, again declaring bold.
Subscript
Subscript text is especially useful for setting indexes in simple formulas. x1, x2,..., xn.
Superscript
Even if a math environment is preferred, the combination of italics style with subscript and superscript is very common for simple formulas in Wikipedia articles: fn(x) = xn – 3x2 + 2.
Typewriter
When describing programs or algorithms, a common style is to set references to program elements such as variables or classes in typewriter font. The transformation translates all font faces with fixed width into the wiki typewriter style.
Footnotes
Note: The transformation uses the new style of footnotes with <ref> and <references> tags that requires the Cite.php extension to be installed into MediaWiki. If those tags occur as plain text in the transformation result, please install this extension.
Articles may be enriched with footnotes.⧼cite_reference_link⧽ Footnotes are especially useful for citing the origin of some information.⧼cite_reference_link⧽ Referencing the same footnote twice⧼cite_reference_link⧽ is also supported by the transformation.
Images
Images in general cannot be exported by a transformation producing a single file of wiki text. However, if the image is already uploaded to the target wiki domain⧼cite_reference_link⧽ (e.g. WikiMedia Commons), then the transformation produces a valid image tag that includes the image. Image descriptions are also supported.
Tables
Tables are a natural way of presenting multiple pieces of equally structured information.
Table Headers
Simple tables are supported well. Table headers are translated into corresponding wiki-style table headers. However, custom formatting of table borders, column sizes and background colors is ignored.
Information 1.1 | Information 1.2 | Information 1.3 |
Information 2.1 | Information 2.2 | Information 2.3 |
Joined Cells
OpenDocument and especially OpenOffice represent tables that have joined cells that span rows as tables with nested tables. In contrast, the wiki model of table is to declare column and row spans for such joined cells.
If only columns of the same row are joined, the result of the transformation resembles the source document very well:
Cell 1.1 | Cells 1.2 | Cell 1.3 |
Cell 2.1 | Joined cells 2.2 and 2.3 |
However, the transformation does not support tables with joined cells that span multiple rows. If the source document uses such tables, nested tables are observed in the result document.
|
||||
Cell 2.1 | Cell 2.2 | Cell 2.3 |
Borders
Irrespective of custom table styles for border and background, a table is always exported as “prettytable”, which renders in the wiki engine with simple borders and bold header.
Charset ans special characters
The charset of the transformation result is fixed to UTF-8. Depending on your system, this might not be the default charset. This might cause “special character” to look broken, when viewed with default settings. However, you can switch your editor to UTF-8 encoding to fix this. If your editor does not support switching the encoding, you can display the result of the transformation in the Firefox browser and switch the encoding to UTF-8 there. Now, you can cut and paste the transformation result to your program of choice.
Direct Wiki Input
WikiMath
WikiLink
Unsupported Features
Hyperlinks
Document-internal Links
Character Styles
Smallcaps
Underline
Strikethrough
Horizontal Rules
⧼cite_references_prefix⧽ ⧼cite_references_link_many⧽ ⧼cite_references_link_one⧽ ⧼cite_references_link_one⧽ ⧼cite_references_suffix⧽