OpenOffice filters using the XML based file format

From Apache OpenOffice Wiki
Revision as of 14:46, 31 December 2007 by SergeMoutou (Talk | contribs)

Jump to: navigation, search

Abstract: This document explains the implementation of OpenOffice.org import and export filter components, focusing on filter components based on the OpenOffice.org XML file format. It is intended as a brief introduction to developers that want to implement OpenOffice.org filters for foreign file formats.

Preliminaries

They are several ways to get information into or out of OpenOffice.org: You can

  1. link against the application core,
  2. use the OpenOffice.org API,
  3. use the XML file format.

Each of these ways has unique advantages and disadvantages, that I will briefly summarize:

Using the core data structure and linking against the application core is the traditional way to implement filters in OpenOffice.org. The advantages this method offers are efficiency and direct access to the document. However, the core implementation provides a very implementation centric view of the applications. Additionally, there are a number of technical disadvantages: Every change in the core data structures or objects will have to be followed-up by corresponding changes in code that use them. Hence filters need to be recompiled to match the binary layout of the application core objects. While these things are manageable (albeit cumbersome) for closed source applications, this method is expected to create a maintenance nightmare if application and filter are developed separately, as is customary in open sources applications. Simultaneous delivery of a new application build and the corresponding filters developed by outside parties looks challenging.

Using the OpenOffice.org API (based on UNO) is a much better way, since it solves the technical problems indicated in the last paragraph. The UNO component technology insulates the filter from binary layout (and other compiler and version dependent issues). Additionally, the API is expected to be more stable than the core interfaces, and it even provides a shallow level of abstraction from the core applications. In fact, the native XML filter implementations largely make use of this strategy and are based on the OpenOffice.org API.

The third (and possibly surprising choice) is to import and export documents using the XML based file format. UNO-based XML import and export components feature all of the advantages of the previous method, but additionally provides the filter implementer with a clean, structured, and fully documented view of the document. As a significant difficulty in conversion between formats is the conceptual mapping from the one format to the other, a clean, well-structured view of the document may turn out to be beneficial.

Personal tools