Integrating Import and Export Filters

From Apache OpenOffice Wiki
Jump to: navigation, search



Apache OpenOffice provides several implementations for objects that can be displayed in a task window. In the context of the Component Framework they are called Office Components. These components can be created from a content, e.g. stored in a file on disk. Most of the time this will be done by creating a document and loading the content into it using a filter. This section explains the implementation of Apache OpenOffice import and export filter components. Apache OpenOffice also allows to load content into a component directly by using a frame loader, but this is described in this section only briefly.

Introduction

Inside Apache OpenOffice a document is represented by its document service, called model. For a list of available document services, refer to the section Document Specific Features. On disk, the same document is represented as a file or possibly as a dynamically generated output, for example, of a database statement. To generalize this and abstract from single disk files we just call it "content". The content is a serialization of a model, e.g. the ODF or the Word model. A filter component is used to convert between this model and the internal model defined by the document core model as shown in the following diagram.

Import/Export Filter Process

In our API the three entities in the above diagram, content, model, and filter, are defined as UNO services. The services consist of several interfaces that map to a specific implementation, for example, using C++ or Java.

The filter implementer has to develop a class that implements the com.sun.star.document.ExportFilter or com.sun.star.document.ImportFilter service, or both in case the filter should support import and export. The filter will get a com.sun.star.document.MediaDescriptor that defines the stream the filter must use for its input or output.

Approaches

To implement said filter class, a developer can

  • link against the application core
  • use the document API
  • use the XML based techniques (sax or xslt)

Each method has unique advantages and disadvantages, that are summarized briefly:

Using the core data structure and linking against the application core is the way how all elder filters (originating from the "pre-UNO" area) are implemented in Apache OpenOffice. As the disadvantages are huge (maintenance nightmare when core data structures or interfaces change), this approach is not recommended for new filter development in general.

Using the Apache OpenOffice API based on UNO is more advantageous, since it solves the technical problems indicated in the above paragraph. The idea is to read data from a file on loading and build up a document using the Apache OpenOffice API, and to iterate over a document model and write the corresponding data to a file on storing. The UNO component technology insulates the filter from binary layout, and other compiler and version dependent issues. Additionally, the API is expected to be more stable than the core interfaces, and provides an abstraction from the core applications. The developer creating an API based filter will directly provide a filter class implementing the service com.sun.star.document.ImportFilter and/or com.sun.star.document.ExportFilter

The third is to import and export documents using the XML-based file format. UNO-based XML import and export components have all the advantages of the previous method, but they have the additional advantage that the filter logic builds upon the ODF model that is not bound to Apache OpenOffice as the document API is and so theoretically can be used in other applications also (the filter logic, not the filter as a whole). A disadvantage may be that conversions based on the ODF format can become a little bit more complicated and also can be worse than conversions based on a document API if they require access to layout information in the source or the target format.

The developer creating an XML based filter will not directly provide a filter class but use a generic filter class provided by Apache OpenOffice. This filter class is the XMLFilterAdaptor of the document it works on. The filter adaptor service expects an XML based importer or exporter UNO service to be provided by the developer. Apache OpenOffice provides generic importer and exporter services that allow to plug in a xslt to carry out a transformation that the importer or exporter feeds into the XMLFilterAdaptor.

In addition to the filter itself the developer must provide some information about it to enable Apache OpenOffice to integrate it into the application framework. This information is provided as a configuration file. To understand better what needs to be in this file let's have a look on how Apache OpenOffice deals with filters.

Checklist for filter developers

Integrating a filter into Apache OpenOffice requires the following steps that will be explained in the following sections:

  1. Implement a filter (required).
  2. Implement an com.sun.star.document.ExtendedTypeDetection service to support detection by content (optional).
  3. Implement a filter options dialog if the implemented filter requires additional parameters (optional).
  4. Register the component libraries as UNO services (required). If the filter is deployed as an extension this is a part of it. If the filter will become a part of the Apache OpenOffice installation, the registration must be done as described in the chapter Deployment Options for Components.
  5. Add configuration information to the org.openoffice.TypeDetection node of the configuration (required). If the filter is deployed as an extension, the extension will contain a configuration file. Apache OpenOffice will access it as part of the extensions layer of the Configuration Manager. If the filter will become a part of the Apache OpenOffice installation, the configuration must be integrated into the build process.

It is recommended to read the following chapters before carrying out any of these steps.

Content on this page is licensed under the Public Documentation License (PDL).
Personal tools
In other languages