Integrating Import and Export Filters

From Apache OpenOffice Wiki
< Documentation‎ | DevGuide
Revision as of 09:25, 30 September 2008 by Mba (Talk | contribs)

Jump to: navigation, search



This section explains the implementation of OpenOffice.org import and export filter components, focusing on filter components. It is intended as a brief introduction for developers who want to implement OpenOffice.org filters for foreign file formats.

Introduction

Inside OpenOffice.org a document is represented by its document service, called model. On disk, the same document is represented as a file or possibly as a dynamically generated output, for example, of a database statement. To generalize this and abstract from single disk files we just call it "content". The content is a serialization of a model, e.g. the ODF or the Word model. A filter component is used to convert between this model and the internal model defined by the document core model.

Import/Export Filter Process

In our API the three entities in the above diagram, content, model, and filter, are defined as UNO services. The services consist of several interfaces that map to a specific implementation, for example, using C++ or Java.

The filter implementer has to develop a class that implements the com.sun.star.document.ExportFilter or com.sun.star.document.ImportFilter service, or both in case the filter should support import and export. The filter will get a com.sun.star.document.MediaDescriptor that defines the stream the filter must use for its input or output.

For a list of available document services, refer to the section Document Specific Features.

Approaches

They are several ways to get information into or out of OpenOffice.org: You can

  • link against the application core
  • use the document API
  • use the XML based techniques (sax or xslt)

Each method has unique advantages and disadvantages, that are summarized briefly:

Using the core data structure and linking against the application core is the way how all elder filters (originating from the "pre-UNO" area) are implemented in OpenOffice.org. As the disadvantages are huge (maintenance nightmare when core data structures or interfaces change), this approach is not recommended for new filter development in general.

Using the OpenOffice.org API based on UNO is more advantageous, since it solves the technical problems indicated in the above paragraph. The idea is to read data from a file on loading and build up a document using the OpenOffice.org API, and to iterate over a document model and write the corresponding data to a file on storing. The UNO component technology insulates the filter from binary layout, and other compiler and version dependent issues. Additionally, the API is expected to be more stable than the core interfaces, and provides an abstraction from the core applications.

The third is to import and export documents XML_Based_Filter_Development. UNO-based XML import and export components feature all of the advantages of the previous method, but additionally provide the filter implementer with a clean, structured, and fully documented view of the document. A significant difficulty in conversion between formats is the conceptual mapping from the one format to the other. From OpenOffice.org 1.1.0 there are XML filter components that carry out the mapping at runtime, so that filter implementers can read from XML streams when exporting and write to XML streams when importing.

Content on this page is licensed under the Public Documentation License (PDL).
Personal tools