Difference between revisions of "Documentation/DevGuide/OfficeDev/Integrating Import and Export Filters"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Approaches)
m (typo)
(17 intermediate revisions by 2 users not shown)
Line 6: Line 6:
 
||NextPage=Documentation/DevGuide/OfficeDev/Filtering Process
 
||NextPage=Documentation/DevGuide/OfficeDev/Filtering Process
 
}}
 
}}
{{DISPLAYTITLE:Integrating Import and Export Filters}}
+
{{Documentation/DevGuideLanguages|Documentation/DevGuide/OfficeDev/{{SUBPAGENAME}}}}
 +
{{DISPLAYTITLE:Integrating Import and Export Filters}}
 
<!--<idltopic>com.sun.star.frame.XComponentLoader;com.sun.star.document.TypeDetection;com.sun.star.document.ExtendedTypeDetection;com.sun.star.document.FilterFactory;com.sun.star.document.ImportFilter;com.sun.star.document.ExportFilter;com.sun.star.frame.FrameLoader;com.sun.star.frame.SynchronousFrameLoader</idltopic>-->
 
<!--<idltopic>com.sun.star.frame.XComponentLoader;com.sun.star.document.TypeDetection;com.sun.star.document.ExtendedTypeDetection;com.sun.star.document.FilterFactory;com.sun.star.document.ImportFilter;com.sun.star.document.ExportFilter;com.sun.star.frame.FrameLoader;com.sun.star.frame.SynchronousFrameLoader</idltopic>-->
This section explains the implementation of {{PRODUCTNAME}} import and export filter components, focusing on filter components. It is intended as a brief introduction for developers who want to implement {{PRODUCTNAME}} filters for foreign file formats.
 
  
=== Introduction ===
+
{{PRODUCTNAME}} provides several implementations for objects that can be displayed in a task window. In the context of the Component Framework they are called Office Components. These components can be created from a content, e.g. stored in a file on disk. Most of the time this will be done by creating a document and loading the content into it using a filter. This section explains the implementation of {{PRODUCTNAME}} import and export filter components. {{PRODUCTNAME}} also allows to load content into a component directly by using a frame loader, but this is described in this section only briefly.
  
Inside {{PRODUCTNAME}} a document is represented by its document service, called model. On disk, the same document is represented as a file or possibly as a dynamically generated output, for example, of a database statement. To generalize this and abstract from single disk files we just call it "content". The content is a serialization of a model, e.g. the ODF or the Word model. A filter component is used to convert between this model and the internal model defined by the document core model.
+
===Introduction===
 +
 
 +
Inside {{PRODUCTNAME}} a document is represented by its document service, called model. For a list of available document services, refer to the section [[Documentation/DevGuide/OfficeDev/Component/Models#Document Specific Features|Document Specific Features]].
 +
On disk, the same document is represented as a file or possibly as a dynamically generated output, for example, of a database statement. To generalize this and abstract from single disk files we just call it "content". The content is a serialization of a model, e.g. the ODF or the Word model. A filter component is used to convert between this model and the internal model defined by the document core model as shown in the following diagram.
  
 
[[Image:import_export.png|none|thumb|300px|Import/Export Filter Process]]
 
[[Image:import_export.png|none|thumb|300px|Import/Export Filter Process]]
Line 18: Line 21:
 
In our API the three entities in the above diagram, content, model, and filter, are defined as UNO services. The services consist of several interfaces that map to a specific implementation, for example, using C++ or Java.
 
In our API the three entities in the above diagram, content, model, and filter, are defined as UNO services. The services consist of several interfaces that map to a specific implementation, for example, using C++ or Java.
  
The filter implementer has to develop a class that implements the <idl>com.sun.star.document.ExportFilter</idl> or <idl>com.sun.star.document.ImportFilter</idl> service, or both in case the filter should support import and export. The filter will get a <idl>com.sun.star.document.MediaDescriptor</idl> that defines the stream the filter must use for its input or output.  
+
The filter implementer has to develop a class that implements the <idl>com.sun.star.document.ExportFilter</idl> or <idl>com.sun.star.document.ImportFilter</idl> service, or both in case the filter should support import and export. The filter will get a <idl>com.sun.star.document.MediaDescriptor</idl> that defines the stream the filter must use for its input or output.
  
For a list of available document services, refer to the section [[Documentation/DevGuide/OfficeDev/Component/Models#Document Specific Features|Document Specific Features]].
+
===Approaches===
  
=== Approaches ===
+
To implement said filter class, a developer can
 
+
They are several ways to get information into or out of {{PRODUCTNAME}}: You can
+
  
 
* link against the application core
 
* link against the application core
Line 34: Line 35:
 
Using the core data structure and linking against the application core is the way how all elder filters (originating from the "pre-UNO" area) are implemented in {{PRODUCTNAME}}. As the disadvantages are huge (maintenance nightmare when core data structures or interfaces change), this approach is not recommended for new filter development in general.
 
Using the core data structure and linking against the application core is the way how all elder filters (originating from the "pre-UNO" area) are implemented in {{PRODUCTNAME}}. As the disadvantages are huge (maintenance nightmare when core data structures or interfaces change), this approach is not recommended for new filter development in general.
  
Using the {{PRODUCTNAME}} API based on UNO is more advantageous, since it solves the technical problems indicated in the above paragraph. The idea is to read data from a file on loading and build up a document using the {{PRODUCTNAME}} API, and to iterate over a document model and write the corresponding data to a file on storing. The UNO component technology insulates the filter from binary layout, and other compiler and version dependent issues. Additionally, the API is expected to be more stable than the core interfaces, and provides an abstraction from the core applications. In fact, the example filter implementation of this section makes use of this strategy and is based on the {{PRODUCTNAME}} API.
+
Using the {{PRODUCTNAME}} [[Documentation/DevGuide/OfficeDev/Document_API_Filter_Development | API]] based on UNO is more advantageous, since it solves the technical problems indicated in the above paragraph. The idea is to read data from a file on loading and build up a document using the {{PRODUCTNAME}} API, and to iterate over a document model and write the corresponding data to a file on storing. The UNO component technology insulates the filter from binary layout, and other compiler and version dependent issues. Additionally, the API is expected to be more stable than the core interfaces, and provides an abstraction from the core applications. The developer creating an API based filter will directly provide a filter class implementing the service <idl>com.sun.star.document.ImportFilter</idl> and/or <idl>com.sun.star.document.ExportFilter</idl>
 +
 
 +
The third is to import and export documents [[Documentation/DevGuide/OfficeDev/XML_Based_Filter_Development|using the XML-based file format]]. UNO-based XML import and export components have all the advantages of the previous method, but they have the additional advantage that the filter logic builds upon the ODF model that is not bound to {{PRODUCTNAME}} as the document APIs is and so theoretically can be used in other applications also (the filter logic, not the filter as a whole). A disadvantage may be that conversions based on the ODF format can become a little bit more complicated and also can be worse than conversions based on a document API if they require access to layout information in the source or the target format.
 +
 
 +
The developer creating an XML based filter will not directly provide a filter class but use a generic filter class provided by {{PRODUCTNAME}}. This filter class is the XMLFilterAdaptor of the document it works on. The filter adaptor service expects an XML based [[Documentation/DevGuide/OfficeDev/Writing_the_Filtering_Component|importer or exporter UNO service]] to be provided by the developer. {{PRODUCTNAME}}  provides generic importer and exporter services that allow to plug in an xslt to carry out a transformation that the importer or exporter feeds into the XMLFilterAdaptor.
 +
 
 +
In addition to the filter itself the developer must provide some information about it to enable {{PRODUCTNAME}} to integrate it into the application framework. This information is provided as a configuration file. To understand better what needs to be in this file let's have a look on how {{PRODUCTNAME}} deals with filters.
 +
 
 +
===Checklist for filter developers===
 +
 
 +
Integrating a filter into {{PRODUCTNAME}} requires the following steps that will be explained in the following sections:
 +
 
 +
# Implement a [[Documentation/DevGuide/OfficeDev/Filter|filter]] (required).
 +
# Implement an <idl>com.sun.star.document.ExtendedTypeDetection</idl> service to support [[Documentation/DevGuide/OfficeDev/Configuring_a_Filter_in_OpenOffice.org#TypeDetection |detection by content]] (optional).
 +
# Implement a [[Documentation/DevGuide/OfficeDev/Filter_Options|filter options dialog]] if the implemented filter requires additional parameters (optional).
 +
# Register the component libraries as UNO services (required). If the filter is deployed as an extension this is a part of it. If the filter will become a part of the {{PRODUCTNAME}} installation, the registration must be done as described in the chapter [[Documentation/DevGuide/WritingUNO/Deployment Options for Components|Deployment Options for Components]].
 +
# Add [[Documentation/DevGuide/OfficeDev/Configuring_a_Filter_in_OpenOffice.org|configuration information]] to the ''org.openoffice.TypeDetection'' node of the configuration (required). If the filter is deployed as an extension, the extension will contain a configuration file. {{PRODUCTNAME}} will access it as part of the extensions layer of the Configuration Manager. If the filter will become a part of the {{PRODUCTNAME}} installation, the configuration must be intergrated into the build process.  
  
The third is to import and export documents using the XML-based file format. UNO-based XML import and export components feature all of the advantages of the previous method, but additionally provide the filter implementer with a clean, structured, and fully documented view of the document. A significant difficulty in conversion between formats is the conceptual mapping from the one format to the other. From {{PRODUCTNAME}} {{OO1.1.0}} there are XML filter components that carry out the mapping at runtime, so that filter implementers can read from XML streams when exporting and write to XML streams when importing.
+
It is recommended to read the following chapters before carrying out any of these steps.
  
 
{{PDL1}}
 
{{PDL1}}
  
 
[[Category:Documentation/Developer's Guide/Office Development]]
 
[[Category:Documentation/Developer's Guide/Office Development]]

Revision as of 19:24, 21 January 2010



OpenOffice.org provides several implementations for objects that can be displayed in a task window. In the context of the Component Framework they are called Office Components. These components can be created from a content, e.g. stored in a file on disk. Most of the time this will be done by creating a document and loading the content into it using a filter. This section explains the implementation of OpenOffice.org import and export filter components. OpenOffice.org also allows to load content into a component directly by using a frame loader, but this is described in this section only briefly.

Introduction

Inside OpenOffice.org a document is represented by its document service, called model. For a list of available document services, refer to the section Document Specific Features. On disk, the same document is represented as a file or possibly as a dynamically generated output, for example, of a database statement. To generalize this and abstract from single disk files we just call it "content". The content is a serialization of a model, e.g. the ODF or the Word model. A filter component is used to convert between this model and the internal model defined by the document core model as shown in the following diagram.

Import/Export Filter Process

In our API the three entities in the above diagram, content, model, and filter, are defined as UNO services. The services consist of several interfaces that map to a specific implementation, for example, using C++ or Java.

The filter implementer has to develop a class that implements the com.sun.star.document.ExportFilter or com.sun.star.document.ImportFilter service, or both in case the filter should support import and export. The filter will get a com.sun.star.document.MediaDescriptor that defines the stream the filter must use for its input or output.

Approaches

To implement said filter class, a developer can

  • link against the application core
  • use the document API
  • use the XML based techniques (sax or xslt)

Each method has unique advantages and disadvantages, that are summarized briefly:

Using the core data structure and linking against the application core is the way how all elder filters (originating from the "pre-UNO" area) are implemented in OpenOffice.org. As the disadvantages are huge (maintenance nightmare when core data structures or interfaces change), this approach is not recommended for new filter development in general.

Using the OpenOffice.org API based on UNO is more advantageous, since it solves the technical problems indicated in the above paragraph. The idea is to read data from a file on loading and build up a document using the OpenOffice.org API, and to iterate over a document model and write the corresponding data to a file on storing. The UNO component technology insulates the filter from binary layout, and other compiler and version dependent issues. Additionally, the API is expected to be more stable than the core interfaces, and provides an abstraction from the core applications. The developer creating an API based filter will directly provide a filter class implementing the service com.sun.star.document.ImportFilter and/or com.sun.star.document.ExportFilter

The third is to import and export documents using the XML-based file format. UNO-based XML import and export components have all the advantages of the previous method, but they have the additional advantage that the filter logic builds upon the ODF model that is not bound to OpenOffice.org as the document APIs is and so theoretically can be used in other applications also (the filter logic, not the filter as a whole). A disadvantage may be that conversions based on the ODF format can become a little bit more complicated and also can be worse than conversions based on a document API if they require access to layout information in the source or the target format.

The developer creating an XML based filter will not directly provide a filter class but use a generic filter class provided by OpenOffice.org. This filter class is the XMLFilterAdaptor of the document it works on. The filter adaptor service expects an XML based importer or exporter UNO service to be provided by the developer. OpenOffice.org provides generic importer and exporter services that allow to plug in an xslt to carry out a transformation that the importer or exporter feeds into the XMLFilterAdaptor.

In addition to the filter itself the developer must provide some information about it to enable OpenOffice.org to integrate it into the application framework. This information is provided as a configuration file. To understand better what needs to be in this file let's have a look on how OpenOffice.org deals with filters.

Checklist for filter developers

Integrating a filter into OpenOffice.org requires the following steps that will be explained in the following sections:

  1. Implement a filter (required).
  2. Implement an com.sun.star.document.ExtendedTypeDetection service to support detection by content (optional).
  3. Implement a filter options dialog if the implemented filter requires additional parameters (optional).
  4. Register the component libraries as UNO services (required). If the filter is deployed as an extension this is a part of it. If the filter will become a part of the OpenOffice.org installation, the registration must be done as described in the chapter Deployment Options for Components.
  5. Add configuration information to the org.openoffice.TypeDetection node of the configuration (required). If the filter is deployed as an extension, the extension will contain a configuration file. OpenOffice.org will access it as part of the extensions layer of the Configuration Manager. If the filter will become a part of the OpenOffice.org installation, the configuration must be intergrated into the build process.

It is recommended to read the following chapters before carrying out any of these steps.

Content on this page is licensed under the Public Documentation License (PDL).
Personal tools
In other languages