Difference between revisions of "ODFDOM"

From Apache OpenOffice Wiki
Jump to: navigation, search
m (OpenDocument API - ODFDOM)
m (Overview)
Line 13: Line 13:
 
=Overview=
 
=Overview=
  
The ODFDOM project's objective is to provide an API for easily reading, writing and manipulating ODF documents. ODFDOM implements a layered approach through which documents are accessed.
+
The ODFDOM project's objective is to provide an API for easily reading, writing and manipulating documents of the OpenDocument format (ODF).  
* The ODF Package / Physical Layer - provides direct access to the resources that are stored in the ODF package, such as XML streams, images or embedded objects.
+
To archive this, the ODFDOM library implements a layered approach through which documents are accessed. This layered design is the robust foundation for a well-designed structure like modularity.
* The ODF Typed DOM / XML Layer - provides classes for all XML elements, with their XML attributes mapped to class attributes. It is a typed DOM as every ODF XML element is represented by their own class, generated from the ODF RelaxNG schema. This level is concerned with the representation of the content of the standardized XML streams of the underlying package using the language independent W3C DOM API. This layer easily provides the ODF developer with all informations about the ODF XML structure.
+
* The ODF Document / Convenient Layer - represents components consisting of multiple underlying XML elements. This level is concerned with usability aids, which are not specified by the ODF standard.
+
* The Customized ODF Document / Extendable Layer - is not part of the delivered API, but part of the design. This level is concerned with user defined customizations.
+
  
<br>
+
Overview over the ODFDOM layers:
 +
* The bottom ODF Package / Physical Layer - provides direct access to the resources that are stored in the ODF package, such as XML streams, images or embedded objects.
 +
* The ODF Typed DOM / XML Layer - is concerned with the representation of the standardized ODF XML streams (e.g. content.xml, styles.xml). The layer provides a class for every ODF XML element defined by the ODF specification and its grammar (the RelaxNG schema). Instead of manually writing the classes are being generated from the ODF grammar. Generation guarantees complete and accurate coverage of the ODF specification on one side and an easy upgrade to future ODF specifications on the other. The API is based on the platform and language independent [http://www.w3.org/DOM/ DOM API standardized by the W3C], best-known by its implementation through every browser.
 +
* The ODF Document / Convenient Layer - is concerned with usability aids. The layer represents components consisting of multiple underlying XML elements. Those components are not specified by the ODF standard, only on frequent user scenarios and common sense.
 +
* The Customized ODF Document / Extendable Layer - is concerned with user defined customizations. Although this level is not part of the delivered library, because it describes customer sources to overwrite ODFDOM functionality, the level is still part of the overall design. <br>
  
 
<center> [[Image:ODFDOM-Layered-Model.png]]
 
<center> [[Image:ODFDOM-Layered-Model.png]]

Revision as of 11:30, 6 October 2008

OpenDocument API - ODFDOM

ODFDOM is the name of the new free OpenDocument framework sponsored by Sun Microsystems Inc.

It's purpose is to provide an easy common way to create, access and manipulate OpenDocument files, without requiring detailed knowledge of the OpenDocument specification.

It is the successor of AODL and Odf4j, designed together with their architects to provide the ODF developer community an easy lightwork programming API, meant to be portable to any object-oriented language.

The Java 5 reference implementation of ODFDOM is available under LGPL3 since May 2008 together with its online JavaDoc API.

Overview

The ODFDOM project's objective is to provide an API for easily reading, writing and manipulating documents of the OpenDocument format (ODF). To archive this, the ODFDOM library implements a layered approach through which documents are accessed. This layered design is the robust foundation for a well-designed structure like modularity.

Overview over the ODFDOM layers:

  • The bottom ODF Package / Physical Layer - provides direct access to the resources that are stored in the ODF package, such as XML streams, images or embedded objects.
  • The ODF Typed DOM / XML Layer - is concerned with the representation of the standardized ODF XML streams (e.g. content.xml, styles.xml). The layer provides a class for every ODF XML element defined by the ODF specification and its grammar (the RelaxNG schema). Instead of manually writing the classes are being generated from the ODF grammar. Generation guarantees complete and accurate coverage of the ODF specification on one side and an easy upgrade to future ODF specifications on the other. The API is based on the platform and language independent DOM API standardized by the W3C, best-known by its implementation through every browser.
  • The ODF Document / Convenient Layer - is concerned with usability aids. The layer represents components consisting of multiple underlying XML elements. Those components are not specified by the ODF standard, only on frequent user scenarios and common sense.
  • The Customized ODF Document / Extendable Layer - is concerned with user defined customizations. Although this level is not part of the delivered library, because it describes customer sources to overwrite ODFDOM functionality, the level is still part of the overall design.
File:ODFDOM-Layered-Model.png


ODFDOM is part of the odftoolkit project. Development is discussed on the dev@odftoolkit.openoffice.org mailing list.


The ODFDOM Layers

The ODF Package / Physical Layer

At this level, a document is represented as a package of named resources. For standardized file streams explicit methods are being provided.

File:ODF Package.jpg

Note: All file streams aside of the /Pictures directory and its content are specified by the ODF standard. The file streams are similar for all ODF documents.

The main requirements for this layer are:

  • Zip/unzip the file streams of the package
  • Enlist all file streams in the /META-INF/manifest.xml (otherwise OOo won't save the file streams)
  • Begin the package with an unzipped 'mimetype' file stream

All sources of the Package/Physical layer are organized beyond org.openoffice.odf.pkg.*

The following example illustrates how to add a graphic to the package level (not shown by OOo, as not referenced by content.xml):

import org.openoffice.odf.pkg.OdfPackage;
[...]

// loads the ODF document package from the path
OdfPackage pkg = OdfPackage.load("/home/myDocuments/myVacation.odt");

// loads the image from the URL and inserts the image in the package, adapting the manifest
pkg.insert("/myweb.org/images/myHoliday.png", "/Pictures/myHoliday.png");


The ODF Typed DOM / XML Layer

At this level, all XML file streams of the document are accessible via the W3C DOM API, but only the ODF standardized XML file streams of the document their own classes representing their ODF XML. Foreign XML within a specified ODF file remains in general in the document model and won't be neglected unless desired ( which might be a future option).

For instance the XML of an ODF table

File:FruitTable code.jpg

Note: In the OpenDocument ISO standard the ODF elements are reused among all document types. The above XML for a table is equally usable in Text and Spreadsheet documents.


Would be mapped to a W3C derived ODF DOM class structure:

File:Table fruits diagramm.jpg

All sources of the typed DOM/XML layer are organized beyond org.openoffice.odf.dom.*

Its sources are all generated from the ODF RelaxNG schema using the following naming conventions:

  • The Class name is equal to the element local name using 'Odf' as an Prefix and 'Element' as Suffix.
  • Elements are stored beyond a sub-package equal to their Namespace used by the OOo.

Note: The element local names 'h' and 'p' have been renamed to the classes 'Heading' and 'Paragraph'

For instance, the frame element 'draw:frame' would be generated as class org.openoffice.odf.dom.draw.OdfFrameElement

Note: Java dependent helper classes, which are not part of the official API, but for convenient part of the module are delivered beyond org.openoffice.odf.java.*

The following example illustrates how to add a graphic to the ODF document, that it is viewable:

import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.openoffice.odf.doc.OdfDocument;
import org.openoffice.odf.dom.element.text.OdfParagraphElement;
import org.openoffice.odf.dom.OdfNamespace;
import org.w3c.dom.Document;
[...]

// loads the ODF document from the path
OdfDocument odfDoc = OdfDocument.load("/home/myDocuments/myVacation.odt");

// get the ODF content as DOM tree representation
Document odfContent = odfDocument.getContent();

// XPath initialization ''(JDK5 functionality)''
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(new OdfNamespace());

// receiving the first paragraph "//text:p[1]" ''(JDK5 functionality)''
OdfParagraphElement para = (OdfParagraphElement) xpath.evaluate("//text:p[1]", odfContent, XPathConstants.NODE);

// adding an image - expecting the user to know that 
// an image consists always of a 'draw:image' and a 'draw:frame' parent
// NOTE: Child access methods are not part of the first v0.6 release
para.createDrawFrame().createDrawImage("/myweb.org/images/myHoliday.png", "/Pictures/myHoliday.png");


The ODF Document / Convenient Functionality Layer

The purpose of a class from the convenient layer is a higher usability for the API user. Usually a class is derived from covered XML layered class.

All sources of the ODF document / convenient functionality layer are organized beyond org.openoffice.odf.doc.* The name of a convenient class is similar as it's XML parent, only the suffix 'Element' has been neglected.

For instance, a convenient class for a frame, derived from the class representing 'draw:frame' would be named org.openoffice.odf.doc.draw.OdfFrame

In contrary to the classes of the XML layer, an object of the convenient layer controls multiple XML elements. For instance, a default table consisting of the similar amount of the earlier mentioned table can be created by a single method call, instead creating every single element using directly the XML layer.


The Customized ODF Document / Extendable Layer

Although not part of the ODFDOM package it is designed as a layer on top of ODFDOM, where customer are able to replace/overwrite/customize ODF elements.

For instance, change the default size/style of their new tables.


Current and Future Work

The current ODFDOM version is 0.6.14, offering you an online API and several downloads. ODFDOM is based on Java 5.

In contrary to its predecessor Odf4j ODFDOM is versioned. The version 1.0 will represent a stable API not necessarily a complete API.

Especially the convenient layer will grow on demand and surely will profit from the work done in Odf4j and the experience of the OOo API.

As ODFDOM should be the basement of many future ODF projects, a high quality is desired. Therefore automatic tests are obligatory for all new sources.

Further development is discussed on the dev@odftoolkit.openoffice.org mailing list - subscribe by sending an empty message to dev-subscribe@odftoolkit.openoffice.org.

Anybody who is interested to get more information and/or have a real-time discussion is invited to participate in our weekly IRC meeting every Friday at 9.30 AM (CEST or UTC+2 hours) on irc://freenode/#odftoolkit . You can also join the project.


Setup ODFDOM build environment

1.) Install Java / JDK 5

2.) Install NetBeans 6.1

3.) Install Mercurial 1.x

   Setup Mercurial
   Windows: <Hg Install Dir>\Mercurial.ini or Unix: <Hg Install Dir>/.hgrc
   [ui]
   username = foo@bar.com

4.) Get ODFDOM

Unpack the ODFDOM package and start Netbeans. Open an existing project in Netbeans and choose the unpacked ODFDOM directory. As the ODFDOM source bundle comes together with Netbeans project files, ODFDOM opens as a preconfigured project. You still got the opportunity to work with ANT directly on the command line instead having the IDE GUI comfort provided by Netbeans.


Mercurial is being used as distributed revision control. Since Netbeans 6.1 the Mercurial plugin is part of the IDE, which help you to track changes being made and ease providing patches.

In case you are new to Netbeans, there are several nice Netbeans tutorial available.

--Svante 00:33, 24 April 2008 (CEST)

Personal tools