Difference between revisions of "ODFDOM"

From Apache OpenOffice Wiki
Jump to: navigation, search
m (The ODF Package / Physical Layer)
m (The ODF Package / Physical Layer)
Line 38: Line 38:
 
At this level a document is represented by a bundle of named resources zipped to a package.
 
At this level a document is represented by a bundle of named resources zipped to a package.
  
For instance, an ODF text document like 'myvacation.odt' might contain the following files:
+
For instance, an ODF text document like 'myVacation.odt' might contain the following files:
  
 
[[Image:ODF_Package.jpg]]
 
[[Image:ODF_Package.jpg]]

Revision as of 12:19, 6 October 2008

OpenDocument API - ODFDOM

ODFDOM is the name of the new free OpenDocument framework sponsored by Sun Microsystems Inc.

It's purpose is to provide an easy common way to create, access and manipulate OpenDocument files, without requiring detailed knowledge of the OpenDocument specification.

It is the successor of AODL and Odf4j, designed together with their architects to provide the ODF developer community an easy lightwork programming API, portable to any object-oriented language.

The Java 5 reference implementation of ODFDOM is available under LGPL3 since May 2008 together with its online JavaDoc API.

Overview

The ODFDOM project's objective is to provide an easy API for reading, writing and manipulating documents of the OpenDocument format (ODF). To archive this, the ODFDOM API follows a layered approach to access documents. This layered design is the robust foundation for a well-designed structure like modularity.

Overview over ODFDOM layers:

  • The bottom ODF Package / Physical Layer - provides direct access to the resources stored in the ODF package, such as XML streams, images or embedded objects.
  • The ODF Typed DOM / XML Layer - represents the ODF XML elements from the standardized ODF XML streams (e.g. content.xml, styles.xml). This layer is based on the platform and language independent DOM API standardized by the W3C, best-known by its implementation through every browser. Following the DOM concept the layer provides a class for every ODF XML element defined by the ODF specification and its grammar (the RelaxNG schema). But instead of laboriously writing all these classes, they are being generated directly from the ODF grammar. This generation guarantees complete and accurate coverage of the ODF specification on one side and an easy upgrade to future ODF specifications on the other.
  • The ODF Document / Convenient Layer - is concerned with usability aids. The layer represents manipulating components consisting of multiple underlying ODF XML elements. The API for these manipulations is not specified by the ODF standard. In fact the API is given by frequent user scenarios. For example, changing the content of a certain spreadsheet cell (e.g. Add 'Hello World' to a spreadsheet cell positioned at 'B2').
  • The Customized ODF Document / Extendable Layer - is concerned with user defined customizations. Although this level is not being delivered as part of the library, the level is listed as it is still part of the overall design. The level describes the sources from a ODFDOM user overwriting ODFDOM functionality (e.g. all new tables will have a certain default size and color).
File:ODFDOM-Layered-Model.png


ODFDOM is part of the odftoolkit project. Development is discussed on the dev@odftoolkit.openoffice.org mailing list.


The ODFDOM Layers

The ODF Package / Physical Layer

At this level a document is represented by a bundle of named resources zipped to a package.

For instance, an ODF text document like 'myVacation.odt' might contain the following files:

File:ODF Package.jpg

Note: All file streams aside of the '/Pictures' directory and its content are specified by the ODF standard. The file streams are similar for all types of ODF documents.

The main requirements for this layer are:

  • Zip/unzip the file streams of the package
  • Enlist all file streams in the /META-INF/manifest.xml (similar to an inventory)
  • Begin the package with an unzipped 'mimetype' file stream (allowing others to easily identify the package)

All sources of the Package/Physical layer are organized in ODFDOM beyond org.openoffice.odf.pkg.*

The following example illustrates how to add a graphic to the package level (although not shown by an ODF application (like OOo), as not used by the shown content):

import org.openoffice.odf.pkg.OdfPackage;
[...]

// loads the ODF document package from the path
OdfPackage pkg = OdfPackage.load("/home/myDocuments/myVacation.odt");

// loads the image from the URL and inserts the image in the package, adapting the manifest
pkg.insert("/myweb.org/images/myHoliday.png", "/Pictures/myHoliday.png");


The ODF Typed DOM / XML Layer

At this level, all XML file streams of the document are accessible via the W3C DOM API, but only the ODF standardized XML file streams of the document their own classes representing their ODF XML. Foreign XML within a specified ODF file remains in general in the document model and won't be neglected unless desired ( which might be a future option).

For instance the XML of an ODF table

File:FruitTable code.jpg

Note: In the OpenDocument ISO standard the ODF elements are reused among all document types. The above XML for a table is equally usable in Text and Spreadsheet documents.


Would be mapped to a W3C derived ODF DOM class structure:

File:Table fruits diagramm.jpg

All sources of the typed DOM/XML layer are organized beyond org.openoffice.odf.dom.*

Its sources are all generated from the ODF RelaxNG schema using the following naming conventions:

  • The Class name is equal to the element local name using 'Odf' as an Prefix and 'Element' as Suffix.
  • Elements are stored beyond a sub-package equal to their Namespace used by the OOo.

Note: The element local names 'h' and 'p' have been renamed to the classes 'Heading' and 'Paragraph'

For instance, the frame element 'draw:frame' would be generated as class org.openoffice.odf.dom.draw.OdfFrameElement

Note: Java dependent helper classes, which are not part of the official API, but for convenient part of the module are delivered beyond org.openoffice.odf.java.*

The following example illustrates how to add a graphic to the ODF document, that it is viewable:

import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.openoffice.odf.doc.OdfDocument;
import org.openoffice.odf.dom.element.text.OdfParagraphElement;
import org.openoffice.odf.dom.OdfNamespace;
import org.w3c.dom.Document;
[...]

// loads the ODF document from the path
OdfDocument odfDoc = OdfDocument.load("/home/myDocuments/myVacation.odt");

// get the ODF content as DOM tree representation
Document odfContent = odfDocument.getContent();

// XPath initialization ''(JDK5 functionality)''
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(new OdfNamespace());

// receiving the first paragraph "//text:p[1]" ''(JDK5 functionality)''
OdfParagraphElement para = (OdfParagraphElement) xpath.evaluate("//text:p[1]", odfContent, XPathConstants.NODE);

// adding an image - expecting the user to know that 
// an image consists always of a 'draw:image' and a 'draw:frame' parent
// NOTE: Child access methods are not part of the first v0.6 release
para.createDrawFrame().createDrawImage("/myweb.org/images/myHoliday.png", "/Pictures/myHoliday.png");


The ODF Document / Convenient Functionality Layer

The purpose of a class from the convenient layer is a higher usability for the API user. Usually a class is derived from covered XML layered class.

All sources of the ODF document / convenient functionality layer are organized beyond org.openoffice.odf.doc.* The name of a convenient class is similar as it's XML parent, only the suffix 'Element' has been neglected.

For instance, a convenient class for a frame, derived from the class representing 'draw:frame' would be named org.openoffice.odf.doc.draw.OdfFrame

In contrary to the classes of the XML layer, an object of the convenient layer controls multiple XML elements. For instance, a default table consisting of the similar amount of the earlier mentioned table can be created by a single method call, instead creating every single element using directly the XML layer.


The Customized ODF Document / Extendable Layer

Although not part of the ODFDOM package it is designed as a layer on top of ODFDOM, where customer are able to replace/overwrite/customize ODF elements.

For instance, change the default size/style of their new tables.


Current and Future Work

The current ODFDOM version is 0.6.14, offering you an online API and several downloads. ODFDOM is based on Java 5.

In contrary to its predecessor Odf4j ODFDOM is versioned. The version 1.0 will represent a stable API not necessarily a complete API.

Especially the convenient layer will grow on demand and surely will profit from the work done in Odf4j and the experience of the OOo API.

As ODFDOM should be the basement of many future ODF projects, a high quality is desired. Therefore automatic tests are obligatory for all new sources.

Further development is discussed on the dev@odftoolkit.openoffice.org mailing list - subscribe by sending an empty message to dev-subscribe@odftoolkit.openoffice.org.

Anybody who is interested to get more information and/or have a real-time discussion is invited to participate in our weekly IRC meeting every Friday at 9.30 AM (CEST or UTC+2 hours) on irc://freenode/#odftoolkit . You can also join the project.


Setup ODFDOM build environment

1.) Install Java / JDK 5

2.) Install NetBeans 6.1

3.) Install Mercurial 1.x

   Setup Mercurial
   Windows: <Hg Install Dir>\Mercurial.ini or Unix: <Hg Install Dir>/.hgrc
   [ui]
   username = foo@bar.com

4.) Get ODFDOM

Unpack the ODFDOM package and start Netbeans. Open an existing project in Netbeans and choose the unpacked ODFDOM directory. As the ODFDOM source bundle comes together with Netbeans project files, ODFDOM opens as a preconfigured project. You still got the opportunity to work with ANT directly on the command line instead having the IDE GUI comfort provided by Netbeans.


Mercurial is being used as distributed revision control. Since Netbeans 6.1 the Mercurial plugin is part of the IDE, which help you to track changes being made and ease providing patches.

In case you are new to Netbeans, there are several nice Netbeans tutorial available.

--Svante 00:33, 24 April 2008 (CEST)

Personal tools