Difference between revisions of "ODFDOM"

From Apache OpenOffice Wiki
Jump to: navigation, search
(OpenDocument API - ODFDOM)
(Overview)
Line 9: Line 9:
 
Svante
 
Svante
  
=Overview=
 
  
The ODFDOM project's objective is to provide an easy API for reading, writing and manipulating documents of the OpenDocument format (ODF).
 
To archive this, the ODFDOM API follows a layered approach to access documents. A layered design is the robust foundation for a well-designed structure like modularity.
 
 
Overview over ODFDOM layers:
 
* The bottom ODF Package / Physical Layer - provides direct access to the resources stored in the ODF package, such as XML streams, images or embedded objects.
 
* The ODF Typed DOM / XML Layer - represents the ODF XML elements from the standardized ODF XML streams (e.g. content.xml, styles.xml). This layer is based on the platform and language independent [http://www.w3.org/DOM/ DOM API standardized by the W3C], best-known by its implementation through every browser. Following the DOM concept the layer provides a class for every ODF XML element defined by the ODF specification and its grammar (the RelaxNG schema). But instead of laboriously writing all these classes, they are being generated directly from the ODF grammar. This generation guarantees complete and accurate coverage of the ODF specification on one side and an easy upgrade to future ODF specifications on the other.
 
* The ODF Document / Convenient Layer - is concerned with usability aids. The layer represents manipulating components consisting of multiple underlying ODF XML elements. The API for these manipulations is not specified by the ODF standard. In fact the API is given by frequent user scenarios. For example, changing the content of a certain spreadsheet cell (e.g. Add 'Hello World' to a spreadsheet cell positioned at 'B2').
 
* The Customized ODF Document / Extendable Layer - is concerned with user defined customizations. Although this level is not being delivered as part of the library, the level is listed as it is still part of the overall design. The level describes the sources from a ODFDOM user overwriting ODFDOM functionality (e.g. all new tables will have a certain default size and color).<br>
 
 
<center> [[Image:ODFDOM-Layered-Model.png]]
 
 
<br>
 
 
ODFDOM is part of the [http://odftoolkit.openoffice.org odftoolkit project]. Development is discussed on the [http://odftoolkit.openoffice.org/servlets/SummarizeList?listName=dev dev@odftoolkit.openoffice.org mailing list].
 
 
<br>
 
 
</center>
 
  
 
=The ODFDOM Layers=
 
=The ODFDOM Layers=

Revision as of 18:11, 1 December 2008

OpenDocument API - ODFDOM

ODFDOM has a new home, please visit http://odftoolkit.org/projects/odftoolkit/pages/ODFDOM

You should register at the ODF Toolkit project and subscribe yourself to the mailing lists of the projects you are interested in.

See you soon, Svante


The ODFDOM Layers

The ODF Package / Physical Layer

At this level a document is represented by a bundle of named resources zipped to a package.

For instance, an ODF text document like 'myVacation.odt' might contain the following files:

File:ODF Package.jpg

Note: All file streams aside of the '/Pictures' directory and its content are specified by the ODF standard. Furthermore, the file streams are similar for all types of ODF documents.

The main requirements for this layer are:

  • Zip/unzip the file streams of the package
  • Enlist all file streams in the /META-INF/manifest.xml (similar to an inventory)
  • Begin the package with an unzipped 'mimetype' file stream (allowing others to easily identify the package)

All sources of the Package/Physical layer are organized in ODFDOM beyond org.openoffice.odf.pkg.*

The following example illustrates how to add a graphic to the package level (although not shown by an ODF application (like OOo), as not used by the shown content):

import org.openoffice.odf.pkg.OdfPackage;
[...]

// loads the ODF document package from the path
OdfPackage pkg = OdfPackage.load("/home/myDocuments/myVacation.odt");

// loads the image from the URL and inserts the image in the package, adapting the manifest
pkg.insert("/myweb.org/images/myHoliday.png", "/Pictures/myHoliday.png");


The ODF Typed DOM / XML Layer

At this level, all XML file streams of the document are accessible via the W3C DOM API, but only the ODF standardized XML file streams of the document (e.g. content.xml, meta.xml) have their own classes representing their ODF XML elements. Foreign XML within a specified ODF XML file will remain in the document model in general and won't be neglected unless desired ( which still might be a future option).

Example of the ODF XML representing a table in ODF:

File:FruitTable code.jpg

Note: In the OpenDocument standard the ODF elements are reused among all document types. The above XML of a table is for instance equally usable in Text and Spreadsheet documents.


This XML would be mapped to a W3C derived ODF DOM class structure:

File:Table fruits diagramm.jpg

All sources of the typed DOM/XML layer are organized beyond org.openoffice.odf.dom.*

The sources for the ODF elements are all generated from the ODF grammar (RelaxNG schema) using the following naming conventions in the Java reference implementation:

  • The class name is equal to the element local name using 'Odf' as an Prefix and 'Element' as Suffix (e.g. the 'draw:frame' element has the OdfFrame class).
  • Elements are stored beyond a sub-package equal to their Namespace used by the OOo. Therefore the frame element 'draw:frame' would be generated in Java as class org.openoffice.odf.dom.draw.OdfFrameElement.

Note: The element local names 'h' and 'p' have been renamed to the classes 'Heading' and 'Paragraph' for usability reasons.

The following example illustrates how to add a graphic to the ODF document, that it is viewable:

import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.openoffice.odf.doc.OdfDocument;
import org.openoffice.odf.dom.element.text.OdfParagraphElement;
import org.openoffice.odf.dom.OdfNamespace;
import org.w3c.dom.Document;
[...]

// loads the ODF document from the path
OdfDocument odfDoc = OdfDocument.load("/home/myDocuments/myVacation.odt");

// get the ODF content as DOM tree representation
Document odfContent = odfDocument.getContent();

// XPath initialization ''(JDK5 functionality)''
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(new OdfNamespace());

// receiving the first paragraph "//text:p[1]" ''(JDK5 functionality)''
OdfParagraphElement para = (OdfParagraphElement) xpath.evaluate("//text:p[1]", odfContent, XPathConstants.NODE);

// adding an image - expecting the user to know that 
// an image consists always of a 'draw:image' and a 'draw:frame' parent
// NOTE: Child access methods are still not part of the v0.6.x releases
para.createDrawFrame().createDrawImage("/myweb.org/images/myHoliday.png", "/Pictures/myHoliday.png");


The ODF Document / Convenient Functionality Layer

The purpose of a class from the convenient layer is to provide a higher usability for the API user. To establish this a convenient layer class usually controls multiple XML elements representing a component. In contrary to this the classes from the previous XML layer only control their ODF element and its attributes.

For instance, the table class of the convenient layer manipulates not only the table element and its attributes, but may manipulate all table element children. For example, when adding a cell to a table not only a new table cell would be created, but as well row and column information on their elements might be updated. All this in a single method call at the table.

Usually a class from the Convenient Layer is derived from the covered XML layered class to inherit its DOM functionality.

As naming convention all sources of the ODF document / convenient functionality layer are organized beyond org.openoffice.odf.doc.* The name of a convenient class is similar as it's parent from the XML layer, only the suffix 'Element' has been neglected. For example, the convenient class for a 'draw:frame' element would be represented in the ODFDOM Java reference implemenation as org.openoffice.odf.doc.draw.OdfFrame class.


The Customized ODF Document / Extendable Layer

Although not part of the ODFDOM package it is designed as a layer on top of ODFDOM, where customer are able to replace/overwrite/customize ODF elements.

For instance, change the default size/style of their new tables.


Current and Future Work

The current ODFDOM version 0.6.15 and its reference implementation based on Java 5. ODFDOM is currently provided with an online API and several downloads. An online Mercurial repository is planed.

In contrary to its predecessor Odf4j ODFDOM has version number. The future version 1.0 will represent a stable API not necessarily a complete API.

Especially the convenient layer will grow on demand and surely will profit from the work done in Odf4j and the experience of the OOo API.

As ODFDOM should be the basement of many future ODF projects, a high quality is desired. Therefore automatic tests are obligatory for all new sources of the Java reference implementation.

The development is being discussed on the dev@odftoolkit.openoffice.org mailing list - subscribe by sending an empty message to dev-subscribe@odftoolkit.openoffice.org.

Anybody who is interested to get more information and/or have a real-time discussion is invited to participate in our weekly IRC meeting every Friday at 9.30 AM (CEST or UTC+2 hours) on irc://freenode/#odftoolkit . You might also join the ODF toolkit project.

Setup ODFDOM build environment

To establish your own ODFDOM build environment:

1.) Install Java / JDK 5

2.) Install NetBeans 6.1

3.) Install Mercurial 1.x

   Setup Mercurial
   Windows: <Hg Install Dir>\Mercurial.ini or Unix: <Hg Install Dir>/.hgrc
   [ui]
   username = foo@bar.com

4.) Get ODFDOM

Unpack the ODFDOM source bundle and start Netbeans. Open an existing project in Netbeans and choose the unpacked ODFDOM directory. As the ODFDOM source bundle comes together with Netbeans project files, ODFDOM opens as a preconfigured project. You still got the opportunity to work solely with ANT directly on the command line instead having the IDE GUI comfort provided by Netbeans.


Mercurial is being used as distributed revision control. Since Netbeans 6.1 the Mercurial plugin is part of the IDE, which help you to track the changes being made and ease providing patches.

In case you are new to Netbeans, there are several nice Netbeans tutorials available.

--Svante 00:33, 24 April 2008 (CEST)

Personal tools