User:DrewJensen/XML export

From Apache OpenOffice Wiki
< User:DrewJensen
Revision as of 05:46, 25 September 2006 by DrewJensen (Talk | contribs)

Jump to: navigation, search

CSV file formatted data is still commonly used and is arguably the first required format for any database export utility. However, XML is rapidly approaching a par as being the other first required data export format.

The default database , HSQLDB, as incorporated into Base currently has no XML generating capabilities

In parallel with the work on the CSV export I propose work on a simple XML export also. This effort will utilize parts of the basic library being created for the former. This effort will be therefore to produce the first implementation in OOBasic as much as a prototype as anything.

Much of the information needed to handle this function is identical to that required for any export. It will mimic the filing naming scheme used by the CSV export. The two export types will share a common export definition manager.

The XML export should support a number of different modes.

At it simplest it would generate a file containing a sequence of XML tags and values mirroring the data format in an CSV file. This would be a type of flat data dump in other words.

Another mode would be a simple group oriented nested output. This might be simply an object named <ExportObjectName><Row Name><seq>Columns</seq></RowName></ExportObjectName> format.

Finally a mode using a mapping file to represent more complex data structures. One possibility of a format for this would be an XSD file.

Existing Tools

For the first pass at this some existing, very simple, routines will be employed. These can be found at the Code Snippet list on OOoForum. http://www.oooforum.org/forum/viewtopic.phtml?t=22956

There is nothing fancy about these routines, mostly just string manipulation stuff. With one exception the use of a Basic structure to map columns to XML data members. This mapping scheme is just expansive enough to handle simple hierarchical data structures. But nothing close to what can be done with XSD dictionary. However, a simple map may be all that is necessary for the vast majority of exporting needs.

The major goal of the first pass will be to have a simple way for the user to create these map files for a particular export.

An interface needs to be designed ( function ) that would allow this map to be handed to the Export Manager being built for storage of this map file - allowing it to be used again later.

A GUI

My initial plan was to build a dialog box with buttons to move columns into either elements or attributes of elements. But even then I am still left with generating some representation of this map that can be stored.

SQL/XML

Another option would be to build a parser in OOBase, or find one in Java perhaps that can be adapted for use here, for the SQL/XML support added to SQL 2003. In the long run this would be the best approach I would think.

Really there are only a few commands that need to be recognized, the publishing functions:

  • xmlelement() Creates an XML element, allowing the name to be specified.
  • xmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attribute.
  • xmlroot() Creates the root node of an XML document.
  • xmlcomment() Creates an XML comment.
  • xmlpi() Creates an XML processing instruction.
  • xmlparse() Parses a string as XML and returns the resulting XML structure.
  • xmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding element.
  • xmlconcat() Combines a list of individual XML values to create a single value containing an XML forest.
  • xmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.

Using the current OOoBase routnies one should be able to support a subset of these fairly quickly.

  • xmlelement()
  • xmlattributes()
  • xmlroot()
  • xmlforest()

The next step would be to add support for

  • xmlagg()

Support would be limited to producing XML. Meaning that you could not mix SQL and SQL/XML syntax and recieve a resultset with xml elements.

This statement therefor would not be accepted:

 SELECT e.id,
 XMLELEMENT (NAME "Emp",
 XMLELEMENT (NAME "name", e.FullName ),
 XMLELEMENT (NAME "dependents",
 (SELECT COUNT (*) FROM dependents d
 WHERE d.parent = e.id ) ) AS xvalue
 FROM employees e

This one would be:

 SELECT 
 XMLELEMENT (NAME "Emp",
 XMLATTRIBUTE( NAME "ID =", e.id ),
 XMLELEMENT (NAME "name", e.fullName ),
 XMLELEMENT (NAME "dependents",
 (SELECT COUNT (*) FROM dependents d
 WHERE d.parent = e.id ) ) AS xvalue
 FROM employees e
Personal tools