User:DrewJensen/XML export

From Apache OpenOffice Wiki
Jump to: navigation, search

CSV file formatted data is still commonly used and is arguably the first required format for any database export utility. However, XML is rapidly approaching a par as being the other first required data export format.

The default database , HSQLDB, as incorporated into Base currently has no XML generating capabilities

In parallel with the work on the CSV export I propose work on a simple XML export also. This effort will utilize parts of the basic library being created for the former. This effort will be therefore to produce the first implementation in OOBasic as much as a prototype as anything.

Much of the information needed to handle this function is identical to that required for any export. It will mimic the filing naming scheme used by the CSV export. The two export types will share a common export definition manager.

The XML export should support a number of different modes.

At it simplest it would generate a file containing a sequence of XML tags and values mirroring the data format in an CSV file. This would be a type of flat data dump in other words.

Another mode would be a simple group oriented nested output. This might be simply an object named <ExportObjectName><Row Name><seq>Columns</seq></RowName></ExportObjectName> format.

Finally a mode using a mapping file to represent more complex data structures. One possibility of a format for this would be an XSD file.

Existing Tools

For the first pass at this some existing, very simple, routines will be employed. These can be found at the Code Snippet list on OOoForum.

There is nothing fancy about these routines, mostly just string manipulation stuff. With one exception the use of a Basic structure to map columns to XML data members. This mapping scheme is just expansive enough to handle simple hierarchical data structures. But nothing close to what can be done with XSD dictionary. However, a simple map may be all that is necessary for the vast majority of exporting needs.

The major goal of the first pass will be to have a simple way for the user to create these map files for a particular export.

An interface needs to be designed ( function ) that would allow this map to be handed to the Export Manager being built for storage of this map file - allowing it to be used again later.


My initial plan was to build a dialog box with buttons to move columns into either elements or attributes of elements. But even then I am still left with generating some representation of this map that can be stored.


Another option would be to build a parser in OOBase, or find one in Java perhaps that can be adapted for use here, for the SQL/XML support added to SQL 2003. In the long run this would be the best approach I would think.

Really there are only a few commands that need to be recognized, the publishing functions:

  • xmlelement() Creates an XML element, allowing the name to be specified.
  • xmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attribute.
  • xmlroot() Creates the root node of an XML document.
  • xmlcomment() Creates an XML comment.
  • xmlpi() Creates an XML processing instruction.
  • xmlparse() Parses a string as XML and returns the resulting XML structure.
  • xmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding element.
  • xmlconcat() Combines a list of individual XML values to create a single value containing an XML forest.
  • xmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.

Using the current OOoBase routnies one should be able to support a subset of these fairly quickly.

  • xmlelement()
  • xmlattributes()
  • xmlroot()
  • xmlforest()

The next step would be to add support for

  • xmlagg()

Support would be limited to producing XML. Meaning that you could not mix SQL and SQL/XML syntax and recieve a resultset with xml elements. Holding to the idea that this is an export function, and that the first support will be for disk based files thre woulld not be a resultset obeject to return.

This statement therefore would not be accepted:

 XMLELEMENT (NAME "name", e.FullName ),
 XMLELEMENT (NAME "dependents",
 (SELECT COUNT (*) FROM dependents d
 WHERE d.parent = ) ) AS xvalue
 FROM employees e

This one would be:

 XMLELEMENT (NAME "name", e.fullName ),
 XMLELEMENT (NAME "dependents",
 (SELECT COUNT (*) FROM dependents d
 WHERE d.parent = ) ) AS xvalue
 FROM employees e

Also, in keeping with the idea of simple first - like a prototype - the interface would be nothing more then a box to enter text in.

Once the simple - give me text - I give back a file, interface works it may be fun to play around with some ideas of what a GUI like the Query Designer would look like it where generating the SQL/XML syntax versus just SQL. That would have imediate payback for when Base is connected to a database engine that recognizes the syntax organically.

Designing the first interfaces

After thnking about this effort a bit further I think a combination of the first UI idea and support for the SQL/XML extensions would be the fastest approach - and offer the bang for the buck for some real usability.

I'll stick with the approach to conecntrate on exporting ResultSet objects. This means no attempt to fully understand an SQL/XML select statement. This will still allow the utility to export a Table object, a Query object and offer a way to build a function that could be called from a form - to export the resultSet of a DataForm object.




Personal tools